<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"><channel><title>Jack Pan</title><description>Jack Pan&apos;s blog — notes on data platforms, computer vision, and engineering</description><link>https://jackpan.me/</link><item><title>Phase 1: Canonical filenames buy zero-config batching</title><link>https://jackpan.me/posts/canonical-filenames-zero-config-batching/</link><guid isPermaLink="true">https://jackpan.me/posts/canonical-filenames-zero-config-batching/</guid><description>A short naming convention like `NN_NNN_ego.mp4` encodes enough metadata to run the entire batch with one CLI flag. Why this is cheap to add, and what it forces you to *not* build.</description><pubDate>Mon, 11 May 2026 00:00:00 GMT</pubDate></item><item><title>Phase 1: Don&apos;t use Label Studio Source Storage for local files</title><link>https://jackpan.me/posts/dont-use-label-studio-source-storage/</link><guid isPermaLink="true">https://jackpan.me/posts/dont-use-label-studio-source-storage/</guid><description>A short warning. LS&apos;s &quot;Cloud Storage → Source Storage&quot; feature looks like exactly what you want for local data. Use it and you get tens of thousands of phantom tasks that collide with the ones you actually imported.</description><pubDate>Mon, 11 May 2026 00:00:00 GMT</pubDate></item><item><title>Phase 1: Four signals for review-frame sampling</title><link>https://jackpan.me/posts/four-signal-frame-sampling/</link><guid isPermaLink="true">https://jackpan.me/posts/four-signal-frame-sampling/</guid><description>A uniform &quot;every Nth frame&quot; sampler wastes labeler time on near-identical frames the model already nailed. Four signals do better — segment boundaries, per-segment uniform, low confidence, bbox jumps.</description><pubDate>Mon, 11 May 2026 00:00:00 GMT</pubDate></item><item><title>Phase 1: Keep tests pure-Python with lazy imports</title><link>https://jackpan.me/posts/lazy-imports-pure-python-tests/</link><guid isPermaLink="true">https://jackpan.me/posts/lazy-imports-pure-python-tests/</guid><description>`mediapipe`, `ultralytics`, `cv2` are slow to import and need model weights at runtime. The trick that keeps the test suite small, fast, and weights-free is putting those imports inside function bodies, not at module top level.</description><pubDate>Mon, 11 May 2026 00:00:00 GMT</pubDate></item><item><title>Phase 1: One module owns every path on disk</title><link>https://jackpan.me/posts/one-module-owns-every-path/</link><guid isPermaLink="true">https://jackpan.me/posts/one-module-owns-every-path/</guid><description>Why a video pre-annotation pipeline ends up with one `layout` module that knows where everything lives, and what breaks when six different parts of the codebase each compute paths their own way.</description><pubDate>Mon, 11 May 2026 00:00:00 GMT</pubDate></item><item><title>Phase 2: the eval set must never see pre-annotations</title><link>https://jackpan.me/posts/phase-2-clean-eval-set/</link><guid isPermaLink="true">https://jackpan.me/posts/phase-2-clean-eval-set/</guid><description>The single decision that most human-in-the-loop projects get wrong. If your eval labels were seeded by the model&apos;s own predictions, every F1 number you ever report is biased toward the model. The fix is cheap on day one, expensive on day forty.</description><pubDate>Mon, 11 May 2026 00:00:00 GMT</pubDate></item><item><title>Phase 2: what counts as ground truth</title><link>https://jackpan.me/posts/phase-2-ground-truth/</link><guid isPermaLink="true">https://jackpan.me/posts/phase-2-ground-truth/</guid><description>When you harvest Label Studio exports to fine-tune the next model, treating *everything* in the export as ground truth is how you train the model on its own predictions. The filters worth applying before a single byte goes into a training set.</description><pubDate>Mon, 11 May 2026 00:00:00 GMT</pubDate></item><item><title>Phase 2: don&apos;t retrain on every export</title><link>https://jackpan.me/posts/phase-2-retraining-cadence/</link><guid isPermaLink="true">https://jackpan.me/posts/phase-2-retraining-cadence/</guid><description>After every batch of human-corrected episodes gets exported from Label Studio, the temptation is to retrain immediately. The reasons not to, and a cheap cadence trigger that actually fires when retraining will help.</description><pubDate>Mon, 11 May 2026 00:00:00 GMT</pubDate></item><item><title>Phase 2: version the slice, not the snapshot</title><link>https://jackpan.me/posts/phase-2-version-the-slice/</link><guid isPermaLink="true">https://jackpan.me/posts/phase-2-version-the-slice/</guid><description>When you fine-tune model v3, you need to be able to answer &quot;which exported corrections went into it&quot;. Snapshotting the whole training set is the obvious answer and the wrong one. Track the inputs and the derivation; the training set is a function of them.</description><pubDate>Mon, 11 May 2026 00:00:00 GMT</pubDate></item><item><title>Phase 1: Why one episode becomes three Label Studio projects</title><link>https://jackpan.me/posts/three-label-studio-projects/</link><guid isPermaLink="true">https://jackpan.me/posts/three-label-studio-projects/</guid><description>A deep dive on the multi-project pattern for video pre-annotation — what forces the split, how one episode fans out, and when not to fight Label Studio&apos;s data model.</description><pubDate>Mon, 11 May 2026 00:00:00 GMT</pubDate></item><item><title>Phase 1: Two fps knobs in a video pre-annotation pipeline</title><link>https://jackpan.me/posts/two-fps-knobs/</link><guid isPermaLink="true">https://jackpan.me/posts/two-fps-knobs/</guid><description>Inference frame rate and review-frame sampling look like one thing and aren&apos;t. What each knob actually buys, and what breaks if you treat them as the same.</description><pubDate>Mon, 11 May 2026 00:00:00 GMT</pubDate></item><item><title>Phase 1: Notes from building a video pre-annotation pipeline</title><link>https://jackpan.me/posts/video-pre-annotation-pipeline/</link><guid isPermaLink="true">https://jackpan.me/posts/video-pre-annotation-pipeline/</guid><description>A Phase-1 pipeline for embodied-robot video data — MediaPipe + YOLO inference, action segmentation, Label Studio import — plus the boring path-abstraction decision that kept it from collapsing.</description><pubDate>Mon, 11 May 2026 00:00:00 GMT</pubDate></item><item><title>CIT CTF 2026 · Debug Disaster: a leaky debug page and a forgotten route</title><link>https://jackpan.me/posts/cit-ctf-2026-debug-disaster/</link><guid isPermaLink="true">https://jackpan.me/posts/cit-ctf-2026-debug-disaster/</guid><description>Flask debug=True leaks more than tracebacks — it leaked the source code of a forgotten route that dumps .env in cleartext.</description><pubDate>Sun, 26 Apr 2026 00:00:00 GMT</pubDate></item><item><title>CIT CTF 2026 · A Massive Problem: mass assignment via dict.update</title><link>https://jackpan.me/posts/cit-ctf-2026-mass-assignment/</link><guid isPermaLink="true">https://jackpan.me/posts/cit-ctf-2026-mass-assignment/</guid><description>The challenge name spells it out. At register time, record.update(incoming) lets the role field in the request body overwrite the hard-coded default.</description><pubDate>Sun, 26 Apr 2026 00:00:00 GMT</pubDate></item><item><title>CIT CTF 2026: a few writeups worth keeping</title><link>https://jackpan.me/posts/cit-ctf-2026-overview/</link><guid isPermaLink="true">https://jackpan.me/posts/cit-ctf-2026-overview/</guid><description>I played CIT CTF 2026 over the holiday — this is the index post for a short series of writeups covering Web, Crypto and Misc challenges.</description><pubDate>Sun, 26 Apr 2026 00:00:00 GMT</pubDate></item><item><title>CIT CTF 2026 · Baby Exponent: the most textbook RSA e=3</title><link>https://jackpan.me/posts/cit-ctf-2026-baby-exponent/</link><guid isPermaLink="true">https://jackpan.me/posts/cit-ctf-2026-baby-exponent/</guid><description>Public exponent e=3, plaintext small enough that m³ never overflowed the modulus. Integer cube root and done.</description><pubDate>Sat, 25 Apr 2026 00:00:00 GMT</pubDate></item><item><title>CIT CTF 2026 · Dog Barking: three bark durations, one custom encoding</title><link>https://jackpan.me/posts/cit-ctf-2026-dog-barking/</link><guid isPermaLink="true">https://jackpan.me/posts/cit-ctf-2026-dog-barking/</guid><description>78 seconds of dog barks. Three distinct bark durations encode bit 0, bit 1, and the byte separator. Not Morse — a custom code.</description><pubDate>Sat, 25 Apr 2026 00:00:00 GMT</pubDate></item><item><title>CIT CTF 2026 · Server Components: RCE via Next.js 15 RSC deserialization</title><link>https://jackpan.me/posts/cit-ctf-2026-server-components/</link><guid isPermaLink="true">https://jackpan.me/posts/cit-ctf-2026-server-components/</guid><description>package.json pins next@15.0.4 — squarely inside the window for this year&apos;s React Server Components deserialization RCE.</description><pubDate>Sat, 25 Apr 2026 00:00:00 GMT</pubDate></item></channel></rss>