Jack Pan

#computer vision

6 posts

Phase 1: Four signals for review-frame sampling

A uniform "every Nth frame" sampler wastes labeler time on near-identical frames the model already nailed. Four signals do better — segment boundaries, per-segment uniform, low confidence, bbox jumps.

computer visiondata pipelinelabel studio

Phase 1: Keep tests pure-Python with lazy imports

`mediapipe`, `ultralytics`, `cv2` are slow to import and need model weights at runtime. The trick that keeps the test suite small, fast, and weights-free is putting those imports inside function bodies, not at module top level.

pythontestingcomputer vision

Phase 2: what counts as ground truth

When you harvest Label Studio exports to fine-tune the next model, treating *everything* in the export as ground truth is how you train the model on its own predictions. The filters worth applying before a single byte goes into a training set.

computer visionlabel studiofine-tuning

Phase 1: Why one episode becomes three Label Studio projects

A deep dive on the multi-project pattern for video pre-annotation — what forces the split, how one episode fans out, and when not to fight Label Studio's data model.

computer visionlabel studiodata pipeline

Phase 1: Two fps knobs in a video pre-annotation pipeline

Inference frame rate and review-frame sampling look like one thing and aren't. What each knob actually buys, and what breaks if you treat them as the same.

computer visiondata pipelinemediapipe

Phase 1: Notes from building a video pre-annotation pipeline

A Phase-1 pipeline for embodied-robot video data — MediaPipe + YOLO inference, action segmentation, Label Studio import — plus the boring path-abstraction decision that kept it from collapsing.

computer visiondata pipelinelabel studio