Jobs/Computer Vision Engineer (Detection, Tracking & 2D Metric Calibration Specialist)

Computer Vision Engineer (Detection, Tracking & 2D Metric Calibration Specialist)

UpworkUS
Contractor

£40 - £120

About the Role

Project Context CrackCoach is an AI platform for automatic analysis of show-jumping videos. This role builds the IMAGE-level perception and geometry stack that everything depends on: detection, tracking, obstacle understanding, jump segmentation, and metric calibration in real-world competition footage. Without a rock-solid perception and geometric foundation, pose estimation, biomechanics, and AI coaching are not reliable. ⸻ Core Mission and Responsibilities You will design, implement, and validate a production-grade computer vision pipeline capable of ingesting raw competition videos and producing robust, structured, and metric-aware outputs. Your responsibilities include: • Video ingestion and preprocessing: handle codecs, resolutions, FPS, orientation, stabilization, and cropping policies. • Horse-and-rider detection using state-of-the-art detectors (YOLO / RT-DETR / Detectron2 or equivalent). • Persistent tracking across frames (ByteTrack, BoT-SORT, DeepSORT, Kalman-based trackers). • Obstacle detection and scene understanding for show-jumping arenas (rails, poles, standards). • Obstacle-to-jump association logic: correctly identify which obstacle is being jumped and when. • Automatic segmentation of a full round video into individual jump clips (per-obstacle segments). • 2D trajectory reconstruction of the horse in image space with stable, low-jitter trajectories. 2D Metric Calibration (Image → Ground Plane) In addition to perception, this role includes implementing a robust 2D metric calibration module: • Estimate a ground-plane homography (image → ground) using stable scene references such as obstacle bases or other ground contact points. • Compute a pixel-to-meter scale, ideally leveraging known or user-declared obstacle heights (e.g. “course at 1.35m”) when available. • Project horse trajectories from image space to ground-plane coordinates in meters. • Enable metric estimates such as: • approach speed (m/s) • distances between obstacles (m) • take-off and landing distances at ground level (m) • approximate stride length at ground level (when combined later with biomechanics) • Provide a calibration confidence indicator and gracefully fall back to relative (pixel-based) measures when calibration is unreliable. The calibration module must be robust, non-blocking, and designed for real-world competition footage (single camera, uncontrolled viewpoints). ⸻ Required Technical Skills • Strong background in computer vision applied to video (sports footage experience is a strong plus). • Proven experience with object detection (YOLO family, Detectron2, RT-DETR, etc.). • Multi-object tracking expertise (ByteTrack / BoT-SORT / DeepSORT; handling occlusions and ID switches). • Experience with segmentation models (Mask R-CNN, YOLO-Seg, SAM-family) if needed for background removal. • Solid understanding of image-space geometry and camera perspective limitations. • Experience implementing 2D metric calibration using planar homography and RANSAC. • Comfortable working with pixel-to-meter conversions and expressing metric uncertainty. • Advanced Python and OpenCV; deep learning framework (PyTorch preferred). • Experience building modular, maintainable pipelines with clear interfaces and exports. ⸻ Key Technical Challenges • Highly variable camera angles, zoom levels, and lighting conditions. • Dynamic occlusions from obstacles, rails, other horses, and spectators. • Motion blur and compression artifacts in user-generated videos. • Background clutter and false positives (banners, rails, similar shapes). • Maintaining stable trajectories despite noisy detections and temporary misses. • Correct obstacle differentiation and obstacle association in multi-obstacle scenes. • Metric calibration with a single camera, limited scene control, and partial reference data. • Performance constraints: processing HD videos in minutes, not hours. ⸻ Expected Deliverables • A fully modular computer vision pipeline (source code) that ingests raw video and outputs: • detections • tracks • obstacle detections • jump segments • 2D trajectories • ground-plane metric projections (when calibration is reliable) • A 2D calibration module producing pixel-to-meter scale, ground-plane mapping, and confidence scores. • Trained detection/segmentation models (weights + training scripts) when custom training is required. • Clean data exports (JSON / CSV) and stable ROI frame exports for pose estimation and biomechanics. • Visual validation outputs (overlays showing boxes, tracks, obstacles, jump boundaries, and metric projections). • Clear technical documentation defining interfaces and data formats for downstream pose estimation, biomechanics, and AI coaching stages. ⸻ Important Notes • This role does NOT include pose estimation or biomechanics (handled by separate specialists). • Metric calibration is 2D ground-plane based, not full 3D reconstruction. • Robustness and graceful degra
Apply on Company Site

Apply for this Position

Fill out the form below to apply for this role

PDF or Word document, max 10MB

By submitting this application, you agree to your information being shared with the hiring team.

About the Company

Upwork

Job Details

Job Type

Contractor

Location

US

Reference

EXT-MM95YP8H-U7MI

Posted

2 Mar 2026

Views

2 views

Applications

0 applications