Computer Vision Engineer Needed for Surgical Video Analysis Benchmark Development

Budget: $250.0 FIXED / ⭐ 4.99 (62) South Korea

pytorch, python, computer-vision, machine-learning, tensorflow, opencv, artificial-intelligence, deep-learning, cuda

Job Description We are looking for a Computer Vision Engineer to benchmark existing state-of-the-art models on a multi-camera surgical video dataset. Dataset 3 full surgeries 5 synchronized RGB camera views per surgery ~9 hours total video Event and role annotations available No depth data Tasks 1. Role Detection Evaluate role recognition models (e.g., MM-OR). Deliverables: Run inference using pretrained models Fine-tune models on our dataset Compare inference vs. fine-tuned performance Report evaluation metrics and implementation details 2. Event Detection Events include: Instrument handoff (successful / failed / none) Door opening / closing Entry / exit events Candidate models: MMAction2 AVA SlowFast VideoMAE VideoMamba Other suitable repositories Deliverables: Run inference using pretrained models Fine-tune models on our dataset Compare inference vs. fine-tuned performance Report event detection accuracy 3. Workflow Recognition Workflow phases include: Setup / Anesthesia / Wheels In Draping / Surgery / Drapes Down Wheels Out / Turnover Candidate models: I3D TimeSformer Video Swin Transformer Other suitable repositories Deliverables: Run inference using pretrained models Fine-tune models on our dataset Compare inference vs. fine-tuned performance Report workflow recognition accuracy Requirements Strong PyTorch experience Experience reproducing and adapting CVPR/ICCV/ECCV video understanding models Experience with action detection, video classification, or workflow recognition Familiarity with MMAction2, SlowFast, VideoMAE, VideoMamba, or similar frameworks Please include: Relevant projects or repositories Estimated timeline Estimated budget

Ouvrir sur Upwork

AI proposal draft

Generate a short cover letter for this job. Edit before sending.

Connexion