π AI Football Event Spotting Engine β Milestone-Based Contract (3 Months)
ΠΡΠ΄ΠΆΠ΅Ρ: $380.0
FIXED /
β 0.00 (0)
India
data-science, python, machine-learning
π AI Football Event Spotting Engine β Milestone-Based Contract (3 Months)
ββββββββββββββββββββββββββββββββββββββββ
π PROJECT OVERVIEW
ββββββββββββββββββββββββββββββββββββββββ
We are building Roots & Rise IQ β an AI-powered Football Event Spotting Engine that detects, classifies, and timestamps 15 discrete football action events from single-camera grassroots match footage in near real-time.
The camera setup is a single pitch-side PTZ (pan-tilt-zoom) unit similar to Veo β it continuously tracks the ball, producing a stabilized cropped tracking view. The system must process a 30-second clip and return a structured JSON event stream within 25 seconds.
This is NOT a research project. We need a working, deployable system with measurable performance scores at each milestone. Every milestone is independently evaluated using a VisionScore metric on 100 held-out private test clips.
ββββββββββββββββββββββββββββββββββββββββ
β½ WHAT THE SYSTEM MUST DO
ββββββββββββββββββββββββββββββββββββββββ
Detect and timestamp 15 action event classes from video:
β’ pass, pass_received, recovery, tackle, interception
β’ ball_out_of_play, clearance, take_on, substitution
β’ block, aerial_duel, shot, save, foul, goal
Output format (JSON array, sorted by timestamp):
[
"action": "shot", "timestamp": 14.40, "team": "home",
"action": "save", "timestamp": 14.84, "team": "away"
]
Hard constraints:
β’ Processing latency: β€ 25 seconds per 30-second clip (non-negotiable)
β’ Inference must be incremental β not buffered until clip end
β’ Output must pass schema validation with zero violations
β’ Deployed on NVIDIA T4 GPU (16GB VRAM)
ββββββββββββββββββββββββββββββββββββββββ
π§ REQUIRED SKILLS & EXPERIENCE
ββββββββββββββββββββββββββββββββββββββββ
This role requires genuine hands-on experience β not theoretical knowledge. You must have:
β
MANDATORY
β’ PyTorch 2.x β model training, custom loss functions, TorchScript export
β’ Video processing β PyAV / OpenCV / FFmpeg-based frame pipelines
β’ Object detection β YOLOv8 fine-tuning (player, ball, pose keypoints)
β’ Temporal action spotting β SoccerNet, E2E-Spot, T-DEED, or Dense Anchor familiarity
β’ Multi-object tracking β BoT-SORT, ByteTrack, or DeepSORT with camera motion compensation
β’ ONNX Runtime + TensorRT FP16 deployment pipeline
β’ FastAPI async inference server
β’ Docker containerization with NVIDIA CUDA runtime
β’ GPU cloud (AWS EC2/GCP) β provisioning, CUDA setup, training runs
β STRONGLY PREFERRED
β’ Direct experience with SoccerNet-v2 or Ball Action Spotting datasets
β’ Optical flow + homography-based camera motion compensation
β’ Pose estimation (YOLOv8-Pose or ViTPose) for contact event disambiguation
β’ DBSCAN jersey-color clustering for team attribution
β’ Focal loss, balanced mixup, and curriculum training for class imbalance
β’ T-DEED, ASTRA, or E2E-Spot architecture implementation
ββββββββββββββββββββββββββββββββββββββββ
π
MILESTONE PLAN β 3 MONTHS, 3 DELIVERIES
ββββββββββββββββββββββββββββββββββββββββ
This is a strict milestone-based contract. Payment is released only upon verified milestone completion.
ββββββββββββββββββββββββββββββββββββββ
MILESTONE 1 β Deadline: June 28, 2026
Target: VisionScore β₯ 0.40 / 1.0
ββββββββββββββββββββββββββββββββββββββ
Deliverables:
β’ Full pipeline: ingestion β camera compensation β detection β tracking β spotting β JSON output
β’ 8 core event classes active: pass, pass_received, recovery, ball_out_of_play, clearance, tackle, shot, aerial_duel
β’ Architecture: ResNet-50 / RegNet-Y + Dense Detection Anchor head with dual classification + timestamp displacement regression
β’ Soft-NMS post-processing, DBSCAN team attribution
β’ Latency: β€ 25s per clip on T4 GPU
β’ Zero JSON schema violations
β’ Working FastAPI endpoint β submit clip URL β receive JSON event stream
β’ VisionScore β₯ 0.40 on 100-clip Cluster A (independent evaluation)
ββββββββββββββββββββββββββββββββββββββ
MILESTONE 2 β Deadline: July 28, 2026
Target: VisionScore β₯ 0.50 / 1.0
ββββββββββββββββββββββββββββββββββββββ
Deliverables:
β’ Upgrade to T-DEED end-to-end architecture (EfficientNetV2 + GSF + temporal encoder-decoder)
β’ All 15 action classes active with valid predictions
β’ YOLOv8-Pose integration β 17-point skeleton for foul/tackle and block/save disambiguation
β’ 5-pair disambiguation heads deployed (tackle/foul, block/save, shot/clearance, interception/recovery, pass/clearance)
β’ Goalkeeper identity module (position prior + kit colour classifier)
β’ Balanced mixup training (ASTRA-style) for class imbalance
β’ Mean absolute timestamp error below 0.8s for goal, foul, save, shot
β’ Foul precision above 0.68 (false foul penalty = 7.7 weight units β cannot afford FPs)
β’ VisionScore β₯ 0.50 on 100-clip Cluster B (independent evaluation)
ββββββββββββββββββββββββββββββββββββββ
MILESTONE 3 β Deadline: August 28, 2026
Target: VisionScore β₯ 0.60 / 1.0
ββββββββββββββββββββββββββββββββββββββ
Deliverables:
β’ Two-model ensemble: T-DEED + ASTRA Transformer encoder-decoder with late fusion
β’ TensorRT FP16 export and optimization β latency β€ 20s per clip on T4
β’ Curriculum training on full proprietary Roots & Rise dataset (expected mid-July)
β’ Goal recall above 0.80, foul + save combined precision above 0.68
β’ Docker production container (nvcr.io/nvidia/pytorch base) β runs with single command on NVIDIA GPU server
β’ Complete API documentation (endpoint spec, request/response schema, code examples)
β’ Model card β capabilities, known limitations, class-level metrics
β’ VisionScore β₯ 0.60 on 100-clip Cluster C (independent evaluation)
ββββββββββββββββββββββββββββββββββββββββ
β οΈ CONTRACT RESTRICTIONS & CANCELLATION POLICY
ββββββββββββββββββββββββββββββββββββββββ
This project has firm, non-negotiable delivery conditions. Please read carefully before applying.
1. MILESTONE DEADLINE IS FINAL
Each milestone has a fixed evaluation submission date (June 28 / July 28 / August 28). Late submissions will NOT be accepted. The evaluation clusters are released only upon submission β there is no extension mechanism.
2. FAILED MILESTONE = CONTRACT TERMINATION
If the official VisionScore falls below the milestone target (M1: 0.40, M2: 0.50, M3: 0.60) on the independent evaluation, the contract will be terminated at that milestone. Payment for that milestone will NOT be released.
3. LATENCY FAILURE = DISQUALIFICATION
If end-to-end processing of a 30-second clip exceeds 25 seconds on T4 GPU, the submission is automatically disqualified β regardless of accuracy. This is an evaluation system rule, not our policy.
4. SCHEMA VIOLATIONS = REJECTION
Any output that fails JSON schema validation is treated as a non-submission. The output array must be sorted by timestamp, all field types must match exactly, and team values must be exactly "home" or "away".
5. PAYMENT TERMS
β’ Milestone 1 payment: released only after VisionScore β₯ 0.40 is confirmed
β’ Milestone 2 payment: released only after VisionScore β₯ 0.50 is confirmed
β’ Milestone 3 payment: released only after VisionScore β₯ 0.60 is confirmed AND Docker package is delivered
β’ No payment is made for partial milestone completion
6. WEEKLY PROGRESS UPDATES ARE MANDATORY
You must share a written progress update every 7 days. Failure to communicate for more than 5 consecutive working days without prior notice will be treated as project abandonment and the contract will be closed.
7. CODE OWNERSHIP
All code, models, weights, and documentation produced under this contract are the full intellectual property of Roots & Rise IQ. No proprietary dataset content may be retained, shared, or reused after contract completion.
ββββββββββββββββββββββββββββββββββββββββ
π¦ RESOURCES WE PROVIDE
ββββββββββββββββββββββββββββββββββββββββ
β’ Public reference dataset: https://huggingface.co/datasets/amandeepds1/roots_and_riseIQ
β’ SoccerNet-v2 and Ball Action Spotting datasets (publicly available β you source these)
β’ Proprietary Roots & Rise grassroots footage dataset β expected mid-July 2026 (delivered for M2/M3 training)
β’ Full Technical Requirements Document (TRD) with exact architecture specifications, scoring formulas, and evaluation methodology
β’ Shared project board (Notion) for task tracking
β’ GPU cloud access: T4 GPU for evaluation (you provision your own training compute)
β’ Our team is available for daily async communication and weekly sync calls
ββββββββββββββββββββββββββββββββββββββββ
π¬ HOW TO APPLY β REQUIRED IN YOUR PROPOSAL
ββββββββββββββββββββββββββββββββββββββββ
Generic proposals will be ignored. To be considered, your proposal must include:
1. RELEVANT EXPERIENCE β Describe a specific project where you built a video-based action detection or temporal spotting system. What dataset, architecture, and metric did you use? What was the result?
2. ARCHITECTURE CHOICE FOR M1 β How would you approach the Dense Detection Anchor model for milestone 1? Which backbone would you use and why? (2β3 sentences β we are testing domain knowledge, not asking for a full plan.)
3. LATENCY STRATEGY β How would you ensure the full pipeline (detection + tracking + spotting + JSON) completes within 25 seconds on a T4 GPU for a 30-second clip? What would you profile and optimize first?
4. DISAMBIGUATION APPROACH β The tackle/foul pair has a combined error cost of 10.2 VisionScore units if misclassified. In one or two sentences, how would you reduce this risk?
5. YOUR BID β Provide your total bid broken down by milestone. Bids without milestone breakdown will not be considered.
6. TIMELINE CONFIRMATION β Confirm explicitly that you accept all three deadline dates: June 28, July 28, and August 28, 2026, and understand the cancellation policy.
ββββββββββββββββββββββββββββββββββββββββ
π― WHO THIS IS FOR
ββββββββββββββββββββββββββββββββββββββββ
This project is ideal for:
β’ Computer vision engineers who have worked on sports analytics or temporal video understanding
β’ ML engineers with SoccerNet or action spotting research/competition experience
β’ AI engineers who are comfortable owning an end-to-end system β training, optimization, and deployment
This project is NOT for:
β’ Generalist ML engineers with no video understanding experience
β’ Freelancers who will subcontract the work without disclosure
β’ Anyone who cannot commit to all three milestone deadlines
We are looking for one focused engineer or a small team (max 2) who can take full ownership and deliver.
ββββββββββββββββββββββββββββββββββββββββ
π SCORING REFERENCE (For Your Information)
ββββββββββββββββββββββββββββββββββββββββ
VisionScore = (matched scores total minus false positive penalties total) divided by ground truth weights total
High-stakes events:
β’ goal: weight 10.9, tolerance Β±3.0s
β’ foul: weight 7.7, tolerance Β±2.5s
β’ save: weight 7.3, tolerance Β±2.0s
β’ shot: weight 4.7, tolerance Β±2.0s
A missed goal costs 10.9 units. A false foul prediction costs 7.7 units. The scoring system heavily penalizes high-weight class errors β your model must be precision-calibrated, not just high-recall.
Good luck β we look forward to hearing from engineers who are serious about this challenge. π―
ΠΡΠ²ΠΎΡΠΈ Π² Upwork