ML/AI/Computer Vision engineer
Budget: -
HOURLY / PART_TIME
⭐ 4.95 (17)
United States
tensorflow, computer-vision, artificial-intelligence, python, neural-networks, machine-learning
The role
We're looking for a data engineer to own the full pipeline from raw sensor data to trained, validated models.
The project involves multi-modal sensing in a live urban environment — LiDAR, cameras, and edge
compute — producing research-grade outputs for a government client. You'll handle data ingestion, labeling,
model training, output delivery, and field ops support.
What you'll do
TRAINING DATA PIPELINE
• Ingest and organize multi-modal sensor data from Ouster LiDAR (point clouds), video, and radar into a
structured data lake
• Build labeling pipelines for AV detection training — annotating vehicles by rooftop morphology, sensor
cluster geometry, and visual fleet signature
• Work with reference data (photos, 3D scans, video) to construct ground-truth datasets for model
training and validation
• Implement data augmentation strategies for edge cases: occlusion, low light, adverse weather,
high-density traffic
MODEL TRAINING & VALIDATION
• Train and evaluate object detection and classification models for AV identification using 3D point cloud
and image fusion
• Train behavior detection models for objectively defined traffic events — turn maneuvers, lane changes,
acceleration/deceleration profiles — with explicit event logic, thresholds, and accuracy metrics
• Optimize models for edge deployment on constrained hardware (low power, no GPU cluster)
• Define and report precision, recall, false positive/negative rates per detection class; maintain a
validation log against ground truth
DATA OUTPUT & DELIVERY
• Design and maintain the output schema: object-level tracks, event records, trajectory data, and
aggregated summaries in CSV, JSON, GeoJSON, and Parquet
• Ensure all outputs are timestamped, spatially referenced, confidence-scored, and fully documented for
downstream research use
• Build batch delivery pipelines to the data hub; handle connectivity degradation gracefully with on-device
buffering
• Strip or anonymize all PII (faces, license plates) at the edge before any data leaves the field device
INFRASTRUCTURE & OPS
• Stand up and maintain the edge-to-cloud data pipeline: edge inference → local buffer → cellular/fiber
uplink → cloud storage → data portal
• Monitor data completeness, system uptime, and sensor health during the live pilot
• Support commissioning and calibration at field deployment locations
What we're looking for
Required
• Python (NumPy, Pandas, PyTorch or TensorFlow)
• Experience with point cloud data (LiDAR, PCL, Open3D)
• Object detection pipelines (YOLO, PointNet, or similar)
• Data labeling and annotation workflows
• ETL pipeline design and batch data delivery
• Geospatial data formats (GeoJSON, shapefiles, spatial referencing)
• AWS or equivalent cloud (S3, Lambda, or EC2)
• Git, CI/CD basics
Strong plus
• Multi-modal sensor fusion (LiDAR + camera)
• Edge model optimization (TensorRT, ONNX, quantization)
• Experience with ITS, smart city, or AV-adjacent deployments
• Computer vision for traffic or roadway monitoring
• Privacy-preserving ML (on-device anonymization)
• Ouster or Velodyne LiDAR SDK familiarity
• Research data pipeline experience (academic or government)
Openen op Upwork