AI Video Automation Platform - DEVELOPER
Budget: $15.0 - $35.0
HOURLY / PART_TIME
⭐ 5.00 (1)
USA
web-application, api, artificial-intelligence, python, javascript, api-integration, machine-learning, java
I want to build an internal web platform that turns a script into a finished YouTube video automatically.
I want to upload a script, select a saved AI character (avatar) and voice, and click Create. The system then automatically via API:
Generates the voiceover
Generates AI avatar footage of the character speaking the script
Pulls matching stock b-roll for each scene
Assembles the complete edited video (avatar intro, b-roll body, split-screen segments) — no human editor
These channels are the output format and quality we're building toward. Watch a few videos before applying:
www.youtube.com/@EliasYoderAmish/videos
www.youtube.com/@EstherYoderAmish/featured
www.youtube.com/@AmishPantry
www.youtube.com/@TheGraysonReport
www.youtube.com/@DrEdmundHale
Dashboard steps & capabilities:
Script upload ? automatic scene breakdown with timestamps and b-roll keywords (LLM API)
Character library — save avatars per channel + create new ones (AI image ? animated talking avatar)
Voice library — saved voices per character
Auto-pull b-roll from Pexels/Pixabay APIs matched to each scene
Auto-generate avatar footage speaking the script (typically the opening minutes + transitions, not the full runtime)
Auto-assemble the final video following editing rules (avatar open ? b-roll ? split-screen ? avatar close)
Render queue, project history, per-video cost tracking
Requirements:
Total generation cost must land around $3-5 per video (15–20 min videos). Expensive convenience APIs like HeyGen can't be the default — use low-cost or open-source approaches (open-source lip-sync on rented GPUs, budget TTS, free stock APIs, FFmpeg assembly), or propose your own way to hit the number. Will still have access to Heygen and other tools
Every layer must be swappable — voice, avatar, b-roll, and rendering providers can be switched later without rebuilding. Option to select program wanted. Choose between VO tool or Avatar tool. The APIs stay stored in the dashboard.
Fully automated end-to-end — script in, video out. A simple review screen to swap clips before final render is a plus.
Relevant skills: FFmpeg / programmatic video assembly, AI media pipelines (ComfyUI, Replicate, fal.ai, RunPod), TTS and image/video generation APIs, full-stack web development.
To apply, start your application with the word BISON and answer:
How would you build this, layer by layer, with the specific tools you'd use and cost per video?
Show me anything you've built involving AI video, avatars, or automated media.
Timeline to a working v1?
Final step before hiring is a small paid test: I give you a short script, you produce a 2-minute sample video.
Auf Upwork öffnen