← Joburi

Full-Stack Developer That Can Build an AI Video Agent SaaS

Buget: $100.0 FIXED / ⭐ 4.80 (1) Nigeria

redis, next.js, react-js, api-integration, python, node.js, celery, ffmpeg

I am building a SaaS platform that acts as an autonomous AI agent for creating UGC-style (User Generated Content) videos at scale. The platform is designed for people who want to produce high-quality AI influencer videos on autopilot without manual work per video. The agent takes an AI influencer image, a script (AI-generated or user-provided), a voice (platform voice or custom ElevenLabs or Minimax Voice ID), video format, and length ; and autonomously produces finished MP4 videos delivered to a dashboard. This is a backend-heavy, API-integration-focused build. I need someone who has real experience with video generation APIs (HeyGen ), async job queues, and SaaS architecture. AI Influencer Image Upload → Avatar Creation (HeyGen Avatar IV) → Persona Saved to Library → Brand Brief → Script Generation (AI or Custom) → Motion Prompt Generation → Voice Assignment → Video Format Selection → Avatar Video Generation → Subtitle Generation → Job Queue → Delivery to Dashboard. Tech Stack (Preferred): Backend Python (FastAPI) or Node.js Frontend Next.js + Tailwind CSS Auth & DB Supabase Video Generation HeyGen API — Avatar IV specifically (motion prompt support required). Voice ElevenLabs API Script AI OpenAI API Subtitles OpenAI Whisper API or AssemblyAI (transcription) + FFmpeg (burn-in) Job Queue Celery + Redis or BullMQ Storage Backblaze Storage Billing Stripe (subscriptions + credits) Deployment Railway, Render, or AWS Key Features: AI Influencer Image Upload & Persona Creation: User uploads a photo of their AI influencer. The image is sent to HeyGen to create a custom Avatar IV. The avatar is saved to the user's persona library so it can be reused across all future campaigns without re-uploading. Each plan supports a set number of personas (e.g. 2 for Starter, 10 for Pro, unlimited for Agency). AI Influencer Video Generation: Saved Avatar IV personas are used to generate talking-head and full body videos with full body motion, gestures, and lip-sync via HeyGen Avatar IV API. Script Engine (Two Modes): AI-generated: user inputs product name, tone, audience, length → OpenAI API writes the script Custom script: user pastes their own script directly - agent skips writing and proceeds to motion + video generation Motion Prompt Engine: After the script is produced (AI or custom), OpenAI API analyzes the script's tone and content section by section and auto-generates HeyGen-compatible motion prompt instructions (e.g. hand gestures, body posture, facial expressions) that sync with what is being said. Users can set a gesture intensity level (subtle / moderate / expressive) which the agent uses to calibrate prompt language. Zero manual work required from the user. Voice Engine (Two Modes) Platform voices: curated library pulled from ElevenLabs or Minimax. Custom Voice ID: user pastes their own ElevenLabs or Minimax Voice ID - validated against the API before the job runs Video Format Selection: User selects output format before generation. The agent passes the correct resolution and aspect ratio to HeyGen accordingly: 16:9 → 1920 x 1080 (YouTube, LinkedIn) 9:16 → 1080 x 1920 (TikTok, Reels, Shorts) 1:1 → 1080 x 1080 (Instagram Feed, Facebook) 4:5 → 1080 x 1350 (Instagram Feed optimal) Subtitle Generation: After video is generated, audio is passed to a transcription service (Whisper API or AssemblyAI) to auto-generate timestamped subtitles. Subtitles are burned directly into the video (hardcoded, not SRT) for maximum compatibility with social autoplay. Subtitle styling is customisable - font, size, colour, and position. Campaign Scheduler : Users set recurring video generation (e.g. 3 videos every Monday) Multi-Tenant Dashboard : Each customer manages their own personas, campaigns, scripts, voice settings, and gesture preferences. Credit-Based Billing :Stripe integration with tiered subscription plans (Starter / Pro / Agency) Video Delivery - Finished videos downloadable from dashboard. Your proposal should go straight to the point and address the following: Specific experience with HeyGen API - specifically Avatar IV and motion prompts. If no specific experience here, have you integrated any API to a SAAS before now? Have you integrated HeyGen Avatar IV before? Describe the project. If not, how do you plan to achieve this? How you would build the motion prompt engine - given a script, how would you use OpenAI API to auto-generate gesture instructions that sync with the script's tone and content? Your approach to async video jobs - how do you handle a pipeline where video generation takes 1–3 minutes per job? How you'd structure multi-tenancy in the database - brief explanation. Your suggested tech stack and any changes from what I've listed above, with reasons. If you know this is what you can do, i'd love to hear from you.
Deschide pe Upwork