← Обяви

AI Developer — Build a Whisper-Based Subtitle Extraction Tool from Scratch (Python/Flask)

Бюджет: $30.0 FIXED / ⭐ 0.00 (0) South Korea

python, api, machine-learning, flask, artificial-intelligence, python-script, node.js

We need an AI-powered subtitle extraction tool built from the ground up — a tool that takes any video file and automatically generates accurate subtitles in the video's original spoken language, using local AI speech recognition (no cloud APIs, no translation). What it should do: Accept video/audio file uploads through a simple web UI Extract audio and clean it up (remove silence/noise) before transcription Use an AI speech-to-text model (Whisper) to transcribe speech — automatically detecting the spoken language (English, Korean, Russian, Turkish, etc. — any language) Merge raw transcription chunks into natural, well-timed subtitle sentences Output a clean, properly formatted .srt file, downloadable from the browser Support queued processing — multiple videos can be uploaded and processed one after another in the background Allow canceling a queued job before it starts Tech we expect you to use: Whisper (AI speech recognition model) — for transcription, ideally via mlx-whisper for Apple Silicon performance, or openai-whisper / faster-whisper as alternatives Voice Activity Detection (AI model) — e.g. Silero VAD — to strip silence before transcription for better accuracy Python backend (Flask or similar) with a background job queue ffmpeg for audio extraction Simple, clean web frontend (HTML/CSS/JS) — no framework required Why this is an AI project: This isn't just file handling — the core value is two AI models working together (speech detection + speech-to-text transcription) to turn raw audio into readable, well-timed subtitles automatically, without any manual transcription or paid translation API. Requirements: Proven experience with Whisper or similar ASR (Automatic Speech Recognition) models Comfortable with audio preprocessing (ffmpeg, sample rates, normalization) Python backend experience (Flask/FastAPI) Bonus: experience with MLX (Apple Silicon ML framework) or CUDA-accelerated inference Deliverable: Fully working app, source code, brief documentation on setup/running it locally.
Отвори в Upwork