Solution Architect / Scaling Lead — Technical Requirements

Бюджет: $40.0 - $70.0 HOURLY / FULL_TIME ⭐ 4.80 (68) India

cloud-computing, docker, database-architecture, amazon-web-services, solution-architecture, digital-ocean, mongodb

Solution Architect / Scaling Lead — Technical Requirements AnavClouds Software Analytics · real-time voice AI, already in production 1. Scaling the live voice pipeline (the core) Scale concurrent calls by growing droplet pools behind a WebSocket-aware load balancer (the ~15-calls-per-4-vCPU-droplet unit) Per-tenant concurrency caps in Redis, statistical-multiplexing/overbooking, and N+1 headroom Capacity planning (concurrent calls → compute), autoscale triggers (~70% utilization), and load testing (k6 / Locust) Audio-path & latency-budget tuning (sub-second first byte) 2. Production reliability & provider resilience Tune real-time timeout budgets (e.g., LLM read-timeout ~10–12 s) so a slow provider fails fast instead of dead-air — with circuit breakers, API-key rotation, cross-provider failover (already built; operate & tune) Manage provider rate limits / concurrent-stream caps per key Zero-downtime deploys, graceful shutdown (don't kill droplets with live calls) Observability: OpenTelemetry per-call tracing, metrics, deep health checks, alerting 3. Operating the service stack DigitalOcean — droplets, Load Balancers, managed databases, Spaces (scale & operate the fleet) Vercel — operate the Next.js frontend (auto-scales; watch bandwidth/function cost) MongoDB (indexing org_id+deleted_at, replica sets, region pinning) · Redis (shared session/concurrency/rate-limit store; Memurai in dev) · Qdrant (vector scaling, reranker caching) · AWS S3 FastAPI async tuning, httpx connection pooling 4. Multi-region & data residency (operate across regions) Region-aware droplet pools (DO TOR/BLR/FRA/AMS), per-region data pinning, geo-routing of calls Residency: India DPDP on-soil, EU GDPR; region-local provider selection (Sarvam in India) Extend to AWS / Azure where DigitalOcean can't meet a bar (e.g., HIPAA BAA) 5. Capacity, cost & rate limiting Per-org rate limits + cost guards (Redis token buckets) Cost optimization across the provider mix and droplet fleet; concurrency-based cost modeling (~$3.20/line) Pre-scale for campaigns/bursts; reserved concurrency for enterprise tenants 6. Telephony at scale Twilio / Plivo — concurrent-channel & per-region number provisioning; outbound pacing/queueing 7. Security & compliance (operate & harden) Multi-tenant isolation, RBAC, secrets & encryption (Fernet, JWT/python-jose, Google OAuth/Authlib) Production compliance controls: SOC 2, HIPAA, GDPR, India DPDP, PCI-DSS 8. Workflows & integrations (extend, not rebuild) Operate & extend the existing workflow engine (nodes, scheduler, triggers) and third-party integrations (CRM, webhooks, iPaaS); scale the campaign/scheduler workers 9. Core stack familiarity Python 3.12 · FastAPI · httpx · Pydantic v2; the AI provider roster — LLM (OpenAI, Anthropic, Google, Groq, Azure OpenAI), STT (Deepgram, AssemblyAI, Sarvam, Speechmatics), TTS (Cartesia, ElevenLabs, Google, Sarvam, OpenAI)

Відкрити на Upwork