Solution Architect / Scaling Lead — Technical Requirements
Бюджет: $40.0 - $70.0
HOURLY / FULL_TIME
⭐ 4.80 (68)
India
cloud-computing, docker, database-architecture, amazon-web-services, solution-architecture, digital-ocean, mongodb
Solution Architect / Scaling Lead — Technical Requirements
AnavClouds Software Analytics · real-time voice AI, already in production
1. Scaling the live voice pipeline (the core)
Scale concurrent calls by growing droplet pools behind a WebSocket-aware load balancer (the ~15-calls-per-4-vCPU-droplet unit)
Per-tenant concurrency caps in Redis, statistical-multiplexing/overbooking, and N+1 headroom
Capacity planning (concurrent calls → compute), autoscale triggers (~70% utilization), and load testing (k6 / Locust)
Audio-path & latency-budget tuning (sub-second first byte)
2. Production reliability & provider resilience
Tune real-time timeout budgets (e.g., LLM read-timeout ~10–12 s) so a slow provider fails fast instead of dead-air — with circuit breakers, API-key rotation, cross-provider failover (already built; operate & tune)
Manage provider rate limits / concurrent-stream caps per key
Zero-downtime deploys, graceful shutdown (don't kill droplets with live calls)
Observability: OpenTelemetry per-call tracing, metrics, deep health checks, alerting
3. Operating the service stack
DigitalOcean — droplets, Load Balancers, managed databases, Spaces (scale & operate the fleet)
Vercel — operate the Next.js frontend (auto-scales; watch bandwidth/function cost)
MongoDB (indexing org_id+deleted_at, replica sets, region pinning) · Redis (shared session/concurrency/rate-limit store; Memurai in dev) · Qdrant (vector scaling, reranker caching) · AWS S3
FastAPI async tuning, httpx connection pooling
4. Multi-region & data residency (operate across regions)
Region-aware droplet pools (DO TOR/BLR/FRA/AMS), per-region data pinning, geo-routing of calls
Residency: India DPDP on-soil, EU GDPR; region-local provider selection (Sarvam in India)
Extend to AWS / Azure where DigitalOcean can't meet a bar (e.g., HIPAA BAA)
5. Capacity, cost & rate limiting
Per-org rate limits + cost guards (Redis token buckets)
Cost optimization across the provider mix and droplet fleet; concurrency-based cost modeling (~$3.20/line)
Pre-scale for campaigns/bursts; reserved concurrency for enterprise tenants
6. Telephony at scale
Twilio / Plivo — concurrent-channel & per-region number provisioning; outbound pacing/queueing
7. Security & compliance (operate & harden)
Multi-tenant isolation, RBAC, secrets & encryption (Fernet, JWT/python-jose, Google OAuth/Authlib)
Production compliance controls: SOC 2, HIPAA, GDPR, India DPDP, PCI-DSS
8. Workflows & integrations (extend, not rebuild)
Operate & extend the existing workflow engine (nodes, scheduler, triggers) and third-party integrations (CRM, webhooks, iPaaS); scale the campaign/scheduler workers
9. Core stack familiarity
Python 3.12 · FastAPI · httpx · Pydantic v2; the AI provider roster — LLM (OpenAI, Anthropic, Google, Groq, Azure OpenAI), STT (Deepgram, AssemblyAI, Sarvam, Speechmatics), TTS (Cartesia, ElevenLabs, Google, Sarvam, OpenAI)
Открыть заказ