← Jobb

DevOps Engineer for Voice AI Platform (Uptime Monitoring & Reliability)

Budget: $20.0 - $45.0 HOURLY / FULL_TIME ⭐ 4.90 (78) Germany

devops, amazon-web-services, github

We are looking for an experienced **DevOps Engineer** to take ownership of the reliability, monitoring, and operational stability of our Voice AI product **Klara AI** ([www.klara-ai.de](http://www.klara-ai.de)). Klara AI is a production-ready Voice AI used in recruiting workflows and is **integrated with Recruitee**. The system is already live; this is **not a greenfield project**. Your task is to ensure professional-grade uptime, monitoring, and operational excellence comparable to **[https://status.close.com/](https://status.close.com/)**. --- ### Scope of Work **Primary Responsibilities** * Design and implement a **robust uptime and incident monitoring system** (public status page preferred) * Proactively **monitor system health, failures, latency, and API availability** * Set up **alerts, logging, and escalation processes** * Ensure **high availability and reliability** across all critical services * Ongoing **DevOps supervision** of production systems **Secondary Responsibilities (Nice to Have)** * Minor **bug fixes** in backend (Python) and frontend (ReactJS) * Support CI/CD workflows and GitHub-based versioning --- ### Technical Stack * **Backend:** Python * **Frontend:** ReactJS * **Infrastructure:** Amazon AWS (Region: Europe / Frankfurt) * **AI / Voice Stack:** * OpenAI (LLM) * VAPI * Twilio (calling) * ElevenLabs (voice synthesis) * **Version Control:** GitHub (fully versioned) --- ### Required Skills * Strong **DevOps experience on AWS** * Proven hands-on experience with **VAPI** and **Twilio** * Solid understanding of **cloud monitoring, logging, and alerting** * Experience building **status dashboards / uptime pages** * Familiarity with **Python** and **ReactJS** for minor fixes * Production mindset: reliability, security, and scalability --- ### What We Expect * Independent, structured work * Clear documentation of monitoring and alerting setup * Focus on **stability, uptime, and prevention**, not firefighting * Long-term availability for system supervision is a plus --- ### Project Type * Initial setup project * Ongoing collaboration possible if performance and reliability standards are met If you have built or maintained **mission-critical SaaS or Voice AI systems**, this project is a strong fit. Please include **relevant DevOps / monitoring examples** in your proposal.
Öppna på Upwork