← Oferty

DevOps & Cloud Infrastructure Engineer — IoT / Edge AI Systems

Budżet: $15.0 - $25.0 HOURLY / FULL_TIME ⭐ 4.98 (53) United States

python, devops, docker, kubernetes, automated-deployment, cicd, automation-software-release, iot-solutions-design, internet-of-things, mqtt

We are looking for a highly capable DevOps & Cloud Infrastructure Engineer with strong experience in IoT architecture, cloud-connected device fleets, and production infrastructure to help Shyld scale and operate a growing fleet of AI-powered edge devices deployed across healthcare facilities. This role is ideal for someone who has worked on systems where physical devices, edge compute, cloud services, networking, telemetry, security, and reliability all come together. You should be comfortable thinking beyond traditional cloud infrastructure and designing the architecture needed to securely provision, monitor, update, and operate hundreds or thousands of distributed devices in real-world environments. Responsibilities IoT Architecture & Edge Fleet Infrastructure Design and improve the cloud-to-device architecture for a growing fleet of AI-powered edge devices Build infrastructure for secure device provisioning, identity management, configuration, and lifecycle management Support reliable communication between edge devices and cloud services using patterns such as MQTT, Pub/Sub, event streaming, message queues, or similar architectures Design scalable telemetry pipelines for device health, connectivity, logs, events, and operational status Help define the architecture for managing device state, software versions, configuration changes, and fleet-wide rollouts Support scaling from pilot deployments to hundreds or thousands of devices across healthcare facilities Cloud Infrastructure & Platform Engineering Design, deploy, and maintain scalable cloud infrastructure on Google Cloud Platform Manage production services running on Cloud Run, GKE/Kubernetes, Pub/Sub, Cloud SQL, Artifact Registry, IAM, and VPC networking Build highly reliable backend infrastructure for real-time IoT, AI, and robotics workloads Optimize cloud performance, uptime, scalability, and operational cost Edge Device & Fleet Operations Support deployment and management of Linux-based AI edge devices, including NVIDIA Jetson Orin platforms Design and improve OTA update pipelines, staged rollouts, rollback systems, remote monitoring, and health checks Manage device authentication, certificates, secure communication, and access control Improve device provisioning, onboarding, replacement, and recovery workflows Build operational tooling to help the team understand fleet health and quickly diagnose field issues CI/CD & Automation Build and maintain CI/CD pipelines using GitHub Actions and related tooling Automate cloud and device infrastructure deployment using Terraform or similar Infrastructure-as-Code tools Improve release management, deployment safety, staging workflows, canary rollouts, and rollback strategies Support multi-architecture builds for cloud and edge environments when needed Observability & Reliability Build monitoring, alerting, and logging infrastructure across both cloud services and edge devices Improve visibility into device health, connectivity issues, AI pipelines, cloud services, OTA status, and production incidents Develop dashboards, alerts, and runbooks for fleet operations Work with engineering teams to improve system reliability, uptime, and incident response Lead root-cause investigations for production and field issues Security & Compliance Implement cloud, IoT, and device security best practices Manage secrets, certificates, IAM policies, mTLS, secure communications, and device identity Support HIPAA/SOC2-oriented infrastructure and operational requirements Improve auditability, access control, logging, and operational security posture Help ensure that edge devices and cloud services follow least-privilege and secure-by-default principles Cross-Functional Collaboration Work closely with AI engineers, robotics engineers, backend developers, product, field operations, and customer success teams Support deployment readiness for hospitals and enterprise healthcare environments Translate real-world field issues into infrastructure, monitoring, automation, and reliability improvements
Otwórz na Upwork