Senior DevOps Engineer (Azure, Linux, API Performance & Production Troubleshooting)
Buget: $15.0 - $40.0
HOURLY / PART_TIME
⭐ 4.85 (8)
United States
windows-azure, devops
Senior DevOps Engineer (Azure, Linux, API Performance & Production Troubleshooting)
Job Overview
We are looking for an experienced Senior DevOps Engineer to troubleshoot and resolve intermittent API performance issues in our production environment.
Our entire application stack is hosted on Microsoft Azure and consists of multiple Ubuntu servers running Apache, PHP-FPM, CodeIgniter, Redis, MySQL and Node.js background services.
We are facing intermittent production issues where APIs remain in a Pending state, become delayed, or occasionally get stuck even though server resources, Apache and MySQL appear healthy.
We need someone who can perform end-to-end root cause analysis across infrastructure, application and database layers.
This is NOT a development role. This is a Production Support, DevOps and Performance Troubleshooting role.
Current Environment Infrastructure
Microsoft Azure
Ubuntu Linux servers
Multiple application servers
Application Stack
Apache 2.4
PHP 8.x
PHP-FPM
CodeIgniter
Redis
MySQL (AWS)
Node.js background processes
External API integrations
Problems We Need Help Solving
We need an expert who can investigate and permanently fix issues such as:
APIs are getting stuck in the Pending state from the browser
Slow API responses
Random API delays
APIs blocking subsequent API calls
Intermittent application slowness
Server performance degradation
Database connection bottlenecks
Connection pool issues
External API dependency delays
Session locking issues
Redis lock contention
Infrastructure bottlenecks
Production stability issues
The candidate should be comfortable troubleshooting issues even when CPU, memory and MySQL appear healthy.
Key Responsibilities
API Troubleshooting
Investigate APIs getting stuck or remaining in the Pending state
Identify reasons for slow API responses
Trace API execution flow end-to-end
Identify API bottlenecks
Investigate API timeout issues
Analyze API dependencies
Troubleshoot API chaining issues
Infrastructure Troubleshooting
Analyze Azure infrastructure
Investigate networking issues
Review load balancing configurations
Analyze VM performance
Investigate server bottlenecks
Server Administration
Troubleshoot Ubuntu servers
Analyze server health
Investigate hanging processes
Monitor resource utilization
Analyze network connections
Apache & PHP-FPM
Analyze Apache worker utilization
Troubleshoot PHP-FPM worker issues
Configure and analyze slow logs
Optimize server performance
2. Database Analysis
Investigate MySQL connection issues
Analyze database bottlenecks
Identify slow queries
Analyze database locking
Review database connection management
Redis Analysis
Investigate Redis locks
Analyze session handling
Review queue processing
Troubleshoot concurrency issues
Monitoring & Observability
Implement monitoring solutions
Configure alerting mechanisms
Implement request tracing
Build diagnostic dashboards
Improve logging mechanisms
Required Skills
Cloud & Infrastructure
Microsoft Azure
Azure Virtual Machines
Azure Networking
Azure Load Balancer
Azure Monitor
Azure Application Insights
Linux Administration
Ubuntu
System troubleshooting
Process analysis
Networking analysis
System performance tuning
Required Linux Tools
3
lsof
ss
netstat
journalctl
systemctl
Web Server
Apache 2.4
Apache server-status
Apache MPM
Reverse proxy troubleshooting
Backend Technologies
PHP
PHP-FPM
CodeIgniter
Node.js
Databases
MySQL
AWS RDS
Connection pooling
Query optimization
Cache & Queue Systems
Redis
Session management
Lock management
Queue troubleshooting
Monitoring Tools
Azure Monitor
Application Insights
Log Analytics
Candidate Must Be Able To Answer These Questions
Why is an API getting stuck even though CPU usage is low?
Why is an API showing Pending status in the browser?
How do we identify where an API is stuck?
How do we determine whether the issue is Azure, Apache, PHP, Redis, MySQL or an external API?
How do we identify session locking issues?
How do we investigate database connection bottlenecks?
How do we trace an API from browser to database?
Why do subsequent APIs get delayed because of one API?
How do we implement proper monitoring to prevent these issues?
4. Expected Deliverables
The selected candidate should:
Identify the root cause of API delays and pending requests.
Trace requests from Browser → Server → Application → Database → External Services.
Implement permanent fixes.
Add monitoring and observability.
Configure alerts and diagnostics.
Improve application stability and performance.
Provide documentation and recommendations.
Ideal Candidate
7+ years of DevOps experience
Strong Azure knowledge
Strong Linux troubleshooting skills
Experience with Apache, PHP-FPM and MySQL
Experience troubleshooting production systems
Experience in diagnosing API latency and hanging requests
Strong root cause analysis skills
Ability to independently investigate complex issues
Deschide pe Upwork