← Joburi

Senior Web-Scraping / Anti-Bot Reliability Engineer

Buget: - HOURLY / PART_TIME ⭐ 4.91 (11) Switzerland

data-scraping, chromium

We run some Python/Playwright scrapers against certain websites. The extraction works — our problem is stability and throughput at scale. We've already shipped multiple fixes and know the target sites well. We need an expert to help close the long tail, not someone to start from scratch. Stack: Python, Playwright (headed Chromium + Xvfb, stealth), Oxylabs residential proxy, RabbitMQ worker fleet (Docker Compose, scaled) with custom retry/dead-letter/circuit-break logic, PostgreSQL, LLM post-processing. Failure modes we're fighting: - Proxy throttling — colliding exits trip Oxylabs ERR_TUNNEL on a shared IP under scale (one incident: 8 workers, ~2,200 tunnel errors, ~250 dropped keys). We've added unique exits/sticky sessions/backoff and want an expert second opinion. - Transient-vs-terminal misclassification — keys silently dropped or retried for minutes, starving workers. - Anti-bot — cookie banners, intermittent captchas, headless detection. - Infra races — Xvfb lock/stale-DISPLAY on container restart; slow renders tripping fixed timeouts.
Deschide pe Upwork