Recovery Engineer (AI-Native) — Stabilize & De-Slop an AI-Built Next.js/Supabase Platform
Budget: $15.0 - $35.0
HOURLY / FULL_TIME
⭐ 4.95 (29)
Germany
react-js, next.js, postgresql, node.js, code-refactoring
We're a digital marketing agency. We've built an AI-heavy internal platform (content automation, SEO research, topical mapping, client management) that powers the services we deliver to our clients — it is not sold standalone as SaaS; it's the engine behind our agency work. Stack: Next.js 16 (App Router) / React 19 / TypeScript (strict) / Supabase / Inngest, on Vercel. ~300k+ lines of application code, 100% AI-built — not a single line was written by hand — in production, with real client work running through it.
Let's be honest about the state, because it determines who we're looking for:
The codebase is disciplined in some ways — strict typing is genuinely enforced (near-zero "any"), and we have (somewhat) mature AI-development tooling (rich repo guidelines, rules, agentic workflows). But it was built entirely with AI — much of it when models were far weaker than today — by a self-taught founder, not a trained engineering team. The result works but is fragile in specific, structural ways: errors get silently swallowed, tests are plentiful but mock both sides of every boundary (so integration seams break undetected and ship), and there's accumulated "AI slop." (-;
I would describe it that way: It's a reasonably ordered, fully AI-built project with specific structural weaknesses — and the gap is in code-level robustness and engineering judgment, not in process ambition.
THE MISSION — TWO EQUAL GOALS
1. Make the app genuinely stable and clean.
2. Leave us with the best-practice processes so our human+AI development stops producing slop. (And work with us long term, if you are interested)
Both matter equally. We don't just want you to fix it. We want to learn how to keep it fixed.
WHAT YOU'LL ACTUALLY DO (PHASED)
- Phase 0 — Paid trial task (small, real).
- Phase 1 — Audit. A structured assessment that surfaces what we, as non-engineers, can't see: the unknown best-practice gaps, the structural weak points, the priorities. You bring a proven recovery process here — we expect a clear method, not blind fixing.
- Phase 2 — Recovery. Execute against the prioritized plan. The single biggest lever we've already identified: our verification/gates check component-correctness (build green, unit tests green) but never check integration/behavior — so broken seams ship. Rebuilding that gate system is likely a core early deliverable. You'll fix the structural roots (silent failures, broken seams, state isolation) and establish guardrails our AI pipeline can then scale across the rest of the codebase.
- Phase 3 (optional) — Long-term. If the recovery goes well, we click, and you're interested, we'd genuinely love to keep working together. We're chronically short on capacity, so this can become an ongoing role, not just a one-off cleanup — entirely depending on mutual fit.
WHO YOU ARE (MUST-HAVES)
- Senior engineering judgment with a track record of real recovery/stabilization projects. Verifiable references required.
- You live agentic AI-orchestration — agentic / spec-driven workflows, plan-execute-verify loops (e.g. Claude Code or equivalent). This is non-negotiable: our entire model is human+AI, and transferring your AI practices is half the job. We mean genuine orchestration.
- Strong with our stack: Next.js App Router, React 19, TypeScript strict, Supabase (incl. RLS), Inngest.
- An explicit, repeatable recovery process you can describe and defend.
- Able to mentor and communicate — you'll leave behind playbooks and guardrails, and occasionally pull in our capacity.
- Working language English. Async-first is fine; one weekly live sync is required, so your timezone must allow at least one shared slot with Central-European working hours.
HOW WE SELECT (PLEASE READ — IT TELLS YOU HOW WE WORK)
We don't hire on written applications alone — in our experience they're now all AI-polished, hit every bullet in the post perfectly, and tell us nothing about whether the person can actually do the work. So:
1. A short Loom video with your application (details under "How to apply") — our single most important filter.
2. A small, paid, real trial task on an isolated copy of our code (no secrets, no production data, NDA first): a process/scale part and a stack-depth part, with a decision log — what you almost got wrong, what you deliberately did not touch, where you were unsure. Reasoning over diff.
3. Finalists do a short live defense of their trial work — spontaneous "what if" questions about their own decisions.
If that sounds like a lot: it's exactly the rigor we want you to bring to our codebase.
ENGAGEMENT & LOGISTICS
- Phased as above. Trial = fixed price. Recovery = hourly/time-based (scope is genuinely uncertain until the audit, so fixed-price would hurt us both), with milestone review checkpoints.
- Access is staged and minimal: trial gets an isolated snapshot with dummy keys; real work gets dedicated, scoped dev credentials and synthetic data — never production keys or real customer data.
- No fixed rate posted — propose yours.
HOW TO APPLY
Send us:
1. A short Loom video (5-10 min): introduce yourself, walk us through your recovery methodology, and present 1-2 real recovery projects you've done. This is not a hard requirement — but it makes our decision dramatically easier, and we'll be transparent about why: we get flooded with AI-generated applications that sound perfect and address every point in this post, from people who turn out to have nothing behind it — and that wastes everyone's time. Hearing you explain your own work, in your own words, is the fastest way for us to see you're the real thing. Honestly, we value this more than any written pitch.
2. Links to verify those projects (repos, references, case studies — whatever's real).
3. One line on your agentic-AI setup (which tools, how you orchestrate).
Openen op Upwork