← Trabajos

Launture.AI: UAT & Dev Sprint

Presupuesto: - HOURLY / PART_TIME ⭐ 5.00 (1) United States

python, react-js, artificial-intelligence

(See attached pdf for RFP) 1. The Headline We’re ~95% built on Launture — an AI-native business launch execution platform — and we’re sprinting to MVP. We need a sharp contractor (solo or firm) to manually UAT the product as a real user, compare what they see against our Figma designs and expected product behavior, file high-quality bug tickets, and fix the bugs across two one-week sprints. What this engagement is: human UAT. A real person clicks through every flow, on desktop and mobile, comparing actual behavior and pixels against the Figma source of truth and against what a founder using the product would reasonably expect. They notice broken affordances, off-brand spacing, awkward copy, missing empty states, bad error handling, obvious hallucinations, etc. Then they fix what they can in collab with founder. One non-negotiable: we want a partner who uses AI to multiply their own productivity on the dev side. We expect you to be a heavy daily user of Claude Code, Cursor, Copilot, or equivalent — driving codebases through LLMs, not typing every line by hand. The UAT pass itself is intentionally human; the bug-fix work that follows it should be AI-leveraged. Launture is built by an AI-leveraged team; we move at the speed of the tooling. Bidders who plan to bill hours that AI-leveraged peers would compress 3–5x will not be a fit. 2. Engagement Shape Duration: Two one-week sprints (10 working days total). Cadence: Daily stand-ups; single longer working session per sprint for kickoff + retro. Start: ASAP after award — earliest start date is a screening factor. Output cadence: Atomic PRs against staging daily; we review same-day. We will provide on day 1: - GitHub repo access (contractor branch + PR rights against staging) - Staging Supabase + Render + Vercel environments wired up - Test Stripe keys, alpha gate code, test user credentials (anonymous, Explorer, Launch Ready, Launch Pack) - Figma file access — desktop + mobile frames are the source of truth for every screen - Read access to Jira (LNTR project) and Slack - A walkthrough of the architecture, the design system, the known-gotchas list, and how to file a high-signal bug ticket 3. Scope of Work Sprint 1 — Human UAT Pass + High-Severity Bug Fixes (5 days) Get oriented (Day 1, morning) Read CLAUDE.md, docs/architecture.md, the design system docs (docs/launture-design-system-v2.md, docs/brand-guide.md), the CorpNet gotchas table. Tour the Figma file with us (one live session) so you know which frames are canonical. Run the backend test suite once locally (cd backend && pytest tests/ -v) to confirm your environment works — not as part of UAT, just an environment check. Human UAT walkthrough (Days 1–3) This is the heart of the engagement: a real person uses the product like a founder would, on desktop and mobile, across all four tier states (anonymous, Explorer, Launch Ready, Launch Pack). You’re not running scripts — you’re using the product. For every screen and flow, you compare what you see against three reference points: The Figma design — does the live UI match the canonical frame? Spacing, color, typography, hierarchy, states (default/hover/disabled/empty/error/loading), responsive breakpoints. What a user would expect — affordances obvious? Copy clear? Errors graceful? Empty states helpful? Did anything surprise you in a bad way? The brand voice (docs/brand-guide.md) — is the tone right? Anywhere it sounds robotic, off-brand, or generic? Journeys we want walked end-to-end (we’ll supply a fuller checklist on day 1, but these are the must-cover flows): Marketing site → alpha gate → signup → first chat → first deliverable Anonymous chat → session limit → registration → continued chat Explorer free flow: 14 deliverables, phase progression, upsell prompts Stripe checkout → webhook tier upgrade → paywalled deliverables unlock Launch Pack flow: 21 deliverables, all 4 export formats (PDF, DOCX, PPTX, XLSX) CorpNet corporate formation (sandbox) Mobile: every above flow on at least one iOS and one Android viewport Settings, account deletion, password change, sign out Error paths: network drop mid-stream, expired session, payment failure, invalid input For every defect found, file a Jira ticket in the LNTR project with: Severity (Highest / High / Medium / Low — using our existing prioritization rubric, which we’ll share) Tier + viewport reproduced on (e.g., “Explorer / iPhone 14 / Safari”) Repro steps (numbered, copy-pasteable) Expected behavior (cite the Figma frame, the brand guide, or user-expectation reasoning) Actual behavior (screenshot or screen recording attached) Suspected root cause if obvious (a code pointer is a bonus, not required) Figma frame URL when the bug is design-drift Bug fixes (Days 2–5, in parallel with UAT) Fix all Highest- and High-severity bugs found during your own UAT pass. Each fix lands as an atomic PR against staging. Backend fixes: must include a pytest regression test (CI enforces a 70% coverage floor). Frontend fixes: must include before/after screenshots in the PR description, with the relevant Figma frame linked. You do not need to write Stagehand/Playwright tests — we own that suite and we’ll add automated coverage ourselves post-merge. Follow the project’s git discipline: conventional commits, no direct pushes to staging or main, no --no-verify, all CI gates green before merge. Sprint 1 deliverable: - A bug-triage report (Jira board snapshot + 1-page written summary: themes, severity histogram, surprising findings). - One atomic PR per Highest/High fix, merged to staging. - A short Loom or written walkthrough of the most interesting bugs you found (we want to see your eye, not just your ticket count). Sprint 2 — Burn Down the Bug List + Polish (5 days) Continue bug fixes — all Medium-severity bugs from Sprint 1; any Highest/High discovered late. Re-walk the worst areas — pick the 2–3 flows that surfaced the most bugs in Sprint 1 and walk them again post-fix. File any regressions or near-misses. Design-drift cleanup pass — one focused sweep comparing the live staging site to Figma, ticket-by-ticket fixing small visual deltas (spacing, color tokens, typography weight) that didn’t merit standalone tickets but accumulated into UI noise. Handoff doc — final report covering: what was tested, what was fixed, what’s deferred (with severity + repro), patterns you noticed (e.g., “mobile error states are consistently weaker than desktop”), and prioritized recommendations for the post-MVP sprint. Sprint 2 deliverable: Stable staging branch, zero open Highest/High bugs, design-drift sweep complete, written handoff doc. 4. Tech Stack You’ll Be Working In Layer Stack Backend FastAPI, Python 3.12, Pydantic v2, async/await, httpx Frontend React 18, TypeScript, Tailwind v4, shadcn/ui, wouter, framer-motion Database Supabase (PostgreSQL + Auth + Realtime + RLS) AI Anthropic SDK (Claude Opus 4.6 + Sonnet 4.5), streaming responses Payments Stripe (signature-verified webhooks) Hosting Vercel (frontend) + Render (backend, $25/mo box) Observability PostHog, Sentry, BetterStack, Upstash UAT framework Stagehand 3.0.8 (Playwright 1.58.2), Claude vision assertions, Zod schema extraction Test runner pytest + pytest-asyncio (backend), custom TSX runner (UAT) Exports WeasyPrint (PDF), python-docx, python-pptx, openpyxl Third-party CorpNet (corporate formation), Census/BLS/FRED/Crunchbase (Evidence Layer) 5. What’s Already In Place (don’t bid to build these) Backend test suite: 194 pytest files, run via pytest tests/ -v, with @pytest.mark.integration separation. You’ll add regression tests here for any backend fix you ship. Automated UAT suite: 57 Stagehand + Playwright tests covering auth / chat / deliverables / payments / navigation / ventures / UI, plus 20 visual regression tests across desktop + mobile. We own and run this suite — you do not. It runs nightly against staging and we’ll cite it if needed; you’re not bidding hours to maintain or extend it. Helpers: alpha-gate bypass, admin fixture seeding endpoint (POST /api/admin/fixtures), test user pool spanning all four tiers. Figma file: desktop and mobile frames for every screen and component — the source of truth for visual correctness. Design system: tokens documented in backend/templates/design_tokens.json and docs/launture-design-system-v2.md; brand voice in docs/brand-guide.md. Worktree setup: isolated branch workflow via make setup (symlinked deps, no slow installs). Project bible: documentation of every convention, guardrail, and gotcha. 6. Required Qualifications You must have: AI-leveraged dev workflow as a daily practice — Claude Code, Cursor, Aider, Copilot, or an equivalent agentic coding tool driving the bulk of your implementation output. The UAT itself is human; the fixes are AI-leveraged. This is the single most important qualification on the dev side: see §7a below. 3+ years shipping production SaaS with Python (FastAPI or similar async) and TypeScript/React — required for the bug-fix half of the engagement. Comfort with Supabase / PostgreSQL + RLS — you know what row-level security is and why it matters. Disciplined git workflow — conventional commits, atomic PRs, no --no-verify, regression tests on every backend fix. Strong written communication — bug tickets are a primary deliverable. We expect repro steps, screenshots, expected-vs-actual referencing Figma or brand-guide, and suspected root cause where you have one. 7. Selection Criteria We’ll evaluate proposals on: Designer’s-eye UAT sample (Q2) — the quality of the unsolicited bug ticket you write. This is the strongest signal that you can do the human half of the engagement well. AI-leverage fluency (Q1) — concrete, specific answer to the AI workflow question. Vague answers do not advance. Relevant experience — concrete examples of similar engagements (LLM-backed SaaS, manual UAT + bug fix sprints). Proposed plan — how you’d structure week 1 vs week 2, what you’d prioritize, what risks you’d surface early. Code samples or portfolio — production code you’ve shipped, ideally including LLM integration. Communication quality — clarity of the proposal itself signals how you’ll communicate during the sprint. Bid + timeline — total cost, hours estimate, earliest start date. We are not optimizing for the lowest bid. We are optimizing for the team most likely to ship clean, tested, merged-to-staging work in 10 days. 8. How to Apply Your proposal should include: A 1–2 paragraph summary of how you’d approach this engagement. Total bid (fixed per sprint or hourly with cap), hours estimate, earliest start date. 2–3 portfolio links or code samples — at least one with LLM integration. Design or visual-QA portfolio links also welcome. Team composition (solo, or firm — name the people and their roles; clarify who does the human UAT pass vs the bug fixes). Answers to the screening questions below. Screening Questions AI leverage in your dev workflow: Describe one production change you shipped in the last 60 days where an AI agent (Claude Code, Cursor, Aider, Devin, etc.) did substantial work. Tell us (a) which tool, (b) what the change was, (c) what you delegated vs reviewed, (d) where the AI was wrong and how you caught it. Vague answers (“we use Copilot daily”) will not advance. Designer’s-eye UAT: Pick any live SaaS product you don’t work on. In 200–300 words, write the kind of bug ticket you’d file if you were doing a human UAT pass on it — pick something visual, behavioral, or copy-related (not a hard crash). We want to see how you observe, what you compare against, and how you communicate it. Human UAT depth: Describe a recent engagement where you did manual, exploratory UAT on a real product (not automated). What did you find that automation would have missed? How did you organize the pass so you didn’t just click around aimlessly? LLM-product UAT: Launture is an AI-native product. What kinds of bugs do you expect to find on an AI-driven flow that wouldn’t appear on a traditional CRUD app? (Examples might include: hallucinated content, missing citations, off-brand tone, latency-related UX gaps, streaming-token glitches, retry/error UX.) Backend bug fix discipline: Our CI requires pytest --cov --fail-under=70 to stay green. How do you sequence a backend bug-fix PR so you don’t accidentally drop coverage while also keeping the diff small? SSE chat hardening (200 words max): Our streaming chat occasionally drops mid-stream under flaky network conditions. Walk us through how you’d reproduce, diagnose, and harden this. Earliest start date, your business-hours timezone, and where you fall on the solo-contractor vs small-firm spectrum.
Abrir en Upwork