I-Powered Search + RAG Answer System for a Legal Education Platform

Бюджет: $75.0 - $150.0 HOURLY / PART_TIME ⭐ 5.00 (3) United States

Overview We run a continuing legal education (CLE) platform with a large catalog of courses (live, replay, and on-demand) across dozens of practice areas, plus full session transcripts and video. We're building an AI-powered search experience and need a developer to build the system behind it. The UI is already designed, and a written spec for the ranking logic is finalized — we'll share both with shortlisted candidates. We want a working v1 live quickly, then iterate. There are two connected parts to this build. Part 1 — AI Catalog Search (with custom ranking logic) Natural-language search over our course catalog that returns relevant classes and applies a specific, already-defined ranking and grouping logic. Semantic search over the catalog, filterable by practice area, topic, credit type, and jurisdiction. A ranking + fill engine implemented exactly to our spec (you implement it; you don't design it): Format tiers: Live + Replay are one combined tier and always rank above On-demand. Sort within tiers: Live/Replay by date (nearest upcoming first); On-demand by popularity (we supply the popularity list). Fill: show the total match count (e.g., "80 classes match"), render 10 — walk Live/Replay by date first, top up with most-popular On-demand, then link to the full set ("See all 80"). Grouping: 3 or fewer results = flat list; more than 3 = group by format tier, then sub-practice area. A clean, documented API the existing React front end calls. Part 2 — RAG Answer System (transcript + video) A retrieval-augmented search across our session transcripts and video that returns a single, cited answer assembled from four linked panels — all pointing to the same moment in a live CLE: AI Reasoning — a synthesized, plain-language answer grounded strictly in retrieved content (no claim without a source). Transcript — the exact transcript excerpt the answer is drawn from, with speaker and timestamp. Video — the live-session clip, deep-linked to that same timestamp. Speaker Information — the faculty member who spoke the passage, with bio/credentials. The engineering core is keeping all four panels in sync: every panel keys off the same program + speaker + timestamp anchor, so chunking must preserve speaker and timestamp at the segment level. Stack We use the Anthropic Claude API and are open to your recommendation on vector store and embeddings (pgvector, Pinecone, Qdrant, etc.). Use whatever ships a clean v1 fastest. Required Production experience building semantic / AI search and RAG systems. Strong with vector databases, embeddings, and LLM APIs. Can process long-form transcripts (chunking that preserves speaker + timestamp metadata). Can implement precise ranking/sorting/grouping rules from a written spec. Can build clean, documented APIs for a React front end. Citations that trace back to real sources — minimal hallucination. Nice to have Legal or other accuracy-critical content experience. Video deep-linking / timestamp syncing. n8n, WooCommerce/Stripe familiarity. Engagement Type: [Hourly / Fixed-price — choose one] Timeline: Start now, working v1 live quickly. Catalog search can ship first; RAG answer system can follow. Budget: [Insert range, or "propose based on scope"] To apply, briefly tell us A RAG or AI search system you've built, and the corpus size. How you'd get a basic working version live fast. Your suggested vector store, embedding model, and LLM. How you'd keep the four answer panels (reasoning, transcript, video, speaker) pointing to the same source moment.

Відкрити на Upwork