AI Engineer — RAG & Semantic Search for Team Chat Platform (Python, pgvector, Embeddings)
Budget: $15.0 - $25.0
HOURLY / FULL_TIME
⭐ 0.00 (0)
Japan
python, natural-language-processing
We are building TeamChat, a workspace-based team collaboration platform (similar to Slack). This role owns the RAG and semantic search layer: making every message and file in a workspace searchable and usable as grounded context for AI features. We have a detailed scope document ready to share with shortlisted candidates. This is one of two AI roles we are hiring; strong performance leads to ongoing, long-term collaboration.
CORE RESPONSIBILITIES & SCOPE OF WORK
1. Embedding Pipeline: Incremental indexing of messages and uploaded files (chunking, dedup, token-aware splitting, metadata preservation), with re-indexing and deletion propagation when sources change.
2. Vector Store & Retrieval: pgvector or Pinecone; hybrid retrieval (BM25 + vector + recency boost); relevance evaluation. Workspace/channel-level permission filtering so users never retrieve content they cannot access.
3. Semantic Search Feature: Natural-language search over workspace history with filters (from:, in:, date ranges), source citations, and latency budget suitable for interactive use.
4. Quality & Cost: Offline evaluation set for retrieval quality, embedding cost tracking and optimization, retrieval logging.
5. Delivery: Python service with documented internal APIs the messaging backend and AI feature team can call; tests + eval harness included.
REQUIRED TECH STACK
- Python 3.11+, FastAPI
- Embeddings + vector DB: pgvector or Pinecone
- Hybrid search (BM25 + vector), rerankers
- PostgreSQL, Redis, Celery or equivalent workers
PROJECT DETAILS
- Engagement: Hourly, $15–$25/hr depending on experience. ~30 hrs/week, initial 3 months, ongoing long-term for the right person.
- Process: Daily async standup (English, text), code review via GitHub PRs, 2-week sprints. At least 3–4 hours of overlap with JST (UTC+9).
- IP & Code: All code delivered in our GitHub org from day one; full source ownership by us.
- Language: English required. Urdu-speaking developers welcome.
WHO SHOULD APPLY
Please do NOT apply if your experience is limited to basic chatbot demos, simple OpenAI API wrappers, or tutorial-level LangChain projects. We will ask about production metrics (cost, latency, retrieval quality).
QUESTIONS TO ANSWER IN YOUR PROPOSAL
1. Describe a RAG system you shipped to production: corpus size, retrieval architecture, and how you measured retrieval quality.
2. How would you design RAG over chat messages where retrieval must respect per-channel permissions?
3. What was your monthly embedding + inference cost in a past project, and how did you reduce it?
4. GitHub/portfolio links, timezone, weekly availability, proposed rate.
5. Start your proposal with the word TEAMCHAT.
Apri su Upwork