Add Local AI to App
Költségvetés: $500.0
FIXED /
⭐ 5.00 (1)
USA
ios-development, android-app-development, mobile-app-development, ipad-app-development
NLP Engineer — Natural-Language Q&A / Semantic Search Engine (Self-Hosted, No LLM API)
OVERVIEW
I need a natural-language question-answering and semantic search engine built on top of a structured dataset I'll provide, integrated into my app's existing search. A user types a plain-English question and gets a direct, grounded answer plus ranked supporting results.
Hard constraint: this must NOT call any generative LLM API (OpenAI, Anthropic, Google, Cohere, etc.) and must not depend on any paid per-query AI service. The system runs on self-hosted / local models only. To be clear about what "no LLM" means here: embedding models, classifiers, NER, and extractive QA models are expected and fine — I use neural embedding models already. What I do not want is a generative chat LLM (or an API wrapper around one) producing the answers.
WHAT YOU'LL BUILD
- A natural-language query endpoint (FastAPI) that takes a question and returns a direct answer + ranked supporting matches
- Query understanding: intent classification + named entity recognition (entity names, identifiers, dates, categories, and other fields relevant to the dataset)
- Dense vector retrieval over a knowledge base using Qdrant (I already run Qdrant), with optional hybrid keyword/BM25 search
- Answering via extractive QA (span extraction from retrieved passages) and/or deterministic template-filled answers — NOT generation
- Cross-encoder reranking for precision
- An ingestion + embedding pipeline that builds the index from a structured dataset I provide
Coverage is defined by the connected data. The system answers factual questions grounded in the dataset — attribute lookups ("what is the [attribute] of [entity]?"), relationship queries ("what is linked to [entity]?"), and filtered lists ("which records match [criteria]?"), including numeric/value fields if included. It is not expected to do open-ended reasoning or opinion — that's the point of avoiding a generative LLM.
TECH STACK (must integrate with)
- Python, FastAPI
- Qdrant (existing instance)
- sentence-transformers / Hugging Face models
- PostgreSQL
- Containerized deploy (Docker; bonus if you've used Modal serverless GPU)
DELIVERABLES
- Working FastAPI service with documented endpoints
- Reproducible ingestion + embedding pipeline (rebuild the index from raw data with one command)
- Written rationale for model choices (embedding model, QA model, reranker)
- Test suite + evaluation results on a held-out question set I provide
- Dockerfile / deployment instructions
- Short handoff doc so my team can maintain and extend it
MUST NOT
- Call any external generative LLM API or paid per-query AI service
- Ship a black box — code must be readable, documented, and maintainable
- Hard-code answers; the system must generalize from the dataset
ACCEPTANCE CRITERIA
- Answers factual questions grounded in the dataset with [suggested: =85%] accuracy on a provided test set
- Query latency [suggested: 300ms p95] on a single GPU (or CPU target if we agree on one)
- Clean integration with my existing FastAPI + Qdrant setup
- Index rebuildable from scratch by me, following your docs
REQUIRED SKILLS
- NLP, information retrieval, semantic search
- sentence-transformers, text embeddings, vector databases (Qdrant / FAISS / Weaviate)
- Extractive QA (fine-tuning or applying BERT / RoBERTa / DistilBERT for span extraction)
- Intent classification and named entity recognition
- Python, FastAPI, Docker
NICE TO HAVE
- Prior domain-specific QA / knowledge-base retrieval work
- Hybrid search (BM25 + dense; Elasticsearch / OpenSearch)
- Cross-encoder reranking experience
TO APPLY — answer both (I'm filtering out anyone who'll just wrap an LLM):
1. Without using a generative LLM (GPT / Claude / Gemini), how would you build a system that answers a factual question like "what is the [attribute] of [a given entity]?" from a structured dataset — name the retrieval approach and the specific models you'd use and why.
2. Link a retrieval, QA, or semantic-search system you've built. What embedding model and (if any) QA/reranking models did you use, and how did you evaluate
Megnyitás Upworkön