Add Local AI to App

Költségvetés: $500.0 FIXED / ⭐ 5.00 (1) USA

ios-development, android-app-development, mobile-app-development, ipad-app-development

NLP Engineer — Natural-Language Q&A / Semantic Search Engine (Self-Hosted, No LLM API) OVERVIEW I need a natural-language question-answering and semantic search engine built on top of a structured dataset I'll provide, integrated into my app's existing search. A user types a plain-English question and gets a direct, grounded answer plus ranked supporting results. Hard constraint: this must NOT call any generative LLM API (OpenAI, Anthropic, Google, Cohere, etc.) and must not depend on any paid per-query AI service. The system runs on self-hosted / local models only. To be clear about what "no LLM" means here: embedding models, classifiers, NER, and extractive QA models are expected and fine — I use neural embedding models already. What I do not want is a generative chat LLM (or an API wrapper around one) producing the answers. WHAT YOU'LL BUILD - A natural-language query endpoint (FastAPI) that takes a question and returns a direct answer + ranked supporting matches - Query understanding: intent classification + named entity recognition (entity names, identifiers, dates, categories, and other fields relevant to the dataset) - Dense vector retrieval over a knowledge base using Qdrant (I already run Qdrant), with optional hybrid keyword/BM25 search - Answering via extractive QA (span extraction from retrieved passages) and/or deterministic template-filled answers — NOT generation - Cross-encoder reranking for precision - An ingestion + embedding pipeline that builds the index from a structured dataset I provide Coverage is defined by the connected data. The system answers factual questions grounded in the dataset — attribute lookups ("what is the [attribute] of [entity]?"), relationship queries ("what is linked to [entity]?"), and filtered lists ("which records match [criteria]?"), including numeric/value fields if included. It is not expected to do open-ended reasoning or opinion — that's the point of avoiding a generative LLM. TECH STACK (must integrate with) - Python, FastAPI - Qdrant (existing instance) - sentence-transformers / Hugging Face models - PostgreSQL - Containerized deploy (Docker; bonus if you've used Modal serverless GPU) DELIVERABLES - Working FastAPI service with documented endpoints - Reproducible ingestion + embedding pipeline (rebuild the index from raw data with one command) - Written rationale for model choices (embedding model, QA model, reranker) - Test suite + evaluation results on a held-out question set I provide - Dockerfile / deployment instructions - Short handoff doc so my team can maintain and extend it MUST NOT - Call any external generative LLM API or paid per-query AI service - Ship a black box — code must be readable, documented, and maintainable - Hard-code answers; the system must generalize from the dataset ACCEPTANCE CRITERIA - Answers factual questions grounded in the dataset with [suggested: =85%] accuracy on a provided test set - Query latency [suggested: 300ms p95] on a single GPU (or CPU target if we agree on one) - Clean integration with my existing FastAPI + Qdrant setup - Index rebuildable from scratch by me, following your docs REQUIRED SKILLS - NLP, information retrieval, semantic search - sentence-transformers, text embeddings, vector databases (Qdrant / FAISS / Weaviate) - Extractive QA (fine-tuning or applying BERT / RoBERTa / DistilBERT for span extraction) - Intent classification and named entity recognition - Python, FastAPI, Docker NICE TO HAVE - Prior domain-specific QA / knowledge-base retrieval work - Hybrid search (BM25 + dense; Elasticsearch / OpenSearch) - Cross-encoder reranking experience TO APPLY — answer both (I'm filtering out anyone who'll just wrap an LLM): 1. Without using a generative LLM (GPT / Claude / Gemini), how would you build a system that answers a factual question like "what is the [attribute] of [a given entity]?" from a structured dataset — name the retrieval approach and the specific models you'd use and why. 2. Link a retrieval, QA, or semantic-search system you've built. What embedding model and (if any) QA/reranking models did you use, and how did you evaluate

Megnyitás Upworkön