Machine learning engineer for AI chatbot development and document retrieval system
Orçamento: $150.0
FIXED /
⭐ 0.00 (0)
Pakistan
artificial-intelligence, machine-learning
Job Description (AI Chatbot RAG system)
We are looking for a Machine Learning Engineer to build a RAG-based chatbot system using a local LLM that can answer questions from a collection of PDF documents.
Project Overview
We have a dataset consisting of multiple PDF files (mixed structure: text-heavy documents, notices, and tabular information inside PDFs).
The goal is to build a system where users can ask natural language questions and get accurate answers grounded strictly in the content of these PDFs.
The system must:
Extract and process text from PDFs
Chunk and index the content for retrieval
Use embeddings + vector search for relevant context retrieval
Use a local LLM (not API-based) to generate answers
Ensure answers are strictly grounded in retrieved content (minimize hallucination)
Key Responsibilities
Design and implement a RAG pipeline for PDF documents
Build a robust PDF parsing and chunking system
Create embedding + vector database pipeline (FAISS, Chroma, etc.)
Integrate a local LLM for inference (Ollama, vLLM, Transformers, etc.)
Optimize retrieval quality and context selection
Ensure system is stable and production-ready (not just a demo script)
Required Skills
Strong experience with RAG systems
Experience with PDF parsing / document ingestion
Knowledge of embeddings and vector search
Experience with local LLMs (Ollama / vLLM / HuggingFace Transformers)
Strong Python skills
Nice to Have
Experience with hybrid retrieval (BM25 + vector search)
Experience handling noisy PDFs (OCR, scanned documents)
Understanding of chunking strategies and retrieval evaluation
Deliverables
End-to-end working RAG chatbot
PDF ingestion + preprocessing pipeline
Vector index setup
Local LLM integration
Simple documentation explaining architecture
How We Work
We prefer engineers who think in terms of system design, not just libraries.
Important Note
This is not a basic chatbot or prompt engineering task. We are specifically looking for someone with real experience building RAG-based document intelligence systems.
Abrir na Upwork