← Вакансії

NLP Specialist for Bengali Text Annotation & Metadata Extraction

Бюджет: $100.0 FIXED / ⭐ 0.00 (0) Canada

python, natural-language-processing, machine-learning, artificial-intelligence, artificial-neural-networks, computer-vision

We are building a Retrieval-Augmented Generation (RAG) system for Bengali content and are looking for a skilled NLP practitioner to enrich raw text data with structured metadata. Given Bengali news or article content, your task will be to extract and generate high-quality annotations, including: • Emotion (e.g., concerned, neutral, optimistic) • Sentiment (positive, negative, neutral) • Topic classification (e.g., corruption, politics, health) • Named entities (key organizations, people, institutions) • Keywords (relevant terms for retrieval optimization) The output must strictly follow a predefined JSON schema and maintain consistency across large datasets. Responsibilities: • Design and/or implement a pipeline to generate structured annotations from Bengali text • Ensure linguistic and contextual accuracy in both Bengali and English • Optimize outputs for downstream retrieval systems (TF-IDF, BM25, hybrid search, etc.) • Handle edge cases such as ambiguous sentiment or mixed topics • Maintain clean, valid, and production-ready JSON outputs Requirements: • Strong proficiency in both Bengali and English (reading & writing) • Experience with NLP tasks such as NER, sentiment analysis, topic classification • Familiarity with LLM prompting, fine-tuning, or annotation pipelines • Experience working with JSON schema validation and structured outputs • Bonus: Experience with RAG systems, search optimization, or multilingual embeddings Preferred Stack (optional): • Python (spaCy, HuggingFace, sentence-transformers, etc.) • Experience with Bengali NLP tools or datasets • Understanding of vector databases (Qdrant, FAISS, etc.) Deliverables: • Annotated dataset in required JSON format • (Optional) Reusable pipeline/script for automated processing Project Type: Freelance / Contract Duration: Short-term (with potential for long-term collaboration) If you have experience working with Bengali language data and understand how structured annotations improve retrieval quality, we’d love to hear from you. This is a part of the assignment to assess the quality of work. If satisfactory, longer term contracts will follow.
Відкрити на Upwork