← Jobs

Senior Data Scientist

Budget: $15.0 - $40.0 HOURLY / FULL_TIME ⭐ 5.00 (16) Germany

data-science, python, machine-learning, data-analysis

You will own the data enrichment strategy for a massive archive of world-class journalism. Your mission is to take 25 years of historical content and "hydrate" it—cleaning and tagging it with metadata so it can power next-gen AI products and search tools. You’ll act as a bridge between business leaders and engineering teams, turning complex editorial goals into smart, scalable data pipelines. Most Important Senior NLP & ML Experience: 5+ years of experience processing large-scale, unstructured text datasets. Technical Stack: Advanced proficiency in Python (Pandas, PySpark) and building production-ready ETL pipelines. NLP Frameworks: Hands-on experience with spaCy, Hugging Face, or Transformers for entity recognition and categorization. Search Knowledge: Familiarity with OpenSearch or Elasticsearch, specifically regarding vector embeddings and index mapping. Taxonomy Design: Ability to design metadata structures that capture the value of diverse content. Strategy & Consultation: Experience leading technical discovery sessions and translating business needs into technical requirements. Nice to Have Legacy Data Handling: Experience working with messy, historical HTML and "dirty" data archives. Efficiency Focus: Knowledge of using open-source LLMs to process data in a cost-effective way. Modern Search: Exposure to hybrid search (Lexicon + Vector) and graph-based retrieval. Personal Traits The Translator: You can explain complex AI concepts to non-technical people without losing them. The Diplomat: You are great at mediating between different teams with competing priorities. Pragmatic Thinker: You focus on results and ROI, knowing when a "good enough" model is better than a perfect one that’s too expensive. Curious Investigator: You enjoy digging into decades of data to find patterns and solve "messy" problems. Team Player: You enjoy working closely with backend and search engineers to ensure your data actually works in the final product.
Openen op Upwork