Need data engineer / semantic data specialist

Budget: $100.0 FIXED / ⭐ 4.55 (17) HKG

python, data-science, data-modeling

Role Summary: We are seeking a highly skilled data specialist to design and implement the foundational data architecture for an AI-enabled private equity deal platform. This role will focus on building the semantic layer, context layer, and core data structures that will help users navigate deal opportunities, company records, and a CRM database of 100K+ contacts with accuracy and speed. The ideal candidate will combine strong data engineering skills with experience in metadata management, entity resolution, data modeling, and AI-ready data design. This person will work closely with product, engineering, and business stakeholders to ensure the platform can support future LLM-powered search, recommendation, and workflow capabilities. Key Responsibilities: • Design and build the platform’s semantic layer to unify deal, contact, company, and investor data across multiple workflows. • Develop the context layer that captures relationship logic, entity definitions, user intent signals, and deal-category-specific metadata. • Create scalable data models for primary equity raises, private credit, unicorn secondaries, buyout/M&A, small business buy/sell, and CRM interactions. • Integrate and normalize structured and semi-structured data from the CRM and other internal sources. • Build entity resolution and deduplication logic for contacts, firms, companies, and deal records. • Define canonical objects, taxonomies, and relationships to support accurate search and discovery. • Partner with product and AI teams to prepare data for retrieval, semantic search, and future LLM applications. • Establish data quality, governance, lineage, and validation processes. • Document schemas, definitions, and operating standards so the platform can scale reliably. • Collaborate with stakeholders to translate business workflows into durable data structures and reusable metadata. Required Qualifications: • 5+ years of experience in data engineering, analytics engineering, data architecture, or a closely related role. • Strong SQL skills and experience with Python or similar data tooling. • Hands-on experience designing data models, semantic layers, or business metric layers. • Experience working with CRM, contact, account, or deal-related datasets. • Familiarity with entity resolution, master data concepts, metadata management, and data governance. • Experience building pipelines and working with warehouses such as Snowflake, BigQuery, Redshift, Databricks, or similar. • Ability to work cross-functionally with product and business teams. • Experience accessing, ingesting, and normalizing external content resources from databases, research libraries, and licensed content providers. • Familiarity with major financial and market intelligence sources such as Financial Times, Thomson Reuters, CB Insights, Wall Street Journal, and similar independent research houses. • Experience working with third-party data sources via web interfaces, enterprise platforms, file feeds, and APIs. • Ability to design workflows for sourcing, tagging, classifying, and structuring external research content for internal platform use. • Understanding of data licensing, usage restrictions, attribution, and compliance considerations when working with paid or restricted content sources. • Experience integrating external research into searchable datasets, semantic layers, or knowledge bases. • Comfort evaluating source quality, relevance, freshness, and duplication across multiple research providers. • Strong ability to translate unstructured external content into structured metadata, entities, and context useful for deal sourcing and contact intelligence. Preferred Qualifications: • Experience in private equity, venture capital, investment banking, capital markets, or financial services. • Familiarity with knowledge graphs, ontologies, vector search, or retrieval-augmented generation concepts. • Experience preparing structured data for AI/LLM use cases. • Exposure to CRM systems such as Salesforce or similar platforms. • Experience with data cataloging, lineage tools, and data quality frameworks. What Success Looks Like: • The platform has a trusted semantic model for contacts, companies, deals, and relationships. • Users can reliably find the right contact or company based on intent and context. • The data foundation supports search, filtering, recommendations, and future AI assistant capabilities. • Business teams can reuse consistent definitions across all deal categories. • Data quality and governance are built into the platform from the start. Nice-to-Have Experience: • Building AI-ready data products or context-aware retrieval layers. • Designing taxonomy or ontology structures for enterprise data. • Working with investment platforms, marketplaces, or B2B CRM systems. • Prior work with market data, research aggregation, or alternative data platforms. • Exposure to content ingestion from publisher APIs, RSS feeds, document repositories, or research portals. • Experience supporting search, ranking / matching, discovery, or recommendation use cases built on licensed or proprietary research content.

Auf Upwork öffnen