Need data engineer / semantic data specialist
Költségvetés: $100.0
FIXED /
⭐ 4.55 (17)
HKG
python, data-science, data-modeling
Role Summary:
We are seeking a highly skilled data specialist to design and implement the foundational data architecture for an AI-enabled private equity deal platform. This role will focus on building the semantic layer, context layer, and core data structures that will help users navigate deal opportunities, company records, and a CRM database of 100K+ contacts with accuracy and speed.
The ideal candidate will combine strong data engineering skills with experience in metadata management, entity resolution, data modeling, and AI-ready data design. This person will work closely with product, engineering, and business stakeholders to ensure the platform can support future LLM-powered search, recommendation, and workflow capabilities.
Key Responsibilities:
• Design and build the platform’s semantic layer to unify deal, contact, company, and investor data across multiple workflows.
• Develop the context layer that captures relationship logic, entity definitions, user intent signals, and deal-category-specific metadata.
• Create scalable data models for primary equity raises, private credit, unicorn secondaries, buyout/M&A, small business buy/sell, and CRM interactions.
• Integrate and normalize structured and semi-structured data from the CRM and other internal sources.
• Build entity resolution and deduplication logic for contacts, firms, companies, and deal records.
• Define canonical objects, taxonomies, and relationships to support accurate search and discovery.
• Partner with product and AI teams to prepare data for retrieval, semantic search, and future LLM applications.
• Establish data quality, governance, lineage, and validation processes.
• Document schemas, definitions, and operating standards so the platform can scale reliably.
• Collaborate with stakeholders to translate business workflows into durable data structures and reusable metadata.
Required Qualifications:
• 5+ years of experience in data engineering, analytics engineering, data architecture, or a closely related role.
• Strong SQL skills and experience with Python or similar data tooling.
• Hands-on experience designing data models, semantic layers, or business metric layers.
• Experience working with CRM, contact, account, or deal-related datasets.
• Familiarity with entity resolution, master data concepts, metadata management, and data governance.
• Experience building pipelines and working with warehouses such as Snowflake, BigQuery, Redshift, Databricks, or similar.
• Ability to work cross-functionally with product and business teams.
• Experience accessing, ingesting, and normalizing external content resources from databases, research libraries, and licensed content providers.
• Familiarity with major financial and market intelligence sources such as Financial Times, Thomson Reuters, CB Insights, Wall Street Journal, and similar independent research houses.
• Experience working with third-party data sources via web interfaces, enterprise platforms, file feeds, and APIs.
• Ability to design workflows for sourcing, tagging, classifying, and structuring external research content for internal platform use.
• Understanding of data licensing, usage restrictions, attribution, and compliance considerations when working with paid or restricted content sources.
• Experience integrating external research into searchable datasets, semantic layers, or knowledge bases.
• Comfort evaluating source quality, relevance, freshness, and duplication across multiple research providers.
• Strong ability to translate unstructured external content into structured metadata, entities, and context useful for deal sourcing and contact intelligence.
Preferred Qualifications:
• Experience in private equity, venture capital, investment banking, capital markets, or financial services.
• Familiarity with knowledge graphs, ontologies, vector search, or retrieval-augmented generation concepts.
• Experience preparing structured data for AI/LLM use cases.
• Exposure to CRM systems such as Salesforce or similar platforms.
• Experience with data cataloging, lineage tools, and data quality frameworks.
What Success Looks Like:
• The platform has a trusted semantic model for contacts, companies, deals, and relationships.
• Users can reliably find the right contact or company based on intent and context.
• The data foundation supports search, filtering, recommendations, and future AI assistant capabilities.
• Business teams can reuse consistent definitions across all deal categories.
• Data quality and governance are built into the platform from the start.
Nice-to-Have Experience:
• Building AI-ready data products or context-aware retrieval layers.
• Designing taxonomy or ontology structures for enterprise data.
• Working with investment platforms, marketplaces, or B2B CRM systems.
• Prior work with market data, research aggregation, or alternative data platforms.
• Exposure to content ingestion from publisher APIs, RSS feeds, document repositories, or research portals.
• Experience supporting search, ranking / matching, discovery, or recommendation use cases built on licensed or proprietary research content.
Megnyitás Upworkön