← Trabajos

Full-Stack Developer / Data Extraction Engineer

Presupuesto: $6.0 - $19.0 HOURLY / PART_TIME ⭐ 4.82 (79) United States

data-extraction, crawlers, etl, api, python, mysql, postgresql

Title: Data Extraction & Segmentation Project Overview We need to extract, normalize, and structure data from a CRM environment containing approximately: * 100,000+ company records * 30,000-40,000 target companies for detailed extraction * 60,000+ contact records * Potentially 1M+ activity records The extracted data will be used for market intelligence, lead sourcing, lead enrichment, and AI workflows. IMPORTANT FIRST MILESTONE Before proposing browser automation or scraping: Developer must determine whether our CRM's APIs can be used. Evaluate: CRM REST API CRM Bulk API SOQL access Export capabilities Object relationships Custom Salesforce objects Deliverable: Technical assessment recommending: API approach Bulk API approach Browser automation approach (only if necessary) CRM Environment The environment contains custom healthcare data including: Companies (Buyers) Buyers including: Private Equity Strategic Acquirers Search Funds Family Offices Independent Sponsors Operators Broker/Intermediaries Listings Transactions across: Home Health Hospice Behavioral Health Dental Imaging Other Healthcare Services Opportunities Buyer interest and transaction activity. Contacts Buyer executives and deal professionals. Activity History Historical buyer engagement. Listing History Historical transaction changes. PRIMARY OBJECTIVE Create a structured buyer intelligence database. Every buyer, contact, opportunity, listing, activity, and history record should be captured and normalized. DATA EXTRACTION REQUIREMENTS 1. Company / Buyer Accounts Capture ALL available fields. Examples include: Account ID Account Name Parent Account Website Phone Billing Address Billing City Billing State Billing Zip Geography Company Category Specialty Type Buyer Profile Additional Profile Notes Important Notes Revenue / Size Amount of Capital Available Source of Funding Healthcare Experience Current Healthcare Holdings Salesperson Account Owner Account Owner Alias Created Date Modified Date Last Activity Date Account History Store complete Buyer Profile text. Store complete Notes fields. No truncation. 2. Buyer Contacts Capture ALL contact records. Examples: Contact ID Account ID Name Title Email Phone Mobile Role Source Salesman Created Date Last Activity Related Contacts Preserve all contact relationships. 3. Listings Capture ALL listing fields. Examples: Financial Data Revenue EBITDA Asking Price Final Sale Price Fee % Fee Amount Transaction Data Listing Status Listing Type Close Date LOI Execution Date LOI Expiration Date Offer Price Marketing Data Original Broadcast Date Rebroadcast Date Last Targeted Broadcast Last Book Follow-Up Date of Most Recent Financials Seller Data Seller Name Seller Email Seller Mobile Seller Address Notes Important Notes Approvals & Exceptions Cancellation Reason Post-Cancellation Follow-Up Notes Payment Notes Capture all raw text. No truncation. 4. Opportunities Connected Capture every opportunity connected to every listing. Examples: Opportunity ID Listing ID Listing Title Buyer Account Primary Contact Stage Status Notes Internal Notes Created Date Close Date Salesperson Created By Alias This table will become the core buyer activity dataset. 5. Activity History Capture all activities including: Emails Calls Meetings Follow-Ups Tasks Fields: Subject Date Assigned To Related To Contact Notes 6. Listing History Capture all listing history records. Examples: Price Changes Status Changes Buyer Changes LOI Events Sale Events Payment Events Store: Date User Action Old Value New Value REQUIRED CLASSIFICATION ENGINE Developer should create derived tags where possible. Buyer Type Private Equity Strategic Buyer Search Fund Independent Sponsor Family Office Operator Broker Healthcare Focus Home Health Hospice Behavioral Health Dental Imaging Physician Practice Other Geographic Focus State Region National Activity Status Book Sent CA Sent CA Executed Pending Approval Conference Call Due Diligence LOI Closed Won Closed Lost ANALYTICS REQUIREMENTS Database should support: Buyer Velocity Listings Received Books Sent CAs Executed Conference Calls LOIs Submitted Deals Closed Buyer Activity Score Recency Frequency Conversion Rate Industry Focus Track buyer activity by industry including but not limited to: Home Health Hospice Behavioral Health Dental Geographic Focus Track buyer activity by: State Region National High Recent Activity Active What I'd Add to the JD Data Warehouse & CRM Requirements Developer will create a structured PostgreSQL/Supabase database that serves as the system of record for EVI. Database will store: Companies Contacts Listings Opportunities Activities Listing History Account History Buyer Intelligence Scores Classification Tags Supporting: Buyer ranking Buyer matching Buyer outreach Market intelligence AI recommendations Future SaaS applications Preferred Skills Required API Bulk API SOQL Python PostgreSQL ETL Development Data Engineering If API Access Is Not Available Experience with: Apify Playwright Browser Automation Salesforce Scraping Session Management Large Scale Data Extraction Proposal Requirements Please answer: Would you attempt CRM API/Bulk API extraction first? Why? Have you extracted custom objects before? Have you built buyer intelligence databases or platforms before? Have you worked with datasets exceeding 100,000 records? What architecture would you recommend for this project?
Abrir en Upwork