Data Engineer / Consultant (Hands-On)

Rozpočet: - HOURLY / FULL_TIME ⭐ 5.00 (142) United States

bigquery, database-architecture, etl-pipelines, big-data, data-science, data-management, python, sql, data-analysis, data-modeling

We're a fast-growing consumer electronics brand (connected air purifiers, expanding into the broader home environment) with a mature, senior-grade data platform already in production. We need a hands-on data engineer who can do three things: keep it running, build new flows on top of it, and help us actually turn the data into business decisions and revenue. The platform is real and well-built — the gap is ownership and the "last mile." We have built a pipeline that runs largely headless today. We want someone who will take the wheel, keep it healthy, extend it as we add sources, and — critically — close the loop so the business consumes and acts on what it produces. The Stack (already in production) Sources (11+): Amazon SP-API, Shopify GraphQL, Walmart Marketplace, Flowspace 3PL, Tuya IoT (device telemetry — filter life, AQI, runtime), Klaviyo, Yotpo, Recharge, plus reference/environmental feeds (currency, AirNow AQI, NASA FIRMS wildfire data). Ingestion: ~12 custom Python extractors (one is built on dlt), shared auth/util libraries. Orchestration: Prefect Cloud + Google Cloud Run, scheduled flows (hourly / 6-hour / daily / monthly), freshness gates, Slack alerting. Warehouse: BigQuery — raw → staging → clean → analytics, modeled in dbt (~30 staging models, clean "business logic" tables, analytics tables, 15 custom tests, macros for identity normalization). Outbound / reverse ETL: Tuya→Klaviyo device-and-filter sync (drives recurring filter-resale revenue), Airtable inventory sync, monthly Avalara tax export, a FastAPI internal dashboard. Infra: Cloud Run, Artifact Registry, Secret Manager, Docker. You don't have to learn all of this on day one — but you should be immediately comfortable in BigQuery, dbt, Python, and a modern orchestrator. What You'll Do — Three Core Jobs 1. Keep the lights on (KTLO). Own the day-to-day health of the pipeline. Keep the existing flows running, fix extractors when source APIs change, maintain the dbt models and tests, watch freshness/alerting, and ingest new data streams as we add channels, products, and tools. This is the foundation — it has to be boringly reliable. 2. Build new flows — activate the data. Stand up new pipelines that do something with the data, not just store it. Reverse-ETL audiences and signals back into the tools the business uses (Klaviyo, ad platforms, internal tools), build identity-resolution logic across channels (e.g., resolving anonymous Amazon orders to known customers), and wire up the data behind product features (notifications, device-synced subscriptions, environmental alerts). You'll turn a warehouse of facts into operational triggers. 3. Monetize the data — help us decide and earn with it. This is what makes this role more than maintenance. Our platform was built ahead of its consumers: the engineering is strong, but the "business front door" (dashboards, clean metrics, decision-ready outputs) was never finished. You'll help close that gap — define and validate the metrics that matter (true cross-channel revenue, filter-replacement/retention cohorts, LTV, inventory/reorder signals), build the consumption layer (dashboards / Slack numbers / spreadsheets people will actually open), and partner with us on the analyses that drive real decisions: which customers to target, what to reorder, where revenue is leaking, and how to grow subscription/filter-resale revenue. We're explicit that we want a builder who also advises — someone with a point of view on what's worth building, what's over-built, and where the leverage is. You'll work directly with our PM/ops/marketing leads, not in a silo. What We're Looking For Must-haves; - Strong SQL + dbt. You've built and maintained a modeled warehouse (staging → marts), written tests, and reasoned about business definitions encoded in models. - BigQuery (or comparable cloud warehouse) in production. - Python for data extraction/transformation — building and debugging API extractors, handling auth, rate limits, pagination, incremental loads. - Orchestration experience (Prefect, Airflow, Dagster, or similar) and comfort deploying on cloud infra (GCP / Cloud Run a plus). - Reverse ETL / activation — you've pushed warehouse data back into operational tools (Klaviyo, CRMs, ad platforms, reverse-ETL tooling). - A consultant's instinct. You can look at a stack, tell us what's earning its keep vs. what's dark/unused, and recommend where to invest — and explain it to non-technical stakeholders. Strongly preferred: - E-commerce / marketplace data experience — Amazon SP-API (RDT/PII quirks), Shopify, Walmart, 3PL/inventory data. - IoT / device telemetry experience (Tuya or similar) — datapoints, device-to-cloud, fleet data. - Identity resolution / customer 360 / cross-channel stitching. - BI / dashboarding (Looker Studio, Metabase, Hex, or similar) to build the consumption layer. - Marketing/lifecycle data fluency (Klaviyo, attribution, cohorts, LTV). - AI-native workflow — you use Cursor / Claude Code / Copilot to move faster.

Otvoriť na Upwork