Data Scientist to Build Lookalike Audience Model (Prototype + Methodology)
Rozpočet: -
HOURLY / PART_TIME
⭐ 4.99 (78)
United States
marketing-analytics, data-science, data-modeling, python, r, statistics, data-analysis
We're adding lookalike modeling to a data marketing platform and need a data scientist who has built this kind of capability for marketing and audience targeting. We have an in-house engineering team that owns the data infrastructure and will handle productionizing the model inside our tool. Your job is the model itself: design the methodology, build a working prototype, and validate that it performs.
Here's the use case. A brand brings a seed audience. That could be a customer file, say their best buyers of a particular product, or an audience they've already built in the platform. Behind the scenes we have a third-party dataset covering the US adult population with demographic and behavioral attributes. The user clicks "build a lookalike," the model learns what distinguishes their seed audience from everyone else, and we score the full universe so every person carries a likelihood score. From there the user ranks and narrows down to the audience they want to reach.
We also want to give the user a bit of insight into the model, not just an output so it feels more defensible, and less like a mystery black box approach. Alongside the scored audience, we want to surface the variables driving the model so a marketer can see why these people resemble their customers.
We want you to define the modeling approach, build and validate a working prototype, and document the methodology clearly enough that our team can take it into production.
What we're looking for:
- Hands-on experience building lookalike or propensity models for marketing, advertising, or audience targeting
- Strong propensity and classification modeling (logistic regression, gradient boosting, and similar) plus feature engineering on large demographic and behavioral datasets
- A sound approach to scoring at scale, meaning the full US adult population across hundreds of millions of records, even if the prototype runs on a sample
- Experience modeling a seed file against a large third-party reference dataset
- Model explainability, with the ability to surface top drivers and feature importance in a way a non-technical marketer can read
- Clear documentation habits, so our engineering team can take your methodology into production without guesswork
Helpful: adtech, martech, or data marketplace background
What to include in your proposal:
1. A short note on similar lookalike or audience modeling work you've done
2. How you'd approach this, including the modeling method and how you'd handle scoring the full population
3. The tools and stack you'd use for the prototype
4. Your pricing approach and how you typically structure work like this
5. A rough timeline
If this is the kind of problem you've solved before, we'd like to hear how you'd build it.
Otevřít na Upwork