← Jobs

Product Categorization/Taxonomy tagging

Budget: - HOURLY / FULL_TIME ⭐ 5.00 (12) United States

python, accuracy-verification, automation

We have a large catalog of about 100k products across many categories. We have several large product lists that need to be cleaned, organized, and classified within our provided taxonomy. This work can not be done by hand, and needs to be automated with a combination of scripts and Ai. We need a repeatable rules based process. What you'll do: - Build an automated classification pipeline that maps products to the correct categories from our controlled vocabulary (we provide the list — no free-typing). - Assign product attributes/specs programmatically from a defined set of valid values. Clean and normalize messy titles and field values, deduplicate, and fix systematic mis-tags across the whole list. - Automatically flag the rows the rules can't confidently resolve, so only a small share needs human review instead of all 100k. - Build QA checks and sampling so we can measure accuracy and catch errors. - Deliver clean output in CSV, formatted for a standard eCommerce product import. Required skills: - Python (pandas) or equivalent scripting for data work — this is the core requirement. - Experience with rule-based classification, text normalization, fuzzy/string matching, and deduplication on large datasets. - Strong with CSV/spreadsheet data and working to a defined schema or controlled vocabulary. - A QA mindset — building validation, sampling, and accuracy metrics. Please have previous project experience using scripts/llms to classify large sets of data. To apply, please answer: At ~100,000 rows, how would you approach automating classification against a fixed category list — what's your general method, and how do you handle the rows your rules can't confidently resolve? How do you measure and keep accuracy high on a job this size (validation, sampling, error checks)? What tools/libraries would you use, and roughly how large a dataset have you cleaned or categorized before? Thank you
Auf Upwork öffnen