Autonomous Buying-Signal Agent Developer
Budżet: $20.0 - $40.0
HOURLY / PART_TIME
⭐ 4.96 (9)
USA
data-scraping, api-integration, python, postgresql, celery, docker
Note to be considered, a video demonstrating familiarity with Hermes or equivalent agent
We run B2B outbound campaigns and want an always-on service that watches public buying signals, scores them with a low-cost open-weight LLM, and writes results to our database for our team to action. This is a single, self-contained worker running on DigitalOcean or a dedicated private Linux server. We want someone who has built signal, scraping, or lead-scoring pipelines before and can show that work. This is not a from-scratch-learning project. All sourcing is limited to public data and sources that permit automated access — no circumventing site protections or violating third-party terms.
What you'll build
A long-running Python or similar worker (event loop plus a durable job queue) deployed on DigitalOcean or a private Linux server, running as a managed service (Docker or systemd) and surviving restarts.
Signal collectors as standing jobs against public and permitted sources: SEC EDGAR filings, RSS/news and PR feeds, public APIs, and job boards that allow programmatic access. Each fires a structured record when a relevant event is found.
An LLM scoring layer that calls a low-cost open-weight model through a serverless provider (DeepInfra, Together, Groq, or OpenRouter) to classify, dedupe, score, and extract fields from raw signals. Structured JSON output, function/tool calling.
Writes of scored, deduped signals to a Postgres database, plus a notification to a Slack channel for human review.
Secure credential handling (kept out of the filesystem and logs, loaded from a secrets manager or platform secrets) and basic host hardening (single-purpose service, restricted egress, sanitizing any external text before it reaches a model prompt).
Required experience
Production Python, async work, and a job queue (Celery, RQ, Arq, or similar).
Integrating LLMs for structured extraction and classification, including prompt design for reliable JSON and tool calling.
Hands-on use of open-weight models through serverless inference providers, with a real opinion on which model/provider fit high-volume, low-cost classification.
Web and data sourcing: public APIs, RSS, permitted scraping, or tools like Playwright or Apify.
Postgres (Supabase a plus).
Deploying and operating a long-running worker on DigitalOcean or a Linux server (Docker, systemd, logging, restart-on-failure).
Sound credential and security hygiene.
Nice to have
Prior work on sales/GTM signal data, intent data, ABM, or lead scoring (6sense, Clay, or similar).
Slack API integration.
Otwórz na Upwork