Web Scraping Automation Bot
Budget: $22.5 - $30.0
HOURLY / PART_TIME
⭐ 0.00 (0)
Singapore
python, data-extraction, data-scraping
We're looking for an experienced web scraping developer to build an automated solution that extracts relevant news articles from multiple large online archives.
Project Overview
The solution needs to scrape several pages of news articles across multiple news websites, filter results by keyword, and automatically export data to Google Sheets.
Scope of Work
1. Archive scraper Navigate and scrape paginated news archives across multiple websites. Must handle large datasets efficiently.
2. Keyword filtering Filter articles by keywords found in the title and body. Must support custom keyword lists. AI/NLP-powered filtering is a plus.
3. Google Sheets export Auto-populate a Google Sheet with: article headline, URL, publication date, source, summary, and category.
Preferred Tech Stack
- Python (BeautifulSoup, Scrapy, Selenium, or Newspaper3k) or Node.js (Puppeteer)
- Google Sheets API
- AI/NLP tools for content filtering (optional but preferred)
Deliverables
- Fully functional scraping, filtering, and export script
- Documentation on how to run and maintain the solution
- Optional: lightweight UI/dashboard for keyword management
Requirements
- Proven experience in web scraping and large-scale data extraction
- Strong knowledge of Google Sheets API integration
- Ability to handle pagination
- Experience with NLP/text analysis is a plus
To Apply, Please Include
- Examples of similar web scraping projects you've completed
- The tech stack you'd use for this project
- Your estimated timeline and budget
Öppna på Upwork