AI / Python Developer Needed to Build Environmental Due Diligence SaaS MVP RAG, OCR, & Web Scraping

Budget: $3000.0 FIXED / ⭐ 0.00 (0) United States

python, data-scraping, artificial-intelligence

Job Title: AI / Python Developer Needed to Build Environmental Due Diligence SaaS MVP (RAG, OCR, & Web Scraping) Job Description: We are looking for a talented AI/Full-Stack Developer (or Small Agency) to build a Minimum Viable Product (MVP) for a business-to-business (B2B) SaaS platform. The application automates environmental due diligence and Phase I ESA reporting. The core function of the app is to take a property address, scrape state/federal regulatory databases, use AI to scan dense/scanned PDF reports for environmental anomalies (RECs), and automatically draft FOIA requests and summary reports. Core Responsibilities & Features to Build: 1. Data Ingestion (Scraping): Build scripts to automatically search and download public environmental records from target federal/state registries based on an address input. 2. Document Processing (OCR & RAG): Implement an advanced OCR pipeline (e.g., AWS Textract or Azure Document Intelligence) to convert messy, historical scanned government PDFs into searchable text. 3. AI Analysis Engine: Build a Retrieval-Augmented Generation (RAG) pipeline using an LLM (OpenAI GPT-4o or Anthropic Claude) and a Vector Database (Pinecone, Chroma, or similar) to accurately search records for specific environmental hazards (spills, tanks, leaks) and cite sources precisely. 4. Automation Output: Build a feature that auto-generates localized FOIA request letters and structured summary reports based on the findings. 5. Frontend/UI: A simple, clean dashboard where users can input addresses, track ongoing FOIA requests, and download reports. Open to building the frontend in a robust No-Code tool like Bubble.io if it seamlessly connects to the Python backend APIs. Required Technical Skills: * Language: Python (Strong expertise) * AI Frameworks: LangChain, LlamaIndex, or similar LLM orchestration tools * Vector Databases: Pinecone, Weaviate, Milvus, or Chroma * APIs: OpenAI API, Anthropic API * OCR Tools: AWS Textract, Azure Document Intelligence, or Google Cloud Document AI * Web Scraping: BeautifulSoup, Scrapy, or Selenium * Database & Frontend: PostgreSQL, React, or advanced Bubble.io API integration How to Apply: Please submit a brief proposal detailing: 1. Your experience building RAG (Retrieval-Augmented Generation) applications that analyze large, messy PDFs. 2. An example of a project where you successfully combined web scraping with an AI pipeline. 3. Your estimated timeline and budget approach for a project of this scope (Milestone-based or Hourly). 4. Start your proposal with the words "EcoAI" so I know you read this post

Auf Upwork öffnen