IT
Back to projects
2025-2026

Automated Lead Discovery

Differential scraping and automated reports

> The context

The client was manually monitoring multiple websites in search of new business opportunities. The process was slow, error-prone, and unscalable: new listings were discovered late or missed entirely, directly impacting acquisition capacity.

  • Manual monitoring of dozens of web pages
  • Leads discovered late or missed entirely
  • Process not scalable or repeatable

> The solution

I designed a differential scraping system that automatically monitors target sites. On each run, the system captures a page snapshot, compares it to the previous one, and identifies new listings. Results are filtered and sent as a structured report via Gmail API. Everything is containerized with Docker Compose and scheduled via cron on a VPS.

  • Scraping with Puppeteer and headless Chromium
  • Differential comparison based on atomic snapshots
  • Automated reports via Gmail API
  • Docker Compose deploy on VPS with cron scheduling

> The result

The client receives weekly reports with identified new leads, without any manual intervention. Time spent on research was eliminated and response speed to new opportunities increased significantly.

  • Complete elimination of manual work
  • Automated, structured weekly reports
  • Response time reduced from days to hours

> Features

  • Automated multi-site scraping
  • Change detection via snapshots
  • Periodic report generation
  • Configurable monitoring intervals
  • Error handling and automatic retry
  • Log rotation and monitoring

> Tech stack

Node.js TypeScript Puppeteer Docker Docker Compose Gmail API Cron