🤖 Multi-Site AI-Enhanced Tennis Equipment Scraper

Multi-Brand + AI-Powered Tennis Equipment Extraction

Advanced scraper that crawls multiple tennis brand websites with concurrent page fetching, comprehensive AI data extraction for all equipment types (rackets, strings, balls, shoes, bags, accessories), and intelligent link management.

Supported Sites: Tennis Warehouse, Wilson, Babolat, Head, Prince, Yonex

🚀 Enhanced Features:

  • 🌐 Multi-Site Crawling: Scrapes 6 major tennis brands simultaneously
  • 🔄 Concurrent Processing: Fetches 5 pages simultaneously
  • 🤖 Advanced AI Extraction: Uses gpt-4o-mini with comprehensive equipment analysis
  • 🎾 All Equipment Types: Extracts rackets, strings, balls, shoes, bags, and accessories
  • 🎯 Product Page Detection: AI validates actual product pages vs reviews/articles
  • 📊 Category-Specific Specs: Different specifications per equipment type
  • 📸 Image Processing: Extracts product images from all supported sites
  • 🔗 Smart Link Discovery: Crawls all links except excluded patterns (impressum, etc.)
  • 🛡️ Duplicate Prevention: Avoids re-crawling existing content across sites and categories
  • ⚡ Performance Optimized: Intelligent URL filtering and controlled concurrency

Note: Make sure to set your AI API key in the environment variable OPENAI_API_KEY or update the configuration.