Hey all, This is Jan, the founder of Apify (https://apify.com/) — a full-stack web scraping platform. After the success of Crawlee for JavaScript (https://github.com/apify/crawlee/) and the demand from the Python community, we're launching Crawlee for Python today! The main features are: - A unified programming interface for both HTTP (HTTPX with BeautifulSoup) & headless browser crawling (Playwright) - Automatic parallel crawling based on available system resources - Written in Python with type hints for enhanced developer experience - Automatic retries on errors or when you’re getting blocked - Integrated proxy rotation and session management - Configurable request routing - direct URLs to the appropriate handlers - Persistent queue for URLs to crawl - Pluggable storage for both tabular data and files For details, you can read the announcement blog post: https://crawlee.dev/blog/launching-crawlee-python Our team and I will be happy to answer here any questions you might have. |