gitaskhub

Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Parsel, BeautifulSoup, Playwright, and raw HTTP. Both headful and headless mode. With proxy rotation.

License · Apache-2.0
Ask anything about this repo to start.

By chatting or signing in you agree to the Terms and chat-message logging (revocable in History).