CAPSOLVER
Blog
How to Use Hrequests for Web Scraping

How to Use Hrequests for Web Scraping

Logo of CapSolver

Lucas Mitchell

Automation Engineer

04-Sep-2024

How to Use Hrequests for Web Scraping

Web scraping is a powerful way to extract information from websites, but it's often a challenge when sites implement captchas, rate-limiting, or proxies to block unwanted scrapers. In this guide, we will introduce hrequests, a high-performance web-scraping library, and walk you through its basic usage, including a demo for scraping a site using hrequests in combination with Capsolver to bypass ReCaptcha challenges.

What is hrequests?

hrequests is a modern Python HTTP library built for speed and flexibility, designed to handle heavy web scraping tasks. It's essentially an enhanced version of requests, with a stronger emphasis on handling requests with more control, especially in environments that require extra proxy or captcha handling.

The library provides several features:

  • Asyncio support for making concurrent requests.
  • Session handling to reuse connections efficiently.
  • Proxy support for handling requests behind proxies.
  • Rate-limiting support to avoid getting blocked.
  • Captcha-solving support via external services.

Prerequisites

Before you dive into using hrequests, ensure you have the following installed:

bash Copy
pip install hrequests capsolver

Make sure you also have a Capsolver API key for solving captchas if the site you are scraping requires it. For detailed setup instructions, visit the hrequests GitHub page.

Getting Started with hrequests

Here's a basic example of how to use hrequests to scrape a webpage:

python Copy
import hrequests

# URL of the webpage we want to scrape
url = 'https://example.com'

# Make a simple GET request
response = hrequests.get(url)

# Print the status code
print(f"Status Code: {response.status_code}")

# Print the content of the page
print(f"Page Content: {response.text}")

This basic script makes a GET request to the given URL and prints the status code and page content. However, many websites are more complex and require additional handling like proxy rotation, user-agent spoofing, or captcha solving.

Handling Captchas with Capsolver and hrequests

In this section, we'll explore how to integrate Capsolver with hrequests to bypass captchas. Capsolver is an external service that helps in solving various types of captchas, including ReCaptcha V2, which is commonly used on websites.

We will demonstrate solving ReCaptcha V2 using Capsolver and then scraping the content of a page that requires solving the captcha first.

Example: Solving ReCaptcha V2 with Capsolver

python Copy
import capsolver
import hrequests
import os

# Consider using environment variables for sensitive information
PROXY = os.getenv("PROXY", "http://username:password@host:port")
capsolver.api_key = os.getenv("CAPSOLVER_API_KEY", "Your Capsolver API Key")
PAGE_URL = os.getenv("PAGE_URL", "PAGE_URL")
PAGE_KEY = os.getenv("PAGE_SITE_KEY", "PAGE_SITE_KEY")

def solve_recaptcha_v2(url, key):
    solution = capsolver.solve({
        "type": "ReCaptchaV2Task",
        "websiteURL": url,
        "websiteKey": key,
        "proxy": PROXY
    })
    return solution['solution']['gRecaptchaResponse']

def main():
    print("Solving reCaptcha v2")
    solution = solve_recaptcha_v2(PAGE_URL, PAGE_KEY)
    print("Solution: ", solution)

    # Now that we have solved the captcha, we can proceed with scraping
    headers = {
        "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36"
    }

    # Sending a GET request with captcha solution
    response = hrequests.get(
        PAGE_URL, 
        headers=headers, 
        data={"g-recaptcha-response": solution},
        proxies={"http": PROXY, "https": PROXY}
    )

    # Checking the status and printing the page content
    if response.status_code == 200:
        print("Successfully fetched the page!")
        print(response.text)
    else:
        print(f"Failed to fetch the page. Status Code: {response.status_code}")

if __name__ == "__main__":
    main()

Explanation:

  1. Environment Variables: We recommend using environment variables for sensitive data like your API key, proxy credentials, and the target page details.

  2. Captcha Solving: The solve_recaptcha_v2 function interacts with Capsolver to solve a ReCaptcha V2 challenge using the provided page URL and site key. It sends the proxy details for the request as well, ensuring the captcha solution will work correctly with the proxy.

  3. Page Request: After solving the captcha, the script proceeds to make a GET request to the page with the solved g-recaptcha-response included in the request data. This allows bypassing the captcha and accessing the content.

Web Scraping Best Practices

When using web scraping libraries like hrequests, it’s essential to follow ethical guidelines and avoid getting banned. Here are some tips:

  • Respect robots.txt: Always check the robots.txt file of the website you're scraping to ensure you aren't violating any rules.
  • Rate Limiting: Make sure to limit your requests to avoid overwhelming the server and getting blocked.
  • Use Proxies: For high-volume scraping, rotate proxies to distribute the load across multiple IP addresses.
  • User-Agent Spoofing: Randomize or rotate user-agent strings to avoid being identified as a bot.

Conclusion

With hrequests, you can efficiently scrape websites while handling the complexities of proxies and captchas. Combining it with Capsolver allows you to bypass ReCaptcha V2 challenges seamlessly, enabling access to content that would otherwise be difficult to scrape.

Feel free to extend this script to suit your scraping needs and experiment with additional features offered by hrequests. Always ensure that your scraping activities respect website terms of service and legal guidelines.

Happy scraping!

Compliance Disclaimer: The information provided on this blog is for informational purposes only. CapSolver is committed to compliance with all applicable laws and regulations. The use of the CapSolver network for illegal, fraudulent, or abusive activities is strictly prohibited and will be investigated. Our captcha-solving solutions enhance user experience while ensuring 100% compliance in helping solve captcha difficulties during public data crawling. We encourage responsible use of our services. For more information, please visit our Terms of Service and Privacy Policy.

More

 How to Solve reCAPTCHA When Scraping Search Results with Puppeteer
How to Solve reCAPTCHA When Scraping Search Results with Puppeteer

Master the art of Puppeteer web scraping by learning how to reliably solve reCAPTCHA v2 and v3. Discover the best puppeteer recaptcha solver techniques for large-scale data harvesting and SEO automation.

reCAPTCHA
Logo of CapSolver

Lucas Mitchell

04-Nov-2025

AI Powered SEO Automation: How to Solve Captcha for Smarter SERP Data Collection
AI Powered SEO Automation: How to Solve Captcha for Smarter SERP Data Collection

Discover how AI Powered SEO Automation overcomes CAPTCHA challenges for smarter SERP data collection and learn about reCAPTCHA v2/v3 solutions

reCAPTCHA
Logo of CapSolver

Emma Foster

23-Oct-2025

Recaptcha Solver
reCAPTCHA Solver Auto Recognition and Solve Methods

Learn how to automatically recognize and solve Google reCAPTCHA v2, v3, invisible, and enterprise challenges using advanced AI and OCR techniques

reCAPTCHA
Logo of CapSolver

Sora Fujimoto

22-Oct-2025

Solve reCAPTCHA v2 Guide
How to Solve reCAPTCHA v2: Solve reCAPTCHA v2 Guide

Learn how to automate solving Google reCAPTCHA v2 using CapSolver. Discover API and SDK integration, step-by-step guides, and bonus codes to streamline captcha solving for web scraping, automation, and development projects.

reCAPTCHA
Logo of CapSolver

AloĆ­sio VĆ­tor

21-Oct-2025

best recaptcha solver
Which reCAPTCHA solver is best? Best reCAPTCHA solver

In this article, we will explore the key factors that determine the effectiveness of a reCAPTCHA solver and highlight why CapSolver stands out as the best reCAPTCHA solver for 2024.

reCAPTCHA
Logo of CapSolver

Sora Fujimoto

21-Oct-2025

How to Solve reCAPTCHA v3 in Crawl4AI with CapSolver Integration
How to Solve reCAPTCHA v3 in Crawl4AI with CapSolver Integration

Solve reCAPTCHA v3 in Crawl4AI with CapSolver — API and extension methods to automate CAPTCHA handling for web scraping.

reCAPTCHA
Logo of CapSolver

Ethan Collins

20-Oct-2025