How to Solve reCAPTCHA v2 in Crawl4AI with CapSolver Integration

Blog

reCAPTCHA

Blog

reCAPTCHA

How to Solve reCAPTCHA v2 in Crawl4AI with CapSolver Integration

Ethan Collins

Pattern Recognition Specialist

20-Oct-2025

"I'm not a robot" checkbox, serves as a crucial defense mechanism against bot traffic and automated abuse on websites. While essential for security, it often poses a significant challenge for legitimate web scraping and data extraction operations. The need for efficient, automated CAPTCHA solving solutions has become paramount for developers and businesses relying on web automation.

This article delves into the robust integration of Crawl4AI , an advanced web crawler, with CapSolver, a leading CAPTCHA solving service, specifically focusing on solving reCAPTCHA v2. We will explore both API-based and browser extension-based integration methods, providing detailed code examples and explanations to help you achieve seamless, uninterrupted web data collection.

Understanding reCAPTCHA v2 and its Challenges

reCAPTCHA v2 requires users to click a checkbox, and sometimes complete image challenges, to prove they are human. For automated systems like web crawlers, this interactive element halts the scraping process, demanding manual intervention or sophisticated bypass techniques. Without an effective solution, data collection becomes inefficient, unstable, and costly.

CapSolver offers a high-accuracy, fast-response solution for reCAPTCHA v2 by leveraging advanced AI algorithms. When integrated with Crawl4AI, it transforms a significant hurdle into a streamlined, automated step, ensuring your web automation tasks remain fluid and productive.

💡 Exclusive Bonus for Crawl4AI Integration Users:
To celebrate this integration, we’re offering an exclusive 6% bonus code — CRAWL4 for all CapSolver users who register through this tutorial.
Simply enter the code during recharge in Dashboard to receive an extra 6% credit instantly.

Integration Method 1: CapSolver API Integration with Crawl4AI

The API integration method provides fine-grained control and is generally recommended for its flexibility and precision. It involves using Crawl4AI's js_code functionality to inject the CAPTCHA token obtained from CapSolver directly into the target webpage.

How it Works:

Navigate to the CAPTCHA page: Crawl4AI accesses the target web page as usual.
Obtain Token: In your Python script, call CapSolver's API using their SDK, passing necessary parameters like siteKey and websiteURL to receive the gRecaptchaResponse token.
Inject Token: Utilize Crawl4AI's js_code parameter within CrawlerRunConfig to inject the obtained token into the g-recaptcha-response textarea element on the page.
Continue Operations: After successful token injection, Crawl4AI can proceed with subsequent actions, such as form submissions or clicks, effectively bypassing the reCAPTCHA.

Example Code: API Integration for reCAPTCHA v2

The following Python code demonstrates how to integrate CapSolver's API with Crawl4AI to solve reCAPTCHA v2. This example targets the reCAPTCHA v2 checkbox demo page.

python Copy

import asyncio
import capsolver
from crawl4ai import *


# TODO: set your config
# Docs: https://docs.capsolver.com/guide/captcha/ReCaptchaV2/
api_key = "CAP-xxxxxxxxxxxxxxxxxxxxx"                                      # your api key of capsolver
site_key = "6LfW6wATAAAAAHLqO2pb8bDBahxlMxNdo9g947u9"                      # site key of your target site
site_url = "https://recaptcha-demo.appspot.com/recaptcha-v2-checkbox.php"  # page url of your target site
captcha_type = "ReCaptchaV2TaskProxyLess"                                  # type of your target captcha
capsolver.api_key = api_key


async def main():
    browser_config = BrowserConfig(
        verbose=True,
        headless=False,
        use_persistent_context=True,
    )

    async with AsyncWebCrawler(config=browser_config) as crawler:
        await crawler.arun(
            url=site_url,
            cache_mode=CacheMode.BYPASS,
            session_id="session_captcha_test"
        )

        # get recaptcha token using capsolver sdk
        solution = capsolver.solve({
            "type": captcha_type,
            "websiteURL": site_url,
            "websiteKey": site_key,
        })
        token = solution["gRecaptchaResponse"]
        print("recaptcha token:", token)

        js_code = """
            const textarea = document.getElementById(\'g-recaptcha-response\');
            if (textarea) {
                textarea.value = \"""" + token + """\";
                document.querySelector(\'button.form-field[type="submit"]\').click();
            }
        """

        wait_condition = """() => {
            const items = document.querySelectorAll(\'h2\');
            return items.length > 1;
        }"""

        run_config = CrawlerRunConfig(
            cache_mode=CacheMode.BYPASS,
            session_id="session_captcha_test",
            js_code=js_code,
            js_only=True,
            wait_for=f"js:{wait_condition}"
        )

        result_next = await crawler.arun(
            url=site_url,
            config=run_config,
        )
        print(result_next.markdown)


if __name__ == "__main__":
    asyncio.run(main())

Code Analysis:

CapSolver SDK Call: The capsolver.solve method is invoked with ReCaptchaV2TaskProxyLess type, websiteURL, and websiteKey to retrieve the gRecaptchaResponse token. This token is the solution provided by CapSolver.
JavaScript Injection (js_code): The js_code string contains JavaScript that locates the g-recaptcha-response textarea element on the page and assigns the obtained token to its value property. Subsequently, it simulates a click on the submit button, ensuring the form is submitted with the valid CAPTCHA token.
wait_for Condition: A wait_condition is defined to ensure Crawl4AI waits for a specific element to appear on the page, indicating that the submission was successful and the page has loaded new content.

Integration Method 2: CapSolver Browser Extension Integration

For scenarios where direct API injection might be complex or less desirable, CapSolver's browser extension offers an alternative. This method leverages the extension's ability to automatically detect and solve CAPTCHAs within the browser context managed by Crawl4AI.

How it Works:

Launch Browser with user_data_dir: Configure Crawl4AI to launch a browser instance with a specified user_data_dir to maintain persistent context.
Install and Configure Extension: Manually install the CapSolver extension into this browser profile and configure your CapSolver API key. You can also pre-configure apiKey and manualSolving parameters in the extension's config.js file.
Navigate to CAPTCHA Page: Crawl4AI navigates to the page containing the reCAPTCHA v2.
Automatic or Manual Solving: Depending on the extension's manualSolving configuration, the CAPTCHA will either be solved automatically upon detection, or you can trigger it manually via injected JavaScript.
Verification: Once solved, the token is automatically handled by the extension, and subsequent actions like form submissions will carry the valid verification.

Example Code: Extension Integration for reCAPTCHA v2 (Automatic Solving)

This example shows how Crawl4AI can be configured to use a browser profile with the CapSolver extension for automatic reCAPTCHA v2 solving.

python Copy

import asyncio
import time

from crawl4ai import *


# TODO: set your config
user_data_dir = "/browser-profile/Default1" # Ensure this path is correctly set and contains your configured extension

browser_config = BrowserConfig(
    verbose=True,
    headless=False,
    user_data_dir=user_data_dir,
    use_persistent_context=True,
    proxy="http://127.0.0.1:13120", # Optional: configure proxy if needed
)

async def main():
    async with AsyncWebCrawler(config=browser_config) as crawler:
        result_initial = await crawler.arun(
            url="https://recaptcha-demo.appspot.com/recaptcha-v2-checkbox.php",
            cache_mode=CacheMode.BYPASS,
            session_id="session_captcha_test"
        )

        # The extension will automatically solve the CAPTCHA upon page load.
        # You might need to add a wait condition or time.sleep for the CAPTCHA to be solved
        # before proceeding with further actions.
        time.sleep(30) # Example wait, adjust as necessary

        # Continue with other Crawl4AI operations after CAPTCHA is solved
        # For instance, check for elements that appear after successful submission
        # print(result_initial.markdown) # You can inspect the page content after the wait


if __name__ == "__main__":
    asyncio.run(main())

Code Analysis:

user_data_dir: This parameter is crucial for Crawl4AI to launch a browser instance that retains the installed CapSolver extension and its configurations. Ensure the path points to a valid browser profile directory where the extension is installed.
Automatic Solving: With manualSolving set to false (or default) in the extension's configuration, the extension will automatically detect and solve the reCAPTCHA v2 upon page load. A time.sleep is included as a placeholder to allow the extension sufficient time to solve the CAPTCHA before any subsequent actions are attempted.

Example Code: Extension Integration for reCAPTCHA v2 (Manual Solving)

If you prefer to trigger the CAPTCHA solving manually at a specific point in your scraping logic, you can configure the extension's manualSolving parameter to true and then use js_code to click the solver button provided by the extension.

python Copy

import asyncio
import time

from crawl4ai import *


# TODO: set your config
user_data_dir = "/browser-profile/Default1" # Ensure this path is correctly set and contains your configured extension

browser_config = BrowserConfig(
    verbose=True,
    headless=False,
    user_data_dir=user_data_dir,
    use_persistent_context=True,
    proxy="http://127.0.0.1:13120", # Optional: configure proxy if needed
)

async def main():
    async with AsyncWebCrawler(config=browser_config) as crawler:
        result_initial = await crawler.arun(
            url="https://recaptcha-demo.appspot.com/recaptcha-v2-checkbox.php",
            cache_mode=CacheMode.BYPASS,
            session_id="session_captcha_test"
        )

        # Wait for a moment for the page to load and the extension to be ready
        time.sleep(6)

        # Use js_code to trigger the manual solve button provided by the CapSolver extension
        js_code = """
            let solverButton = document.querySelector(\'#capsolver-solver-tip-button\');
            if (solverButton) {
                const clickEvent = new MouseEvent(\'click\', {
                    bubbles: true,
                    cancelable: true,
                    view: window
                });
                solverButton.dispatchEvent(clickEvent);
            }
        """
        print(js_code)
        run_config = CrawlerRunConfig(
            cache_mode=CacheMode.BYPASS,
            session_id="session_captcha_test",
            js_code=js_code,
            js_only=True,
        )
        result_next = await crawler.arun(
            url="https://recaptcha-demo.appspot.com/recaptcha-v2-checkbox.php",
            config=run_config
        )
        print("JS Execution results:", result_next.js_execution_result)

        # Allow time for the CAPTCHA to be solved after manual trigger
        time.sleep(30) # Example wait, adjust as necessary

        # Continue with other Crawl4AI operations


if __name__ == "__main__":
    asyncio.run(main())

Code Analysis:

manualSolving: Before running this code, ensure the CapSolver extension's config.js has manualSolving set to true.
Triggering Solve: The js_code simulates a click event on the #capsolver-solver-tip-button, which is the button provided by the CapSolver extension for manual solving. This gives you precise control over when the CAPTCHA resolution process is initiated.

Conclusion

The integration of Crawl4AI with CapSolver provides powerful and flexible solutions for bypassing reCAPTCHA v2, significantly enhancing the efficiency and reliability of web scraping operations. Whether you opt for the precise control of API integration or the simplified setup of browser extension integration, both methods ensure that reCAPTCHA v2 no longer stands as a barrier to your data collection goals.

By automating CAPTCHA resolution, developers can focus on extracting valuable data, confident that their crawlers will navigate protected websites seamlessly. This synergy between Crawl4AI's advanced crawling capabilities and CapSolver's robust CAPTCHA solving technology marks a significant step forward in automated web data extraction.

References

Compliance Disclaimer: The information provided on this blog is for informational purposes only. CapSolver is committed to compliance with all applicable laws and regulations. The use of the CapSolver network for illegal, fraudulent, or abusive activities is strictly prohibited and will be investigated. Our captcha-solving solutions enhance user experience while ensuring 100% compliance in helping solve captcha difficulties during public data crawling. We encourage responsible use of our services. For more information, please visit our Terms of Service and Privacy Policy.

How to Solve reCAPTCHA When Scraping Search Results with Puppeteer

Master the art of Puppeteer web scraping by learning how to reliably solve reCAPTCHA v2 and v3. Discover the best puppeteer recaptcha solver techniques for large-scale data harvesting and SEO automation.

reCAPTCHA

Lucas Mitchell

04-Nov-2025

AI Powered SEO Automation: How to Solve Captcha for Smarter SERP Data Collection

Discover how AI Powered SEO Automation overcomes CAPTCHA challenges for smarter SERP data collection and learn about reCAPTCHA v2/v3 solutions

reCAPTCHA

Emma Foster

23-Oct-2025

reCAPTCHA Solver Auto Recognition and Solve Methods

Learn how to automatically recognize and solve Google reCAPTCHA v2, v3, invisible, and enterprise challenges using advanced AI and OCR techniques

reCAPTCHA

Sora Fujimoto

22-Oct-2025

How to Solve reCAPTCHA v2: Solve reCAPTCHA v2 Guide

Learn how to automate solving Google reCAPTCHA v2 using CapSolver. Discover API and SDK integration, step-by-step guides, and bonus codes to streamline captcha solving for web scraping, automation, and development projects.

reCAPTCHA

Aloísio Vítor

21-Oct-2025

Which reCAPTCHA solver is best? Best reCAPTCHA solver

In this article, we will explore the key factors that determine the effectiveness of a reCAPTCHA solver and highlight why CapSolver stands out as the best reCAPTCHA solver for 2024.

reCAPTCHA

Sora Fujimoto

21-Oct-2025

How to Solve reCAPTCHA v3 in Crawl4AI with CapSolver Integration

Solve reCAPTCHA v3 in Crawl4AI with CapSolver — API and extension methods to automate CAPTCHA handling for web scraping.

reCAPTCHA

Ethan Collins

20-Oct-2025

How to Solve reCAPTCHA v2 in Crawl4AI with CapSolver Integration

Understanding reCAPTCHA v2 and its Challenges

Integration Method 1: CapSolver API Integration with Crawl4AI

How it Works:

Example Code: API Integration for reCAPTCHA v2

Integration Method 2: CapSolver Browser Extension Integration

How it Works:

Example Code: Extension Integration for reCAPTCHA v2 (Automatic Solving)

Example Code: Extension Integration for reCAPTCHA v2 (Manual Solving)

Conclusion

References

More

How to Solve reCAPTCHA When Scraping Search Results with Puppeteer

AI Powered SEO Automation: How to Solve Captcha for Smarter SERP Data Collection

reCAPTCHA Solver Auto Recognition and Solve Methods

How to Solve reCAPTCHA v2: Solve reCAPTCHA v2 Guide

Which reCAPTCHA solver is best? Best reCAPTCHA solver

How to Solve reCAPTCHA v3 in Crawl4AI with CapSolver Integration