Compare commits

...

3 Commits

Author SHA1 Message Date
ilike2burnthing
0fe9958afe Update README.md 2025-12-03 08:51:24 +00:00
Prodan Denis
9f8c71131f Resolve turnstile captcha (#1634)
Fixes #804
2025-12-03 08:50:31 +00:00
ilike2burnthing
2405c00521 Add formatting to log file
Resolves #1635
2025-12-03 07:07:54 +00:00
4 changed files with 189 additions and 116 deletions

View File

@@ -33,13 +33,14 @@ It is recommended to install using a Docker container because the project depend
already included within the image. already included within the image.
Docker images are available in: Docker images are available in:
* GitHub Registry => https://github.com/orgs/FlareSolverr/packages/container/package/flaresolverr
* DockerHub => https://hub.docker.com/r/flaresolverr/flaresolverr - GitHub Registry => https://github.com/orgs/FlareSolverr/packages/container/package/flaresolverr
- DockerHub => https://hub.docker.com/r/flaresolverr/flaresolverr
Supported architectures are: Supported architectures are:
| Architecture | Tag | | Architecture | Tag |
|--------------|--------------| | ------------ | ------------ |
| x86 | linux/386 | | x86 | linux/386 |
| x86-64 | linux/amd64 | | x86-64 | linux/amd64 |
| ARM32 | linux/arm/v7 | | ARM32 | linux/arm/v7 |
@@ -50,6 +51,7 @@ We provide a `docker-compose.yml` configuration file. Clone this repository and
the container. the container.
If you prefer the `docker cli` execute the following command. If you prefer the `docker cli` execute the following command.
```bash ```bash
docker run -d \ docker run -d \
--name=flaresolverr \ --name=flaresolverr \
@@ -69,28 +71,29 @@ Remember to restart the Docker daemon and the container after the update.
> Precompiled binaries are only available for x64 architecture. For other architectures see Docker images. > Precompiled binaries are only available for x64 architecture. For other architectures see Docker images.
This is the recommended way for Windows users. This is the recommended way for Windows users.
* Download the [FlareSolverr executable](https://github.com/FlareSolverr/FlareSolverr/releases) from the release's page. It is available for Windows x64 and Linux x64.
* Execute FlareSolverr binary. In the environment variables section you can find how to change the configuration. - Download the [FlareSolverr executable](https://github.com/FlareSolverr/FlareSolverr/releases) from the release's page. It is available for Windows x64 and Linux x64.
- Execute FlareSolverr binary. In the environment variables section you can find how to change the configuration.
### From source code ### From source code
> **Warning** > **Warning**
> Installing from source code only works for x64 architecture. For other architectures see Docker images. > Installing from source code only works for x64 architecture. For other architectures see Docker images.
* Install [Python 3.13](https://www.python.org/downloads/). - Install [Python 3.13](https://www.python.org/downloads/).
* Install [Chrome](https://www.google.com/intl/en_us/chrome/) (all OS) or [Chromium](https://www.chromium.org/getting-involved/download-chromium/) (just Linux, it doesn't work in Windows) web browser. - Install [Chrome](https://www.google.com/intl/en_us/chrome/) (all OS) or [Chromium](https://www.chromium.org/getting-involved/download-chromium/) (just Linux, it doesn't work in Windows) web browser.
* (Only in Linux) Install [Xvfb](https://en.wikipedia.org/wiki/Xvfb) package. - (Only in Linux) Install [Xvfb](https://en.wikipedia.org/wiki/Xvfb) package.
* (Only in macOS) Install [XQuartz](https://www.xquartz.org/) package. - (Only in macOS) Install [XQuartz](https://www.xquartz.org/) package.
* Clone this repository and open a shell in that path. - Clone this repository and open a shell in that path.
* Run `pip install -r requirements.txt` command to install FlareSolverr dependencies. - Run `pip install -r requirements.txt` command to install FlareSolverr dependencies.
* Run `python src/flaresolverr.py` command to start FlareSolverr. - Run `python src/flaresolverr.py` command to start FlareSolverr.
### From source code (FreeBSD/TrueNAS CORE) ### From source code (FreeBSD/TrueNAS CORE)
* Run `pkg install chromium python313 py313-pip xorg-vfbserver` command to install the required dependencies. - Run `pkg install chromium python313 py313-pip xorg-vfbserver` command to install the required dependencies.
* Clone this repository and open a shell in that path. - Clone this repository and open a shell in that path.
* Run `python3.13 -m pip install -r requirements.txt` command to install FlareSolverr dependencies. - Run `python3.13 -m pip install -r requirements.txt` command to install FlareSolverr dependencies.
* Run `python3.13 src/flaresolverr.py` command to start FlareSolverr. - Run `python3.13 src/flaresolverr.py` command to start FlareSolverr.
### Systemd service ### Systemd service
@@ -99,6 +102,7 @@ We provide an example Systemd unit file `flaresolverr.service` as reference. You
## Usage ## Usage
Example Bash request: Example Bash request:
```bash ```bash
curl -L -X POST 'http://localhost:8191/v1' \ curl -L -X POST 'http://localhost:8191/v1' \
-H 'Content-Type: application/json' \ -H 'Content-Type: application/json' \
@@ -110,6 +114,7 @@ curl -L -X POST 'http://localhost:8191/v1' \
``` ```
Example Python request: Example Python request:
```py ```py
import requests import requests
@@ -125,6 +130,7 @@ print(response.text)
``` ```
Example PowerShell request: Example PowerShell request:
```ps1 ```ps1
$body = @{ $body = @{
cmd = "request.get" cmd = "request.get"
@@ -146,7 +152,7 @@ cookies for the browser to use.
This also speeds up the requests since it won't have to launch a new browser instance for every request. This also speeds up the requests since it won't have to launch a new browser instance for every request.
| Parameter | Notes | | Parameter | Notes |
|-----------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | --------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| session | Optional. The session ID that you want to be assigned to the instance. If isn't set a random UUID will be assigned. | | session | Optional. The session ID that you want to be assigned to the instance. If isn't set a random UUID will be assigned. |
| proxy | Optional, default disabled. Eg: `"proxy": {"url": "http://127.0.0.1:8888"}`. You must include the proxy schema in the URL: `http://`, `socks4://` or `socks5://`. Authorization (username/password) is supported. Eg: `"proxy": {"url": "http://127.0.0.1:8888", "username": "testuser", "password": "testpass"}` | | proxy | Optional, default disabled. Eg: `"proxy": {"url": "http://127.0.0.1:8888"}`. You must include the proxy schema in the URL: `http://`, `socks4://` or `socks5://`. Authorization (username/password) is supported. Eg: `"proxy": {"url": "http://127.0.0.1:8888", "username": "testuser", "password": "testpass"}` |
@@ -160,11 +166,7 @@ Example response:
```json ```json
{ {
"sessions": [ "sessions": ["session_id_1", "session_id_2", "session_id_3..."]
"session_id_1",
"session_id_2",
"session_id_3..."
]
} }
``` ```
@@ -174,13 +176,13 @@ This will properly shutdown a browser instance and remove all files associated w
session. When you no longer need to use a session you should make sure to close it. session. When you no longer need to use a session you should make sure to close it.
| Parameter | Notes | | Parameter | Notes |
|-----------|-----------------------------------------------| | --------- | --------------------------------------------- |
| session | The session ID that you want to be destroyed. | | session | The session ID that you want to be destroyed. |
#### + `request.get` #### + `request.get`
| Parameter | Notes | | Parameter | Notes |
|---------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | ------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| url | Mandatory | | url | Mandatory |
| session | Optional. Will send the request from and existing browser instance. If one is not sent it will create a temporary instance that will be destroyed immediately after the request is completed. | | session | Optional. Will send the request from and existing browser instance. If one is not sent it will create a temporary instance that will be destroyed immediately after the request is completed. |
| session_ttl_minutes | Optional. FlareSolverr will automatically rotate expired sessions based on the TTL provided in minutes. | | session_ttl_minutes | Optional. FlareSolverr will automatically rotate expired sessions based on the TTL provided in minutes. |
@@ -191,6 +193,7 @@ session. When you no longer need to use a session you should make sure to close
| proxy | Optional, default disabled. Eg: `"proxy": {"url": "http://127.0.0.1:8888"}`. You must include the proxy schema in the URL: `http://`, `socks4://` or `socks5://`. Authorization (username/password) is not supported. (When the `session` parameter is set, the proxy is ignored; a session specific proxy can be set in `sessions.create`.) | | proxy | Optional, default disabled. Eg: `"proxy": {"url": "http://127.0.0.1:8888"}`. You must include the proxy schema in the URL: `http://`, `socks4://` or `socks5://`. Authorization (username/password) is not supported. (When the `session` parameter is set, the proxy is ignored; a session specific proxy can be set in `sessions.create`.) |
| waitInSeconds | Optional, default none. Length to wait in seconds after solving the challenge, and before returning the results. Useful to allow it to load dynamic content. | | waitInSeconds | Optional, default none. Length to wait in seconds after solving the challenge, and before returning the results. Useful to allow it to load dynamic content. |
| disableMedia | Optional, default false. When true FlareSolverr will prevent media resources (images, CSS, and fonts) from being loaded to speed up navigation. | | disableMedia | Optional, default false. When true FlareSolverr will prevent media resources (images, CSS, and fonts) from being loaded to speed up navigation. |
| tabs_till_verify | Optional, default none. Number of times the `Tab` button is needed to be pressed to end up on the turnstile captcha, in order to verify it. After verifying the captcha, the result will be stored in the solution under `turnstile_token`. |
> **Warning** > **Warning**
> If you want to use Cloudflare clearance cookie in your scripts, make sure you use the FlareSolverr User-Agent too. If they don't match you will see the challenge. > If you want to use Cloudflare clearance cookie in your scripts, make sure you use the FlareSolverr User-Agent too. If they don't match you will see the challenge.
@@ -244,7 +247,8 @@ Example response from running the `curl` above:
"sameSite": "None" "sameSite": "None"
} }
], ],
"userAgent": "Windows NT 10.0; Win64; x64) AppleWebKit/5..." "userAgent": "Windows NT 10.0; Win64; x64) AppleWebKit/5...",
"turnstile_token": "03AGdBq24k3lK7JH2v8uN1T5F..."
}, },
"status": "ok", "status": "ok",
"message": "", "message": "",
@@ -256,18 +260,18 @@ Example response from running the `curl` above:
### + `request.post` ### + `request.post`
This is the same as `request.get` but it takes one more param: This works like `request.get`, with the addition of the postData parameter. Note that `tabs_till_verify` is currently supported only for GET requests and requires one extra argument.
| Parameter | Notes | | Parameter | Notes |
|-----------|--------------------------------------------------------------------------| | --------- | ------------------------------------------------------------------------ |
| postData | Must be a string with `application/x-www-form-urlencoded`. Eg: `a=b&c=d` | | postData | Must be a string with `application/x-www-form-urlencoded`. Eg: `a=b&c=d` |
## Environment variables ## Environment variables
| Name | Default | Notes | | Name | Default | Notes |
|--------------------|------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------| | ------------------ | ---------------------- | ---------------------------------------------------------------------------------------------------------------------------------------- |
| LOG_LEVEL | info | Verbosity of the logging. Use `LOG_LEVEL=debug` for more information. | | LOG_LEVEL | info | Verbosity of the logging. Use `LOG_LEVEL=debug` for more information. |
| LOG_FILE | none | Path to capture log to file. Example: `/config/flaresolver.log`. | | LOG_FILE | none | Path to capture log to file. Example: `/config/flaresolverr.log`. |
| LOG_HTML | false | Only for debugging. If `true` all HTML that passes through the proxy will be logged to the console in `debug` level. | | LOG_HTML | false | Only for debugging. If `true` all HTML that passes through the proxy will be logged to the console in `debug` level. |
| PROXY_URL | none | URL for proxy. Will be overwritten by `request` or `sessions` proxy, if used. Example: `http://127.0.0.1:8080`. | | PROXY_URL | none | URL for proxy. Will be overwritten by `request` or `sessions` proxy, if used. Example: `http://127.0.0.1:8080`. |
| PROXY_USERNAME | none | Username for proxy. Will be overwritten by `request` or `sessions` proxy, if used. Example: `testuser`. | | PROXY_USERNAME | none | Username for proxy. Will be overwritten by `request` or `sessions` proxy, if used. Example: `testuser`. |
@@ -284,15 +288,17 @@ This is the same as `request.get` but it takes one more param:
| PROMETHEUS_PORT | 8192 | Listening port for Prometheus exporter. See the Prometheus section below. | | PROMETHEUS_PORT | 8192 | Listening port for Prometheus exporter. See the Prometheus section below. |
Environment variables are set differently depending on the operating system. Some examples: Environment variables are set differently depending on the operating system. Some examples:
* Docker: Take a look at the Docker section in this document. Environment variables can be set in the `docker-compose.yml` file or in the Docker CLI command.
* Linux: Run `export LOG_LEVEL=debug` and then run `flaresolverr` in the same shell. - Docker: Take a look at the Docker section in this document. Environment variables can be set in the `docker-compose.yml` file or in the Docker CLI command.
* Windows: Open `cmd.exe`, run `set LOG_LEVEL=debug` and then run `flaresolverr.exe` in the same shell. - Linux: Run `export LOG_LEVEL=debug` and then run `flaresolverr` in the same shell.
- Windows: Open `cmd.exe`, run `set LOG_LEVEL=debug` and then run `flaresolverr.exe` in the same shell.
## Prometheus exporter ## Prometheus exporter
The Prometheus exporter for FlareSolverr is disabled by default. It can be enabled with the environment variable `PROMETHEUS_ENABLED`. If you are using Docker make sure you expose the `PROMETHEUS_PORT`. The Prometheus exporter for FlareSolverr is disabled by default. It can be enabled with the environment variable `PROMETHEUS_ENABLED`. If you are using Docker make sure you expose the `PROMETHEUS_PORT`.
Example metrics: Example metrics:
```shell ```shell
# HELP flaresolverr_request_total Total requests with result # HELP flaresolverr_request_total Total requests with result
# TYPE flaresolverr_request_total counter # TYPE flaresolverr_request_total counter
@@ -328,5 +334,5 @@ to the file name of one of the adapters inside the `/captcha` directory.
## Related projects ## Related projects
* C# implementation => https://github.com/FlareSolverr/FlareSolverrSharp - C# implementation => https://github.com/FlareSolverr/FlareSolverrSharp

View File

@@ -11,6 +11,7 @@ class ChallengeResolutionResultT:
cookies: list = None cookies: list = None
userAgent: str = None userAgent: str = None
screenshot: str | None = None screenshot: str | None = None
turnstile_token: str = None
def __init__(self, _dict): def __init__(self, _dict):
self.__dict__.update(_dict) self.__dict__.update(_dict)
@@ -48,6 +49,8 @@ class V1RequestBase(object):
waitInSeconds: int = None waitInSeconds: int = None
# Optional resource blocking flag (blocks images, CSS, and fonts) # Optional resource blocking flag (blocks images, CSS, and fonts)
disableMedia: bool = None disableMedia: bool = None
# Optional when you've got a turnstile captcha that needs to be clicked after X number of Tab presses
tabs_till_verify : int = None
def __init__(self, _dict): def __init__(self, _dict):
self.__dict__.update(_dict) self.__dict__.update(_dict)

View File

@@ -97,6 +97,20 @@ if __name__ == "__main__":
logger_format = '%(asctime)s %(levelname)-8s %(message)s' logger_format = '%(asctime)s %(levelname)-8s %(message)s'
if log_level == 'DEBUG': if log_level == 'DEBUG':
logger_format = '%(asctime)s %(levelname)-8s ReqId %(thread)s %(message)s' logger_format = '%(asctime)s %(levelname)-8s ReqId %(thread)s %(message)s'
if log_file:
log_file = os.path.realpath(log_file)
log_path = os.path.dirname(log_file)
os.makedirs(log_path, exist_ok=True)
logging.basicConfig(
format=logger_format,
level=log_level,
datefmt='%Y-%m-%d %H:%M:%S',
handlers=[
logging.StreamHandler(sys.stdout),
logging.FileHandler(log_file)
]
)
else:
logging.basicConfig( logging.basicConfig(
format=logger_format, format=logger_format,
level=log_level, level=log_level,
@@ -105,12 +119,6 @@ if __name__ == "__main__":
logging.StreamHandler(sys.stdout) logging.StreamHandler(sys.stdout)
] ]
) )
if log_file:
log_file = os.path.realpath(log_file)
log_path = os.path.dirname(log_file)
os.makedirs(log_path, exist_ok=True)
logging.getLogger().addHandler(logging.FileHandler(log_file))
# disable warning traces from urllib3 # disable warning traces from urllib3
logging.getLogger('urllib3').setLevel(logging.ERROR) logging.getLogger('urllib3').setLevel(logging.ERROR)

View File

@@ -48,6 +48,11 @@ CHALLENGE_SELECTORS = [
# Fairlane / pararius.com # Fairlane / pararius.com
'div.vc div.text-box h2' 'div.vc div.text-box h2'
] ]
TURNSTILE_SELECTORS = [
"input[name='cf-turnstile-response']"
]
SHORT_TIMEOUT = 1 SHORT_TIMEOUT = 1
SESSIONS_STORAGE = SessionsStorage() SESSIONS_STORAGE = SessionsStorage()
@@ -253,12 +258,17 @@ def _resolve_challenge(req: V1RequestBase, method: str) -> ChallengeResolutionT:
logging.debug('A used instance of webdriver has been destroyed') logging.debug('A used instance of webdriver has been destroyed')
def click_verify(driver: WebDriver): def click_verify(driver: WebDriver, num_tabs: int = 1):
try: try:
logging.debug("Try to find the Cloudflare verify checkbox...") logging.debug("Try to find the Cloudflare verify checkbox...")
actions = ActionChains(driver) actions = ActionChains(driver)
actions.pause(5).send_keys(Keys.TAB).pause(1).send_keys(Keys.SPACE).perform() actions.pause(5)
logging.debug("Cloudflare verify checkbox found and clicked!") for _ in range(num_tabs):
actions.send_keys(Keys.TAB).pause(0.1)
actions.pause(1)
actions.send_keys(Keys.SPACE).perform()
logging.debug(f"Cloudflare verify checkbox clicked after {num_tabs} tabs!")
except Exception: except Exception:
logging.debug("Cloudflare verify checkbox not found on the page.") logging.debug("Cloudflare verify checkbox not found on the page.")
finally: finally:
@@ -281,6 +291,47 @@ def click_verify(driver: WebDriver):
time.sleep(2) time.sleep(2)
def _get_turnstile_token(driver: WebDriver, tabs: int):
token_input = driver.find_element(By.CSS_SELECTOR, "input[name='cf-turnstile-response']")
current_value = token_input.get_attribute("value")
while True:
click_verify(driver, num_tabs=tabs)
turnstile_token = token_input.get_attribute("value")
if turnstile_token:
if turnstile_token != current_value:
logging.info(f"Turnstile token: {turnstile_token}")
return turnstile_token
logging.debug(f"Failed to extract token possibly click failed")
# reset focus
driver.execute_script("""
let el = document.createElement('button');
el.style.position='fixed';
el.style.top='0';
el.style.left='0';
document.body.prepend(el);
el.focus();
""")
time.sleep(1)
def _resolve_turnstile_captcha(req: V1RequestBase, driver: WebDriver):
turnstile_token = None
if req.tabs_till_verify is not None:
logging.debug(f'Navigating to... {req.url} in order to pass the turnstile challenge')
driver.get(req.url)
turnstile_challenge_found = False
for selector in TURNSTILE_SELECTORS:
found_elements = driver.find_elements(By.CSS_SELECTOR, selector)
if len(found_elements) > 0:
turnstile_challenge_found = True
logging.info("Turnstile challenge detected. Selector found: " + selector)
break
if turnstile_challenge_found:
turnstile_token = _get_turnstile_token(driver=driver, tabs=req.tabs_till_verify)
else:
logging.debug(f'Turnstile challenge not found')
return turnstile_token
def _evil_logic(req: V1RequestBase, driver: WebDriver, method: str) -> ChallengeResolutionT: def _evil_logic(req: V1RequestBase, driver: WebDriver, method: str) -> ChallengeResolutionT:
res = ChallengeResolutionT({}) res = ChallengeResolutionT({})
@@ -315,11 +366,15 @@ def _evil_logic(req: V1RequestBase, driver: WebDriver, method: str) -> Challenge
# navigate to the page # navigate to the page
logging.debug(f"Navigating to... {req.url}") logging.debug(f"Navigating to... {req.url}")
turnstile_token = None
if method == "POST": if method == "POST":
_post_request(req, driver) _post_request(req, driver)
else: else:
if req.tabs_till_verify is None:
driver.get(req.url) driver.get(req.url)
else:
turnstile_token = _resolve_turnstile_captcha(req, driver)
# set cookies if required # set cookies if required
if req.cookies is not None and len(req.cookies) > 0: if req.cookies is not None and len(req.cookies) > 0:
@@ -413,6 +468,7 @@ def _evil_logic(req: V1RequestBase, driver: WebDriver, method: str) -> Challenge
challenge_res.status = 200 # todo: fix, selenium not provides this info challenge_res.status = 200 # todo: fix, selenium not provides this info
challenge_res.cookies = driver.get_cookies() challenge_res.cookies = driver.get_cookies()
challenge_res.userAgent = utils.get_user_agent(driver) challenge_res.userAgent = utils.get_user_agent(driver)
challenge_res.turnstile_token = turnstile_token
if not req.returnOnlyCookies: if not req.returnOnlyCookies:
challenge_res.headers = {} # todo: fix, selenium not provides this info challenge_res.headers = {} # todo: fix, selenium not provides this info