Compare commits

..

21 Commits

Author SHA1 Message Date
ngosang
36226b34c1 Fix Dockerfile for linux/386 architecture 2022-09-24 20:33:33 +02:00
ngosang
606d84f7c0 Install undetected_chromedriver dependencies 2022-09-24 20:04:32 +02:00
ngosang
62eb363575 Reuse patched chromedriver 2022-09-24 19:54:42 +02:00
ngosang
345d27dd5a Fix Chrome version detection on Windows 2022-09-24 19:16:02 +02:00
ngosang
3b9fd0aa6a Add browser headless mode for Windows 2022-09-24 18:42:58 +02:00
ngosang
93041779fb Fork undetected-chromedriver 3.1.5.post4 2022-09-24 18:35:01 +02:00
ngosang
3dbb4e65d6 Reduce Docker image size 2022-09-24 18:29:44 +02:00
ngosang
23dd8f8725 Update readme 2022-09-24 16:18:57 +02:00
ngosang
9ab7ab1371 Add browser headless mode for Linux 2022-09-24 16:18:36 +02:00
ngosang
cf7e4f8749 Add tests for several known sites 2022-09-24 15:48:01 +02:00
ngosang
e8328adb90 Show ReqId only in Debug traces 2022-09-24 15:47:33 +02:00
ngosang
843f588859 Detect Cloudflare Access Denied 2022-09-24 15:40:52 +02:00
ngosang
f8462c86f2 Bump version to 3.0.0.beta2 2022-09-24 15:24:05 +02:00
ngosang
4bc083896b Update readme 2022-09-23 02:18:59 +02:00
ngosang
c9f2d6e954 Add Docker image and Docker compose 2022-09-23 02:18:48 +02:00
ngosang
177578d5d8 Rewrite FlareSolverr from scratch in Python + Selenium 2022-09-23 02:17:50 +02:00
ngosang
efcab83f6e Update package.json 2022-09-22 23:37:31 +02:00
ngosang
51b7bc3b92 Update license, remove FlareSolverr v1 / v2 authors 2022-09-22 21:11:40 +02:00
ngosang
e5be265026 Prepare .gitignore for Python project 2022-09-22 21:08:45 +02:00
ngosang
aed54e0bb3 Disable autotag Github Action 2022-09-22 21:08:22 +02:00
ngosang
5046f60914 Prepare for version 3.0, remove JS code 2022-09-22 20:35:03 +02:00
37 changed files with 2115 additions and 3923 deletions

32
.github/ISSUE_TEMPLATE.md vendored Normal file
View File

@@ -0,0 +1,32 @@
**Please use the search bar** at the top of the page and make sure you are not creating an already submitted issue.
Check closed issues as well, because your issue may have already been fixed.
### How to enable debug and html traces
[Follow the instructions from this wiki page](https://github.com/FlareSolverr/FlareSolverr/wiki/How-to-enable-debug-and-html-trace)
### Environment
* **FlareSolverr version**:
* **Last working FlareSolverr version**:
* **Operating system**:
* **Are you using Docker**: [yes/no]
* **FlareSolverr User-Agent (see log traces or / endpoint)**:
* **Are you using a proxy or VPN?** [yes/no]
* **Are you using Captcha Solver:** [yes/no]
* **If using captcha solver, which one:**
* **URL to test this issue:**
### Description
[List steps to reproduce the error and details on what happens and what you expected to happen]
### Logged Error Messages
[Place any relevant error messages you noticed from the logs here.]
[Make sure you attach the full logs with your personal information removed in case we need more information]
### Screenshots
[Place any screenshots of the issue here if needed]

View File

@@ -1,78 +0,0 @@
name: Bug report
description: Create a report of your issue
body:
- type: checkboxes
attributes:
label: Have you checked our README?
description: Please check the <a href="https://github.com/FlareSolverr/FlareSolverr/blob/master/README.md">README</a>.
options:
- label: I have checked the README
required: true
- type: checkboxes
attributes:
label: Have you followed our Troubleshooting?
description: Please follow our <a href="https://github.com/FlareSolverr/FlareSolverr/wiki/Troubleshooting">Troubleshooting</a>.
options:
- label: I have followed your Troubleshooting
required: true
- type: checkboxes
attributes:
label: Is there already an issue for your problem?
description: Please make sure you are not creating an already submitted <a href="https://github.com/FlareSolverr/FlareSolverr/issues">Issue</a>. Check closed issues as well, because your issue may have already been fixed.
options:
- label: I have checked older issues, open and closed
required: true
- type: checkboxes
attributes:
label: Have you checked the discussions?
description: Please read our <a href="https://github.com/FlareSolverr/FlareSolverr/discussions">Discussions</a> before submitting your issue, some wider problems may be dealt with there.
options:
- label: I have read the Discussions
required: true
- type: input
attributes:
label: Have you ACTUALLY checked all these?
description: Please do not waste our time and yours; these checks are there for a reason, it is not just so you can tick boxes for fun. If you type <b>YES</b> and it is clear you did not or have put in no effort, your issue will be closed and locked without comment. If you type <b>NO</b> but still open this issue, you will be permanently blocked for timewasting.
placeholder: YES or NO
validations:
required: true
- type: textarea
attributes:
label: Environment
description: Please provide the details of the system FlareSolverr is running on.
value: |
- FlareSolverr version:
- Last working FlareSolverr version:
- Operating system:
- Are you using Docker: [yes/no]
- FlareSolverr User-Agent (see log traces or / endpoint):
- Are you using a VPN: [yes/no]
- Are you using a Proxy: [yes/no]
- Are you using Captcha Solver: [yes/no]
- If using captcha solver, which one:
- URL to test this issue:
render: markdown
validations:
required: true
- type: textarea
attributes:
label: Description
description: List steps to reproduce the error and details on what happens and what you expected to happen.
validations:
required: true
- type: textarea
attributes:
label: Logged Error Messages
description: |
Place any relevant error messages you noticed from the logs here.
Make sure you attach the full logs with your personal information removed in case we need more information.
If you wish to provide debug logs, follow the instructions from this <a href="https://github.com/FlareSolverr/FlareSolverr/wiki/How-to-enable-debug-and-html-trace">wiki page</a>.
render: text
validations:
required: true
- type: textarea
attributes:
label: Screenshots
description: Place any screenshots of the issue here if needed
validations:
required: false

View File

@@ -1,8 +0,0 @@
blank_issues_enabled: false
contact_links:
- name: Requesting new features or changes
url: https://github.com/FlareSolverr/FlareSolverr/discussions
about: Please create a new discussion topic, grouped under "Ideas".
- name: Asking questions
url: https://github.com/FlareSolverr/FlareSolverr/discussions
about: Please create a new discussion topic, grouped under "Q&A".

View File

@@ -1,19 +1,21 @@
name: Autotag # todo: enable in the first release
#name: autotag
on: #
push: #on:
branches: # push:
- "master" # branches:
# - "master"
jobs: #
tag-release: #jobs:
runs-on: ubuntu-latest # build:
steps: # runs-on: ubuntu-latest
- name: Checkout repository # steps:
uses: actions/checkout@v5 # -
# name: Checkout
- name: Auto Tag # uses: actions/checkout@v2
uses: Klemensas/action-autotag@stable # -
with: # name: Auto Tag
GITHUB_TOKEN: "${{ secrets.GH_PAT }}" # uses: Klemensas/action-autotag@stable
tag_prefix: "v" # with:
# GITHUB_TOKEN: "${{ secrets.GH_PAT }}"
# tag_prefix: "v"

View File

@@ -1,67 +1,53 @@
name: Docker release name: release-docker
on: on:
push: push:
tags: tags:
- "v*.*.*" - 'v*.*.*'
pull_request:
branches:
- master
concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true
jobs: jobs:
build-docker-images: build:
if: ${{ !github.event.pull_request.head.repo.fork }}
runs-on: ubuntu-latest runs-on: ubuntu-latest
steps: steps:
- name: Checkout repository -
uses: actions/checkout@v5 name: Checkout
uses: actions/checkout@v2
- name: Downcase repo -
name: Downcase repo
run: echo REPOSITORY=$(echo ${{ github.repository }} | tr '[:upper:]' '[:lower:]') >> $GITHUB_ENV run: echo REPOSITORY=$(echo ${{ github.repository }} | tr '[:upper:]' '[:lower:]') >> $GITHUB_ENV
-
- name: Docker meta name: Docker meta
id: docker_meta id: docker_meta
uses: docker/metadata-action@v5 uses: crazy-max/ghaction-docker-meta@v1
with: with:
images: | images: ${{ env.REPOSITORY }},ghcr.io/${{ env.REPOSITORY }}
${{ env.REPOSITORY }},enable=${{ github.event_name != 'pull_request' }} tag-sha: false
ghcr.io/${{ env.REPOSITORY }} -
tags: | name: Set up QEMU
type=semver,pattern={{version}},prefix=v uses: docker/setup-qemu-action@v1.0.1
type=ref,event=pr -
flavor: | name: Set up Docker Buildx
latest=auto uses: docker/setup-buildx-action@v1
-
- name: Set up QEMU name: Login to DockerHub
uses: docker/setup-qemu-action@v3 uses: docker/login-action@v1
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Login to DockerHub
if: github.event_name != 'pull_request'
uses: docker/login-action@v3
with: with:
username: ${{ secrets.DOCKERHUB_USERNAME }} username: ${{ secrets.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_TOKEN }} password: ${{ secrets.DOCKERHUB_TOKEN }}
-
- name: Login to GitHub Container Registry name: Login to GitHub Container Registry
uses: docker/login-action@v3 uses: docker/login-action@v1
with: with:
registry: ghcr.io registry: ghcr.io
username: ${{ github.repository_owner }} username: ${{ github.repository_owner }}
password: ${{ secrets.GH_PAT }} password: ${{ secrets.GH_PAT }}
-
- name: Build and push name: Build and push
uses: docker/build-push-action@v6 uses: docker/build-push-action@v2
with: with:
context: . context: .
file: ./Dockerfile file: ./Dockerfile
platforms: linux/386,linux/amd64,linux/arm/v7,linux/arm64/v8 platforms: linux/amd64,linux/arm/v7,linux/arm64
push: true push: ${{ github.event_name != 'pull_request' }}
tags: ${{ steps.docker_meta.outputs.tags }} tags: ${{ steps.docker_meta.outputs.tags }}
labels: ${{ steps.docker_meta.outputs.labels }} labels: ${{ steps.docker_meta.outputs.labels }}

View File

@@ -1,19 +1,30 @@
name: Release name: release
on: on:
push: push:
tags: tags:
- "v*.*.*" - 'v*.*.*'
jobs: jobs:
create-release: build:
name: Create release name: Create release
runs-on: ubuntu-latest runs-on: ubuntu-latest
steps: steps:
- name: Checkout repository - name: Checkout code
uses: actions/checkout@v5 uses: actions/checkout@v2
with: with:
fetch-depth: 0 fetch-depth: 0 # get all commits, branches and tags (required for the changelog)
- name: Setup Node
uses: actions/setup-node@v2
with:
node-version: '16'
- name: Build artifacts
run: |
npm install
npm run build
npm run package
- name: Build changelog - name: Build changelog
id: github_changelog id: github_changelog
@@ -22,45 +33,23 @@ jobs:
changelog="${changelog//'%'/'%25'}" changelog="${changelog//'%'/'%25'}"
changelog="${changelog//$'\n'/'%0A'}" changelog="${changelog//$'\n'/'%0A'}"
changelog="${changelog//$'\r'/'%0D'}" changelog="${changelog//$'\r'/'%0D'}"
echo "changelog=${changelog}" >> $GITHUB_ENV echo "##[set-output name=changelog;]${changelog}"
- name: Create release - name: Create release
uses: softprops/action-gh-release@v2 id: create_release
uses: actions/create-release@v1
env:
GITHUB_TOKEN: ${{ secrets.GH_PAT }}
with: with:
tag_name: ${{ github.ref }} tag_name: ${{ github.ref }}
name: ${{ github.ref }} release_name: ${{ github.ref }}
body: ${{ env.changelog }} body: ${{ steps.github_changelog.outputs.changelog }}
env: draft: false
GITHUB_TOKEN: ${{ secrets.GH_PAT }} prerelease: false
build-package:
name: Build binaries
needs: create-release
runs-on: ${{ matrix.os }}
strategy:
matrix:
os: [ubuntu-latest, windows-latest]
steps:
- name: Checkout repository
uses: actions/checkout@v5
with:
fetch-depth: 0
- name: Setup Python
uses: actions/setup-python@v6
with:
python-version: "3.13"
- name: Build artifacts
run: |
python -m pip install -r requirements.txt
python -m pip install pyinstaller==6.16.0
cd src
python build_package.py
- name: Upload release artifacts - name: Upload release artifacts
uses: softprops/action-gh-release@v2 uses: alexellis/upload-assets@0.2.2
with:
files: ./dist/flaresolverr_*
env: env:
GITHUB_TOKEN: ${{ secrets.GH_PAT }} GITHUB_TOKEN: ${{ secrets.GH_PAT }}
with:
asset_paths: '["./bin/*.zip"]'

4
.gitignore vendored
View File

@@ -25,7 +25,6 @@ __pycache__/
build/ build/
develop-eggs/ develop-eggs/
dist/ dist/
dist_chrome/
downloads/ downloads/
eggs/ eggs/
.eggs/ .eggs/
@@ -124,6 +123,3 @@ venv.bak/
.mypy_cache/ .mypy_cache/
.dmypy.json .dmypy.json
dmypy.json dmypy.json
# node
node_modules/

View File

@@ -1,458 +0,0 @@
# Changelog
## v3.4.2 (2025/10/09)
* Bump dependencies & CI actions. Thanks @flowerey
* Add optional wait time after resolving the challenge before returning. Thanks @kennedyoliveira
* Add proxy ENVs. Thanks @Robokishan
* Handle empty string and keys without value in postData. Thanks @eZ4RK0
* Add quote protection for password containing it. Thanks @warrenberberd
* Add returnScreenshot parameter to screenshot the final web page. Thanks @estebanthi
* Add log file support. Thanks @acg5159
## v3.4.1 (2025/09/15)
* Fix regex pattern syntax in utils.py
* Change access denied title check to use startswith
## v3.4.0 (2025/08/25)
* Modernize and upgrade application. Thanks @TheCrazyLex
* Remove disable software rasterizer option for ARM builds. Thanks @smrodman83
## v3.3.25 (2025/06/14)
* Remove `use-gl` argument. Thanks @qwerty12
* u_c: remove apparent c&p typo. Thanks @ok3721
* Bump requirements
## v3.3.24 (2025/06/04)
* Remove hidden character
## v3.3.23 (2025/06/04)
* Update base image to bookworm. Thanks @rwjack
## v3.3.22 (2025/06/03)
* Disable search engine choice screen
* Fix headless=false stalling. Thanks @MAKMED1337
* Change from click to keys. Thanks @sh4dowb
* Don't open devtools
* Bump Chromium to v137 for build
* Bump requirements
## v3.3.21 (2024/06/26)
* Add challenge selector to catch reloading page on non-English systems
* Escape values for generated form used in request.post. Thanks @mynameisbogdan
## v3.3.20 (2024/06/21)
* maxTimeout should always be int
* Check not running in Docker before logging version_main error
* Update Cloudflare challenge and checkbox selectors. Thanks @tenettow & @21hsmw
## v3.3.19 (2024/05/23)
* Fix occasional headless issue on Linux when set to "false". Thanks @21hsmw
## v3.3.18 (2024/05/20)
* Fix LANG ENV for Linux
* Fix Chrome v124+ not closing on Windows. Thanks @RileyXX
## v3.3.17 (2024/04/09)
* Fix file descriptor leak in service on quit(). Thanks @zkulis
## v3.3.16 (2024/02/28)
* Fix of the subprocess.STARTUPINFO() call. Thanks @ceconelo
* Add FreeBSD support. Thanks @Asthowen
* Use headless configuration properly. Thanks @hashworks
## v3.3.15 (2024/02/20)
* Fix looping challenges
## v3.3.14-hotfix2 (2024/02/17)
* Hotfix 2 - bad Chromium build, instances failed to terminate
## v3.3.14-hotfix (2024/02/17)
* Hotfix for Linux build - some Chrome files no longer exist
## v3.3.14 (2024/02/17)
* Update Chrome downloads. Thanks @opemvbs
## v3.3.13 (2024/01/07)
* Fix too many open files error
## v3.3.12 (2023/12/15)
* Fix looping challenges and invalid cookies
## v3.3.11 (2023/12/11)
* Update UC 3.5.4 & Selenium 4.15.2. Thanks @txtsd
## v3.3.10 (2023/11/14)
* Add LANG ENV - resolves issues with YGGtorrent
## v3.3.9 (2023/11/13)
* Fix for Docker build, capture TypeError
## v3.3.8 (2023/11/13)
* Fix headless=true for Chrome 117+. Thanks @NabiKAZ
* Support running Chrome 119 from source. Thanks @koleg and @Chris7X
* Fix "OSError: [WinError 6] The handle is invalid" on exit. Thanks @enesgorkemgenc
## v3.3.7 (2023/11/05)
* Bump to rebuild. Thanks @JoachimDorchies
## v3.3.6 (2023/09/15)
* Update checkbox selector, again
## v3.3.5 (2023/09/13)
* Change checkbox selector, support languages other than English
## v3.3.4 (2023/09/02)
* Update checkbox selector
## v3.3.3 (2023/08/31)
* Update undetected_chromedriver to v3.5.3
## v3.3.2 (2023/08/03)
* Fix URL domain in Prometheus exporter
## v3.3.1 (2023/08/03)
* Fix for Cloudflare verify checkbox
* Fix HEADLESS=false in Windows binary
* Fix Prometheus exporter for management and health endpoints
* Remove misleading stack trace when the verify checkbox is not found
* Revert "Update base Docker image to Debian Bookworm" #849
* Revert "Install Chromium 115 from Debian testing" #849
## v3.3.0 (2023/08/02)
* Fix for new Cloudflare detection. Thanks @cedric-bour for #845
* Add support for proxy authentication username/password. Thanks @jacobprice808 for #807
* Implement Prometheus metrics
* Fix Chromium Driver for Chrome / Chromium version > 114
* Use Chromium 115 in binary packages (Windows and Linux)
* Install Chromium 115 from Debian testing (Docker)
* Update base Docker image to Debian Bookworm
* Update Selenium 4.11.2
* Update pyinstaller 5.13.0
* Add more traces in build_package.py
## v3.2.2 (2023/07/16)
* Workaround for updated 'verify you are human' check
## v3.2.1 (2023/06/10)
* Kill dead Chrome processes in Windows
* Fix Chrome GL erros in ASUSTOR NAS
## v3.2.0 (2023/05/23)
* Support "proxy" param in requests and sessions
* Support "cookies" param in requests
* Fix Chromium exec permissions in Linux package
* Update Python dependencies
## v3.1.2 (2023/04/02)
* Fix headless mode in macOS
* Remove redundant artifact from Windows binary package
* Bump Selenium dependency
## v3.1.1 (2023/03/25)
* Distribute binary executables in compressed package
* Add icon for binary executable
* Include information about supported architectures in the readme
* Check Python version on start
## v3.1.0 (2023/03/20)
* Build binaries for Linux x64 and Windows x64
* Sessions with auto-creation on fetch request and TTL
* Fix error trace: Crash Reports/pending No such file or directory
* Fix Waitress server error with asyncore_use_poll=true
* Attempt to fix Docker ARM32 build
* Print platform information on start up
* Add Fairlane challenge selector
* Update DDOS-GUARD title
* Update dependencies
## v3.0.4 (2023/03/07)
* Click on the Cloudflare's 'Verify you are human' button if necessary
## v3.0.3 (2023/03/06)
* Update undetected_chromedriver version to 3.4.6
## v3.0.2 (2023/01/08)
* Detect Cloudflare blocked access
* Check Chrome / Chromium web browser is installed correctly
## v3.0.1 (2023/01/06)
* Kill Chromium processes properly to avoid defunct/zombie processes
* Update undetected-chromedriver
* Disable Zygote sandbox in Chromium browser
* Add more selectors to detect blocked access
* Include procps (ps), curl and vim packages in the Docker image
## v3.0.0 (2023/01/04)
* This is the first release of FlareSolverr v3. There are some breaking changes
* Docker images for linux/386, linux/amd64, linux/arm/v7 and linux/arm64/v8
* Replaced Firefox with Chrome
* Replaced NodeJS / Typescript with Python
* Replaced Puppeter with Selenium
* No binaries for Linux / Windows. You have to use the Docker image or install from Source code
* No proxy support
* No session support
## v2.2.10 (2022/10/22)
* Detect DDoS-Guard through title content
## v2.2.9 (2022/09/25)
* Detect Cloudflare Access Denied
* Commit the complete changelog
## v2.2.8 (2022/09/17)
* Remove 30 s delay and clean legacy code
## v2.2.7 (2022/09/12)
* Temporary fix: add 30s delay
* Update README.md
## v2.2.6 (2022/07/31)
* Fix Cloudflare detection in POST requests
## v2.2.5 (2022/07/30)
* Update GitHub actions to build executables with NodeJs 16
* Update Cloudflare selectors and add HTML samples
* Install Firefox 94 instead of the latest Nightly
* Update dependencies
* Upgrade Puppeteer (#396)
## v2.2.4 (2022/04/17)
* Detect DDoS-Guard challenge
## v2.2.3 (2022/04/16)
* Fix 2000 ms navigation timeout
* Update README.md (libseccomp2 package in Debian)
* Update README.md (clarify proxy parameter) (#307)
* Update NPM dependencies
* Disable Cloudflare ban detection
## v2.2.2 (2022/03/19)
* Fix ban detection. Resolves #330 (#336)
## v2.2.1 (2022/02/06)
* Fix max timeout error in some pages
* Avoid crashing in NodeJS 17 due to Unhandled promise rejection
* Improve proxy validation and debug traces
* Remove @types/puppeteer dependency
## v2.2.0 (2022/01/31)
* Increase default BROWSER_TIMEOUT=40000 (40 seconds)
* Fix Puppeter deprecation warnings
* Update base Docker image Alpine 3.15 / NodeJS 16
* Build precompiled binaries with NodeJS 16
* Update Puppeter and other dependencies
* Add support for Custom CloudFlare challenge
* Add support for DDoS-GUARD challenge
## v2.1.0 (2021/12/12)
* Add aarch64 to user agents to be replaced (#248)
* Fix SOCKSv4 and SOCKSv5 proxy. resolves #214 #220
* Remove redundant JSON key (postData) (#242)
* Make test URL configurable with TEST_URL env var. resolves #240
* Bypass new Cloudflare protection
* Update donation links
## v2.0.2 (2021/10/31)
* Fix SOCKS5 proxy. Resolves #214
* Replace Firefox ERS with a newer version
* Catch startup exceptions and give some advices
* Add env var BROWSER_TIMEOUT for slow systems
* Fix NPM warning in Docker images
## v2.0.1 (2021/10/24)
* Check user home dir before testing web browser installation
## v2.0.0 (2021/10/20)
FlareSolverr 2.0.0 is out with some important changes:
* It is capable of solving the automatic challenges of Cloudflare. CAPTCHAs (hCaptcha) cannot be resolved and the old solvers have been removed.
* The Chrome browser has been replaced by Firefox. This has caused some functionality to be removed. Parameters: `userAgent`, `headers`, `rawHtml` and `downloadare` no longer available.
* Included `proxy` support without user/password credentials. If you are writing your own integration with FlareSolverr, make sure your client uses the same User-Agent header and Proxy that FlareSolverr uses. Those values together with the Cookie are checked and detected by Cloudflare.
* FlareSolverr has been rewritten from scratch. From now on it should be easier to maintain and test.
* If you are using Jackett make sure you have version v0.18.1041 or higher. FlareSolverSharp v2.0.0 is out too.
Complete changelog:
* Bump version 2.0.0
* Set puppeteer timeout half of maxTimeout param. Resolves #180
* Add test for blocked IP
* Avoid reloading the page in case of error
* Improve Cloudflare detection
* Fix version
* Fix browser preferences and proxy
* Fix request.post method and clean error traces
* Use Firefox ESR for Docker images
* Improve Firefox start time and code clean up
* Improve bad request management and tests
* Build native packages with Firefox
* Update readme
* Improve Docker image and clean TODOs
* Add proxy support
* Implement request.post method for Firefox
* Code clean up, remove returnRawHtml, download, headers params
* Remove outdated chaptcha solvers
* Refactor the app to use Express server and Jest for tests
* Fix Cloudflare resolver for Linux ARM builds
* Fix Cloudflare resolver
* Replace Chrome web browser with Firefox
* Remove userAgent parameter since any modification is detected by CF
* Update dependencies
* Remove Puppeter steath plugin
## v1.2.9 (2021/08/01)
* Improve "Execution context was destroyed" error handling
* Implement returnRawHtml parameter. resolves #172 resolves #165
* Capture Docker stop signal. resolves #158
* Reduce Docker image size 20 MB
* Fix page reload after challenge is solved. resolves #162 resolves #143
* Avoid loading images/css/fonts to speed up page load
* Improve Cloudflare IP ban detection
* Fix vulnerabilities
## v1.2.8 (2021/06/01)
* Improve old JS challenge waiting. Resolves #129
## v1.2.7 (2021/06/01)
* Improvements in Cloudflare redirect detection. Resolves #140
* Fix installation instructions
## v1.2.6 (2021/05/30)
* Handle new Cloudflare challenge. Resolves #135 Resolves #134
* Provide reference Systemd unit file. Resolves #72
* Fix EACCES: permission denied, open '/tmp/flaresolverr.txt'. Resolves #120
* Configure timezone with TZ env var. Resolves #109
* Return the redirected URL in the response (#126)
* Show an error in hcaptcha-solver. Resolves #132
* Regenerate package-lock.json lockfileVersion 2
* Update issue template. Resolves #130
* Bump ws from 7.4.1 to 7.4.6 (#137)
* Bump hosted-git-info from 2.8.8 to 2.8.9 (#124)
* Bump lodash from 4.17.20 to 4.17.21 (#125)
## v1.2.5 (2021/04/05)
* Fix memory regression, close test browser
* Fix release-docker GitHub action
## v1.2.4 (2021/04/04)
* Include license in release zips. resolves #75
* Validate Chrome is working at startup
* Speedup Docker image build
* Add health check endpoint
* Update issue template
* Minor improvements in debug traces
* Validate environment variables at startup. resolves #101
* Add FlareSolverr logo. resolves #23
## v1.2.3 (2021/01/10)
* CI/CD: Generate release changelog from commits. resolves #34
* Update README.md
* Add donation links
* Simplify docker-compose.yml
* Allow to configure "none" captcha resolver
* Override docker-compose.yml variables via .env resolves #64 (#66)
## v1.2.2 (2021/01/09)
* Add documentation for precompiled binaries installation
* Add instructions to set environment variables in Windows
* Build Windows and Linux binaries. resolves #18
* Add release badge in the readme
* CI/CD: Generate release changelog from commits. resolves #34
* Add a notice about captcha solvers
* Add Chrome flag --disable-dev-shm-usage to fix crashes. resolves #45
* Fix Docker CLI documentation
* Add traces with captcha solver service. resolves #39
* Improve logic to detect Cloudflare captcha. resolves #48
* Move Cloudflare provider logic to his own class
* Simplify and document the "return only cookies" parameter
* Show message when debug log is enabled
* Update readme to add more clarifications. resolves #53 (#60)
* issue_template: typo fix (#52)
## v1.2.1 (2020/12/20)
* Change version to match release tag / 1.2.0 => v1.2.0
* CI/CD Publish release in GitHub repository. resolves #34
* Add welcome message in / endpoint
* Rewrite request timeout handling (maxTimeout) resolves #42
* Add http status for better logging
* Return an error when no selectors are found, #25
* Add issue template, fix #32
* Moving log.html right after loading the page and add one on reload, fix #30
* Update User-Agent to match chromium version, ref: #15 (#28)
* Update install from source code documentation
* Update readme to add Docker instructions (#20)
* Clean up readme (#19)
* Add docker-compose
* Change default log level to info
## v1.2.0 (2020/12/20)
* Fix User-Agent detected by CouldFlare (Docker ARM) resolves #15
* Include exception message in error response
* CI/CD: Rename GitHub Action build => publish
* Bump version
* Fix TypeScript compilation and bump minor version
* CI/CD: Bump minor version
* CI/CD: Configure GitHub Actions
* CI/CD: Configure GitHub Actions
* CI/CD: Bump minor version
* CI/CD: Configure Build GitHub Action
* CI/CD: Configure AutoTag GitHub Action (#14)
* CI/CD: Build the Docker images with GitHub Actions (#13)
* Update dependencies
* Backport changes from Cloudproxy (#11)

View File

@@ -1,4 +1,4 @@
FROM python:3.13-slim-bookworm as builder FROM python:3.10-slim-bullseye as builder
# Build dummy packages to skip installing them and their dependencies # Build dummy packages to skip installing them and their dependencies
RUN apt-get update \ RUN apt-get update \
@@ -12,25 +12,28 @@ RUN apt-get update \
&& equivs-build adwaita-icon-theme \ && equivs-build adwaita-icon-theme \
&& mv adwaita-icon-theme_*.deb /adwaita-icon-theme.deb && mv adwaita-icon-theme_*.deb /adwaita-icon-theme.deb
FROM python:3.13-slim-bookworm FROM python:3.10-slim-bullseye
# Copy dummy packages # Copy dummy packages
COPY --from=builder /*.deb / COPY --from=builder /*.deb /
# Install dependencies and create flaresolverr user # Install dependencies and create flaresolverr user
# We have to install and old version of Chromium because its not working in Raspberry Pi / ARM
# You can test Chromium running this command inside the container: # You can test Chromium running this command inside the container:
# xvfb-run -s "-screen 0 1600x1200x24" chromium --no-sandbox # xvfb-run -s "-screen 0 1600x1200x24" chromium --no-sandbox
# The error traces is like this: "*** stack smashing detected ***: terminated" # The error traces is like this: "*** stack smashing detected ***: terminated"
# To check the package versions available you can use this command: # To check the package versions available you can use this command:
# apt-cache madison chromium # apt-cache madison chromium
WORKDIR /app WORKDIR /app
RUN echo "\ndeb http://snapshot.debian.org/archive/debian/20210519T212015Z/ bullseye main" >> /etc/apt/sources.list \
&& echo 'Acquire::Check-Valid-Until "false";' | tee /etc/apt/apt.conf.d/00snapshot \
# Install dummy packages # Install dummy packages
RUN dpkg -i /libgl1-mesa-dri.deb \ && dpkg -i /libgl1-mesa-dri.deb \
&& dpkg -i /adwaita-icon-theme.deb \ && dpkg -i /adwaita-icon-theme.deb \
# Install dependencies # Install dependencies
&& apt-get update \ && apt-get update \
&& apt-get install -y --no-install-recommends chromium chromium-common chromium-driver xvfb dumb-init \ && apt-get install -y --no-install-recommends chromium=89.0.4389.114-1 chromium-common=89.0.4389.114-1 \
procps curl vim xauth \ chromium-driver=89.0.4389.114-1 xvfb \
# Remove temporary files and hardware decoding libraries # Remove temporary files and hardware decoding libraries
&& rm -rf /var/lib/apt/lists/* \ && rm -rf /var/lib/apt/lists/* \
&& rm -f /usr/lib/x86_64-linux-gnu/libmfxhw* \ && rm -f /usr/lib/x86_64-linux-gnu/libmfxhw* \
@@ -38,46 +41,29 @@ RUN dpkg -i /libgl1-mesa-dri.deb \
# Create flaresolverr user # Create flaresolverr user
&& useradd --home-dir /app --shell /bin/sh flaresolverr \ && useradd --home-dir /app --shell /bin/sh flaresolverr \
&& mv /usr/bin/chromedriver chromedriver \ && mv /usr/bin/chromedriver chromedriver \
&& chown -R flaresolverr:flaresolverr . \ && chown -R flaresolverr:flaresolverr .
# Create config dir
&& mkdir /config \
&& chown flaresolverr:flaresolverr /config
VOLUME /config
# Install Python dependencies # Install Python dependencies
COPY requirements.txt . COPY requirements.txt .
RUN pip install -r requirements.txt \ RUN pip install -r requirements.txt \
# Remove temporary files # Remove temporary files
&& rm -rf /root/.cache && rm -rf /root/.cache \
&& find / -name '*.pyc' -delete
USER flaresolverr USER flaresolverr
RUN mkdir -p "/app/.config/chromium/Crash Reports/pending"
COPY src . COPY src .
COPY package.json ../ COPY package.json ../
EXPOSE 8191 EXPOSE 8191
EXPOSE 8192
# dumb-init avoids zombie chromium processes
ENTRYPOINT ["/usr/bin/dumb-init", "--"]
CMD ["/usr/local/bin/python", "-u", "/app/flaresolverr.py"] CMD ["/usr/local/bin/python", "-u", "/app/flaresolverr.py"]
# Local build # Local build
# docker build -t ngosang/flaresolverr:3.4.2 . # docker build -t ngosang/flaresolverr:3.0.0.beta2 .
# docker run -p 8191:8191 ngosang/flaresolverr:3.4.2 # docker run -p 8191:8191 ngosang/flaresolverr:3.0.0.beta2
# Multi-arch build # Multi-arch build
# docker run --rm --privileged multiarch/qemu-user-static --reset -p yes
# docker buildx create --use # docker buildx create --use
# docker buildx build -t ngosang/flaresolverr:3.4.2 --platform linux/386,linux/amd64,linux/arm/v7,linux/arm64/v8 . # docker buildx build -t ngosang/flaresolverr:3.0.0.beta2 --platform linux/386,linux/amd64,linux/arm/v7,linux/arm64/v8 .
# add --push to publish in DockerHub # add --push to publish in DockerHub
# Test multi-arch build
# docker run --rm --privileged multiarch/qemu-user-static --reset -p yes
# docker buildx create --use
# docker buildx build -t ngosang/flaresolverr:3.4.2 --platform linux/arm/v7 --load .
# docker run -p 8191:8191 --platform linux/arm/v7 ngosang/flaresolverr:3.4.2

View File

@@ -1,6 +1,6 @@
MIT License MIT License
Copyright (c) 2025 Diego Heras (ngosang / ngosang@hotmail.es) Copyright (c) 2022 Diego Heras (ngosang / ngosang@hotmail.es)
Permission is hereby granted, free of charge, to any person obtaining a copy Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal of this software and associated documentation files (the "Software"), to deal

155
README.md
View File

@@ -45,8 +45,7 @@ Supported architectures are:
| ARM32 | linux/arm/v7 | | ARM32 | linux/arm/v7 |
| ARM64 | linux/arm64 | | ARM64 | linux/arm64 |
We provide a `docker-compose.yml` configuration file. Clone this repository and execute We provide a `docker-compose.yml` configuration file. Clone this repository and execute `docker-compose up -d` to start
`docker-compose up -d` _(Compose V1)_ or `docker compose up -d` _(Compose V2)_ to start
the container. the container.
If you prefer the `docker cli` execute the following command. If you prefer the `docker cli` execute the following command.
@@ -59,82 +58,44 @@ docker run -d \
ghcr.io/flaresolverr/flaresolverr:latest ghcr.io/flaresolverr/flaresolverr:latest
``` ```
If your host OS is Debian, make sure `libseccomp2` version is 2.5.x. You can check the version with `sudo apt-cache policy libseccomp2` If your host OS is Debian, make sure `libseccomp2` version is 2.5.x. You can check the version with `sudo apt-cache policy libseccomp2`
and update the package with `sudo apt install libseccomp2=2.5.1-1~bpo10+1` or `sudo apt install libseccomp2=2.5.1-1+deb11u1`. and update the package with `sudo apt install libseccomp2=2.5.1-1~bpo10+1` or `sudo apt install libseccomp2=2.5.1-1+deb11u1`.
Remember to restart the Docker daemon and the container after the update. Remember to restart the Docker daemon and the container after the update.
### Precompiled binaries ### Precompiled binaries
> **Warning**
> Precompiled binaries are only available for x64 architecture. For other architectures see Docker images.
This is the recommended way for Windows users. This is the recommended way for Windows users.
* Download the [FlareSolverr executable](https://github.com/FlareSolverr/FlareSolverr/releases) from the release's page. It is available for Windows x64 and Linux x64. * Download the [FlareSolverr zip](https://github.com/FlareSolverr/FlareSolverr/releases) from the release's assets. It is available for Windows and Linux.
* Extract the zip file. FlareSolverr executable and firefox folder must be in the same directory.
* Execute FlareSolverr binary. In the environment variables section you can find how to change the configuration. * Execute FlareSolverr binary. In the environment variables section you can find how to change the configuration.
### From source code ### From source code
> **Warning** This is the recommended way for macOS users and for developers.
> Installing from source code only works for x64 architecture. For other architectures see Docker images. * Install [Python 3.10](https://www.python.org/downloads/).
* Install [Chrome](https://www.google.com/intl/en_us/chrome/) or [Chromium](https://www.chromium.org/getting-involved/download-chromium/) web browser.
* Install [Python 3.13](https://www.python.org/downloads/). * (Only in Linux / macOS) Install [Xvfb](https://en.wikipedia.org/wiki/Xvfb) package.
* Install [Chrome](https://www.google.com/intl/en_us/chrome/) (all OS) or [Chromium](https://www.chromium.org/getting-involved/download-chromium/) (just Linux, it doesn't work in Windows) web browser.
* (Only in Linux) Install [Xvfb](https://en.wikipedia.org/wiki/Xvfb) package.
* (Only in macOS) Install [XQuartz](https://www.xquartz.org/) package.
* Clone this repository and open a shell in that path. * Clone this repository and open a shell in that path.
* Run `pip install -r requirements.txt` command to install FlareSolverr dependencies. * Run `pip install -r requirements.txt` command to install FlareSolverr dependencies.
* Run `python src/flaresolverr.py` command to start FlareSolverr. * Run `python src/flaresolverr.py` command to start FlareSolverr.
### From source code (FreeBSD/TrueNAS CORE)
* Run `pkg install chromium python313 py313-pip xorg-vfbserver` command to install the required dependencies.
* Clone this repository and open a shell in that path.
* Run `python3.13 -m pip install -r requirements.txt` command to install FlareSolverr dependencies.
* Run `python3.13 src/flaresolverr.py` command to start FlareSolverr.
### Systemd service ### Systemd service
We provide an example Systemd unit file `flaresolverr.service` as reference. You have to modify the file to suit your needs: paths, user and environment variables. We provide an example Systemd unit file `flaresolverr.service` as reference. You have to modify the file to suit your needs: paths, user and environment variables.
## Usage ## Usage
Example Bash request: Example request:
```bash ```bash
curl -L -X POST 'http://localhost:8191/v1' \ curl -L -X POST 'http://localhost:8191/v1' \
-H 'Content-Type: application/json' \ -H 'Content-Type: application/json' \
--data-raw '{ --data-raw '{
"cmd": "request.get", "cmd": "request.get",
"url": "http://www.google.com/", "url":"http://www.google.com/",
"maxTimeout": 60000 "maxTimeout": 60000
}' }'
``` ```
Example Python request:
```py
import requests
url = "http://localhost:8191/v1"
headers = {"Content-Type": "application/json"}
data = {
"cmd": "request.get",
"url": "http://www.google.com/",
"maxTimeout": 60000
}
response = requests.post(url, headers=headers, json=data)
print(response.text)
```
Example PowerShell request:
```ps1
$body = @{
cmd = "request.get"
url = "http://www.google.com/"
maxTimeout = 60000
} | ConvertTo-Json
irm -UseBasicParsing 'http://localhost:8191/v1' -Headers @{"Content-Type"="application/json"} -Method Post -Body $body
```
### Commands ### Commands
#### + `sessions.create` #### + `sessions.create`
@@ -145,10 +106,10 @@ cookies for the browser to use.
This also speeds up the requests since it won't have to launch a new browser instance for every request. This also speeds up the requests since it won't have to launch a new browser instance for every request.
| Parameter | Notes | | Parameter | Notes |
|-----------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| |-----------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| session | Optional. The session ID that you want to be assigned to the instance. If isn't set a random UUID will be assigned. | | session | Optional. The session ID that you want to be assigned to the instance. If isn't set a random UUID will be assigned. |
| proxy | Optional, default disabled. Eg: `"proxy": {"url": "http://127.0.0.1:8888"}`. You must include the proxy schema in the URL: `http://`, `socks4://` or `socks5://`. Authorization (username/password) is supported. Eg: `"proxy": {"url": "http://127.0.0.1:8888", "username": "testuser", "password": "testpass"}` | | proxy | Optional, default disabled. Eg: `"proxy": {"url": "http://127.0.0.1:8888"}`. You must include the proxy schema in the URL: `http://`, `socks4://` or `socks5://`. Authorization (username/password) is not supported. |
#### + `sessions.list` #### + `sessions.list`
@@ -179,20 +140,16 @@ session. When you no longer need to use a session you should make sure to close
#### + `request.get` #### + `request.get`
| Parameter | Notes | | Parameter | Notes |
|---------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| |-------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| url | Mandatory | | url | Mandatory |
| session | Optional. Will send the request from and existing browser instance. If one is not sent it will create a temporary instance that will be destroyed immediately after the request is completed. | | session | Optional. Will send the request from and existing browser instance. If one is not sent it will create a temporary instance that will be destroyed immediately after the request is completed. |
| session_ttl_minutes | Optional. FlareSolverr will automatically rotate expired sessions based on the TTL provided in minutes. | | maxTimeout | Optional, default value 60000. Max timeout to solve the challenge in milliseconds. |
| maxTimeout | Optional, default value 60000. Max timeout to solve the challenge in milliseconds. | | cookies | Optional. Will be used by the headless browser. Follow [this](https://github.com/puppeteer/puppeteer/blob/v3.3.0/docs/api.md#pagesetcookiecookies) format. |
| cookies | Optional. Will be used by the headless browser. Eg: `"cookies": [{"name": "cookie1", "value": "value1"}, {"name": "cookie2", "value": "value2"}]`. | | returnOnlyCookies | Optional, default false. Only returns the cookies. Response data, headers and other parts of the response are removed. |
| returnOnlyCookies | Optional, default false. Only returns the cookies. Response data, headers and other parts of the response are removed. | | proxy | Optional, default disabled. Eg: `"proxy": {"url": "http://127.0.0.1:8888"}`. You must include the proxy schema in the URL: `http://`, `socks4://` or `socks5://`. Authorization (username/password) is not supported. (When the `session` parameter is set, the proxy is ignored; a session specific proxy can be set in `sessions.create`.) |
| returnScreenshot | Optional, default false. Captures a screenshot of the final rendered page after all challenges and waits are completed. The screenshot is returned as a Base64-encoded PNG string in the `screenshot` field of the response. |
| proxy | Optional, default disabled. Eg: `"proxy": {"url": "http://127.0.0.1:8888"}`. You must include the proxy schema in the URL: `http://`, `socks4://` or `socks5://`. Authorization (username/password) is not supported. (When the `session` parameter is set, the proxy is ignored; a session specific proxy can be set in `sessions.create`.) |
| waitInSeconds | Optional, default none. Length to wait in seconds after solving the challenge, and before returning the results. Useful to allow it to load dynamic content. |
> **Warning** :warning: If you want to use Cloudflare clearance cookie in your scripts, make sure you use the FlareSolverr User-Agent too. If they don't match you will see the challenge.
> If you want to use Cloudflare clearance cookie in your scripts, make sure you use the FlareSolverr User-Agent too. If they don't match you will see the challenge.
Example response from running the `curl` above: Example response from running the `curl` above:
@@ -263,68 +220,34 @@ This is the same as `request.get` but it takes one more param:
## Environment variables ## Environment variables
| Name | Default | Notes | | Name | Default | Notes |
|--------------------|------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------| |-----------------|------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------|
| LOG_LEVEL | info | Verbosity of the logging. Use `LOG_LEVEL=debug` for more information. | | LOG_LEVEL | info | Verbosity of the logging. Use `LOG_LEVEL=debug` for more information. |
| LOG_FILE | none | Path to capture log to file. Example: `/config/flaresolver.log`. | | LOG_HTML | false | Only for debugging. If `true` all HTML that passes through the proxy will be logged to the console in `debug` level. |
| LOG_HTML | false | Only for debugging. If `true` all HTML that passes through the proxy will be logged to the console in `debug` level. | | CAPTCHA_SOLVER | none | Captcha solving method. It is used when a captcha is encountered. See the Captcha Solvers section. |
| PROXY_URL | none | URL for proxy. Will be overwritten by `request` or `sessions` proxy, if used. Example: `http://127.0.0.1:8080`. | | TZ | UTC | Timezone used in the logs and the web browser. Example: `TZ=Europe/London`. |
| PROXY_USERNAME | none | Username for proxy. Will be overwritten by `request` or `sessions` proxy, if used. Example: `testuser`. | | HEADLESS | true | Only for debugging. To run the web browser in headless mode or visible. |
| PROXY_PASSWORD | none | Password for proxy. Will be overwritten by `request` or `sessions` proxy, if used. Example: `testpass`. | | BROWSER_TIMEOUT | 40000 | If you are experiencing errors/timeouts because your system is slow, you can try to increase this value. Remember to increase the `maxTimeout` parameter too. |
| CAPTCHA_SOLVER | none | Captcha solving method. It is used when a captcha is encountered. See the Captcha Solvers section. | | TEST_URL | https://www.google.com | FlareSolverr makes a request on start to make sure the web browser is working. You can change that URL if it is blocked in your country. |
| TZ | UTC | Timezone used in the logs and the web browser. Example: `TZ=Europe/London`. | | PORT | 8191 | Listening port. You don't need to change this if you are running on Docker. |
| LANG | none | Language used in the web browser. Example: `LANG=en_GB`. | | HOST | 0.0.0.0 | Listening interface. You don't need to change this if you are running on Docker. |
| HEADLESS | true | Only for debugging. To run the web browser in headless mode or visible. |
| TEST_URL | https://www.google.com | FlareSolverr makes a request on start to make sure the web browser is working. You can change that URL if it is blocked in your country. |
| PORT | 8191 | Listening port. You don't need to change this if you are running on Docker. |
| HOST | 0.0.0.0 | Listening interface. You don't need to change this if you are running on Docker. |
| PROMETHEUS_ENABLED | false | Enable Prometheus exporter. See the Prometheus section below. |
| PROMETHEUS_PORT | 8192 | Listening port for Prometheus exporter. See the Prometheus section below. |
Environment variables are set differently depending on the operating system. Some examples: Environment variables are set differently depending on the operating system. Some examples:
* Docker: Take a look at the Docker section in this document. Environment variables can be set in the `docker-compose.yml` file or in the Docker CLI command. * Docker: Take a look at the Docker section in this document. Environment variables can be set in the `docker-compose.yml` file or in the Docker CLI command.
* Linux: Run `export LOG_LEVEL=debug` and then run `flaresolverr` in the same shell. * Linux: Run `export LOG_LEVEL=debug` and then start FlareSolverr in the same shell.
* Windows: Open `cmd.exe`, run `set LOG_LEVEL=debug` and then run `flaresolverr.exe` in the same shell. * Windows: Open `cmd.exe`, run `set LOG_LEVEL=debug` and then start FlareSolverr in the same shell.
## Prometheus exporter
The Prometheus exporter for FlareSolverr is disabled by default. It can be enabled with the environment variable `PROMETHEUS_ENABLED`. If you are using Docker make sure you expose the `PROMETHEUS_PORT`.
Example metrics:
```shell
# HELP flaresolverr_request_total Total requests with result
# TYPE flaresolverr_request_total counter
flaresolverr_request_total{domain="nowsecure.nl",result="solved"} 1.0
# HELP flaresolverr_request_created Total requests with result
# TYPE flaresolverr_request_created gauge
flaresolverr_request_created{domain="nowsecure.nl",result="solved"} 1.690141657157109e+09
# HELP flaresolverr_request_duration Request duration in seconds
# TYPE flaresolverr_request_duration histogram
flaresolverr_request_duration_bucket{domain="nowsecure.nl",le="0.0"} 0.0
flaresolverr_request_duration_bucket{domain="nowsecure.nl",le="10.0"} 1.0
flaresolverr_request_duration_bucket{domain="nowsecure.nl",le="25.0"} 1.0
flaresolverr_request_duration_bucket{domain="nowsecure.nl",le="50.0"} 1.0
flaresolverr_request_duration_bucket{domain="nowsecure.nl",le="+Inf"} 1.0
flaresolverr_request_duration_count{domain="nowsecure.nl"} 1.0
flaresolverr_request_duration_sum{domain="nowsecure.nl"} 5.858
# HELP flaresolverr_request_duration_created Request duration in seconds
# TYPE flaresolverr_request_duration_created gauge
flaresolverr_request_duration_created{domain="nowsecure.nl"} 1.6901416571570296e+09
```
## Captcha Solvers ## Captcha Solvers
> **Warning** :warning: At this time none of the captcha solvers work. You can check the status in the open issues. Any help is welcome.
> At this time none of the captcha solvers work. You can check the status in the open issues. Any help is welcome.
Sometimes CloudFlare not only gives mathematical computations and browser tests, sometimes they also require the user to Sometimes CloudFlare not only gives mathematical computations and browser tests, sometimes they also require the user to
solve a captcha. solve a captcha.
If this is the case, FlareSolverr will return the error `Captcha detected but no automatic solver is configured.` If this is the case, FlareSolverr will return the error `Captcha detected but no automatic solver is configured.`
FlareSolverr can be customized to solve the CAPTCHA automatically by setting the environment variable `CAPTCHA_SOLVER` FlareSolverr can be customized to solve the captchas automatically by setting the environment variable `CAPTCHA_SOLVER`
to the file name of one of the adapters inside the `/captcha` directory. to the file name of one of the adapters inside the [/captcha](src/captcha) directory.
## Related projects ## Related projects
* C# implementation => https://github.com/FlareSolverr/FlareSolverrSharp * C# implementation => https://github.com/FlareSolverr/FlareSolverrSharp

View File

@@ -7,12 +7,9 @@ services:
container_name: flaresolverr container_name: flaresolverr
environment: environment:
- LOG_LEVEL=${LOG_LEVEL:-info} - LOG_LEVEL=${LOG_LEVEL:-info}
- LOG_FILE=${LOG_FILE:-none}
- LOG_HTML=${LOG_HTML:-false} - LOG_HTML=${LOG_HTML:-false}
- CAPTCHA_SOLVER=${CAPTCHA_SOLVER:-none} - CAPTCHA_SOLVER=${CAPTCHA_SOLVER:-none}
- TZ=Europe/London - TZ=Europe/London
ports: ports:
- "${PORT:-8191}:8191" - "${PORT:-8191}:8191"
volumes:
- /var/lib/flaresolver:/config
restart: unless-stopped restart: unless-stopped

View File

@@ -1,19 +0,0 @@
[Unit]
Description=FlareSolverr
After=network.target
[Service]
SyslogIdentifier=flaresolverr
Restart=always
RestartSec=5
Type=simple
User=flaresolverr
Group=flaresolverr
Environment="LOG_LEVEL=info"
Environment="CAPTCHA_SOLVER=none"
WorkingDirectory=/opt/flaresolverr
ExecStart=/opt/flaresolverr/flaresolverr
TimeoutStopSec=30
[Install]
WantedBy=multi-user.target

View File

@@ -1,7 +1,7 @@
{ {
"name": "flaresolverr", "name": "flaresolverr",
"version": "3.4.2", "version": "3.0.0.beta2",
"description": "Proxy server to bypass Cloudflare protection", "description": "Proxy server to bypass Cloudflare protection",
"author": "Diego Heras (ngosang / ngosang@hotmail.es)", "author": "Diego Heras (ngosang / ngosang@hotmail.es)",
"license": "MIT" "license": "MIT"
} }

View File

@@ -1,14 +1,9 @@
bottle==0.13.4 bottle==0.12.23
waitress==3.0.2 waitress==2.1.2
selenium==4.36.0 selenium==4.4.3
func-timeout==4.3.5 func-timeout==4.3.5
prometheus-client==0.23.1 # required by undetected_chromedriver
# Required by undetected_chromedriver requests==2.28.1
requests==2.32.5 websockets==10.3
certifi==2025.10.5 # only required for linux
websockets==15.0.1 xvfbwrapper==0.2.9
packaging==25.0
# Only required for Linux and macOS
xvfbwrapper==0.2.14; platform_system != "Windows"
# Only required for Windows
pefile==2024.8.26; platform_system == "Windows"

Binary file not shown.

Before

Width:  |  Height:  |  Size: 8.8 KiB

View File

@@ -1,66 +0,0 @@
import logging
import os
import urllib.parse
from bottle import request
from dtos import V1RequestBase, V1ResponseBase
from metrics import start_metrics_http_server, REQUEST_COUNTER, REQUEST_DURATION
PROMETHEUS_ENABLED = os.environ.get('PROMETHEUS_ENABLED', 'false').lower() == 'true'
PROMETHEUS_PORT = int(os.environ.get('PROMETHEUS_PORT', 8192))
def setup():
if PROMETHEUS_ENABLED:
start_metrics_http_server(PROMETHEUS_PORT)
def prometheus_plugin(callback):
"""
Bottle plugin to expose Prometheus metrics
http://bottlepy.org/docs/dev/plugindev.html
"""
def wrapper(*args, **kwargs):
actual_response = callback(*args, **kwargs)
if PROMETHEUS_ENABLED:
try:
export_metrics(actual_response)
except Exception as e:
logging.warning("Error exporting metrics: " + str(e))
return actual_response
def export_metrics(actual_response):
res = V1ResponseBase(actual_response)
if res.startTimestamp is None or res.endTimestamp is None:
# skip management and healthcheck endpoints
return
domain = "unknown"
if res.solution and res.solution.url:
domain = parse_domain_url(res.solution.url)
else:
# timeout error
req = V1RequestBase(request.json)
if req.url:
domain = parse_domain_url(req.url)
run_time = (res.endTimestamp - res.startTimestamp) / 1000
REQUEST_DURATION.labels(domain=domain).observe(run_time)
result = "unknown"
if res.message == "Challenge solved!":
result = "solved"
elif res.message == "Challenge not detected!":
result = "not_detected"
elif res.message.startswith("Error"):
result = "error"
REQUEST_COUNTER.labels(domain=domain, result=result).inc()
def parse_domain_url(url):
parsed_url = urllib.parse.urlparse(url)
return parsed_url.hostname
return wrapper

View File

@@ -1,110 +0,0 @@
import os
import platform
import shutil
import subprocess
import sys
import zipfile
import requests
def clean_files():
try:
shutil.rmtree(os.path.join(os.path.dirname(os.path.abspath(__file__)), os.pardir, 'build'))
except Exception:
pass
try:
shutil.rmtree(os.path.join(os.path.dirname(os.path.abspath(__file__)), os.pardir, 'dist'))
except Exception:
pass
try:
shutil.rmtree(os.path.join(os.path.dirname(os.path.abspath(__file__)), os.pardir, 'dist_chrome'))
except Exception:
pass
def download_chromium():
# https://commondatastorage.googleapis.com/chromium-browser-snapshots/index.html?prefix=Linux_x64/
revision = "1465706" if os.name == 'nt' else '1465706'
arch = 'Win_x64' if os.name == 'nt' else 'Linux_x64'
dl_file = 'chrome-win' if os.name == 'nt' else 'chrome-linux'
dl_path = os.path.join(os.path.dirname(os.path.abspath(__file__)), os.pardir, 'dist_chrome')
dl_path_folder = os.path.join(dl_path, dl_file)
dl_path_zip = dl_path_folder + '.zip'
# response = requests.get(
# f'https://commondatastorage.googleapis.com/chromium-browser-snapshots/{arch}/LAST_CHANGE',
# timeout=30)
# revision = response.text.strip()
print("Downloading revision: " + revision)
os.mkdir(dl_path)
with requests.get(
f'https://commondatastorage.googleapis.com/chromium-browser-snapshots/{arch}/{revision}/{dl_file}.zip',
stream=True) as r:
r.raise_for_status()
with open(dl_path_zip, 'wb') as f:
for chunk in r.iter_content(chunk_size=8192):
f.write(chunk)
print("File downloaded: " + dl_path_zip)
with zipfile.ZipFile(dl_path_zip, 'r') as zip_ref:
zip_ref.extractall(dl_path)
os.remove(dl_path_zip)
chrome_path = os.path.join(dl_path, "chrome")
shutil.move(dl_path_folder, chrome_path)
print("Extracted in: " + chrome_path)
if os.name != 'nt':
# Give executable permissions for *nix
# file * | grep executable | cut -d: -f1
print("Giving executable permissions...")
execs = ['chrome', 'chrome_crashpad_handler', 'chrome_sandbox', 'chrome-wrapper', 'xdg-mime', 'xdg-settings']
for exec_file in execs:
exec_path = os.path.join(chrome_path, exec_file)
os.chmod(exec_path, 0o755)
def run_pyinstaller():
sep = ';' if os.name == 'nt' else ':'
result = subprocess.run([sys.executable, "-m", "PyInstaller",
"--icon", "resources/flaresolverr_logo.ico",
"--add-data", f"package.json{sep}.",
"--add-data", f"{os.path.join('dist_chrome', 'chrome')}{sep}chrome",
os.path.join("src", "flaresolverr.py")],
cwd=os.pardir, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
if result.returncode != 0:
print(result.stderr.decode('utf-8'))
raise Exception("Error running pyInstaller")
def compress_package():
dist_folder = os.path.join(os.path.dirname(os.path.abspath(__file__)), os.pardir, 'dist')
package_folder = os.path.join(dist_folder, 'package')
shutil.move(os.path.join(dist_folder, 'flaresolverr'), os.path.join(package_folder, 'flaresolverr'))
print("Package folder: " + package_folder)
compr_format = 'zip' if os.name == 'nt' else 'gztar'
compr_file_name = 'flaresolverr_windows_x64' if os.name == 'nt' else 'flaresolverr_linux_x64'
compr_file_path = os.path.join(dist_folder, compr_file_name)
shutil.make_archive(compr_file_path, compr_format, package_folder)
print("Compressed file path: " + compr_file_path)
if __name__ == "__main__":
print("Building package...")
print("Platform: " + platform.platform())
print("Cleaning previous build...")
clean_files()
print("Downloading Chromium...")
download_chromium()
print("Building pyinstaller executable... ")
run_pyinstaller()
print("Compressing package... ")
compress_package()
# NOTE: python -m pip install pyinstaller

View File

@@ -10,7 +10,6 @@ class ChallengeResolutionResultT:
response: str = None response: str = None
cookies: list = None cookies: list = None
userAgent: str = None userAgent: str = None
screenshot: str | None = None
def __init__(self, _dict): def __init__(self, _dict):
self.__dict__.update(_dict) self.__dict__.update(_dict)
@@ -34,7 +33,6 @@ class V1RequestBase(object):
maxTimeout: int = None maxTimeout: int = None
proxy: dict = None proxy: dict = None
session: str = None session: str = None
session_ttl_minutes: int = None
headers: list = None # deprecated v2.0.0, not used headers: list = None # deprecated v2.0.0, not used
userAgent: str = None # deprecated v2.0.0, not used userAgent: str = None # deprecated v2.0.0, not used
@@ -42,10 +40,8 @@ class V1RequestBase(object):
url: str = None url: str = None
postData: str = None postData: str = None
returnOnlyCookies: bool = None returnOnlyCookies: bool = None
returnScreenshot: bool = None
download: bool = None # deprecated v2.0.0, not used download: bool = None # deprecated v2.0.0, not used
returnRawHtml: bool = None # deprecated v2.0.0, not used returnRawHtml: bool = None # deprecated v2.0.0, not used
waitInSeconds: int = None
def __init__(self, _dict): def __init__(self, _dict):
self.__dict__.update(_dict) self.__dict__.update(_dict)
@@ -55,8 +51,6 @@ class V1ResponseBase(object):
# V1ResponseBase # V1ResponseBase
status: str = None status: str = None
message: str = None message: str = None
session: str = None
sessions: list[str] = None
startTimestamp: int = None startTimestamp: int = None
endTimestamp: int = None endTimestamp: int = None
version: str = None version: str = None

View File

@@ -3,20 +3,14 @@ import logging
import os import os
import sys import sys
import certifi from bottle import run, response, Bottle, request
from bottle import run, response, Bottle, request, ServerAdapter
from bottle_plugins.error_plugin import error_plugin from bottle_plugins.error_plugin import error_plugin
from bottle_plugins.logger_plugin import logger_plugin from bottle_plugins.logger_plugin import logger_plugin
from bottle_plugins import prometheus_plugin from dtos import IndexResponse, V1RequestBase
from dtos import V1RequestBase
import flaresolverr_service import flaresolverr_service
import utils import utils
env_proxy_url = os.environ.get('PROXY_URL', None)
env_proxy_username = os.environ.get('PROXY_USERNAME', None)
env_proxy_password = os.environ.get('PROXY_PASSWORD', None)
class JSONErrorBottle(Bottle): class JSONErrorBottle(Bottle):
""" """
@@ -29,6 +23,10 @@ class JSONErrorBottle(Bottle):
app = JSONErrorBottle() app = JSONErrorBottle()
# plugin order is important
app.install(logger_plugin)
app.install(error_plugin)
@app.route('/') @app.route('/')
def index(): def index():
@@ -54,14 +52,7 @@ def controller_v1():
""" """
Controller v1 Controller v1
""" """
data = request.json or {} req = V1RequestBase(request.json)
if (('proxy' not in data or not data.get('proxy')) and env_proxy_url is not None and (env_proxy_username is None and env_proxy_password is None)):
logging.info('Using proxy URL ENV')
data['proxy'] = {"url": env_proxy_url}
if (('proxy' not in data or not data.get('proxy')) and env_proxy_url is not None and (env_proxy_username is not None or env_proxy_password is not None)):
logging.info('Using proxy URL, username & password ENVs')
data['proxy'] = {"url": env_proxy_url, "username": env_proxy_username, "password": env_proxy_password}
req = V1RequestBase(data)
res = flaresolverr_service.controller_v1_endpoint(req) res = flaresolverr_service.controller_v1_endpoint(req)
if res.__error_500__: if res.__error_500__:
response.status = 500 response.status = 500
@@ -69,25 +60,8 @@ def controller_v1():
if __name__ == "__main__": if __name__ == "__main__":
# check python version
if sys.version_info < (3, 9):
raise Exception("The Python version is less than 3.9, a version equal to or higher is required.")
# fix for HEADLESS=false in Windows binary
# https://stackoverflow.com/a/27694505
if os.name == 'nt':
import multiprocessing
multiprocessing.freeze_support()
# fix ssl certificates for compiled binaries
# https://github.com/pyinstaller/pyinstaller/issues/7229
# https://stackoverflow.com/questions/55736855/how-to-change-the-cafile-argument-in-the-ssl-module-in-python3
os.environ["REQUESTS_CA_BUNDLE"] = certifi.where()
os.environ["SSL_CERT_FILE"] = certifi.where()
# validate configuration # validate configuration
log_level = os.environ.get('LOG_LEVEL', 'info').upper() log_level = os.environ.get('LOG_LEVEL', 'info').upper()
log_file = os.environ.get('LOG_FILE', None)
log_html = utils.get_config_log_html() log_html = utils.get_config_log_html()
headless = utils.get_config_headless() headless = utils.get_config_headless()
server_host = os.environ.get('HOST', '0.0.0.0') server_host = os.environ.get('HOST', '0.0.0.0')
@@ -105,13 +79,6 @@ if __name__ == "__main__":
logging.StreamHandler(sys.stdout) logging.StreamHandler(sys.stdout)
] ]
) )
if log_file:
log_file = os.path.realpath(log_file)
log_path = os.path.dirname(log_file)
os.makedirs(log_path, exist_ok=True)
logging.getLogger().addHandler(logging.FileHandler(log_file))
# disable warning traces from urllib3 # disable warning traces from urllib3
logging.getLogger('urllib3').setLevel(logging.ERROR) logging.getLogger('urllib3').setLevel(logging.ERROR)
logging.getLogger('selenium.webdriver.remote.remote_connection').setLevel(logging.WARNING) logging.getLogger('selenium.webdriver.remote.remote_connection').setLevel(logging.WARNING)
@@ -120,25 +87,9 @@ if __name__ == "__main__":
logging.info(f'FlareSolverr {utils.get_flaresolverr_version()}') logging.info(f'FlareSolverr {utils.get_flaresolverr_version()}')
logging.debug('Debug log enabled') logging.debug('Debug log enabled')
# Get current OS for global variable
utils.get_current_platform()
# test browser installation # test browser installation
flaresolverr_service.test_browser_installation() flaresolverr_service.test_browser_installation()
# start bootle plugins
# plugin order is important
app.install(logger_plugin)
app.install(error_plugin)
prometheus_plugin.setup()
app.install(prometheus_plugin.prometheus_plugin)
# start webserver # start webserver
# default server 'wsgiref' does not support concurrent requests # default server 'wsgiref' does not support concurrent requests
# https://github.com/FlareSolverr/FlareSolverr/issues/680 run(app, host=server_host, port=server_port, quiet=True, server='waitress')
# https://github.com/Pylons/waitress/issues/31
class WaitressServerPoll(ServerAdapter):
def run(self, handler):
from waitress import serve
serve(handler, host=self.host, port=self.port, asyncore_use_poll=True)
run(app, host=server_host, port=server_port, quiet=True, server=WaitressServerPoll)

View File

@@ -1,79 +1,38 @@
import logging import logging
import platform
import sys
import time import time
from datetime import timedelta from urllib.parse import unquote
from html import escape
from urllib.parse import unquote, quote
from func_timeout import FunctionTimedOut, func_timeout from func_timeout import func_timeout, FunctionTimedOut
from selenium.common import TimeoutException from selenium.common import TimeoutException
from selenium.webdriver.chrome.webdriver import WebDriver from selenium.webdriver.chrome.webdriver import WebDriver
from selenium.webdriver.common.by import By from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.expected_conditions import (
presence_of_element_located, staleness_of, title_is)
from selenium.webdriver.common.action_chains import ActionChains
from selenium.webdriver.support.wait import WebDriverWait from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.support.expected_conditions import presence_of_element_located, staleness_of
from dtos import V1RequestBase, V1ResponseBase, ChallengeResolutionT, ChallengeResolutionResultT, IndexResponse, \
HealthResponse, STATUS_OK, STATUS_ERROR
import utils import utils
from dtos import (STATUS_ERROR, STATUS_OK, ChallengeResolutionResultT,
ChallengeResolutionT, HealthResponse, IndexResponse,
V1RequestBase, V1ResponseBase)
from sessions import SessionsStorage
ACCESS_DENIED_TITLES = [
# Cloudflare
'Access denied',
# Cloudflare http://bitturk.net/ Firefox
'Attention Required! | Cloudflare'
]
ACCESS_DENIED_SELECTORS = [ ACCESS_DENIED_SELECTORS = [
# Cloudflare # Cloudflare
'div.cf-error-title span.cf-code-label span', 'div.main-wrapper div.header.section h1 span.code-label span'
# Cloudflare http://bitturk.net/ Firefox
'#cf-error-details div.cf-error-overview h1'
]
CHALLENGE_TITLES = [
# Cloudflare
'Just a moment...',
# DDoS-GUARD
'DDoS-Guard'
] ]
CHALLENGE_SELECTORS = [ CHALLENGE_SELECTORS = [
# Cloudflare # Cloudflare
'#cf-challenge-running', '.ray_id', '.attack-box', '#cf-please-wait', '#challenge-spinner', '#trk_jschal_js', '#turnstile-wrapper', '.lds-ring', '#cf-challenge-running', '.ray_id', '.attack-box', '#cf-please-wait', '#trk_jschal_js',
# DDoS-GUARD
'#link-ddg',
# Custom CloudFlare for EbookParadijs, Film-Paleis, MuziekFabriek and Puur-Hollands # Custom CloudFlare for EbookParadijs, Film-Paleis, MuziekFabriek and Puur-Hollands
'td.info #js_info', 'td.info #js_info'
# Fairlane / pararius.com
'div.vc div.text-box h2'
] ]
SHORT_TIMEOUT = 1 SHORT_TIMEOUT = 5
SESSIONS_STORAGE = SessionsStorage()
def test_browser_installation(): def test_browser_installation():
logging.info("Testing web browser installation...") logging.info("Testing web browser installation...")
logging.info("Platform: " + platform.platform())
chrome_exe_path = utils.get_chrome_exe_path()
if chrome_exe_path is None:
logging.error("Chrome / Chromium web browser not installed!")
sys.exit(1)
else:
logging.info("Chrome / Chromium path: " + chrome_exe_path)
chrome_major_version = utils.get_chrome_major_version()
if chrome_major_version == '':
logging.error("Chrome / Chromium version not detected!")
sys.exit(1)
else:
logging.info("Chrome / Chromium major version: " + chrome_major_version)
logging.info("Launching web browser...")
user_agent = utils.get_user_agent() user_agent = utils.get_user_agent()
logging.info("FlareSolverr User-Agent: " + user_agent) logging.info("FlareSolverr User-Agent: " + user_agent)
logging.info("Test successful!") logging.info("Test successful")
def index_endpoint() -> IndexResponse: def index_endpoint() -> IndexResponse:
@@ -121,17 +80,17 @@ def _controller_v1_handler(req: V1RequestBase) -> V1ResponseBase:
logging.warning("Request parameter 'userAgent' was removed in FlareSolverr v2.") logging.warning("Request parameter 'userAgent' was removed in FlareSolverr v2.")
# set default values # set default values
if req.maxTimeout is None or int(req.maxTimeout) < 1: if req.maxTimeout is None or req.maxTimeout < 1:
req.maxTimeout = 60000 req.maxTimeout = 60000
# execute the command # execute the command
res: V1ResponseBase res: V1ResponseBase
if req.cmd == 'sessions.create': if req.cmd == 'sessions.create':
res = _cmd_sessions_create(req) raise Exception("Not implemented yet.")
elif req.cmd == 'sessions.list': elif req.cmd == 'sessions.list':
res = _cmd_sessions_list(req) raise Exception("Not implemented yet.")
elif req.cmd == 'sessions.destroy': elif req.cmd == 'sessions.destroy':
res = _cmd_sessions_destroy(req) raise Exception("Not implemented yet.")
elif req.cmd == 'request.get': elif req.cmd == 'request.get':
res = _cmd_request_get(req) res = _cmd_request_get(req)
elif req.cmd == 'request.post': elif req.cmd == 'request.post':
@@ -178,108 +137,19 @@ def _cmd_request_post(req: V1RequestBase) -> V1ResponseBase:
return res return res
def _cmd_sessions_create(req: V1RequestBase) -> V1ResponseBase:
logging.debug("Creating new session...")
session, fresh = SESSIONS_STORAGE.create(session_id=req.session, proxy=req.proxy)
session_id = session.session_id
if not fresh:
return V1ResponseBase({
"status": STATUS_OK,
"message": "Session already exists.",
"session": session_id
})
return V1ResponseBase({
"status": STATUS_OK,
"message": "Session created successfully.",
"session": session_id
})
def _cmd_sessions_list(req: V1RequestBase) -> V1ResponseBase:
session_ids = SESSIONS_STORAGE.session_ids()
return V1ResponseBase({
"status": STATUS_OK,
"message": "",
"sessions": session_ids
})
def _cmd_sessions_destroy(req: V1RequestBase) -> V1ResponseBase:
session_id = req.session
existed = SESSIONS_STORAGE.destroy(session_id)
if not existed:
raise Exception("The session doesn't exist.")
return V1ResponseBase({
"status": STATUS_OK,
"message": "The session has been removed."
})
def _resolve_challenge(req: V1RequestBase, method: str) -> ChallengeResolutionT: def _resolve_challenge(req: V1RequestBase, method: str) -> ChallengeResolutionT:
timeout = int(req.maxTimeout) / 1000 timeout = req.maxTimeout / 1000
driver = None driver = None
try: try:
if req.session: driver = utils.get_webdriver()
session_id = req.session
ttl = timedelta(minutes=req.session_ttl_minutes) if req.session_ttl_minutes else None
session, fresh = SESSIONS_STORAGE.get(session_id, ttl)
if fresh:
logging.debug(f"new session created to perform the request (session_id={session_id})")
else:
logging.debug(f"existing session is used to perform the request (session_id={session_id}, "
f"lifetime={str(session.lifetime())}, ttl={str(ttl)})")
driver = session.driver
else:
driver = utils.get_webdriver(req.proxy)
logging.debug('New instance of webdriver has been created to perform the request')
return func_timeout(timeout, _evil_logic, (req, driver, method)) return func_timeout(timeout, _evil_logic, (req, driver, method))
except FunctionTimedOut: except FunctionTimedOut:
raise Exception(f'Error solving the challenge. Timeout after {timeout} seconds.') raise Exception(f'Error solving the challenge. Timeout after {timeout} seconds.')
except Exception as e: except Exception as e:
raise Exception('Error solving the challenge. ' + str(e).replace('\n', '\\n')) raise Exception('Error solving the challenge. ' + str(e))
finally: finally:
if not req.session and driver is not None: if driver is not None:
if utils.PLATFORM_VERSION == "nt":
driver.close()
driver.quit() driver.quit()
logging.debug('A used instance of webdriver has been destroyed')
def click_verify(driver: WebDriver):
try:
logging.debug("Try to find the Cloudflare verify checkbox...")
actions = ActionChains(driver)
actions.pause(5).send_keys(Keys.TAB).pause(1).send_keys(Keys.SPACE).perform()
logging.debug("Cloudflare verify checkbox found and clicked!")
except Exception:
logging.debug("Cloudflare verify checkbox not found on the page.")
finally:
driver.switch_to.default_content()
try:
logging.debug("Try to find the Cloudflare 'Verify you are human' button...")
button = driver.find_element(
by=By.XPATH,
value="//input[@type='button' and @value='Verify you are human']",
)
if button:
actions = ActionChains(driver)
actions.move_to_element_with_offset(button, 5, 7)
actions.click(button)
actions.perform()
logging.debug("The Cloudflare 'Verify you are human' button found and clicked!")
except Exception:
logging.debug("The Cloudflare 'Verify you are human' button not found on the page.")
time.sleep(2)
def _evil_logic(req: V1RequestBase, driver: WebDriver, method: str) -> ChallengeResolutionT: def _evil_logic(req: V1RequestBase, driver: WebDriver, method: str) -> ChallengeResolutionT:
@@ -287,37 +157,18 @@ def _evil_logic(req: V1RequestBase, driver: WebDriver, method: str) -> Challenge
res.status = STATUS_OK res.status = STATUS_OK
res.message = "" res.message = ""
# navigate to the page # navigate to the page
logging.debug(f'Navigating to... {req.url}') logging.debug(f'Navigating to... {req.url}')
if method == 'POST': if method == 'POST':
_post_request(req, driver) _post_request(req, driver)
else: else:
driver.get(req.url) driver.get(req.url)
# set cookies if required
if req.cookies is not None and len(req.cookies) > 0:
logging.debug(f'Setting cookies...')
for cookie in req.cookies:
driver.delete_cookie(cookie['name'])
driver.add_cookie(cookie)
# reload the page
if method == 'POST':
_post_request(req, driver)
else:
driver.get(req.url)
# wait for the page
if utils.get_config_log_html(): if utils.get_config_log_html():
logging.debug(f"Response HTML:\n{driver.page_source}") logging.debug(f"Response HTML:\n{driver.page_source}")
html_element = driver.find_element(By.TAG_NAME, "html")
page_title = driver.title
# find access denied titles # wait for the page
for title in ACCESS_DENIED_TITLES: html_element = driver.find_element(By.TAG_NAME, "html")
if page_title.startswith(title):
raise Exception('Cloudflare has blocked this request. '
'Probably your IP is banned for this site, check in your web browser.')
# find access denied selectors # find access denied selectors
for selector in ACCESS_DENIED_SELECTORS: for selector in ACCESS_DENIED_SELECTORS:
found_elements = driver.find_elements(By.CSS_SELECTOR, selector) found_elements = driver.find_elements(By.CSS_SELECTOR, selector)
@@ -325,35 +176,21 @@ def _evil_logic(req: V1RequestBase, driver: WebDriver, method: str) -> Challenge
raise Exception('Cloudflare has blocked this request. ' raise Exception('Cloudflare has blocked this request. '
'Probably your IP is banned for this site, check in your web browser.') 'Probably your IP is banned for this site, check in your web browser.')
# find challenge by title # find challenge selectors
challenge_found = False challenge_found = False
for title in CHALLENGE_TITLES: for selector in CHALLENGE_SELECTORS:
if title.lower() == page_title.lower(): found_elements = driver.find_elements(By.CSS_SELECTOR, selector)
if len(found_elements) > 0:
challenge_found = True challenge_found = True
logging.info("Challenge detected. Title found: " + page_title) logging.info("Challenge detected. Selector found: " + selector)
break break
if not challenge_found:
# find challenge by selectors
for selector in CHALLENGE_SELECTORS:
found_elements = driver.find_elements(By.CSS_SELECTOR, selector)
if len(found_elements) > 0:
challenge_found = True
logging.info("Challenge detected. Selector found: " + selector)
break
attempt = 0
if challenge_found: if challenge_found:
while True: while True:
try: try:
attempt = attempt + 1
# wait until the title changes
for title in CHALLENGE_TITLES:
logging.debug("Waiting for title (attempt " + str(attempt) + "): " + title)
WebDriverWait(driver, SHORT_TIMEOUT).until_not(title_is(title))
# then wait until all the selectors disappear # then wait until all the selectors disappear
for selector in CHALLENGE_SELECTORS: for selector in CHALLENGE_SELECTORS:
logging.debug("Waiting for selector (attempt " + str(attempt) + "): " + selector) logging.debug("Waiting for selector: " + selector)
WebDriverWait(driver, SHORT_TIMEOUT).until_not( WebDriverWait(driver, SHORT_TIMEOUT).until_not(
presence_of_element_located((By.CSS_SELECTOR, selector))) presence_of_element_located((By.CSS_SELECTOR, selector)))
@@ -362,9 +199,6 @@ def _evil_logic(req: V1RequestBase, driver: WebDriver, method: str) -> Challenge
except TimeoutException: except TimeoutException:
logging.debug("Timeout waiting for selector") logging.debug("Timeout waiting for selector")
click_verify(driver)
# update the html (cloudflare reloads the page every 5 s) # update the html (cloudflare reloads the page every 5 s)
html_element = driver.find_element(By.TAG_NAME, "html") html_element = driver.find_element(By.TAG_NAME, "html")
@@ -386,19 +220,11 @@ def _evil_logic(req: V1RequestBase, driver: WebDriver, method: str) -> Challenge
challenge_res.url = driver.current_url challenge_res.url = driver.current_url
challenge_res.status = 200 # todo: fix, selenium not provides this info challenge_res.status = 200 # todo: fix, selenium not provides this info
challenge_res.cookies = driver.get_cookies() challenge_res.cookies = driver.get_cookies()
challenge_res.userAgent = utils.get_user_agent(driver)
if not req.returnOnlyCookies: if not req.returnOnlyCookies:
challenge_res.headers = {} # todo: fix, selenium not provides this info challenge_res.headers = {} # todo: fix, selenium not provides this info
if req.waitInSeconds and req.waitInSeconds > 0:
logging.info("Waiting " + str(req.waitInSeconds) + " seconds before returning the response...")
time.sleep(req.waitInSeconds)
challenge_res.response = driver.page_source challenge_res.response = driver.page_source
challenge_res.userAgent = utils.get_user_agent(driver)
if req.returnScreenshot:
challenge_res.screenshot = driver.get_screenshot_as_base64()
res.result = challenge_res res.result = challenge_res
return res return res
@@ -406,10 +232,10 @@ def _evil_logic(req: V1RequestBase, driver: WebDriver, method: str) -> Challenge
def _post_request(req: V1RequestBase, driver: WebDriver): def _post_request(req: V1RequestBase, driver: WebDriver):
post_form = f'<form id="hackForm" action="{req.url}" method="POST">' post_form = f'<form id="hackForm" action="{req.url}" method="POST">'
query_string = req.postData if req.postData and req.postData[0] != '?' else req.postData[1:] if req.postData else '' query_string = req.postData if req.postData[0] != '?' else req.postData[1:]
pairs = query_string.split('&') pairs = query_string.split('&')
for pair in pairs: for pair in pairs:
parts = pair.split('=', 1) parts = pair.split('=')
# noinspection PyBroadException # noinspection PyBroadException
try: try:
name = unquote(parts[0]) name = unquote(parts[0])
@@ -419,12 +245,10 @@ def _post_request(req: V1RequestBase, driver: WebDriver):
continue continue
# noinspection PyBroadException # noinspection PyBroadException
try: try:
value = unquote(parts[1]) if len(parts) > 1 else '' value = unquote(parts[1])
except Exception: except Exception:
value = parts[1] if len(parts) > 1 else '' value = parts[1]
# Protection of " character, for syntax post_form += f'<input type="text" name="{name}" value="{value}"><br>'
value=value.replace('"','&quot;')
post_form += f'<input type="text" name="{escape(quote(name))}" value="{escape(quote(value))}"><br>'
post_form += '</form>' post_form += '</form>'
html_content = f""" html_content = f"""
<!DOCTYPE html> <!DOCTYPE html>
@@ -434,4 +258,4 @@ def _post_request(req: V1RequestBase, driver: WebDriver):
<script>document.getElementById('hackForm').submit();</script> <script>document.getElementById('hackForm').submit();</script>
</body> </body>
</html>""" </html>"""
driver.get("data:text/html;charset=utf-8,{html_content}".format(html_content=html_content)) driver.get("data:text/html;charset=utf-8," + html_content)

View File

@@ -1,32 +0,0 @@
import logging
from prometheus_client import Counter, Histogram, start_http_server
import time
REQUEST_COUNTER = Counter(
name='flaresolverr_request',
documentation='Total requests with result',
labelnames=['domain', 'result']
)
REQUEST_DURATION = Histogram(
name='flaresolverr_request_duration',
documentation='Request duration in seconds',
labelnames=['domain'],
buckets=[0, 10, 25, 50]
)
def serve(port):
start_http_server(port=port)
while True:
time.sleep(600)
def start_metrics_http_server(prometheus_port: int):
logging.info(f"Serving Prometheus exporter on http://0.0.0.0:{prometheus_port}/metrics")
from threading import Thread
Thread(
target=serve,
kwargs=dict(port=prometheus_port),
daemon=True,
).start()

View File

@@ -1,84 +0,0 @@
import logging
from dataclasses import dataclass
from datetime import datetime, timedelta
from typing import Optional, Tuple
from uuid import uuid1
from selenium.webdriver.chrome.webdriver import WebDriver
import utils
@dataclass
class Session:
session_id: str
driver: WebDriver
created_at: datetime
def lifetime(self) -> timedelta:
return datetime.now() - self.created_at
class SessionsStorage:
"""SessionsStorage creates, stores and process all the sessions"""
def __init__(self):
self.sessions = {}
def create(self, session_id: Optional[str] = None, proxy: Optional[dict] = None,
force_new: Optional[bool] = False) -> Tuple[Session, bool]:
"""create creates new instance of WebDriver if necessary,
assign defined (or newly generated) session_id to the instance
and returns the session object. If a new session has been created
second argument is set to True.
Note: The function is idempotent, so in case if session_id
already exists in the storage a new instance of WebDriver won't be created
and existing session will be returned. Second argument defines if
new session has been created (True) or an existing one was used (False).
"""
session_id = session_id or str(uuid1())
if force_new:
self.destroy(session_id)
if self.exists(session_id):
return self.sessions[session_id], False
driver = utils.get_webdriver(proxy)
created_at = datetime.now()
session = Session(session_id, driver, created_at)
self.sessions[session_id] = session
return session, True
def exists(self, session_id: str) -> bool:
return session_id in self.sessions
def destroy(self, session_id: str) -> bool:
"""destroy closes the driver instance and removes session from the storage.
The function is noop if session_id doesn't exist.
The function returns True if session was found and destroyed,
and False if session_id wasn't found.
"""
if not self.exists(session_id):
return False
session = self.sessions.pop(session_id)
if utils.PLATFORM_VERSION == "nt":
session.driver.close()
session.driver.quit()
return True
def get(self, session_id: str, ttl: Optional[timedelta] = None) -> Tuple[Session, bool]:
session, fresh = self.create(session_id)
if ttl is not None and not fresh and session.lifetime() > ttl:
logging.debug(f'session\'s lifetime has expired, so the session is recreated (session_id={session_id})')
session, fresh = self.create(session_id, force_new=True)
return session, fresh
def session_ids(self) -> list[str]:
return list(self.sessions.keys())

View File

@@ -1,5 +1,5 @@
import unittest import unittest
from typing import Optional from datetime import datetime, timezone
from webtest import TestApp from webtest import TestApp
@@ -8,7 +8,7 @@ import flaresolverr
import utils import utils
def _find_obj_by_key(key: str, value: str, _list: list) -> Optional[dict]: def _find_obj_by_key(key: str, value: str, _list: list) -> dict | None:
for obj in _list: for obj in _list:
if obj[key] == value: if obj[key] == value:
return obj return obj
@@ -20,17 +20,14 @@ class TestFlareSolverr(unittest.TestCase):
proxy_url = "http://127.0.0.1:8888" proxy_url = "http://127.0.0.1:8888"
proxy_socks_url = "socks5://127.0.0.1:1080" proxy_socks_url = "socks5://127.0.0.1:1080"
google_url = "https://www.google.com" google_url = "https://www.google.com"
post_url = "https://httpbin.org/post" post_url = "https://ptsv2.com/t/qv4j3-1634496523"
cloudflare_url = "https://nowsecure.nl" cloudflare_url = "https://nowsecure.nl"
cloudflare_url_2 = "https://idope.se/torrent-list/harry/" cloudflare_url_2 = "https://idope.se/torrent-list/harry/"
ddos_guard_url = "https://anidex.info/" ddos_guard_url = "https://anidex.info/"
fairlane_url = "https://www.pararius.com/apartments/amsterdam"
custom_cloudflare_url = "https://www.muziekfabriek.org" custom_cloudflare_url = "https://www.muziekfabriek.org"
cloudflare_blocked_url = "https://cpasbiens3.fr/index.php?do=search&subaction=search" cloudflare_blocked_url = "https://avistaz.to/api/v1/jackett/torrents?in=1&type=0&search="
app = TestApp(flaresolverr.app) app = TestApp(flaresolverr.app)
# wait until the server is ready
app.get('/')
def test_wrong_endpoint(self): def test_wrong_endpoint(self):
res = self.app.get('/wrong', status=404) res = self.app.get('/wrong', status=404)
@@ -68,7 +65,7 @@ class TestFlareSolverr(unittest.TestCase):
self.assertEqual("Error: Request parameter 'cmd' = 'request.bad' is invalid.", body.message) self.assertEqual("Error: Request parameter 'cmd' = 'request.bad' is invalid.", body.message)
self.assertGreater(body.startTimestamp, 10000) self.assertGreater(body.startTimestamp, 10000)
self.assertGreaterEqual(body.endTimestamp, body.startTimestamp) self.assertGreaterEqual(body.endTimestamp, body.startTimestamp)
self.assertEqual(utils.get_flaresolverr_version(), body.version) self.assertEqual(utils.get_flaresolverr_version(), body.version)
def test_v1_endpoint_request_get_no_cloudflare(self): def test_v1_endpoint_request_get_no_cloudflare(self):
res = self.app.post_json('/v1', { res = self.app.post_json('/v1', {
@@ -82,7 +79,7 @@ class TestFlareSolverr(unittest.TestCase):
self.assertEqual("Challenge not detected!", body.message) self.assertEqual("Challenge not detected!", body.message)
self.assertGreater(body.startTimestamp, 10000) self.assertGreater(body.startTimestamp, 10000)
self.assertGreaterEqual(body.endTimestamp, body.startTimestamp) self.assertGreaterEqual(body.endTimestamp, body.startTimestamp)
self.assertEqual(utils.get_flaresolverr_version(), body.version) self.assertEqual(utils.get_flaresolverr_version(), body.version)
solution = body.solution solution = body.solution
self.assertIn(self.google_url, solution.url) self.assertIn(self.google_url, solution.url)
@@ -104,7 +101,7 @@ class TestFlareSolverr(unittest.TestCase):
self.assertEqual("Challenge solved!", body.message) self.assertEqual("Challenge solved!", body.message)
self.assertGreater(body.startTimestamp, 10000) self.assertGreater(body.startTimestamp, 10000)
self.assertGreaterEqual(body.endTimestamp, body.startTimestamp) self.assertGreaterEqual(body.endTimestamp, body.startTimestamp)
self.assertEqual(utils.get_flaresolverr_version(), body.version) self.assertEqual(utils.get_flaresolverr_version(), body.version)
solution = body.solution solution = body.solution
self.assertIn(self.cloudflare_url, solution.url) self.assertIn(self.cloudflare_url, solution.url)
@@ -130,7 +127,7 @@ class TestFlareSolverr(unittest.TestCase):
self.assertEqual("Challenge solved!", body.message) self.assertEqual("Challenge solved!", body.message)
self.assertGreater(body.startTimestamp, 10000) self.assertGreater(body.startTimestamp, 10000)
self.assertGreaterEqual(body.endTimestamp, body.startTimestamp) self.assertGreaterEqual(body.endTimestamp, body.startTimestamp)
self.assertEqual(utils.get_flaresolverr_version(), body.version) self.assertEqual(utils.get_flaresolverr_version(), body.version)
solution = body.solution solution = body.solution
self.assertIn(self.cloudflare_url_2, solution.url) self.assertIn(self.cloudflare_url_2, solution.url)
@@ -156,7 +153,7 @@ class TestFlareSolverr(unittest.TestCase):
self.assertEqual("Challenge solved!", body.message) self.assertEqual("Challenge solved!", body.message)
self.assertGreater(body.startTimestamp, 10000) self.assertGreater(body.startTimestamp, 10000)
self.assertGreaterEqual(body.endTimestamp, body.startTimestamp) self.assertGreaterEqual(body.endTimestamp, body.startTimestamp)
self.assertEqual(utils.get_flaresolverr_version(), body.version) self.assertEqual(utils.get_flaresolverr_version(), body.version)
solution = body.solution solution = body.solution
self.assertIn(self.ddos_guard_url, solution.url) self.assertIn(self.ddos_guard_url, solution.url)
@@ -170,32 +167,6 @@ class TestFlareSolverr(unittest.TestCase):
self.assertIsNotNone(cf_cookie, "DDOS-Guard cookie not found") self.assertIsNotNone(cf_cookie, "DDOS-Guard cookie not found")
self.assertGreater(len(cf_cookie["value"]), 10) self.assertGreater(len(cf_cookie["value"]), 10)
def test_v1_endpoint_request_get_fairlane_js(self):
res = self.app.post_json('/v1', {
"cmd": "request.get",
"url": self.fairlane_url
})
self.assertEqual(res.status_code, 200)
body = V1ResponseBase(res.json)
self.assertEqual(STATUS_OK, body.status)
self.assertEqual("Challenge solved!", body.message)
self.assertGreater(body.startTimestamp, 10000)
self.assertGreaterEqual(body.endTimestamp, body.startTimestamp)
self.assertEqual(utils.get_flaresolverr_version(), body.version)
solution = body.solution
self.assertIn(self.fairlane_url, solution.url)
self.assertEqual(solution.status, 200)
self.assertIs(len(solution.headers), 0)
self.assertIn("<title>Rental Apartments Amsterdam</title>", solution.response)
self.assertGreater(len(solution.cookies), 0)
self.assertIn("Chrome/", solution.userAgent)
cf_cookie = _find_obj_by_key("name", "fl_pass_v2_b", solution.cookies)
self.assertIsNotNone(cf_cookie, "Fairlane cookie not found")
self.assertGreater(len(cf_cookie["value"]), 50)
def test_v1_endpoint_request_get_custom_cloudflare_js(self): def test_v1_endpoint_request_get_custom_cloudflare_js(self):
res = self.app.post_json('/v1', { res = self.app.post_json('/v1', {
"cmd": "request.get", "cmd": "request.get",
@@ -208,7 +179,7 @@ class TestFlareSolverr(unittest.TestCase):
self.assertEqual("Challenge solved!", body.message) self.assertEqual("Challenge solved!", body.message)
self.assertGreater(body.startTimestamp, 10000) self.assertGreater(body.startTimestamp, 10000)
self.assertGreaterEqual(body.endTimestamp, body.startTimestamp) self.assertGreaterEqual(body.endTimestamp, body.startTimestamp)
self.assertEqual(utils.get_flaresolverr_version(), body.version) self.assertEqual(utils.get_flaresolverr_version(), body.version)
solution = body.solution solution = body.solution
self.assertIn(self.custom_cloudflare_url, solution.url) self.assertIn(self.custom_cloudflare_url, solution.url)
@@ -239,45 +210,7 @@ class TestFlareSolverr(unittest.TestCase):
self.assertGreaterEqual(body.endTimestamp, body.startTimestamp) self.assertGreaterEqual(body.endTimestamp, body.startTimestamp)
self.assertEqual(utils.get_flaresolverr_version(), body.version) self.assertEqual(utils.get_flaresolverr_version(), body.version)
def test_v1_endpoint_request_get_cookies_param(self): # todo: test Cmd 'request.get' should return OK with 'cookies' param
res = self.app.post_json('/v1', {
"cmd": "request.get",
"url": self.google_url,
"cookies": [
{
"name": "testcookie1",
"value": "testvalue1"
},
{
"name": "testcookie2",
"value": "testvalue2"
}
]
})
self.assertEqual(res.status_code, 200)
body = V1ResponseBase(res.json)
self.assertEqual(STATUS_OK, body.status)
self.assertEqual("Challenge not detected!", body.message)
self.assertGreater(body.startTimestamp, 10000)
self.assertGreaterEqual(body.endTimestamp, body.startTimestamp)
self.assertEqual(utils.get_flaresolverr_version(), body.version)
solution = body.solution
self.assertIn(self.google_url, solution.url)
self.assertEqual(solution.status, 200)
self.assertIs(len(solution.headers), 0)
self.assertIn("<title>Google</title>", solution.response)
self.assertGreater(len(solution.cookies), 1)
self.assertIn("Chrome/", solution.userAgent)
user_cookie1 = _find_obj_by_key("name", "testcookie1", solution.cookies)
self.assertIsNotNone(user_cookie1, "User cookie 1 not found")
self.assertEqual("testvalue1", user_cookie1["value"])
user_cookie2 = _find_obj_by_key("name", "testcookie2", solution.cookies)
self.assertIsNotNone(user_cookie2, "User cookie 2 not found")
self.assertEqual("testvalue2", user_cookie2["value"])
def test_v1_endpoint_request_get_returnOnlyCookies_param(self): def test_v1_endpoint_request_get_returnOnlyCookies_param(self):
res = self.app.post_json('/v1', { res = self.app.post_json('/v1', {
@@ -300,126 +233,12 @@ class TestFlareSolverr(unittest.TestCase):
self.assertIsNone(solution.headers) self.assertIsNone(solution.headers)
self.assertIsNone(solution.response) self.assertIsNone(solution.response)
self.assertGreater(len(solution.cookies), 0) self.assertGreater(len(solution.cookies), 0)
self.assertIn("Chrome/", solution.userAgent) self.assertIsNone(solution.userAgent)
def test_v1_endpoint_request_get_proxy_http_param(self): # todo: test Cmd 'request.get' should return OK with HTTP 'proxy' param
""" # todo: test Cmd 'request.get' should return OK with HTTP 'proxy' param with credentials
To configure TinyProxy in local: # todo: test Cmd 'request.get' should return OK with SOCKSv5 'proxy' param
* sudo vim /etc/tinyproxy/tinyproxy.conf # todo: test Cmd 'request.get' should fail with wrong 'proxy' param
* edit => LogFile "/tmp/tinyproxy.log"
* edit => Syslog Off
* sudo tinyproxy -d
* sudo tail -f /tmp/tinyproxy.log
"""
res = self.app.post_json('/v1', {
"cmd": "request.get",
"url": self.google_url,
"proxy": {
"url": self.proxy_url
}
})
self.assertEqual(res.status_code, 200)
body = V1ResponseBase(res.json)
self.assertEqual(STATUS_OK, body.status)
self.assertEqual("Challenge not detected!", body.message)
self.assertGreater(body.startTimestamp, 10000)
self.assertGreaterEqual(body.endTimestamp, body.startTimestamp)
self.assertEqual(utils.get_flaresolverr_version(), body.version)
solution = body.solution
self.assertIn(self.google_url, solution.url)
self.assertEqual(solution.status, 200)
self.assertIs(len(solution.headers), 0)
self.assertIn("<title>Google</title>", solution.response)
self.assertGreater(len(solution.cookies), 0)
self.assertIn("Chrome/", solution.userAgent)
def test_v1_endpoint_request_get_proxy_http_param_with_credentials(self):
"""
To configure TinyProxy in local:
* sudo vim /etc/tinyproxy/tinyproxy.conf
* edit => LogFile "/tmp/tinyproxy.log"
* edit => Syslog Off
* add => BasicAuth testuser testpass
* sudo tinyproxy -d
* sudo tail -f /tmp/tinyproxy.log
"""
res = self.app.post_json('/v1', {
"cmd": "request.get",
"url": self.google_url,
"proxy": {
"url": self.proxy_url,
"username": "testuser",
"password": "testpass"
}
})
self.assertEqual(res.status_code, 200)
body = V1ResponseBase(res.json)
self.assertEqual(STATUS_OK, body.status)
self.assertEqual("Challenge not detected!", body.message)
self.assertGreater(body.startTimestamp, 10000)
self.assertGreaterEqual(body.endTimestamp, body.startTimestamp)
self.assertEqual(utils.get_flaresolverr_version(), body.version)
solution = body.solution
self.assertIn(self.google_url, solution.url)
self.assertEqual(solution.status, 200)
self.assertIs(len(solution.headers), 0)
self.assertIn("<title>Google</title>", solution.response)
self.assertGreater(len(solution.cookies), 0)
self.assertIn("Chrome/", solution.userAgent)
def test_v1_endpoint_request_get_proxy_socks_param(self):
"""
To configure Dante in local:
* https://linuxhint.com/set-up-a-socks5-proxy-on-ubuntu-with-dante/
* sudo vim /etc/sockd.conf
* sudo systemctl restart sockd.service
* curl --socks5 socks5://127.0.0.1:1080 https://www.google.com
"""
res = self.app.post_json('/v1', {
"cmd": "request.get",
"url": self.google_url,
"proxy": {
"url": self.proxy_socks_url
}
})
self.assertEqual(res.status_code, 200)
body = V1ResponseBase(res.json)
self.assertEqual(STATUS_OK, body.status)
self.assertEqual("Challenge not detected!", body.message)
self.assertGreater(body.startTimestamp, 10000)
self.assertGreaterEqual(body.endTimestamp, body.startTimestamp)
self.assertEqual(utils.get_flaresolverr_version(), body.version)
solution = body.solution
self.assertIn(self.google_url, solution.url)
self.assertEqual(solution.status, 200)
self.assertIs(len(solution.headers), 0)
self.assertIn("<title>Google</title>", solution.response)
self.assertGreater(len(solution.cookies), 0)
self.assertIn("Chrome/", solution.userAgent)
def test_v1_endpoint_request_get_proxy_wrong_param(self):
res = self.app.post_json('/v1', {
"cmd": "request.get",
"url": self.google_url,
"proxy": {
"url": "http://127.0.0.1:43210"
}
}, status=500)
self.assertEqual(res.status_code, 500)
body = V1ResponseBase(res.json)
self.assertEqual(STATUS_ERROR, body.status)
self.assertIn("Error: Error solving the challenge. Message: unknown error: net::ERR_PROXY_CONNECTION_FAILED",
body.message)
self.assertGreater(body.startTimestamp, 10000)
self.assertGreaterEqual(body.endTimestamp, body.startTimestamp)
self.assertEqual(utils.get_flaresolverr_version(), body.version)
def test_v1_endpoint_request_get_fail_timeout(self): def test_v1_endpoint_request_get_fail_timeout(self):
res = self.app.post_json('/v1', { res = self.app.post_json('/v1', {
@@ -462,7 +281,7 @@ class TestFlareSolverr(unittest.TestCase):
def test_v1_endpoint_request_post_no_cloudflare(self): def test_v1_endpoint_request_post_no_cloudflare(self):
res = self.app.post_json('/v1', { res = self.app.post_json('/v1', {
"cmd": "request.post", "cmd": "request.post",
"url": self.post_url, "url": self.post_url + '/post',
"postData": "param1=value1&param2=value2" "postData": "param1=value1&param2=value2"
}) })
self.assertEqual(res.status_code, 200) self.assertEqual(res.status_code, 200)
@@ -478,10 +297,22 @@ class TestFlareSolverr(unittest.TestCase):
self.assertIn(self.post_url, solution.url) self.assertIn(self.post_url, solution.url)
self.assertEqual(solution.status, 200) self.assertEqual(solution.status, 200)
self.assertIs(len(solution.headers), 0) self.assertIs(len(solution.headers), 0)
self.assertIn('"form": {\n "param1": "value1", \n "param2": "value2"\n }', solution.response) self.assertIn("I hope you have a lovely day!", solution.response)
self.assertEqual(len(solution.cookies), 0) self.assertEqual(len(solution.cookies), 0)
self.assertIn("Chrome/", solution.userAgent) self.assertIn("Chrome/", solution.userAgent)
# check that we sent the post data
res2 = self.app.post_json('/v1', {
"cmd": "request.get",
"url": self.post_url
})
self.assertEqual(res2.status_code, 200)
body2 = V1ResponseBase(res2.json)
self.assertEqual(STATUS_OK, body2.status)
date_hour = datetime.now(timezone.utc).isoformat().split(':')[0].replace('T', ' ')
self.assertIn(date_hour, body2.solution.response)
def test_v1_endpoint_request_post_cloudflare(self): def test_v1_endpoint_request_post_cloudflare(self):
res = self.app.post_json('/v1', { res = self.app.post_json('/v1', {
"cmd": "request.post", "cmd": "request.post",
@@ -533,99 +364,12 @@ class TestFlareSolverr(unittest.TestCase):
self.assertEqual(STATUS_OK, body.status) self.assertEqual(STATUS_OK, body.status)
self.assertEqual("Challenge not detected!", body.message) self.assertEqual("Challenge not detected!", body.message)
def test_v1_endpoint_sessions_create_without_session(self): # todo: test Cmd 'sessions.create' should return OK
res = self.app.post_json('/v1', { # todo: test Cmd 'sessions.create' should return OK with session
"cmd": "sessions.create" # todo: test Cmd 'sessions.list' should return OK
}) # todo: test Cmd 'sessions.destroy' should return OK
self.assertEqual(res.status_code, 200) # todo: test Cmd 'sessions.destroy' should fail
# todo: test Cmd 'request.get' should use session
body = V1ResponseBase(res.json)
self.assertEqual(STATUS_OK, body.status)
self.assertEqual("Session created successfully.", body.message)
self.assertIsNotNone(body.session)
def test_v1_endpoint_sessions_create_with_session(self):
res = self.app.post_json('/v1', {
"cmd": "sessions.create",
"session": "test_create_session"
})
self.assertEqual(res.status_code, 200)
body = V1ResponseBase(res.json)
self.assertEqual(STATUS_OK, body.status)
self.assertEqual("Session created successfully.", body.message)
self.assertEqual(body.session, "test_create_session")
def test_v1_endpoint_sessions_create_with_proxy(self):
res = self.app.post_json('/v1', {
"cmd": "sessions.create",
"proxy": {
"url": self.proxy_url
}
})
self.assertEqual(res.status_code, 200)
body = V1ResponseBase(res.json)
self.assertEqual(STATUS_OK, body.status)
self.assertEqual("Session created successfully.", body.message)
self.assertIsNotNone(body.session)
def test_v1_endpoint_sessions_list(self):
self.app.post_json('/v1', {
"cmd": "sessions.create",
"session": "test_list_sessions"
})
res = self.app.post_json('/v1', {
"cmd": "sessions.list"
})
self.assertEqual(res.status_code, 200)
body = V1ResponseBase(res.json)
self.assertEqual(STATUS_OK, body.status)
self.assertEqual("", body.message)
self.assertGreaterEqual(len(body.sessions), 1)
self.assertIn("test_list_sessions", body.sessions)
def test_v1_endpoint_sessions_destroy_existing_session(self):
self.app.post_json('/v1', {
"cmd": "sessions.create",
"session": "test_destroy_sessions"
})
res = self.app.post_json('/v1', {
"cmd": "sessions.destroy",
"session": "test_destroy_sessions"
})
self.assertEqual(res.status_code, 200)
body = V1ResponseBase(res.json)
self.assertEqual(STATUS_OK, body.status)
self.assertEqual("The session has been removed.", body.message)
def test_v1_endpoint_sessions_destroy_non_existing_session(self):
res = self.app.post_json('/v1', {
"cmd": "sessions.destroy",
"session": "non_existing_session_name"
}, status=500)
self.assertEqual(res.status_code, 500)
body = V1ResponseBase(res.json)
self.assertEqual(STATUS_ERROR, body.status)
self.assertEqual("Error: The session doesn't exist.", body.message)
def test_v1_endpoint_request_get_with_session(self):
self.app.post_json('/v1', {
"cmd": "sessions.create",
"session": "test_request_sessions"
})
res = self.app.post_json('/v1', {
"cmd": "request.get",
"session": "test_request_sessions",
"url": self.google_url
})
self.assertEqual(res.status_code, 200)
body = V1ResponseBase(res.json)
self.assertEqual(STATUS_OK, body.status)
if __name__ == '__main__': if __name__ == '__main__':

View File

@@ -39,8 +39,6 @@ def asset_cloudflare_solution(self, res, site_url, site_text):
class TestFlareSolverr(unittest.TestCase): class TestFlareSolverr(unittest.TestCase):
app = TestApp(flaresolverr.app) app = TestApp(flaresolverr.app)
# wait until the server is ready
app.get('/')
def test_v1_endpoint_request_get_cloudflare(self): def test_v1_endpoint_request_get_cloudflare(self):
sites_get = [ sites_get = [

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,259 @@
#!/usr/bin/env python3
# this module is part of undetected_chromedriver
"""
888 888 d8b
888 888 Y8P
888 888
.d8888b 88888b. 888d888 .d88b. 88888b.d88b. .d88b. .d88888 888d888 888 888 888 .d88b. 888d888
d88P" 888 "88b 888P" d88""88b 888 "888 "88b d8P Y8b d88" 888 888P" 888 888 888 d8P Y8b 888P"
888 888 888 888 888 888 888 888 888 88888888 888 888 888 888 Y88 88P 88888888 888
Y88b. 888 888 888 Y88..88P 888 888 888 Y8b. Y88b 888 888 888 Y8bd8P Y8b. 888
"Y8888P 888 888 888 "Y88P" 888 888 888 "Y8888 "Y88888 888 888 Y88P "Y8888 888 88888888
by UltrafunkAmsterdam (https://github.com/ultrafunkamsterdam)
"""
import io
import logging
import os
import random
import re
import string
import sys
import zipfile
from distutils.version import LooseVersion
from urllib.request import urlopen, urlretrieve
from selenium.webdriver import Chrome as _Chrome, ChromeOptions as _ChromeOptions
TARGET_VERSION = 0
logger = logging.getLogger("uc")
class Chrome:
def __new__(cls, *args, emulate_touch=False, **kwargs):
if not ChromeDriverManager.installed:
ChromeDriverManager(*args, **kwargs).install()
if not ChromeDriverManager.selenium_patched:
ChromeDriverManager(*args, **kwargs).patch_selenium_webdriver()
if not kwargs.get("executable_path"):
kwargs["executable_path"] = "./{}".format(
ChromeDriverManager(*args, **kwargs).executable_path
)
if not kwargs.get("options"):
kwargs["options"] = ChromeOptions()
instance = object.__new__(_Chrome)
instance.__init__(*args, **kwargs)
instance._orig_get = instance.get
def _get_wrapped(*args, **kwargs):
if instance.execute_script("return navigator.webdriver"):
instance.execute_cdp_cmd(
"Page.addScriptToEvaluateOnNewDocument",
{
"source": """
Object.defineProperty(window, 'navigator', {
value: new Proxy(navigator, {
has: (target, key) => (key === 'webdriver' ? false : key in target),
get: (target, key) =>
key === 'webdriver'
? undefined
: typeof target[key] === 'function'
? target[key].bind(target)
: target[key]
})
});
"""
},
)
return instance._orig_get(*args, **kwargs)
instance.get = _get_wrapped
instance.get = _get_wrapped
instance.get = _get_wrapped
original_user_agent_string = instance.execute_script(
"return navigator.userAgent"
)
instance.execute_cdp_cmd(
"Network.setUserAgentOverride",
{
"userAgent": original_user_agent_string.replace("Headless", ""),
},
)
if emulate_touch:
instance.execute_cdp_cmd(
"Page.addScriptToEvaluateOnNewDocument",
{
"source": """
Object.defineProperty(navigator, 'maxTouchPoints', {
get: () => 1
})"""
},
)
logger.info(f"starting undetected_chromedriver.Chrome({args}, {kwargs})")
return instance
class ChromeOptions:
def __new__(cls, *args, **kwargs):
if not ChromeDriverManager.installed:
ChromeDriverManager(*args, **kwargs).install()
if not ChromeDriverManager.selenium_patched:
ChromeDriverManager(*args, **kwargs).patch_selenium_webdriver()
instance = object.__new__(_ChromeOptions)
instance.__init__()
instance.add_argument("start-maximized")
instance.add_experimental_option("excludeSwitches", ["enable-automation"])
instance.add_argument("--disable-blink-features=AutomationControlled")
return instance
class ChromeDriverManager(object):
installed = False
selenium_patched = False
target_version = None
DL_BASE = "https://chromedriver.storage.googleapis.com/"
def __init__(self, executable_path=None, target_version=None, *args, **kwargs):
_platform = sys.platform
if TARGET_VERSION:
# use global if set
self.target_version = TARGET_VERSION
if target_version:
# use explicitly passed target
self.target_version = target_version # user override
if not self.target_version:
# none of the above (default) and just get current version
self.target_version = self.get_release_version_number().version[
0
] # only major version int
self._base = base_ = "chromedriver{}"
exe_name = self._base
if _platform in ("win32",):
exe_name = base_.format(".exe")
if _platform in ("linux",):
_platform += "64"
exe_name = exe_name.format("")
if _platform in ("darwin",):
_platform = "mac64"
exe_name = exe_name.format("")
self.platform = _platform
self.executable_path = executable_path or exe_name
self._exe_name = exe_name
def patch_selenium_webdriver(self_):
"""
Patches selenium package Chrome, ChromeOptions classes for current session
:return:
"""
import selenium.webdriver.chrome.service
import selenium.webdriver
selenium.webdriver.Chrome = Chrome
selenium.webdriver.ChromeOptions = ChromeOptions
logger.info("Selenium patched. Safe to import Chrome / ChromeOptions")
self_.__class__.selenium_patched = True
def install(self, patch_selenium=True):
"""
Initialize the patch
This will:
download chromedriver if not present
patch the downloaded chromedriver
patch selenium package if <patch_selenium> is True (default)
:param patch_selenium: patch selenium webdriver classes for Chrome and ChromeDriver (for current python session)
:return:
"""
if not os.path.exists(self.executable_path):
self.fetch_chromedriver()
if not self.__class__.installed:
if self.patch_binary():
self.__class__.installed = True
if patch_selenium:
self.patch_selenium_webdriver()
def get_release_version_number(self):
"""
Gets the latest major version available, or the latest major version of self.target_version if set explicitly.
:return: version string
"""
path = (
"LATEST_RELEASE"
if not self.target_version
else f"LATEST_RELEASE_{self.target_version}"
)
return LooseVersion(urlopen(self.__class__.DL_BASE + path).read().decode())
def fetch_chromedriver(self):
"""
Downloads ChromeDriver from source and unpacks the executable
:return: on success, name of the unpacked executable
"""
base_ = self._base
zip_name = base_.format(".zip")
ver = self.get_release_version_number().vstring
if os.path.exists(self.executable_path):
return self.executable_path
urlretrieve(
f"{self.__class__.DL_BASE}{ver}/{base_.format(f'_{self.platform}')}.zip",
filename=zip_name,
)
with zipfile.ZipFile(zip_name) as zf:
zf.extract(self._exe_name)
os.remove(zip_name)
if sys.platform != "win32":
os.chmod(self._exe_name, 0o755)
return self._exe_name
@staticmethod
def random_cdc():
cdc = random.choices(string.ascii_lowercase, k=26)
cdc[-6:-4] = map(str.upper, cdc[-6:-4])
cdc[2] = cdc[0]
cdc[3] = "_"
return "".join(cdc).encode()
def patch_binary(self):
"""
Patches the ChromeDriver binary
:return: False on failure, binary name on success
"""
linect = 0
replacement = self.random_cdc()
with io.open(self.executable_path, "r+b") as fh:
for line in iter(lambda: fh.readline(), b""):
if b"cdc_" in line:
fh.seek(-len(line), 1)
newline = re.sub(b"cdc_.{22}", replacement, line)
fh.write(newline)
linect += 1
return linect
def install(executable_path=None, target_version=None, *args, **kwargs):
ChromeDriverManager(executable_path, target_version, *args, **kwargs).install()

View File

@@ -1,112 +1,112 @@
#!/usr/bin/env python3 #!/usr/bin/env python3
# this module is part of undetected_chromedriver # this module is part of undetected_chromedriver
import json import json
import logging import logging
from collections.abc import Mapping, Sequence
import requests
import websockets import requests
import websockets
log = logging.getLogger(__name__) log = logging.getLogger(__name__)
class CDPObject(dict): class CDPObject(dict):
def __init__(self, *a, **k): def __init__(self, *a, **k):
super().__init__(*a, **k) super().__init__(*a, **k)
self.__dict__ = self self.__dict__ = self
for k in self.__dict__: for k in self.__dict__:
if isinstance(self.__dict__[k], dict): if isinstance(self.__dict__[k], dict):
self.__dict__[k] = CDPObject(self.__dict__[k]) self.__dict__[k] = CDPObject(self.__dict__[k])
elif isinstance(self.__dict__[k], list): elif isinstance(self.__dict__[k], list):
for i in range(len(self.__dict__[k])): for i in range(len(self.__dict__[k])):
if isinstance(self.__dict__[k][i], dict): if isinstance(self.__dict__[k][i], dict):
self.__dict__[k][i] = CDPObject(self) self.__dict__[k][i] = CDPObject(self)
def __repr__(self): def __repr__(self):
tpl = f"{self.__class__.__name__}(\n\t{{}}\n\t)" tpl = f"{self.__class__.__name__}(\n\t{{}}\n\t)"
return tpl.format("\n ".join(f"{k} = {v}" for k, v in self.items())) return tpl.format("\n ".join(f"{k} = {v}" for k, v in self.items()))
class PageElement(CDPObject): class PageElement(CDPObject):
pass pass
class CDP: class CDP:
log = logging.getLogger("CDP") log = logging.getLogger("CDP")
endpoints = CDPObject( endpoints = CDPObject(
{ {
"json": "/json", "json": "/json",
"protocol": "/json/protocol", "protocol": "/json/protocol",
"list": "/json/list", "list": "/json/list",
"new": "/json/new?{url}", "new": "/json/new?{url}",
"activate": "/json/activate/{id}", "activate": "/json/activate/{id}",
"close": "/json/close/{id}", "close": "/json/close/{id}",
} }
) )
def __init__(self, options: "ChromeOptions"): # noqa def __init__(self, options: "ChromeOptions"): # noqa
self.server_addr = "http://{0}:{1}".format(*options.debugger_address.split(":")) self.server_addr = "http://{0}:{1}".format(*options.debugger_address.split(":"))
self._reqid = 0 self._reqid = 0
self._session = requests.Session() self._session = requests.Session()
self._last_resp = None self._last_resp = None
self._last_json = None self._last_json = None
resp = self.get(self.endpoints.json) # noqa resp = self.get(self.endpoints.json) # noqa
self.sessionId = resp[0]["id"] self.sessionId = resp[0]["id"]
self.wsurl = resp[0]["webSocketDebuggerUrl"] self.wsurl = resp[0]["webSocketDebuggerUrl"]
def tab_activate(self, id=None): def tab_activate(self, id=None):
if not id: if not id:
active_tab = self.tab_list()[0] active_tab = self.tab_list()[0]
id = active_tab.id # noqa id = active_tab.id # noqa
self.wsurl = active_tab.webSocketDebuggerUrl # noqa self.wsurl = active_tab.webSocketDebuggerUrl # noqa
return self.post(self.endpoints["activate"].format(id=id)) return self.post(self.endpoints["activate"].format(id=id))
def tab_list(self): def tab_list(self):
retval = self.get(self.endpoints["list"]) retval = self.get(self.endpoints["list"])
return [PageElement(o) for o in retval] return [PageElement(o) for o in retval]
def tab_new(self, url): def tab_new(self, url):
return self.post(self.endpoints["new"].format(url=url)) return self.post(self.endpoints["new"].format(url=url))
def tab_close_last_opened(self): def tab_close_last_opened(self):
sessions = self.tab_list() sessions = self.tab_list()
opentabs = [s for s in sessions if s["type"] == "page"] opentabs = [s for s in sessions if s["type"] == "page"]
return self.post(self.endpoints["close"].format(id=opentabs[-1]["id"])) return self.post(self.endpoints["close"].format(id=opentabs[-1]["id"]))
async def send(self, method: str, params: dict): async def send(self, method: str, params: dict):
self._reqid += 1 self._reqid += 1
async with websockets.connect(self.wsurl) as ws: async with websockets.connect(self.wsurl) as ws:
await ws.send( await ws.send(
json.dumps({"method": method, "params": params, "id": self._reqid}) json.dumps({"method": method, "params": params, "id": self._reqid})
) )
self._last_resp = await ws.recv() self._last_resp = await ws.recv()
self._last_json = json.loads(self._last_resp) self._last_json = json.loads(self._last_resp)
self.log.info(self._last_json) self.log.info(self._last_json)
def get(self, uri): def get(self, uri):
resp = self._session.get(self.server_addr + uri) resp = self._session.get(self.server_addr + uri)
try: try:
self._last_resp = resp self._last_resp = resp
self._last_json = resp.json() self._last_json = resp.json()
except Exception: except Exception:
return return
else: else:
return self._last_json return self._last_json
def post(self, uri, data: dict = None): def post(self, uri, data: dict = None):
if not data: if not data:
data = {} data = {}
resp = self._session.post(self.server_addr + uri, json=data) resp = self._session.post(self.server_addr + uri, json=data)
try: try:
self._last_resp = resp self._last_resp = resp
self._last_json = resp.json() self._last_json = resp.json()
except Exception: except Exception:
return self._last_resp return self._last_resp
@property @property
def last_json(self): def last_json(self):
return self._last_json return self._last_json

View File

@@ -1,193 +1,191 @@
import asyncio import asyncio
from collections.abc import Mapping import logging
from collections.abc import Sequence import time
from functools import wraps import traceback
import os from collections.abc import Mapping
import logging from collections.abc import Sequence
import threading from typing import Any
import time from typing import Awaitable
import traceback from typing import Callable
from typing import Any from typing import List
from typing import Awaitable from typing import Optional
from typing import Callable from contextlib import ExitStack
from typing import List import threading
from typing import Optional from functools import wraps, partial
class Structure(dict): class Structure(dict):
""" """
This is a dict-like object structure, which you should subclass This is a dict-like object structure, which you should subclass
Only properties defined in the class context are used on initialization. Only properties defined in the class context are used on initialization.
See example See example
""" """
_store = {} _store = {}
def __init__(self, *a, **kw): def __init__(self, *a, **kw):
""" """
Instantiate a new instance. Instantiate a new instance.
:param a: :param a:
:param kw: :param kw:
""" """
super().__init__() super().__init__()
# auxiliar dict # auxiliar dict
d = dict(*a, **kw) d = dict(*a, **kw)
for k, v in d.items(): for k, v in d.items():
if isinstance(v, Mapping): if isinstance(v, Mapping):
self[k] = self.__class__(v) self[k] = self.__class__(v)
elif isinstance(v, Sequence) and not isinstance(v, (str, bytes)): elif isinstance(v, Sequence) and not isinstance(v, (str, bytes)):
self[k] = [self.__class__(i) for i in v] self[k] = [self.__class__(i) for i in v]
else: else:
self[k] = v self[k] = v
super().__setattr__("__dict__", self) super().__setattr__("__dict__", self)
def __getattr__(self, item): def __getattr__(self, item):
return getattr(super(), item) return getattr(super(), item)
def __getitem__(self, item): def __getitem__(self, item):
return super().__getitem__(item) return super().__getitem__(item)
def __setattr__(self, key, value): def __setattr__(self, key, value):
self.__setitem__(key, value) self.__setitem__(key, value)
def __setitem__(self, key, value): def __setitem__(self, key, value):
super().__setitem__(key, value) super().__setitem__(key, value)
def update(self, *a, **kw): def update(self, *a, **kw):
super().update(*a, **kw) super().update(*a, **kw)
def __eq__(self, other): def __eq__(self, other):
return frozenset(other.items()) == frozenset(self.items()) return frozenset(other.items()) == frozenset(self.items())
def __hash__(self): def __hash__(self):
return hash(frozenset(self.items())) return hash(frozenset(self.items()))
@classmethod @classmethod
def __init_subclass__(cls, **kwargs): def __init_subclass__(cls, **kwargs):
cls._store = {} cls._store = {}
def _normalize_strings(self): def _normalize_strings(self):
for k, v in self.copy().items(): for k, v in self.copy().items():
if isinstance(v, (str)): if isinstance(v, (str)):
self[k] = v.strip() self[k] = v.strip()
def timeout(seconds=3, on_timeout: Optional[Callable[[callable], Any]] = None): def timeout(seconds=3, on_timeout: Optional[Callable[[callable], Any]] = None):
def wrapper(func): def wrapper(func):
@wraps(func) @wraps(func)
def wrapped(*args, **kwargs): def wrapped(*args, **kwargs):
def function_reached_timeout(): def function_reached_timeout():
if on_timeout: if on_timeout:
on_timeout(func) on_timeout(func)
else: else:
raise TimeoutError("function call timed out") raise TimeoutError("function call timed out")
t = threading.Timer(interval=seconds, function=function_reached_timeout) t = threading.Timer(interval=seconds, function=function_reached_timeout)
t.start() t.start()
try: try:
return func(*args, **kwargs) return func(*args, **kwargs)
except: except:
t.cancel() t.cancel()
raise raise
finally: finally:
t.cancel() t.cancel()
return wrapped return wrapped
return wrapper return wrapper
def test(): def test():
import sys, os import sys, os
sys.path.insert(0, os.path.abspath(os.path.dirname(__file__))) sys.path.insert(0, os.path.abspath(os.path.dirname(__file__)))
import undetected_chromedriver as uc import undetected_chromedriver as uc
import threading import threading
def collector( def collector(
driver: uc.Chrome, driver: uc.Chrome,
stop_event: threading.Event, stop_event: threading.Event,
on_event_coro: Optional[Callable[[List[str]], Awaitable[Any]]] = None, on_event_coro: Optional[Callable[[List[str]], Awaitable[Any]]] = None,
listen_events: Sequence = ("browser", "network", "performance"), listen_events: Sequence = ("browser", "network", "performance"),
): ):
def threaded(driver, stop_event, on_event_coro): def threaded(driver, stop_event, on_event_coro):
async def _ensure_service_started(): async def _ensure_service_started():
while ( while (
getattr(driver, "service", False) getattr(driver, "service", False)
and getattr(driver.service, "process", False) and getattr(driver.service, "process", False)
and driver.service.process.poll() and driver.service.process.poll()
): ):
print("waiting for driver service to come back on") print("waiting for driver service to come back on")
await asyncio.sleep(0.05) await asyncio.sleep(0.05)
# await asyncio.sleep(driver._delay or .25) # await asyncio.sleep(driver._delay or .25)
async def get_log_lines(typ): async def get_log_lines(typ):
await _ensure_service_started() await _ensure_service_started()
return driver.get_log(typ) return driver.get_log(typ)
async def looper(): async def looper():
while not stop_event.is_set(): while not stop_event.is_set():
log_lines = [] log_lines = []
try: try:
for _ in listen_events: for _ in listen_events:
try: try:
log_lines += await get_log_lines(_) log_lines += await get_log_lines(_)
except: except:
if logging.getLogger().getEffectiveLevel() <= 10: if logging.getLogger().getEffectiveLevel() <= 10:
traceback.print_exc() traceback.print_exc()
continue continue
if log_lines and on_event_coro: if log_lines and on_event_coro:
await on_event_coro(log_lines) await on_event_coro(log_lines)
except Exception as e: except Exception as e:
if logging.getLogger().getEffectiveLevel() <= 10: if logging.getLogger().getEffectiveLevel() <= 10:
traceback.print_exc() traceback.print_exc()
loop = asyncio.new_event_loop() loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop) asyncio.set_event_loop(loop)
loop.run_until_complete(looper()) loop.run_until_complete(looper())
t = threading.Thread(target=threaded, args=(driver, stop_event, on_event_coro)) t = threading.Thread(target=threaded, args=(driver, stop_event, on_event_coro))
t.start() t.start()
async def on_event(data): async def on_event(data):
print("on_event") print("on_event")
print("data:", data) print("data:", data)
def func_called(fn): def func_called(fn):
def wrapped(*args, **kwargs): def wrapped(*args, **kwargs):
print( print(
"func called! %s (args: %s, kwargs: %s)" % (fn.__name__, args, kwargs) "func called! %s (args: %s, kwargs: %s)" % (fn.__name__, args, kwargs)
) )
while driver.service.process and driver.service.process.poll() is not None: while driver.service.process and driver.service.process.poll() is not None:
time.sleep(0.1) time.sleep(0.1)
res = fn(*args, **kwargs) res = fn(*args, **kwargs)
print("func completed! (result: %s)" % res) print("func completed! (result: %s)" % res)
return res return res
return wrapped return wrapped
logging.basicConfig(level=10) logging.basicConfig(level=10)
options = uc.ChromeOptions() options = uc.ChromeOptions()
options.set_capability( options.set_capability(
"goog:loggingPrefs", {"performance": "ALL", "browser": "ALL", "network": "ALL"} "goog:loggingPrefs", {"performance": "ALL", "browser": "ALL", "network": "ALL"}
) )
driver = uc.Chrome(version_main=96, options=options) driver = uc.Chrome(version_main=96, options=options)
# driver.command_executor._request = timeout(seconds=1)(driver.command_executor._request) # driver.command_executor._request = timeout(seconds=1)(driver.command_executor._request)
driver.command_executor._request = func_called(driver.command_executor._request) driver.command_executor._request = func_called(driver.command_executor._request)
collector_stop = threading.Event() collector_stop = threading.Event()
collector(driver, collector_stop, on_event) collector(driver, collector_stop, on_event)
driver.get("https://nowsecure.nl") driver.get("https://nowsecure.nl")
time.sleep(10) time.sleep(10)
if os.name == "nt": driver.quit()
driver.close()
driver.quit()

View File

@@ -1,77 +1,75 @@
import atexit import multiprocessing
import logging import os
import multiprocessing import platform
import os import sys
import platform from subprocess import PIPE
import signal from subprocess import Popen
from subprocess import PIPE import atexit
from subprocess import Popen import traceback
import sys import logging
import signal
CREATE_NEW_PROCESS_GROUP = 0x00000200 CREATE_NEW_PROCESS_GROUP = 0x00000200
DETACHED_PROCESS = 0x00000008 DETACHED_PROCESS = 0x00000008
REGISTERED = [] REGISTERED = []
def start_detached(executable, *args): def start_detached(executable, *args):
""" """
Starts a fully independent subprocess (with no parent) Starts a fully independent subprocess (with no parent)
:param executable: executable :param executable: executable
:param args: arguments to the executable, eg: ['--param1_key=param1_val', '-vvv' ...] :param args: arguments to the executable, eg: ['--param1_key=param1_val', '-vvv' ...]
:return: pid of the grandchild process :return: pid of the grandchild process
""" """
# create pipe # create pipe
reader, writer = multiprocessing.Pipe(False) reader, writer = multiprocessing.Pipe(False)
# do not keep reference # do not keep reference
process = multiprocessing.Process( multiprocessing.Process(
target=_start_detached, target=_start_detached,
args=(executable, *args), args=(executable, *args),
kwargs={"writer": writer}, kwargs={"writer": writer},
daemon=True, daemon=True,
) ).start()
process.start() # receive pid from pipe
process.join() pid = reader.recv()
# receive pid from pipe REGISTERED.append(pid)
pid = reader.recv() # close pipes
REGISTERED.append(pid) writer.close()
# close pipes reader.close()
writer.close()
reader.close() return pid
process.close()
return pid def _start_detached(executable, *args, writer: multiprocessing.Pipe = None):
# configure launch
def _start_detached(executable, *args, writer: multiprocessing.Pipe = None): kwargs = {}
# configure launch if platform.system() == "Windows":
kwargs = {} kwargs.update(creationflags=DETACHED_PROCESS | CREATE_NEW_PROCESS_GROUP)
if platform.system() == "Windows": elif sys.version_info < (3, 2):
kwargs.update(creationflags=DETACHED_PROCESS | CREATE_NEW_PROCESS_GROUP) # assume posix
elif sys.version_info < (3, 2): kwargs.update(preexec_fn=os.setsid)
# assume posix else: # Python 3.2+ and Unix
kwargs.update(preexec_fn=os.setsid) kwargs.update(start_new_session=True)
else: # Python 3.2+ and Unix
kwargs.update(start_new_session=True) # run
p = Popen([executable, *args], stdin=PIPE, stdout=PIPE, stderr=PIPE, **kwargs)
# run
p = Popen([executable, *args], stdin=PIPE, stdout=PIPE, stderr=PIPE, **kwargs) # send pid to pipe
writer.send(p.pid)
# send pid to pipe sys.exit()
writer.send(p.pid)
sys.exit()
def _cleanup():
for pid in REGISTERED:
def _cleanup(): try:
for pid in REGISTERED: logging.getLogger(__name__).debug("cleaning up pid %d " % pid)
try: os.kill(pid, signal.SIGTERM)
logging.getLogger(__name__).debug("cleaning up pid %d " % pid) except: # noqa
os.kill(pid, signal.SIGTERM) pass
except: # noqa
pass
atexit.register(_cleanup)
atexit.register(_cleanup)

View File

@@ -1,85 +1,70 @@
#!/usr/bin/env python3 #!/usr/bin/env python3
# this module is part of undetected_chromedriver # this module is part of undetected_chromedriver
import json import json
import os import os
from selenium.webdriver.chromium.options import ChromiumOptions as _ChromiumOptions from selenium.webdriver.chromium.options import ChromiumOptions as _ChromiumOptions
class ChromeOptions(_ChromiumOptions): class ChromeOptions(_ChromiumOptions):
_session = None _session = None
_user_data_dir = None _user_data_dir = None
@property @property
def user_data_dir(self): def user_data_dir(self):
return self._user_data_dir return self._user_data_dir
@user_data_dir.setter @user_data_dir.setter
def user_data_dir(self, path: str): def user_data_dir(self, path: str):
""" """
Sets the browser profile folder to use, or creates a new profile Sets the browser profile folder to use, or creates a new profile
at given <path>. at given <path>.
Parameters Parameters
---------- ----------
path: str path: str
the path to a chrome profile folder the path to a chrome profile folder
if it does not exist, a new profile will be created at given location if it does not exist, a new profile will be created at given location
""" """
apath = os.path.abspath(path) apath = os.path.abspath(path)
self._user_data_dir = os.path.normpath(apath) self._user_data_dir = os.path.normpath(apath)
@staticmethod @staticmethod
def _undot_key(key, value): def _undot_key(key, value):
"""turn a (dotted key, value) into a proper nested dict""" """turn a (dotted key, value) into a proper nested dict"""
if "." in key: if "." in key:
key, rest = key.split(".", 1) key, rest = key.split(".", 1)
value = ChromeOptions._undot_key(rest, value) value = ChromeOptions._undot_key(rest, value)
return {key: value} return {key: value}
@staticmethod def handle_prefs(self, user_data_dir):
def _merge_nested(a, b): prefs = self.experimental_options.get("prefs")
""" if prefs:
merges b into a
leaf values in a are overwritten with values from b user_data_dir = user_data_dir or self._user_data_dir
""" default_path = os.path.join(user_data_dir, "Default")
for key in b: os.makedirs(default_path, exist_ok=True)
if key in a:
if isinstance(a[key], dict) and isinstance(b[key], dict): # undot prefs dict keys
ChromeOptions._merge_nested(a[key], b[key]) undot_prefs = {}
continue for key, value in prefs.items():
a[key] = b[key] undot_prefs.update(self._undot_key(key, value))
return a
prefs_file = os.path.join(default_path, "Preferences")
def handle_prefs(self, user_data_dir): if os.path.exists(prefs_file):
prefs = self.experimental_options.get("prefs") with open(prefs_file, encoding="latin1", mode="r") as f:
if prefs: undot_prefs.update(json.load(f))
user_data_dir = user_data_dir or self._user_data_dir
default_path = os.path.join(user_data_dir, "Default") with open(prefs_file, encoding="latin1", mode="w") as f:
os.makedirs(default_path, exist_ok=True) json.dump(undot_prefs, f)
# undot prefs dict keys # remove the experimental_options to avoid an error
undot_prefs = {} del self._experimental_options["prefs"]
for key, value in prefs.items():
undot_prefs = self._merge_nested( @classmethod
undot_prefs, self._undot_key(key, value) def from_options(cls, options):
) o = cls()
o.__dict__.update(options.__dict__)
prefs_file = os.path.join(default_path, "Preferences") return o
if os.path.exists(prefs_file):
with open(prefs_file, encoding="latin1", mode="r") as f:
undot_prefs = self._merge_nested(json.load(f), undot_prefs)
with open(prefs_file, encoding="latin1", mode="w") as f:
json.dump(undot_prefs, f)
# remove the experimental_options to avoid an error
del self._experimental_options["prefs"]
@classmethod
def from_options(cls, options):
o = cls()
o.__dict__.update(options.__dict__)
return o

View File

@@ -1,473 +1,276 @@
#!/usr/bin/env python3 #!/usr/bin/env python3
# this module is part of undetected_chromedriver # this module is part of undetected_chromedriver
from packaging.version import Version as LooseVersion import io
import io import logging
import json import os
import logging import random
import os import re
import pathlib import string
import platform import sys
import random import time
import re import zipfile
import shutil from distutils.version import LooseVersion
import string from urllib.request import urlopen, urlretrieve
import subprocess import secrets
import sys
import time
from urllib.request import urlopen logger = logging.getLogger(__name__)
from urllib.request import urlretrieve
import zipfile IS_POSIX = sys.platform.startswith(("darwin", "cygwin", "linux"))
from multiprocessing import Lock
logger = logging.getLogger(__name__) class Patcher(object):
url_repo = "https://chromedriver.storage.googleapis.com"
IS_POSIX = sys.platform.startswith(("darwin", "cygwin", "linux", "linux2", "freebsd")) zip_name = "chromedriver_%s.zip"
exe_name = "chromedriver%s"
class Patcher(object): platform = sys.platform
lock = Lock() if platform.endswith("win32"):
exe_name = "chromedriver%s" zip_name %= "win32"
exe_name %= ".exe"
platform = sys.platform if platform.endswith("linux"):
if platform.endswith("win32"): zip_name %= "linux64"
d = "~/appdata/roaming/undetected_chromedriver" exe_name %= ""
elif "LAMBDA_TASK_ROOT" in os.environ: if platform.endswith("darwin"):
d = "/tmp/undetected_chromedriver" zip_name %= "mac64"
elif platform.startswith(("linux", "linux2")): exe_name %= ""
d = "~/.local/share/undetected_chromedriver"
elif platform.endswith("darwin"): if platform.endswith("win32"):
d = "~/Library/Application Support/undetected_chromedriver" d = "~/appdata/roaming/undetected_chromedriver"
else: elif platform.startswith("linux"):
d = "~/.undetected_chromedriver" d = "~/.local/share/undetected_chromedriver"
data_path = os.path.abspath(os.path.expanduser(d)) elif platform.endswith("darwin"):
d = "~/Library/Application Support/undetected_chromedriver"
def __init__( else:
self, d = "~/.undetected_chromedriver"
executable_path=None, data_path = os.path.abspath(os.path.expanduser(d))
force=False,
version_main: int = 0, def __init__(self, executable_path=None, force=False, version_main: int = 0):
user_multi_procs=False, """
):
""" Args:
Args: executable_path: None = automatic
executable_path: None = automatic a full file path to the chromedriver executable
a full file path to the chromedriver executable force: False
force: False terminate processes which are holding lock
terminate processes which are holding lock version_main: 0 = auto
version_main: 0 = auto specify main chrome version (rounded, ex: 82)
specify main chrome version (rounded, ex: 82) """
"""
self.force = force self.force = force
self._custom_exe_path = False self.executable_path = None
prefix = "undetected" prefix = secrets.token_hex(8)
self.user_multi_procs = user_multi_procs
if not os.path.exists(self.data_path):
try: os.makedirs(self.data_path, exist_ok=True)
# Try to convert version_main into an integer
version_main_int = int(version_main) if not executable_path:
# check if version_main_int is less than or equal to e.g 114 self.executable_path = os.path.join(
self.is_old_chromedriver = version_main and version_main_int <= 114 self.data_path, "_".join([prefix, self.exe_name])
except (ValueError,TypeError): )
# Check not running inside Docker
if not os.path.exists("/app/chromedriver"): if not IS_POSIX:
# If the conversion fails, log an error message if executable_path:
logging.info("version_main cannot be converted to an integer") if not executable_path[-4:] == ".exe":
# Set self.is_old_chromedriver to False if the conversion fails executable_path += ".exe"
self.is_old_chromedriver = False
self.zip_path = os.path.join(self.data_path, prefix)
# Needs to be called before self.exe_name is accessed
self._set_platform_name() if not executable_path:
self.executable_path = os.path.abspath(
if not os.path.exists(self.data_path): os.path.join(".", self.executable_path)
os.makedirs(self.data_path, exist_ok=True) )
if not executable_path: self._custom_exe_path = False
if sys.platform.startswith("freebsd"):
self.executable_path = os.path.join( if executable_path:
self.data_path, self.exe_name self._custom_exe_path = True
) self.executable_path = executable_path
else: self.version_main = version_main
self.executable_path = os.path.join( self.version_full = None
self.data_path, "_".join([prefix, self.exe_name])
) def auto(self, executable_path=None, force=False, version_main=None):
""""""
if not IS_POSIX: if executable_path:
if executable_path: self.executable_path = executable_path
if not executable_path[-4:] == ".exe": self._custom_exe_path = True
executable_path += ".exe"
if self._custom_exe_path:
self.zip_path = os.path.join(self.data_path, prefix) ispatched = self.is_binary_patched(self.executable_path)
if not ispatched:
if not executable_path: return self.patch_exe()
if not self.user_multi_procs: else:
self.executable_path = os.path.abspath( return
os.path.join(".", self.executable_path)
) if version_main:
self.version_main = version_main
if executable_path: if force is True:
self._custom_exe_path = True self.force = force
self.executable_path = executable_path
try:
# Set the correct repository to download the Chromedriver from os.unlink(self.executable_path)
if self.is_old_chromedriver: except PermissionError:
self.url_repo = "https://chromedriver.storage.googleapis.com" if self.force:
else: self.force_kill_instances(self.executable_path)
self.url_repo = "https://googlechromelabs.github.io/chrome-for-testing" return self.auto(force=not self.force)
try:
self.version_main = version_main if self.is_binary_patched():
self.version_full = None # assumes already running AND patched
return True
def _set_platform_name(self): except PermissionError:
""" pass
Set the platform and exe name based on the platform undetected_chromedriver is running on # return False
in order to download the correct chromedriver. except FileNotFoundError:
""" pass
if self.platform.endswith("win32"):
self.platform_name = "win32" release = self.fetch_release_number()
self.exe_name %= ".exe" self.version_main = release.version[0]
if self.platform.endswith(("linux", "linux2")): self.version_full = release
self.platform_name = "linux64" self.unzip_package(self.fetch_package())
self.exe_name %= "" return self.patch()
if self.platform.endswith("darwin"):
if self.is_old_chromedriver: def patch(self):
self.platform_name = "mac64" self.patch_exe()
else: return self.is_binary_patched()
self.platform_name = "mac-x64"
self.exe_name %= "" def fetch_release_number(self):
if self.platform.startswith("freebsd"): """
self.platform_name = "freebsd" Gets the latest major version available, or the latest major version of self.target_version if set explicitly.
self.exe_name %= "" :return: version string
:rtype: LooseVersion
def auto(self, executable_path=None, force=False, version_main=None, _=None): """
""" path = "/latest_release"
if self.version_main:
Args: path += f"_{self.version_main}"
executable_path: path = path.upper()
force: logger.debug("getting release number from %s" % path)
version_main: return LooseVersion(urlopen(self.url_repo + path).read().decode())
Returns: def parse_exe_version(self):
with io.open(self.executable_path, "rb") as f:
""" for line in iter(lambda: f.readline(), b""):
p = pathlib.Path(self.data_path) match = re.search(rb"platform_handle\x00content\x00([0-9.]*)", line)
if self.user_multi_procs: if match:
with Lock(): return LooseVersion(match[1].decode())
files = list(p.rglob("*chromedriver*"))
most_recent = max(files, key=lambda f: f.stat().st_mtime) def fetch_package(self):
files.remove(most_recent) """
list(map(lambda f: f.unlink(), files)) Downloads ChromeDriver from source
if self.is_binary_patched(most_recent):
self.executable_path = str(most_recent) :return: path to downloaded file
return True """
u = "%s/%s/%s" % (self.url_repo, self.version_full.vstring, self.zip_name)
if executable_path: logger.debug("downloading from %s" % u)
self.executable_path = executable_path # return urlretrieve(u, filename=self.data_path)[0]
self._custom_exe_path = True return urlretrieve(u)[0]
if self._custom_exe_path: def unzip_package(self, fp):
ispatched = self.is_binary_patched(self.executable_path) """
if not ispatched: Does what it says
return self.patch_exe()
else: :return: path to unpacked executable
return """
logger.debug("unzipping %s" % fp)
if version_main: try:
self.version_main = version_main os.unlink(self.zip_path)
if force is True: except (FileNotFoundError, OSError):
self.force = force pass
os.makedirs(self.zip_path, mode=0o755, exist_ok=True)
if self.platform_name == "freebsd": with zipfile.ZipFile(fp, mode="r") as zf:
chromedriver_path = shutil.which("chromedriver") zf.extract(self.exe_name, self.zip_path)
os.rename(os.path.join(self.zip_path, self.exe_name), self.executable_path)
if not os.path.isfile(chromedriver_path) or not os.access(chromedriver_path, os.X_OK): os.remove(fp)
logging.error("Chromedriver not installed!") os.rmdir(self.zip_path)
return os.chmod(self.executable_path, 0o755)
return self.executable_path
version_path = os.path.join(os.path.dirname(self.executable_path), "version.txt")
@staticmethod
process = os.popen(f'"{chromedriver_path}" --version') def force_kill_instances(exe_name):
chromedriver_version = process.read().split(' ')[1].split(' ')[0] """
process.close() kills running instances.
:param: executable name to kill, may be a path as well
current_version = None
if os.path.isfile(version_path) or os.access(version_path, os.X_OK): :return: True on success else False
with open(version_path, 'r') as f: """
current_version = f.read() exe_name = os.path.basename(exe_name)
if IS_POSIX:
if current_version != chromedriver_version: r = os.system("kill -f -9 $(pidof %s)" % exe_name)
logging.info("Copying chromedriver executable...") else:
shutil.copy(chromedriver_path, self.executable_path) r = os.system("taskkill /f /im %s" % exe_name)
os.chmod(self.executable_path, 0o755) return not r
with open(version_path, 'w') as f: @staticmethod
f.write(chromedriver_version) def gen_random_cdc():
cdc = random.choices(string.ascii_lowercase, k=26)
logging.info("Chromedriver executable copied!") cdc[-6:-4] = map(str.upper, cdc[-6:-4])
else: cdc[2] = cdc[0]
try: cdc[3] = "_"
os.unlink(self.executable_path) return "".join(cdc).encode()
except PermissionError:
if self.force: def is_binary_patched(self, executable_path=None):
self.force_kill_instances(self.executable_path) """simple check if executable is patched.
return self.auto(force=not self.force)
try: :return: False if not patched, else True
if self.is_binary_patched(): """
# assumes already running AND patched executable_path = executable_path or self.executable_path
return True with io.open(executable_path, "rb") as fh:
except PermissionError: for line in iter(lambda: fh.readline(), b""):
pass if b"cdc_" in line:
# return False return False
except FileNotFoundError: else:
pass return True
release = self.fetch_release_number() def patch_exe(self):
self.version_main = release.major """
self.version_full = release Patches the ChromeDriver binary
self.unzip_package(self.fetch_package())
:return: False on failure, binary name on success
return self.patch() """
logger.info("patching driver executable %s" % self.executable_path)
def driver_binary_in_use(self, path: str = None) -> bool:
""" linect = 0
naive test to check if a found chromedriver binary is replacement = self.gen_random_cdc()
currently in use with io.open(self.executable_path, "r+b") as fh:
for line in iter(lambda: fh.readline(), b""):
Args: if b"cdc_" in line:
path: a string or PathLike object to the binary to check. fh.seek(-len(line), 1)
if not specified, we check use this object's executable_path newline = re.sub(b"cdc_.{22}", replacement, line)
""" fh.write(newline)
if not path: linect += 1
path = self.executable_path return linect
p = pathlib.Path(path)
def __repr__(self):
if not p.exists(): return "{0:s}({1:s})".format(
raise OSError("file does not exist: %s" % p) self.__class__.__name__,
try: self.executable_path,
with open(p, mode="a+b") as fs: )
exc = []
try: def __del__(self):
fs.seek(0, 0) if self._custom_exe_path:
except PermissionError as e: # if the driver binary is specified by user
exc.append(e) # since some systems apprently allow seeking # we assume it is important enough to not delete it
# we conduct another test return
try: else:
fs.readline() timeout = 3 # stop trying after this many seconds
except PermissionError as e: t = time.monotonic()
exc.append(e) while True:
now = time.monotonic()
if exc: if now - t > timeout:
# we don't want to wait until the end of time
return True logger.debug(
return False "could not unlink %s in time (%d seconds)"
# ok safe to assume this is in use % (self.executable_path, timeout)
except Exception as e: )
# logger.exception("whoops ", e) break
pass try:
os.unlink(self.executable_path)
def cleanup_unused_files(self): logger.debug("successfully unlinked %s" % self.executable_path)
p = pathlib.Path(self.data_path) break
items = list(p.glob("*undetected*")) except (OSError, RuntimeError, PermissionError):
for item in items: time.sleep(0.1)
try: continue
item.unlink() except FileNotFoundError:
except: break
pass
def patch(self):
self.patch_exe()
return self.is_binary_patched()
def fetch_release_number(self):
"""
Gets the latest major version available, or the latest major version of self.target_version if set explicitly.
:return: version string
:rtype: LooseVersion
"""
# Endpoint for old versions of Chromedriver (114 and below)
if self.is_old_chromedriver:
path = f"/latest_release_{self.version_main}"
path = path.upper()
logger.debug("getting release number from %s" % path)
return LooseVersion(urlopen(self.url_repo + path).read().decode())
# Endpoint for new versions of Chromedriver (115+)
if not self.version_main:
# Fetch the latest version
path = "/last-known-good-versions-with-downloads.json"
logger.debug("getting release number from %s" % path)
with urlopen(self.url_repo + path) as conn:
response = conn.read().decode()
last_versions = json.loads(response)
return LooseVersion(last_versions["channels"]["Stable"]["version"])
# Fetch the latest minor version of the major version provided
path = "/latest-versions-per-milestone-with-downloads.json"
logger.debug("getting release number from %s" % path)
with urlopen(self.url_repo + path) as conn:
response = conn.read().decode()
major_versions = json.loads(response)
return LooseVersion(major_versions["milestones"][str(self.version_main)]["version"])
def parse_exe_version(self):
with io.open(self.executable_path, "rb") as f:
for line in iter(lambda: f.readline(), b""):
match = re.search(rb"platform_handle\x00content\x00([0-9.]*)", line)
if match:
return LooseVersion(match[1].decode())
def fetch_package(self):
"""
Downloads ChromeDriver from source
:return: path to downloaded file
"""
zip_name = f"chromedriver_{self.platform_name}.zip"
if self.is_old_chromedriver:
download_url = "%s/%s/%s" % (self.url_repo, str(self.version_full), zip_name)
else:
zip_name = zip_name.replace("_", "-", 1)
download_url = "https://storage.googleapis.com/chrome-for-testing-public/%s/%s/%s"
download_url %= (str(self.version_full), self.platform_name, zip_name)
logger.debug("downloading from %s" % download_url)
return urlretrieve(download_url)[0]
def unzip_package(self, fp):
"""
Does what it says
:return: path to unpacked executable
"""
exe_path = self.exe_name
if not self.is_old_chromedriver:
# The new chromedriver unzips into its own folder
zip_name = f"chromedriver-{self.platform_name}"
exe_path = os.path.join(zip_name, self.exe_name)
logger.debug("unzipping %s" % fp)
try:
os.unlink(self.zip_path)
except (FileNotFoundError, OSError):
pass
os.makedirs(self.zip_path, mode=0o755, exist_ok=True)
with zipfile.ZipFile(fp, mode="r") as zf:
zf.extractall(self.zip_path)
os.rename(os.path.join(self.zip_path, exe_path), self.executable_path)
os.remove(fp)
shutil.rmtree
os.chmod(self.executable_path, 0o755)
return self.executable_path
@staticmethod
def force_kill_instances(exe_name):
"""
kills running instances.
:param: executable name to kill, may be a path as well
:return: True on success else False
"""
exe_name = os.path.basename(exe_name)
if IS_POSIX:
# Using shell=True for pidof, consider a more robust pid finding method if issues arise.
# pgrep can be an alternative: ["pgrep", "-f", exe_name]
# Or psutil if adding a dependency is acceptable.
command = f"pidof {exe_name}"
try:
result = subprocess.run(command, shell=True, capture_output=True, text=True, check=True)
pids = result.stdout.strip().split()
if pids:
subprocess.run(["kill", "-9"] + pids, check=False) # Changed from -f -9 to -9 as -f is not standard for kill
return True
return False # No PIDs found
except subprocess.CalledProcessError: # pidof returns 1 if no process found
return False # No process found
except Exception as e:
logger.debug(f"Error killing process on POSIX: {e}")
return False
else:
try:
# TASKKILL /F /IM chromedriver.exe
result = subprocess.run(["taskkill", "/f", "/im", exe_name], check=False, capture_output=True)
# taskkill returns 0 if process was killed, 128 if not found.
return result.returncode == 0
except Exception as e:
logger.debug(f"Error killing process on Windows: {e}")
return False
@staticmethod
def gen_random_cdc():
cdc = random.choices(string.ascii_letters, k=27)
return "".join(cdc).encode()
def is_binary_patched(self, executable_path=None):
executable_path = executable_path or self.executable_path
try:
with io.open(executable_path, "rb") as fh:
return fh.read().find(b"undetected chromedriver") != -1
except FileNotFoundError:
return False
def patch_exe(self):
start = time.perf_counter()
logger.info("patching driver executable %s" % self.executable_path)
with io.open(self.executable_path, "r+b") as fh:
content = fh.read()
# match_injected_codeblock = re.search(rb"{window.*;}", content)
match_injected_codeblock = re.search(rb"\{window\.cdc.*?;\}", content)
if match_injected_codeblock:
target_bytes = match_injected_codeblock[0]
new_target_bytes = (
b'{console.log("undetected chromedriver 1337!")}'.ljust(
len(target_bytes), b" "
)
)
new_content = content.replace(target_bytes, new_target_bytes)
if new_content == content:
logger.warning(
"something went wrong patching the driver binary. could not find injection code block"
)
else:
logger.debug(
"found block:\n%s\nreplacing with:\n%s"
% (target_bytes, new_target_bytes)
)
fh.seek(0)
fh.write(new_content)
logger.debug(
"patching took us {:.2f} seconds".format(time.perf_counter() - start)
)
def __repr__(self):
return "{0:s}({1:s})".format(
self.__class__.__name__,
self.executable_path,
)
def __del__(self):
if self._custom_exe_path:
# if the driver binary is specified by user
# we assume it is important enough to not delete it
return
else:
timeout = 3 # stop trying after this many seconds
t = time.monotonic()
now = lambda: time.monotonic()
while now() - t > timeout:
# we don't want to wait until the end of time
try:
if self.user_multi_procs:
break
os.unlink(self.executable_path)
logger.debug("successfully unlinked %s" % self.executable_path)
break
except (OSError, RuntimeError, PermissionError):
time.sleep(0.01)
continue
except FileNotFoundError:
break

View File

@@ -1,99 +1,102 @@
#!/usr/bin/env python3 #!/usr/bin/env python3
# this module is part of undetected_chromedriver # this module is part of undetected_chromedriver
import asyncio import asyncio
import json import json
import logging import logging
import threading import threading
logger = logging.getLogger(__name__)
logger = logging.getLogger(__name__)
class Reactor(threading.Thread):
class Reactor(threading.Thread): def __init__(self, driver: "Chrome"):
def __init__(self, driver: "Chrome"): super().__init__()
super().__init__()
self.driver = driver
self.driver = driver self.loop = asyncio.new_event_loop()
self.loop = asyncio.new_event_loop()
self.lock = threading.Lock()
self.lock = threading.Lock() self.event = threading.Event()
self.event = threading.Event() self.daemon = True
self.daemon = True self.handlers = {}
self.handlers = {}
def add_event_handler(self, method_name, callback: callable):
def add_event_handler(self, method_name, callback: callable): """
"""
Parameters
Parameters ----------
---------- event_name: str
event_name: str example "Network.responseReceived"
example "Network.responseReceived"
callback: callable
callback: callable callable which accepts 1 parameter: the message object dictionary
callable which accepts 1 parameter: the message object dictionary
Returns
Returns -------
-------
"""
""" with self.lock:
with self.lock: self.handlers[method_name.lower()] = callback
self.handlers[method_name.lower()] = callback
@property
@property def running(self):
def running(self): return not self.event.is_set()
return not self.event.is_set()
def run(self):
def run(self): try:
try: asyncio.set_event_loop(self.loop)
asyncio.set_event_loop(self.loop) self.loop.run_until_complete(self.listen())
self.loop.run_until_complete(self.listen()) except Exception as e:
except Exception as e: logger.warning("Reactor.run() => %s", e)
logger.warning("Reactor.run() => %s", e)
async def _wait_service_started(self):
async def _wait_service_started(self): while True:
while True: with self.lock:
with self.lock: if (
if ( getattr(self.driver, "service", None)
getattr(self.driver, "service", None) and getattr(self.driver.service, "process", None)
and getattr(self.driver.service, "process", None) and self.driver.service.process.poll()
and self.driver.service.process.poll() ):
): await asyncio.sleep(self.driver._delay or 0.25)
await asyncio.sleep(self.driver._delay or 0.25) else:
else: break
break
async def listen(self):
async def listen(self):
while self.running: while self.running:
await self._wait_service_started()
await asyncio.sleep(1) await self._wait_service_started()
await asyncio.sleep(1)
try:
with self.lock: try:
log_entries = self.driver.get_log("performance") with self.lock:
log_entries = self.driver.get_log("performance")
for entry in log_entries:
try: for entry in log_entries:
obj_serialized: str = entry.get("message")
obj = json.loads(obj_serialized) try:
message = obj.get("message")
method = message.get("method") obj_serialized: str = entry.get("message")
obj = json.loads(obj_serialized)
if "*" in self.handlers: message = obj.get("message")
await self.loop.run_in_executor( method = message.get("method")
None, self.handlers["*"], message
) if "*" in self.handlers:
elif method.lower() in self.handlers: await self.loop.run_in_executor(
await self.loop.run_in_executor( None, self.handlers["*"], message
None, self.handlers[method.lower()], message )
) elif method.lower() in self.handlers:
await self.loop.run_in_executor(
# print(type(message), message) None, self.handlers[method.lower()], message
except Exception as e: )
raise e from None
# print(type(message), message)
except Exception as e: except Exception as e:
if "invalid session id" in str(e): raise e from None
pass
else: except Exception as e:
logging.debug("exception ignored :", e) if "invalid session id" in str(e):
pass
else:
logging.debug("exception ignored :", e)

View File

@@ -0,0 +1,4 @@
# for backward compatibility
import sys
sys.modules[__name__] = sys.modules[__package__]

View File

@@ -1,86 +1,37 @@
from typing import List import selenium.webdriver.remote.webelement
from selenium.webdriver.common.by import By
import selenium.webdriver.remote.webelement class WebElement(selenium.webdriver.remote.webelement.WebElement):
"""
Custom WebElement class which makes it easier to view elements when
class WebElement(selenium.webdriver.remote.webelement.WebElement): working in an interactive environment.
def click_safe(self):
super().click() standard webelement repr:
self._parent.reconnect(0.1) <selenium.webdriver.remote.webelement.WebElement (session="85ff0f671512fa535630e71ee951b1f2", element="6357cb55-92c3-4c0f-9416-b174f9c1b8c4")>
def children( using this WebElement class:
self, tag=None, recursive=False <WebElement(<a class="mobile-show-inline-block mc-update-infos init-ok" href="#" id="main-cat-switcher-mobile">)>
) -> List[selenium.webdriver.remote.webelement.WebElement]:
""" """
returns direct child elements of current element
:param tag: str, if supplied, returns <tag> nodes only @property
""" def attrs(self):
script = "return [... arguments[0].children]" if not hasattr(self, "_attrs"):
if tag: self._attrs = self._parent.execute_script(
script += ".filter( node => node.tagName === '%s')" % tag.upper() """
if recursive: var items = {};
return list(_recursive_children(self, tag)) for (index = 0; index < arguments[0].attributes.length; ++index)
return list(self._parent.execute_script(script, self)) {
items[arguments[0].attributes[index].name] = arguments[0].attributes[index].value
};
class UCWebElement(WebElement): return items;
""" """,
Custom WebElement class which makes it easier to view elements when self,
working in an interactive environment. )
return self._attrs
standard webelement repr:
<selenium.webdriver.remote.webelement.WebElement (session="85ff0f671512fa535630e71ee951b1f2", element="6357cb55-92c3-4c0f-9416-b174f9c1b8c4")> def __repr__(self):
strattrs = " ".join([f'{k}="{v}"' for k, v in self.attrs.items()])
using this WebElement class: if strattrs:
<WebElement(<a class="mobile-show-inline-block mc-update-infos init-ok" href="#" id="main-cat-switcher-mobile">)> strattrs = " " + strattrs
return f"{self.__class__.__name__} <{self.tag_name}{strattrs}>"
"""
def __init__(self, parent, id_):
super().__init__(parent, id_)
self._attrs = None
@property
def attrs(self):
if not self._attrs:
self._attrs = self._parent.execute_script(
"""
var items = {};
for (index = 0; index < arguments[0].attributes.length; ++index)
{
items[arguments[0].attributes[index].name] = arguments[0].attributes[index].value
};
return items;
""",
self,
)
return self._attrs
def __repr__(self):
strattrs = " ".join([f'{k}="{v}"' for k, v in self.attrs.items()])
if strattrs:
strattrs = " " + strattrs
return f"{self.__class__.__name__} <{self.tag_name}{strattrs}>"
def _recursive_children(element, tag: str = None, _results=None):
"""
returns all children of <element> recursively
:param element: `WebElement` object.
find children below this <element>
:param tag: str = None.
if provided, return only <tag> elements. example: 'a', or 'img'
:param _results: do not use!
"""
results = _results or set()
for element in element.children():
if tag:
if element.tag_name == tag:
results.add(element)
else:
results.add(element)
results |= _recursive_children(element, tag, results)
return results

View File

@@ -1,19 +1,13 @@
import json import json
import logging import logging
import os import os
import platform
import re import re
import shutil import shutil
import sys
import tempfile
import urllib.parse
from selenium.webdriver.chrome.webdriver import WebDriver from selenium.webdriver.chrome.webdriver import WebDriver
import undetected_chromedriver as uc import undetected_chromedriver as uc
FLARESOLVERR_VERSION = None FLARESOLVERR_VERSION = None
PLATFORM_VERSION = None
CHROME_EXE_PATH = None
CHROME_MAJOR_VERSION = None CHROME_MAJOR_VERSION = None
USER_AGENT = None USER_AGENT = None
XVFB_DISPLAY = None XVFB_DISPLAY = None
@@ -34,133 +28,24 @@ def get_flaresolverr_version() -> str:
return FLARESOLVERR_VERSION return FLARESOLVERR_VERSION
package_path = os.path.join(os.path.dirname(os.path.abspath(__file__)), os.pardir, 'package.json') package_path = os.path.join(os.path.dirname(os.path.abspath(__file__)), os.pardir, 'package.json')
if not os.path.isfile(package_path):
package_path = os.path.join(os.path.dirname(os.path.abspath(__file__)), 'package.json')
with open(package_path) as f: with open(package_path) as f:
FLARESOLVERR_VERSION = json.loads(f.read())['version'] FLARESOLVERR_VERSION = json.loads(f.read())['version']
return FLARESOLVERR_VERSION return FLARESOLVERR_VERSION
def get_current_platform() -> str:
global PLATFORM_VERSION
if PLATFORM_VERSION is not None:
return PLATFORM_VERSION
PLATFORM_VERSION = os.name
return PLATFORM_VERSION
def get_webdriver() -> WebDriver:
def create_proxy_extension(proxy: dict) -> str: global PATCHED_DRIVER_PATH
parsed_url = urllib.parse.urlparse(proxy['url'])
scheme = parsed_url.scheme
host = parsed_url.hostname
port = parsed_url.port
username = proxy['username']
password = proxy['password']
manifest_json = """
{
"version": "1.0.0",
"manifest_version": 2,
"name": "Chrome Proxy",
"permissions": [
"proxy",
"tabs",
"unlimitedStorage",
"storage",
"<all_urls>",
"webRequest",
"webRequestBlocking"
],
"background": {"scripts": ["background.js"]},
"minimum_chrome_version": "76.0.0"
}
"""
background_js = """
var config = {
mode: "fixed_servers",
rules: {
singleProxy: {
scheme: "%s",
host: "%s",
port: %d
},
bypassList: ["localhost"]
}
};
chrome.proxy.settings.set({value: config, scope: "regular"}, function() {});
function callbackFn(details) {
return {
authCredentials: {
username: "%s",
password: "%s"
}
};
}
chrome.webRequest.onAuthRequired.addListener(
callbackFn,
{ urls: ["<all_urls>"] },
['blocking']
);
""" % (
scheme,
host,
port,
username,
password
)
proxy_extension_dir = tempfile.mkdtemp()
with open(os.path.join(proxy_extension_dir, "manifest.json"), "w") as f:
f.write(manifest_json)
with open(os.path.join(proxy_extension_dir, "background.js"), "w") as f:
f.write(background_js)
return proxy_extension_dir
def get_webdriver(proxy: dict = None) -> WebDriver:
global PATCHED_DRIVER_PATH, USER_AGENT
logging.debug('Launching web browser...') logging.debug('Launching web browser...')
# undetected_chromedriver # undetected_chromedriver
options = uc.ChromeOptions() options = uc.ChromeOptions()
options.add_argument('--no-sandbox') options.add_argument('--no-sandbox')
options.add_argument('--window-size=1920,1080') options.add_argument('--window-size=1920,1080')
options.add_argument('--disable-search-engine-choice-screen')
# todo: this param shows a warning in chrome head-full # todo: this param shows a warning in chrome head-full
options.add_argument('--disable-setuid-sandbox') options.add_argument('--disable-setuid-sandbox')
options.add_argument('--disable-dev-shm-usage') options.add_argument('--disable-dev-shm-usage')
# this option removes the zygote sandbox (it seems that the resolution is a bit faster)
options.add_argument('--no-zygote')
# attempt to fix Docker ARM32 build
IS_ARMARCH = platform.machine().startswith(('arm', 'aarch'))
if IS_ARMARCH:
options.add_argument('--disable-gpu-sandbox')
options.add_argument('--ignore-certificate-errors')
options.add_argument('--ignore-ssl-errors')
language = os.environ.get('LANG', None) # note: headless mode is detected (options.headless = True)
if language is not None:
options.add_argument('--accept-lang=%s' % language)
# Fix for Chrome 117 | https://github.com/FlareSolverr/FlareSolverr/issues/910
if USER_AGENT is not None:
options.add_argument('--user-agent=%s' % USER_AGENT)
proxy_extension_dir = None
if proxy and all(key in proxy for key in ['url', 'username', 'password']):
proxy_extension_dir = create_proxy_extension(proxy)
options.add_argument("--load-extension=%s" % os.path.abspath(proxy_extension_dir))
elif proxy and 'url' in proxy:
proxy_url = proxy['url']
logging.debug("Using webdriver proxy: %s", proxy_url)
options.add_argument('--proxy-server=%s' % proxy_url)
# note: headless mode is detected (headless = True)
# we launch the browser in head-full mode with the window hidden # we launch the browser in head-full mode with the window hidden
windows_headless = False windows_headless = False
if get_config_headless(): if get_config_headless():
@@ -168,8 +53,6 @@ def get_webdriver(proxy: dict = None) -> WebDriver:
windows_headless = True windows_headless = True
else: else:
start_xvfb_display() start_xvfb_display()
# For normal headless mode:
# options.add_argument('--headless')
# if we are inside the Docker container, we avoid downloading the driver # if we are inside the Docker container, we avoid downloading the driver
driver_exe_path = None driver_exe_path = None
@@ -182,29 +65,15 @@ def get_webdriver(proxy: dict = None) -> WebDriver:
if PATCHED_DRIVER_PATH is not None: if PATCHED_DRIVER_PATH is not None:
driver_exe_path = PATCHED_DRIVER_PATH driver_exe_path = PATCHED_DRIVER_PATH
# detect chrome path
browser_executable_path = get_chrome_exe_path()
# downloads and patches the chromedriver # downloads and patches the chromedriver
# if we don't set driver_executable_path it downloads, patches, and deletes the driver each time # if we don't set driver_executable_path it downloads, patches, and deletes the driver each time
try: driver = uc.Chrome(options=options, driver_executable_path=driver_exe_path, version_main=version_main,
driver = uc.Chrome(options=options, browser_executable_path=browser_executable_path, windows_headless=windows_headless)
driver_executable_path=driver_exe_path, version_main=version_main,
windows_headless=windows_headless, headless=get_config_headless())
except Exception as e:
logging.error("Error starting Chrome: %s" % e)
# No point in continuing if we cannot retrieve the driver
raise e
# save the patched driver to avoid re-downloads # save the patched driver to avoid re-downloads
if driver_exe_path is None: if driver_exe_path is None:
PATCHED_DRIVER_PATH = os.path.join(driver.patcher.data_path, driver.patcher.exe_name) PATCHED_DRIVER_PATH = os.path.join(driver.patcher.data_path, driver.patcher.exe_name)
if PATCHED_DRIVER_PATH != driver.patcher.executable_path: shutil.copy(driver.patcher.executable_path, PATCHED_DRIVER_PATH)
shutil.copy(driver.patcher.executable_path, PATCHED_DRIVER_PATH)
# clean up proxy extension directory
if proxy_extension_dir is not None:
shutil.rmtree(proxy_extension_dir)
# selenium vanilla # selenium vanilla
# options = webdriver.ChromeOptions() # options = webdriver.ChromeOptions()
@@ -217,45 +86,23 @@ def get_webdriver(proxy: dict = None) -> WebDriver:
return driver return driver
def get_chrome_exe_path() -> str:
global CHROME_EXE_PATH
if CHROME_EXE_PATH is not None:
return CHROME_EXE_PATH
# linux pyinstaller bundle
chrome_path = os.path.join(os.path.dirname(os.path.abspath(__file__)), 'chrome', "chrome")
if os.path.exists(chrome_path):
if not os.access(chrome_path, os.X_OK):
raise Exception(f'Chrome binary "{chrome_path}" is not executable. '
f'Please, extract the archive with "tar xzf <file.tar.gz>".')
CHROME_EXE_PATH = chrome_path
return CHROME_EXE_PATH
# windows pyinstaller bundle
chrome_path = os.path.join(os.path.dirname(os.path.abspath(__file__)), 'chrome', "chrome.exe")
if os.path.exists(chrome_path):
CHROME_EXE_PATH = chrome_path
return CHROME_EXE_PATH
# system
CHROME_EXE_PATH = uc.find_chrome_executable()
return CHROME_EXE_PATH
def get_chrome_major_version() -> str: def get_chrome_major_version() -> str:
global CHROME_MAJOR_VERSION global CHROME_MAJOR_VERSION
if CHROME_MAJOR_VERSION is not None: if CHROME_MAJOR_VERSION is not None:
return CHROME_MAJOR_VERSION return CHROME_MAJOR_VERSION
if os.name == 'nt': if os.name == 'nt':
# Example: '104.0.5112.79'
try: try:
complete_version = extract_version_nt_executable(get_chrome_exe_path()) stream = os.popen(
'reg query "HKLM\\SOFTWARE\\Wow6432Node\\Microsoft\\Windows\\CurrentVersion\\Uninstall\\Google Chrome"')
output = stream.read()
# Example: '104.0.5112.79'
complete_version = extract_version_registry(output)
except Exception: except Exception:
try: # Example: '104.0.5112.79'
complete_version = extract_version_nt_registry() complete_version = extract_version_folder()
except Exception:
# Example: '104.0.5112.79'
complete_version = extract_version_nt_folder()
else: else:
chrome_path = get_chrome_exe_path() chrome_path = uc.find_chrome_executable()
process = os.popen(f'"{chrome_path}" --version') process = os.popen(f'"{chrome_path}" --version')
# Example 1: 'Chromium 104.0.5112.79 Arch Linux\n' # Example 1: 'Chromium 104.0.5112.79 Arch Linux\n'
# Example 2: 'Google Chrome 104.0.5112.79 Arch Linux\n' # Example 2: 'Google Chrome 104.0.5112.79 Arch Linux\n'
@@ -263,32 +110,24 @@ def get_chrome_major_version() -> str:
process.close() process.close()
CHROME_MAJOR_VERSION = complete_version.split('.')[0].split(' ')[-1] CHROME_MAJOR_VERSION = complete_version.split('.')[0].split(' ')[-1]
logging.info(f"Chrome major version: {CHROME_MAJOR_VERSION}")
return CHROME_MAJOR_VERSION return CHROME_MAJOR_VERSION
def extract_version_nt_executable(exe_path: str) -> str: def extract_version_registry(output) -> str:
import pefile try:
pe = pefile.PE(exe_path, fast_load=True) google_version = ''
pe.parse_data_directories( for letter in output[output.rindex('DisplayVersion REG_SZ') + 24:]:
directories=[pefile.DIRECTORY_ENTRY["IMAGE_DIRECTORY_ENTRY_RESOURCE"]] if letter != '\n':
) google_version += letter
return pe.FileInfo[0][0].StringTable[0].entries[b"FileVersion"].decode('utf-8') else:
break
return google_version.strip()
except TypeError:
return ''
def extract_version_nt_registry() -> str: def extract_version_folder() -> str:
stream = os.popen(
'reg query "HKLM\\SOFTWARE\\Wow6432Node\\Microsoft\\Windows\\CurrentVersion\\Uninstall\\Google Chrome"')
output = stream.read()
google_version = ''
for letter in output[output.rindex('DisplayVersion REG_SZ') + 24:]:
if letter != '\n':
google_version += letter
else:
break
return google_version.strip()
def extract_version_nt_folder() -> str:
# Check if the Chrome folder exists in the x32 or x64 Program Files folders. # Check if the Chrome folder exists in the x32 or x64 Program Files folders.
for i in range(2): for i in range(2):
path = 'C:\\Program Files' + (' (x86)' if i else '') + '\\Google\\Chrome\\Application' path = 'C:\\Program Files' + (' (x86)' if i else '') + '\\Google\\Chrome\\Application'
@@ -296,7 +135,7 @@ def extract_version_nt_folder() -> str:
paths = [f.path for f in os.scandir(path) if f.is_dir()] paths = [f.path for f in os.scandir(path) if f.is_dir()]
for path in paths: for path in paths:
filename = os.path.basename(path) filename = os.path.basename(path)
pattern = r'\d+\.\d+\.\d+\.\d+' pattern = '\d+\.\d+\.\d+\.\d+'
match = re.search(pattern, filename) match = re.search(pattern, filename)
if match and match.group(): if match and match.group():
# Found a Chrome version. # Found a Chrome version.
@@ -313,15 +152,11 @@ def get_user_agent(driver=None) -> str:
if driver is None: if driver is None:
driver = get_webdriver() driver = get_webdriver()
USER_AGENT = driver.execute_script("return navigator.userAgent") USER_AGENT = driver.execute_script("return navigator.userAgent")
# Fix for Chrome 117 | https://github.com/FlareSolverr/FlareSolverr/issues/910
USER_AGENT = re.sub('HEADLESS', '', USER_AGENT, flags=re.IGNORECASE)
return USER_AGENT return USER_AGENT
except Exception as e: except Exception as e:
raise Exception("Error getting browser User-Agent. " + str(e)) raise Exception("Error getting browser User-Agent. " + str(e))
finally: finally:
if driver is not None: if driver is not None:
if PLATFORM_VERSION == "nt":
driver.close()
driver.quit() driver.quit()

View File

@@ -1 +1 @@
WebTest==3.0.6 WebTest==3.0.0