Compare commits

..

21 Commits

Author SHA1 Message Date
ngosang
36226b34c1 Fix Dockerfile for linux/386 architecture 2022-09-24 20:33:33 +02:00
ngosang
606d84f7c0 Install undetected_chromedriver dependencies 2022-09-24 20:04:32 +02:00
ngosang
62eb363575 Reuse patched chromedriver 2022-09-24 19:54:42 +02:00
ngosang
345d27dd5a Fix Chrome version detection on Windows 2022-09-24 19:16:02 +02:00
ngosang
3b9fd0aa6a Add browser headless mode for Windows 2022-09-24 18:42:58 +02:00
ngosang
93041779fb Fork undetected-chromedriver 3.1.5.post4 2022-09-24 18:35:01 +02:00
ngosang
3dbb4e65d6 Reduce Docker image size 2022-09-24 18:29:44 +02:00
ngosang
23dd8f8725 Update readme 2022-09-24 16:18:57 +02:00
ngosang
9ab7ab1371 Add browser headless mode for Linux 2022-09-24 16:18:36 +02:00
ngosang
cf7e4f8749 Add tests for several known sites 2022-09-24 15:48:01 +02:00
ngosang
e8328adb90 Show ReqId only in Debug traces 2022-09-24 15:47:33 +02:00
ngosang
843f588859 Detect Cloudflare Access Denied 2022-09-24 15:40:52 +02:00
ngosang
f8462c86f2 Bump version to 3.0.0.beta2 2022-09-24 15:24:05 +02:00
ngosang
4bc083896b Update readme 2022-09-23 02:18:59 +02:00
ngosang
c9f2d6e954 Add Docker image and Docker compose 2022-09-23 02:18:48 +02:00
ngosang
177578d5d8 Rewrite FlareSolverr from scratch in Python + Selenium 2022-09-23 02:17:50 +02:00
ngosang
efcab83f6e Update package.json 2022-09-22 23:37:31 +02:00
ngosang
51b7bc3b92 Update license, remove FlareSolverr v1 / v2 authors 2022-09-22 21:11:40 +02:00
ngosang
e5be265026 Prepare .gitignore for Python project 2022-09-22 21:08:45 +02:00
ngosang
aed54e0bb3 Disable autotag Github Action 2022-09-22 21:08:22 +02:00
ngosang
5046f60914 Prepare for version 3.0, remove JS code 2022-09-22 20:35:03 +02:00
22 changed files with 1834 additions and 2339 deletions

32
.github/ISSUE_TEMPLATE.md vendored Normal file
View File

@@ -0,0 +1,32 @@
**Please use the search bar** at the top of the page and make sure you are not creating an already submitted issue.
Check closed issues as well, because your issue may have already been fixed.
### How to enable debug and html traces
[Follow the instructions from this wiki page](https://github.com/FlareSolverr/FlareSolverr/wiki/How-to-enable-debug-and-html-trace)
### Environment
* **FlareSolverr version**:
* **Last working FlareSolverr version**:
* **Operating system**:
* **Are you using Docker**: [yes/no]
* **FlareSolverr User-Agent (see log traces or / endpoint)**:
* **Are you using a proxy or VPN?** [yes/no]
* **Are you using Captcha Solver:** [yes/no]
* **If using captcha solver, which one:**
* **URL to test this issue:**
### Description
[List steps to reproduce the error and details on what happens and what you expected to happen]
### Logged Error Messages
[Place any relevant error messages you noticed from the logs here.]
[Make sure you attach the full logs with your personal information removed in case we need more information]
### Screenshots
[Place any screenshots of the issue here if needed]

View File

@@ -1,63 +0,0 @@
name: Bug report
description: Create a report of your issue
body:
- type: checkboxes
attributes:
label: Have you checked our README?
description: Please check the <a href="https://github.com/FlareSolverr/FlareSolverr/blob/master/README.md">README</a>.
options:
- label: I have checked the README
required: true
- type: checkboxes
attributes:
label: Is there already an issue for your problem?
description: Please make sure you are not creating an already submitted <a href="https://github.com/FlareSolverr/FlareSolverr/issues">Issue</a>. Check closed issues as well, because your issue may have already been fixed.
options:
- label: I have checked older issues, open and closed
required: true
- type: checkboxes
attributes:
label: Have you checked the discussions?
description: Please read our <a href="https://github.com/FlareSolverr/FlareSolverr/discussions">Discussions</a> before submitting your issue, some wider problems may be dealt with there.
options:
- label: I have read the Discussions
required: true
- type: textarea
attributes:
label: Environment
description: Please provide the details of the system FlareSolverr is running on.
value: |
- FlareSolverr version:
- Last working FlareSolverr version:
- Operating system:
- Are you using Docker: [yes/no]
- FlareSolverr User-Agent (see log traces or / endpoint):
- Are you using a proxy or VPN: [yes/no]
- Are you using Captcha Solver: [yes/no]
- If using captcha solver, which one:
- URL to test this issue:
render: markdown
validations:
required: true
- type: textarea
attributes:
label: Description
description: List steps to reproduce the error and details on what happens and what you expected to happen.
validations:
required: true
- type: textarea
attributes:
label: Logged Error Messages
description: |
Place any relevant error messages you noticed from the logs here.
Make sure you attach the full logs with your personal information removed in case we need more information.
If you wish to provide debug logs, follow the instructions from this <a href="https://github.com/FlareSolverr/FlareSolverr/wiki/How-to-enable-debug-and-html-trace">wiki page</a>.
render: text
validations:
required: true
- type: textarea
attributes:
label: Screenshots
description: Place any screenshots of the issue here if needed
validations:
required: false

View File

@@ -1,8 +0,0 @@
blank_issues_enabled: false
contact_links:
- name: Requesting new features or changes
url: https://github.com/FlareSolverr/FlareSolverr/discussions
about: Please create a new discussion topic, grouped under "Ideas".
- name: Asking questions
url: https://github.com/FlareSolverr/FlareSolverr/discussions
about: Please create a new discussion topic, grouped under "Q&A".

View File

@@ -1,20 +1,21 @@
name: autotag
on:
push:
branches:
- "master"
jobs:
build:
runs-on: ubuntu-latest
steps:
-
name: Checkout
uses: actions/checkout@v3
-
name: Auto Tag
uses: Klemensas/action-autotag@stable
with:
GITHUB_TOKEN: "${{ secrets.GH_PAT }}"
tag_prefix: "v"
# todo: enable in the first release
#name: autotag
#
#on:
# push:
# branches:
# - "master"
#
#jobs:
# build:
# runs-on: ubuntu-latest
# steps:
# -
# name: Checkout
# uses: actions/checkout@v2
# -
# name: Auto Tag
# uses: Klemensas/action-autotag@stable
# with:
# GITHUB_TOKEN: "${{ secrets.GH_PAT }}"
# tag_prefix: "v"

View File

@@ -11,43 +11,43 @@ jobs:
steps:
-
name: Checkout
uses: actions/checkout@v3
uses: actions/checkout@v2
-
name: Downcase repo
run: echo REPOSITORY=$(echo ${{ github.repository }} | tr '[:upper:]' '[:lower:]') >> $GITHUB_ENV
-
name: Docker meta
id: docker_meta
uses: crazy-max/ghaction-docker-meta@v3
uses: crazy-max/ghaction-docker-meta@v1
with:
images: ${{ env.REPOSITORY }},ghcr.io/${{ env.REPOSITORY }}
tag-sha: false
-
name: Set up QEMU
uses: docker/setup-qemu-action@v2
uses: docker/setup-qemu-action@v1.0.1
-
name: Set up Docker Buildx
uses: docker/setup-buildx-action@v2
uses: docker/setup-buildx-action@v1
-
name: Login to DockerHub
uses: docker/login-action@v2
uses: docker/login-action@v1
with:
username: ${{ secrets.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_TOKEN }}
-
name: Login to GitHub Container Registry
uses: docker/login-action@v2
uses: docker/login-action@v1
with:
registry: ghcr.io
username: ${{ github.repository_owner }}
password: ${{ secrets.GH_PAT }}
-
name: Build and push
uses: docker/build-push-action@v3
uses: docker/build-push-action@v2
with:
context: .
file: ./Dockerfile
platforms: linux/386,linux/amd64,linux/arm/v7,linux/arm64/v8
platforms: linux/amd64,linux/arm/v7,linux/arm64
push: ${{ github.event_name != 'pull_request' }}
tags: ${{ steps.docker_meta.outputs.tags }}
labels: ${{ steps.docker_meta.outputs.labels }}

View File

@@ -11,12 +11,12 @@ jobs:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v3
uses: actions/checkout@v2
with:
fetch-depth: 0 # get all commits, branches and tags (required for the changelog)
- name: Setup Node
uses: actions/setup-node@v3
uses: actions/setup-node@v2
with:
node-version: '16'

View File

@@ -1,253 +0,0 @@
# Changelog
## v3.1.0 (upcoming)
* Kill Chromium processes properly to avoid defunct/zombie processes
* Update undetected-chromedriver
* Disable Zygote sandbox in Chromium browser
* Add more selectors to detect blocked access
* Include procps (ps), curl and vim packages in the Docker image
## v3.0.0 (2023/01/04)
* This is the first release of FlareSolverr v3. There are some breaking changes
* Docker images for linux/386, linux/amd64, linux/arm/v7 and linux/arm64/v8
* Replaced Firefox with Chrome
* Replaced NodeJS / Typescript with Python
* Replaced Puppeter with Selenium
* No binaries for Linux / Windows. You have to use the Docker image or install from Source code
* No proxy support
* No session support
## v2.2.10 (2022/10/22)
* Detect DDoS-Guard through title content
## v2.2.9 (2022/09/25)
* Detect Cloudflare Access Denied
* Commit the complete changelog
## v2.2.8 (2022/09/17)
* Remove 30 s delay and clean legacy code
## v2.2.7 (2022/09/12)
* Temporary fix: add 30s delay
* Update README.md
## v2.2.6 (2022/07/31)
* Fix Cloudflare detection in POST requests
## v2.2.5 (2022/07/30)
* Update GitHub actions to build executables with NodeJs 16
* Update Cloudflare selectors and add HTML samples
* Install Firefox 94 instead of the latest Nightly
* Update dependencies
* Upgrade Puppeteer (#396)
## v2.2.4 (2022/04/17)
* Detect DDoS-Guard challenge
## v2.2.3 (2022/04/16)
* Fix 2000 ms navigation timeout
* Update README.md (libseccomp2 package in Debian)
* Update README.md (clarify proxy parameter) (#307)
* Update NPM dependencies
* Disable Cloudflare ban detection
## v2.2.2 (2022/03/19)
* Fix ban detection. Resolves #330 (#336)
## v2.2.1 (2022/02/06)
* Fix max timeout error in some pages
* Avoid crashing in NodeJS 17 due to Unhandled promise rejection
* Improve proxy validation and debug traces
* Remove @types/puppeteer dependency
## v2.2.0 (2022/01/31)
* Increase default BROWSER_TIMEOUT=40000 (40 seconds)
* Fix Puppeter deprecation warnings
* Update base Docker image Alpine 3.15 / NodeJS 16
* Build precompiled binaries with NodeJS 16
* Update Puppeter and other dependencies
* Add support for Custom CloudFlare challenge
* Add support for DDoS-GUARD challenge
## v2.1.0 (2021/12/12)
* Add aarch64 to user agents to be replaced (#248)
* Fix SOCKSv4 and SOCKSv5 proxy. resolves #214 #220
* Remove redundant JSON key (postData) (#242)
* Make test URL configurable with TEST_URL env var. resolves #240
* Bypass new Cloudflare protection
* Update donation links
## v2.0.2 (2021/10/31)
* Fix SOCKS5 proxy. Resolves #214
* Replace Firefox ERS with a newer version
* Catch startup exceptions and give some advices
* Add env var BROWSER_TIMEOUT for slow systems
* Fix NPM warning in Docker images
## v2.0.1 (2021/10/24)
* Check user home dir before testing web browser installation
## v2.0.0 (2021/10/20)
FlareSolverr 2.0.0 is out with some important changes:
* It is capable of solving the automatic challenges of Cloudflare. CAPTCHAs (hCaptcha) cannot be resolved and the old solvers have been removed.
* The Chrome browser has been replaced by Firefox. This has caused some functionality to be removed. Parameters: `userAgent`, `headers`, `rawHtml` and `downloadare` no longer available.
* Included `proxy` support without user/password credentials. If you are writing your own integration with FlareSolverr, make sure your client uses the same User-Agent header and Proxy that FlareSolverr uses. Those values together with the Cookie are checked and detected by Cloudflare.
* FlareSolverr has been rewritten from scratch. From now on it should be easier to maintain and test.
* If you are using Jackett make sure you have version v0.18.1041 or higher. FlareSolverSharp v2.0.0 is out too.
Complete changelog:
* Bump version 2.0.0
* Set puppeteer timeout half of maxTimeout param. Resolves #180
* Add test for blocked IP
* Avoid reloading the page in case of error
* Improve Cloudflare detection
* Fix version
* Fix browser preferences and proxy
* Fix request.post method and clean error traces
* Use Firefox ESR for Docker images
* Improve Firefox start time and code clean up
* Improve bad request management and tests
* Build native packages with Firefox
* Update readme
* Improve Docker image and clean TODOs
* Add proxy support
* Implement request.post method for Firefox
* Code clean up, remove returnRawHtml, download, headers params
* Remove outdated chaptcha solvers
* Refactor the app to use Express server and Jest for tests
* Fix Cloudflare resolver for Linux ARM builds
* Fix Cloudflare resolver
* Replace Chrome web browser with Firefox
* Remove userAgent parameter since any modification is detected by CF
* Update dependencies
* Remove Puppeter steath plugin
## v1.2.9 (2021/08/01)
* Improve "Execution context was destroyed" error handling
* Implement returnRawHtml parameter. resolves #172 resolves #165
* Capture Docker stop signal. resolves #158
* Reduce Docker image size 20 MB
* Fix page reload after challenge is solved. resolves #162 resolves #143
* Avoid loading images/css/fonts to speed up page load
* Improve Cloudflare IP ban detection
* Fix vulnerabilities
## v1.2.8 (2021/06/01)
* Improve old JS challenge waiting. Resolves #129
## v1.2.7 (2021/06/01)
* Improvements in Cloudflare redirect detection. Resolves #140
* Fix installation instructions
## v1.2.6 (2021/05/30)
* Handle new Cloudflare challenge. Resolves #135 Resolves #134
* Provide reference Systemd unit file. Resolves #72
* Fix EACCES: permission denied, open '/tmp/flaresolverr.txt'. Resolves #120
* Configure timezone with TZ env var. Resolves #109
* Return the redirected URL in the response (#126)
* Show an error in hcaptcha-solver. Resolves #132
* Regenerate package-lock.json lockfileVersion 2
* Update issue template. Resolves #130
* Bump ws from 7.4.1 to 7.4.6 (#137)
* Bump hosted-git-info from 2.8.8 to 2.8.9 (#124)
* Bump lodash from 4.17.20 to 4.17.21 (#125)
## v1.2.5 (2021/04/05)
* Fix memory regression, close test browser
* Fix release-docker GitHub action
## v1.2.4 (2021/04/04)
* Include license in release zips. resolves #75
* Validate Chrome is working at startup
* Speedup Docker image build
* Add health check endpoint
* Update issue template
* Minor improvements in debug traces
* Validate environment variables at startup. resolves #101
* Add FlareSolverr logo. resolves #23
## v1.2.3 (2021/01/10)
* CI/CD: Generate release changelog from commits. resolves #34
* Update README.md
* Add donation links
* Simplify docker-compose.yml
* Allow to configure "none" captcha resolver
* Override docker-compose.yml variables via .env resolves #64 (#66)
## v1.2.2 (2021/01/09)
* Add documentation for precompiled binaries installation
* Add instructions to set environment variables in Windows
* Build Windows and Linux binaries. resolves #18
* Add release badge in the readme
* CI/CD: Generate release changelog from commits. resolves #34
* Add a notice about captcha solvers
* Add Chrome flag --disable-dev-shm-usage to fix crashes. resolves #45
* Fix Docker CLI documentation
* Add traces with captcha solver service. resolves #39
* Improve logic to detect Cloudflare captcha. resolves #48
* Move Cloudflare provider logic to his own class
* Simplify and document the "return only cookies" parameter
* Show message when debug log is enabled
* Update readme to add more clarifications. resolves #53 (#60)
* issue_template: typo fix (#52)
## v1.2.1 (2020/12/20)
* Change version to match release tag / 1.2.0 => v1.2.0
* CI/CD Publish release in GitHub repository. resolves #34
* Add welcome message in / endpoint
* Rewrite request timeout handling (maxTimeout) resolves #42
* Add http status for better logging
* Return an error when no selectors are found, #25
* Add issue template, fix #32
* Moving log.html right after loading the page and add one on reload, fix #30
* Update User-Agent to match chromium version, ref: #15 (#28)
* Update install from source code documentation
* Update readme to add Docker instructions (#20)
* Clean up readme (#19)
* Add docker-compose
* Change default log level to info
## v1.2.0 (2020/12/20)
* Fix User-Agent detected by CouldFlare (Docker ARM) resolves #15
* Include exception message in error response
* CI/CD: Rename GitHub Action build => publish
* Bump version
* Fix TypeScript compilation and bump minor version
* CI/CD: Bump minor version
* CI/CD: Configure GitHub Actions
* CI/CD: Configure GitHub Actions
* CI/CD: Bump minor version
* CI/CD: Configure Build GitHub Action
* CI/CD: Configure AutoTag GitHub Action (#14)
* CI/CD: Build the Docker images with GitHub Actions (#13)
* Update dependencies
* Backport changes from Cloudproxy (#11)

View File

@@ -1,4 +1,4 @@
FROM python:3.11-slim-bullseye as builder
FROM python:3.10-slim-bullseye as builder
# Build dummy packages to skip installing them and their dependencies
RUN apt-get update \
@@ -12,25 +12,28 @@ RUN apt-get update \
&& equivs-build adwaita-icon-theme \
&& mv adwaita-icon-theme_*.deb /adwaita-icon-theme.deb
FROM python:3.11-slim-bullseye
FROM python:3.10-slim-bullseye
# Copy dummy packages
COPY --from=builder /*.deb /
# Install dependencies and create flaresolverr user
# We have to install and old version of Chromium because its not working in Raspberry Pi / ARM
# You can test Chromium running this command inside the container:
# xvfb-run -s "-screen 0 1600x1200x24" chromium --no-sandbox
# The error traces is like this: "*** stack smashing detected ***: terminated"
# To check the package versions available you can use this command:
# apt-cache madison chromium
WORKDIR /app
RUN echo "\ndeb http://snapshot.debian.org/archive/debian/20210519T212015Z/ bullseye main" >> /etc/apt/sources.list \
&& echo 'Acquire::Check-Valid-Until "false";' | tee /etc/apt/apt.conf.d/00snapshot \
# Install dummy packages
RUN dpkg -i /libgl1-mesa-dri.deb \
&& dpkg -i /libgl1-mesa-dri.deb \
&& dpkg -i /adwaita-icon-theme.deb \
# Install dependencies
&& apt-get update \
&& apt-get install -y --no-install-recommends chromium chromium-common chromium-driver xvfb dumb-init \
procps curl vim \
&& apt-get install -y --no-install-recommends chromium=89.0.4389.114-1 chromium-common=89.0.4389.114-1 \
chromium-driver=89.0.4389.114-1 xvfb \
# Remove temporary files and hardware decoding libraries
&& rm -rf /var/lib/apt/lists/* \
&& rm -f /usr/lib/x86_64-linux-gnu/libmfxhw* \
@@ -44,7 +47,8 @@ RUN dpkg -i /libgl1-mesa-dri.deb \
COPY requirements.txt .
RUN pip install -r requirements.txt \
# Remove temporary files
&& rm -rf /root/.cache
&& rm -rf /root/.cache \
&& find / -name '*.pyc' -delete
USER flaresolverr
@@ -53,17 +57,13 @@ COPY package.json ../
EXPOSE 8191
# dumb-init avoids zombie chromium processes
ENTRYPOINT ["/usr/bin/dumb-init", "--"]
CMD ["/usr/local/bin/python", "-u", "/app/flaresolverr.py"]
# Local build
# docker build -t ngosang/flaresolverr:3.0.0 .
# docker run -p 8191:8191 ngosang/flaresolverr:3.0.0
# docker build -t ngosang/flaresolverr:3.0.0.beta2 .
# docker run -p 8191:8191 ngosang/flaresolverr:3.0.0.beta2
# Multi-arch build
# docker run --rm --privileged multiarch/qemu-user-static --reset -p yes
# docker buildx create --use
# docker buildx build -t ngosang/flaresolverr:3.0.0 --platform linux/386,linux/amd64,linux/arm/v7,linux/arm64/v8 .
# docker buildx build -t ngosang/flaresolverr:3.0.0.beta2 --platform linux/386,linux/amd64,linux/arm/v7,linux/arm64/v8 .
# add --push to publish in DockerHub

View File

@@ -1,6 +1,6 @@
MIT License
Copyright (c) 2023 Diego Heras (ngosang / ngosang@hotmail.es)
Copyright (c) 2022 Diego Heras (ngosang / ngosang@hotmail.es)
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal

View File

@@ -1,6 +1,6 @@
{
"name": "flaresolverr",
"version": "3.0.1",
"version": "3.0.0.beta2",
"description": "Proxy server to bypass Cloudflare protection",
"author": "Diego Heras (ngosang / ngosang@hotmail.es)",
"license": "MIT"

View File

@@ -1,9 +1,9 @@
bottle==0.12.23
waitress==2.1.2
selenium==4.7.2
selenium==4.4.3
func-timeout==4.3.5
# required by undetected_chromedriver
requests==2.28.1
websockets==10.4
websockets==10.3
# only required for linux
xvfbwrapper==0.2.9

View File

@@ -7,7 +7,7 @@ from selenium.common import TimeoutException
from selenium.webdriver.chrome.webdriver import WebDriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.support.expected_conditions import presence_of_element_located, staleness_of, title_is
from selenium.webdriver.support.expected_conditions import presence_of_element_located, staleness_of
from dtos import V1RequestBase, V1ResponseBase, ChallengeResolutionT, ChallengeResolutionResultT, IndexResponse, \
HealthResponse, STATUS_OK, STATUS_ERROR
@@ -15,23 +15,17 @@ import utils
ACCESS_DENIED_SELECTORS = [
# Cloudflare
'div.cf-error-title span.cf-code-label span'
# Cloudflare http://bitturk.net/ Firefox
'#cf-error-details div.cf-error-overview h1'
]
CHALLENGE_TITLE = [
# Cloudflare
'Just a moment...',
# DDoS-GUARD
'DDOS-GUARD',
'div.main-wrapper div.header.section h1 span.code-label span'
]
CHALLENGE_SELECTORS = [
# Cloudflare
'#cf-challenge-running', '.ray_id', '.attack-box', '#cf-please-wait', '#challenge-spinner', '#trk_jschal_js',
'#cf-challenge-running', '.ray_id', '.attack-box', '#cf-please-wait', '#trk_jschal_js',
# DDoS-GUARD
'#link-ddg',
# Custom CloudFlare for EbookParadijs, Film-Paleis, MuziekFabriek and Puur-Hollands
'td.info #js_info'
]
SHORT_TIMEOUT = 10
SHORT_TIMEOUT = 5
def test_browser_installation():
@@ -182,16 +176,8 @@ def _evil_logic(req: V1RequestBase, driver: WebDriver, method: str) -> Challenge
raise Exception('Cloudflare has blocked this request. '
'Probably your IP is banned for this site, check in your web browser.')
# find challenge by title
# find challenge selectors
challenge_found = False
page_title = driver.title
for title in CHALLENGE_TITLE:
if title == page_title:
challenge_found = True
logging.info("Challenge detected. Title found: " + title)
break
if not challenge_found:
# find challenge by selectors
for selector in CHALLENGE_SELECTORS:
found_elements = driver.find_elements(By.CSS_SELECTOR, selector)
if len(found_elements) > 0:
@@ -202,11 +188,6 @@ def _evil_logic(req: V1RequestBase, driver: WebDriver, method: str) -> Challenge
if challenge_found:
while True:
try:
# wait until the title change
for title in CHALLENGE_TITLE:
logging.debug("Waiting for title: " + title)
WebDriverWait(driver, SHORT_TIMEOUT).until_not(title_is(title))
# then wait until all the selectors disappear
for selector in CHALLENGE_SELECTORS:
logging.debug("Waiting for selector: " + selector)
@@ -239,11 +220,11 @@ def _evil_logic(req: V1RequestBase, driver: WebDriver, method: str) -> Challenge
challenge_res.url = driver.current_url
challenge_res.status = 200 # todo: fix, selenium not provides this info
challenge_res.cookies = driver.get_cookies()
challenge_res.userAgent = utils.get_user_agent(driver)
if not req.returnOnlyCookies:
challenge_res.headers = {} # todo: fix, selenium not provides this info
challenge_res.response = driver.page_source
challenge_res.userAgent = utils.get_user_agent(driver)
res.result = challenge_res
return res

View File

@@ -1,4 +1,5 @@
import unittest
from datetime import datetime, timezone
from webtest import TestApp
@@ -19,12 +20,12 @@ class TestFlareSolverr(unittest.TestCase):
proxy_url = "http://127.0.0.1:8888"
proxy_socks_url = "socks5://127.0.0.1:1080"
google_url = "https://www.google.com"
post_url = "https://httpbin.org/post"
post_url = "https://ptsv2.com/t/qv4j3-1634496523"
cloudflare_url = "https://nowsecure.nl"
cloudflare_url_2 = "https://idope.se/torrent-list/harry/"
ddos_guard_url = "https://anidex.info/"
custom_cloudflare_url = "https://www.muziekfabriek.org"
cloudflare_blocked_url = "https://cpasbiens3.fr/index.php?do=search&subaction=search"
cloudflare_blocked_url = "https://avistaz.to/api/v1/jackett/torrents?in=1&type=0&search="
app = TestApp(flaresolverr.app)
@@ -232,7 +233,7 @@ class TestFlareSolverr(unittest.TestCase):
self.assertIsNone(solution.headers)
self.assertIsNone(solution.response)
self.assertGreater(len(solution.cookies), 0)
self.assertIn("Chrome/", solution.userAgent)
self.assertIsNone(solution.userAgent)
# todo: test Cmd 'request.get' should return OK with HTTP 'proxy' param
# todo: test Cmd 'request.get' should return OK with HTTP 'proxy' param with credentials
@@ -280,7 +281,7 @@ class TestFlareSolverr(unittest.TestCase):
def test_v1_endpoint_request_post_no_cloudflare(self):
res = self.app.post_json('/v1', {
"cmd": "request.post",
"url": self.post_url,
"url": self.post_url + '/post',
"postData": "param1=value1&param2=value2"
})
self.assertEqual(res.status_code, 200)
@@ -296,10 +297,22 @@ class TestFlareSolverr(unittest.TestCase):
self.assertIn(self.post_url, solution.url)
self.assertEqual(solution.status, 200)
self.assertIs(len(solution.headers), 0)
self.assertIn('"form": {\n "param1": "value1", \n "param2": "value2"\n }', solution.response)
self.assertIn("I hope you have a lovely day!", solution.response)
self.assertEqual(len(solution.cookies), 0)
self.assertIn("Chrome/", solution.userAgent)
# check that we sent the post data
res2 = self.app.post_json('/v1', {
"cmd": "request.get",
"url": self.post_url
})
self.assertEqual(res2.status_code, 200)
body2 = V1ResponseBase(res2.json)
self.assertEqual(STATUS_OK, body2.status)
date_hour = datetime.now(timezone.utc).isoformat().split(':')[0].replace('T', ' ')
self.assertIn(date_hour, body2.solution.response)
def test_v1_endpoint_request_post_cloudflare(self):
res = self.app.post_json('/v1', {
"cmd": "request.post",

View File

@@ -1,4 +1,7 @@
#!/usr/bin/env python3
from __future__ import annotations
import subprocess
"""
@@ -14,38 +17,33 @@ Y88b. 888 888 888 Y88..88P 888 888 888 Y8b. Y88b 888 888 888 Y
by UltrafunkAmsterdam (https://github.com/ultrafunkamsterdam)
"""
from __future__ import annotations
__version__ = "3.2.1"
__version__ = "3.1.5r4"
import json
import logging
import os
import re
import shutil
import subprocess
import sys
import tempfile
import time
from weakref import finalize
import inspect
import threading
import selenium.webdriver.chrome.service
import selenium.webdriver.chrome.webdriver
from selenium.webdriver.common.by import By
import selenium.webdriver.common.service
import selenium.webdriver.remote.command
import selenium.webdriver.remote.webdriver
from .cdp import CDP
from .dprocess import start_detached
from .options import ChromeOptions
from .patcher import IS_POSIX
from .patcher import Patcher
from .reactor import Reactor
from .webelement import UCWebElement
from .webelement import WebElement
from .dprocess import start_detached
__all__ = (
"Chrome",
@@ -110,7 +108,6 @@ class Chrome(selenium.webdriver.chrome.webdriver.WebDriver):
port=0,
enable_cdp_events=False,
service_args=None,
service_creationflags=None,
desired_capabilities=None,
advanced_elements=False,
service_log_path=None,
@@ -122,9 +119,8 @@ class Chrome(selenium.webdriver.chrome.webdriver.WebDriver):
suppress_welcome=True,
use_subprocess=False,
debug=False,
no_sandbox=True,
windows_headless=False,
**kw,
**kw
):
"""
Creates a new instance of the chrome driver.
@@ -151,9 +147,7 @@ class Chrome(selenium.webdriver.chrome.webdriver.WebDriver):
If not specified, make sure the executable's folder is in $PATH
port: int, optional, default: 0
port to be used by the chromedriver executable, this is NOT the debugger port.
leave it at 0 unless you know what you are doing.
the default value of 0 automatically picks an available port.
port you would like the service to run, if left as 0, a free port will be found.
enable_cdp_events: bool, default: False
:: currently for chrome only
@@ -213,12 +207,11 @@ class Chrome(selenium.webdriver.chrome.webdriver.WebDriver):
now, in case you are nag-fetishist, or a diagnostics data feeder to google, you can set this to False.
Note: if you don't handle the nag screen in time, the browser loses it's connection and throws an Exception.
use_subprocess: bool, optional , default: True,
use_subprocess: bool, optional , default: False,
False (the default) makes sure Chrome will get it's own process (so no subprocess of chromedriver.exe or python
This fixes a LOT of issues, like multithreaded run, but mst importantly. shutting corectly after
program exits or using .quit()
you should be knowing what you're doing, and know how python works.
unfortunately, there is always an edge case in which one would like to write an single script with the only contents being:
--start script--
@@ -231,13 +224,7 @@ class Chrome(selenium.webdriver.chrome.webdriver.WebDriver):
in that case you can set this to `True`. The browser will start via subprocess, and will keep running most of times.
! setting it to True comes with NO support when being detected. !
no_sandbox: bool, optional, default=True
uses the --no-sandbox option, and additionally does suppress the "unsecure option" status bar
this option has a default of True since many people seem to run this as root (....) , and chrome does not start
when running as root without using --no-sandbox flag.
"""
finalize(self, self._ensure_close, self)
self.debug = debug
patcher = Patcher(
executable_path=driver_executable_path,
@@ -249,6 +236,7 @@ class Chrome(selenium.webdriver.chrome.webdriver.WebDriver):
if not options:
options = ChromeOptions()
try:
if hasattr(options, "_session") and options._session is not None:
# prevent reuse of options,
@@ -260,17 +248,11 @@ class Chrome(selenium.webdriver.chrome.webdriver.WebDriver):
options._session = self
if not options.debugger_address:
debug_port = (
port
if port != 0
else selenium.webdriver.common.service.utils.free_port()
)
debug_port = selenium.webdriver.common.service.utils.free_port()
debug_host = "127.0.0.1"
if not options.debugger_address:
options.debugger_address = "%s:%d" % (debug_host, debug_port)
else:
debug_host, debug_port = options.debugger_address.split(":")
debug_port = int(debug_port)
if enable_cdp_events:
options.set_capability(
@@ -281,12 +263,13 @@ class Chrome(selenium.webdriver.chrome.webdriver.WebDriver):
options.add_argument("--remote-debugging-port=%s" % debug_port)
if user_data_dir:
options.add_argument("--user-data-dir=%s" % user_data_dir)
options.add_argument('--user-data-dir=%s' % user_data_dir)
language, keep_user_data_dir = None, bool(user_data_dir)
# see if a custom user profile is specified in options
for arg in options.arguments:
if "lang" in arg:
m = re.search("(?:--)?lang(?:[ =])?(.*)", arg)
try:
@@ -311,6 +294,7 @@ class Chrome(selenium.webdriver.chrome.webdriver.WebDriver):
)
if not user_data_dir:
# backward compatiblity
# check if an old uc.ChromeOptions is used, and extract the user data dir
@@ -363,8 +347,6 @@ class Chrome(selenium.webdriver.chrome.webdriver.WebDriver):
if suppress_welcome:
options.arguments.extend(["--no-default-browser-check", "--no-first-run"])
if no_sandbox:
options.arguments.extend(["--no-sandbox", "--test-type"])
if headless or options.headless:
options.headless = True
options.add_argument("--window-size=1920,1080")
@@ -378,7 +360,7 @@ class Chrome(selenium.webdriver.chrome.webdriver.WebDriver):
or divmod(logging.getLogger().getEffectiveLevel(), 10)[0]
)
if hasattr(options, "handle_prefs"):
if hasattr(options, 'handle_prefs'):
options.handle_prefs(user_data_dir)
# fix exit_type flag to prevent tab-restore nag
@@ -394,7 +376,6 @@ class Chrome(selenium.webdriver.chrome.webdriver.WebDriver):
config["profile"]["exit_type"] = None
fs.seek(0, 0)
json.dump(config, fs)
fs.truncate() # the file might be shorter
logger.debug("fixed exit_type flag")
except Exception as e:
logger.debug("did not find a bad exit_type flag ")
@@ -422,17 +403,6 @@ class Chrome(selenium.webdriver.chrome.webdriver.WebDriver):
)
self.browser_pid = browser.pid
if service_creationflags:
service = selenium.webdriver.common.service.Service(
patcher.executable_path, port, service_args, service_log_path
)
for attr_name in ("creationflags", "creation_flags"):
if hasattr(service, attr_name):
setattr(service, attr_name, service_creationflags)
break
else:
service = None
super(Chrome, self).__init__(
executable_path=patcher.executable_path,
port=port,
@@ -441,7 +411,6 @@ class Chrome(selenium.webdriver.chrome.webdriver.WebDriver):
desired_capabilities=desired_capabilities,
service_log_path=service_log_path,
keep_alive=keep_alive,
service=service, # needed or the service will be re-created
)
self.reactor = None
@@ -456,14 +425,35 @@ class Chrome(selenium.webdriver.chrome.webdriver.WebDriver):
self.reactor = reactor
if advanced_elements:
self._web_element_cls = UCWebElement
else:
from .webelement import WebElement
self._web_element_cls = WebElement
if options.headless:
self._configure_headless()
def __getattribute__(self, item):
if not super().__getattribute__("debug"):
return super().__getattribute__(item)
else:
import inspect
original = super().__getattribute__(item)
if inspect.ismethod(original) and not inspect.isclass(original):
def newfunc(*args, **kwargs):
logger.debug(
"calling %s with args %s and kwargs %s\n"
% (original.__qualname__, args, kwargs)
)
return original(*args, **kwargs)
return newfunc
return original
def _configure_headless(self):
orig_get = self.get
logger.info("setting properties for headless")
@@ -504,107 +494,18 @@ class Chrome(selenium.webdriver.chrome.webdriver.WebDriver):
"Page.addScriptToEvaluateOnNewDocument",
{
"source": """
Object.defineProperty(navigator, 'maxTouchPoints', {get: () => 1});
Object.defineProperty(navigator.connection, 'rtt', {get: () => 100});
// https://github.com/microlinkhq/browserless/blob/master/packages/goto/src/evasions/chrome-runtime.js
window.chrome = {
app: {
isInstalled: false,
InstallState: {
DISABLED: 'disabled',
INSTALLED: 'installed',
NOT_INSTALLED: 'not_installed'
},
RunningState: {
CANNOT_RUN: 'cannot_run',
READY_TO_RUN: 'ready_to_run',
RUNNING: 'running'
}
},
runtime: {
OnInstalledReason: {
CHROME_UPDATE: 'chrome_update',
INSTALL: 'install',
SHARED_MODULE_UPDATE: 'shared_module_update',
UPDATE: 'update'
},
OnRestartRequiredReason: {
APP_UPDATE: 'app_update',
OS_UPDATE: 'os_update',
PERIODIC: 'periodic'
},
PlatformArch: {
ARM: 'arm',
ARM64: 'arm64',
MIPS: 'mips',
MIPS64: 'mips64',
X86_32: 'x86-32',
X86_64: 'x86-64'
},
PlatformNaclArch: {
ARM: 'arm',
MIPS: 'mips',
MIPS64: 'mips64',
X86_32: 'x86-32',
X86_64: 'x86-64'
},
PlatformOs: {
ANDROID: 'android',
CROS: 'cros',
LINUX: 'linux',
MAC: 'mac',
OPENBSD: 'openbsd',
WIN: 'win'
},
RequestUpdateCheckStatus: {
NO_UPDATE: 'no_update',
THROTTLED: 'throttled',
UPDATE_AVAILABLE: 'update_available'
}
}
}
// https://github.com/microlinkhq/browserless/blob/master/packages/goto/src/evasions/navigator-permissions.js
if (!window.Notification) {
window.Notification = {
permission: 'denied'
}
}
const originalQuery = window.navigator.permissions.query
window.navigator.permissions.__proto__.query = parameters =>
parameters.name === 'notifications'
? Promise.resolve({ state: window.Notification.permission })
: originalQuery(parameters)
const oldCall = Function.prototype.call
function call() {
return oldCall.apply(this, arguments)
}
Function.prototype.call = call
const nativeToStringFunctionString = Error.toString().replace(/Error/g, 'toString')
const oldToString = Function.prototype.toString
function functionToString() {
if (this === window.navigator.permissions.query) {
return 'function query() { [native code] }'
}
if (this === functionToString) {
return nativeToStringFunctionString
}
return oldCall.call(oldToString, this)
}
// eslint-disable-next-line
Function.prototype.toString = functionToString
"""
Object.defineProperty(navigator, 'maxTouchPoints', {
get: () => 1
})"""
},
)
return orig_get(*args, **kwargs)
self.get = get_wrapped
def __dir__(self):
return object.__dir__(self)
def _get_cdc_props(self):
return self.execute_script(
"""
@@ -652,11 +553,6 @@ class Chrome(selenium.webdriver.chrome.webdriver.WebDriver):
if self.reactor and isinstance(self.reactor, Reactor):
self.reactor.handlers.clear()
def window_new(self):
self.execute(
selenium.webdriver.remote.command.Command.NEW_WINDOW, {"type": "window"}
)
def tab_new(self, url: str):
"""
this opens a url in a new tab.
@@ -701,22 +597,24 @@ class Chrome(selenium.webdriver.chrome.webdriver.WebDriver):
# super(Chrome, self).start_session(capabilities, browser_profile)
def quit(self):
try:
logger.debug("closing webdriver")
if hasattr(self, "service") and getattr(self.service, "process", None):
self.service.process.kill()
self.service.process.wait(5)
logger.debug("webdriver process ended")
except (AttributeError, RuntimeError, OSError):
pass
try:
self.reactor.event.set()
if self.reactor and isinstance(self.reactor, Reactor):
logger.debug("shutting down reactor")
except AttributeError:
self.reactor.event.set()
except Exception: # noqa
pass
try:
logger.debug("killing browser")
os.kill(self.browser_pid, 15)
logger.debug("gracefully closed browser")
except Exception as e: # noqa
except TimeoutError as e:
logger.debug(e, exc_info=True)
except Exception: # noqa
pass
if (
hasattr(self, "keep_user_data_dir")
and hasattr(self, "user_data_dir")
@@ -724,6 +622,7 @@ class Chrome(selenium.webdriver.chrome.webdriver.WebDriver):
):
for _ in range(5):
try:
shutil.rmtree(self.user_data_dir, ignore_errors=False)
except FileNotFoundError:
pass
@@ -741,24 +640,13 @@ class Chrome(selenium.webdriver.chrome.webdriver.WebDriver):
# this must come last, otherwise it will throw 'in use' errors
self.patcher = None
def __getattribute__(self, item):
if not super().__getattribute__("debug"):
return super().__getattribute__(item)
else:
import inspect
original = super().__getattribute__(item)
if inspect.ismethod(original) and not inspect.isclass(original):
def newfunc(*args, **kwargs):
logger.debug(
"calling %s with args %s and kwargs %s\n"
% (original.__qualname__, args, kwargs)
)
return original(*args, **kwargs)
return newfunc
return original
def __del__(self):
try:
super().quit()
# self.service.process.kill()
except: # noqa
pass
self.quit()
def __enter__(self):
return self
@@ -772,27 +660,6 @@ class Chrome(selenium.webdriver.chrome.webdriver.WebDriver):
def __hash__(self):
return hash(self.options.debugger_address)
def __dir__(self):
return object.__dir__(self)
def __del__(self):
try:
self.service.process.kill()
except: # noqa
pass
self.quit()
@classmethod
def _ensure_close(cls, self):
# needs to be a classmethod so finalize can find the reference
logger.info("ensuring close")
if (
hasattr(self, "service")
and hasattr(self.service, "process")
and hasattr(self.service.process, "kill")
):
self.service.process.kill()
def find_chrome_executable():
"""
@@ -824,10 +691,8 @@ def find_chrome_executable():
)
else:
for item in map(
os.environ.get,
("PROGRAMFILES", "PROGRAMFILES(X86)", "LOCALAPPDATA", "PROGRAMW6432"),
os.environ.get, ("PROGRAMFILES", "PROGRAMFILES(X86)", "LOCALAPPDATA")
):
if item is not None:
for subitem in (
"Google/Chrome/Application",
"Google/Chrome Beta/Application",

View File

@@ -17,7 +17,6 @@ by UltrafunkAmsterdam (https://github.com/ultrafunkamsterdam)
"""
from distutils.version import LooseVersion
import io
import logging
import os
@@ -25,13 +24,11 @@ import random
import re
import string
import sys
from urllib.request import urlopen
from urllib.request import urlretrieve
import zipfile
from distutils.version import LooseVersion
from urllib.request import urlopen, urlretrieve
from selenium.webdriver import Chrome as _Chrome
from selenium.webdriver import ChromeOptions as _ChromeOptions
from selenium.webdriver import Chrome as _Chrome, ChromeOptions as _ChromeOptions
TARGET_VERSION = 0
logger = logging.getLogger("uc")

View File

@@ -3,11 +3,11 @@
import json
import logging
from collections.abc import Mapping, Sequence
import requests
import websockets
log = logging.getLogger(__name__)

View File

@@ -1,16 +1,17 @@
import asyncio
from collections.abc import Mapping
from collections.abc import Sequence
from functools import wraps
import logging
import threading
import time
import traceback
from collections.abc import Mapping
from collections.abc import Sequence
from typing import Any
from typing import Awaitable
from typing import Callable
from typing import List
from typing import Optional
from contextlib import ExitStack
import threading
from functools import wraps, partial
class Structure(dict):

View File

@@ -1,13 +1,13 @@
import atexit
import logging
import multiprocessing
import os
import platform
import signal
import sys
from subprocess import PIPE
from subprocess import Popen
import sys
import atexit
import traceback
import logging
import signal
CREATE_NEW_PROCESS_GROUP = 0x00000200
DETACHED_PROCESS = 0x00000008
@@ -27,14 +27,12 @@ def start_detached(executable, *args):
reader, writer = multiprocessing.Pipe(False)
# do not keep reference
process = multiprocessing.Process(
multiprocessing.Process(
target=_start_detached,
args=(executable, *args),
kwargs={"writer": writer},
daemon=True,
)
process.start()
process.join()
).start()
# receive pid from pipe
pid = reader.recv()
REGISTERED.append(pid)

View File

@@ -39,20 +39,6 @@ class ChromeOptions(_ChromiumOptions):
value = ChromeOptions._undot_key(rest, value)
return {key: value}
@staticmethod
def _merge_nested(a, b):
"""
merges b into a
leaf values in a are overwritten with values from b
"""
for key in b:
if key in a:
if isinstance(a[key], dict) and isinstance(b[key], dict):
ChromeOptions._merge_nested(a[key], b[key])
continue
a[key] = b[key]
return a
def handle_prefs(self, user_data_dir):
prefs = self.experimental_options.get("prefs")
if prefs:
@@ -64,14 +50,12 @@ class ChromeOptions(_ChromiumOptions):
# undot prefs dict keys
undot_prefs = {}
for key, value in prefs.items():
undot_prefs = self._merge_nested(
undot_prefs, self._undot_key(key, value)
)
undot_prefs.update(self._undot_key(key, value))
prefs_file = os.path.join(default_path, "Preferences")
if os.path.exists(prefs_file):
with open(prefs_file, encoding="latin1", mode="r") as f:
undot_prefs = self._merge_nested(json.load(f), undot_prefs)
undot_prefs.update(json.load(f))
with open(prefs_file, encoding="latin1", mode="w") as f:
json.dump(undot_prefs, f)

View File

@@ -1,24 +1,23 @@
#!/usr/bin/env python3
# this module is part of undetected_chromedriver
from distutils.version import LooseVersion
import io
import logging
import os
import random
import re
import secrets
import string
import sys
import time
from urllib.request import urlopen
from urllib.request import urlretrieve
import zipfile
from distutils.version import LooseVersion
from urllib.request import urlopen, urlretrieve
import secrets
logger = logging.getLogger(__name__)
IS_POSIX = sys.platform.startswith(("darwin", "cygwin", "linux", "linux2"))
IS_POSIX = sys.platform.startswith(("darwin", "cygwin", "linux"))
class Patcher(object):
@@ -30,7 +29,7 @@ class Patcher(object):
if platform.endswith("win32"):
zip_name %= "win32"
exe_name %= ".exe"
if platform.endswith(("linux", "linux2")):
if platform.endswith("linux"):
zip_name %= "linux64"
exe_name %= ""
if platform.endswith("darwin"):
@@ -39,9 +38,7 @@ class Patcher(object):
if platform.endswith("win32"):
d = "~/appdata/roaming/undetected_chromedriver"
elif "LAMBDA_TASK_ROOT" in os.environ:
d = "/tmp/undetected_chromedriver"
elif platform.startswith(("linux","linux2")):
elif platform.startswith("linux"):
d = "~/.local/share/undetected_chromedriver"
elif platform.endswith("darwin"):
d = "~/Library/Application Support/undetected_chromedriver"

View File

@@ -1,29 +1,7 @@
from selenium.webdriver.common.by import By
import selenium.webdriver.remote.webelement
from typing import List
class WebElement(selenium.webdriver.remote.webelement.WebElement):
def click_safe(self):
super().click()
self._parent.reconnect(0.1)
def children(
self, tag=None, recursive=False
) -> List[selenium.webdriver.remote.webelement.WebElement]:
"""
returns direct child elements of current element
:param tag: str, if supplied, returns <tag> nodes only
"""
script = "return [... arguments[0].children]"
if tag:
script += ".filter( node => node.tagName === '%s')" % tag.upper()
if recursive:
return list(_recursive_children(self, tag))
return list(self._parent.execute_script(script, self))
class UCWebElement(WebElement):
"""
Custom WebElement class which makes it easier to view elements when
working in an interactive environment.
@@ -36,13 +14,9 @@ class UCWebElement(WebElement):
"""
def __init__(self, parent, id_):
super().__init__(parent, id_)
self._attrs = None
@property
def attrs(self):
if not self._attrs:
if not hasattr(self, "_attrs"):
self._attrs = self._parent.execute_script(
"""
var items = {};
@@ -61,25 +35,3 @@ class UCWebElement(WebElement):
if strattrs:
strattrs = " " + strattrs
return f"{self.__class__.__name__} <{self.tag_name}{strattrs}>"
def _recursive_children(element, tag: str = None, _results=None):
"""
returns all children of <element> recursively
:param element: `WebElement` object.
find children below this <element>
:param tag: str = None.
if provided, return only <tag> elements. example: 'a', or 'img'
:param _results: do not use!
"""
results = _results or set()
for element in element.children():
if tag:
if element.tag_name == tag:
results.add(element)
else:
results.add(element)
results |= _recursive_children(element, tag, results)
return results

View File

@@ -44,8 +44,6 @@ def get_webdriver() -> WebDriver:
# todo: this param shows a warning in chrome head-full
options.add_argument('--disable-setuid-sandbox')
options.add_argument('--disable-dev-shm-usage')
# this option removes the zygote sandbox (it seems that the resolution is a bit faster)
options.add_argument('--no-zygote')
# note: headless mode is detected (options.headless = True)
# we launch the browser in head-full mode with the window hidden