Compare commits

...

40 Commits

Author SHA1 Message Date
ilike2burnthing
fd773e5909 Bump version 3.3.16 (#1105) 2024-02-28 21:38:53 +00:00
Justin Kromlinger
35c7bff3c8 Use headless configuration properly (#1104) 2024-02-28 20:36:38 +00:00
Raphaël
afdc1c7a8e Add FreeBSD support (#1054) 2024-02-28 02:45:04 +00:00
ilike2burnthing
0bc7a4498c Update README.md 2024-02-23 05:01:00 +00:00
Xewdy
c5a5f6d65e Add platform specifiers to dependencies (#877) 2024-02-21 21:14:25 +00:00
ilike2burnthing
aaf29be8e1 Bump version 3.3.15 (#1088) 2024-02-20 23:48:23 +00:00
Tadas Gedgaudas
800866d033 Fix looping challenges. #1036 (#1065)
Co-authored-by: Tadas Gedgaudas <tg@infrahub.io>
Co-authored-by: ilike2burnthing <59480337+ilike2burnthing@users.noreply.github.com>
2024-02-20 23:41:02 +00:00
ilike2burnthing
043f18b231 Bump UC version to 3.5.5 2024-02-17 19:28:06 +00:00
ilike2burnthing
d21a332519 Hotfix 2 - bad Chromium build, instances failed to terminate (#1072) 2024-02-17 05:53:45 +00:00
ilike2burnthing
3ca6d08f41 Hotfix for Linux build - some Chrome files no longer exist (#1071) 2024-02-17 01:15:32 +00:00
ilike2burnthing
227bd7ac72 Update Chrome downloads (#1070) 2024-02-17 00:50:14 +00:00
ilike2burnthing
e6a08584c0 Update README.md
thanks @kimboslice99
2024-02-16 04:35:37 +00:00
ilike2burnthing
df06d13cf8 Update README.md 2024-01-12 23:38:18 +00:00
ilike2burnthing
993b8c41ac Fix too many open files error. resolves #983 (#1033) 2024-01-07 21:22:14 +00:00
ilike2burnthing
a4d42d7834 Remove unnecessary comment 2023-12-17 00:33:06 +00:00
21hsmw
1c855b8af0 Fix looping challenges and invalid cookies. resolves #1006 (#1010)
Co-authored-by: ilike2burnthing <59480337+ilike2burnthing@users.noreply.github.com>
2023-12-15 22:11:58 +00:00
ilike2burnthing
745c69491f Bump version 3.3.11 (#999) 2023-12-11 20:56:14 +00:00
txtsd
f7e316fd5a updates: UC 3.5.4 & Selenium 4.15.2 (#970)
Co-authored-by: GaspardRuan <1039553124@qq.com>
Co-authored-by: ilike2burnthing <59480337+ilike2burnthing@users.noreply.github.com>
2023-12-11 20:51:16 +00:00
ilike2burnthing
16c8ab5f3d Update README.md 2023-11-14 07:54:09 +00:00
ilike2burnthing
7af311b73c Bump version 3.3.10 (#969) 2023-11-14 04:04:42 +00:00
ilike2burnthing
daec97532d Update README.md 2023-11-14 04:00:01 +00:00
ilike2burnthing
8d7ed48f21 Add LANG ENV. resolves #951 2023-11-14 03:56:57 +00:00
ilike2burnthing
220f2599ae Bump version 3.3.9 (#963) 2023-11-13 07:17:28 +00:00
ilike2burnthing
d772cf3f50 Fix for Docker build, capture TypeError. Fixes #962 2023-11-13 07:14:13 +00:00
ilike2burnthing
ab4365894b Bump version 3.3.8 (#961) 2023-11-13 04:55:49 +00:00
ilike2burnthing
3fa9631559 Fix "OSError: [WinError 6] The handle is invalid" on exit 2023-11-13 04:28:19 +00:00
ilike2burnthing
04858c22fd Support running Chrome 119 from source (#960) 2023-11-13 04:23:06 +00:00
Nabi KaramAliZadeh
5085ca6990 Fix headless=true for Chrome 117+. Fixes #910 (#921) 2023-11-13 04:03:56 +00:00
ilike2burnthing
cd4df1e061 Bump version 3.3.7 (#944) 2023-11-05 14:41:12 +00:00
ilike2burnthing
6c79783f7c Bump version 3.3.6 (#905) 2023-09-15 20:40:56 +01:00
ilike2burnthing
4139e8d47c Update checkbox selector, again 2023-09-15 20:37:26 +01:00
zax2002
1942eb5fdc Typo in README (#901) 2023-09-14 07:41:59 +01:00
ilike2burnthing
401bf5be76 Bump version 3.3.5 (#902) 2023-09-13 10:28:02 +01:00
ilike2burnthing
d8ffdd3061 Change checkbox selector, support language other than English. resolves #891 2023-09-13 10:19:19 +01:00
ilike2burnthing
2d66590b08 Bump version 3.3.4 (#884) 2023-09-02 12:30:21 +01:00
Zachary Hampton
a217510dc7 Update checkbox selector (#882) 2023-09-02 12:24:25 +01:00
ilike2burnthing
553bd8ab4f Bump version 3.3.3 (#879) 2023-08-31 20:02:17 +01:00
ilike2burnthing
1b197c3e53 Update undetected_chromedriver to v3.5.3 (#860) 2023-08-31 19:56:06 +01:00
ngosang
fd308f01be Bump version 3.3.2 2023-08-03 10:00:16 +02:00
ngosang
b5eef32615 Fix URL domain in Prometheus exporter 2023-08-03 09:02:46 +02:00
12 changed files with 365 additions and 102 deletions

View File

@@ -1,5 +1,77 @@
# Changelog
## v3.3.16 (2024/02/28)
* Fix of the subprocess.STARTUPINFO() call. Thanks @ceconelo
* Add FreeBSD support. Thanks @Asthowen
* Use headless configuration properly. Thanks @hashworks
## v3.3.15 (2024/02/20)
* Fix looping challenges
## v3.3.14-hotfix2 (2024/02/17)
* Hotfix 2 - bad Chromium build, instances failed to terminate
## v3.3.14-hotfix (2024/02/17)
* Hotfix for Linux build - some Chrome files no longer exist
## v3.3.14 (2024/02/17)
* Update Chrome downloads. Thanks @opemvbs
## v3.3.13 (2024/01/07)
* Fix too many open files error
## v3.3.12 (2023/12/15)
* Fix looping challenges and invalid cookies
## v3.3.11 (2023/12/11)
* Update UC 3.5.4 & Selenium 4.15.2. Thanks @txtsd
## v3.3.10 (2023/11/14)
* Add LANG ENV - resolves issues with YGGtorrent
## v3.3.9 (2023/11/13)
* Fix for Docker build, capture TypeError
## v3.3.8 (2023/11/13)
* Fix headless=true for Chrome 117+. Thanks @NabiKAZ
* Support running Chrome 119 from source. Thanks @koleg and @Chris7X
* Fix "OSError: [WinError 6] The handle is invalid" on exit. Thanks @enesgorkemgenc
## v3.3.7 (2023/11/05)
* Bump to rebuild. Thanks @JoachimDorchies
## v3.3.6 (2023/09/15)
* Update checkbox selector, again
## v3.3.5 (2023/09/13)
* Change checkbox selector, support languages other than English
## v3.3.4 (2023/09/02)
* Update checkbox selector
## v3.3.3 (2023/08/31)
* Update undetected_chromedriver to v3.5.3
## v3.3.2 (2023/08/03)
* Fix URL domain in Prometheus exporter
## v3.3.1 (2023/08/03)
* Fix for Cloudflare verify checkbox

View File

@@ -62,17 +62,17 @@ ENTRYPOINT ["/usr/bin/dumb-init", "--"]
CMD ["/usr/local/bin/python", "-u", "/app/flaresolverr.py"]
# Local build
# docker build -t ngosang/flaresolverr:3.3.1 .
# docker run -p 8191:8191 ngosang/flaresolverr:3.3.1
# docker build -t ngosang/flaresolverr:3.3.16 .
# docker run -p 8191:8191 ngosang/flaresolverr:3.3.16
# Multi-arch build
# docker run --rm --privileged multiarch/qemu-user-static --reset -p yes
# docker buildx create --use
# docker buildx build -t ngosang/flaresolverr:3.3.1 --platform linux/386,linux/amd64,linux/arm/v7,linux/arm64/v8 .
# docker buildx build -t ngosang/flaresolverr:3.3.16 --platform linux/386,linux/amd64,linux/arm/v7,linux/arm64/v8 .
# add --push to publish in DockerHub
# Test multi-arch build
# docker run --rm --privileged multiarch/qemu-user-static --reset -p yes
# docker buildx create --use
# docker buildx build -t ngosang/flaresolverr:3.3.1 --platform linux/arm/v7 --load .
# docker run -p 8191:8191 --platform linux/arm/v7 ngosang/flaresolverr:3.3.1
# docker buildx build -t ngosang/flaresolverr:3.3.16 --platform linux/arm/v7 --load .
# docker run -p 8191:8191 --platform linux/arm/v7 ngosang/flaresolverr:3.3.16

View File

@@ -83,13 +83,20 @@ This is the recommended way for Windows users.
* Run `pip install -r requirements.txt` command to install FlareSolverr dependencies.
* Run `python src/flaresolverr.py` command to start FlareSolverr.
### From source code (FreeBSD/TrueNAS CORE)
* Run `pkg install chromium python39 py39-pip xorg-vfbserver` command to install the required dependencies.
* Clone this repository and open a shell in that path.
* Run `python3.9 -m pip install -r requirements.txt` command to install FlareSolverr dependencies.
* Run `python3.9 src/flaresolverr.py` command to start FlareSolverr.
### Systemd service
We provide an example Systemd unit file `flaresolverr.service` as reference. You have to modify the file to suit your needs: paths, user and environment variables.
## Usage
Example request:
Example Bash request:
```bash
curl -L -X POST 'http://localhost:8191/v1' \
-H 'Content-Type: application/json' \
@@ -100,6 +107,32 @@ curl -L -X POST 'http://localhost:8191/v1' \
}'
```
Example Python request:
```py
import requests
url = "http://localhost:8191/v1"
headers = {"Content-Type": "application/json"}
data = {
"cmd": "request.get",
"url": "http://www.google.com/",
"maxTimeout": 60000
}
response = requests.post(url, headers=headers, json=data)
print(response.text)
```
Example PowerShell request:
```ps1
$body = @{
cmd = "request.get"
url = "http://www.google.com/"
maxTimeout = 60000
} | ConvertTo-Json
irm -UseBasicParsing 'http://localhost:8191/v1' -Headers @{"Content-Type"="application/json"} -Method Post -Body $body
```
### Commands
#### + `sessions.create`
@@ -113,7 +146,7 @@ This also speeds up the requests since it won't have to launch a new browser ins
| Parameter | Notes |
|-----------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| session | Optional. The session ID that you want to be assigned to the instance. If isn't set a random UUID will be assigned. |
| proxy | Optional, default disabled. Eg: `"proxy": {"url": "http://127.0.0.1:8888"}`. You must include the proxy schema in the URL: `http://`, `socks4://` or `socks5://`. Authorization (username/password) is supported. Eg: `"proxy": {"url": "http://127.0.0.1:8888", username": "testuser", "password": "testpass"}` |
| proxy | Optional, default disabled. Eg: `"proxy": {"url": "http://127.0.0.1:8888"}`. You must include the proxy schema in the URL: `http://`, `socks4://` or `socks5://`. Authorization (username/password) is supported. Eg: `"proxy": {"url": "http://127.0.0.1:8888", "username": "testuser", "password": "testpass"}` |
#### + `sessions.list`
@@ -232,6 +265,7 @@ This is the same as `request.get` but it takes one more param:
| LOG_HTML | false | Only for debugging. If `true` all HTML that passes through the proxy will be logged to the console in `debug` level. |
| CAPTCHA_SOLVER | none | Captcha solving method. It is used when a captcha is encountered. See the Captcha Solvers section. |
| TZ | UTC | Timezone used in the logs and the web browser. Example: `TZ=Europe/London`. |
| LANG | none | Language used in the web browser. Example: `LANG=en_GB`. |
| HEADLESS | true | Only for debugging. To run the web browser in headless mode or visible. |
| BROWSER_TIMEOUT | 40000 | If you are experiencing errors/timeouts because your system is slow, you can try to increase this value. Remember to increase the `maxTimeout` parameter too. |
| TEST_URL | https://www.google.com | FlareSolverr makes a request on start to make sure the web browser is working. You can change that URL if it is blocked in your country. |

View File

@@ -1,6 +1,6 @@
{
"name": "flaresolverr",
"version": "3.3.1",
"version": "3.3.16",
"description": "Proxy server to bypass Cloudflare protection",
"author": "Diego Heras (ngosang / ngosang@hotmail.es)",
"license": "MIT"

View File

@@ -1,13 +1,13 @@
bottle==0.12.25
waitress==2.1.2
selenium==4.11.2
selenium==4.15.2
func-timeout==4.3.5
prometheus-client==0.17.1
# required by undetected_chromedriver
requests==2.31.0
certifi==2023.7.22
websockets==11.0.3
# only required for linux
xvfbwrapper==0.2.9
# only required for linux and macos
xvfbwrapper==0.2.9; platform_system != "Windows"
# only required for windows
pefile==2023.2.7
pefile==2023.2.7; platform_system == "Windows"

View File

@@ -2,7 +2,8 @@ import logging
import os
import urllib.parse
from dtos import V1ResponseBase
from bottle import request
from dtos import V1RequestBase, V1ResponseBase
from metrics import start_metrics_http_server, REQUEST_COUNTER, REQUEST_DURATION
PROMETHEUS_ENABLED = os.environ.get('PROMETHEUS_ENABLED', 'false').lower() == 'true'
@@ -39,8 +40,12 @@ def prometheus_plugin(callback):
domain = "unknown"
if res.solution and res.solution.url:
parsed_url = urllib.parse.urlparse(res.solution.url)
domain = parsed_url.hostname
domain = parse_domain_url(res.solution.url)
else:
# timeout error
req = V1RequestBase(request.json)
if req.url:
domain = parse_domain_url(req.url)
run_time = (res.endTimestamp - res.startTimestamp) / 1000
REQUEST_DURATION.labels(domain=domain).observe(run_time)
@@ -54,4 +59,8 @@ def prometheus_plugin(callback):
result = "error"
REQUEST_COUNTER.labels(domain=domain, result=result).inc()
def parse_domain_url(url):
parsed_url = urllib.parse.urlparse(url)
return parsed_url.hostname
return wrapper

View File

@@ -25,7 +25,7 @@ def clean_files():
def download_chromium():
# https://commondatastorage.googleapis.com/chromium-browser-snapshots/index.html?prefix=Linux_x64/
revision = "1140001" if os.name == 'nt' else '1140000'
revision = "1260008" if os.name == 'nt' else '1260015'
arch = 'Win_x64' if os.name == 'nt' else 'Linux_x64'
dl_file = 'chrome-win' if os.name == 'nt' else 'chrome-linux'
dl_path = os.path.join(os.path.dirname(os.path.abspath(__file__)), os.pardir, 'dist_chrome')
@@ -59,8 +59,7 @@ def download_chromium():
# Give executable permissions for *nix
# file * | grep executable | cut -d: -f1
print("Giving executable permissions...")
execs = ['chrome', 'chrome_crashpad_handler', 'chrome_sandbox', 'chrome-wrapper', 'nacl_helper',
'nacl_helper_bootstrap', 'nacl_irt_x86_64.nexe', 'xdg-mime', 'xdg-settings']
execs = ['chrome', 'chrome_crashpad_handler', 'chrome_sandbox', 'chrome-wrapper', 'xdg-mime', 'xdg-settings']
for exec_file in execs:
exec_path = os.path.join(chrome_path, exec_file)
os.chmod(exec_path, 0o755)

View File

@@ -252,11 +252,11 @@ def _resolve_challenge(req: V1RequestBase, method: str) -> ChallengeResolutionT:
def click_verify(driver: WebDriver):
try:
logging.debug("Try to find the Cloudflare verify checkbox...")
iframe = driver.find_element(By.XPATH, "//iframe[@title='Widget containing a Cloudflare security challenge']")
iframe = driver.find_element(By.XPATH, "//iframe[starts-with(@id, 'cf-chl-widget-')]")
driver.switch_to.frame(iframe)
checkbox = driver.find_element(
by=By.XPATH,
value='//*[@id="cf-stage"]//label[@class="ctp-checkbox-label"]/input',
value='//*[@id="challenge-stage"]/div/label/input',
)
if checkbox:
actions = ActionChains(driver)
@@ -287,18 +287,35 @@ def click_verify(driver: WebDriver):
time.sleep(2)
def get_correct_window(driver: WebDriver) -> WebDriver:
if len(driver.window_handles) > 1:
for window_handle in driver.window_handles:
driver.switch_to.window(window_handle)
current_url = driver.current_url
if not current_url.startswith("devtools://devtools"):
return driver
return driver
def access_page(driver: WebDriver, url: str) -> None:
driver.get(url)
driver.start_session()
driver.start_session() # required to bypass Cloudflare
def _evil_logic(req: V1RequestBase, driver: WebDriver, method: str) -> ChallengeResolutionT:
res = ChallengeResolutionT({})
res.status = STATUS_OK
res.message = ""
# navigate to the page
logging.debug(f'Navigating to... {req.url}')
if method == 'POST':
_post_request(req, driver)
else:
driver.get(req.url)
driver.start_session() # required to bypass Cloudflare
access_page(driver, req.url)
driver = get_correct_window(driver)
# set cookies if required
if req.cookies is not None and len(req.cookies) > 0:
@@ -310,8 +327,8 @@ def _evil_logic(req: V1RequestBase, driver: WebDriver, method: str) -> Challenge
if method == 'POST':
_post_request(req, driver)
else:
driver.get(req.url)
driver.start_session() # required to bypass Cloudflare
access_page(driver, req.url)
driver = get_correct_window(driver)
# wait for the page
if utils.get_config_log_html():
@@ -431,4 +448,5 @@ def _post_request(req: V1RequestBase, driver: WebDriver):
</body>
</html>"""
driver.get("data:text/html;charset=utf-8," + html_content)
driver.start_session()
driver.start_session() # required to bypass Cloudflare

View File

@@ -17,11 +17,12 @@ by UltrafunkAmsterdam (https://github.com/ultrafunkamsterdam)
from __future__ import annotations
__version__ = "3.5.0"
__version__ = "3.5.5"
import json
import logging
import os
import pathlib
import re
import shutil
import subprocess
@@ -373,6 +374,18 @@ class Chrome(selenium.webdriver.chrome.webdriver.WebDriver):
browser_executable_path or find_chrome_executable()
)
if not options.binary_location or not \
pathlib.Path(options.binary_location).exists():
raise FileNotFoundError(
"\n---------------------\n"
"Could not determine browser executable."
"\n---------------------\n"
"Make sure your browser is installed in the default location (path).\n"
"If you are sure about the browser executable, you can specify it using\n"
"the `browser_executable_path='{}` parameter.\n\n"
.format("/path/to/browser/executable" if IS_POSIX else "c:/path/to/your/browser.exe")
)
self._delay = 3
self.user_data_dir = user_data_dir
@@ -383,7 +396,7 @@ class Chrome(selenium.webdriver.chrome.webdriver.WebDriver):
if no_sandbox:
options.arguments.extend(["--no-sandbox", "--test-type"])
if headless or options.headless:
if headless or getattr(options, 'headless', None):
#workaround until a better checking is found
try:
v_main = int(self.patcher.version_main) if self.patcher.version_main else 108
@@ -438,8 +451,10 @@ class Chrome(selenium.webdriver.chrome.webdriver.WebDriver):
options.binary_location, *options.arguments
)
else:
startupinfo = subprocess.STARTUPINFO()
startupinfo = None
if os.name == 'nt' and windows_headless:
# STARTUPINFO() is Windows only
startupinfo = subprocess.STARTUPINFO()
startupinfo.dwFlags |= subprocess.STARTF_USESHOWWINDOW
browser = subprocess.Popen(
[options.binary_location, *options.arguments],
@@ -451,11 +466,9 @@ class Chrome(selenium.webdriver.chrome.webdriver.WebDriver):
)
self.browser_pid = browser.pid
# Fix for Chrome 115
# https://github.com/seleniumbase/SeleniumBase/pull/1967
service = selenium.webdriver.chromium.service.ChromiumService(
executable_path=self.patcher.executable_path,
service_args=["--disable-build-check"]
self.patcher.executable_path
)
super(Chrome, self).__init__(
@@ -480,7 +493,7 @@ class Chrome(selenium.webdriver.chrome.webdriver.WebDriver):
else:
self._web_element_cls = WebElement
if options.headless:
if headless or getattr(options, 'headless', None):
self._configure_headless()
def _configure_headless(self):
@@ -800,7 +813,11 @@ class Chrome(selenium.webdriver.chrome.webdriver.WebDriver):
else:
logger.debug("successfully removed %s" % self.user_data_dir)
break
try:
time.sleep(0.1)
except OSError:
pass
# dereference patcher, so patcher can start cleaning up as well.
# this must come last, otherwise it will throw 'in use' errors
@@ -895,8 +912,6 @@ def find_chrome_executable():
if item is not None:
for subitem in (
"Google/Chrome/Application",
"Google/Chrome Beta/Application",
"Google/Chrome Canary/Application",
):
candidates.add(os.sep.join((item, subitem, "chrome.exe")))
for candidate in candidates:

View File

@@ -41,6 +41,7 @@ def start_detached(executable, *args):
# close pipes
writer.close()
reader.close()
process.close()
return pid

View File

@@ -3,9 +3,11 @@
from distutils.version import LooseVersion
import io
import json
import logging
import os
import pathlib
import platform
import random
import re
import shutil
@@ -19,26 +21,14 @@ from multiprocessing import Lock
logger = logging.getLogger(__name__)
IS_POSIX = sys.platform.startswith(("darwin", "cygwin", "linux", "linux2"))
IS_POSIX = sys.platform.startswith(("darwin", "cygwin", "linux", "linux2", "freebsd"))
class Patcher(object):
lock = Lock()
url_repo = "https://chromedriver.storage.googleapis.com"
zip_name = "chromedriver_%s.zip"
exe_name = "chromedriver%s"
platform = sys.platform
if platform.endswith("win32"):
zip_name %= "win32"
exe_name %= ".exe"
if platform.endswith(("linux", "linux2")):
zip_name %= "linux64"
exe_name %= ""
if platform.endswith("darwin"):
zip_name %= "mac64"
exe_name %= ""
if platform.endswith("win32"):
d = "~/appdata/roaming/undetected_chromedriver"
elif "LAMBDA_TASK_ROOT" in os.environ:
@@ -72,10 +62,29 @@ class Patcher(object):
prefix = "undetected"
self.user_multi_procs = user_multi_procs
try:
# Try to convert version_main into an integer
version_main_int = int(version_main)
# check if version_main_int is less than or equal to e.g 114
self.is_old_chromedriver = version_main and version_main_int <= 114
except (ValueError,TypeError):
# If the conversion fails, print an error message
print("version_main cannot be converted to an integer")
# Set self.is_old_chromedriver to False if the conversion fails
self.is_old_chromedriver = False
# Needs to be called before self.exe_name is accessed
self._set_platform_name()
if not os.path.exists(self.data_path):
os.makedirs(self.data_path, exist_ok=True)
if not executable_path:
if sys.platform.startswith("freebsd"):
self.executable_path = os.path.join(
self.data_path, self.exe_name
)
else:
self.executable_path = os.path.join(
self.data_path, "_".join([prefix, self.exe_name])
)
@@ -97,9 +106,36 @@ class Patcher(object):
self._custom_exe_path = True
self.executable_path = executable_path
# Set the correct repository to download the Chromedriver from
if self.is_old_chromedriver:
self.url_repo = "https://chromedriver.storage.googleapis.com"
else:
self.url_repo = "https://googlechromelabs.github.io/chrome-for-testing"
self.version_main = version_main
self.version_full = None
def _set_platform_name(self):
"""
Set the platform and exe name based on the platform undetected_chromedriver is running on
in order to download the correct chromedriver.
"""
if self.platform.endswith("win32"):
self.platform_name = "win32"
self.exe_name %= ".exe"
if self.platform.endswith(("linux", "linux2")):
self.platform_name = "linux64"
self.exe_name %= ""
if self.platform.endswith("darwin"):
if self.is_old_chromedriver:
self.platform_name = "mac64"
else:
self.platform_name = "mac-x64"
self.exe_name %= ""
if self.platform.startswith("freebsd"):
self.platform_name = "freebsd"
self.exe_name %= ""
def auto(self, executable_path=None, force=False, version_main=None, _=None):
"""
@@ -111,16 +147,15 @@ class Patcher(object):
Returns:
"""
# if self.user_multi_procs and \
# self.user_multi_procs != -1:
# # -1 being a skip value used later in this block
#
p = pathlib.Path(self.data_path)
if self.user_multi_procs:
with Lock():
files = list(p.rglob("*chromedriver*?"))
for file in files:
if self.is_binary_patched(file):
self.executable_path = str(file)
files = list(p.rglob("*chromedriver*"))
most_recent = max(files, key=lambda f: f.stat().st_mtime)
files.remove(most_recent)
list(map(lambda f: f.unlink(), files))
if self.is_binary_patched(most_recent):
self.executable_path = str(most_recent)
return True
if executable_path:
@@ -139,6 +174,35 @@ class Patcher(object):
if force is True:
self.force = force
if self.platform_name == "freebsd":
chromedriver_path = shutil.which("chromedriver")
if not os.path.isfile(chromedriver_path) or not os.access(chromedriver_path, os.X_OK):
logging.error("Chromedriver not installed!")
return
version_path = os.path.join(os.path.dirname(self.executable_path), "version.txt")
process = os.popen(f'"{chromedriver_path}" --version')
chromedriver_version = process.read().split(' ')[1].split(' ')[0]
process.close()
current_version = None
if os.path.isfile(version_path) or os.access(version_path, os.X_OK):
with open(version_path, 'r') as f:
current_version = f.read()
if current_version != chromedriver_version:
logging.info("Copying chromedriver executable...")
shutil.copy(chromedriver_path, self.executable_path)
os.chmod(self.executable_path, 0o755)
with open(version_path, 'w') as f:
f.write(chromedriver_version)
logging.info("Chromedriver executable copied!")
else:
try:
os.unlink(self.executable_path)
except PermissionError:
@@ -159,6 +223,7 @@ class Patcher(object):
self.version_main = release.version[0]
self.version_full = release
self.unzip_package(self.fetch_package())
return self.patch()
def driver_binary_in_use(self, path: str = None) -> bool:
@@ -202,7 +267,11 @@ class Patcher(object):
def cleanup_unused_files(self):
p = pathlib.Path(self.data_path)
items = list(p.glob("*undetected*"))
print(items)
for item in items:
try:
item.unlink()
except:
pass
def patch(self):
self.patch_exe()
@@ -214,13 +283,33 @@ class Patcher(object):
:return: version string
:rtype: LooseVersion
"""
path = "/latest_release"
if self.version_main:
path += f"_{self.version_main}"
# Endpoint for old versions of Chromedriver (114 and below)
if self.is_old_chromedriver:
path = f"/latest_release_{self.version_main}"
path = path.upper()
logger.debug("getting release number from %s" % path)
return LooseVersion(urlopen(self.url_repo + path).read().decode())
# Endpoint for new versions of Chromedriver (115+)
if not self.version_main:
# Fetch the latest version
path = "/last-known-good-versions-with-downloads.json"
logger.debug("getting release number from %s" % path)
with urlopen(self.url_repo + path) as conn:
response = conn.read().decode()
last_versions = json.loads(response)
return LooseVersion(last_versions["channels"]["Stable"]["version"])
# Fetch the latest minor version of the major version provided
path = "/latest-versions-per-milestone-with-downloads.json"
logger.debug("getting release number from %s" % path)
with urlopen(self.url_repo + path) as conn:
response = conn.read().decode()
major_versions = json.loads(response)
return LooseVersion(major_versions["milestones"][str(self.version_main)]["version"])
def parse_exe_version(self):
with io.open(self.executable_path, "rb") as f:
for line in iter(lambda: f.readline(), b""):
@@ -234,10 +323,16 @@ class Patcher(object):
:return: path to downloaded file
"""
u = "%s/%s/%s" % (self.url_repo, self.version_full.vstring, self.zip_name)
logger.debug("downloading from %s" % u)
# return urlretrieve(u, filename=self.data_path)[0]
return urlretrieve(u)[0]
zip_name = f"chromedriver_{self.platform_name}.zip"
if self.is_old_chromedriver:
download_url = "%s/%s/%s" % (self.url_repo, self.version_full.vstring, zip_name)
else:
zip_name = zip_name.replace("_", "-", 1)
download_url = "https://storage.googleapis.com/chrome-for-testing-public/%s/%s/%s"
download_url %= (self.version_full.vstring, self.platform_name, zip_name)
logger.debug("downloading from %s" % download_url)
return urlretrieve(download_url)[0]
def unzip_package(self, fp):
"""
@@ -245,6 +340,12 @@ class Patcher(object):
:return: path to unpacked executable
"""
exe_path = self.exe_name
if not self.is_old_chromedriver:
# The new chromedriver unzips into its own folder
zip_name = f"chromedriver-{self.platform_name}"
exe_path = os.path.join(zip_name, self.exe_name)
logger.debug("unzipping %s" % fp)
try:
os.unlink(self.zip_path)
@@ -253,10 +354,10 @@ class Patcher(object):
os.makedirs(self.zip_path, mode=0o755, exist_ok=True)
with zipfile.ZipFile(fp, mode="r") as zf:
zf.extract(self.exe_name, self.zip_path)
os.rename(os.path.join(self.zip_path, self.exe_name), self.executable_path)
zf.extractall(self.zip_path)
os.rename(os.path.join(self.zip_path, exe_path), self.executable_path)
os.remove(fp)
os.rmdir(self.zip_path)
shutil.rmtree
os.chmod(self.executable_path, 0o755)
return self.executable_path

View File

@@ -5,6 +5,7 @@ import re
import shutil
import urllib.parse
import tempfile
import sys
from selenium.webdriver.chrome.webdriver import WebDriver
import undetected_chromedriver as uc
@@ -113,7 +114,7 @@ def create_proxy_extension(proxy: dict) -> str:
def get_webdriver(proxy: dict = None) -> WebDriver:
global PATCHED_DRIVER_PATH
global PATCHED_DRIVER_PATH, USER_AGENT
logging.debug('Launching web browser...')
# undetected_chromedriver
@@ -136,6 +137,14 @@ def get_webdriver(proxy: dict = None) -> WebDriver:
# https://peter.sh/experiments/chromium-command-line-switches/#use-gl
options.add_argument('--use-gl=swiftshader')
language = os.environ.get('LANG', None)
if language is not None:
options.add_argument('--lang=%s' % language)
# Fix for Chrome 117 | https://github.com/FlareSolverr/FlareSolverr/issues/910
if USER_AGENT is not None:
options.add_argument('--user-agent=%s' % USER_AGENT)
proxy_extension_dir = None
if proxy and all(key in proxy for key in ['url', 'username', 'password']):
proxy_extension_dir = create_proxy_extension(proxy)
@@ -145,7 +154,7 @@ def get_webdriver(proxy: dict = None) -> WebDriver:
logging.debug("Using webdriver proxy: %s", proxy_url)
options.add_argument('--proxy-server=%s' % proxy_url)
# note: headless mode is detected (options.headless = True)
# note: headless mode is detected (headless = True)
# we launch the browser in head-full mode with the window hidden
windows_headless = False
if get_config_headless():
@@ -153,6 +162,10 @@ def get_webdriver(proxy: dict = None) -> WebDriver:
windows_headless = True
else:
start_xvfb_display()
# For normal headless mode:
# options.add_argument('--headless')
options.add_argument("--auto-open-devtools-for-tabs")
# if we are inside the Docker container, we avoid downloading the driver
driver_exe_path = None
@@ -162,10 +175,6 @@ def get_webdriver(proxy: dict = None) -> WebDriver:
driver_exe_path = "/app/chromedriver"
else:
version_main = get_chrome_major_version()
# Fix for Chrome 115
# https://github.com/seleniumbase/SeleniumBase/pull/1967
if int(version_main) > 114:
version_main = 114
if PATCHED_DRIVER_PATH is not None:
driver_exe_path = PATCHED_DRIVER_PATH
@@ -174,9 +183,12 @@ def get_webdriver(proxy: dict = None) -> WebDriver:
# downloads and patches the chromedriver
# if we don't set driver_executable_path it downloads, patches, and deletes the driver each time
try:
driver = uc.Chrome(options=options, browser_executable_path=browser_executable_path,
driver_executable_path=driver_exe_path, version_main=version_main,
windows_headless=windows_headless, headless=windows_headless)
windows_headless=windows_headless, headless=get_config_headless())
except Exception as e:
logging.error("Error starting Chrome: %s" % e)
# save the patched driver to avoid re-downloads
if driver_exe_path is None:
@@ -295,6 +307,8 @@ def get_user_agent(driver=None) -> str:
if driver is None:
driver = get_webdriver()
USER_AGENT = driver.execute_script("return navigator.userAgent")
# Fix for Chrome 117 | https://github.com/FlareSolverr/FlareSolverr/issues/910
USER_AGENT = re.sub('HEADLESS', '', USER_AGENT, flags=re.IGNORECASE)
return USER_AGENT
except Exception as e:
raise Exception("Error getting browser User-Agent. " + str(e))