mirror of
https://github.com/josegonzalez/python-github-backup.git
synced 2026-02-16 18:04:30 +01:00
Compare commits
20 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
60067650b0 | ||
|
|
655886fa80 | ||
|
|
0162f7ed46 | ||
|
|
8c1a13475a | ||
|
|
6268a4c5c6 | ||
|
|
4b2295db0d | ||
|
|
be900d1f3f | ||
|
|
9be6282719 | ||
|
|
1102990af0 | ||
|
|
311ffb40cd | ||
|
|
2f5e7c2dcf | ||
|
|
0d8a504b02 | ||
|
|
712d22d124 | ||
|
|
e0c9d65225 | ||
|
|
52d996f784 | ||
|
|
e6283f9384 | ||
|
|
1181f811b7 | ||
|
|
856ad5db41 | ||
|
|
c6fa8c7695 | ||
|
|
93e505c07d |
127
CHANGES.rst
127
CHANGES.rst
@@ -1,9 +1,134 @@
|
|||||||
Changelog
|
Changelog
|
||||||
=========
|
=========
|
||||||
|
|
||||||
0.61.1 (2026-01-13)
|
0.61.4 (2026-02-16)
|
||||||
-------------------
|
-------------------
|
||||||
------------------------
|
------------------------
|
||||||
|
- Fix HTTP 451 DMCA and 403 TOS handling regression (#487) [Rodos]
|
||||||
|
|
||||||
|
The DMCA handling added in PR #454 had a bug: make_request_with_retry()
|
||||||
|
raises HTTPError before retrieve_data() could check the status code via
|
||||||
|
getcode(), making the case 451 handler dead code. This also affected
|
||||||
|
HTTP 403 TOS violations (e.g. jumoog/MagiskOnWSA).
|
||||||
|
|
||||||
|
Fix by catching HTTPError in retrieve_data() and converting 451 and
|
||||||
|
blocked 403 responses (identified by "block" key in response body) to
|
||||||
|
RepositoryUnavailableError. Non-block 403s (permissions, scopes) still
|
||||||
|
propagate as HTTPError. Also handle RepositoryUnavailableError in
|
||||||
|
retrieve_repositories() for the --repository case.
|
||||||
|
|
||||||
|
Rewrote tests to mock urlopen (not make_request_with_retry) to exercise
|
||||||
|
the real code path that was previously untested.
|
||||||
|
|
||||||
|
Closes #487
|
||||||
|
- Chore(deps): bump setuptools in the python-packages group.
|
||||||
|
[dependabot[bot]]
|
||||||
|
|
||||||
|
Bumps the python-packages group with 1 update: [setuptools](https://github.com/pypa/setuptools).
|
||||||
|
|
||||||
|
|
||||||
|
Updates `setuptools` from 80.10.2 to 82.0.0
|
||||||
|
- [Release notes](https://github.com/pypa/setuptools/releases)
|
||||||
|
- [Changelog](https://github.com/pypa/setuptools/blob/main/NEWS.rst)
|
||||||
|
- [Commits](https://github.com/pypa/setuptools/compare/v80.10.2...v82.0.0)
|
||||||
|
|
||||||
|
---
|
||||||
|
updated-dependencies:
|
||||||
|
- dependency-name: setuptools
|
||||||
|
dependency-version: 82.0.0
|
||||||
|
dependency-type: direct:production
|
||||||
|
update-type: version-update:semver-major
|
||||||
|
dependency-group: python-packages
|
||||||
|
...
|
||||||
|
- Chore(deps): bump setuptools in the python-packages group.
|
||||||
|
[dependabot[bot]]
|
||||||
|
|
||||||
|
Bumps the python-packages group with 1 update: [setuptools](https://github.com/pypa/setuptools).
|
||||||
|
|
||||||
|
|
||||||
|
Updates `setuptools` from 80.10.1 to 80.10.2
|
||||||
|
- [Release notes](https://github.com/pypa/setuptools/releases)
|
||||||
|
- [Changelog](https://github.com/pypa/setuptools/blob/main/NEWS.rst)
|
||||||
|
- [Commits](https://github.com/pypa/setuptools/compare/v80.10.1...v80.10.2)
|
||||||
|
|
||||||
|
---
|
||||||
|
updated-dependencies:
|
||||||
|
- dependency-name: setuptools
|
||||||
|
dependency-version: 80.10.2
|
||||||
|
dependency-type: direct:production
|
||||||
|
update-type: version-update:semver-patch
|
||||||
|
dependency-group: python-packages
|
||||||
|
...
|
||||||
|
|
||||||
|
|
||||||
|
0.61.3 (2026-01-24)
|
||||||
|
-------------------
|
||||||
|
- Fix KeyError: 'Private' when using --all flag (#481) [Rodos]
|
||||||
|
|
||||||
|
The repository dictionary uses lowercase "private" key. Use .get() with
|
||||||
|
the correct case to match the pattern used elsewhere in the codebase.
|
||||||
|
|
||||||
|
The bug only affects --all users since --security-advisories short-circuits
|
||||||
|
before the key access.
|
||||||
|
- Chore(deps): bump setuptools in the python-packages group.
|
||||||
|
[dependabot[bot]]
|
||||||
|
|
||||||
|
Bumps the python-packages group with 1 update: [setuptools](https://github.com/pypa/setuptools).
|
||||||
|
|
||||||
|
|
||||||
|
Updates `setuptools` from 80.9.0 to 80.10.1
|
||||||
|
- [Release notes](https://github.com/pypa/setuptools/releases)
|
||||||
|
- [Changelog](https://github.com/pypa/setuptools/blob/main/NEWS.rst)
|
||||||
|
- [Commits](https://github.com/pypa/setuptools/compare/v80.9.0...v80.10.1)
|
||||||
|
|
||||||
|
---
|
||||||
|
updated-dependencies:
|
||||||
|
- dependency-name: setuptools
|
||||||
|
dependency-version: 80.10.1
|
||||||
|
dependency-type: direct:production
|
||||||
|
update-type: version-update:semver-minor
|
||||||
|
dependency-group: python-packages
|
||||||
|
...
|
||||||
|
|
||||||
|
|
||||||
|
0.61.2 (2026-01-19)
|
||||||
|
-------------------
|
||||||
|
|
||||||
|
Fix
|
||||||
|
~~~
|
||||||
|
- Skip security advisories for private repos unless explicitly
|
||||||
|
requested. [Lukas Bestle]
|
||||||
|
- Handle 404 errors on security advisories. [Lukas Bestle]
|
||||||
|
|
||||||
|
Other
|
||||||
|
~~~~~
|
||||||
|
- Chore(deps): bump black in the python-packages group.
|
||||||
|
[dependabot[bot]]
|
||||||
|
|
||||||
|
Bumps the python-packages group with 1 update: [black](https://github.com/psf/black).
|
||||||
|
|
||||||
|
|
||||||
|
Updates `black` from 25.12.0 to 26.1.0
|
||||||
|
- [Release notes](https://github.com/psf/black/releases)
|
||||||
|
- [Changelog](https://github.com/psf/black/blob/main/CHANGES.md)
|
||||||
|
- [Commits](https://github.com/psf/black/compare/25.12.0...26.1.0)
|
||||||
|
|
||||||
|
---
|
||||||
|
updated-dependencies:
|
||||||
|
- dependency-name: black
|
||||||
|
dependency-version: 26.1.0
|
||||||
|
dependency-type: direct:production
|
||||||
|
update-type: version-update:semver-major
|
||||||
|
dependency-group: python-packages
|
||||||
|
...
|
||||||
|
- Docs: Explain security advisories in README. [Lukas Bestle]
|
||||||
|
- Feat: Only make security advisory dir if successful. [Lukas Bestle]
|
||||||
|
|
||||||
|
Avoids empty directories for private repos
|
||||||
|
|
||||||
|
|
||||||
|
0.61.1 (2026-01-13)
|
||||||
|
-------------------
|
||||||
- Refactor test fixtures to use shared create_args helper. [Rodos]
|
- Refactor test fixtures to use shared create_args helper. [Rodos]
|
||||||
|
|
||||||
Uses the real parse_args() function to get CLI defaults, so when
|
Uses the real parse_args() function to get CLI defaults, so when
|
||||||
|
|||||||
11
README.rst
11
README.rst
@@ -284,6 +284,17 @@ The tool automatically extracts file extensions from HTTP headers to ensure file
|
|||||||
**Fine-grained token limitation:** Due to a GitHub platform limitation, fine-grained personal access tokens (``github_pat_...``) cannot download attachments from private repositories directly. This affects both ``/assets/`` (images) and ``/files/`` (documents) URLs. The tool implements a workaround for image attachments using GitHub's Markdown API, which converts URLs to temporary JWT-signed URLs that can be downloaded. However, this workaround only works for images - document attachments (PDFs, text files, etc.) will fail with 404 errors when using fine-grained tokens on private repos. For full attachment support on private repositories, use a classic token (``-t``) instead of a fine-grained token (``-f``). See `#477 <https://github.com/josegonzalez/python-github-backup/issues/477>`_ for details.
|
**Fine-grained token limitation:** Due to a GitHub platform limitation, fine-grained personal access tokens (``github_pat_...``) cannot download attachments from private repositories directly. This affects both ``/assets/`` (images) and ``/files/`` (documents) URLs. The tool implements a workaround for image attachments using GitHub's Markdown API, which converts URLs to temporary JWT-signed URLs that can be downloaded. However, this workaround only works for images - document attachments (PDFs, text files, etc.) will fail with 404 errors when using fine-grained tokens on private repos. For full attachment support on private repositories, use a classic token (``-t``) instead of a fine-grained token (``-f``). See `#477 <https://github.com/josegonzalez/python-github-backup/issues/477>`_ for details.
|
||||||
|
|
||||||
|
|
||||||
|
About security advisories
|
||||||
|
-------------------------
|
||||||
|
|
||||||
|
GitHub security advisories are only available in public repositories. GitHub does not provide the respective API endpoint for private repositories.
|
||||||
|
|
||||||
|
Therefore the logic is implemented as follows:
|
||||||
|
- Security advisories are included in the `--all` option.
|
||||||
|
- If only the `--all` option was provided, backups of security advisories are skipped for private repositories.
|
||||||
|
- If the `--security-advisories` option is provided (on its own or in addition to `--all`), a backup of security advisories is attempted for all repositories, with graceful handling if the GitHub API doesn't return any.
|
||||||
|
|
||||||
|
|
||||||
Run in Docker container
|
Run in Docker container
|
||||||
-----------------------
|
-----------------------
|
||||||
|
|
||||||
|
|||||||
@@ -1 +1 @@
|
|||||||
__version__ = "0.61.1"
|
__version__ = "0.61.4"
|
||||||
|
|||||||
@@ -39,11 +39,11 @@ logger = logging.getLogger(__name__)
|
|||||||
|
|
||||||
|
|
||||||
class RepositoryUnavailableError(Exception):
|
class RepositoryUnavailableError(Exception):
|
||||||
"""Raised when a repository is unavailable due to legal reasons (e.g., DMCA takedown)."""
|
"""Raised when a repository is unavailable due to legal reasons (e.g., DMCA takedown, TOS violation)."""
|
||||||
|
|
||||||
def __init__(self, message, dmca_url=None):
|
def __init__(self, message, legal_url=None):
|
||||||
super().__init__(message)
|
super().__init__(message)
|
||||||
self.dmca_url = dmca_url
|
self.legal_url = legal_url
|
||||||
|
|
||||||
|
|
||||||
# Setup SSL context with fallback chain
|
# Setup SSL context with fallback chain
|
||||||
@@ -647,6 +647,14 @@ def retrieve_data(args, template, query_args=None, paginated=True):
|
|||||||
return None
|
return None
|
||||||
|
|
||||||
def fetch_all() -> Generator[dict, None, None]:
|
def fetch_all() -> Generator[dict, None, None]:
|
||||||
|
def _extract_legal_url(response_body_bytes):
|
||||||
|
"""Extract DMCA/legal notice URL from GitHub API error response body."""
|
||||||
|
try:
|
||||||
|
data = json.loads(response_body_bytes.decode("utf-8"))
|
||||||
|
return data.get("block", {}).get("html_url")
|
||||||
|
except Exception:
|
||||||
|
return None
|
||||||
|
|
||||||
next_url = None
|
next_url = None
|
||||||
|
|
||||||
while True:
|
while True:
|
||||||
@@ -661,47 +669,66 @@ def retrieve_data(args, template, query_args=None, paginated=True):
|
|||||||
as_app=args.as_app,
|
as_app=args.as_app,
|
||||||
fine=args.token_fine is not None,
|
fine=args.token_fine is not None,
|
||||||
)
|
)
|
||||||
http_response = make_request_with_retry(request, auth, args.max_retries)
|
try:
|
||||||
|
http_response = make_request_with_retry(
|
||||||
match http_response.getcode():
|
request, auth, args.max_retries
|
||||||
case 200:
|
)
|
||||||
# Success - Parse JSON response
|
except HTTPError as exc:
|
||||||
try:
|
if exc.code == 451:
|
||||||
response = json.loads(http_response.read().decode("utf-8"))
|
legal_url = _extract_legal_url(exc.read())
|
||||||
break # Exit retry loop and handle the data returned
|
|
||||||
except (
|
|
||||||
IncompleteRead,
|
|
||||||
json.decoder.JSONDecodeError,
|
|
||||||
TimeoutError,
|
|
||||||
) as e:
|
|
||||||
logger.warning(f"{type(e).__name__} reading response")
|
|
||||||
if attempt < args.max_retries:
|
|
||||||
delay = calculate_retry_delay(attempt, {})
|
|
||||||
logger.warning(
|
|
||||||
f"Retrying read in {delay:.1f}s (attempt {attempt + 1}/{args.max_retries + 1})"
|
|
||||||
)
|
|
||||||
time.sleep(delay)
|
|
||||||
continue # Next retry attempt
|
|
||||||
|
|
||||||
case 451:
|
|
||||||
# DMCA takedown - extract URL if available, then raise
|
|
||||||
dmca_url = None
|
|
||||||
try:
|
|
||||||
response_data = json.loads(
|
|
||||||
http_response.read().decode("utf-8")
|
|
||||||
)
|
|
||||||
dmca_url = response_data.get("block", {}).get("html_url")
|
|
||||||
except Exception:
|
|
||||||
pass
|
|
||||||
raise RepositoryUnavailableError(
|
raise RepositoryUnavailableError(
|
||||||
"Repository unavailable due to legal reasons (HTTP 451)",
|
f"Repository unavailable due to legal reasons (HTTP {exc.code})",
|
||||||
dmca_url=dmca_url,
|
legal_url=legal_url,
|
||||||
)
|
)
|
||||||
|
elif exc.code == 403:
|
||||||
|
# Rate-limit 403s (x-ratelimit-remaining=0) are retried
|
||||||
|
# by make_request_with_retry — re-raise if exhausted.
|
||||||
|
if int(exc.headers.get("x-ratelimit-remaining", 1)) < 1:
|
||||||
|
raise
|
||||||
|
# Only convert to RepositoryUnavailableError if GitHub
|
||||||
|
# indicates a TOS/DMCA block (response contains "block"
|
||||||
|
# key). Other 403s (permissions, scopes) should propagate.
|
||||||
|
body = exc.read()
|
||||||
|
try:
|
||||||
|
data = json.loads(body.decode("utf-8"))
|
||||||
|
except Exception:
|
||||||
|
data = {}
|
||||||
|
if "block" in data:
|
||||||
|
raise RepositoryUnavailableError(
|
||||||
|
"Repository access blocked (HTTP 403)",
|
||||||
|
legal_url=data.get("block", {}).get("html_url"),
|
||||||
|
)
|
||||||
|
raise
|
||||||
|
else:
|
||||||
|
raise
|
||||||
|
|
||||||
case _:
|
# urlopen raises HTTPError for non-2xx, so only success gets here.
|
||||||
raise Exception(
|
# Guard against unexpected status codes from proxies, future Python
|
||||||
f"API request returned HTTP {http_response.getcode()}: {http_response.reason}"
|
# changes, or other edge cases we haven't considered.
|
||||||
|
status = http_response.getcode()
|
||||||
|
if status != 200:
|
||||||
|
raise Exception(
|
||||||
|
f"Unexpected HTTP {status} from {next_url or template} "
|
||||||
|
f"(expected non-2xx to raise HTTPError)"
|
||||||
|
)
|
||||||
|
|
||||||
|
# Parse JSON response
|
||||||
|
try:
|
||||||
|
response = json.loads(http_response.read().decode("utf-8"))
|
||||||
|
break # Exit retry loop and handle the data returned
|
||||||
|
except (
|
||||||
|
IncompleteRead,
|
||||||
|
json.decoder.JSONDecodeError,
|
||||||
|
TimeoutError,
|
||||||
|
) as e:
|
||||||
|
logger.warning(f"{type(e).__name__} reading response")
|
||||||
|
if attempt < args.max_retries:
|
||||||
|
delay = calculate_retry_delay(attempt, {})
|
||||||
|
logger.warning(
|
||||||
|
f"Retrying read in {delay:.1f}s (attempt {attempt + 1}/{args.max_retries + 1})"
|
||||||
)
|
)
|
||||||
|
time.sleep(delay)
|
||||||
|
continue # Next retry attempt
|
||||||
else:
|
else:
|
||||||
logger.error(
|
logger.error(
|
||||||
f"Failed to read response after {args.max_retries + 1} attempts for {next_url or template}"
|
f"Failed to read response after {args.max_retries + 1} attempts for {next_url or template}"
|
||||||
@@ -1614,7 +1641,13 @@ def retrieve_repositories(args, authenticated_user):
|
|||||||
paginated = False
|
paginated = False
|
||||||
template = "https://{0}/repos/{1}".format(get_github_api_host(args), repo_path)
|
template = "https://{0}/repos/{1}".format(get_github_api_host(args), repo_path)
|
||||||
|
|
||||||
repos = retrieve_data(args, template, paginated=paginated)
|
try:
|
||||||
|
repos = retrieve_data(args, template, paginated=paginated)
|
||||||
|
except RepositoryUnavailableError as e:
|
||||||
|
logger.warning(f"Repository is unavailable: {e}")
|
||||||
|
if e.legal_url:
|
||||||
|
logger.warning(f"Legal notice: {e.legal_url}")
|
||||||
|
return []
|
||||||
|
|
||||||
if args.all_starred:
|
if args.all_starred:
|
||||||
starred_template = "https://{0}/users/{1}/starred".format(
|
starred_template = "https://{0}/users/{1}/starred".format(
|
||||||
@@ -1814,7 +1847,7 @@ def backup_repositories(args, output_directory, repositories):
|
|||||||
if args.include_milestones or args.include_everything:
|
if args.include_milestones or args.include_everything:
|
||||||
backup_milestones(args, repo_cwd, repository, repos_template)
|
backup_milestones(args, repo_cwd, repository, repos_template)
|
||||||
|
|
||||||
if args.include_security_advisories or args.include_everything:
|
if args.include_security_advisories or (args.include_everything and not repository.get("private", False)):
|
||||||
backup_security_advisories(args, repo_cwd, repository, repos_template)
|
backup_security_advisories(args, repo_cwd, repository, repos_template)
|
||||||
|
|
||||||
if args.include_labels or args.include_everything:
|
if args.include_labels or args.include_everything:
|
||||||
@@ -1832,11 +1865,9 @@ def backup_repositories(args, output_directory, repositories):
|
|||||||
include_assets=args.include_assets or args.include_everything,
|
include_assets=args.include_assets or args.include_everything,
|
||||||
)
|
)
|
||||||
except RepositoryUnavailableError as e:
|
except RepositoryUnavailableError as e:
|
||||||
logger.warning(
|
logger.warning(f"Repository {repository['full_name']} is unavailable: {e}")
|
||||||
f"Repository {repository['full_name']} is unavailable (HTTP 451)"
|
if e.legal_url:
|
||||||
)
|
logger.warning(f"Legal notice: {e.legal_url}")
|
||||||
if e.dmca_url:
|
|
||||||
logger.warning(f"DMCA notice: {e.dmca_url}")
|
|
||||||
logger.info(f"Skipping remaining resources for {repository['full_name']}")
|
logger.info(f"Skipping remaining resources for {repository['full_name']}")
|
||||||
continue
|
continue
|
||||||
|
|
||||||
@@ -2039,13 +2070,20 @@ def backup_security_advisories(args, repo_cwd, repository, repos_template):
|
|||||||
return
|
return
|
||||||
|
|
||||||
logger.info("Retrieving {0} security advisories".format(repository["full_name"]))
|
logger.info("Retrieving {0} security advisories".format(repository["full_name"]))
|
||||||
mkdir_p(repo_cwd, advisory_cwd)
|
|
||||||
|
|
||||||
template = "{0}/{1}/security-advisories".format(
|
template = "{0}/{1}/security-advisories".format(
|
||||||
repos_template, repository["full_name"]
|
repos_template, repository["full_name"]
|
||||||
)
|
)
|
||||||
|
|
||||||
_advisories = retrieve_data(args, template)
|
try:
|
||||||
|
_advisories = retrieve_data(args, template)
|
||||||
|
except Exception as e:
|
||||||
|
if "404" in str(e):
|
||||||
|
logger.info("Security advisories are not available for this repository, skipping")
|
||||||
|
return
|
||||||
|
raise
|
||||||
|
|
||||||
|
mkdir_p(repo_cwd, advisory_cwd)
|
||||||
|
|
||||||
advisories = {}
|
advisories = {}
|
||||||
for advisory in _advisories:
|
for advisory in _advisories:
|
||||||
|
|||||||
@@ -1,6 +1,6 @@
|
|||||||
# Linting & Formatting
|
# Linting & Formatting
|
||||||
autopep8==2.3.2
|
autopep8==2.3.2
|
||||||
black==25.12.0
|
black==26.1.0
|
||||||
flake8==7.3.0
|
flake8==7.3.0
|
||||||
|
|
||||||
# Testing
|
# Testing
|
||||||
@@ -9,7 +9,7 @@ pytest==9.0.2
|
|||||||
# Release & Publishing
|
# Release & Publishing
|
||||||
twine==6.2.0
|
twine==6.2.0
|
||||||
gitchangelog==3.0.4
|
gitchangelog==3.0.4
|
||||||
setuptools==80.9.0
|
setuptools==82.0.0
|
||||||
|
|
||||||
# Documentation
|
# Documentation
|
||||||
restructuredtext-lint==2.0.2
|
restructuredtext-lint==2.0.2
|
||||||
|
|||||||
@@ -1,13 +1,28 @@
|
|||||||
"""Tests for HTTP 451 (DMCA takedown) handling."""
|
"""Tests for HTTP 451 (DMCA takedown) and HTTP 403 (TOS) handling."""
|
||||||
|
|
||||||
|
import io
|
||||||
import json
|
import json
|
||||||
from unittest.mock import Mock, patch
|
from unittest.mock import patch
|
||||||
|
from urllib.error import HTTPError
|
||||||
|
|
||||||
import pytest
|
import pytest
|
||||||
|
|
||||||
from github_backup import github_backup
|
from github_backup import github_backup
|
||||||
|
|
||||||
|
|
||||||
|
def _make_http_error(code, body_bytes, msg="Error", headers=None):
|
||||||
|
"""Create an HTTPError with a readable body (like a real urllib response)."""
|
||||||
|
if headers is None:
|
||||||
|
headers = {"x-ratelimit-remaining": "5000"}
|
||||||
|
return HTTPError(
|
||||||
|
url="https://api.github.com/repos/test/repo",
|
||||||
|
code=code,
|
||||||
|
msg=msg,
|
||||||
|
hdrs=headers,
|
||||||
|
fp=io.BytesIO(body_bytes),
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
class TestHTTP451Exception:
|
class TestHTTP451Exception:
|
||||||
"""Test suite for HTTP 451 DMCA takedown exception handling."""
|
"""Test suite for HTTP 451 DMCA takedown exception handling."""
|
||||||
|
|
||||||
@@ -15,9 +30,6 @@ class TestHTTP451Exception:
|
|||||||
"""HTTP 451 should raise RepositoryUnavailableError with DMCA URL."""
|
"""HTTP 451 should raise RepositoryUnavailableError with DMCA URL."""
|
||||||
args = create_args()
|
args = create_args()
|
||||||
|
|
||||||
mock_response = Mock()
|
|
||||||
mock_response.getcode.return_value = 451
|
|
||||||
|
|
||||||
dmca_data = {
|
dmca_data = {
|
||||||
"message": "Repository access blocked",
|
"message": "Repository access blocked",
|
||||||
"block": {
|
"block": {
|
||||||
@@ -26,66 +38,166 @@ class TestHTTP451Exception:
|
|||||||
"html_url": "https://github.com/github/dmca/blob/master/2024/11/2024-11-04-source-code.md",
|
"html_url": "https://github.com/github/dmca/blob/master/2024/11/2024-11-04-source-code.md",
|
||||||
},
|
},
|
||||||
}
|
}
|
||||||
mock_response.read.return_value = json.dumps(dmca_data).encode("utf-8")
|
body = json.dumps(dmca_data).encode("utf-8")
|
||||||
mock_response.headers = {"x-ratelimit-remaining": "5000"}
|
|
||||||
mock_response.reason = "Unavailable For Legal Reasons"
|
|
||||||
|
|
||||||
with patch(
|
def mock_urlopen(*a, **kw):
|
||||||
"github_backup.github_backup.make_request_with_retry",
|
raise _make_http_error(451, body, msg="Unavailable For Legal Reasons")
|
||||||
return_value=mock_response,
|
|
||||||
):
|
with patch("github_backup.github_backup.urlopen", side_effect=mock_urlopen):
|
||||||
with pytest.raises(github_backup.RepositoryUnavailableError) as exc_info:
|
with pytest.raises(github_backup.RepositoryUnavailableError) as exc_info:
|
||||||
github_backup.retrieve_data(
|
github_backup.retrieve_data(
|
||||||
args, "https://api.github.com/repos/test/dmca/issues"
|
args, "https://api.github.com/repos/test/dmca/issues"
|
||||||
)
|
)
|
||||||
|
|
||||||
assert (
|
assert (
|
||||||
exc_info.value.dmca_url
|
exc_info.value.legal_url
|
||||||
== "https://github.com/github/dmca/blob/master/2024/11/2024-11-04-source-code.md"
|
== "https://github.com/github/dmca/blob/master/2024/11/2024-11-04-source-code.md"
|
||||||
)
|
)
|
||||||
assert "451" in str(exc_info.value)
|
assert "451" in str(exc_info.value)
|
||||||
|
|
||||||
def test_repository_unavailable_error_without_dmca_url(self, create_args):
|
def test_repository_unavailable_error_without_legal_url(self, create_args):
|
||||||
"""HTTP 451 without DMCA details should still raise exception."""
|
"""HTTP 451 without DMCA details should still raise exception."""
|
||||||
args = create_args()
|
args = create_args()
|
||||||
|
|
||||||
mock_response = Mock()
|
def mock_urlopen(*a, **kw):
|
||||||
mock_response.getcode.return_value = 451
|
raise _make_http_error(451, b'{"message": "Blocked"}')
|
||||||
mock_response.read.return_value = b'{"message": "Blocked"}'
|
|
||||||
mock_response.headers = {"x-ratelimit-remaining": "5000"}
|
|
||||||
mock_response.reason = "Unavailable For Legal Reasons"
|
|
||||||
|
|
||||||
with patch(
|
with patch("github_backup.github_backup.urlopen", side_effect=mock_urlopen):
|
||||||
"github_backup.github_backup.make_request_with_retry",
|
|
||||||
return_value=mock_response,
|
|
||||||
):
|
|
||||||
with pytest.raises(github_backup.RepositoryUnavailableError) as exc_info:
|
with pytest.raises(github_backup.RepositoryUnavailableError) as exc_info:
|
||||||
github_backup.retrieve_data(
|
github_backup.retrieve_data(
|
||||||
args, "https://api.github.com/repos/test/dmca/issues"
|
args, "https://api.github.com/repos/test/dmca/issues"
|
||||||
)
|
)
|
||||||
|
|
||||||
assert exc_info.value.dmca_url is None
|
assert exc_info.value.legal_url is None
|
||||||
assert "451" in str(exc_info.value)
|
assert "451" in str(exc_info.value)
|
||||||
|
|
||||||
def test_repository_unavailable_error_with_malformed_json(self, create_args):
|
def test_repository_unavailable_error_with_malformed_json(self, create_args):
|
||||||
"""HTTP 451 with malformed JSON should still raise exception."""
|
"""HTTP 451 with malformed JSON should still raise exception."""
|
||||||
args = create_args()
|
args = create_args()
|
||||||
|
|
||||||
mock_response = Mock()
|
def mock_urlopen(*a, **kw):
|
||||||
mock_response.getcode.return_value = 451
|
raise _make_http_error(451, b"invalid json {")
|
||||||
mock_response.read.return_value = b"invalid json {"
|
|
||||||
mock_response.headers = {"x-ratelimit-remaining": "5000"}
|
|
||||||
mock_response.reason = "Unavailable For Legal Reasons"
|
|
||||||
|
|
||||||
with patch(
|
with patch("github_backup.github_backup.urlopen", side_effect=mock_urlopen):
|
||||||
"github_backup.github_backup.make_request_with_retry",
|
|
||||||
return_value=mock_response,
|
|
||||||
):
|
|
||||||
with pytest.raises(github_backup.RepositoryUnavailableError):
|
with pytest.raises(github_backup.RepositoryUnavailableError):
|
||||||
github_backup.retrieve_data(
|
github_backup.retrieve_data(
|
||||||
args, "https://api.github.com/repos/test/dmca/issues"
|
args, "https://api.github.com/repos/test/dmca/issues"
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
|
class TestHTTP403TOS:
|
||||||
|
"""Test suite for HTTP 403 TOS violation handling."""
|
||||||
|
|
||||||
|
def test_403_tos_raises_repository_unavailable(self, create_args):
|
||||||
|
"""HTTP 403 (non-rate-limit) should raise RepositoryUnavailableError."""
|
||||||
|
args = create_args()
|
||||||
|
|
||||||
|
tos_data = {
|
||||||
|
"message": "Repository access blocked",
|
||||||
|
"block": {
|
||||||
|
"reason": "tos",
|
||||||
|
"html_url": "https://github.com/contact/tos-violation",
|
||||||
|
},
|
||||||
|
}
|
||||||
|
body = json.dumps(tos_data).encode("utf-8")
|
||||||
|
|
||||||
|
def mock_urlopen(*a, **kw):
|
||||||
|
raise _make_http_error(
|
||||||
|
403,
|
||||||
|
body,
|
||||||
|
msg="Forbidden",
|
||||||
|
headers={"x-ratelimit-remaining": "5000"},
|
||||||
|
)
|
||||||
|
|
||||||
|
with patch("github_backup.github_backup.urlopen", side_effect=mock_urlopen):
|
||||||
|
with pytest.raises(github_backup.RepositoryUnavailableError) as exc_info:
|
||||||
|
github_backup.retrieve_data(
|
||||||
|
args, "https://api.github.com/repos/test/blocked/issues"
|
||||||
|
)
|
||||||
|
|
||||||
|
assert (
|
||||||
|
exc_info.value.legal_url == "https://github.com/contact/tos-violation"
|
||||||
|
)
|
||||||
|
assert "403" in str(exc_info.value)
|
||||||
|
|
||||||
|
def test_403_permission_denied_not_converted(self, create_args):
|
||||||
|
"""HTTP 403 without 'block' in body should propagate as HTTPError, not RepositoryUnavailableError."""
|
||||||
|
args = create_args()
|
||||||
|
|
||||||
|
body = json.dumps({"message": "Must have admin rights to Repository."}).encode(
|
||||||
|
"utf-8"
|
||||||
|
)
|
||||||
|
|
||||||
|
def mock_urlopen(*a, **kw):
|
||||||
|
raise _make_http_error(
|
||||||
|
403,
|
||||||
|
body,
|
||||||
|
msg="Forbidden",
|
||||||
|
headers={"x-ratelimit-remaining": "5000"},
|
||||||
|
)
|
||||||
|
|
||||||
|
with patch("github_backup.github_backup.urlopen", side_effect=mock_urlopen):
|
||||||
|
with pytest.raises(HTTPError) as exc_info:
|
||||||
|
github_backup.retrieve_data(
|
||||||
|
args, "https://api.github.com/repos/test/private/issues"
|
||||||
|
)
|
||||||
|
|
||||||
|
assert exc_info.value.code == 403
|
||||||
|
|
||||||
|
def test_403_rate_limit_not_converted(self, create_args):
|
||||||
|
"""HTTP 403 with rate limit exhausted should NOT become RepositoryUnavailableError."""
|
||||||
|
args = create_args()
|
||||||
|
|
||||||
|
call_count = 0
|
||||||
|
|
||||||
|
def mock_urlopen(*a, **kw):
|
||||||
|
nonlocal call_count
|
||||||
|
call_count += 1
|
||||||
|
raise _make_http_error(
|
||||||
|
403,
|
||||||
|
b'{"message": "rate limit"}',
|
||||||
|
msg="Forbidden",
|
||||||
|
headers={"x-ratelimit-remaining": "0"},
|
||||||
|
)
|
||||||
|
|
||||||
|
with patch("github_backup.github_backup.urlopen", side_effect=mock_urlopen):
|
||||||
|
with patch(
|
||||||
|
"github_backup.github_backup.calculate_retry_delay", return_value=0
|
||||||
|
):
|
||||||
|
with pytest.raises(HTTPError) as exc_info:
|
||||||
|
github_backup.retrieve_data(
|
||||||
|
args, "https://api.github.com/repos/test/ratelimit/issues"
|
||||||
|
)
|
||||||
|
|
||||||
|
assert exc_info.value.code == 403
|
||||||
|
# Should have retried (not raised immediately as RepositoryUnavailableError)
|
||||||
|
assert call_count > 1
|
||||||
|
|
||||||
|
|
||||||
|
class TestRetrieveRepositoriesUnavailable:
|
||||||
|
"""Test that retrieve_repositories handles RepositoryUnavailableError gracefully."""
|
||||||
|
|
||||||
|
def test_unavailable_repo_returns_empty_list(self, create_args):
|
||||||
|
"""retrieve_repositories should return [] when the repo is unavailable."""
|
||||||
|
args = create_args(repository="blocked-repo")
|
||||||
|
|
||||||
|
def mock_urlopen(*a, **kw):
|
||||||
|
raise _make_http_error(
|
||||||
|
451,
|
||||||
|
json.dumps(
|
||||||
|
{
|
||||||
|
"message": "Blocked",
|
||||||
|
"block": {"html_url": "https://example.com/dmca"},
|
||||||
|
}
|
||||||
|
).encode("utf-8"),
|
||||||
|
msg="Unavailable For Legal Reasons",
|
||||||
|
)
|
||||||
|
|
||||||
|
with patch("github_backup.github_backup.urlopen", side_effect=mock_urlopen):
|
||||||
|
repos = github_backup.retrieve_repositories(args, {"login": None})
|
||||||
|
|
||||||
|
assert repos == []
|
||||||
|
|
||||||
|
|
||||||
if __name__ == "__main__":
|
if __name__ == "__main__":
|
||||||
pytest.main([__file__, "-v"])
|
pytest.main([__file__, "-v"])
|
||||||
|
|||||||
@@ -288,6 +288,28 @@ class TestMakeRequestWithRetry:
|
|||||||
assert exc_info.value.code == 403
|
assert exc_info.value.code == 403
|
||||||
assert call_count == 1 # No retries
|
assert call_count == 1 # No retries
|
||||||
|
|
||||||
|
def test_451_error_not_retried(self):
|
||||||
|
"""HTTP 451 should not be retried - raise immediately."""
|
||||||
|
call_count = 0
|
||||||
|
|
||||||
|
def mock_urlopen(*args, **kwargs):
|
||||||
|
nonlocal call_count
|
||||||
|
call_count += 1
|
||||||
|
raise HTTPError(
|
||||||
|
url="https://api.github.com/test",
|
||||||
|
code=451,
|
||||||
|
msg="Unavailable For Legal Reasons",
|
||||||
|
hdrs={"x-ratelimit-remaining": "5000"},
|
||||||
|
fp=None,
|
||||||
|
)
|
||||||
|
|
||||||
|
with patch("github_backup.github_backup.urlopen", side_effect=mock_urlopen):
|
||||||
|
with pytest.raises(HTTPError) as exc_info:
|
||||||
|
make_request_with_retry(Mock(), None)
|
||||||
|
|
||||||
|
assert exc_info.value.code == 451
|
||||||
|
assert call_count == 1 # No retries
|
||||||
|
|
||||||
def test_connection_error_retries_and_succeeds(self):
|
def test_connection_error_retries_and_succeeds(self):
|
||||||
"""URLError (connection error) should retry and succeed if subsequent request works."""
|
"""URLError (connection error) should retry and succeed if subsequent request works."""
|
||||||
good_response = Mock()
|
good_response = Mock()
|
||||||
|
|||||||
Reference in New Issue
Block a user