mirror of
https://github.com/josegonzalez/python-github-backup.git
synced 2025-12-05 16:18:02 +01:00
Compare commits
9 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
8b7512c8d8 | ||
|
|
995b7ede6c | ||
|
|
7840528fe2 | ||
|
|
6fb0d86977 | ||
|
|
9f6b401171 | ||
|
|
bf638f7aea | ||
|
|
c3855a94f1 | ||
|
|
c3f4bfde0d | ||
|
|
d3edef0622 |
2
.github/workflows/automatic-release.yml
vendored
2
.github/workflows/automatic-release.yml
vendored
@@ -18,7 +18,7 @@ jobs:
|
||||
runs-on: ubuntu-24.04
|
||||
steps:
|
||||
- name: Checkout repository
|
||||
uses: actions/checkout@v5
|
||||
uses: actions/checkout@v6
|
||||
with:
|
||||
fetch-depth: 0
|
||||
ssh-key: ${{ secrets.DEPLOY_PRIVATE_KEY }}
|
||||
|
||||
2
.github/workflows/docker.yml
vendored
2
.github/workflows/docker.yml
vendored
@@ -38,7 +38,7 @@ jobs:
|
||||
|
||||
steps:
|
||||
- name: Checkout repository
|
||||
uses: actions/checkout@v5
|
||||
uses: actions/checkout@v6
|
||||
with:
|
||||
persist-credentials: false
|
||||
|
||||
|
||||
2
.github/workflows/lint.yml
vendored
2
.github/workflows/lint.yml
vendored
@@ -21,7 +21,7 @@ jobs:
|
||||
|
||||
steps:
|
||||
- name: Checkout repository
|
||||
uses: actions/checkout@v5
|
||||
uses: actions/checkout@v6
|
||||
with:
|
||||
fetch-depth: 0
|
||||
- name: Setup Python
|
||||
|
||||
2
.github/workflows/test.yml
vendored
2
.github/workflows/test.yml
vendored
@@ -21,7 +21,7 @@ jobs:
|
||||
|
||||
steps:
|
||||
- name: Checkout repository
|
||||
uses: actions/checkout@v5
|
||||
uses: actions/checkout@v6
|
||||
with:
|
||||
fetch-depth: 0
|
||||
- name: Setup Python
|
||||
|
||||
83
CHANGES.rst
83
CHANGES.rst
@@ -1,9 +1,90 @@
|
||||
Changelog
|
||||
=========
|
||||
|
||||
0.51.3 (2025-11-18)
|
||||
0.52.0 (2025-11-28)
|
||||
-------------------
|
||||
------------------------
|
||||
- Skip DMCA'd repos which return a 451 response. [Rodos]
|
||||
|
||||
Log a warning and the link to the DMCA notice. Continue backing up
|
||||
other repositories instead of crashing.
|
||||
|
||||
Closes #163
|
||||
- Chore(deps): bump restructuredtext-lint in the python-packages group.
|
||||
[dependabot[bot]]
|
||||
|
||||
Bumps the python-packages group with 1 update: [restructuredtext-lint](https://github.com/twolfson/restructuredtext-lint).
|
||||
|
||||
|
||||
Updates `restructuredtext-lint` from 1.4.0 to 2.0.2
|
||||
- [Changelog](https://github.com/twolfson/restructuredtext-lint/blob/master/CHANGELOG.rst)
|
||||
- [Commits](https://github.com/twolfson/restructuredtext-lint/compare/1.4.0...2.0.2)
|
||||
|
||||
---
|
||||
updated-dependencies:
|
||||
- dependency-name: restructuredtext-lint
|
||||
dependency-version: 2.0.2
|
||||
dependency-type: direct:production
|
||||
update-type: version-update:semver-major
|
||||
dependency-group: python-packages
|
||||
...
|
||||
- Chore(deps): bump actions/checkout from 5 to 6. [dependabot[bot]]
|
||||
|
||||
Bumps [actions/checkout](https://github.com/actions/checkout) from 5 to 6.
|
||||
- [Release notes](https://github.com/actions/checkout/releases)
|
||||
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
|
||||
- [Commits](https://github.com/actions/checkout/compare/v5...v6)
|
||||
|
||||
---
|
||||
updated-dependencies:
|
||||
- dependency-name: actions/checkout
|
||||
dependency-version: '6'
|
||||
dependency-type: direct:production
|
||||
update-type: version-update:semver-major
|
||||
...
|
||||
- Chore(deps): bump the python-packages group with 3 updates.
|
||||
[dependabot[bot]]
|
||||
|
||||
Bumps the python-packages group with 3 updates: [click](https://github.com/pallets/click), [pytest](https://github.com/pytest-dev/pytest) and [keyring](https://github.com/jaraco/keyring).
|
||||
|
||||
|
||||
Updates `click` from 8.3.0 to 8.3.1
|
||||
- [Release notes](https://github.com/pallets/click/releases)
|
||||
- [Changelog](https://github.com/pallets/click/blob/main/CHANGES.rst)
|
||||
- [Commits](https://github.com/pallets/click/compare/8.3.0...8.3.1)
|
||||
|
||||
Updates `pytest` from 8.3.3 to 9.0.1
|
||||
- [Release notes](https://github.com/pytest-dev/pytest/releases)
|
||||
- [Changelog](https://github.com/pytest-dev/pytest/blob/main/CHANGELOG.rst)
|
||||
- [Commits](https://github.com/pytest-dev/pytest/compare/8.3.3...9.0.1)
|
||||
|
||||
Updates `keyring` from 25.6.0 to 25.7.0
|
||||
- [Release notes](https://github.com/jaraco/keyring/releases)
|
||||
- [Changelog](https://github.com/jaraco/keyring/blob/main/NEWS.rst)
|
||||
- [Commits](https://github.com/jaraco/keyring/compare/v25.6.0...v25.7.0)
|
||||
|
||||
---
|
||||
updated-dependencies:
|
||||
- dependency-name: click
|
||||
dependency-version: 8.3.1
|
||||
dependency-type: direct:production
|
||||
update-type: version-update:semver-patch
|
||||
dependency-group: python-packages
|
||||
- dependency-name: pytest
|
||||
dependency-version: 9.0.1
|
||||
dependency-type: direct:production
|
||||
update-type: version-update:semver-major
|
||||
dependency-group: python-packages
|
||||
- dependency-name: keyring
|
||||
dependency-version: 25.7.0
|
||||
dependency-type: direct:production
|
||||
update-type: version-update:semver-minor
|
||||
dependency-group: python-packages
|
||||
...
|
||||
|
||||
|
||||
0.51.3 (2025-11-18)
|
||||
-------------------
|
||||
- Test: Add pagination tests for cursor and page-based Link headers.
|
||||
[Rodos]
|
||||
- Use cursor based pagination. [Helio Machado]
|
||||
|
||||
@@ -1 +1 @@
|
||||
__version__ = "0.51.3"
|
||||
__version__ = "0.52.0"
|
||||
|
||||
@@ -37,6 +37,15 @@ FNULL = open(os.devnull, "w")
|
||||
FILE_URI_PREFIX = "file://"
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
class RepositoryUnavailableError(Exception):
|
||||
"""Raised when a repository is unavailable due to legal reasons (e.g., DMCA takedown)."""
|
||||
|
||||
def __init__(self, message, dmca_url=None):
|
||||
super().__init__(message)
|
||||
self.dmca_url = dmca_url
|
||||
|
||||
|
||||
# Setup SSL context with fallback chain
|
||||
https_ctx = ssl.create_default_context()
|
||||
if https_ctx.get_ca_certs():
|
||||
@@ -612,6 +621,19 @@ def retrieve_data_gen(args, template, query_args=None, single_request=False):
|
||||
|
||||
status_code = int(r.getcode())
|
||||
|
||||
# Handle DMCA takedown (HTTP 451) - raise exception to skip entire repository
|
||||
if status_code == 451:
|
||||
dmca_url = None
|
||||
try:
|
||||
response_data = json.loads(r.read().decode("utf-8"))
|
||||
dmca_url = response_data.get("block", {}).get("html_url")
|
||||
except Exception:
|
||||
pass
|
||||
raise RepositoryUnavailableError(
|
||||
"Repository unavailable due to legal reasons (HTTP 451)",
|
||||
dmca_url=dmca_url
|
||||
)
|
||||
|
||||
# Check if we got correct data
|
||||
try:
|
||||
response = json.loads(r.read().decode("utf-8"))
|
||||
@@ -1668,40 +1690,47 @@ def backup_repositories(args, output_directory, repositories):
|
||||
|
||||
continue # don't try to back anything else for a gist; it doesn't exist
|
||||
|
||||
download_wiki = args.include_wiki or args.include_everything
|
||||
if repository["has_wiki"] and download_wiki:
|
||||
fetch_repository(
|
||||
repository["name"],
|
||||
repo_url.replace(".git", ".wiki.git"),
|
||||
os.path.join(repo_cwd, "wiki"),
|
||||
skip_existing=args.skip_existing,
|
||||
bare_clone=args.bare_clone,
|
||||
lfs_clone=args.lfs_clone,
|
||||
no_prune=args.no_prune,
|
||||
)
|
||||
if args.include_issues or args.include_everything:
|
||||
backup_issues(args, repo_cwd, repository, repos_template)
|
||||
try:
|
||||
download_wiki = args.include_wiki or args.include_everything
|
||||
if repository["has_wiki"] and download_wiki:
|
||||
fetch_repository(
|
||||
repository["name"],
|
||||
repo_url.replace(".git", ".wiki.git"),
|
||||
os.path.join(repo_cwd, "wiki"),
|
||||
skip_existing=args.skip_existing,
|
||||
bare_clone=args.bare_clone,
|
||||
lfs_clone=args.lfs_clone,
|
||||
no_prune=args.no_prune,
|
||||
)
|
||||
if args.include_issues or args.include_everything:
|
||||
backup_issues(args, repo_cwd, repository, repos_template)
|
||||
|
||||
if args.include_pulls or args.include_everything:
|
||||
backup_pulls(args, repo_cwd, repository, repos_template)
|
||||
if args.include_pulls or args.include_everything:
|
||||
backup_pulls(args, repo_cwd, repository, repos_template)
|
||||
|
||||
if args.include_milestones or args.include_everything:
|
||||
backup_milestones(args, repo_cwd, repository, repos_template)
|
||||
if args.include_milestones or args.include_everything:
|
||||
backup_milestones(args, repo_cwd, repository, repos_template)
|
||||
|
||||
if args.include_labels or args.include_everything:
|
||||
backup_labels(args, repo_cwd, repository, repos_template)
|
||||
if args.include_labels or args.include_everything:
|
||||
backup_labels(args, repo_cwd, repository, repos_template)
|
||||
|
||||
if args.include_hooks or args.include_everything:
|
||||
backup_hooks(args, repo_cwd, repository, repos_template)
|
||||
if args.include_hooks or args.include_everything:
|
||||
backup_hooks(args, repo_cwd, repository, repos_template)
|
||||
|
||||
if args.include_releases or args.include_everything:
|
||||
backup_releases(
|
||||
args,
|
||||
repo_cwd,
|
||||
repository,
|
||||
repos_template,
|
||||
include_assets=args.include_assets or args.include_everything,
|
||||
)
|
||||
if args.include_releases or args.include_everything:
|
||||
backup_releases(
|
||||
args,
|
||||
repo_cwd,
|
||||
repository,
|
||||
repos_template,
|
||||
include_assets=args.include_assets or args.include_everything,
|
||||
)
|
||||
except RepositoryUnavailableError as e:
|
||||
logger.warning(f"Repository {repository['full_name']} is unavailable (HTTP 451)")
|
||||
if e.dmca_url:
|
||||
logger.warning(f"DMCA notice: {e.dmca_url}")
|
||||
logger.info(f"Skipping remaining resources for {repository['full_name']}")
|
||||
continue
|
||||
|
||||
if args.incremental:
|
||||
if last_update == "0000-00-00T00:00:00Z":
|
||||
|
||||
@@ -3,16 +3,16 @@ black==25.11.0
|
||||
bleach==6.3.0
|
||||
certifi==2025.11.12
|
||||
charset-normalizer==3.4.4
|
||||
click==8.3.0
|
||||
click==8.3.1
|
||||
colorama==0.4.6
|
||||
docutils==0.22.3
|
||||
flake8==7.3.0
|
||||
gitchangelog==3.0.4
|
||||
pytest==8.3.3
|
||||
pytest==9.0.1
|
||||
idna==3.11
|
||||
importlib-metadata==8.7.0
|
||||
jaraco.classes==3.4.0
|
||||
keyring==25.6.0
|
||||
keyring==25.7.0
|
||||
markdown-it-py==4.0.0
|
||||
mccabe==0.7.0
|
||||
mdurl==0.1.2
|
||||
@@ -28,7 +28,7 @@ Pygments==2.19.2
|
||||
readme-renderer==44.0
|
||||
requests==2.32.5
|
||||
requests-toolbelt==1.0.0
|
||||
restructuredtext-lint==1.4.0
|
||||
restructuredtext-lint==2.0.2
|
||||
rfc3986==2.0.0
|
||||
rich==14.2.0
|
||||
setuptools==80.9.0
|
||||
|
||||
143
tests/test_http_451.py
Normal file
143
tests/test_http_451.py
Normal file
@@ -0,0 +1,143 @@
|
||||
"""Tests for HTTP 451 (DMCA takedown) handling."""
|
||||
|
||||
import json
|
||||
from unittest.mock import Mock, patch
|
||||
|
||||
import pytest
|
||||
|
||||
from github_backup import github_backup
|
||||
|
||||
|
||||
class TestHTTP451Exception:
|
||||
"""Test suite for HTTP 451 DMCA takedown exception handling."""
|
||||
|
||||
def test_repository_unavailable_error_raised(self):
|
||||
"""HTTP 451 should raise RepositoryUnavailableError with DMCA URL."""
|
||||
# Create mock args
|
||||
args = Mock()
|
||||
args.as_app = False
|
||||
args.token_fine = None
|
||||
args.token_classic = None
|
||||
args.username = None
|
||||
args.password = None
|
||||
args.osx_keychain_item_name = None
|
||||
args.osx_keychain_item_account = None
|
||||
args.throttle_limit = None
|
||||
args.throttle_pause = 0
|
||||
|
||||
# Mock HTTPError 451 response
|
||||
mock_response = Mock()
|
||||
mock_response.getcode.return_value = 451
|
||||
|
||||
dmca_data = {
|
||||
"message": "Repository access blocked",
|
||||
"block": {
|
||||
"reason": "dmca",
|
||||
"created_at": "2024-11-12T14:38:04Z",
|
||||
"html_url": "https://github.com/github/dmca/blob/master/2024/11/2024-11-04-source-code.md"
|
||||
}
|
||||
}
|
||||
mock_response.read.return_value = json.dumps(dmca_data).encode("utf-8")
|
||||
mock_response.headers = {"x-ratelimit-remaining": "5000"}
|
||||
mock_response.reason = "Unavailable For Legal Reasons"
|
||||
|
||||
def mock_get_response(request, auth, template):
|
||||
return mock_response, []
|
||||
|
||||
with patch("github_backup.github_backup._get_response", side_effect=mock_get_response):
|
||||
with pytest.raises(github_backup.RepositoryUnavailableError) as exc_info:
|
||||
list(github_backup.retrieve_data_gen(args, "https://api.github.com/repos/test/dmca/issues"))
|
||||
|
||||
# Check exception has DMCA URL
|
||||
assert exc_info.value.dmca_url == "https://github.com/github/dmca/blob/master/2024/11/2024-11-04-source-code.md"
|
||||
assert "451" in str(exc_info.value)
|
||||
|
||||
def test_repository_unavailable_error_without_dmca_url(self):
|
||||
"""HTTP 451 without DMCA details should still raise exception."""
|
||||
args = Mock()
|
||||
args.as_app = False
|
||||
args.token_fine = None
|
||||
args.token_classic = None
|
||||
args.username = None
|
||||
args.password = None
|
||||
args.osx_keychain_item_name = None
|
||||
args.osx_keychain_item_account = None
|
||||
args.throttle_limit = None
|
||||
args.throttle_pause = 0
|
||||
|
||||
mock_response = Mock()
|
||||
mock_response.getcode.return_value = 451
|
||||
mock_response.read.return_value = b'{"message": "Blocked"}'
|
||||
mock_response.headers = {"x-ratelimit-remaining": "5000"}
|
||||
mock_response.reason = "Unavailable For Legal Reasons"
|
||||
|
||||
def mock_get_response(request, auth, template):
|
||||
return mock_response, []
|
||||
|
||||
with patch("github_backup.github_backup._get_response", side_effect=mock_get_response):
|
||||
with pytest.raises(github_backup.RepositoryUnavailableError) as exc_info:
|
||||
list(github_backup.retrieve_data_gen(args, "https://api.github.com/repos/test/dmca/issues"))
|
||||
|
||||
# Exception raised even without DMCA URL
|
||||
assert exc_info.value.dmca_url is None
|
||||
assert "451" in str(exc_info.value)
|
||||
|
||||
def test_repository_unavailable_error_with_malformed_json(self):
|
||||
"""HTTP 451 with malformed JSON should still raise exception."""
|
||||
args = Mock()
|
||||
args.as_app = False
|
||||
args.token_fine = None
|
||||
args.token_classic = None
|
||||
args.username = None
|
||||
args.password = None
|
||||
args.osx_keychain_item_name = None
|
||||
args.osx_keychain_item_account = None
|
||||
args.throttle_limit = None
|
||||
args.throttle_pause = 0
|
||||
|
||||
mock_response = Mock()
|
||||
mock_response.getcode.return_value = 451
|
||||
mock_response.read.return_value = b"invalid json {"
|
||||
mock_response.headers = {"x-ratelimit-remaining": "5000"}
|
||||
mock_response.reason = "Unavailable For Legal Reasons"
|
||||
|
||||
def mock_get_response(request, auth, template):
|
||||
return mock_response, []
|
||||
|
||||
with patch("github_backup.github_backup._get_response", side_effect=mock_get_response):
|
||||
with pytest.raises(github_backup.RepositoryUnavailableError):
|
||||
list(github_backup.retrieve_data_gen(args, "https://api.github.com/repos/test/dmca/issues"))
|
||||
|
||||
def test_other_http_errors_unchanged(self):
|
||||
"""Other HTTP errors should still raise generic Exception."""
|
||||
args = Mock()
|
||||
args.as_app = False
|
||||
args.token_fine = None
|
||||
args.token_classic = None
|
||||
args.username = None
|
||||
args.password = None
|
||||
args.osx_keychain_item_name = None
|
||||
args.osx_keychain_item_account = None
|
||||
args.throttle_limit = None
|
||||
args.throttle_pause = 0
|
||||
|
||||
mock_response = Mock()
|
||||
mock_response.getcode.return_value = 404
|
||||
mock_response.read.return_value = b'{"message": "Not Found"}'
|
||||
mock_response.headers = {"x-ratelimit-remaining": "5000"}
|
||||
mock_response.reason = "Not Found"
|
||||
|
||||
def mock_get_response(request, auth, template):
|
||||
return mock_response, []
|
||||
|
||||
with patch("github_backup.github_backup._get_response", side_effect=mock_get_response):
|
||||
# Should raise generic Exception, not RepositoryUnavailableError
|
||||
with pytest.raises(Exception) as exc_info:
|
||||
list(github_backup.retrieve_data_gen(args, "https://api.github.com/repos/test/notfound/issues"))
|
||||
|
||||
assert not isinstance(exc_info.value, github_backup.RepositoryUnavailableError)
|
||||
assert "404" in str(exc_info.value)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
pytest.main([__file__, "-v"])
|
||||
Reference in New Issue
Block a user