mirror of
https://github.com/josegonzalez/python-github-backup.git
synced 2025-12-11 18:41:11 +01:00
Compare commits
49 Commits
1ec0820936
...
master
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
2bb83d6d8b | ||
|
|
8fcc142621 | ||
|
|
7615ce6102 | ||
|
|
3f1ef821c3 | ||
|
|
3684756eaa | ||
|
|
e745b55755 | ||
|
|
75e6f56773 | ||
|
|
b991c363a0 | ||
|
|
6d74af9126 | ||
|
|
381d67af96 | ||
|
|
2fbe8d272c | ||
|
|
eb5779ac23 | ||
|
|
5b52931ebf | ||
|
|
1d6d474408 | ||
|
|
b80049e96e | ||
|
|
58ad1c2378 | ||
|
|
6e2a7e521c | ||
|
|
aba048a3e9 | ||
|
|
9f7c08166f | ||
|
|
fdfaaec1ba | ||
|
|
8f9cf7ff89 | ||
|
|
899ab5fdc2 | ||
|
|
2a9d86a6bf | ||
|
|
4fd3ea9e3c | ||
|
|
041dc013f9 | ||
|
|
12802103c4 | ||
|
|
bf28b46954 | ||
|
|
ff2681e196 | ||
|
|
745b05a63f | ||
|
|
83ff0ae1dd | ||
|
|
6ad1959d43 | ||
|
|
5739ac0745 | ||
|
|
8b7512c8d8 | ||
|
|
995b7ede6c | ||
|
|
7840528fe2 | ||
|
|
6fb0d86977 | ||
|
|
9f6b401171 | ||
|
|
bf638f7aea | ||
|
|
c3855a94f1 | ||
|
|
c3f4bfde0d | ||
|
|
d3edef0622 | ||
|
|
9ef496efad | ||
|
|
42bfe6f79d | ||
|
|
5af522a348 | ||
|
|
6dfba7a783 | ||
|
|
7551829677 | ||
|
|
72d35a9b94 | ||
|
|
3eae9d78ed | ||
|
|
90ba839c7d |
2
.github/workflows/automatic-release.yml
vendored
2
.github/workflows/automatic-release.yml
vendored
@@ -18,7 +18,7 @@ jobs:
|
|||||||
runs-on: ubuntu-24.04
|
runs-on: ubuntu-24.04
|
||||||
steps:
|
steps:
|
||||||
- name: Checkout repository
|
- name: Checkout repository
|
||||||
uses: actions/checkout@v5
|
uses: actions/checkout@v6
|
||||||
with:
|
with:
|
||||||
fetch-depth: 0
|
fetch-depth: 0
|
||||||
ssh-key: ${{ secrets.DEPLOY_PRIVATE_KEY }}
|
ssh-key: ${{ secrets.DEPLOY_PRIVATE_KEY }}
|
||||||
|
|||||||
2
.github/workflows/docker.yml
vendored
2
.github/workflows/docker.yml
vendored
@@ -38,7 +38,7 @@ jobs:
|
|||||||
|
|
||||||
steps:
|
steps:
|
||||||
- name: Checkout repository
|
- name: Checkout repository
|
||||||
uses: actions/checkout@v5
|
uses: actions/checkout@v6
|
||||||
with:
|
with:
|
||||||
persist-credentials: false
|
persist-credentials: false
|
||||||
|
|
||||||
|
|||||||
2
.github/workflows/lint.yml
vendored
2
.github/workflows/lint.yml
vendored
@@ -21,7 +21,7 @@ jobs:
|
|||||||
|
|
||||||
steps:
|
steps:
|
||||||
- name: Checkout repository
|
- name: Checkout repository
|
||||||
uses: actions/checkout@v5
|
uses: actions/checkout@v6
|
||||||
with:
|
with:
|
||||||
fetch-depth: 0
|
fetch-depth: 0
|
||||||
- name: Setup Python
|
- name: Setup Python
|
||||||
|
|||||||
2
.github/workflows/test.yml
vendored
2
.github/workflows/test.yml
vendored
@@ -21,7 +21,7 @@ jobs:
|
|||||||
|
|
||||||
steps:
|
steps:
|
||||||
- name: Checkout repository
|
- name: Checkout repository
|
||||||
uses: actions/checkout@v5
|
uses: actions/checkout@v6
|
||||||
with:
|
with:
|
||||||
fetch-depth: 0
|
fetch-depth: 0
|
||||||
- name: Setup Python
|
- name: Setup Python
|
||||||
|
|||||||
301
CHANGES.rst
301
CHANGES.rst
@@ -1,10 +1,309 @@
|
|||||||
Changelog
|
Changelog
|
||||||
=========
|
=========
|
||||||
|
|
||||||
0.51.1 (2025-11-16)
|
0.56.0 (2025-12-11)
|
||||||
-------------------
|
-------------------
|
||||||
------------------------
|
------------------------
|
||||||
|
|
||||||
|
Fix
|
||||||
|
~~~
|
||||||
|
- Replace deprecated git lfs clone with git clone + git lfs fetch --all.
|
||||||
|
[Rodos]
|
||||||
|
|
||||||
|
git lfs clone is deprecated - modern git clone handles LFS automatically.
|
||||||
|
Using git lfs fetch --all ensures all LFS objects across all refs are
|
||||||
|
backed up, matching the existing bare clone behavior and providing
|
||||||
|
complete LFS backups.
|
||||||
|
|
||||||
|
Closes #379
|
||||||
|
- Add Windows support with entry_points and os.replace. [Rodos]
|
||||||
|
|
||||||
|
- Replace os.rename() with os.replace() for atomic file operations
|
||||||
|
on Windows (os.rename fails if destination exists on Windows)
|
||||||
|
- Add entry_points console_scripts for proper .exe generation on Windows
|
||||||
|
- Create github_backup/cli.py with main() entry point
|
||||||
|
- Add github_backup/__main__.py for python -m github_backup support
|
||||||
|
- Keep bin/github-backup as thin wrapper for backwards compatibility
|
||||||
|
|
||||||
|
Closes #112
|
||||||
|
|
||||||
|
Other
|
||||||
|
~~~~~
|
||||||
|
- Docs: add "Restoring from Backup" section to README. [Rodos]
|
||||||
|
|
||||||
|
Clarifies that this tool is backup-only with no inbuilt restore.
|
||||||
|
Documents that git repos can be pushed back, but issues/PRs have
|
||||||
|
GitHub API limitations affecting all backup tools.
|
||||||
|
|
||||||
|
Closes #246
|
||||||
|
- Chore(deps): bump urllib3 in the python-packages group.
|
||||||
|
[dependabot[bot]]
|
||||||
|
|
||||||
|
Bumps the python-packages group with 1 update: [urllib3](https://github.com/urllib3/urllib3).
|
||||||
|
|
||||||
|
|
||||||
|
Updates `urllib3` from 2.6.0 to 2.6.1
|
||||||
|
- [Release notes](https://github.com/urllib3/urllib3/releases)
|
||||||
|
- [Changelog](https://github.com/urllib3/urllib3/blob/main/CHANGES.rst)
|
||||||
|
- [Commits](https://github.com/urllib3/urllib3/compare/2.6.0...2.6.1)
|
||||||
|
|
||||||
|
---
|
||||||
|
updated-dependencies:
|
||||||
|
- dependency-name: urllib3
|
||||||
|
dependency-version: 2.6.1
|
||||||
|
dependency-type: direct:production
|
||||||
|
update-type: version-update:semver-patch
|
||||||
|
dependency-group: python-packages
|
||||||
|
...
|
||||||
|
- Chore(deps): bump the python-packages group with 3 updates.
|
||||||
|
[dependabot[bot]]
|
||||||
|
|
||||||
|
Bumps the python-packages group with 3 updates: [black](https://github.com/psf/black), [pytest](https://github.com/pytest-dev/pytest) and [platformdirs](https://github.com/tox-dev/platformdirs).
|
||||||
|
|
||||||
|
|
||||||
|
Updates `black` from 25.11.0 to 25.12.0
|
||||||
|
- [Release notes](https://github.com/psf/black/releases)
|
||||||
|
- [Changelog](https://github.com/psf/black/blob/main/CHANGES.md)
|
||||||
|
- [Commits](https://github.com/psf/black/compare/25.11.0...25.12.0)
|
||||||
|
|
||||||
|
Updates `pytest` from 9.0.1 to 9.0.2
|
||||||
|
- [Release notes](https://github.com/pytest-dev/pytest/releases)
|
||||||
|
- [Changelog](https://github.com/pytest-dev/pytest/blob/main/CHANGELOG.rst)
|
||||||
|
- [Commits](https://github.com/pytest-dev/pytest/compare/9.0.1...9.0.2)
|
||||||
|
|
||||||
|
Updates `platformdirs` from 4.5.0 to 4.5.1
|
||||||
|
- [Release notes](https://github.com/tox-dev/platformdirs/releases)
|
||||||
|
- [Changelog](https://github.com/tox-dev/platformdirs/blob/main/CHANGES.rst)
|
||||||
|
- [Commits](https://github.com/tox-dev/platformdirs/compare/4.5.0...4.5.1)
|
||||||
|
|
||||||
|
---
|
||||||
|
updated-dependencies:
|
||||||
|
- dependency-name: black
|
||||||
|
dependency-version: 25.12.0
|
||||||
|
dependency-type: direct:production
|
||||||
|
update-type: version-update:semver-minor
|
||||||
|
dependency-group: python-packages
|
||||||
|
- dependency-name: pytest
|
||||||
|
dependency-version: 9.0.2
|
||||||
|
dependency-type: direct:production
|
||||||
|
update-type: version-update:semver-patch
|
||||||
|
dependency-group: python-packages
|
||||||
|
- dependency-name: platformdirs
|
||||||
|
dependency-version: 4.5.1
|
||||||
|
dependency-type: direct:production
|
||||||
|
update-type: version-update:semver-patch
|
||||||
|
dependency-group: python-packages
|
||||||
|
...
|
||||||
|
|
||||||
|
|
||||||
|
0.55.0 (2025-12-07)
|
||||||
|
-------------------
|
||||||
|
|
||||||
|
Fix
|
||||||
|
~~~
|
||||||
|
- Improve error messages for inaccessible repos and empty wikis. [Rodos]
|
||||||
|
- --all-starred now clones repos without --repositories. [Rodos]
|
||||||
|
- Warn when --private used without authentication. [Rodos]
|
||||||
|
- Warn and skip when --starred-gists used for different user. [Rodos]
|
||||||
|
|
||||||
|
GitHub's API only allows retrieving starred gists for the authenticated
|
||||||
|
user. Previously, using --starred-gists when backing up a different user
|
||||||
|
would silently return no relevant data.
|
||||||
|
|
||||||
|
Now warns and skips the retrieval entirely when the target user differs
|
||||||
|
from the authenticated user. Uses case-insensitive comparison to match
|
||||||
|
GitHub's username handling.
|
||||||
|
|
||||||
|
Fixes #93
|
||||||
|
|
||||||
|
Other
|
||||||
|
~~~~~
|
||||||
|
- Test: add missing test coverage for case sensitivity fix. [Rodos]
|
||||||
|
- Docs: fix RST formatting in Known blocking errors section. [Rodos]
|
||||||
|
- Chore(deps): bump urllib3 from 2.5.0 to 2.6.0. [dependabot[bot]]
|
||||||
|
|
||||||
|
Bumps [urllib3](https://github.com/urllib3/urllib3) from 2.5.0 to 2.6.0.
|
||||||
|
- [Release notes](https://github.com/urllib3/urllib3/releases)
|
||||||
|
- [Changelog](https://github.com/urllib3/urllib3/blob/main/CHANGES.rst)
|
||||||
|
- [Commits](https://github.com/urllib3/urllib3/compare/2.5.0...2.6.0)
|
||||||
|
|
||||||
|
---
|
||||||
|
updated-dependencies:
|
||||||
|
- dependency-name: urllib3
|
||||||
|
dependency-version: 2.6.0
|
||||||
|
dependency-type: direct:production
|
||||||
|
...
|
||||||
|
|
||||||
|
|
||||||
|
0.54.0 (2025-12-03)
|
||||||
|
-------------------
|
||||||
|
|
||||||
|
Fix
|
||||||
|
~~~
|
||||||
|
- Send INFO/DEBUG to stdout, WARNING/ERROR to stderr. [Rodos]
|
||||||
|
|
||||||
|
Fixes #182
|
||||||
|
|
||||||
|
Other
|
||||||
|
~~~~~
|
||||||
|
- Docs: update README testing section and add fetch vs pull explanation.
|
||||||
|
[Rodos]
|
||||||
|
|
||||||
|
|
||||||
|
0.53.0 (2025-11-30)
|
||||||
|
-------------------
|
||||||
|
|
||||||
|
Fix
|
||||||
|
~~~
|
||||||
|
- Case-sensitive username filtering causing silent backup failures.
|
||||||
|
[Rodos]
|
||||||
|
|
||||||
|
GitHub's API accepts usernames in any case but returns canonical case.
|
||||||
|
The case-sensitive comparison in filter_repositories() filtered out all
|
||||||
|
repositories when user-provided case didn't match GitHub's canonical case.
|
||||||
|
|
||||||
|
Changed to case-insensitive comparison.
|
||||||
|
|
||||||
|
Fixes #198
|
||||||
|
|
||||||
|
Other
|
||||||
|
~~~~~
|
||||||
|
- Avoid rewriting unchanged JSON files for labels, milestones, releases,
|
||||||
|
hooks, followers, and following. [Rodos]
|
||||||
|
|
||||||
|
This change reduces unnecessary writes when backing up metadata that changes
|
||||||
|
infrequently. The implementation compares existing file content before writing
|
||||||
|
and skips the write if the content is identical, preserving file timestamps.
|
||||||
|
|
||||||
|
Key changes:
|
||||||
|
- Added json_dump_if_changed() helper that compares content before writing
|
||||||
|
- Uses atomic writes (temp file + rename) for all metadata files
|
||||||
|
- NOT applied to issues/pulls (they use incremental_by_files logic)
|
||||||
|
- Made log messages consistent and past tense ("Saved" instead of "Saving")
|
||||||
|
- Added informative logging showing skip counts
|
||||||
|
|
||||||
|
Fixes #133
|
||||||
|
|
||||||
|
|
||||||
|
0.52.0 (2025-11-28)
|
||||||
|
-------------------
|
||||||
|
- Skip DMCA'd repos which return a 451 response. [Rodos]
|
||||||
|
|
||||||
|
Log a warning and the link to the DMCA notice. Continue backing up
|
||||||
|
other repositories instead of crashing.
|
||||||
|
|
||||||
|
Closes #163
|
||||||
|
- Chore(deps): bump restructuredtext-lint in the python-packages group.
|
||||||
|
[dependabot[bot]]
|
||||||
|
|
||||||
|
Bumps the python-packages group with 1 update: [restructuredtext-lint](https://github.com/twolfson/restructuredtext-lint).
|
||||||
|
|
||||||
|
|
||||||
|
Updates `restructuredtext-lint` from 1.4.0 to 2.0.2
|
||||||
|
- [Changelog](https://github.com/twolfson/restructuredtext-lint/blob/master/CHANGELOG.rst)
|
||||||
|
- [Commits](https://github.com/twolfson/restructuredtext-lint/compare/1.4.0...2.0.2)
|
||||||
|
|
||||||
|
---
|
||||||
|
updated-dependencies:
|
||||||
|
- dependency-name: restructuredtext-lint
|
||||||
|
dependency-version: 2.0.2
|
||||||
|
dependency-type: direct:production
|
||||||
|
update-type: version-update:semver-major
|
||||||
|
dependency-group: python-packages
|
||||||
|
...
|
||||||
|
- Chore(deps): bump actions/checkout from 5 to 6. [dependabot[bot]]
|
||||||
|
|
||||||
|
Bumps [actions/checkout](https://github.com/actions/checkout) from 5 to 6.
|
||||||
|
- [Release notes](https://github.com/actions/checkout/releases)
|
||||||
|
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
|
||||||
|
- [Commits](https://github.com/actions/checkout/compare/v5...v6)
|
||||||
|
|
||||||
|
---
|
||||||
|
updated-dependencies:
|
||||||
|
- dependency-name: actions/checkout
|
||||||
|
dependency-version: '6'
|
||||||
|
dependency-type: direct:production
|
||||||
|
update-type: version-update:semver-major
|
||||||
|
...
|
||||||
|
- Chore(deps): bump the python-packages group with 3 updates.
|
||||||
|
[dependabot[bot]]
|
||||||
|
|
||||||
|
Bumps the python-packages group with 3 updates: [click](https://github.com/pallets/click), [pytest](https://github.com/pytest-dev/pytest) and [keyring](https://github.com/jaraco/keyring).
|
||||||
|
|
||||||
|
|
||||||
|
Updates `click` from 8.3.0 to 8.3.1
|
||||||
|
- [Release notes](https://github.com/pallets/click/releases)
|
||||||
|
- [Changelog](https://github.com/pallets/click/blob/main/CHANGES.rst)
|
||||||
|
- [Commits](https://github.com/pallets/click/compare/8.3.0...8.3.1)
|
||||||
|
|
||||||
|
Updates `pytest` from 8.3.3 to 9.0.1
|
||||||
|
- [Release notes](https://github.com/pytest-dev/pytest/releases)
|
||||||
|
- [Changelog](https://github.com/pytest-dev/pytest/blob/main/CHANGELOG.rst)
|
||||||
|
- [Commits](https://github.com/pytest-dev/pytest/compare/8.3.3...9.0.1)
|
||||||
|
|
||||||
|
Updates `keyring` from 25.6.0 to 25.7.0
|
||||||
|
- [Release notes](https://github.com/jaraco/keyring/releases)
|
||||||
|
- [Changelog](https://github.com/jaraco/keyring/blob/main/NEWS.rst)
|
||||||
|
- [Commits](https://github.com/jaraco/keyring/compare/v25.6.0...v25.7.0)
|
||||||
|
|
||||||
|
---
|
||||||
|
updated-dependencies:
|
||||||
|
- dependency-name: click
|
||||||
|
dependency-version: 8.3.1
|
||||||
|
dependency-type: direct:production
|
||||||
|
update-type: version-update:semver-patch
|
||||||
|
dependency-group: python-packages
|
||||||
|
- dependency-name: pytest
|
||||||
|
dependency-version: 9.0.1
|
||||||
|
dependency-type: direct:production
|
||||||
|
update-type: version-update:semver-major
|
||||||
|
dependency-group: python-packages
|
||||||
|
- dependency-name: keyring
|
||||||
|
dependency-version: 25.7.0
|
||||||
|
dependency-type: direct:production
|
||||||
|
update-type: version-update:semver-minor
|
||||||
|
dependency-group: python-packages
|
||||||
|
...
|
||||||
|
|
||||||
|
|
||||||
|
0.51.3 (2025-11-18)
|
||||||
|
-------------------
|
||||||
|
- Test: Add pagination tests for cursor and page-based Link headers.
|
||||||
|
[Rodos]
|
||||||
|
- Use cursor based pagination. [Helio Machado]
|
||||||
|
|
||||||
|
|
||||||
|
0.51.2 (2025-11-16)
|
||||||
|
-------------------
|
||||||
|
|
||||||
|
Fix
|
||||||
|
~~~
|
||||||
|
- Improve CA certificate detection with fallback chain. [Rodos]
|
||||||
|
|
||||||
|
The previous implementation incorrectly assumed empty get_ca_certs()
|
||||||
|
meant broken SSL, causing false failures in GitHub Codespaces and other
|
||||||
|
directory-based cert systems where certificates exist but aren't pre-loaded.
|
||||||
|
It would then attempt to import certifi as a workaround, but certifi wasn't
|
||||||
|
listed in requirements.txt, causing the fallback to fail with ImportError
|
||||||
|
even though the system certificates would have worked fine.
|
||||||
|
|
||||||
|
This commit replaces the naive check with a layered fallback approach that
|
||||||
|
checks multiple certificate sources. First it checks for pre-loaded system
|
||||||
|
certs (file-based systems). Then it verifies system cert paths exist
|
||||||
|
(directory-based systems like Ubuntu/Debian/Codespaces). Finally it attempts
|
||||||
|
to use certifi as an optional fallback only if needed.
|
||||||
|
|
||||||
|
This approach eliminates hard dependencies (certifi is now optional), works
|
||||||
|
in GitHub Codespaces without any setup, and fails gracefully with clear hints
|
||||||
|
for resolution when SSL is actually broken rather than failing with
|
||||||
|
ModuleNotFoundError.
|
||||||
|
|
||||||
|
Fixes #444
|
||||||
|
|
||||||
|
|
||||||
|
0.51.1 (2025-11-16)
|
||||||
|
-------------------
|
||||||
|
|
||||||
Fix
|
Fix
|
||||||
~~~
|
~~~
|
||||||
- Prevent duplicate attachment downloads. [Rodos]
|
- Prevent duplicate attachment downloads. [Rodos]
|
||||||
|
|||||||
55
README.rst
55
README.rst
@@ -215,6 +215,8 @@ When you use the ``--lfs`` option, you will need to make sure you have Git LFS i
|
|||||||
|
|
||||||
Instructions on how to do this can be found on https://git-lfs.github.com.
|
Instructions on how to do this can be found on https://git-lfs.github.com.
|
||||||
|
|
||||||
|
LFS objects are fetched for all refs, not just the current checkout, ensuring a complete backup of all LFS content across all branches and history.
|
||||||
|
|
||||||
|
|
||||||
About Attachments
|
About Attachments
|
||||||
-----------------
|
-----------------
|
||||||
@@ -281,11 +283,11 @@ If the incremental argument is used, this will result in the next backup only re
|
|||||||
|
|
||||||
It's therefore recommended to only use the incremental argument if the output/result is being actively monitored, or complimented with periodic full non-incremental runs, to avoid unexpected missing data in a regular backup runs.
|
It's therefore recommended to only use the incremental argument if the output/result is being actively monitored, or complimented with periodic full non-incremental runs, to avoid unexpected missing data in a regular backup runs.
|
||||||
|
|
||||||
1. **Starred public repo hooks blocking**
|
**Starred public repo hooks blocking**
|
||||||
|
|
||||||
Since the ``--all`` argument includes ``--hooks``, if you use ``--all`` and ``--all-starred`` together to clone a users starred public repositories, the backup will likely error and block the backup continuing.
|
Since the ``--all`` argument includes ``--hooks``, if you use ``--all`` and ``--all-starred`` together to clone a users starred public repositories, the backup will likely error and block the backup continuing.
|
||||||
|
|
||||||
This is due to needing the correct permission for ``--hooks`` on public repos.
|
This is due to needing the correct permission for ``--hooks`` on public repos.
|
||||||
|
|
||||||
|
|
||||||
"bare" is actually "mirror"
|
"bare" is actually "mirror"
|
||||||
@@ -301,6 +303,8 @@ Starred gists vs starred repo behaviour
|
|||||||
|
|
||||||
The starred normal repo cloning (``--all-starred``) argument stores starred repos separately to the users own repositories. However, using ``--starred-gists`` will store starred gists within the same directory as the users own gists ``--gists``. Also, all gist repo directory names are IDs not the gist's name.
|
The starred normal repo cloning (``--all-starred``) argument stores starred repos separately to the users own repositories. However, using ``--starred-gists`` will store starred gists within the same directory as the users own gists ``--gists``. Also, all gist repo directory names are IDs not the gist's name.
|
||||||
|
|
||||||
|
Note: ``--starred-gists`` only retrieves starred gists for the authenticated user, not the target user, due to a GitHub API limitation.
|
||||||
|
|
||||||
|
|
||||||
Skip existing on incomplete backups
|
Skip existing on incomplete backups
|
||||||
-----------------------------------
|
-----------------------------------
|
||||||
@@ -308,6 +312,25 @@ Skip existing on incomplete backups
|
|||||||
The ``--skip-existing`` argument will skip a backup if the directory already exists, even if the backup in that directory failed (perhaps due to a blocking error). This may result in unexpected missing data in a regular backup.
|
The ``--skip-existing`` argument will skip a backup if the directory already exists, even if the backup in that directory failed (perhaps due to a blocking error). This may result in unexpected missing data in a regular backup.
|
||||||
|
|
||||||
|
|
||||||
|
Updates use fetch, not pull
|
||||||
|
---------------------------
|
||||||
|
|
||||||
|
When updating an existing repository backup, ``github-backup`` uses ``git fetch`` rather than ``git pull``. This is intentional - a backup tool should reliably download data without risk of failure. Using ``git pull`` would require handling merge conflicts, which adds complexity and could cause backups to fail unexpectedly.
|
||||||
|
|
||||||
|
With fetch, **all branches and commits are downloaded** safely into remote-tracking branches. The working directory files won't change, but your backup is complete.
|
||||||
|
|
||||||
|
If you look at files directly (e.g., ``cat README.md``), you'll see the old content. The new data is in the remote-tracking branches (confusingly named "remote" but stored locally). To view or use the latest files::
|
||||||
|
|
||||||
|
git show origin/main:README.md # view a file
|
||||||
|
git merge origin/main # update working directory
|
||||||
|
|
||||||
|
All branches are backed up as remote refs (``origin/main``, ``origin/feature-branch``, etc.).
|
||||||
|
|
||||||
|
If you want to browse files directly without merging, consider using ``--bare`` which skips the working directory entirely - the backup is just the git data.
|
||||||
|
|
||||||
|
See `#269 <https://github.com/josegonzalez/python-github-backup/issues/269>`_ for more discussion.
|
||||||
|
|
||||||
|
|
||||||
Github Backup Examples
|
Github Backup Examples
|
||||||
======================
|
======================
|
||||||
|
|
||||||
@@ -339,6 +362,25 @@ Debug an error/block or incomplete backup into a temporary directory. Omit "incr
|
|||||||
github-backup -f $FINE_ACCESS_TOKEN -o /tmp/github-backup/ -l debug -P --all-starred --starred --watched --followers --following --issues --issue-comments --issue-events --pulls --pull-comments --pull-commits --labels --milestones --repositories --wikis --releases --assets --pull-details --gists --starred-gists $GH_USER
|
github-backup -f $FINE_ACCESS_TOKEN -o /tmp/github-backup/ -l debug -P --all-starred --starred --watched --followers --following --issues --issue-comments --issue-events --pulls --pull-comments --pull-commits --labels --milestones --repositories --wikis --releases --assets --pull-details --gists --starred-gists $GH_USER
|
||||||
|
|
||||||
|
|
||||||
|
Restoring from Backup
|
||||||
|
=====================
|
||||||
|
|
||||||
|
This tool creates backups only, there is no inbuilt restore command.
|
||||||
|
|
||||||
|
**Git repositories, wikis, and gists** can be restored by pushing them back to GitHub as you would any git repository. For example, to restore a bare repository backup::
|
||||||
|
|
||||||
|
cd /tmp/white-house/repositories/petitions/repository
|
||||||
|
git push --mirror git@github.com:WhiteHouse/petitions.git
|
||||||
|
|
||||||
|
**Issues, pull requests, comments, and other metadata** are saved as JSON files for archival purposes. The GitHub API does not support recreating this data faithfully, creating issues via the API has limitations:
|
||||||
|
|
||||||
|
- New issue/PR numbers are assigned (original numbers cannot be set)
|
||||||
|
- Timestamps reflect creation time (original dates cannot be set)
|
||||||
|
- The API caller becomes the author (original authors cannot be set)
|
||||||
|
- Cross-references between issues and PRs will break
|
||||||
|
|
||||||
|
These are GitHub API limitations that affect all backup and migration tools, not just this one. Recreating issues with these limitations via the GitHub API is an exercise for the reader. The JSON backups remain useful for searching, auditing, or manual reference.
|
||||||
|
|
||||||
|
|
||||||
Development
|
Development
|
||||||
===========
|
===========
|
||||||
@@ -357,7 +399,12 @@ A huge thanks to all the contibuters!
|
|||||||
Testing
|
Testing
|
||||||
-------
|
-------
|
||||||
|
|
||||||
This project currently contains no unit tests. To run linting::
|
To run the test suite::
|
||||||
|
|
||||||
|
pip install pytest
|
||||||
|
pytest
|
||||||
|
|
||||||
|
To run linting::
|
||||||
|
|
||||||
pip install flake8
|
pip install flake8
|
||||||
flake8 --ignore=E501
|
flake8 --ignore=E501
|
||||||
|
|||||||
@@ -1,58 +1,18 @@
|
|||||||
#!/usr/bin/env python
|
#!/usr/bin/env python
|
||||||
|
"""
|
||||||
|
Backwards-compatible wrapper script.
|
||||||
|
|
||||||
|
The recommended way to run github-backup is via the installed command
|
||||||
|
(pip install github-backup) or python -m github_backup.
|
||||||
|
|
||||||
|
This script is kept for backwards compatibility with existing installations
|
||||||
|
that may reference this path directly.
|
||||||
|
"""
|
||||||
|
|
||||||
import logging
|
|
||||||
import os
|
|
||||||
import sys
|
import sys
|
||||||
|
|
||||||
from github_backup.github_backup import (
|
from github_backup.cli import main
|
||||||
backup_account,
|
from github_backup.github_backup import logger
|
||||||
backup_repositories,
|
|
||||||
check_git_lfs_install,
|
|
||||||
filter_repositories,
|
|
||||||
get_authenticated_user,
|
|
||||||
logger,
|
|
||||||
mkdir_p,
|
|
||||||
parse_args,
|
|
||||||
retrieve_repositories,
|
|
||||||
)
|
|
||||||
|
|
||||||
logging.basicConfig(
|
|
||||||
format="%(asctime)s.%(msecs)03d: %(message)s",
|
|
||||||
datefmt="%Y-%m-%dT%H:%M:%S",
|
|
||||||
level=logging.INFO,
|
|
||||||
)
|
|
||||||
|
|
||||||
|
|
||||||
def main():
|
|
||||||
args = parse_args()
|
|
||||||
|
|
||||||
if args.quiet:
|
|
||||||
logger.setLevel(logging.WARNING)
|
|
||||||
|
|
||||||
output_directory = os.path.realpath(args.output_directory)
|
|
||||||
if not os.path.isdir(output_directory):
|
|
||||||
logger.info("Create output directory {0}".format(output_directory))
|
|
||||||
mkdir_p(output_directory)
|
|
||||||
|
|
||||||
if args.lfs_clone:
|
|
||||||
check_git_lfs_install()
|
|
||||||
|
|
||||||
if args.log_level:
|
|
||||||
log_level = logging.getLevelName(args.log_level.upper())
|
|
||||||
if isinstance(log_level, int):
|
|
||||||
logger.root.setLevel(log_level)
|
|
||||||
|
|
||||||
if not args.as_app:
|
|
||||||
logger.info("Backing up user {0} to {1}".format(args.user, output_directory))
|
|
||||||
authenticated_user = get_authenticated_user(args)
|
|
||||||
else:
|
|
||||||
authenticated_user = {"login": None}
|
|
||||||
|
|
||||||
repositories = retrieve_repositories(args, authenticated_user)
|
|
||||||
repositories = filter_repositories(args, repositories)
|
|
||||||
backup_repositories(args, output_directory, repositories)
|
|
||||||
backup_account(args, output_directory)
|
|
||||||
|
|
||||||
|
|
||||||
if __name__ == "__main__":
|
if __name__ == "__main__":
|
||||||
try:
|
try:
|
||||||
|
|||||||
@@ -1 +1 @@
|
|||||||
__version__ = "0.51.1"
|
__version__ = "0.56.0"
|
||||||
|
|||||||
13
github_backup/__main__.py
Normal file
13
github_backup/__main__.py
Normal file
@@ -0,0 +1,13 @@
|
|||||||
|
"""Allow running as: python -m github_backup"""
|
||||||
|
|
||||||
|
import sys
|
||||||
|
|
||||||
|
from github_backup.cli import main
|
||||||
|
from github_backup.github_backup import logger
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
try:
|
||||||
|
main()
|
||||||
|
except Exception as e:
|
||||||
|
logger.error(str(e))
|
||||||
|
sys.exit(1)
|
||||||
82
github_backup/cli.py
Normal file
82
github_backup/cli.py
Normal file
@@ -0,0 +1,82 @@
|
|||||||
|
#!/usr/bin/env python
|
||||||
|
"""Command-line interface for github-backup."""
|
||||||
|
|
||||||
|
import logging
|
||||||
|
import os
|
||||||
|
import sys
|
||||||
|
|
||||||
|
from github_backup.github_backup import (
|
||||||
|
backup_account,
|
||||||
|
backup_repositories,
|
||||||
|
check_git_lfs_install,
|
||||||
|
filter_repositories,
|
||||||
|
get_auth,
|
||||||
|
get_authenticated_user,
|
||||||
|
logger,
|
||||||
|
mkdir_p,
|
||||||
|
parse_args,
|
||||||
|
retrieve_repositories,
|
||||||
|
)
|
||||||
|
|
||||||
|
# INFO and DEBUG go to stdout, WARNING and above go to stderr
|
||||||
|
log_format = logging.Formatter(
|
||||||
|
fmt="%(asctime)s.%(msecs)03d: %(message)s",
|
||||||
|
datefmt="%Y-%m-%dT%H:%M:%S",
|
||||||
|
)
|
||||||
|
|
||||||
|
stdout_handler = logging.StreamHandler(sys.stdout)
|
||||||
|
stdout_handler.setLevel(logging.DEBUG)
|
||||||
|
stdout_handler.addFilter(lambda r: r.levelno < logging.WARNING)
|
||||||
|
stdout_handler.setFormatter(log_format)
|
||||||
|
|
||||||
|
stderr_handler = logging.StreamHandler(sys.stderr)
|
||||||
|
stderr_handler.setLevel(logging.WARNING)
|
||||||
|
stderr_handler.setFormatter(log_format)
|
||||||
|
|
||||||
|
logging.basicConfig(level=logging.INFO, handlers=[stdout_handler, stderr_handler])
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
"""Main entry point for github-backup CLI."""
|
||||||
|
args = parse_args()
|
||||||
|
|
||||||
|
if args.private and not get_auth(args):
|
||||||
|
logger.warning(
|
||||||
|
"The --private flag has no effect without authentication. "
|
||||||
|
"Use -t/--token, -f/--token-fine, or -u/--username to authenticate."
|
||||||
|
)
|
||||||
|
|
||||||
|
if args.quiet:
|
||||||
|
logger.setLevel(logging.WARNING)
|
||||||
|
|
||||||
|
output_directory = os.path.realpath(args.output_directory)
|
||||||
|
if not os.path.isdir(output_directory):
|
||||||
|
logger.info("Create output directory {0}".format(output_directory))
|
||||||
|
mkdir_p(output_directory)
|
||||||
|
|
||||||
|
if args.lfs_clone:
|
||||||
|
check_git_lfs_install()
|
||||||
|
|
||||||
|
if args.log_level:
|
||||||
|
log_level = logging.getLevelName(args.log_level.upper())
|
||||||
|
if isinstance(log_level, int):
|
||||||
|
logger.root.setLevel(log_level)
|
||||||
|
|
||||||
|
if not args.as_app:
|
||||||
|
logger.info("Backing up user {0} to {1}".format(args.user, output_directory))
|
||||||
|
authenticated_user = get_authenticated_user(args)
|
||||||
|
else:
|
||||||
|
authenticated_user = {"login": None}
|
||||||
|
|
||||||
|
repositories = retrieve_repositories(args, authenticated_user)
|
||||||
|
repositories = filter_repositories(args, repositories)
|
||||||
|
backup_repositories(args, output_directory, repositories)
|
||||||
|
backup_account(args, output_directory)
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
try:
|
||||||
|
main()
|
||||||
|
except Exception as e:
|
||||||
|
logger.error(str(e))
|
||||||
|
sys.exit(1)
|
||||||
@@ -37,22 +37,42 @@ FNULL = open(os.devnull, "w")
|
|||||||
FILE_URI_PREFIX = "file://"
|
FILE_URI_PREFIX = "file://"
|
||||||
logger = logging.getLogger(__name__)
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
https_ctx = ssl.create_default_context()
|
|
||||||
if not https_ctx.get_ca_certs():
|
|
||||||
import warnings
|
|
||||||
|
|
||||||
warnings.warn(
|
class RepositoryUnavailableError(Exception):
|
||||||
"\n\nYOUR DEFAULT CA CERTS ARE EMPTY.\n"
|
"""Raised when a repository is unavailable due to legal reasons (e.g., DMCA takedown)."""
|
||||||
+ "PLEASE POPULATE ANY OF:"
|
|
||||||
+ "".join(
|
def __init__(self, message, dmca_url=None):
|
||||||
["\n - " + x for x in ssl.get_default_verify_paths() if type(x) is str]
|
super().__init__(message)
|
||||||
)
|
self.dmca_url = dmca_url
|
||||||
+ "\n",
|
|
||||||
stacklevel=2,
|
|
||||||
)
|
# Setup SSL context with fallback chain
|
||||||
|
https_ctx = ssl.create_default_context()
|
||||||
|
if https_ctx.get_ca_certs():
|
||||||
|
# Layer 1: Certificates pre-loaded from system (file-based)
|
||||||
|
pass
|
||||||
|
else:
|
||||||
|
paths = ssl.get_default_verify_paths()
|
||||||
|
if (paths.cafile and os.path.exists(paths.cafile)) or (
|
||||||
|
paths.capath and os.path.exists(paths.capath)
|
||||||
|
):
|
||||||
|
# Layer 2: Cert paths exist, will be lazy-loaded on first use (directory-based)
|
||||||
|
pass
|
||||||
|
else:
|
||||||
|
# Layer 3: Try certifi package as optional fallback
|
||||||
|
try:
|
||||||
import certifi
|
import certifi
|
||||||
|
|
||||||
https_ctx = ssl.create_default_context(cafile=certifi.where())
|
https_ctx = ssl.create_default_context(cafile=certifi.where())
|
||||||
|
except ImportError:
|
||||||
|
# All layers failed - no certificates available anywhere
|
||||||
|
sys.exit(
|
||||||
|
"\nERROR: No CA certificates found. Cannot connect to GitHub over SSL.\n\n"
|
||||||
|
"Solutions you can explore:\n"
|
||||||
|
" 1. pip install certifi\n"
|
||||||
|
" 2. Alpine: apk add ca-certificates\n"
|
||||||
|
" 3. Debian/Ubuntu: apt-get install ca-certificates\n\n"
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
def logging_subprocess(
|
def logging_subprocess(
|
||||||
@@ -541,7 +561,7 @@ def get_github_host(args):
|
|||||||
|
|
||||||
|
|
||||||
def read_file_contents(file_uri):
|
def read_file_contents(file_uri):
|
||||||
return open(file_uri[len(FILE_URI_PREFIX) :], "rt").readline().strip()
|
return open(file_uri[len(FILE_URI_PREFIX):], "rt").readline().strip()
|
||||||
|
|
||||||
|
|
||||||
def get_github_repo_url(args, repository):
|
def get_github_repo_url(args, repository):
|
||||||
@@ -581,27 +601,39 @@ def retrieve_data_gen(args, template, query_args=None, single_request=False):
|
|||||||
auth = get_auth(args, encode=not args.as_app)
|
auth = get_auth(args, encode=not args.as_app)
|
||||||
query_args = get_query_args(query_args)
|
query_args = get_query_args(query_args)
|
||||||
per_page = 100
|
per_page = 100
|
||||||
page = 0
|
next_url = None
|
||||||
|
|
||||||
while True:
|
while True:
|
||||||
if single_request:
|
if single_request:
|
||||||
request_page, request_per_page = None, None
|
request_per_page = None
|
||||||
else:
|
else:
|
||||||
page = page + 1
|
request_per_page = per_page
|
||||||
request_page, request_per_page = page, per_page
|
|
||||||
|
|
||||||
request = _construct_request(
|
request = _construct_request(
|
||||||
request_per_page,
|
request_per_page,
|
||||||
request_page,
|
|
||||||
query_args,
|
query_args,
|
||||||
template,
|
next_url or template,
|
||||||
auth,
|
auth,
|
||||||
as_app=args.as_app,
|
as_app=args.as_app,
|
||||||
fine=True if args.token_fine is not None else False,
|
fine=True if args.token_fine is not None else False,
|
||||||
) # noqa
|
) # noqa
|
||||||
r, errors = _get_response(request, auth, template)
|
r, errors = _get_response(request, auth, next_url or template)
|
||||||
|
|
||||||
status_code = int(r.getcode())
|
status_code = int(r.getcode())
|
||||||
|
|
||||||
|
# Handle DMCA takedown (HTTP 451) - raise exception to skip entire repository
|
||||||
|
if status_code == 451:
|
||||||
|
dmca_url = None
|
||||||
|
try:
|
||||||
|
response_data = json.loads(r.read().decode("utf-8"))
|
||||||
|
dmca_url = response_data.get("block", {}).get("html_url")
|
||||||
|
except Exception:
|
||||||
|
pass
|
||||||
|
raise RepositoryUnavailableError(
|
||||||
|
"Repository unavailable due to legal reasons (HTTP 451)",
|
||||||
|
dmca_url=dmca_url
|
||||||
|
)
|
||||||
|
|
||||||
# Check if we got correct data
|
# Check if we got correct data
|
||||||
try:
|
try:
|
||||||
response = json.loads(r.read().decode("utf-8"))
|
response = json.loads(r.read().decode("utf-8"))
|
||||||
@@ -633,15 +665,14 @@ def retrieve_data_gen(args, template, query_args=None, single_request=False):
|
|||||||
retries += 1
|
retries += 1
|
||||||
time.sleep(5)
|
time.sleep(5)
|
||||||
request = _construct_request(
|
request = _construct_request(
|
||||||
per_page,
|
request_per_page,
|
||||||
page,
|
|
||||||
query_args,
|
query_args,
|
||||||
template,
|
next_url or template,
|
||||||
auth,
|
auth,
|
||||||
as_app=args.as_app,
|
as_app=args.as_app,
|
||||||
fine=True if args.token_fine is not None else False,
|
fine=True if args.token_fine is not None else False,
|
||||||
) # noqa
|
) # noqa
|
||||||
r, errors = _get_response(request, auth, template)
|
r, errors = _get_response(request, auth, next_url or template)
|
||||||
|
|
||||||
status_code = int(r.getcode())
|
status_code = int(r.getcode())
|
||||||
try:
|
try:
|
||||||
@@ -671,7 +702,16 @@ def retrieve_data_gen(args, template, query_args=None, single_request=False):
|
|||||||
if type(response) is list:
|
if type(response) is list:
|
||||||
for resp in response:
|
for resp in response:
|
||||||
yield resp
|
yield resp
|
||||||
if len(response) < per_page:
|
# Parse Link header for next page URL (cursor-based pagination)
|
||||||
|
link_header = r.headers.get("Link", "")
|
||||||
|
next_url = None
|
||||||
|
if link_header:
|
||||||
|
# Parse Link header: <https://api.github.com/...?per_page=100&after=cursor>; rel="next"
|
||||||
|
for link in link_header.split(","):
|
||||||
|
if 'rel="next"' in link:
|
||||||
|
next_url = link[link.find("<") + 1:link.find(">")]
|
||||||
|
break
|
||||||
|
if not next_url:
|
||||||
break
|
break
|
||||||
elif type(response) is dict and single_request:
|
elif type(response) is dict and single_request:
|
||||||
yield response
|
yield response
|
||||||
@@ -724,13 +764,18 @@ def _get_response(request, auth, template):
|
|||||||
|
|
||||||
|
|
||||||
def _construct_request(
|
def _construct_request(
|
||||||
per_page, page, query_args, template, auth, as_app=None, fine=False
|
per_page, query_args, template, auth, as_app=None, fine=False
|
||||||
):
|
):
|
||||||
|
# If template is already a full URL with query params (from Link header), use it directly
|
||||||
|
if "?" in template and template.startswith("http"):
|
||||||
|
request_url = template
|
||||||
|
# Extract query string for logging
|
||||||
|
querystring = template.split("?", 1)[1]
|
||||||
|
else:
|
||||||
|
# Build URL with query parameters
|
||||||
all_query_args = {}
|
all_query_args = {}
|
||||||
if per_page:
|
if per_page:
|
||||||
all_query_args["per_page"] = per_page
|
all_query_args["per_page"] = per_page
|
||||||
if page:
|
|
||||||
all_query_args["page"] = page
|
|
||||||
if query_args:
|
if query_args:
|
||||||
all_query_args.update(query_args)
|
all_query_args.update(query_args)
|
||||||
|
|
||||||
@@ -755,7 +800,7 @@ def _construct_request(
|
|||||||
"Accept", "application/vnd.github.machine-man-preview+json"
|
"Accept", "application/vnd.github.machine-man-preview+json"
|
||||||
)
|
)
|
||||||
|
|
||||||
log_url = template
|
log_url = template if "?" not in template else template.split("?")[0]
|
||||||
if querystring:
|
if querystring:
|
||||||
log_url += "?" + querystring
|
log_url += "?" + querystring
|
||||||
logger.info("Requesting {}".format(log_url))
|
logger.info("Requesting {}".format(log_url))
|
||||||
@@ -832,8 +877,7 @@ def download_file(url, path, auth, as_app=False, fine=False):
|
|||||||
return
|
return
|
||||||
|
|
||||||
request = _construct_request(
|
request = _construct_request(
|
||||||
per_page=100,
|
per_page=None,
|
||||||
page=1,
|
|
||||||
query_args={},
|
query_args={},
|
||||||
template=url,
|
template=url,
|
||||||
auth=auth,
|
auth=auth,
|
||||||
@@ -994,7 +1038,7 @@ def download_attachment_file(url, path, auth, as_app=False, fine=False):
|
|||||||
bytes_downloaded += len(chunk)
|
bytes_downloaded += len(chunk)
|
||||||
|
|
||||||
# Atomic rename to final location
|
# Atomic rename to final location
|
||||||
os.rename(temp_path, path)
|
os.replace(temp_path, path)
|
||||||
|
|
||||||
metadata["size_bytes"] = bytes_downloaded
|
metadata["size_bytes"] = bytes_downloaded
|
||||||
metadata["success"] = True
|
metadata["success"] = True
|
||||||
@@ -1415,7 +1459,7 @@ def download_attachments(
|
|||||||
|
|
||||||
# Rename to add extension (already atomic from download)
|
# Rename to add extension (already atomic from download)
|
||||||
try:
|
try:
|
||||||
os.rename(filepath, final_filepath)
|
os.replace(filepath, final_filepath)
|
||||||
metadata["saved_as"] = os.path.basename(final_filepath)
|
metadata["saved_as"] = os.path.basename(final_filepath)
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
logger.warning(
|
logger.warning(
|
||||||
@@ -1446,7 +1490,7 @@ def download_attachments(
|
|||||||
manifest_path = os.path.join(attachments_dir, "manifest.json")
|
manifest_path = os.path.join(attachments_dir, "manifest.json")
|
||||||
with open(manifest_path + ".temp", "w") as f:
|
with open(manifest_path + ".temp", "w") as f:
|
||||||
json.dump(manifest, f, indent=2)
|
json.dump(manifest, f, indent=2)
|
||||||
os.rename(manifest_path + ".temp", manifest_path) # Atomic write
|
os.replace(manifest_path + ".temp", manifest_path) # Atomic write
|
||||||
logger.debug(
|
logger.debug(
|
||||||
"Wrote manifest for {0} #{1}: {2} attachments".format(
|
"Wrote manifest for {0} #{1}: {2} attachments".format(
|
||||||
item_type_display, number, len(attachment_metadata_list)
|
item_type_display, number, len(attachment_metadata_list)
|
||||||
@@ -1521,6 +1565,12 @@ def retrieve_repositories(args, authenticated_user):
|
|||||||
repos.extend(gists)
|
repos.extend(gists)
|
||||||
|
|
||||||
if args.include_starred_gists:
|
if args.include_starred_gists:
|
||||||
|
if not authenticated_user.get("login") or args.user.lower() != authenticated_user["login"].lower():
|
||||||
|
logger.warning(
|
||||||
|
"Cannot retrieve starred gists for '%s'. GitHub only allows access to the authenticated user's starred gists.",
|
||||||
|
args.user,
|
||||||
|
)
|
||||||
|
else:
|
||||||
starred_gists_template = "https://{0}/gists/starred".format(
|
starred_gists_template = "https://{0}/gists/starred".format(
|
||||||
get_github_api_host(args)
|
get_github_api_host(args)
|
||||||
)
|
)
|
||||||
@@ -1543,7 +1593,9 @@ def filter_repositories(args, unfiltered_repositories):
|
|||||||
repositories = []
|
repositories = []
|
||||||
for r in unfiltered_repositories:
|
for r in unfiltered_repositories:
|
||||||
# gists can be anonymous, so need to safely check owner
|
# gists can be anonymous, so need to safely check owner
|
||||||
if r.get("owner", {}).get("login") == args.user or r.get("is_starred"):
|
# Use case-insensitive comparison to match GitHub's case-insensitive username behavior
|
||||||
|
owner_login = r.get("owner", {}).get("login", "")
|
||||||
|
if owner_login.lower() == args.user.lower() or r.get("is_starred"):
|
||||||
repositories.append(r)
|
repositories.append(r)
|
||||||
|
|
||||||
name_regex = None
|
name_regex = None
|
||||||
@@ -1620,9 +1672,10 @@ def backup_repositories(args, output_directory, repositories):
|
|||||||
repo_url = get_github_repo_url(args, repository)
|
repo_url = get_github_repo_url(args, repository)
|
||||||
|
|
||||||
include_gists = args.include_gists or args.include_starred_gists
|
include_gists = args.include_gists or args.include_starred_gists
|
||||||
|
include_starred = args.all_starred and repository.get("is_starred")
|
||||||
if (args.include_repository or args.include_everything) or (
|
if (args.include_repository or args.include_everything) or (
|
||||||
include_gists and repository.get("is_gist")
|
include_gists and repository.get("is_gist")
|
||||||
):
|
) or include_starred:
|
||||||
repo_name = (
|
repo_name = (
|
||||||
repository.get("name")
|
repository.get("name")
|
||||||
if not repository.get("is_gist")
|
if not repository.get("is_gist")
|
||||||
@@ -1646,6 +1699,7 @@ def backup_repositories(args, output_directory, repositories):
|
|||||||
|
|
||||||
continue # don't try to back anything else for a gist; it doesn't exist
|
continue # don't try to back anything else for a gist; it doesn't exist
|
||||||
|
|
||||||
|
try:
|
||||||
download_wiki = args.include_wiki or args.include_everything
|
download_wiki = args.include_wiki or args.include_everything
|
||||||
if repository["has_wiki"] and download_wiki:
|
if repository["has_wiki"] and download_wiki:
|
||||||
fetch_repository(
|
fetch_repository(
|
||||||
@@ -1680,6 +1734,12 @@ def backup_repositories(args, output_directory, repositories):
|
|||||||
repos_template,
|
repos_template,
|
||||||
include_assets=args.include_assets or args.include_everything,
|
include_assets=args.include_assets or args.include_everything,
|
||||||
)
|
)
|
||||||
|
except RepositoryUnavailableError as e:
|
||||||
|
logger.warning(f"Repository {repository['full_name']} is unavailable (HTTP 451)")
|
||||||
|
if e.dmca_url:
|
||||||
|
logger.warning(f"DMCA notice: {e.dmca_url}")
|
||||||
|
logger.info(f"Skipping remaining resources for {repository['full_name']}")
|
||||||
|
continue
|
||||||
|
|
||||||
if args.incremental:
|
if args.incremental:
|
||||||
if last_update == "0000-00-00T00:00:00Z":
|
if last_update == "0000-00-00T00:00:00Z":
|
||||||
@@ -1751,7 +1811,7 @@ def backup_issues(args, repo_cwd, repository, repos_template):
|
|||||||
|
|
||||||
with codecs.open(issue_file + ".temp", "w", encoding="utf-8") as f:
|
with codecs.open(issue_file + ".temp", "w", encoding="utf-8") as f:
|
||||||
json_dump(issue, f)
|
json_dump(issue, f)
|
||||||
os.rename(issue_file + ".temp", issue_file) # Unlike json_dump, this is atomic
|
os.replace(issue_file + ".temp", issue_file) # Atomic write
|
||||||
|
|
||||||
|
|
||||||
def backup_pulls(args, repo_cwd, repository, repos_template):
|
def backup_pulls(args, repo_cwd, repository, repos_template):
|
||||||
@@ -1826,7 +1886,7 @@ def backup_pulls(args, repo_cwd, repository, repos_template):
|
|||||||
|
|
||||||
with codecs.open(pull_file + ".temp", "w", encoding="utf-8") as f:
|
with codecs.open(pull_file + ".temp", "w", encoding="utf-8") as f:
|
||||||
json_dump(pull, f)
|
json_dump(pull, f)
|
||||||
os.rename(pull_file + ".temp", pull_file) # Unlike json_dump, this is atomic
|
os.replace(pull_file + ".temp", pull_file) # Atomic write
|
||||||
|
|
||||||
|
|
||||||
def backup_milestones(args, repo_cwd, repository, repos_template):
|
def backup_milestones(args, repo_cwd, repository, repos_template):
|
||||||
@@ -1847,11 +1907,21 @@ def backup_milestones(args, repo_cwd, repository, repos_template):
|
|||||||
for milestone in _milestones:
|
for milestone in _milestones:
|
||||||
milestones[milestone["number"]] = milestone
|
milestones[milestone["number"]] = milestone
|
||||||
|
|
||||||
logger.info("Saving {0} milestones to disk".format(len(list(milestones.keys()))))
|
written_count = 0
|
||||||
for number, milestone in list(milestones.items()):
|
for number, milestone in list(milestones.items()):
|
||||||
milestone_file = "{0}/{1}.json".format(milestone_cwd, number)
|
milestone_file = "{0}/{1}.json".format(milestone_cwd, number)
|
||||||
with codecs.open(milestone_file, "w", encoding="utf-8") as f:
|
if json_dump_if_changed(milestone, milestone_file):
|
||||||
json_dump(milestone, f)
|
written_count += 1
|
||||||
|
|
||||||
|
total = len(milestones)
|
||||||
|
if written_count == total:
|
||||||
|
logger.info("Saved {0} milestones to disk".format(total))
|
||||||
|
elif written_count == 0:
|
||||||
|
logger.info("{0} milestones unchanged, skipped write".format(total))
|
||||||
|
else:
|
||||||
|
logger.info("Saved {0} of {1} milestones to disk ({2} unchanged)".format(
|
||||||
|
written_count, total, total - written_count
|
||||||
|
))
|
||||||
|
|
||||||
|
|
||||||
def backup_labels(args, repo_cwd, repository, repos_template):
|
def backup_labels(args, repo_cwd, repository, repos_template):
|
||||||
@@ -1904,19 +1974,17 @@ def backup_releases(args, repo_cwd, repository, repos_template, include_assets=F
|
|||||||
reverse=True,
|
reverse=True,
|
||||||
)
|
)
|
||||||
releases = releases[: args.number_of_latest_releases]
|
releases = releases[: args.number_of_latest_releases]
|
||||||
logger.info("Saving the latest {0} releases to disk".format(len(releases)))
|
|
||||||
else:
|
|
||||||
logger.info("Saving {0} releases to disk".format(len(releases)))
|
|
||||||
|
|
||||||
# for each release, store it
|
# for each release, store it
|
||||||
|
written_count = 0
|
||||||
for release in releases:
|
for release in releases:
|
||||||
release_name = release["tag_name"]
|
release_name = release["tag_name"]
|
||||||
release_name_safe = release_name.replace("/", "__")
|
release_name_safe = release_name.replace("/", "__")
|
||||||
output_filepath = os.path.join(
|
output_filepath = os.path.join(
|
||||||
release_cwd, "{0}.json".format(release_name_safe)
|
release_cwd, "{0}.json".format(release_name_safe)
|
||||||
)
|
)
|
||||||
with codecs.open(output_filepath, "w+", encoding="utf-8") as f:
|
if json_dump_if_changed(release, output_filepath):
|
||||||
json_dump(release, f)
|
written_count += 1
|
||||||
|
|
||||||
if include_assets:
|
if include_assets:
|
||||||
assets = retrieve_data(args, release["assets_url"])
|
assets = retrieve_data(args, release["assets_url"])
|
||||||
@@ -1933,6 +2001,17 @@ def backup_releases(args, repo_cwd, repository, repos_template, include_assets=F
|
|||||||
fine=True if args.token_fine is not None else False,
|
fine=True if args.token_fine is not None else False,
|
||||||
)
|
)
|
||||||
|
|
||||||
|
# Log the results
|
||||||
|
total = len(releases)
|
||||||
|
if written_count == total:
|
||||||
|
logger.info("Saved {0} releases to disk".format(total))
|
||||||
|
elif written_count == 0:
|
||||||
|
logger.info("{0} releases unchanged, skipped write".format(total))
|
||||||
|
else:
|
||||||
|
logger.info("Saved {0} of {1} releases to disk ({2} unchanged)".format(
|
||||||
|
written_count, total, total - written_count
|
||||||
|
))
|
||||||
|
|
||||||
|
|
||||||
def fetch_repository(
|
def fetch_repository(
|
||||||
name,
|
name,
|
||||||
@@ -1945,12 +2024,9 @@ def fetch_repository(
|
|||||||
):
|
):
|
||||||
if bare_clone:
|
if bare_clone:
|
||||||
if os.path.exists(local_dir):
|
if os.path.exists(local_dir):
|
||||||
clone_exists = (
|
clone_exists = subprocess.check_output(
|
||||||
subprocess.check_output(
|
|
||||||
["git", "rev-parse", "--is-bare-repository"], cwd=local_dir
|
["git", "rev-parse", "--is-bare-repository"], cwd=local_dir
|
||||||
)
|
) == b"true\n"
|
||||||
== b"true\n"
|
|
||||||
)
|
|
||||||
else:
|
else:
|
||||||
clone_exists = False
|
clone_exists = False
|
||||||
else:
|
else:
|
||||||
@@ -1965,10 +2041,13 @@ def fetch_repository(
|
|||||||
"git ls-remote " + remote_url, stdout=FNULL, stderr=FNULL, shell=True
|
"git ls-remote " + remote_url, stdout=FNULL, stderr=FNULL, shell=True
|
||||||
)
|
)
|
||||||
if initialized == 128:
|
if initialized == 128:
|
||||||
|
if ".wiki.git" in remote_url:
|
||||||
logger.info(
|
logger.info(
|
||||||
"Skipping {0} ({1}) since it's not initialized".format(
|
"Skipping {0} wiki (wiki is enabled but has no content)".format(name)
|
||||||
name, masked_remote_url
|
|
||||||
)
|
)
|
||||||
|
else:
|
||||||
|
logger.info(
|
||||||
|
"Skipping {0} (repository not accessible - may be empty, private, or credentials invalid)".format(name)
|
||||||
)
|
)
|
||||||
return
|
return
|
||||||
|
|
||||||
@@ -2010,12 +2089,14 @@ def fetch_repository(
|
|||||||
if no_prune:
|
if no_prune:
|
||||||
git_command.pop()
|
git_command.pop()
|
||||||
logging_subprocess(git_command, cwd=local_dir)
|
logging_subprocess(git_command, cwd=local_dir)
|
||||||
else:
|
|
||||||
if lfs_clone:
|
|
||||||
git_command = ["git", "lfs", "clone", remote_url, local_dir]
|
|
||||||
else:
|
else:
|
||||||
git_command = ["git", "clone", remote_url, local_dir]
|
git_command = ["git", "clone", remote_url, local_dir]
|
||||||
logging_subprocess(git_command)
|
logging_subprocess(git_command)
|
||||||
|
if lfs_clone:
|
||||||
|
git_command = ["git", "lfs", "fetch", "--all", "--prune"]
|
||||||
|
if no_prune:
|
||||||
|
git_command.pop()
|
||||||
|
logging_subprocess(git_command, cwd=local_dir)
|
||||||
|
|
||||||
|
|
||||||
def backup_account(args, output_directory):
|
def backup_account(args, output_directory):
|
||||||
@@ -2057,9 +2138,10 @@ def _backup_data(args, name, template, output_file, output_directory):
|
|||||||
mkdir_p(output_directory)
|
mkdir_p(output_directory)
|
||||||
data = retrieve_data(args, template)
|
data = retrieve_data(args, template)
|
||||||
|
|
||||||
logger.info("Writing {0} {1} to disk".format(len(data), name))
|
if json_dump_if_changed(data, output_file):
|
||||||
with codecs.open(output_file, "w", encoding="utf-8") as f:
|
logger.info("Saved {0} {1} to disk".format(len(data), name))
|
||||||
json_dump(data, f)
|
else:
|
||||||
|
logger.info("{0} {1} unchanged, skipped write".format(len(data), name))
|
||||||
|
|
||||||
|
|
||||||
def json_dump(data, output_file):
|
def json_dump(data, output_file):
|
||||||
@@ -2071,3 +2153,57 @@ def json_dump(data, output_file):
|
|||||||
indent=4,
|
indent=4,
|
||||||
separators=(",", ": "),
|
separators=(",", ": "),
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def json_dump_if_changed(data, output_file_path):
|
||||||
|
"""
|
||||||
|
Write JSON data to file only if content has changed.
|
||||||
|
|
||||||
|
Compares the serialized JSON data with the existing file content
|
||||||
|
and only writes if different. This prevents unnecessary file
|
||||||
|
modification timestamp updates and disk writes.
|
||||||
|
|
||||||
|
Uses atomic writes (temp file + rename) to prevent corruption
|
||||||
|
if the process is interrupted during the write.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
data: The data to serialize as JSON
|
||||||
|
output_file_path: The path to the output file
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
True if file was written (content changed or new file)
|
||||||
|
False if write was skipped (content unchanged)
|
||||||
|
"""
|
||||||
|
# Serialize new data with consistent formatting matching json_dump()
|
||||||
|
new_content = json.dumps(
|
||||||
|
data,
|
||||||
|
ensure_ascii=False,
|
||||||
|
sort_keys=True,
|
||||||
|
indent=4,
|
||||||
|
separators=(",", ": "),
|
||||||
|
)
|
||||||
|
|
||||||
|
# Check if file exists and compare content
|
||||||
|
if os.path.exists(output_file_path):
|
||||||
|
try:
|
||||||
|
with codecs.open(output_file_path, "r", encoding="utf-8") as f:
|
||||||
|
existing_content = f.read()
|
||||||
|
if existing_content == new_content:
|
||||||
|
logger.debug(
|
||||||
|
"Content unchanged, skipping write: {0}".format(output_file_path)
|
||||||
|
)
|
||||||
|
return False
|
||||||
|
except (OSError, UnicodeDecodeError) as e:
|
||||||
|
# If we can't read the existing file, write the new one
|
||||||
|
logger.debug(
|
||||||
|
"Error reading existing file {0}, will overwrite: {1}".format(
|
||||||
|
output_file_path, e
|
||||||
|
)
|
||||||
|
)
|
||||||
|
|
||||||
|
# Write the file atomically using temp file + rename
|
||||||
|
temp_file = output_file_path + ".temp"
|
||||||
|
with codecs.open(temp_file, "w", encoding="utf-8") as f:
|
||||||
|
f.write(new_content)
|
||||||
|
os.replace(temp_file, output_file_path) # Atomic write
|
||||||
|
return True
|
||||||
|
|||||||
@@ -1,18 +1,18 @@
|
|||||||
autopep8==2.3.2
|
autopep8==2.3.2
|
||||||
black==25.11.0
|
black==25.12.0
|
||||||
bleach==6.3.0
|
bleach==6.3.0
|
||||||
certifi==2025.11.12
|
certifi==2025.11.12
|
||||||
charset-normalizer==3.4.4
|
charset-normalizer==3.4.4
|
||||||
click==8.3.0
|
click==8.3.1
|
||||||
colorama==0.4.6
|
colorama==0.4.6
|
||||||
docutils==0.22.3
|
docutils==0.22.3
|
||||||
flake8==7.3.0
|
flake8==7.3.0
|
||||||
gitchangelog==3.0.4
|
gitchangelog==3.0.4
|
||||||
pytest==8.3.3
|
pytest==9.0.2
|
||||||
idna==3.11
|
idna==3.11
|
||||||
importlib-metadata==8.7.0
|
importlib-metadata==8.7.0
|
||||||
jaraco.classes==3.4.0
|
jaraco.classes==3.4.0
|
||||||
keyring==25.6.0
|
keyring==25.7.0
|
||||||
markdown-it-py==4.0.0
|
markdown-it-py==4.0.0
|
||||||
mccabe==0.7.0
|
mccabe==0.7.0
|
||||||
mdurl==0.1.2
|
mdurl==0.1.2
|
||||||
@@ -21,20 +21,20 @@ mypy-extensions==1.1.0
|
|||||||
packaging==25.0
|
packaging==25.0
|
||||||
pathspec==0.12.1
|
pathspec==0.12.1
|
||||||
pkginfo==1.12.1.2
|
pkginfo==1.12.1.2
|
||||||
platformdirs==4.5.0
|
platformdirs==4.5.1
|
||||||
pycodestyle==2.14.0
|
pycodestyle==2.14.0
|
||||||
pyflakes==3.4.0
|
pyflakes==3.4.0
|
||||||
Pygments==2.19.2
|
Pygments==2.19.2
|
||||||
readme-renderer==44.0
|
readme-renderer==44.0
|
||||||
requests==2.32.5
|
requests==2.32.5
|
||||||
requests-toolbelt==1.0.0
|
requests-toolbelt==1.0.0
|
||||||
restructuredtext-lint==1.4.0
|
restructuredtext-lint==2.0.2
|
||||||
rfc3986==2.0.0
|
rfc3986==2.0.0
|
||||||
rich==14.2.0
|
rich==14.2.0
|
||||||
setuptools==80.9.0
|
setuptools==80.9.0
|
||||||
six==1.17.0
|
six==1.17.0
|
||||||
tqdm==4.67.1
|
tqdm==4.67.1
|
||||||
twine==6.2.0
|
twine==6.2.0
|
||||||
urllib3==2.5.0
|
urllib3==2.6.1
|
||||||
webencodings==0.5.1
|
webencodings==0.5.1
|
||||||
zipp==3.23.0
|
zipp==3.23.0
|
||||||
|
|||||||
@@ -1 +0,0 @@
|
|||||||
|
|
||||||
|
|||||||
6
setup.py
6
setup.py
@@ -33,7 +33,11 @@ setup(
|
|||||||
author="Jose Diaz-Gonzalez",
|
author="Jose Diaz-Gonzalez",
|
||||||
author_email="github-backup@josediazgonzalez.com",
|
author_email="github-backup@josediazgonzalez.com",
|
||||||
packages=["github_backup"],
|
packages=["github_backup"],
|
||||||
scripts=["bin/github-backup"],
|
entry_points={
|
||||||
|
"console_scripts": [
|
||||||
|
"github-backup=github_backup.cli:main",
|
||||||
|
],
|
||||||
|
},
|
||||||
url="http://github.com/josegonzalez/python-github-backup",
|
url="http://github.com/josegonzalez/python-github-backup",
|
||||||
license="MIT",
|
license="MIT",
|
||||||
classifiers=[
|
classifiers=[
|
||||||
|
|||||||
161
tests/test_all_starred.py
Normal file
161
tests/test_all_starred.py
Normal file
@@ -0,0 +1,161 @@
|
|||||||
|
"""Tests for --all-starred flag behavior (issue #225)."""
|
||||||
|
|
||||||
|
import pytest
|
||||||
|
from unittest.mock import Mock, patch
|
||||||
|
|
||||||
|
from github_backup import github_backup
|
||||||
|
|
||||||
|
|
||||||
|
class TestAllStarredCloning:
|
||||||
|
"""Test suite for --all-starred repository cloning behavior.
|
||||||
|
|
||||||
|
Issue #225: --all-starred should clone starred repos without requiring --repositories.
|
||||||
|
"""
|
||||||
|
|
||||||
|
def _create_mock_args(self, **overrides):
|
||||||
|
"""Create a mock args object with sensible defaults."""
|
||||||
|
args = Mock()
|
||||||
|
args.user = "testuser"
|
||||||
|
args.output_directory = "/tmp/backup"
|
||||||
|
args.include_repository = False
|
||||||
|
args.include_everything = False
|
||||||
|
args.include_gists = False
|
||||||
|
args.include_starred_gists = False
|
||||||
|
args.all_starred = False
|
||||||
|
args.skip_existing = False
|
||||||
|
args.bare_clone = False
|
||||||
|
args.lfs_clone = False
|
||||||
|
args.no_prune = False
|
||||||
|
args.include_wiki = False
|
||||||
|
args.include_issues = False
|
||||||
|
args.include_issue_comments = False
|
||||||
|
args.include_issue_events = False
|
||||||
|
args.include_pulls = False
|
||||||
|
args.include_pull_comments = False
|
||||||
|
args.include_pull_commits = False
|
||||||
|
args.include_pull_details = False
|
||||||
|
args.include_labels = False
|
||||||
|
args.include_hooks = False
|
||||||
|
args.include_milestones = False
|
||||||
|
args.include_releases = False
|
||||||
|
args.include_assets = False
|
||||||
|
args.include_attachments = False
|
||||||
|
args.incremental = False
|
||||||
|
args.incremental_by_files = False
|
||||||
|
args.github_host = None
|
||||||
|
args.prefer_ssh = False
|
||||||
|
args.token_classic = None
|
||||||
|
args.token_fine = None
|
||||||
|
args.username = None
|
||||||
|
args.password = None
|
||||||
|
args.as_app = False
|
||||||
|
args.osx_keychain_item_name = None
|
||||||
|
args.osx_keychain_item_account = None
|
||||||
|
|
||||||
|
for key, value in overrides.items():
|
||||||
|
setattr(args, key, value)
|
||||||
|
|
||||||
|
return args
|
||||||
|
|
||||||
|
@patch('github_backup.github_backup.fetch_repository')
|
||||||
|
@patch('github_backup.github_backup.get_github_repo_url')
|
||||||
|
def test_all_starred_clones_without_repositories_flag(self, mock_get_url, mock_fetch):
|
||||||
|
"""--all-starred should clone starred repos without --repositories flag.
|
||||||
|
|
||||||
|
This is the core fix for issue #225.
|
||||||
|
"""
|
||||||
|
args = self._create_mock_args(all_starred=True)
|
||||||
|
mock_get_url.return_value = "https://github.com/otheruser/awesome-project.git"
|
||||||
|
|
||||||
|
# A starred repository (is_starred flag set by retrieve_repositories)
|
||||||
|
starred_repo = {
|
||||||
|
"name": "awesome-project",
|
||||||
|
"full_name": "otheruser/awesome-project",
|
||||||
|
"owner": {"login": "otheruser"},
|
||||||
|
"private": False,
|
||||||
|
"fork": False,
|
||||||
|
"has_wiki": False,
|
||||||
|
"is_starred": True, # This flag is set for starred repos
|
||||||
|
}
|
||||||
|
|
||||||
|
with patch('github_backup.github_backup.mkdir_p'):
|
||||||
|
github_backup.backup_repositories(args, "/tmp/backup", [starred_repo])
|
||||||
|
|
||||||
|
# fetch_repository should be called for the starred repo
|
||||||
|
assert mock_fetch.called, "--all-starred should trigger repository cloning"
|
||||||
|
mock_fetch.assert_called_once()
|
||||||
|
call_args = mock_fetch.call_args
|
||||||
|
assert call_args[0][0] == "awesome-project" # repo name
|
||||||
|
|
||||||
|
@patch('github_backup.github_backup.fetch_repository')
|
||||||
|
@patch('github_backup.github_backup.get_github_repo_url')
|
||||||
|
def test_starred_repo_not_cloned_without_all_starred_flag(self, mock_get_url, mock_fetch):
|
||||||
|
"""Starred repos should NOT be cloned if --all-starred is not set."""
|
||||||
|
args = self._create_mock_args(all_starred=False)
|
||||||
|
mock_get_url.return_value = "https://github.com/otheruser/awesome-project.git"
|
||||||
|
|
||||||
|
starred_repo = {
|
||||||
|
"name": "awesome-project",
|
||||||
|
"full_name": "otheruser/awesome-project",
|
||||||
|
"owner": {"login": "otheruser"},
|
||||||
|
"private": False,
|
||||||
|
"fork": False,
|
||||||
|
"has_wiki": False,
|
||||||
|
"is_starred": True,
|
||||||
|
}
|
||||||
|
|
||||||
|
with patch('github_backup.github_backup.mkdir_p'):
|
||||||
|
github_backup.backup_repositories(args, "/tmp/backup", [starred_repo])
|
||||||
|
|
||||||
|
# fetch_repository should NOT be called
|
||||||
|
assert not mock_fetch.called, "Starred repos should not be cloned without --all-starred"
|
||||||
|
|
||||||
|
@patch('github_backup.github_backup.fetch_repository')
|
||||||
|
@patch('github_backup.github_backup.get_github_repo_url')
|
||||||
|
def test_non_starred_repo_not_cloned_with_only_all_starred(self, mock_get_url, mock_fetch):
|
||||||
|
"""Non-starred repos should NOT be cloned when only --all-starred is set."""
|
||||||
|
args = self._create_mock_args(all_starred=True)
|
||||||
|
mock_get_url.return_value = "https://github.com/testuser/my-project.git"
|
||||||
|
|
||||||
|
# A regular (non-starred) repository
|
||||||
|
regular_repo = {
|
||||||
|
"name": "my-project",
|
||||||
|
"full_name": "testuser/my-project",
|
||||||
|
"owner": {"login": "testuser"},
|
||||||
|
"private": False,
|
||||||
|
"fork": False,
|
||||||
|
"has_wiki": False,
|
||||||
|
# No is_starred flag
|
||||||
|
}
|
||||||
|
|
||||||
|
with patch('github_backup.github_backup.mkdir_p'):
|
||||||
|
github_backup.backup_repositories(args, "/tmp/backup", [regular_repo])
|
||||||
|
|
||||||
|
# fetch_repository should NOT be called for non-starred repos
|
||||||
|
assert not mock_fetch.called, "Non-starred repos should not be cloned with only --all-starred"
|
||||||
|
|
||||||
|
@patch('github_backup.github_backup.fetch_repository')
|
||||||
|
@patch('github_backup.github_backup.get_github_repo_url')
|
||||||
|
def test_repositories_flag_still_works(self, mock_get_url, mock_fetch):
|
||||||
|
"""--repositories flag should still clone repos as before."""
|
||||||
|
args = self._create_mock_args(include_repository=True)
|
||||||
|
mock_get_url.return_value = "https://github.com/testuser/my-project.git"
|
||||||
|
|
||||||
|
regular_repo = {
|
||||||
|
"name": "my-project",
|
||||||
|
"full_name": "testuser/my-project",
|
||||||
|
"owner": {"login": "testuser"},
|
||||||
|
"private": False,
|
||||||
|
"fork": False,
|
||||||
|
"has_wiki": False,
|
||||||
|
}
|
||||||
|
|
||||||
|
with patch('github_backup.github_backup.mkdir_p'):
|
||||||
|
github_backup.backup_repositories(args, "/tmp/backup", [regular_repo])
|
||||||
|
|
||||||
|
# fetch_repository should be called
|
||||||
|
assert mock_fetch.called, "--repositories should trigger repository cloning"
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
pytest.main([__file__, "-v"])
|
||||||
112
tests/test_case_sensitivity.py
Normal file
112
tests/test_case_sensitivity.py
Normal file
@@ -0,0 +1,112 @@
|
|||||||
|
"""Tests for case-insensitive username/organization filtering."""
|
||||||
|
|
||||||
|
import pytest
|
||||||
|
from unittest.mock import Mock
|
||||||
|
|
||||||
|
from github_backup import github_backup
|
||||||
|
|
||||||
|
|
||||||
|
class TestCaseSensitivity:
|
||||||
|
"""Test suite for case-insensitive username matching in filter_repositories."""
|
||||||
|
|
||||||
|
def test_filter_repositories_case_insensitive_user(self):
|
||||||
|
"""Should filter repositories case-insensitively for usernames.
|
||||||
|
|
||||||
|
Reproduces issue #198 where typing 'iamrodos' fails to match
|
||||||
|
repositories with owner.login='Iamrodos' (the canonical case from GitHub API).
|
||||||
|
"""
|
||||||
|
# Simulate user typing lowercase username
|
||||||
|
args = Mock()
|
||||||
|
args.user = "iamrodos" # lowercase (what user typed)
|
||||||
|
args.repository = None
|
||||||
|
args.name_regex = None
|
||||||
|
args.languages = None
|
||||||
|
args.exclude = None
|
||||||
|
args.fork = False
|
||||||
|
args.private = False
|
||||||
|
args.public = False
|
||||||
|
args.all = True
|
||||||
|
|
||||||
|
# Simulate GitHub API returning canonical case
|
||||||
|
repos = [
|
||||||
|
{
|
||||||
|
"name": "repo1",
|
||||||
|
"owner": {"login": "Iamrodos"}, # Capital I (canonical from API)
|
||||||
|
"private": False,
|
||||||
|
"fork": False,
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "repo2",
|
||||||
|
"owner": {"login": "Iamrodos"},
|
||||||
|
"private": False,
|
||||||
|
"fork": False,
|
||||||
|
},
|
||||||
|
]
|
||||||
|
|
||||||
|
filtered = github_backup.filter_repositories(args, repos)
|
||||||
|
|
||||||
|
# Should match despite case difference
|
||||||
|
assert len(filtered) == 2
|
||||||
|
assert filtered[0]["name"] == "repo1"
|
||||||
|
assert filtered[1]["name"] == "repo2"
|
||||||
|
|
||||||
|
def test_filter_repositories_case_insensitive_org(self):
|
||||||
|
"""Should filter repositories case-insensitively for organizations.
|
||||||
|
|
||||||
|
Tests the example from issue #198 where 'prai-org' doesn't match 'PRAI-Org'.
|
||||||
|
"""
|
||||||
|
args = Mock()
|
||||||
|
args.user = "prai-org" # lowercase (what user typed)
|
||||||
|
args.repository = None
|
||||||
|
args.name_regex = None
|
||||||
|
args.languages = None
|
||||||
|
args.exclude = None
|
||||||
|
args.fork = False
|
||||||
|
args.private = False
|
||||||
|
args.public = False
|
||||||
|
args.all = True
|
||||||
|
|
||||||
|
repos = [
|
||||||
|
{
|
||||||
|
"name": "repo1",
|
||||||
|
"owner": {"login": "PRAI-Org"}, # Different case (canonical from API)
|
||||||
|
"private": False,
|
||||||
|
"fork": False,
|
||||||
|
},
|
||||||
|
]
|
||||||
|
|
||||||
|
filtered = github_backup.filter_repositories(args, repos)
|
||||||
|
|
||||||
|
# Should match despite case difference
|
||||||
|
assert len(filtered) == 1
|
||||||
|
assert filtered[0]["name"] == "repo1"
|
||||||
|
|
||||||
|
def test_filter_repositories_case_variations(self):
|
||||||
|
"""Should handle various case combinations correctly."""
|
||||||
|
args = Mock()
|
||||||
|
args.user = "TeSt-UsEr" # Mixed case
|
||||||
|
args.repository = None
|
||||||
|
args.name_regex = None
|
||||||
|
args.languages = None
|
||||||
|
args.exclude = None
|
||||||
|
args.fork = False
|
||||||
|
args.private = False
|
||||||
|
args.public = False
|
||||||
|
args.all = True
|
||||||
|
|
||||||
|
repos = [
|
||||||
|
{"name": "repo1", "owner": {"login": "test-user"}, "private": False, "fork": False},
|
||||||
|
{"name": "repo2", "owner": {"login": "TEST-USER"}, "private": False, "fork": False},
|
||||||
|
{"name": "repo3", "owner": {"login": "TeSt-UsEr"}, "private": False, "fork": False},
|
||||||
|
{"name": "repo4", "owner": {"login": "other-user"}, "private": False, "fork": False},
|
||||||
|
]
|
||||||
|
|
||||||
|
filtered = github_backup.filter_repositories(args, repos)
|
||||||
|
|
||||||
|
# Should match first 3 (all case variations of same user)
|
||||||
|
assert len(filtered) == 3
|
||||||
|
assert set(r["name"] for r in filtered) == {"repo1", "repo2", "repo3"}
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
pytest.main([__file__, "-v"])
|
||||||
143
tests/test_http_451.py
Normal file
143
tests/test_http_451.py
Normal file
@@ -0,0 +1,143 @@
|
|||||||
|
"""Tests for HTTP 451 (DMCA takedown) handling."""
|
||||||
|
|
||||||
|
import json
|
||||||
|
from unittest.mock import Mock, patch
|
||||||
|
|
||||||
|
import pytest
|
||||||
|
|
||||||
|
from github_backup import github_backup
|
||||||
|
|
||||||
|
|
||||||
|
class TestHTTP451Exception:
|
||||||
|
"""Test suite for HTTP 451 DMCA takedown exception handling."""
|
||||||
|
|
||||||
|
def test_repository_unavailable_error_raised(self):
|
||||||
|
"""HTTP 451 should raise RepositoryUnavailableError with DMCA URL."""
|
||||||
|
# Create mock args
|
||||||
|
args = Mock()
|
||||||
|
args.as_app = False
|
||||||
|
args.token_fine = None
|
||||||
|
args.token_classic = None
|
||||||
|
args.username = None
|
||||||
|
args.password = None
|
||||||
|
args.osx_keychain_item_name = None
|
||||||
|
args.osx_keychain_item_account = None
|
||||||
|
args.throttle_limit = None
|
||||||
|
args.throttle_pause = 0
|
||||||
|
|
||||||
|
# Mock HTTPError 451 response
|
||||||
|
mock_response = Mock()
|
||||||
|
mock_response.getcode.return_value = 451
|
||||||
|
|
||||||
|
dmca_data = {
|
||||||
|
"message": "Repository access blocked",
|
||||||
|
"block": {
|
||||||
|
"reason": "dmca",
|
||||||
|
"created_at": "2024-11-12T14:38:04Z",
|
||||||
|
"html_url": "https://github.com/github/dmca/blob/master/2024/11/2024-11-04-source-code.md"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
mock_response.read.return_value = json.dumps(dmca_data).encode("utf-8")
|
||||||
|
mock_response.headers = {"x-ratelimit-remaining": "5000"}
|
||||||
|
mock_response.reason = "Unavailable For Legal Reasons"
|
||||||
|
|
||||||
|
def mock_get_response(request, auth, template):
|
||||||
|
return mock_response, []
|
||||||
|
|
||||||
|
with patch("github_backup.github_backup._get_response", side_effect=mock_get_response):
|
||||||
|
with pytest.raises(github_backup.RepositoryUnavailableError) as exc_info:
|
||||||
|
list(github_backup.retrieve_data_gen(args, "https://api.github.com/repos/test/dmca/issues"))
|
||||||
|
|
||||||
|
# Check exception has DMCA URL
|
||||||
|
assert exc_info.value.dmca_url == "https://github.com/github/dmca/blob/master/2024/11/2024-11-04-source-code.md"
|
||||||
|
assert "451" in str(exc_info.value)
|
||||||
|
|
||||||
|
def test_repository_unavailable_error_without_dmca_url(self):
|
||||||
|
"""HTTP 451 without DMCA details should still raise exception."""
|
||||||
|
args = Mock()
|
||||||
|
args.as_app = False
|
||||||
|
args.token_fine = None
|
||||||
|
args.token_classic = None
|
||||||
|
args.username = None
|
||||||
|
args.password = None
|
||||||
|
args.osx_keychain_item_name = None
|
||||||
|
args.osx_keychain_item_account = None
|
||||||
|
args.throttle_limit = None
|
||||||
|
args.throttle_pause = 0
|
||||||
|
|
||||||
|
mock_response = Mock()
|
||||||
|
mock_response.getcode.return_value = 451
|
||||||
|
mock_response.read.return_value = b'{"message": "Blocked"}'
|
||||||
|
mock_response.headers = {"x-ratelimit-remaining": "5000"}
|
||||||
|
mock_response.reason = "Unavailable For Legal Reasons"
|
||||||
|
|
||||||
|
def mock_get_response(request, auth, template):
|
||||||
|
return mock_response, []
|
||||||
|
|
||||||
|
with patch("github_backup.github_backup._get_response", side_effect=mock_get_response):
|
||||||
|
with pytest.raises(github_backup.RepositoryUnavailableError) as exc_info:
|
||||||
|
list(github_backup.retrieve_data_gen(args, "https://api.github.com/repos/test/dmca/issues"))
|
||||||
|
|
||||||
|
# Exception raised even without DMCA URL
|
||||||
|
assert exc_info.value.dmca_url is None
|
||||||
|
assert "451" in str(exc_info.value)
|
||||||
|
|
||||||
|
def test_repository_unavailable_error_with_malformed_json(self):
|
||||||
|
"""HTTP 451 with malformed JSON should still raise exception."""
|
||||||
|
args = Mock()
|
||||||
|
args.as_app = False
|
||||||
|
args.token_fine = None
|
||||||
|
args.token_classic = None
|
||||||
|
args.username = None
|
||||||
|
args.password = None
|
||||||
|
args.osx_keychain_item_name = None
|
||||||
|
args.osx_keychain_item_account = None
|
||||||
|
args.throttle_limit = None
|
||||||
|
args.throttle_pause = 0
|
||||||
|
|
||||||
|
mock_response = Mock()
|
||||||
|
mock_response.getcode.return_value = 451
|
||||||
|
mock_response.read.return_value = b"invalid json {"
|
||||||
|
mock_response.headers = {"x-ratelimit-remaining": "5000"}
|
||||||
|
mock_response.reason = "Unavailable For Legal Reasons"
|
||||||
|
|
||||||
|
def mock_get_response(request, auth, template):
|
||||||
|
return mock_response, []
|
||||||
|
|
||||||
|
with patch("github_backup.github_backup._get_response", side_effect=mock_get_response):
|
||||||
|
with pytest.raises(github_backup.RepositoryUnavailableError):
|
||||||
|
list(github_backup.retrieve_data_gen(args, "https://api.github.com/repos/test/dmca/issues"))
|
||||||
|
|
||||||
|
def test_other_http_errors_unchanged(self):
|
||||||
|
"""Other HTTP errors should still raise generic Exception."""
|
||||||
|
args = Mock()
|
||||||
|
args.as_app = False
|
||||||
|
args.token_fine = None
|
||||||
|
args.token_classic = None
|
||||||
|
args.username = None
|
||||||
|
args.password = None
|
||||||
|
args.osx_keychain_item_name = None
|
||||||
|
args.osx_keychain_item_account = None
|
||||||
|
args.throttle_limit = None
|
||||||
|
args.throttle_pause = 0
|
||||||
|
|
||||||
|
mock_response = Mock()
|
||||||
|
mock_response.getcode.return_value = 404
|
||||||
|
mock_response.read.return_value = b'{"message": "Not Found"}'
|
||||||
|
mock_response.headers = {"x-ratelimit-remaining": "5000"}
|
||||||
|
mock_response.reason = "Not Found"
|
||||||
|
|
||||||
|
def mock_get_response(request, auth, template):
|
||||||
|
return mock_response, []
|
||||||
|
|
||||||
|
with patch("github_backup.github_backup._get_response", side_effect=mock_get_response):
|
||||||
|
# Should raise generic Exception, not RepositoryUnavailableError
|
||||||
|
with pytest.raises(Exception) as exc_info:
|
||||||
|
list(github_backup.retrieve_data_gen(args, "https://api.github.com/repos/test/notfound/issues"))
|
||||||
|
|
||||||
|
assert not isinstance(exc_info.value, github_backup.RepositoryUnavailableError)
|
||||||
|
assert "404" in str(exc_info.value)
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
pytest.main([__file__, "-v"])
|
||||||
198
tests/test_json_dump_if_changed.py
Normal file
198
tests/test_json_dump_if_changed.py
Normal file
@@ -0,0 +1,198 @@
|
|||||||
|
"""Tests for json_dump_if_changed functionality."""
|
||||||
|
|
||||||
|
import codecs
|
||||||
|
import json
|
||||||
|
import os
|
||||||
|
import tempfile
|
||||||
|
|
||||||
|
import pytest
|
||||||
|
|
||||||
|
from github_backup import github_backup
|
||||||
|
|
||||||
|
|
||||||
|
class TestJsonDumpIfChanged:
|
||||||
|
"""Test suite for json_dump_if_changed function."""
|
||||||
|
|
||||||
|
def test_writes_new_file(self):
|
||||||
|
"""Should write file when it doesn't exist."""
|
||||||
|
with tempfile.TemporaryDirectory() as tmpdir:
|
||||||
|
output_file = os.path.join(tmpdir, "test.json")
|
||||||
|
test_data = {"key": "value", "number": 42}
|
||||||
|
|
||||||
|
result = github_backup.json_dump_if_changed(test_data, output_file)
|
||||||
|
|
||||||
|
assert result is True
|
||||||
|
assert os.path.exists(output_file)
|
||||||
|
|
||||||
|
# Verify content matches expected format
|
||||||
|
with codecs.open(output_file, "r", encoding="utf-8") as f:
|
||||||
|
content = f.read()
|
||||||
|
loaded = json.loads(content)
|
||||||
|
assert loaded == test_data
|
||||||
|
|
||||||
|
def test_skips_unchanged_file(self):
|
||||||
|
"""Should skip write when content is identical."""
|
||||||
|
with tempfile.TemporaryDirectory() as tmpdir:
|
||||||
|
output_file = os.path.join(tmpdir, "test.json")
|
||||||
|
test_data = {"key": "value", "number": 42}
|
||||||
|
|
||||||
|
# First write
|
||||||
|
result1 = github_backup.json_dump_if_changed(test_data, output_file)
|
||||||
|
assert result1 is True
|
||||||
|
|
||||||
|
# Get the initial mtime
|
||||||
|
mtime1 = os.path.getmtime(output_file)
|
||||||
|
|
||||||
|
# Second write with same data
|
||||||
|
result2 = github_backup.json_dump_if_changed(test_data, output_file)
|
||||||
|
assert result2 is False
|
||||||
|
|
||||||
|
# File should not have been modified
|
||||||
|
mtime2 = os.path.getmtime(output_file)
|
||||||
|
assert mtime1 == mtime2
|
||||||
|
|
||||||
|
def test_writes_when_content_changed(self):
|
||||||
|
"""Should write file when content has changed."""
|
||||||
|
with tempfile.TemporaryDirectory() as tmpdir:
|
||||||
|
output_file = os.path.join(tmpdir, "test.json")
|
||||||
|
test_data1 = {"key": "value1"}
|
||||||
|
test_data2 = {"key": "value2"}
|
||||||
|
|
||||||
|
# First write
|
||||||
|
result1 = github_backup.json_dump_if_changed(test_data1, output_file)
|
||||||
|
assert result1 is True
|
||||||
|
|
||||||
|
# Second write with different data
|
||||||
|
result2 = github_backup.json_dump_if_changed(test_data2, output_file)
|
||||||
|
assert result2 is True
|
||||||
|
|
||||||
|
# Verify new content
|
||||||
|
with codecs.open(output_file, "r", encoding="utf-8") as f:
|
||||||
|
loaded = json.load(f)
|
||||||
|
assert loaded == test_data2
|
||||||
|
|
||||||
|
def test_uses_consistent_formatting(self):
|
||||||
|
"""Should use same JSON formatting as json_dump."""
|
||||||
|
with tempfile.TemporaryDirectory() as tmpdir:
|
||||||
|
output_file = os.path.join(tmpdir, "test.json")
|
||||||
|
test_data = {"z": "last", "a": "first", "m": "middle"}
|
||||||
|
|
||||||
|
github_backup.json_dump_if_changed(test_data, output_file)
|
||||||
|
|
||||||
|
with codecs.open(output_file, "r", encoding="utf-8") as f:
|
||||||
|
content = f.read()
|
||||||
|
|
||||||
|
# Check for consistent formatting:
|
||||||
|
# - sorted keys
|
||||||
|
# - 4-space indent
|
||||||
|
# - comma-colon-space separator
|
||||||
|
expected = json.dumps(
|
||||||
|
test_data,
|
||||||
|
ensure_ascii=False,
|
||||||
|
sort_keys=True,
|
||||||
|
indent=4,
|
||||||
|
separators=(",", ": "),
|
||||||
|
)
|
||||||
|
assert content == expected
|
||||||
|
|
||||||
|
def test_atomic_write_always_used(self):
|
||||||
|
"""Should always use temp file and rename for atomic writes."""
|
||||||
|
with tempfile.TemporaryDirectory() as tmpdir:
|
||||||
|
output_file = os.path.join(tmpdir, "test.json")
|
||||||
|
test_data = {"key": "value"}
|
||||||
|
|
||||||
|
result = github_backup.json_dump_if_changed(test_data, output_file)
|
||||||
|
|
||||||
|
assert result is True
|
||||||
|
assert os.path.exists(output_file)
|
||||||
|
|
||||||
|
# Temp file should not exist after atomic write
|
||||||
|
temp_file = output_file + ".temp"
|
||||||
|
assert not os.path.exists(temp_file)
|
||||||
|
|
||||||
|
# Verify content
|
||||||
|
with codecs.open(output_file, "r", encoding="utf-8") as f:
|
||||||
|
loaded = json.load(f)
|
||||||
|
assert loaded == test_data
|
||||||
|
|
||||||
|
def test_handles_unicode_content(self):
|
||||||
|
"""Should correctly handle Unicode content."""
|
||||||
|
with tempfile.TemporaryDirectory() as tmpdir:
|
||||||
|
output_file = os.path.join(tmpdir, "test.json")
|
||||||
|
test_data = {
|
||||||
|
"emoji": "🚀",
|
||||||
|
"chinese": "你好",
|
||||||
|
"arabic": "مرحبا",
|
||||||
|
"cyrillic": "Привет",
|
||||||
|
}
|
||||||
|
|
||||||
|
result = github_backup.json_dump_if_changed(test_data, output_file)
|
||||||
|
assert result is True
|
||||||
|
|
||||||
|
# Verify Unicode is preserved
|
||||||
|
with codecs.open(output_file, "r", encoding="utf-8") as f:
|
||||||
|
loaded = json.load(f)
|
||||||
|
assert loaded == test_data
|
||||||
|
|
||||||
|
# Second write should skip
|
||||||
|
result2 = github_backup.json_dump_if_changed(test_data, output_file)
|
||||||
|
assert result2 is False
|
||||||
|
|
||||||
|
def test_handles_complex_nested_data(self):
|
||||||
|
"""Should handle complex nested data structures."""
|
||||||
|
with tempfile.TemporaryDirectory() as tmpdir:
|
||||||
|
output_file = os.path.join(tmpdir, "test.json")
|
||||||
|
test_data = {
|
||||||
|
"users": [
|
||||||
|
{"id": 1, "name": "Alice", "tags": ["admin", "user"]},
|
||||||
|
{"id": 2, "name": "Bob", "tags": ["user"]},
|
||||||
|
],
|
||||||
|
"metadata": {"version": "1.0", "nested": {"deep": {"value": 42}}},
|
||||||
|
}
|
||||||
|
|
||||||
|
result = github_backup.json_dump_if_changed(test_data, output_file)
|
||||||
|
assert result is True
|
||||||
|
|
||||||
|
# Verify structure is preserved
|
||||||
|
with codecs.open(output_file, "r", encoding="utf-8") as f:
|
||||||
|
loaded = json.load(f)
|
||||||
|
assert loaded == test_data
|
||||||
|
|
||||||
|
def test_overwrites_on_unicode_decode_error(self):
|
||||||
|
"""Should overwrite if existing file has invalid UTF-8."""
|
||||||
|
with tempfile.TemporaryDirectory() as tmpdir:
|
||||||
|
output_file = os.path.join(tmpdir, "test.json")
|
||||||
|
test_data = {"key": "value"}
|
||||||
|
|
||||||
|
# Write invalid UTF-8 bytes
|
||||||
|
with open(output_file, "wb") as f:
|
||||||
|
f.write(b"\xff\xfe invalid utf-8")
|
||||||
|
|
||||||
|
# Should catch UnicodeDecodeError and overwrite
|
||||||
|
result = github_backup.json_dump_if_changed(test_data, output_file)
|
||||||
|
assert result is True
|
||||||
|
|
||||||
|
# Verify new content was written
|
||||||
|
with codecs.open(output_file, "r", encoding="utf-8") as f:
|
||||||
|
loaded = json.load(f)
|
||||||
|
assert loaded == test_data
|
||||||
|
|
||||||
|
def test_key_order_independence(self):
|
||||||
|
"""Should treat differently-ordered dicts as same if keys/values match."""
|
||||||
|
with tempfile.TemporaryDirectory() as tmpdir:
|
||||||
|
output_file = os.path.join(tmpdir, "test.json")
|
||||||
|
|
||||||
|
# Write first dict
|
||||||
|
data1 = {"z": 1, "a": 2, "m": 3}
|
||||||
|
github_backup.json_dump_if_changed(data1, output_file)
|
||||||
|
|
||||||
|
# Try to write same data but different order
|
||||||
|
data2 = {"a": 2, "m": 3, "z": 1}
|
||||||
|
result = github_backup.json_dump_if_changed(data2, output_file)
|
||||||
|
|
||||||
|
# Should skip because content is the same (keys are sorted)
|
||||||
|
assert result is False
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
pytest.main([__file__, "-v"])
|
||||||
153
tests/test_pagination.py
Normal file
153
tests/test_pagination.py
Normal file
@@ -0,0 +1,153 @@
|
|||||||
|
"""Tests for Link header pagination handling."""
|
||||||
|
|
||||||
|
import json
|
||||||
|
from unittest.mock import Mock, patch
|
||||||
|
|
||||||
|
import pytest
|
||||||
|
|
||||||
|
from github_backup import github_backup
|
||||||
|
|
||||||
|
|
||||||
|
class MockHTTPResponse:
|
||||||
|
"""Mock HTTP response for paginated API calls."""
|
||||||
|
|
||||||
|
def __init__(self, data, link_header=None):
|
||||||
|
self._content = json.dumps(data).encode("utf-8")
|
||||||
|
self._link_header = link_header
|
||||||
|
self._read = False
|
||||||
|
self.reason = "OK"
|
||||||
|
|
||||||
|
def getcode(self):
|
||||||
|
return 200
|
||||||
|
|
||||||
|
def read(self):
|
||||||
|
if self._read:
|
||||||
|
return b""
|
||||||
|
self._read = True
|
||||||
|
return self._content
|
||||||
|
|
||||||
|
def get_header(self, name, default=None):
|
||||||
|
"""Mock method for headers.get()."""
|
||||||
|
return self.headers.get(name, default)
|
||||||
|
|
||||||
|
@property
|
||||||
|
def headers(self):
|
||||||
|
headers = {"x-ratelimit-remaining": "5000"}
|
||||||
|
if self._link_header:
|
||||||
|
headers["Link"] = self._link_header
|
||||||
|
return headers
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.fixture
|
||||||
|
def mock_args():
|
||||||
|
"""Mock args for retrieve_data_gen."""
|
||||||
|
args = Mock()
|
||||||
|
args.as_app = False
|
||||||
|
args.token_fine = None
|
||||||
|
args.token_classic = "fake_token"
|
||||||
|
args.username = None
|
||||||
|
args.password = None
|
||||||
|
args.osx_keychain_item_name = None
|
||||||
|
args.osx_keychain_item_account = None
|
||||||
|
args.throttle_limit = None
|
||||||
|
args.throttle_pause = 0
|
||||||
|
return args
|
||||||
|
|
||||||
|
|
||||||
|
def test_cursor_based_pagination(mock_args):
|
||||||
|
"""Link header with 'after' cursor parameter works correctly."""
|
||||||
|
|
||||||
|
# Simulate issues endpoint behavior: returns cursor in Link header
|
||||||
|
responses = [
|
||||||
|
# Issues endpoint returns 'after' cursor parameter (not 'page')
|
||||||
|
MockHTTPResponse(
|
||||||
|
data=[{"issue": i} for i in range(1, 101)], # Page 1 contents
|
||||||
|
link_header='<https://api.github.com/repos/owner/repo/issues?per_page=100&after=ABC123&page=2>; rel="next"',
|
||||||
|
),
|
||||||
|
MockHTTPResponse(
|
||||||
|
data=[{"issue": i} for i in range(101, 151)], # Page 2 contents
|
||||||
|
link_header=None, # No Link header - signals end of pagination
|
||||||
|
),
|
||||||
|
]
|
||||||
|
requests_made = []
|
||||||
|
|
||||||
|
def mock_urlopen(request, *args, **kwargs):
|
||||||
|
url = request.get_full_url()
|
||||||
|
requests_made.append(url)
|
||||||
|
return responses[len(requests_made) - 1]
|
||||||
|
|
||||||
|
with patch("github_backup.github_backup.urlopen", side_effect=mock_urlopen):
|
||||||
|
results = list(
|
||||||
|
github_backup.retrieve_data_gen(
|
||||||
|
mock_args, "https://api.github.com/repos/owner/repo/issues"
|
||||||
|
)
|
||||||
|
)
|
||||||
|
|
||||||
|
# Verify all items retrieved and cursor was used in second request
|
||||||
|
assert len(results) == 150
|
||||||
|
assert len(requests_made) == 2
|
||||||
|
assert "after=ABC123" in requests_made[1]
|
||||||
|
|
||||||
|
|
||||||
|
def test_page_based_pagination(mock_args):
|
||||||
|
"""Link header with 'page' parameter works correctly."""
|
||||||
|
|
||||||
|
# Simulate pulls/repos endpoint behavior: returns page numbers in Link header
|
||||||
|
responses = [
|
||||||
|
# Pulls endpoint uses traditional 'page' parameter (not cursor)
|
||||||
|
MockHTTPResponse(
|
||||||
|
data=[{"pull": i} for i in range(1, 101)], # Page 1 contents
|
||||||
|
link_header='<https://api.github.com/repos/owner/repo/pulls?per_page=100&page=2>; rel="next"',
|
||||||
|
),
|
||||||
|
MockHTTPResponse(
|
||||||
|
data=[{"pull": i} for i in range(101, 181)], # Page 2 contents
|
||||||
|
link_header=None, # No Link header - signals end of pagination
|
||||||
|
),
|
||||||
|
]
|
||||||
|
requests_made = []
|
||||||
|
|
||||||
|
def mock_urlopen(request, *args, **kwargs):
|
||||||
|
url = request.get_full_url()
|
||||||
|
requests_made.append(url)
|
||||||
|
return responses[len(requests_made) - 1]
|
||||||
|
|
||||||
|
with patch("github_backup.github_backup.urlopen", side_effect=mock_urlopen):
|
||||||
|
results = list(
|
||||||
|
github_backup.retrieve_data_gen(
|
||||||
|
mock_args, "https://api.github.com/repos/owner/repo/pulls"
|
||||||
|
)
|
||||||
|
)
|
||||||
|
|
||||||
|
# Verify all items retrieved and page parameter was used (not cursor)
|
||||||
|
assert len(results) == 180
|
||||||
|
assert len(requests_made) == 2
|
||||||
|
assert "page=2" in requests_made[1]
|
||||||
|
assert "after" not in requests_made[1]
|
||||||
|
|
||||||
|
|
||||||
|
def test_no_link_header_stops_pagination(mock_args):
|
||||||
|
"""Pagination stops when Link header is absent."""
|
||||||
|
|
||||||
|
# Simulate endpoint with results that fit in a single page
|
||||||
|
responses = [
|
||||||
|
MockHTTPResponse(
|
||||||
|
data=[{"label": i} for i in range(1, 51)], # Page contents
|
||||||
|
link_header=None, # No Link header - signals end of pagination
|
||||||
|
)
|
||||||
|
]
|
||||||
|
requests_made = []
|
||||||
|
|
||||||
|
def mock_urlopen(request, *args, **kwargs):
|
||||||
|
requests_made.append(request.get_full_url())
|
||||||
|
return responses[len(requests_made) - 1]
|
||||||
|
|
||||||
|
with patch("github_backup.github_backup.urlopen", side_effect=mock_urlopen):
|
||||||
|
results = list(
|
||||||
|
github_backup.retrieve_data_gen(
|
||||||
|
mock_args, "https://api.github.com/repos/owner/repo/labels"
|
||||||
|
)
|
||||||
|
)
|
||||||
|
|
||||||
|
# Verify pagination stopped after first request
|
||||||
|
assert len(results) == 50
|
||||||
|
assert len(requests_made) == 1
|
||||||
Reference in New Issue
Block a user