Compare commits

...

62 Commits

Author SHA1 Message Date
GitHub Action
72d35a9b94 Release version 0.51.2 2025-11-16 23:55:36 +00:00
Jose Diaz-Gonzalez
3eae9d78ed Merge pull request #447 from Iamrodos/master
fix: Improve CA certificate detection with fallback chain
2025-11-16 18:54:58 -05:00
Rodos
90ba839c7d fix: Improve CA certificate detection with fallback chain
The previous implementation incorrectly assumed empty get_ca_certs()
meant broken SSL, causing false failures in GitHub Codespaces and other
directory-based cert systems where certificates exist but aren't pre-loaded.
It would then attempt to import certifi as a workaround, but certifi wasn't
listed in requirements.txt, causing the fallback to fail with ImportError
even though the system certificates would have worked fine.

This commit replaces the naive check with a layered fallback approach that
checks multiple certificate sources. First it checks for pre-loaded system
certs (file-based systems). Then it verifies system cert paths exist
(directory-based systems like Ubuntu/Debian/Codespaces). Finally it attempts
to use certifi as an optional fallback only if needed.

This approach eliminates hard dependencies (certifi is now optional), works
in GitHub Codespaces without any setup, and fails gracefully with clear hints
for resolution when SSL is actually broken rather than failing with
ModuleNotFoundError.

Fixes #444
2025-11-16 16:33:10 +11:00
GitHub Action
1ec0820936 Release version 0.51.1 2025-11-16 02:01:39 +00:00
Jose Diaz-Gonzalez
ca463e5cd4 Merge pull request #446 from josegonzalez/dependabot/pip/python-packages-4ff811fbf7
chore(deps): bump certifi from 2025.10.5 to 2025.11.12 in the python-packages group
2025-11-15 21:01:01 -05:00
Jose Diaz-Gonzalez
1750d0eff1 Merge pull request #448 from Iamrodos/fix/attachment-duplicate-downloads
fix: Prevent duplicate attachment downloads (with tests)
2025-11-15 21:00:00 -05:00
Rodos
e4d1c78993 test: Add pytest infrastructure and attachment tests
In making my last fix to attachments, I found it challenging not
having tests to ensure there was no regression.

Added pytest with minimal setup and isolated configuration. Created
a separate test workflow to keep tests isolated from linting.

Tests cover the key elements of the attachment logic:
- URL extraction from issue bodies
- Filename extraction from different URL types
- Filename collision resolution
- Manifest duplicate prevention
2025-11-14 10:28:30 +11:00
Rodos
7a9455db88 fix: Prevent duplicate attachment downloads
Fixes bug where attachments were downloaded multiple times with
incremented filenames (file.mov, file_1.mov, file_2.mov) when
running backups without --skip-existing flag.

I should not have used the --skip-existing flag for attachments,
it did not do what I thought it did.

The correct approach is to always use the manifest to guide what
has already been downloaded and what now needs to be done.
2025-11-14 10:28:30 +11:00
dependabot[bot]
a98ff7f23d chore(deps): bump certifi in the python-packages group
Bumps the python-packages group with 1 update: [certifi](https://github.com/certifi/python-certifi).


Updates `certifi` from 2025.10.5 to 2025.11.12
- [Commits](https://github.com/certifi/python-certifi/compare/2025.10.05...2025.11.12)

---
updated-dependencies:
- dependency-name: certifi
  dependency-version: 2025.11.12
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: python-packages
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-11-12 13:11:06 +00:00
Jose Diaz-Gonzalez
7b78f06a68 Merge pull request #445 from josegonzalez/dependabot/pip/python-packages-499fb03faa
chore(deps): bump black from 25.9.0 to 25.11.0 in the python-packages group
2025-11-10 12:45:25 -05:00
dependabot[bot]
56db3ff0e8 chore(deps): bump black in the python-packages group
Bumps the python-packages group with 1 update: [black](https://github.com/psf/black).


Updates `black` from 25.9.0 to 25.11.0
- [Release notes](https://github.com/psf/black/releases)
- [Changelog](https://github.com/psf/black/blob/main/CHANGES.md)
- [Commits](https://github.com/psf/black/compare/25.9.0...25.11.0)

---
updated-dependencies:
- dependency-name: black
  dependency-version: 25.11.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: python-packages
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-11-10 13:59:47 +00:00
Jose Diaz-Gonzalez
5c9c20f6ee Merge pull request #443 from josegonzalez/dependabot/pip/python-packages-7fb8ba35da
chore(deps): bump docutils from 0.22.2 to 0.22.3 in the python-packages group
2025-11-07 15:56:55 -05:00
dependabot[bot]
c8c585cbb5 chore(deps): bump docutils in the python-packages group
Bumps the python-packages group with 1 update: [docutils](https://github.com/rtfd/recommonmark).


Updates `docutils` from 0.22.2 to 0.22.3
- [Changelog](https://github.com/readthedocs/recommonmark/blob/master/CHANGELOG.md)
- [Commits](https://github.com/rtfd/recommonmark/commits)

---
updated-dependencies:
- dependency-name: docutils
  dependency-version: 0.22.3
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: python-packages
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-11-06 13:09:51 +00:00
GitHub Action
e7880bb056 Release version 0.51.0 2025-11-06 02:11:08 +00:00
Jose Diaz-Gonzalez
18e3bd574a Merge pull request #439 from Iamrodos/master
feat: Add attachment download support for issues and pull requests
2025-11-05 21:10:03 -05:00
Rodos
1ed3d66777 refactor: Add atomic writes for attachment files and manifests 2025-11-06 12:42:57 +11:00
Rodos
a194fa48ce feat: Add attachment download support for issues and pull requests
Adds new --attachments flag that downloads user-uploaded files from
issue and PR bodies and comments. Key features:

- Determines attachment URLs
- Tracks downloads in manifest.json with metadata
- Supports --skip-existing to avoid re-downloading
- Handles filename collisions with counter suffix
- Smart retry logic for transient vs permanent failures
- Uses Content-Disposition for correct file extensions
2025-11-06 12:42:57 +11:00
Jose Diaz-Gonzalez
8f859be355 Merge pull request #441 from Iamrodos/feat/python-version-requirements-clean
chore: Enforce Python 3.8+ requirement and add multi-version CI testing
2025-11-05 20:25:47 -05:00
rodos
80e00d31d9 Merge branch 'master' into feat/python-version-requirements-clean 2025-11-06 10:01:22 +11:00
Jose Diaz-Gonzalez
32202656ba Merge pull request #442 from Iamrodos/feat/drop-python-3.8-3.9-support
chore: Drop support for Python 3.8 and 3.9 (EOL)
2025-11-05 17:45:40 -05:00
Rodos
875e31819a feat: Drop support for Python 3.8 and 3.9 (EOL)
Both Python 3.8 and 3.9 have reached end-of-life:
- Python 3.8: EOL October 7, 2024
- Python 3.9: EOL October 31, 2025

Changes:
- Add python_requires=">=3.10" to setup.py
- Remove Python 3.8 and 3.9 from classifiers
- Add Python 3.13 and 3.14 to classifiers
- Update README to document Python 3.10+ requirement
2025-11-04 13:53:41 +11:00
Rodos
73dc75ab95 fix: Remove Python 3.8 and 3.9 from CI matrix
3.8 and 3.9 are failing because the pinned dependencies don't support them:
- autopep8==2.3.2 needs Python 3.9+
- bleach==6.3.0 needs Python 3.10+

Both are EOL now anyway (3.8 in Oct 2024, 3.9 in Oct 2025).

Just fixing CI to test 3.10-3.14 for now. Will do a separate PR to formally
drop 3.8/3.9 support with python_requires and README updates.
2025-11-04 13:30:42 +11:00
Rodos
cd23dd1a16 feat: Enforce Python 3.8+ requirement and add multi-version CI testing
- Add python_requires=">=3.8" to setup.py to enforce minimum version at install time
- Update README to explicitly document Python 3.8+ requirement
- Add CI matrix to test lint/build on Python 3.8-3.14 (7 versions)
- Aligns with actual usage patterns (~99% of downloads on Python 3.8+)
- Prevents future PRs from inadvertently using incompatible syntax

This change protects users by preventing installation on unsupported Python
versions and ensures contributors can see version requirements clearly.
2025-11-04 10:47:13 +11:00
Jose Diaz-Gonzalez
d244de1952 Merge pull request #438 from josegonzalez/dependabot/pip/python-packages-7355fe9fde
chore(deps): bump bleach from 6.2.0 to 6.3.0 in the python-packages group
2025-10-28 16:51:18 -04:00
dependabot[bot]
4dae43c58e chore(deps): bump bleach in the python-packages group
Bumps the python-packages group with 1 update: [bleach](https://github.com/mozilla/bleach).


Updates `bleach` from 6.2.0 to 6.3.0
- [Changelog](https://github.com/mozilla/bleach/blob/main/CHANGES)
- [Commits](https://github.com/mozilla/bleach/compare/v6.2.0...v6.3.0)

---
updated-dependencies:
- dependency-name: bleach
  dependency-version: 6.3.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: python-packages
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-10-28 13:11:27 +00:00
Jose Diaz-Gonzalez
b018a91fb4 Merge pull request #437 from josegonzalez/dependabot/pip/python-packages-9224e307fb
chore(deps): bump charset-normalizer from 3.4.3 to 3.4.4 in the python-packages group
2025-10-14 17:22:22 -04:00
dependabot[bot]
759ec58beb chore(deps): bump charset-normalizer in the python-packages group
Bumps the python-packages group with 1 update: [charset-normalizer](https://github.com/jawah/charset_normalizer).


Updates `charset-normalizer` from 3.4.3 to 3.4.4
- [Release notes](https://github.com/jawah/charset_normalizer/releases)
- [Changelog](https://github.com/jawah/charset_normalizer/blob/master/CHANGELOG.md)
- [Commits](https://github.com/jawah/charset_normalizer/compare/3.4.3...3.4.4)

---
updated-dependencies:
- dependency-name: charset-normalizer
  dependency-version: 3.4.4
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: python-packages
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-10-14 13:10:22 +00:00
Jose Diaz-Gonzalez
b43c998b65 Merge pull request #436 from josegonzalez/dependabot/pip/python-packages-df716dda22
chore(deps): bump idna from 3.10 to 3.11 in the python-packages group
2025-10-13 18:09:50 -04:00
dependabot[bot]
38b4a2c106 chore(deps): bump idna from 3.10 to 3.11 in the python-packages group
Bumps the python-packages group with 1 update: [idna](https://github.com/kjd/idna).


Updates `idna` from 3.10 to 3.11
- [Release notes](https://github.com/kjd/idna/releases)
- [Changelog](https://github.com/kjd/idna/blob/master/HISTORY.rst)
- [Commits](https://github.com/kjd/idna/compare/v3.10...v3.11)

---
updated-dependencies:
- dependency-name: idna
  dependency-version: '3.11'
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: python-packages
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-10-13 13:42:50 +00:00
Jose Diaz-Gonzalez
6210ec3845 Merge pull request #435 from josegonzalez/dependabot/pip/python-packages-7695ffecbf
chore(deps): bump the python-packages group across 1 directory with 2 updates
2025-10-12 17:29:47 -04:00
dependabot[bot]
90396d2bdf chore(deps): bump the python-packages group across 1 directory with 2 updates
Bumps the python-packages group with 2 updates in the / directory: [platformdirs](https://github.com/tox-dev/platformdirs) and [rich](https://github.com/Textualize/rich).


Updates `platformdirs` from 4.4.0 to 4.5.0
- [Release notes](https://github.com/tox-dev/platformdirs/releases)
- [Changelog](https://github.com/tox-dev/platformdirs/blob/main/CHANGES.rst)
- [Commits](https://github.com/tox-dev/platformdirs/compare/4.4.0...4.5.0)

Updates `rich` from 14.1.0 to 14.2.0
- [Release notes](https://github.com/Textualize/rich/releases)
- [Changelog](https://github.com/Textualize/rich/blob/master/CHANGELOG.md)
- [Commits](https://github.com/Textualize/rich/compare/v14.1.0...v14.2.0)

---
updated-dependencies:
- dependency-name: platformdirs
  dependency-version: 4.5.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: python-packages
- dependency-name: rich
  dependency-version: 14.2.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: python-packages
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-10-10 13:09:42 +00:00
Jose Diaz-Gonzalez
aa35e883b0 Merge pull request #433 from josegonzalez/dependabot/pip/python-packages-a446e2f58d
chore(deps): bump the python-packages group with 3 updates
2025-10-07 14:05:20 -04:00
dependabot[bot]
963ed3e6f6 chore(deps): bump the python-packages group with 3 updates
Bumps the python-packages group with 3 updates: [certifi](https://github.com/certifi/python-certifi), [click](https://github.com/pallets/click) and [markdown-it-py](https://github.com/executablebooks/markdown-it-py).


Updates `certifi` from 2025.8.3 to 2025.10.5
- [Commits](https://github.com/certifi/python-certifi/compare/2025.08.03...2025.10.05)

Updates `click` from 8.1.8 to 8.3.0
- [Release notes](https://github.com/pallets/click/releases)
- [Changelog](https://github.com/pallets/click/blob/main/CHANGES.rst)
- [Commits](https://github.com/pallets/click/compare/8.1.8...8.3.0)

Updates `markdown-it-py` from 3.0.0 to 4.0.0
- [Release notes](https://github.com/executablebooks/markdown-it-py/releases)
- [Changelog](https://github.com/executablebooks/markdown-it-py/blob/master/CHANGELOG.md)
- [Commits](https://github.com/executablebooks/markdown-it-py/compare/v3.0.0...v4.0.0)

---
updated-dependencies:
- dependency-name: certifi
  dependency-version: 2025.10.5
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: python-packages
- dependency-name: click
  dependency-version: 8.3.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: python-packages
- dependency-name: markdown-it-py
  dependency-version: 4.0.0
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: python-packages
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-10-06 13:53:31 +00:00
Jose Diaz-Gonzalez
b710547fdc Merge pull request #432 from josegonzalez/dependabot/pip/python-packages-ba4e83c9d8
chore(deps): bump docutils from 0.22.1 to 0.22.2 in the python-packages group
2025-09-22 21:18:39 -04:00
dependabot[bot]
64b5667a16 chore(deps): bump docutils in the python-packages group
Bumps the python-packages group with 1 update: [docutils](https://github.com/rtfd/recommonmark).


Updates `docutils` from 0.22.1 to 0.22.2
- [Changelog](https://github.com/readthedocs/recommonmark/blob/master/CHANGELOG.md)
- [Commits](https://github.com/rtfd/recommonmark/commits)

---
updated-dependencies:
- dependency-name: docutils
  dependency-version: 0.22.2
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: python-packages
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-09-22 13:12:10 +00:00
Jose Diaz-Gonzalez
b0c8cfe059 Merge pull request #431 from josegonzalez/dependabot/pip/python-packages-ecd2129f1c
chore(deps): bump the python-packages group across 1 directory with 2 updates
2025-09-19 23:26:26 -04:00
dependabot[bot]
5bedaf825f chore(deps): bump the python-packages group across 1 directory with 2 updates
Bumps the python-packages group with 2 updates in the / directory: [black](https://github.com/psf/black) and [docutils](https://github.com/rtfd/recommonmark).


Updates `black` from 25.1.0 to 25.9.0
- [Release notes](https://github.com/psf/black/releases)
- [Changelog](https://github.com/psf/black/blob/main/CHANGES.md)
- [Commits](https://github.com/psf/black/compare/25.1.0...25.9.0)

Updates `docutils` from 0.22 to 0.22.1
- [Changelog](https://github.com/readthedocs/recommonmark/blob/master/CHANGELOG.md)
- [Commits](https://github.com/rtfd/recommonmark/commits)

---
updated-dependencies:
- dependency-name: black
  dependency-version: 25.9.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: python-packages
- dependency-name: docutils
  dependency-version: 0.22.1
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: python-packages
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-09-19 13:09:40 +00:00
Jose Diaz-Gonzalez
9d28d9c2b0 Update feature.yaml 2025-09-11 16:34:50 -04:00
Jose Diaz-Gonzalez
eb756d665c Delete .github/ISSUE_TEMPLATE.md 2025-09-11 16:34:18 -04:00
Jose Diaz-Gonzalez
3d5f61aa22 Create feature.yaml 2025-09-11 16:33:49 -04:00
Jose Diaz-Gonzalez
d6bf031bf7 Delete .github/ISSUE_TEMPLATE/bug_report.md 2025-09-11 16:32:32 -04:00
Jose Diaz-Gonzalez
85ab54e514 Update issue templates 2025-09-11 16:31:38 -04:00
Jose Diaz-Gonzalez
df4d751be2 Rename bug.md to bug.yaml 2025-09-11 16:30:46 -04:00
Jose Diaz-Gonzalez
03c660724d chore: create bug template 2025-09-11 16:30:10 -04:00
Jose Diaz-Gonzalez
39848e650c chore: Rename PULL_REQUEST.md to .github/PULL_REQUEST.md 2025-09-11 16:27:23 -04:00
Jose Diaz-Gonzalez
12ac519e9c chore: Rename ISSUE_TEMPLATE.md to .github/ISSUE_TEMPLATE.md 2025-09-11 16:26:53 -04:00
Jose Diaz-Gonzalez
9e25473151 Merge pull request #428 from josegonzalez/dependabot/github_actions/actions/setup-python-6
chore(deps): bump actions/setup-python from 5 to 6
2025-09-08 00:20:12 -04:00
dependabot[bot]
d3079bfb74 chore(deps): bump actions/setup-python from 5 to 6
Bumps [actions/setup-python](https://github.com/actions/setup-python) from 5 to 6.
- [Release notes](https://github.com/actions/setup-python/releases)
- [Commits](https://github.com/actions/setup-python/compare/v5...v6)

---
updated-dependencies:
- dependency-name: actions/setup-python
  dependency-version: '6'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-09-08 04:10:35 +00:00
Jose Diaz-Gonzalez
3b9ff1ac14 Merge pull request #427 from josegonzalez/dependabot/pip/python-packages-bc0667daba
chore(deps): bump twine from 6.1.0 to 6.2.0 in the python-packages group
2025-09-05 15:00:30 -04:00
dependabot[bot]
268a989b09 chore(deps): bump twine from 6.1.0 to 6.2.0 in the python-packages group
Bumps the python-packages group with 1 update: [twine](https://github.com/pypa/twine).


Updates `twine` from 6.1.0 to 6.2.0
- [Release notes](https://github.com/pypa/twine/releases)
- [Changelog](https://github.com/pypa/twine/blob/main/docs/changelog.rst)
- [Commits](https://github.com/pypa/twine/compare/6.1.0...6.2.0)

---
updated-dependencies:
- dependency-name: twine
  dependency-version: 6.2.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: python-packages
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-09-05 13:09:08 +00:00
Jose Diaz-Gonzalez
45a3b87892 Merge pull request #426 from josegonzalez/dependabot/pip/python-packages-177133f34b
chore(deps): bump more-itertools from 10.7.0 to 10.8.0 in the python-packages group
2025-09-03 21:22:11 -04:00
dependabot[bot]
1c465f4d35 chore(deps): bump more-itertools in the python-packages group
Bumps the python-packages group with 1 update: [more-itertools](https://github.com/more-itertools/more-itertools).


Updates `more-itertools` from 10.7.0 to 10.8.0
- [Release notes](https://github.com/more-itertools/more-itertools/releases)
- [Commits](https://github.com/more-itertools/more-itertools/compare/v10.7.0...v10.8.0)

---
updated-dependencies:
- dependency-name: more-itertools
  dependency-version: 10.8.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: python-packages
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-09-03 23:43:31 +00:00
Jose Diaz-Gonzalez
3ad9b02b26 Merge pull request #425 from josegonzalez/dependabot/pip/python-packages-c9f8c21b21
chore(deps): bump platformdirs from 4.3.8 to 4.4.0 in the python-packages group
2025-09-03 00:16:58 -04:00
dependabot[bot]
8bfad9b5b7 chore(deps): bump platformdirs in the python-packages group
Bumps the python-packages group with 1 update: [platformdirs](https://github.com/tox-dev/platformdirs).


Updates `platformdirs` from 4.3.8 to 4.4.0
- [Release notes](https://github.com/tox-dev/platformdirs/releases)
- [Changelog](https://github.com/tox-dev/platformdirs/blob/main/CHANGES.rst)
- [Commits](https://github.com/tox-dev/platformdirs/compare/4.3.8...4.4.0)

---
updated-dependencies:
- dependency-name: platformdirs
  dependency-version: 4.4.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: python-packages
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-08-27 20:52:18 +00:00
Jose Diaz-Gonzalez
985d79c1bc Merge pull request #423 from josegonzalez/dependabot/github_actions/actions/checkout-5
chore(deps): bump actions/checkout from 4 to 5
2025-08-22 02:59:12 -04:00
Jose Diaz-Gonzalez
7d1b7f20ef Merge pull request #424 from josegonzalez/dependabot/pip/python-packages-23f06ca20f
chore(deps): bump requests from 2.32.4 to 2.32.5 in the python-packages group
2025-08-22 02:59:04 -04:00
dependabot[bot]
d3b67f884a chore(deps): bump requests in the python-packages group
Bumps the python-packages group with 1 update: [requests](https://github.com/psf/requests).


Updates `requests` from 2.32.4 to 2.32.5
- [Release notes](https://github.com/psf/requests/releases)
- [Changelog](https://github.com/psf/requests/blob/main/HISTORY.md)
- [Commits](https://github.com/psf/requests/compare/v2.32.4...v2.32.5)

---
updated-dependencies:
- dependency-name: requests
  dependency-version: 2.32.5
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: python-packages
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-08-19 20:54:47 +00:00
dependabot[bot]
65749bfde4 chore(deps): bump actions/checkout from 4 to 5
Bumps [actions/checkout](https://github.com/actions/checkout) from 4 to 5.
- [Release notes](https://github.com/actions/checkout/releases)
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
- [Commits](https://github.com/actions/checkout/compare/v4...v5)

---
updated-dependencies:
- dependency-name: actions/checkout
  dependency-version: '5'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-08-18 06:33:46 +00:00
Jose Diaz-Gonzalez
aeeb0eb9d7 Merge pull request #422 from mhajder/chore/dockerfile
chore: update Dockerfile
2025-08-15 16:57:21 -04:00
Mateusz Hajder
f027760ac5 chore: update Dockerfile to use Python 3.12 and improve dependency installation 2025-08-15 08:47:02 +02:00
Jose Diaz-Gonzalez
a9e48f8c4e Merge pull request #421 from josegonzalez/dependabot/pip/python-packages-cf9d3ddef5
chore(deps): bump the python-packages group with 2 updates
2025-08-12 11:46:49 -04:00
dependabot[bot]
338d5a956b chore(deps): bump the python-packages group with 2 updates
Bumps the python-packages group with 2 updates: [certifi](https://github.com/certifi/python-certifi) and [charset-normalizer](https://github.com/jawah/charset_normalizer).


Updates `certifi` from 2025.7.14 to 2025.8.3
- [Commits](https://github.com/certifi/python-certifi/compare/2025.07.14...2025.08.03)

Updates `charset-normalizer` from 3.4.2 to 3.4.3
- [Release notes](https://github.com/jawah/charset_normalizer/releases)
- [Changelog](https://github.com/jawah/charset_normalizer/blob/master/CHANGELOG.md)
- [Commits](https://github.com/jawah/charset_normalizer/compare/3.4.2...3.4.3)

---
updated-dependencies:
- dependency-name: certifi
  dependency-version: 2025.8.3
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: python-packages
- dependency-name: charset-normalizer
  dependency-version: 3.4.3
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: python-packages
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-08-11 20:51:37 +00:00
21 changed files with 1716 additions and 66 deletions

75
.dockerignore Normal file
View File

@@ -0,0 +1,75 @@
# Docker ignore file to reduce build context size
# Temp files
*~
~*
.*~
\#*
.#*
*#
dist
# Build files
build
dist
pkg
*.egg
*.egg-info
# Debian Files
debian/files
debian/python-github-backup*
# Sphinx build
doc/_build
# Generated man page
doc/github_backup.1
# Annoying macOS files
.DS_Store
._*
# IDE configuration files
.vscode
.atom
.idea
*.code-workspace
# RSA
id_rsa
id_rsa.pub
# Virtual env
venv
.venv
# Git
.git
.gitignore
.gitchangelog.rc
.github
# Documentation
*.md
!README.md
# Environment variables files
.env
.env.*
!.env.example
*.log
# Cache files
**/__pycache__/
*.py[cod]
# Docker files
docker-compose.yml
Dockerfile*
# Other files
release
*.tar
*.zip
*.gzip

28
.github/ISSUE_TEMPLATE/bug.yaml vendored Normal file
View File

@@ -0,0 +1,28 @@
---
name: Bug Report
description: File a bug report.
body:
- type: markdown
attributes:
value: |
# Important notice regarding filed issues
This project already fills my needs, and as such I have no real reason to continue it's development. This project is otherwise provided as is, and no support is given.
If pull requests implementing bug fixes or enhancements are pushed, I am happy to review and merge them (time permitting).
If you wish to have a bug fixed, you have a few options:
- Fix it yourself and file a pull request.
- File a bug and hope someone else fixes it for you.
- Pay me to fix it (my rate is $200 an hour, minimum 1 hour, contact me via my [github email address](https://github.com/josegonzalez) if you want to go this route).
In all cases, feel free to file an issue, they may be of help to others in the future.
- type: textarea
id: what-happened
attributes:
label: What happened?
description: Also tell us, what did you expect to happen?
placeholder: Tell us what you see!
validations:
required: true

27
.github/ISSUE_TEMPLATE/feature.yaml vendored Normal file
View File

@@ -0,0 +1,27 @@
---
name: Feature Request
description: File a feature request.
body:
- type: markdown
attributes:
value: |
# Important notice regarding filed issues
This project already fills my needs, and as such I have no real reason to continue it's development. This project is otherwise provided as is, and no support is given.
If pull requests implementing bug fixes or enhancements are pushed, I am happy to review and merge them (time permitting).
If you wish to have a feature implemented, you have a few options:
- Implement it yourself and file a pull request.
- File an issue and hope someone else implements it for you.
- Pay me to implement it (my rate is $200 an hour, minimum 1 hour, contact me via my [github email address](https://github.com/josegonzalez) if you want to go this route).
In all cases, feel free to file an issue, they may be of help to others in the future.
- type: textarea
id: what-would-you-like-to-happen
attributes:
label: What would you like to happen?
description: Please describe in detail how the new functionality should work as well as any issues with existing functionality.
validations:
required: true

View File

@@ -18,7 +18,7 @@ jobs:
runs-on: ubuntu-24.04
steps:
- name: Checkout repository
uses: actions/checkout@v4
uses: actions/checkout@v5
with:
fetch-depth: 0
ssh-key: ${{ secrets.DEPLOY_PRIVATE_KEY }}
@@ -27,7 +27,7 @@ jobs:
git config --local user.email "action@github.com"
git config --local user.name "GitHub Action"
- name: Setup Python
uses: actions/setup-python@v5
uses: actions/setup-python@v6
with:
python-version: '3.12'
- name: Install prerequisites

View File

@@ -38,7 +38,7 @@ jobs:
steps:
- name: Checkout repository
uses: actions/checkout@v4
uses: actions/checkout@v5
with:
persist-credentials: false

View File

@@ -15,16 +15,19 @@ jobs:
lint:
name: lint
runs-on: ubuntu-24.04
strategy:
matrix:
python-version: ["3.10", "3.11", "3.12", "3.13", "3.14"]
steps:
- name: Checkout repository
uses: actions/checkout@v4
uses: actions/checkout@v5
with:
fetch-depth: 0
- name: Setup Python
uses: actions/setup-python@v5
uses: actions/setup-python@v6
with:
python-version: "3.12"
python-version: ${{ matrix.python-version }}
cache: "pip"
- run: pip install -r release-requirements.txt && pip install wheel
- run: flake8 --ignore=E501,E203,W503

33
.github/workflows/test.yml vendored Normal file
View File

@@ -0,0 +1,33 @@
---
name: "test"
# yamllint disable-line rule:truthy
on:
pull_request:
branches:
- "*"
push:
branches:
- "main"
- "master"
jobs:
test:
name: test
runs-on: ubuntu-24.04
strategy:
matrix:
python-version: ["3.10", "3.11", "3.12", "3.13", "3.14"]
steps:
- name: Checkout repository
uses: actions/checkout@v5
with:
fetch-depth: 0
- name: Setup Python
uses: actions/setup-python@v6
with:
python-version: ${{ matrix.python-version }}
cache: "pip"
- run: pip install -r release-requirements.txt
- run: pytest tests/ -v

4
.gitignore vendored
View File

@@ -1,4 +1,4 @@
*.py[oc]
*.py[cod]
# Temp files
*~
@@ -33,6 +33,7 @@ doc/github_backup.1
# IDE configuration files
.vscode
.atom
.idea
README
@@ -42,3 +43,4 @@ id_rsa.pub
# Virtual env
venv
.venv

View File

@@ -1,9 +1,489 @@
Changelog
=========
0.50.3 (2025-08-08)
0.51.2 (2025-11-16)
-------------------
------------------------
Fix
~~~
- Improve CA certificate detection with fallback chain. [Rodos]
The previous implementation incorrectly assumed empty get_ca_certs()
meant broken SSL, causing false failures in GitHub Codespaces and other
directory-based cert systems where certificates exist but aren't pre-loaded.
It would then attempt to import certifi as a workaround, but certifi wasn't
listed in requirements.txt, causing the fallback to fail with ImportError
even though the system certificates would have worked fine.
This commit replaces the naive check with a layered fallback approach that
checks multiple certificate sources. First it checks for pre-loaded system
certs (file-based systems). Then it verifies system cert paths exist
(directory-based systems like Ubuntu/Debian/Codespaces). Finally it attempts
to use certifi as an optional fallback only if needed.
This approach eliminates hard dependencies (certifi is now optional), works
in GitHub Codespaces without any setup, and fails gracefully with clear hints
for resolution when SSL is actually broken rather than failing with
ModuleNotFoundError.
Fixes #444
0.51.1 (2025-11-16)
-------------------
Fix
~~~
- Prevent duplicate attachment downloads. [Rodos]
Fixes bug where attachments were downloaded multiple times with
incremented filenames (file.mov, file_1.mov, file_2.mov) when
running backups without --skip-existing flag.
I should not have used the --skip-existing flag for attachments,
it did not do what I thought it did.
The correct approach is to always use the manifest to guide what
has already been downloaded and what now needs to be done.
Other
~~~~~
- Chore(deps): bump certifi in the python-packages group.
[dependabot[bot]]
Bumps the python-packages group with 1 update: [certifi](https://github.com/certifi/python-certifi).
Updates `certifi` from 2025.10.5 to 2025.11.12
- [Commits](https://github.com/certifi/python-certifi/compare/2025.10.05...2025.11.12)
---
updated-dependencies:
- dependency-name: certifi
dependency-version: 2025.11.12
dependency-type: direct:production
update-type: version-update:semver-minor
dependency-group: python-packages
...
- Test: Add pytest infrastructure and attachment tests. [Rodos]
In making my last fix to attachments, I found it challenging not
having tests to ensure there was no regression.
Added pytest with minimal setup and isolated configuration. Created
a separate test workflow to keep tests isolated from linting.
Tests cover the key elements of the attachment logic:
- URL extraction from issue bodies
- Filename extraction from different URL types
- Filename collision resolution
- Manifest duplicate prevention
- Chore(deps): bump black in the python-packages group.
[dependabot[bot]]
Bumps the python-packages group with 1 update: [black](https://github.com/psf/black).
Updates `black` from 25.9.0 to 25.11.0
- [Release notes](https://github.com/psf/black/releases)
- [Changelog](https://github.com/psf/black/blob/main/CHANGES.md)
- [Commits](https://github.com/psf/black/compare/25.9.0...25.11.0)
---
updated-dependencies:
- dependency-name: black
dependency-version: 25.11.0
dependency-type: direct:production
update-type: version-update:semver-minor
dependency-group: python-packages
...
- Chore(deps): bump docutils in the python-packages group.
[dependabot[bot]]
Bumps the python-packages group with 1 update: [docutils](https://github.com/rtfd/recommonmark).
Updates `docutils` from 0.22.2 to 0.22.3
- [Changelog](https://github.com/readthedocs/recommonmark/blob/master/CHANGELOG.md)
- [Commits](https://github.com/rtfd/recommonmark/commits)
---
updated-dependencies:
- dependency-name: docutils
dependency-version: 0.22.3
dependency-type: direct:production
update-type: version-update:semver-patch
dependency-group: python-packages
...
0.51.0 (2025-11-06)
-------------------
Fix
~~~
- Remove Python 3.8 and 3.9 from CI matrix. [Rodos]
3.8 and 3.9 are failing because the pinned dependencies don't support them:
- autopep8==2.3.2 needs Python 3.9+
- bleach==6.3.0 needs Python 3.10+
Both are EOL now anyway (3.8 in Oct 2024, 3.9 in Oct 2025).
Just fixing CI to test 3.10-3.14 for now. Will do a separate PR to formally
drop 3.8/3.9 support with python_requires and README updates.
Other
~~~~~
- Refactor: Add atomic writes for attachment files and manifests.
[Rodos]
- Feat: Add attachment download support for issues and pull requests.
[Rodos]
Adds new --attachments flag that downloads user-uploaded files from
issue and PR bodies and comments. Key features:
- Determines attachment URLs
- Tracks downloads in manifest.json with metadata
- Supports --skip-existing to avoid re-downloading
- Handles filename collisions with counter suffix
- Smart retry logic for transient vs permanent failures
- Uses Content-Disposition for correct file extensions
- Feat: Drop support for Python 3.8 and 3.9 (EOL) [Rodos]
Both Python 3.8 and 3.9 have reached end-of-life:
- Python 3.8: EOL October 7, 2024
- Python 3.9: EOL October 31, 2025
Changes:
- Add python_requires=">=3.10" to setup.py
- Remove Python 3.8 and 3.9 from classifiers
- Add Python 3.13 and 3.14 to classifiers
- Update README to document Python 3.10+ requirement
- Feat: Enforce Python 3.8+ requirement and add multi-version CI
testing. [Rodos]
- Add python_requires=">=3.8" to setup.py to enforce minimum version at install time
- Update README to explicitly document Python 3.8+ requirement
- Add CI matrix to test lint/build on Python 3.8-3.14 (7 versions)
- Aligns with actual usage patterns (~99% of downloads on Python 3.8+)
- Prevents future PRs from inadvertently using incompatible syntax
This change protects users by preventing installation on unsupported Python
versions and ensures contributors can see version requirements clearly.
- Chore(deps): bump bleach in the python-packages group.
[dependabot[bot]]
Bumps the python-packages group with 1 update: [bleach](https://github.com/mozilla/bleach).
Updates `bleach` from 6.2.0 to 6.3.0
- [Changelog](https://github.com/mozilla/bleach/blob/main/CHANGES)
- [Commits](https://github.com/mozilla/bleach/compare/v6.2.0...v6.3.0)
---
updated-dependencies:
- dependency-name: bleach
dependency-version: 6.3.0
dependency-type: direct:production
update-type: version-update:semver-minor
dependency-group: python-packages
...
- Chore(deps): bump charset-normalizer in the python-packages group.
[dependabot[bot]]
Bumps the python-packages group with 1 update: [charset-normalizer](https://github.com/jawah/charset_normalizer).
Updates `charset-normalizer` from 3.4.3 to 3.4.4
- [Release notes](https://github.com/jawah/charset_normalizer/releases)
- [Changelog](https://github.com/jawah/charset_normalizer/blob/master/CHANGELOG.md)
- [Commits](https://github.com/jawah/charset_normalizer/compare/3.4.3...3.4.4)
---
updated-dependencies:
- dependency-name: charset-normalizer
dependency-version: 3.4.4
dependency-type: direct:production
update-type: version-update:semver-patch
dependency-group: python-packages
...
- Chore(deps): bump idna from 3.10 to 3.11 in the python-packages group.
[dependabot[bot]]
Bumps the python-packages group with 1 update: [idna](https://github.com/kjd/idna).
Updates `idna` from 3.10 to 3.11
- [Release notes](https://github.com/kjd/idna/releases)
- [Changelog](https://github.com/kjd/idna/blob/master/HISTORY.rst)
- [Commits](https://github.com/kjd/idna/compare/v3.10...v3.11)
---
updated-dependencies:
- dependency-name: idna
dependency-version: '3.11'
dependency-type: direct:production
update-type: version-update:semver-minor
dependency-group: python-packages
...
- Chore(deps): bump the python-packages group across 1 directory with 2
updates. [dependabot[bot]]
Bumps the python-packages group with 2 updates in the / directory: [platformdirs](https://github.com/tox-dev/platformdirs) and [rich](https://github.com/Textualize/rich).
Updates `platformdirs` from 4.4.0 to 4.5.0
- [Release notes](https://github.com/tox-dev/platformdirs/releases)
- [Changelog](https://github.com/tox-dev/platformdirs/blob/main/CHANGES.rst)
- [Commits](https://github.com/tox-dev/platformdirs/compare/4.4.0...4.5.0)
Updates `rich` from 14.1.0 to 14.2.0
- [Release notes](https://github.com/Textualize/rich/releases)
- [Changelog](https://github.com/Textualize/rich/blob/master/CHANGELOG.md)
- [Commits](https://github.com/Textualize/rich/compare/v14.1.0...v14.2.0)
---
updated-dependencies:
- dependency-name: platformdirs
dependency-version: 4.5.0
dependency-type: direct:production
update-type: version-update:semver-minor
dependency-group: python-packages
- dependency-name: rich
dependency-version: 14.2.0
dependency-type: direct:production
update-type: version-update:semver-minor
dependency-group: python-packages
...
- Chore(deps): bump the python-packages group with 3 updates.
[dependabot[bot]]
Bumps the python-packages group with 3 updates: [certifi](https://github.com/certifi/python-certifi), [click](https://github.com/pallets/click) and [markdown-it-py](https://github.com/executablebooks/markdown-it-py).
Updates `certifi` from 2025.8.3 to 2025.10.5
- [Commits](https://github.com/certifi/python-certifi/compare/2025.08.03...2025.10.05)
Updates `click` from 8.1.8 to 8.3.0
- [Release notes](https://github.com/pallets/click/releases)
- [Changelog](https://github.com/pallets/click/blob/main/CHANGES.rst)
- [Commits](https://github.com/pallets/click/compare/8.1.8...8.3.0)
Updates `markdown-it-py` from 3.0.0 to 4.0.0
- [Release notes](https://github.com/executablebooks/markdown-it-py/releases)
- [Changelog](https://github.com/executablebooks/markdown-it-py/blob/master/CHANGELOG.md)
- [Commits](https://github.com/executablebooks/markdown-it-py/compare/v3.0.0...v4.0.0)
---
updated-dependencies:
- dependency-name: certifi
dependency-version: 2025.10.5
dependency-type: direct:production
update-type: version-update:semver-minor
dependency-group: python-packages
- dependency-name: click
dependency-version: 8.3.0
dependency-type: direct:production
update-type: version-update:semver-minor
dependency-group: python-packages
- dependency-name: markdown-it-py
dependency-version: 4.0.0
dependency-type: direct:production
update-type: version-update:semver-major
dependency-group: python-packages
...
- Chore(deps): bump docutils in the python-packages group.
[dependabot[bot]]
Bumps the python-packages group with 1 update: [docutils](https://github.com/rtfd/recommonmark).
Updates `docutils` from 0.22.1 to 0.22.2
- [Changelog](https://github.com/readthedocs/recommonmark/blob/master/CHANGELOG.md)
- [Commits](https://github.com/rtfd/recommonmark/commits)
---
updated-dependencies:
- dependency-name: docutils
dependency-version: 0.22.2
dependency-type: direct:production
update-type: version-update:semver-patch
dependency-group: python-packages
...
- Chore(deps): bump the python-packages group across 1 directory with 2
updates. [dependabot[bot]]
Bumps the python-packages group with 2 updates in the / directory: [black](https://github.com/psf/black) and [docutils](https://github.com/rtfd/recommonmark).
Updates `black` from 25.1.0 to 25.9.0
- [Release notes](https://github.com/psf/black/releases)
- [Changelog](https://github.com/psf/black/blob/main/CHANGES.md)
- [Commits](https://github.com/psf/black/compare/25.1.0...25.9.0)
Updates `docutils` from 0.22 to 0.22.1
- [Changelog](https://github.com/readthedocs/recommonmark/blob/master/CHANGELOG.md)
- [Commits](https://github.com/rtfd/recommonmark/commits)
---
updated-dependencies:
- dependency-name: black
dependency-version: 25.9.0
dependency-type: direct:production
update-type: version-update:semver-minor
dependency-group: python-packages
- dependency-name: docutils
dependency-version: 0.22.1
dependency-type: direct:production
update-type: version-update:semver-patch
dependency-group: python-packages
...
- Delete .github/ISSUE_TEMPLATE.md. [Jose Diaz-Gonzalez]
- Create feature.yaml. [Jose Diaz-Gonzalez]
- Delete .github/ISSUE_TEMPLATE/bug_report.md. [Jose Diaz-Gonzalez]
- Rename bug.md to bug.yaml. [Jose Diaz-Gonzalez]
- Chore: create bug template. [Jose Diaz-Gonzalez]
- Chore: Rename PULL_REQUEST.md to .github/PULL_REQUEST.md. [Jose Diaz-
Gonzalez]
- Chore: Rename ISSUE_TEMPLATE.md to .github/ISSUE_TEMPLATE.md. [Jose
Diaz-Gonzalez]
- Chore(deps): bump actions/setup-python from 5 to 6. [dependabot[bot]]
Bumps [actions/setup-python](https://github.com/actions/setup-python) from 5 to 6.
- [Release notes](https://github.com/actions/setup-python/releases)
- [Commits](https://github.com/actions/setup-python/compare/v5...v6)
---
updated-dependencies:
- dependency-name: actions/setup-python
dependency-version: '6'
dependency-type: direct:production
update-type: version-update:semver-major
...
- Chore(deps): bump twine from 6.1.0 to 6.2.0 in the python-packages
group. [dependabot[bot]]
Bumps the python-packages group with 1 update: [twine](https://github.com/pypa/twine).
Updates `twine` from 6.1.0 to 6.2.0
- [Release notes](https://github.com/pypa/twine/releases)
- [Changelog](https://github.com/pypa/twine/blob/main/docs/changelog.rst)
- [Commits](https://github.com/pypa/twine/compare/6.1.0...6.2.0)
---
updated-dependencies:
- dependency-name: twine
dependency-version: 6.2.0
dependency-type: direct:production
update-type: version-update:semver-minor
dependency-group: python-packages
...
- Chore(deps): bump more-itertools in the python-packages group.
[dependabot[bot]]
Bumps the python-packages group with 1 update: [more-itertools](https://github.com/more-itertools/more-itertools).
Updates `more-itertools` from 10.7.0 to 10.8.0
- [Release notes](https://github.com/more-itertools/more-itertools/releases)
- [Commits](https://github.com/more-itertools/more-itertools/compare/v10.7.0...v10.8.0)
---
updated-dependencies:
- dependency-name: more-itertools
dependency-version: 10.8.0
dependency-type: direct:production
update-type: version-update:semver-minor
dependency-group: python-packages
...
- Chore(deps): bump platformdirs in the python-packages group.
[dependabot[bot]]
Bumps the python-packages group with 1 update: [platformdirs](https://github.com/tox-dev/platformdirs).
Updates `platformdirs` from 4.3.8 to 4.4.0
- [Release notes](https://github.com/tox-dev/platformdirs/releases)
- [Changelog](https://github.com/tox-dev/platformdirs/blob/main/CHANGES.rst)
- [Commits](https://github.com/tox-dev/platformdirs/compare/4.3.8...4.4.0)
---
updated-dependencies:
- dependency-name: platformdirs
dependency-version: 4.4.0
dependency-type: direct:production
update-type: version-update:semver-minor
dependency-group: python-packages
...
- Chore(deps): bump actions/checkout from 4 to 5. [dependabot[bot]]
Bumps [actions/checkout](https://github.com/actions/checkout) from 4 to 5.
- [Release notes](https://github.com/actions/checkout/releases)
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
- [Commits](https://github.com/actions/checkout/compare/v4...v5)
---
updated-dependencies:
- dependency-name: actions/checkout
dependency-version: '5'
dependency-type: direct:production
update-type: version-update:semver-major
...
- Chore(deps): bump requests in the python-packages group.
[dependabot[bot]]
Bumps the python-packages group with 1 update: [requests](https://github.com/psf/requests).
Updates `requests` from 2.32.4 to 2.32.5
- [Release notes](https://github.com/psf/requests/releases)
- [Changelog](https://github.com/psf/requests/blob/main/HISTORY.md)
- [Commits](https://github.com/psf/requests/compare/v2.32.4...v2.32.5)
---
updated-dependencies:
- dependency-name: requests
dependency-version: 2.32.5
dependency-type: direct:production
update-type: version-update:semver-patch
dependency-group: python-packages
...
- Chore: update Dockerfile to use Python 3.12 and improve dependency
installation. [Mateusz Hajder]
- Chore(deps): bump the python-packages group with 2 updates.
[dependabot[bot]]
Bumps the python-packages group with 2 updates: [certifi](https://github.com/certifi/python-certifi) and [charset-normalizer](https://github.com/jawah/charset_normalizer).
Updates `certifi` from 2025.7.14 to 2025.8.3
- [Commits](https://github.com/certifi/python-certifi/compare/2025.07.14...2025.08.03)
Updates `charset-normalizer` from 3.4.2 to 3.4.3
- [Release notes](https://github.com/jawah/charset_normalizer/releases)
- [Changelog](https://github.com/jawah/charset_normalizer/blob/master/CHANGELOG.md)
- [Commits](https://github.com/jawah/charset_normalizer/compare/3.4.2...3.4.3)
---
updated-dependencies:
- dependency-name: certifi
dependency-version: 2025.8.3
dependency-type: direct:production
update-type: version-update:semver-minor
dependency-group: python-packages
- dependency-name: charset-normalizer
dependency-version: 3.4.3
dependency-type: direct:production
update-type: version-update:semver-patch
dependency-group: python-packages
...
0.50.3 (2025-08-08)
-------------------
- Revert "Add conditional check for git checkout in development path"
[Eric Wheeler]

View File

@@ -1,16 +1,38 @@
FROM python:3.9.18-slim
FROM python:3.12-alpine3.22 AS builder
RUN --mount=type=cache,target=/var/cache/apt \
apt-get update && apt-get install -y git git-lfs
RUN pip install --no-cache-dir --upgrade pip \
&& pip install --no-cache-dir uv
WORKDIR /usr/src/app
WORKDIR /app
COPY release-requirements.txt .
RUN --mount=type=cache,target=/root/.cache/pip \
pip install -r release-requirements.txt
RUN --mount=type=cache,target=/root/.cache/uv \
--mount=type=bind,source=requirements.txt,target=requirements.txt \
--mount=type=bind,source=release-requirements.txt,target=release-requirements.txt \
uv venv \
&& uv pip install -r release-requirements.txt
COPY . .
RUN --mount=type=cache,target=/root/.cache/pip \
pip install .
ENTRYPOINT [ "github-backup" ]
RUN --mount=type=cache,target=/root/.cache/uv \
uv pip install .
FROM python:3.12-alpine3.22
ENV PYTHONUNBUFFERED=1
RUN apk add --no-cache \
ca-certificates \
git \
git-lfs \
&& addgroup -g 1000 appuser \
&& adduser -D -u 1000 -G appuser appuser
COPY --from=builder --chown=appuser:appuser /app /app
WORKDIR /app
USER appuser
ENV PATH="/app/.venv/bin:$PATH"
ENTRYPOINT ["github-backup"]

View File

@@ -1,13 +0,0 @@
# Important notice regarding filed issues
This project already fills my needs, and as such I have no real reason to continue it's development. This project is otherwise provided as is, and no support is given.
If pull requests implementing bug fixes or enhancements are pushed, I am happy to review and merge them (time permitting).
If you wish to have a bug fixed, you have a few options:
- Fix it yourself and file a pull request.
- File a bug and hope someone else fixes it for you.
- Pay me to fix it (my rate is $200 an hour, minimum 1 hour, contact me via my [github email address](https://github.com/josegonzalez) if you want to go this route).
In all cases, feel free to file an issue, they may be of help to others in the future.

View File

@@ -9,8 +9,8 @@ The package can be used to backup an *entire* `Github <https://github.com/>`_ or
Requirements
============
- Python 3.10 or higher
- GIT 1.9+
- Python
Installation
============
@@ -50,7 +50,7 @@ CLI Help output::
[--keychain-name OSX_KEYCHAIN_ITEM_NAME]
[--keychain-account OSX_KEYCHAIN_ITEM_ACCOUNT]
[--releases] [--latest-releases NUMBER_OF_LATEST_RELEASES]
[--skip-prerelease] [--assets]
[--skip-prerelease] [--assets] [--attachments]
[--exclude [REPOSITORY [REPOSITORY ...]]
[--throttle-limit THROTTLE_LIMIT] [--throttle-pause THROTTLE_PAUSE]
USER
@@ -133,6 +133,9 @@ CLI Help output::
--skip-prerelease skip prerelease and draft versions; only applies if including releases
--assets include assets alongside release information; only
applies if including releases
--attachments download user-attachments from issues and pull requests
to issues/attachments/{issue_number}/ and
pulls/attachments/{pull_number}/ directories
--exclude [REPOSITORY [REPOSITORY ...]]
names of repositories to exclude from backup.
--throttle-limit THROTTLE_LIMIT
@@ -213,6 +216,29 @@ When you use the ``--lfs`` option, you will need to make sure you have Git LFS i
Instructions on how to do this can be found on https://git-lfs.github.com.
About Attachments
-----------------
When you use the ``--attachments`` option with ``--issues`` or ``--pulls``, the tool will download user-uploaded attachments (images, videos, documents, etc.) from issue and pull request descriptions and comments. In some circumstances attachments contain valuable data related to the topic, and without their backup important information or context might be lost inadvertently.
Attachments are saved to ``issues/attachments/{issue_number}/`` and ``pulls/attachments/{pull_number}/`` directories, where ``{issue_number}`` is the GitHub issue number (e.g., issue #123 saves to ``issues/attachments/123/``). Each attachment directory contains:
- The downloaded attachment files (named by their GitHub identifier with appropriate file extensions)
- If multiple attachments have the same filename, conflicts are resolved with numeric suffixes (e.g., ``report.pdf``, ``report_1.pdf``, ``report_2.pdf``)
- A ``manifest.json`` file documenting all downloads, including URLs, file metadata, and download status
The tool automatically extracts file extensions from HTTP headers to ensure files can be more easily opened by your operating system.
**Supported URL formats:**
- Modern: ``github.com/user-attachments/{assets,files}/*``
- Legacy: ``user-images.githubusercontent.com/*`` and ``private-user-images.githubusercontent.com/*``
- Repo files: ``github.com/{owner}/{repo}/files/*`` (filtered to current repository)
- Repo assets: ``github.com/{owner}/{repo}/assets/*`` (filtered to current repository)
**Repository filtering** for repo files/assets handles renamed and transferred repositories gracefully. URLs are included if they either match the current repository name directly, or redirect to it (e.g., ``willmcgugan/rich`` redirects to ``Textualize/rich`` after transfer).
Run in Docker container
-----------------------
@@ -303,7 +329,7 @@ Quietly and incrementally backup useful Github user data (public and private rep
export FINE_ACCESS_TOKEN=SOME-GITHUB-TOKEN
GH_USER=YOUR-GITHUB-USER
github-backup -f $FINE_ACCESS_TOKEN --prefer-ssh -o ~/github-backup/ -l error -P -i --all-starred --starred --watched --followers --following --issues --issue-comments --issue-events --pulls --pull-comments --pull-commits --labels --milestones --repositories --wikis --releases --assets --pull-details --gists --starred-gists $GH_USER
github-backup -f $FINE_ACCESS_TOKEN --prefer-ssh -o ~/github-backup/ -l error -P -i --all-starred --starred --watched --followers --following --issues --issue-comments --issue-events --pulls --pull-comments --pull-commits --labels --milestones --repositories --wikis --releases --assets --attachments --pull-details --gists --starred-gists $GH_USER
Debug an error/block or incomplete backup into a temporary directory. Omit "incremental" to fill a previous incomplete backup. ::

View File

@@ -1 +1 @@
__version__ = "0.50.3"
__version__ = "0.51.2"

View File

@@ -37,22 +37,33 @@ FNULL = open(os.devnull, "w")
FILE_URI_PREFIX = "file://"
logger = logging.getLogger(__name__)
# Setup SSL context with fallback chain
https_ctx = ssl.create_default_context()
if not https_ctx.get_ca_certs():
import warnings
warnings.warn(
"\n\nYOUR DEFAULT CA CERTS ARE EMPTY.\n"
+ "PLEASE POPULATE ANY OF:"
+ "".join(
["\n - " + x for x in ssl.get_default_verify_paths() if type(x) is str]
)
+ "\n",
stacklevel=2,
)
if https_ctx.get_ca_certs():
# Layer 1: Certificates pre-loaded from system (file-based)
pass
else:
paths = ssl.get_default_verify_paths()
if (paths.cafile and os.path.exists(paths.cafile)) or (
paths.capath and os.path.exists(paths.capath)
):
# Layer 2: Cert paths exist, will be lazy-loaded on first use (directory-based)
pass
else:
# Layer 3: Try certifi package as optional fallback
try:
import certifi
https_ctx = ssl.create_default_context(cafile=certifi.where())
except ImportError:
# All layers failed - no certificates available anywhere
sys.exit(
"\nERROR: No CA certificates found. Cannot connect to GitHub over SSL.\n\n"
"Solutions you can explore:\n"
" 1. pip install certifi\n"
" 2. Alpine: apk add ca-certificates\n"
" 3. Debian/Ubuntu: apt-get install ca-certificates\n\n"
)
def logging_subprocess(
@@ -420,6 +431,12 @@ def parse_args(args=None):
dest="include_assets",
help="include assets alongside release information; only applies if including releases",
)
parser.add_argument(
"--attachments",
action="store_true",
dest="include_attachments",
help="download user-attachments from issues and pull requests",
)
parser.add_argument(
"--throttle-limit",
dest="throttle_limit",
@@ -814,6 +831,8 @@ class S3HTTPRedirectHandler(HTTPRedirectHandler):
request = super(S3HTTPRedirectHandler, self).redirect_request(
req, fp, code, msg, headers, newurl
)
# Only delete Authorization header if it exists (attachments may not have it)
if "Authorization" in request.headers:
del request.headers["Authorization"]
return request
@@ -867,6 +886,585 @@ def download_file(url, path, auth, as_app=False, fine=False):
)
def download_attachment_file(url, path, auth, as_app=False, fine=False):
"""Download attachment file directly (not via GitHub API).
Similar to download_file() but for direct file URLs, not API endpoints.
Attachment URLs (user-images, user-attachments) are direct downloads,
not API endpoints, so we skip _construct_request() which adds API params.
URL Format Support & Authentication Requirements:
| URL Format | Auth Required | Notes |
|----------------------------------------------|---------------|--------------------------|
| github.com/user-attachments/assets/* | Private only | Modern format (2024+) |
| github.com/user-attachments/files/* | Private only | Modern format (2024+) |
| user-images.githubusercontent.com/* | No (public) | Legacy CDN, all eras |
| private-user-images.githubusercontent.com/* | JWT in URL | Legacy private (5min) |
| github.com/{owner}/{repo}/files/* | Repo filter | Old repo files |
- Modern user-attachments: Requires GitHub token auth for private repos
- Legacy public CDN: No auth needed/accepted (returns 400 with auth header)
- Legacy private CDN: Uses JWT token embedded in URL, no GitHub token needed
- Repo files: Filtered to current repository only during extraction
Returns dict with metadata:
- success: bool
- http_status: int (200, 404, etc.)
- content_type: str or None
- original_filename: str or None (from Content-Disposition)
- size_bytes: int or None
- error: str or None
"""
import re
from datetime import datetime, timezone
metadata = {
"url": url,
"success": False,
"http_status": None,
"content_type": None,
"original_filename": None,
"size_bytes": None,
"downloaded_at": datetime.now(timezone.utc).isoformat(),
"error": None,
}
# Create simple request (no API query params)
request = Request(url)
request.add_header("Accept", "application/octet-stream")
# Add authentication header only for modern github.com/user-attachments URLs
# Legacy CDN URLs (user-images.githubusercontent.com) are public and don't need/accept auth
# Private CDN URLs (private-user-images) use JWT tokens embedded in the URL
if auth is not None and "github.com/user-attachments/" in url:
if not as_app:
if fine:
# Fine-grained token: plain token with "token " prefix
request.add_header("Authorization", "token " + auth)
else:
# Classic token: base64-encoded with "Basic " prefix
request.add_header("Authorization", "Basic ".encode("ascii") + auth)
else:
# App authentication
auth = auth.encode("ascii")
request.add_header("Authorization", "token ".encode("ascii") + auth)
# Reuse S3HTTPRedirectHandler from download_file()
opener = build_opener(S3HTTPRedirectHandler)
temp_path = path + ".temp"
try:
response = opener.open(request)
metadata["http_status"] = response.getcode()
# Extract Content-Type
content_type = response.headers.get("Content-Type", "").split(";")[0].strip()
if content_type:
metadata["content_type"] = content_type
# Extract original filename from Content-Disposition header
# Format: attachment; filename=example.mov or attachment;filename="example.mov"
content_disposition = response.headers.get("Content-Disposition", "")
if content_disposition:
# Match: filename=something or filename="something" or filename*=UTF-8''something
match = re.search(r'filename\*?=["\']?([^"\';\r\n]+)', content_disposition)
if match:
original_filename = match.group(1).strip()
# Handle RFC 5987 encoding: filename*=UTF-8''example.mov
if "UTF-8''" in original_filename:
original_filename = original_filename.split("UTF-8''")[1]
metadata["original_filename"] = original_filename
# Fallback: Extract filename from final URL after redirects
# This handles user-attachments/assets URLs which redirect to S3 with filename.ext
if not metadata["original_filename"]:
from urllib.parse import urlparse, unquote
final_url = response.geturl()
parsed = urlparse(final_url)
# Get filename from path (last component before query string)
path_parts = parsed.path.split("/")
if path_parts:
# URL might be encoded, decode it
filename_from_url = unquote(path_parts[-1])
# Only use if it has an extension
if "." in filename_from_url:
metadata["original_filename"] = filename_from_url
# Download file to temporary location
chunk_size = 16 * 1024
bytes_downloaded = 0
with open(temp_path, "wb") as f:
while True:
chunk = response.read(chunk_size)
if not chunk:
break
f.write(chunk)
bytes_downloaded += len(chunk)
# Atomic rename to final location
os.rename(temp_path, path)
metadata["size_bytes"] = bytes_downloaded
metadata["success"] = True
except HTTPError as exc:
metadata["http_status"] = exc.code
metadata["error"] = str(exc.reason)
logger.warning(
"Skipping download of attachment {0} due to HTTPError: {1}".format(
url, exc.reason
)
)
except URLError as e:
metadata["error"] = str(e.reason)
logger.warning(
"Skipping download of attachment {0} due to URLError: {1}".format(
url, e.reason
)
)
except socket.error as e:
metadata["error"] = str(e.strerror) if hasattr(e, "strerror") else str(e)
logger.warning(
"Skipping download of attachment {0} due to socket error: {1}".format(
url, e.strerror if hasattr(e, "strerror") else str(e)
)
)
except Exception as e:
metadata["error"] = str(e)
logger.warning(
"Skipping download of attachment {0} due to error: {1}".format(url, str(e))
)
# Clean up temp file if it was partially created
if os.path.exists(temp_path):
try:
os.remove(temp_path)
except Exception:
pass
return metadata
def extract_attachment_urls(item_data, issue_number=None, repository_full_name=None):
"""Extract GitHub-hosted attachment URLs from issue/PR body and comments.
What qualifies as an attachment?
There is no "attachment" concept in the GitHub API - it's a user behavior pattern
we've identified through analysis of real-world repositories. We define attachments as:
- User-uploaded files hosted on GitHub's CDN domains
- Found outside of code blocks (not examples/documentation)
- Matches known GitHub attachment URL patterns
This intentionally captures bare URLs pasted by users, not just markdown/HTML syntax.
Some false positives (example URLs in documentation) may occur - these fail gracefully
with HTTP 404 and are logged in the manifest.
Supported URL formats:
- Modern: github.com/user-attachments/{assets,files}/*
- Legacy: user-images.githubusercontent.com/* (including private-user-images)
- Repo files: github.com/{owner}/{repo}/files/* (filtered to current repo)
- Repo assets: github.com/{owner}/{repo}/assets/* (filtered to current repo)
Repository filtering (repo files/assets only):
- Direct match: URL is for current repository → included
- Redirect match: URL redirects to current repository → included (handles renames/transfers)
- Different repo: URL is for different repository → excluded
Code block filtering:
- Removes fenced code blocks (```) and inline code (`) before extraction
- Prevents extracting URLs from code examples and documentation snippets
Args:
item_data: Issue or PR data dict
issue_number: Issue/PR number for logging
repository_full_name: Full repository name (owner/repo) for filtering repo-scoped URLs
"""
import re
urls = []
# Define all GitHub attachment patterns
# Stop at markdown punctuation: whitespace, ), `, ", >, <
# Trailing sentence punctuation (. ! ? , ; : ' ") is stripped in post-processing
patterns = [
r'https://github\.com/user-attachments/(?:assets|files)/[^\s\)`"<>]+', # Modern
r'https://(?:private-)?user-images\.githubusercontent\.com/[^\s\)`"<>]+', # Legacy CDN
]
# Add repo-scoped patterns (will be filtered by repository later)
# These patterns match ANY repo, then we filter to current repo with redirect checking
repo_files_pattern = r'https://github\.com/[^/]+/[^/]+/files/\d+/[^\s\)`"<>]+'
repo_assets_pattern = r'https://github\.com/[^/]+/[^/]+/assets/\d+/[^\s\)`"<>]+'
patterns.append(repo_files_pattern)
patterns.append(repo_assets_pattern)
def clean_url(url):
"""Remove trailing sentence and markdown punctuation that's not part of the URL."""
return url.rstrip(".!?,;:'\")")
def remove_code_blocks(text):
"""Remove markdown code blocks (fenced and inline) from text.
This prevents extracting URLs from code examples like:
- Fenced code blocks: ```code```
- Inline code: `code`
"""
# Remove fenced code blocks first (```...```)
# DOTALL flag makes . match newlines
text = re.sub(r"```.*?```", "", text, flags=re.DOTALL)
# Remove inline code (`...`)
# Non-greedy match between backticks
text = re.sub(r"`[^`]*`", "", text)
return text
def is_repo_scoped_url(url):
"""Check if URL is a repo-scoped attachment (files or assets)."""
return bool(
re.match(r"https://github\.com/[^/]+/[^/]+/(?:files|assets)/\d+/", url)
)
def check_redirect_to_current_repo(url, current_repo):
"""Check if URL redirects to current repository.
Returns True if:
- URL is already for current repo
- URL redirects (301/302) to current repo (handles renames/transfers)
Returns False otherwise (URL is for a different repo).
"""
# Extract owner/repo from URL
match = re.match(r"https://github\.com/([^/]+)/([^/]+)/", url)
if not match:
return False
url_owner, url_repo = match.groups()
url_repo_full = f"{url_owner}/{url_repo}"
# Direct match - no need to check redirect
if url_repo_full.lower() == current_repo.lower():
return True
# Different repo - check if it redirects to current repo
# This handles repository transfers and renames
try:
import urllib.request
import urllib.error
# Make HEAD request with redirect following disabled
# We need to manually handle redirects to see the Location header
request = urllib.request.Request(url, method="HEAD")
request.add_header("User-Agent", "python-github-backup")
# Create opener that does NOT follow redirects
class NoRedirectHandler(urllib.request.HTTPRedirectHandler):
def redirect_request(self, req, fp, code, msg, headers, newurl):
return None # Don't follow redirects
opener = urllib.request.build_opener(NoRedirectHandler)
try:
_ = opener.open(request, timeout=10)
# Got 200 - URL works as-is but for different repo
return False
except urllib.error.HTTPError as e:
# Check if it's a redirect (301, 302, 307, 308)
if e.code in (301, 302, 307, 308):
location = e.headers.get("Location", "")
# Check if redirect points to current repo
if location:
redirect_match = re.match(
r"https://github\.com/([^/]+)/([^/]+)/", location
)
if redirect_match:
redirect_owner, redirect_repo = redirect_match.groups()
redirect_repo_full = f"{redirect_owner}/{redirect_repo}"
return redirect_repo_full.lower() == current_repo.lower()
return False
except Exception:
# On any error (timeout, network issue, etc.), be conservative
# and exclude the URL to avoid downloading from wrong repos
return False
# Extract from body
body = item_data.get("body") or ""
# Remove code blocks before searching for URLs
body_cleaned = remove_code_blocks(body)
for pattern in patterns:
found_urls = re.findall(pattern, body_cleaned)
urls.extend([clean_url(url) for url in found_urls])
# Extract from issue comments
if "comment_data" in item_data:
for comment in item_data["comment_data"]:
comment_body = comment.get("body") or ""
# Remove code blocks before searching for URLs
comment_cleaned = remove_code_blocks(comment_body)
for pattern in patterns:
found_urls = re.findall(pattern, comment_cleaned)
urls.extend([clean_url(url) for url in found_urls])
# Extract from PR regular comments
if "comment_regular_data" in item_data:
for comment in item_data["comment_regular_data"]:
comment_body = comment.get("body") or ""
# Remove code blocks before searching for URLs
comment_cleaned = remove_code_blocks(comment_body)
for pattern in patterns:
found_urls = re.findall(pattern, comment_cleaned)
urls.extend([clean_url(url) for url in found_urls])
regex_urls = list(set(urls)) # dedupe
# Filter repo-scoped URLs to current repository only
# This handles repository transfers/renames via redirect checking
if repository_full_name:
filtered_urls = []
for url in regex_urls:
if is_repo_scoped_url(url):
# Check if URL belongs to current repo (or redirects to it)
if check_redirect_to_current_repo(url, repository_full_name):
filtered_urls.append(url)
# else: skip URLs from other repositories
else:
# Non-repo-scoped URLs (user-attachments, CDN) - always include
filtered_urls.append(url)
regex_urls = filtered_urls
return regex_urls
def get_attachment_filename(url):
"""Get filename from attachment URL, handling all GitHub formats.
Formats:
- github.com/user-attachments/assets/{uuid} → uuid (add extension later)
- github.com/user-attachments/files/{id}/{filename} → filename
- github.com/{owner}/{repo}/files/{id}/{filename} → filename
- user-images.githubusercontent.com/{user}/{hash}.{ext} → hash.ext
- private-user-images.githubusercontent.com/...?jwt=... → extract from path
"""
from urllib.parse import urlparse
parsed = urlparse(url)
path_parts = parsed.path.split("/")
# Modern: /user-attachments/files/{id}/{filename}
if "user-attachments/files" in parsed.path:
return path_parts[-1]
# Modern: /user-attachments/assets/{uuid}
elif "user-attachments/assets" in parsed.path:
return path_parts[-1] # extension added later via detect_and_add_extension
# Repo files: /{owner}/{repo}/files/{id}/{filename}
elif "/files/" in parsed.path and len(path_parts) >= 2:
return path_parts[-1]
# Legacy: user-images.githubusercontent.com/{user}/{hash-with-ext}
elif "githubusercontent.com" in parsed.netloc:
return path_parts[-1] # Already has extension usually
# Fallback: use last path component
return path_parts[-1] if path_parts[-1] else "unknown_attachment"
def resolve_filename_collision(filepath):
"""Resolve filename collisions using counter suffix pattern.
If filepath exists, returns a new filepath with counter suffix.
Pattern: report.pdf → report_1.pdf → report_2.pdf
Also protects against manifest.json collisions by treating it as reserved.
Args:
filepath: Full path to file that might exist
Returns:
filepath that doesn't collide (may be same as input if no collision)
"""
directory = os.path.dirname(filepath)
filename = os.path.basename(filepath)
# Protect manifest.json - it's a reserved filename
if filename == "manifest.json":
name, ext = os.path.splitext(filename)
counter = 1
while True:
new_filename = f"{name}_{counter}{ext}"
new_filepath = os.path.join(directory, new_filename)
if not os.path.exists(new_filepath):
return new_filepath
counter += 1
if not os.path.exists(filepath):
return filepath
name, ext = os.path.splitext(filename)
counter = 1
while True:
new_filename = f"{name}_{counter}{ext}"
new_filepath = os.path.join(directory, new_filename)
if not os.path.exists(new_filepath):
return new_filepath
counter += 1
def download_attachments(
args, item_cwd, item_data, number, repository, item_type="issue"
):
"""Download user-attachments from issue/PR body and comments with manifest.
Args:
args: Command line arguments
item_cwd: Working directory (issue_cwd or pulls_cwd)
item_data: Issue or PR data dict
number: Issue or PR number
repository: Repository dict
item_type: "issue" or "pull" for logging/manifest
"""
import json
from datetime import datetime, timezone
item_type_display = "issue" if item_type == "issue" else "pull request"
urls = extract_attachment_urls(
item_data, issue_number=number, repository_full_name=repository["full_name"]
)
if not urls:
return
attachments_dir = os.path.join(item_cwd, "attachments", str(number))
manifest_path = os.path.join(attachments_dir, "manifest.json")
# Load existing manifest to prevent duplicate downloads
existing_urls = set()
existing_metadata = []
if os.path.exists(manifest_path):
try:
with open(manifest_path, "r") as f:
existing_manifest = json.load(f)
all_metadata = existing_manifest.get("attachments", [])
# Only skip URLs that were successfully downloaded OR failed with permanent errors
# Retry transient failures (5xx, timeouts, network errors)
for item in all_metadata:
if item.get("success"):
existing_urls.add(item["url"])
else:
# Check if this is a permanent failure (don't retry) or transient (retry)
http_status = item.get("http_status")
if http_status in [404, 410, 451]:
# Permanent failures - don't retry
existing_urls.add(item["url"])
# Transient failures (5xx, auth errors, timeouts) will be retried
existing_metadata = all_metadata
except (json.JSONDecodeError, IOError):
# If manifest is corrupted, re-download everything
logger.warning(
"Corrupted manifest for {0} #{1}, will re-download".format(
item_type_display, number
)
)
existing_urls = set()
existing_metadata = []
# Filter to only new URLs
new_urls = [url for url in urls if url not in existing_urls]
if not new_urls and existing_urls:
logger.debug(
"Skipping attachments for {0} #{1} (all {2} already downloaded)".format(
item_type_display, number, len(urls)
)
)
return
if new_urls:
logger.info(
"Downloading {0} new attachment(s) for {1} #{2}".format(
len(new_urls), item_type_display, number
)
)
mkdir_p(item_cwd, attachments_dir)
# Collect metadata for manifest (start with existing)
attachment_metadata_list = existing_metadata[:]
for url in new_urls:
filename = get_attachment_filename(url)
filepath = os.path.join(attachments_dir, filename)
# Download and get metadata
metadata = download_attachment_file(
url,
filepath,
get_auth(args, encode=not args.as_app),
as_app=args.as_app,
fine=args.token_fine is not None,
)
# If download succeeded but we got an extension from Content-Disposition,
# we may need to rename the file to add the extension
if metadata["success"] and metadata.get("original_filename"):
original_ext = os.path.splitext(metadata["original_filename"])[1]
current_ext = os.path.splitext(filepath)[1]
# Add extension if not present
if original_ext and current_ext != original_ext:
final_filepath = filepath + original_ext
# Check for collision again with new extension
final_filepath = resolve_filename_collision(final_filepath)
logger.debug(
"Adding extension {0} to {1}".format(original_ext, filepath)
)
# Rename to add extension (already atomic from download)
try:
os.rename(filepath, final_filepath)
metadata["saved_as"] = os.path.basename(final_filepath)
except Exception as e:
logger.warning(
"Could not add extension to {0}: {1}".format(filepath, str(e))
)
metadata["saved_as"] = os.path.basename(filepath)
else:
metadata["saved_as"] = os.path.basename(filepath)
elif metadata["success"]:
metadata["saved_as"] = os.path.basename(filepath)
else:
metadata["saved_as"] = None
attachment_metadata_list.append(metadata)
# Write manifest
if attachment_metadata_list:
manifest = {
"issue_number": number,
"issue_type": item_type,
"repository": f"{args.user}/{args.repository}"
if hasattr(args, "repository") and args.repository
else args.user,
"manifest_updated_at": datetime.now(timezone.utc).isoformat(),
"attachments": attachment_metadata_list,
}
manifest_path = os.path.join(attachments_dir, "manifest.json")
with open(manifest_path + ".temp", "w") as f:
json.dump(manifest, f, indent=2)
os.rename(manifest_path + ".temp", manifest_path) # Atomic write
logger.debug(
"Wrote manifest for {0} #{1}: {2} attachments".format(
item_type_display, number, len(attachment_metadata_list)
)
)
def get_authenticated_user(args):
template = "https://{0}/user".format(get_github_api_host(args))
data = retrieve_data(args, template, single_request=True)
@@ -1157,6 +1755,10 @@ def backup_issues(args, repo_cwd, repository, repos_template):
if args.include_issue_events or args.include_everything:
template = events_template.format(number)
issues[number]["event_data"] = retrieve_data(args, template)
if args.include_attachments:
download_attachments(
args, issue_cwd, issues[number], number, repository, item_type="issue"
)
with codecs.open(issue_file + ".temp", "w", encoding="utf-8") as f:
json_dump(issue, f)
@@ -1228,6 +1830,10 @@ def backup_pulls(args, repo_cwd, repository, repos_template):
if args.include_pull_commits or args.include_everything:
template = commits_template.format(number)
pulls[number]["commit_data"] = retrieve_data(args, template)
if args.include_attachments:
download_attachments(
args, pulls_cwd, pulls[number], number, repository, item_type="pull"
)
with codecs.open(pull_file + ".temp", "w", encoding="utf-8") as f:
json_dump(pull, f)

6
pytest.ini Normal file
View File

@@ -0,0 +1,6 @@
[pytest]
testpaths = tests
python_files = test_*.py
python_classes = Test*
python_functions = test_*
addopts = -v

View File

@@ -1,39 +1,40 @@
autopep8==2.3.2
black==25.1.0
bleach==6.2.0
certifi==2025.7.14
charset-normalizer==3.4.2
click==8.1.8
black==25.11.0
bleach==6.3.0
certifi==2025.11.12
charset-normalizer==3.4.4
click==8.3.0
colorama==0.4.6
docutils==0.22
docutils==0.22.3
flake8==7.3.0
gitchangelog==3.0.4
idna==3.10
pytest==8.3.3
idna==3.11
importlib-metadata==8.7.0
jaraco.classes==3.4.0
keyring==25.6.0
markdown-it-py==3.0.0
markdown-it-py==4.0.0
mccabe==0.7.0
mdurl==0.1.2
more-itertools==10.7.0
more-itertools==10.8.0
mypy-extensions==1.1.0
packaging==25.0
pathspec==0.12.1
pkginfo==1.12.1.2
platformdirs==4.3.8
platformdirs==4.5.0
pycodestyle==2.14.0
pyflakes==3.4.0
Pygments==2.19.2
readme-renderer==44.0
requests==2.32.4
requests==2.32.5
requests-toolbelt==1.0.0
restructuredtext-lint==1.4.0
rfc3986==2.0.0
rich==14.1.0
rich==14.2.0
setuptools==80.9.0
six==1.17.0
tqdm==4.67.1
twine==6.1.0
twine==6.2.0
urllib3==2.5.0
webencodings==0.5.1
zipp==3.23.0

View File

@@ -1 +0,0 @@

View File

@@ -40,15 +40,16 @@ setup(
"Development Status :: 5 - Production/Stable",
"Topic :: System :: Archiving :: Backup",
"License :: OSI Approved :: MIT License",
"Programming Language :: Python :: 3.8",
"Programming Language :: Python :: 3.9",
"Programming Language :: Python :: 3.10",
"Programming Language :: Python :: 3.11",
"Programming Language :: Python :: 3.12",
"Programming Language :: Python :: 3.13",
"Programming Language :: Python :: 3.14",
],
description="backup a github user or organization",
long_description=open_file("README.rst").read(),
long_description_content_type="text/x-rst",
install_requires=open_file("requirements.txt").readlines(),
python_requires=">=3.10",
zip_safe=True,
)

1
tests/__init__.py Normal file
View File

@@ -0,0 +1 @@
"""Tests for python-github-backup."""

353
tests/test_attachments.py Normal file
View File

@@ -0,0 +1,353 @@
"""Behavioral tests for attachment functionality."""
import json
import os
import tempfile
from pathlib import Path
from unittest.mock import Mock
import pytest
from github_backup import github_backup
@pytest.fixture
def attachment_test_setup(tmp_path):
"""Fixture providing setup and helper for attachment download tests."""
from unittest.mock import patch
issue_cwd = tmp_path / "issues"
issue_cwd.mkdir()
# Mock args
args = Mock()
args.as_app = False
args.token_fine = None
args.token_classic = None
args.username = None
args.password = None
args.osx_keychain_item_name = None
args.osx_keychain_item_account = None
args.user = "testuser"
args.repository = "testrepo"
repository = {"full_name": "testuser/testrepo"}
def call_download(issue_data, issue_number=123):
"""Call download_attachments with mocked HTTP downloads.
Returns list of URLs that were actually downloaded.
"""
downloaded_urls = []
def mock_download(url, path, auth, as_app, fine):
downloaded_urls.append(url)
return {
"success": True,
"saved_as": os.path.basename(path),
"url": url,
}
with patch(
"github_backup.github_backup.download_attachment_file",
side_effect=mock_download,
):
github_backup.download_attachments(
args, str(issue_cwd), issue_data, issue_number, repository
)
return downloaded_urls
return {
"issue_cwd": str(issue_cwd),
"args": args,
"repository": repository,
"call_download": call_download,
}
class TestURLExtraction:
"""Test URL extraction with realistic issue content."""
def test_mixed_urls(self):
issue_data = {
"body": """
## Bug Report
When uploading files, I see this error. Here's a screenshot:
https://github.com/user-attachments/assets/abc123def456
The logs show: https://github.com/user-attachments/files/789/error-log.txt
This is similar to https://github.com/someorg/somerepo/issues/42 but different.
You can also see the video at https://user-images.githubusercontent.com/12345/video-demo.mov
Here's how to reproduce:
```bash
# Don't extract this example URL:
curl https://github.com/user-attachments/assets/example999
```
More info at https://docs.example.com/guide
Also see this inline code `https://github.com/user-attachments/files/111/inline.pdf` should not extract.
Final attachment: https://github.com/user-attachments/files/222/report.pdf.
""",
"comment_data": [
{
"body": "Here's another attachment: https://private-user-images.githubusercontent.com/98765/secret.png?jwt=token123"
},
{
"body": """
Example code:
```python
url = "https://github.com/user-attachments/assets/code-example"
```
But this is real: https://github.com/user-attachments/files/333/actual.zip
"""
},
],
}
# Extract URLs
urls = github_backup.extract_attachment_urls(issue_data)
expected_urls = [
"https://github.com/user-attachments/assets/abc123def456",
"https://github.com/user-attachments/files/789/error-log.txt",
"https://user-images.githubusercontent.com/12345/video-demo.mov",
"https://github.com/user-attachments/files/222/report.pdf",
"https://private-user-images.githubusercontent.com/98765/secret.png?jwt=token123",
"https://github.com/user-attachments/files/333/actual.zip",
]
assert set(urls) == set(expected_urls)
def test_trailing_punctuation_stripped(self):
"""URLs with trailing punctuation should have punctuation stripped."""
issue_data = {
"body": """
See this file: https://github.com/user-attachments/files/1/doc.pdf.
And this one (https://github.com/user-attachments/files/2/image.png).
Check it out! https://github.com/user-attachments/files/3/data.csv!
"""
}
urls = github_backup.extract_attachment_urls(issue_data)
expected = [
"https://github.com/user-attachments/files/1/doc.pdf",
"https://github.com/user-attachments/files/2/image.png",
"https://github.com/user-attachments/files/3/data.csv",
]
assert set(urls) == set(expected)
def test_deduplication_across_body_and_comments(self):
"""Same URL in body and comments should only appear once."""
duplicate_url = "https://github.com/user-attachments/assets/abc123"
issue_data = {
"body": f"First mention: {duplicate_url}",
"comment_data": [
{"body": f"Second mention: {duplicate_url}"},
{"body": f"Third mention: {duplicate_url}"},
],
}
urls = github_backup.extract_attachment_urls(issue_data)
assert set(urls) == {duplicate_url}
class TestFilenameExtraction:
"""Test filename extraction from different URL types."""
def test_modern_assets_url(self):
"""Modern assets URL returns UUID."""
url = "https://github.com/user-attachments/assets/abc123def456"
filename = github_backup.get_attachment_filename(url)
assert filename == "abc123def456"
def test_modern_files_url(self):
"""Modern files URL returns filename."""
url = "https://github.com/user-attachments/files/12345/report.pdf"
filename = github_backup.get_attachment_filename(url)
assert filename == "report.pdf"
def test_legacy_cdn_url(self):
"""Legacy CDN URL returns filename with extension."""
url = "https://user-images.githubusercontent.com/123456/abc-def.png"
filename = github_backup.get_attachment_filename(url)
assert filename == "abc-def.png"
def test_private_cdn_url(self):
"""Private CDN URL returns filename."""
url = "https://private-user-images.githubusercontent.com/98765/secret.png?jwt=token123"
filename = github_backup.get_attachment_filename(url)
assert filename == "secret.png"
def test_repo_files_url(self):
"""Repo-scoped files URL returns filename."""
url = "https://github.com/owner/repo/files/789/document.txt"
filename = github_backup.get_attachment_filename(url)
assert filename == "document.txt"
class TestFilenameCollision:
"""Test filename collision resolution."""
def test_collision_behavior(self):
"""Test filename collision resolution with real files."""
with tempfile.TemporaryDirectory() as tmpdir:
# No collision - file doesn't exist
result = github_backup.resolve_filename_collision(
os.path.join(tmpdir, "report.pdf")
)
assert result == os.path.join(tmpdir, "report.pdf")
# Create the file, now collision exists
Path(os.path.join(tmpdir, "report.pdf")).touch()
result = github_backup.resolve_filename_collision(
os.path.join(tmpdir, "report.pdf")
)
assert result == os.path.join(tmpdir, "report_1.pdf")
# Create report_1.pdf too
Path(os.path.join(tmpdir, "report_1.pdf")).touch()
result = github_backup.resolve_filename_collision(
os.path.join(tmpdir, "report.pdf")
)
assert result == os.path.join(tmpdir, "report_2.pdf")
def test_manifest_reserved(self):
"""manifest.json is always treated as reserved."""
with tempfile.TemporaryDirectory() as tmpdir:
# Even if manifest.json doesn't exist, should get manifest_1.json
result = github_backup.resolve_filename_collision(
os.path.join(tmpdir, "manifest.json")
)
assert result == os.path.join(tmpdir, "manifest_1.json")
class TestManifestDuplicatePrevention:
"""Test that manifest prevents duplicate downloads (the bug fix)."""
def test_manifest_filters_existing_urls(self, attachment_test_setup):
"""URLs in manifest are not re-downloaded."""
setup = attachment_test_setup
# Create manifest with existing URLs
attachments_dir = os.path.join(setup["issue_cwd"], "attachments", "123")
os.makedirs(attachments_dir)
manifest_path = os.path.join(attachments_dir, "manifest.json")
manifest = {
"attachments": [
{
"url": "https://github.com/user-attachments/assets/old1",
"success": True,
"saved_as": "old1.pdf",
},
{
"url": "https://github.com/user-attachments/assets/old2",
"success": True,
"saved_as": "old2.pdf",
},
]
}
with open(manifest_path, "w") as f:
json.dump(manifest, f)
# Issue data with 2 old URLs and 1 new URL
issue_data = {
"body": """
Old: https://github.com/user-attachments/assets/old1
Old: https://github.com/user-attachments/assets/old2
New: https://github.com/user-attachments/assets/new1
"""
}
downloaded_urls = setup["call_download"](issue_data)
# Should only download the NEW URL (old ones filtered by manifest)
assert len(downloaded_urls) == 1
assert downloaded_urls[0] == "https://github.com/user-attachments/assets/new1"
def test_no_manifest_downloads_all(self, attachment_test_setup):
"""Without manifest, all URLs should be downloaded."""
setup = attachment_test_setup
# Issue data with 2 URLs
issue_data = {
"body": """
https://github.com/user-attachments/assets/url1
https://github.com/user-attachments/assets/url2
"""
}
downloaded_urls = setup["call_download"](issue_data)
# Should download ALL URLs (no manifest to filter)
assert len(downloaded_urls) == 2
assert set(downloaded_urls) == {
"https://github.com/user-attachments/assets/url1",
"https://github.com/user-attachments/assets/url2",
}
def test_manifest_skips_permanent_failures(self, attachment_test_setup):
"""Manifest skips permanent failures (404, 410) but retries transient (503)."""
setup = attachment_test_setup
# Create manifest with different failure types
attachments_dir = os.path.join(setup["issue_cwd"], "attachments", "123")
os.makedirs(attachments_dir)
manifest_path = os.path.join(attachments_dir, "manifest.json")
manifest = {
"attachments": [
{
"url": "https://github.com/user-attachments/assets/success",
"success": True,
"saved_as": "success.pdf",
},
{
"url": "https://github.com/user-attachments/assets/notfound",
"success": False,
"http_status": 404,
},
{
"url": "https://github.com/user-attachments/assets/gone",
"success": False,
"http_status": 410,
},
{
"url": "https://github.com/user-attachments/assets/unavailable",
"success": False,
"http_status": 503,
},
]
}
with open(manifest_path, "w") as f:
json.dump(manifest, f)
# Issue data has all 4 URLs
issue_data = {
"body": """
https://github.com/user-attachments/assets/success
https://github.com/user-attachments/assets/notfound
https://github.com/user-attachments/assets/gone
https://github.com/user-attachments/assets/unavailable
"""
}
downloaded_urls = setup["call_download"](issue_data)
# Should only retry 503 (transient failure)
# Success, 404, and 410 should be skipped
assert len(downloaded_urls) == 1
assert (
downloaded_urls[0]
== "https://github.com/user-attachments/assets/unavailable"
)