mirror of
https://github.com/josegonzalez/python-github-backup.git
synced 2025-12-08 17:28:02 +01:00
Compare commits
21 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
eb5779ac23 | ||
|
|
5b52931ebf | ||
|
|
1d6d474408 | ||
|
|
b80049e96e | ||
|
|
58ad1c2378 | ||
|
|
6e2a7e521c | ||
|
|
aba048a3e9 | ||
|
|
9f7c08166f | ||
|
|
fdfaaec1ba | ||
|
|
8f9cf7ff89 | ||
|
|
899ab5fdc2 | ||
|
|
2a9d86a6bf | ||
|
|
4fd3ea9e3c | ||
|
|
041dc013f9 | ||
|
|
12802103c4 | ||
|
|
bf28b46954 | ||
|
|
ff2681e196 | ||
|
|
745b05a63f | ||
|
|
83ff0ae1dd | ||
|
|
6ad1959d43 | ||
|
|
5739ac0745 |
91
CHANGES.rst
91
CHANGES.rst
@@ -1,9 +1,98 @@
|
||||
Changelog
|
||||
=========
|
||||
|
||||
0.52.0 (2025-11-28)
|
||||
0.55.0 (2025-12-07)
|
||||
-------------------
|
||||
------------------------
|
||||
|
||||
Fix
|
||||
~~~
|
||||
- Improve error messages for inaccessible repos and empty wikis. [Rodos]
|
||||
- --all-starred now clones repos without --repositories. [Rodos]
|
||||
- Warn when --private used without authentication. [Rodos]
|
||||
- Warn and skip when --starred-gists used for different user. [Rodos]
|
||||
|
||||
GitHub's API only allows retrieving starred gists for the authenticated
|
||||
user. Previously, using --starred-gists when backing up a different user
|
||||
would silently return no relevant data.
|
||||
|
||||
Now warns and skips the retrieval entirely when the target user differs
|
||||
from the authenticated user. Uses case-insensitive comparison to match
|
||||
GitHub's username handling.
|
||||
|
||||
Fixes #93
|
||||
|
||||
Other
|
||||
~~~~~
|
||||
- Test: add missing test coverage for case sensitivity fix. [Rodos]
|
||||
- Docs: fix RST formatting in Known blocking errors section. [Rodos]
|
||||
- Chore(deps): bump urllib3 from 2.5.0 to 2.6.0. [dependabot[bot]]
|
||||
|
||||
Bumps [urllib3](https://github.com/urllib3/urllib3) from 2.5.0 to 2.6.0.
|
||||
- [Release notes](https://github.com/urllib3/urllib3/releases)
|
||||
- [Changelog](https://github.com/urllib3/urllib3/blob/main/CHANGES.rst)
|
||||
- [Commits](https://github.com/urllib3/urllib3/compare/2.5.0...2.6.0)
|
||||
|
||||
---
|
||||
updated-dependencies:
|
||||
- dependency-name: urllib3
|
||||
dependency-version: 2.6.0
|
||||
dependency-type: direct:production
|
||||
...
|
||||
|
||||
|
||||
0.54.0 (2025-12-03)
|
||||
-------------------
|
||||
|
||||
Fix
|
||||
~~~
|
||||
- Send INFO/DEBUG to stdout, WARNING/ERROR to stderr. [Rodos]
|
||||
|
||||
Fixes #182
|
||||
|
||||
Other
|
||||
~~~~~
|
||||
- Docs: update README testing section and add fetch vs pull explanation.
|
||||
[Rodos]
|
||||
|
||||
|
||||
0.53.0 (2025-11-30)
|
||||
-------------------
|
||||
|
||||
Fix
|
||||
~~~
|
||||
- Case-sensitive username filtering causing silent backup failures.
|
||||
[Rodos]
|
||||
|
||||
GitHub's API accepts usernames in any case but returns canonical case.
|
||||
The case-sensitive comparison in filter_repositories() filtered out all
|
||||
repositories when user-provided case didn't match GitHub's canonical case.
|
||||
|
||||
Changed to case-insensitive comparison.
|
||||
|
||||
Fixes #198
|
||||
|
||||
Other
|
||||
~~~~~
|
||||
- Avoid rewriting unchanged JSON files for labels, milestones, releases,
|
||||
hooks, followers, and following. [Rodos]
|
||||
|
||||
This change reduces unnecessary writes when backing up metadata that changes
|
||||
infrequently. The implementation compares existing file content before writing
|
||||
and skips the write if the content is identical, preserving file timestamps.
|
||||
|
||||
Key changes:
|
||||
- Added json_dump_if_changed() helper that compares content before writing
|
||||
- Uses atomic writes (temp file + rename) for all metadata files
|
||||
- NOT applied to issues/pulls (they use incremental_by_files logic)
|
||||
- Made log messages consistent and past tense ("Saved" instead of "Saving")
|
||||
- Added informative logging showing skip counts
|
||||
|
||||
Fixes #133
|
||||
|
||||
|
||||
0.52.0 (2025-11-28)
|
||||
-------------------
|
||||
- Skip DMCA'd repos which return a 451 response. [Rodos]
|
||||
|
||||
Log a warning and the link to the DMCA notice. Continue backing up
|
||||
|
||||
30
README.rst
30
README.rst
@@ -281,7 +281,7 @@ If the incremental argument is used, this will result in the next backup only re
|
||||
|
||||
It's therefore recommended to only use the incremental argument if the output/result is being actively monitored, or complimented with periodic full non-incremental runs, to avoid unexpected missing data in a regular backup runs.
|
||||
|
||||
1. **Starred public repo hooks blocking**
|
||||
**Starred public repo hooks blocking**
|
||||
|
||||
Since the ``--all`` argument includes ``--hooks``, if you use ``--all`` and ``--all-starred`` together to clone a users starred public repositories, the backup will likely error and block the backup continuing.
|
||||
|
||||
@@ -301,6 +301,8 @@ Starred gists vs starred repo behaviour
|
||||
|
||||
The starred normal repo cloning (``--all-starred``) argument stores starred repos separately to the users own repositories. However, using ``--starred-gists`` will store starred gists within the same directory as the users own gists ``--gists``. Also, all gist repo directory names are IDs not the gist's name.
|
||||
|
||||
Note: ``--starred-gists`` only retrieves starred gists for the authenticated user, not the target user, due to a GitHub API limitation.
|
||||
|
||||
|
||||
Skip existing on incomplete backups
|
||||
-----------------------------------
|
||||
@@ -308,6 +310,25 @@ Skip existing on incomplete backups
|
||||
The ``--skip-existing`` argument will skip a backup if the directory already exists, even if the backup in that directory failed (perhaps due to a blocking error). This may result in unexpected missing data in a regular backup.
|
||||
|
||||
|
||||
Updates use fetch, not pull
|
||||
---------------------------
|
||||
|
||||
When updating an existing repository backup, ``github-backup`` uses ``git fetch`` rather than ``git pull``. This is intentional - a backup tool should reliably download data without risk of failure. Using ``git pull`` would require handling merge conflicts, which adds complexity and could cause backups to fail unexpectedly.
|
||||
|
||||
With fetch, **all branches and commits are downloaded** safely into remote-tracking branches. The working directory files won't change, but your backup is complete.
|
||||
|
||||
If you look at files directly (e.g., ``cat README.md``), you'll see the old content. The new data is in the remote-tracking branches (confusingly named "remote" but stored locally). To view or use the latest files::
|
||||
|
||||
git show origin/main:README.md # view a file
|
||||
git merge origin/main # update working directory
|
||||
|
||||
All branches are backed up as remote refs (``origin/main``, ``origin/feature-branch``, etc.).
|
||||
|
||||
If you want to browse files directly without merging, consider using ``--bare`` which skips the working directory entirely - the backup is just the git data.
|
||||
|
||||
See `#269 <https://github.com/josegonzalez/python-github-backup/issues/269>`_ for more discussion.
|
||||
|
||||
|
||||
Github Backup Examples
|
||||
======================
|
||||
|
||||
@@ -357,7 +378,12 @@ A huge thanks to all the contibuters!
|
||||
Testing
|
||||
-------
|
||||
|
||||
This project currently contains no unit tests. To run linting::
|
||||
To run the test suite::
|
||||
|
||||
pip install pytest
|
||||
pytest
|
||||
|
||||
To run linting::
|
||||
|
||||
pip install flake8
|
||||
flake8 --ignore=E501
|
||||
|
||||
@@ -9,6 +9,7 @@ from github_backup.github_backup import (
|
||||
backup_repositories,
|
||||
check_git_lfs_install,
|
||||
filter_repositories,
|
||||
get_auth,
|
||||
get_authenticated_user,
|
||||
logger,
|
||||
mkdir_p,
|
||||
@@ -16,16 +17,33 @@ from github_backup.github_backup import (
|
||||
retrieve_repositories,
|
||||
)
|
||||
|
||||
logging.basicConfig(
|
||||
format="%(asctime)s.%(msecs)03d: %(message)s",
|
||||
# INFO and DEBUG go to stdout, WARNING and above go to stderr
|
||||
log_format = logging.Formatter(
|
||||
fmt="%(asctime)s.%(msecs)03d: %(message)s",
|
||||
datefmt="%Y-%m-%dT%H:%M:%S",
|
||||
level=logging.INFO,
|
||||
)
|
||||
|
||||
stdout_handler = logging.StreamHandler(sys.stdout)
|
||||
stdout_handler.setLevel(logging.DEBUG)
|
||||
stdout_handler.addFilter(lambda r: r.levelno < logging.WARNING)
|
||||
stdout_handler.setFormatter(log_format)
|
||||
|
||||
stderr_handler = logging.StreamHandler(sys.stderr)
|
||||
stderr_handler.setLevel(logging.WARNING)
|
||||
stderr_handler.setFormatter(log_format)
|
||||
|
||||
logging.basicConfig(level=logging.INFO, handlers=[stdout_handler, stderr_handler])
|
||||
|
||||
|
||||
def main():
|
||||
args = parse_args()
|
||||
|
||||
if args.private and not get_auth(args):
|
||||
logger.warning(
|
||||
"The --private flag has no effect without authentication. "
|
||||
"Use -t/--token, -f/--token-fine, or -u/--username to authenticate."
|
||||
)
|
||||
|
||||
if args.quiet:
|
||||
logger.setLevel(logging.WARNING)
|
||||
|
||||
|
||||
@@ -1 +1 @@
|
||||
__version__ = "0.52.0"
|
||||
__version__ = "0.55.0"
|
||||
|
||||
@@ -1565,6 +1565,12 @@ def retrieve_repositories(args, authenticated_user):
|
||||
repos.extend(gists)
|
||||
|
||||
if args.include_starred_gists:
|
||||
if not authenticated_user.get("login") or args.user.lower() != authenticated_user["login"].lower():
|
||||
logger.warning(
|
||||
"Cannot retrieve starred gists for '%s'. GitHub only allows access to the authenticated user's starred gists.",
|
||||
args.user,
|
||||
)
|
||||
else:
|
||||
starred_gists_template = "https://{0}/gists/starred".format(
|
||||
get_github_api_host(args)
|
||||
)
|
||||
@@ -1587,7 +1593,9 @@ def filter_repositories(args, unfiltered_repositories):
|
||||
repositories = []
|
||||
for r in unfiltered_repositories:
|
||||
# gists can be anonymous, so need to safely check owner
|
||||
if r.get("owner", {}).get("login") == args.user or r.get("is_starred"):
|
||||
# Use case-insensitive comparison to match GitHub's case-insensitive username behavior
|
||||
owner_login = r.get("owner", {}).get("login", "")
|
||||
if owner_login.lower() == args.user.lower() or r.get("is_starred"):
|
||||
repositories.append(r)
|
||||
|
||||
name_regex = None
|
||||
@@ -1664,9 +1672,10 @@ def backup_repositories(args, output_directory, repositories):
|
||||
repo_url = get_github_repo_url(args, repository)
|
||||
|
||||
include_gists = args.include_gists or args.include_starred_gists
|
||||
include_starred = args.all_starred and repository.get("is_starred")
|
||||
if (args.include_repository or args.include_everything) or (
|
||||
include_gists and repository.get("is_gist")
|
||||
):
|
||||
) or include_starred:
|
||||
repo_name = (
|
||||
repository.get("name")
|
||||
if not repository.get("is_gist")
|
||||
@@ -1898,11 +1907,21 @@ def backup_milestones(args, repo_cwd, repository, repos_template):
|
||||
for milestone in _milestones:
|
||||
milestones[milestone["number"]] = milestone
|
||||
|
||||
logger.info("Saving {0} milestones to disk".format(len(list(milestones.keys()))))
|
||||
written_count = 0
|
||||
for number, milestone in list(milestones.items()):
|
||||
milestone_file = "{0}/{1}.json".format(milestone_cwd, number)
|
||||
with codecs.open(milestone_file, "w", encoding="utf-8") as f:
|
||||
json_dump(milestone, f)
|
||||
if json_dump_if_changed(milestone, milestone_file):
|
||||
written_count += 1
|
||||
|
||||
total = len(milestones)
|
||||
if written_count == total:
|
||||
logger.info("Saved {0} milestones to disk".format(total))
|
||||
elif written_count == 0:
|
||||
logger.info("{0} milestones unchanged, skipped write".format(total))
|
||||
else:
|
||||
logger.info("Saved {0} of {1} milestones to disk ({2} unchanged)".format(
|
||||
written_count, total, total - written_count
|
||||
))
|
||||
|
||||
|
||||
def backup_labels(args, repo_cwd, repository, repos_template):
|
||||
@@ -1955,19 +1974,17 @@ def backup_releases(args, repo_cwd, repository, repos_template, include_assets=F
|
||||
reverse=True,
|
||||
)
|
||||
releases = releases[: args.number_of_latest_releases]
|
||||
logger.info("Saving the latest {0} releases to disk".format(len(releases)))
|
||||
else:
|
||||
logger.info("Saving {0} releases to disk".format(len(releases)))
|
||||
|
||||
# for each release, store it
|
||||
written_count = 0
|
||||
for release in releases:
|
||||
release_name = release["tag_name"]
|
||||
release_name_safe = release_name.replace("/", "__")
|
||||
output_filepath = os.path.join(
|
||||
release_cwd, "{0}.json".format(release_name_safe)
|
||||
)
|
||||
with codecs.open(output_filepath, "w+", encoding="utf-8") as f:
|
||||
json_dump(release, f)
|
||||
if json_dump_if_changed(release, output_filepath):
|
||||
written_count += 1
|
||||
|
||||
if include_assets:
|
||||
assets = retrieve_data(args, release["assets_url"])
|
||||
@@ -1984,6 +2001,17 @@ def backup_releases(args, repo_cwd, repository, repos_template, include_assets=F
|
||||
fine=True if args.token_fine is not None else False,
|
||||
)
|
||||
|
||||
# Log the results
|
||||
total = len(releases)
|
||||
if written_count == total:
|
||||
logger.info("Saved {0} releases to disk".format(total))
|
||||
elif written_count == 0:
|
||||
logger.info("{0} releases unchanged, skipped write".format(total))
|
||||
else:
|
||||
logger.info("Saved {0} of {1} releases to disk ({2} unchanged)".format(
|
||||
written_count, total, total - written_count
|
||||
))
|
||||
|
||||
|
||||
def fetch_repository(
|
||||
name,
|
||||
@@ -1996,12 +2024,9 @@ def fetch_repository(
|
||||
):
|
||||
if bare_clone:
|
||||
if os.path.exists(local_dir):
|
||||
clone_exists = (
|
||||
subprocess.check_output(
|
||||
clone_exists = subprocess.check_output(
|
||||
["git", "rev-parse", "--is-bare-repository"], cwd=local_dir
|
||||
)
|
||||
== b"true\n"
|
||||
)
|
||||
) == b"true\n"
|
||||
else:
|
||||
clone_exists = False
|
||||
else:
|
||||
@@ -2016,10 +2041,13 @@ def fetch_repository(
|
||||
"git ls-remote " + remote_url, stdout=FNULL, stderr=FNULL, shell=True
|
||||
)
|
||||
if initialized == 128:
|
||||
if ".wiki.git" in remote_url:
|
||||
logger.info(
|
||||
"Skipping {0} ({1}) since it's not initialized".format(
|
||||
name, masked_remote_url
|
||||
"Skipping {0} wiki (wiki is enabled but has no content)".format(name)
|
||||
)
|
||||
else:
|
||||
logger.info(
|
||||
"Skipping {0} (repository not accessible - may be empty, private, or credentials invalid)".format(name)
|
||||
)
|
||||
return
|
||||
|
||||
@@ -2108,9 +2136,10 @@ def _backup_data(args, name, template, output_file, output_directory):
|
||||
mkdir_p(output_directory)
|
||||
data = retrieve_data(args, template)
|
||||
|
||||
logger.info("Writing {0} {1} to disk".format(len(data), name))
|
||||
with codecs.open(output_file, "w", encoding="utf-8") as f:
|
||||
json_dump(data, f)
|
||||
if json_dump_if_changed(data, output_file):
|
||||
logger.info("Saved {0} {1} to disk".format(len(data), name))
|
||||
else:
|
||||
logger.info("{0} {1} unchanged, skipped write".format(len(data), name))
|
||||
|
||||
|
||||
def json_dump(data, output_file):
|
||||
@@ -2122,3 +2151,57 @@ def json_dump(data, output_file):
|
||||
indent=4,
|
||||
separators=(",", ": "),
|
||||
)
|
||||
|
||||
|
||||
def json_dump_if_changed(data, output_file_path):
|
||||
"""
|
||||
Write JSON data to file only if content has changed.
|
||||
|
||||
Compares the serialized JSON data with the existing file content
|
||||
and only writes if different. This prevents unnecessary file
|
||||
modification timestamp updates and disk writes.
|
||||
|
||||
Uses atomic writes (temp file + rename) to prevent corruption
|
||||
if the process is interrupted during the write.
|
||||
|
||||
Args:
|
||||
data: The data to serialize as JSON
|
||||
output_file_path: The path to the output file
|
||||
|
||||
Returns:
|
||||
True if file was written (content changed or new file)
|
||||
False if write was skipped (content unchanged)
|
||||
"""
|
||||
# Serialize new data with consistent formatting matching json_dump()
|
||||
new_content = json.dumps(
|
||||
data,
|
||||
ensure_ascii=False,
|
||||
sort_keys=True,
|
||||
indent=4,
|
||||
separators=(",", ": "),
|
||||
)
|
||||
|
||||
# Check if file exists and compare content
|
||||
if os.path.exists(output_file_path):
|
||||
try:
|
||||
with codecs.open(output_file_path, "r", encoding="utf-8") as f:
|
||||
existing_content = f.read()
|
||||
if existing_content == new_content:
|
||||
logger.debug(
|
||||
"Content unchanged, skipping write: {0}".format(output_file_path)
|
||||
)
|
||||
return False
|
||||
except (OSError, UnicodeDecodeError) as e:
|
||||
# If we can't read the existing file, write the new one
|
||||
logger.debug(
|
||||
"Error reading existing file {0}, will overwrite: {1}".format(
|
||||
output_file_path, e
|
||||
)
|
||||
)
|
||||
|
||||
# Write the file atomically using temp file + rename
|
||||
temp_file = output_file_path + ".temp"
|
||||
with codecs.open(temp_file, "w", encoding="utf-8") as f:
|
||||
f.write(new_content)
|
||||
os.rename(temp_file, output_file_path) # Atomic on POSIX systems
|
||||
return True
|
||||
|
||||
@@ -35,6 +35,6 @@ setuptools==80.9.0
|
||||
six==1.17.0
|
||||
tqdm==4.67.1
|
||||
twine==6.2.0
|
||||
urllib3==2.5.0
|
||||
urllib3==2.6.0
|
||||
webencodings==0.5.1
|
||||
zipp==3.23.0
|
||||
|
||||
161
tests/test_all_starred.py
Normal file
161
tests/test_all_starred.py
Normal file
@@ -0,0 +1,161 @@
|
||||
"""Tests for --all-starred flag behavior (issue #225)."""
|
||||
|
||||
import pytest
|
||||
from unittest.mock import Mock, patch
|
||||
|
||||
from github_backup import github_backup
|
||||
|
||||
|
||||
class TestAllStarredCloning:
|
||||
"""Test suite for --all-starred repository cloning behavior.
|
||||
|
||||
Issue #225: --all-starred should clone starred repos without requiring --repositories.
|
||||
"""
|
||||
|
||||
def _create_mock_args(self, **overrides):
|
||||
"""Create a mock args object with sensible defaults."""
|
||||
args = Mock()
|
||||
args.user = "testuser"
|
||||
args.output_directory = "/tmp/backup"
|
||||
args.include_repository = False
|
||||
args.include_everything = False
|
||||
args.include_gists = False
|
||||
args.include_starred_gists = False
|
||||
args.all_starred = False
|
||||
args.skip_existing = False
|
||||
args.bare_clone = False
|
||||
args.lfs_clone = False
|
||||
args.no_prune = False
|
||||
args.include_wiki = False
|
||||
args.include_issues = False
|
||||
args.include_issue_comments = False
|
||||
args.include_issue_events = False
|
||||
args.include_pulls = False
|
||||
args.include_pull_comments = False
|
||||
args.include_pull_commits = False
|
||||
args.include_pull_details = False
|
||||
args.include_labels = False
|
||||
args.include_hooks = False
|
||||
args.include_milestones = False
|
||||
args.include_releases = False
|
||||
args.include_assets = False
|
||||
args.include_attachments = False
|
||||
args.incremental = False
|
||||
args.incremental_by_files = False
|
||||
args.github_host = None
|
||||
args.prefer_ssh = False
|
||||
args.token_classic = None
|
||||
args.token_fine = None
|
||||
args.username = None
|
||||
args.password = None
|
||||
args.as_app = False
|
||||
args.osx_keychain_item_name = None
|
||||
args.osx_keychain_item_account = None
|
||||
|
||||
for key, value in overrides.items():
|
||||
setattr(args, key, value)
|
||||
|
||||
return args
|
||||
|
||||
@patch('github_backup.github_backup.fetch_repository')
|
||||
@patch('github_backup.github_backup.get_github_repo_url')
|
||||
def test_all_starred_clones_without_repositories_flag(self, mock_get_url, mock_fetch):
|
||||
"""--all-starred should clone starred repos without --repositories flag.
|
||||
|
||||
This is the core fix for issue #225.
|
||||
"""
|
||||
args = self._create_mock_args(all_starred=True)
|
||||
mock_get_url.return_value = "https://github.com/otheruser/awesome-project.git"
|
||||
|
||||
# A starred repository (is_starred flag set by retrieve_repositories)
|
||||
starred_repo = {
|
||||
"name": "awesome-project",
|
||||
"full_name": "otheruser/awesome-project",
|
||||
"owner": {"login": "otheruser"},
|
||||
"private": False,
|
||||
"fork": False,
|
||||
"has_wiki": False,
|
||||
"is_starred": True, # This flag is set for starred repos
|
||||
}
|
||||
|
||||
with patch('github_backup.github_backup.mkdir_p'):
|
||||
github_backup.backup_repositories(args, "/tmp/backup", [starred_repo])
|
||||
|
||||
# fetch_repository should be called for the starred repo
|
||||
assert mock_fetch.called, "--all-starred should trigger repository cloning"
|
||||
mock_fetch.assert_called_once()
|
||||
call_args = mock_fetch.call_args
|
||||
assert call_args[0][0] == "awesome-project" # repo name
|
||||
|
||||
@patch('github_backup.github_backup.fetch_repository')
|
||||
@patch('github_backup.github_backup.get_github_repo_url')
|
||||
def test_starred_repo_not_cloned_without_all_starred_flag(self, mock_get_url, mock_fetch):
|
||||
"""Starred repos should NOT be cloned if --all-starred is not set."""
|
||||
args = self._create_mock_args(all_starred=False)
|
||||
mock_get_url.return_value = "https://github.com/otheruser/awesome-project.git"
|
||||
|
||||
starred_repo = {
|
||||
"name": "awesome-project",
|
||||
"full_name": "otheruser/awesome-project",
|
||||
"owner": {"login": "otheruser"},
|
||||
"private": False,
|
||||
"fork": False,
|
||||
"has_wiki": False,
|
||||
"is_starred": True,
|
||||
}
|
||||
|
||||
with patch('github_backup.github_backup.mkdir_p'):
|
||||
github_backup.backup_repositories(args, "/tmp/backup", [starred_repo])
|
||||
|
||||
# fetch_repository should NOT be called
|
||||
assert not mock_fetch.called, "Starred repos should not be cloned without --all-starred"
|
||||
|
||||
@patch('github_backup.github_backup.fetch_repository')
|
||||
@patch('github_backup.github_backup.get_github_repo_url')
|
||||
def test_non_starred_repo_not_cloned_with_only_all_starred(self, mock_get_url, mock_fetch):
|
||||
"""Non-starred repos should NOT be cloned when only --all-starred is set."""
|
||||
args = self._create_mock_args(all_starred=True)
|
||||
mock_get_url.return_value = "https://github.com/testuser/my-project.git"
|
||||
|
||||
# A regular (non-starred) repository
|
||||
regular_repo = {
|
||||
"name": "my-project",
|
||||
"full_name": "testuser/my-project",
|
||||
"owner": {"login": "testuser"},
|
||||
"private": False,
|
||||
"fork": False,
|
||||
"has_wiki": False,
|
||||
# No is_starred flag
|
||||
}
|
||||
|
||||
with patch('github_backup.github_backup.mkdir_p'):
|
||||
github_backup.backup_repositories(args, "/tmp/backup", [regular_repo])
|
||||
|
||||
# fetch_repository should NOT be called for non-starred repos
|
||||
assert not mock_fetch.called, "Non-starred repos should not be cloned with only --all-starred"
|
||||
|
||||
@patch('github_backup.github_backup.fetch_repository')
|
||||
@patch('github_backup.github_backup.get_github_repo_url')
|
||||
def test_repositories_flag_still_works(self, mock_get_url, mock_fetch):
|
||||
"""--repositories flag should still clone repos as before."""
|
||||
args = self._create_mock_args(include_repository=True)
|
||||
mock_get_url.return_value = "https://github.com/testuser/my-project.git"
|
||||
|
||||
regular_repo = {
|
||||
"name": "my-project",
|
||||
"full_name": "testuser/my-project",
|
||||
"owner": {"login": "testuser"},
|
||||
"private": False,
|
||||
"fork": False,
|
||||
"has_wiki": False,
|
||||
}
|
||||
|
||||
with patch('github_backup.github_backup.mkdir_p'):
|
||||
github_backup.backup_repositories(args, "/tmp/backup", [regular_repo])
|
||||
|
||||
# fetch_repository should be called
|
||||
assert mock_fetch.called, "--repositories should trigger repository cloning"
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
pytest.main([__file__, "-v"])
|
||||
112
tests/test_case_sensitivity.py
Normal file
112
tests/test_case_sensitivity.py
Normal file
@@ -0,0 +1,112 @@
|
||||
"""Tests for case-insensitive username/organization filtering."""
|
||||
|
||||
import pytest
|
||||
from unittest.mock import Mock
|
||||
|
||||
from github_backup import github_backup
|
||||
|
||||
|
||||
class TestCaseSensitivity:
|
||||
"""Test suite for case-insensitive username matching in filter_repositories."""
|
||||
|
||||
def test_filter_repositories_case_insensitive_user(self):
|
||||
"""Should filter repositories case-insensitively for usernames.
|
||||
|
||||
Reproduces issue #198 where typing 'iamrodos' fails to match
|
||||
repositories with owner.login='Iamrodos' (the canonical case from GitHub API).
|
||||
"""
|
||||
# Simulate user typing lowercase username
|
||||
args = Mock()
|
||||
args.user = "iamrodos" # lowercase (what user typed)
|
||||
args.repository = None
|
||||
args.name_regex = None
|
||||
args.languages = None
|
||||
args.exclude = None
|
||||
args.fork = False
|
||||
args.private = False
|
||||
args.public = False
|
||||
args.all = True
|
||||
|
||||
# Simulate GitHub API returning canonical case
|
||||
repos = [
|
||||
{
|
||||
"name": "repo1",
|
||||
"owner": {"login": "Iamrodos"}, # Capital I (canonical from API)
|
||||
"private": False,
|
||||
"fork": False,
|
||||
},
|
||||
{
|
||||
"name": "repo2",
|
||||
"owner": {"login": "Iamrodos"},
|
||||
"private": False,
|
||||
"fork": False,
|
||||
},
|
||||
]
|
||||
|
||||
filtered = github_backup.filter_repositories(args, repos)
|
||||
|
||||
# Should match despite case difference
|
||||
assert len(filtered) == 2
|
||||
assert filtered[0]["name"] == "repo1"
|
||||
assert filtered[1]["name"] == "repo2"
|
||||
|
||||
def test_filter_repositories_case_insensitive_org(self):
|
||||
"""Should filter repositories case-insensitively for organizations.
|
||||
|
||||
Tests the example from issue #198 where 'prai-org' doesn't match 'PRAI-Org'.
|
||||
"""
|
||||
args = Mock()
|
||||
args.user = "prai-org" # lowercase (what user typed)
|
||||
args.repository = None
|
||||
args.name_regex = None
|
||||
args.languages = None
|
||||
args.exclude = None
|
||||
args.fork = False
|
||||
args.private = False
|
||||
args.public = False
|
||||
args.all = True
|
||||
|
||||
repos = [
|
||||
{
|
||||
"name": "repo1",
|
||||
"owner": {"login": "PRAI-Org"}, # Different case (canonical from API)
|
||||
"private": False,
|
||||
"fork": False,
|
||||
},
|
||||
]
|
||||
|
||||
filtered = github_backup.filter_repositories(args, repos)
|
||||
|
||||
# Should match despite case difference
|
||||
assert len(filtered) == 1
|
||||
assert filtered[0]["name"] == "repo1"
|
||||
|
||||
def test_filter_repositories_case_variations(self):
|
||||
"""Should handle various case combinations correctly."""
|
||||
args = Mock()
|
||||
args.user = "TeSt-UsEr" # Mixed case
|
||||
args.repository = None
|
||||
args.name_regex = None
|
||||
args.languages = None
|
||||
args.exclude = None
|
||||
args.fork = False
|
||||
args.private = False
|
||||
args.public = False
|
||||
args.all = True
|
||||
|
||||
repos = [
|
||||
{"name": "repo1", "owner": {"login": "test-user"}, "private": False, "fork": False},
|
||||
{"name": "repo2", "owner": {"login": "TEST-USER"}, "private": False, "fork": False},
|
||||
{"name": "repo3", "owner": {"login": "TeSt-UsEr"}, "private": False, "fork": False},
|
||||
{"name": "repo4", "owner": {"login": "other-user"}, "private": False, "fork": False},
|
||||
]
|
||||
|
||||
filtered = github_backup.filter_repositories(args, repos)
|
||||
|
||||
# Should match first 3 (all case variations of same user)
|
||||
assert len(filtered) == 3
|
||||
assert set(r["name"] for r in filtered) == {"repo1", "repo2", "repo3"}
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
pytest.main([__file__, "-v"])
|
||||
198
tests/test_json_dump_if_changed.py
Normal file
198
tests/test_json_dump_if_changed.py
Normal file
@@ -0,0 +1,198 @@
|
||||
"""Tests for json_dump_if_changed functionality."""
|
||||
|
||||
import codecs
|
||||
import json
|
||||
import os
|
||||
import tempfile
|
||||
|
||||
import pytest
|
||||
|
||||
from github_backup import github_backup
|
||||
|
||||
|
||||
class TestJsonDumpIfChanged:
|
||||
"""Test suite for json_dump_if_changed function."""
|
||||
|
||||
def test_writes_new_file(self):
|
||||
"""Should write file when it doesn't exist."""
|
||||
with tempfile.TemporaryDirectory() as tmpdir:
|
||||
output_file = os.path.join(tmpdir, "test.json")
|
||||
test_data = {"key": "value", "number": 42}
|
||||
|
||||
result = github_backup.json_dump_if_changed(test_data, output_file)
|
||||
|
||||
assert result is True
|
||||
assert os.path.exists(output_file)
|
||||
|
||||
# Verify content matches expected format
|
||||
with codecs.open(output_file, "r", encoding="utf-8") as f:
|
||||
content = f.read()
|
||||
loaded = json.loads(content)
|
||||
assert loaded == test_data
|
||||
|
||||
def test_skips_unchanged_file(self):
|
||||
"""Should skip write when content is identical."""
|
||||
with tempfile.TemporaryDirectory() as tmpdir:
|
||||
output_file = os.path.join(tmpdir, "test.json")
|
||||
test_data = {"key": "value", "number": 42}
|
||||
|
||||
# First write
|
||||
result1 = github_backup.json_dump_if_changed(test_data, output_file)
|
||||
assert result1 is True
|
||||
|
||||
# Get the initial mtime
|
||||
mtime1 = os.path.getmtime(output_file)
|
||||
|
||||
# Second write with same data
|
||||
result2 = github_backup.json_dump_if_changed(test_data, output_file)
|
||||
assert result2 is False
|
||||
|
||||
# File should not have been modified
|
||||
mtime2 = os.path.getmtime(output_file)
|
||||
assert mtime1 == mtime2
|
||||
|
||||
def test_writes_when_content_changed(self):
|
||||
"""Should write file when content has changed."""
|
||||
with tempfile.TemporaryDirectory() as tmpdir:
|
||||
output_file = os.path.join(tmpdir, "test.json")
|
||||
test_data1 = {"key": "value1"}
|
||||
test_data2 = {"key": "value2"}
|
||||
|
||||
# First write
|
||||
result1 = github_backup.json_dump_if_changed(test_data1, output_file)
|
||||
assert result1 is True
|
||||
|
||||
# Second write with different data
|
||||
result2 = github_backup.json_dump_if_changed(test_data2, output_file)
|
||||
assert result2 is True
|
||||
|
||||
# Verify new content
|
||||
with codecs.open(output_file, "r", encoding="utf-8") as f:
|
||||
loaded = json.load(f)
|
||||
assert loaded == test_data2
|
||||
|
||||
def test_uses_consistent_formatting(self):
|
||||
"""Should use same JSON formatting as json_dump."""
|
||||
with tempfile.TemporaryDirectory() as tmpdir:
|
||||
output_file = os.path.join(tmpdir, "test.json")
|
||||
test_data = {"z": "last", "a": "first", "m": "middle"}
|
||||
|
||||
github_backup.json_dump_if_changed(test_data, output_file)
|
||||
|
||||
with codecs.open(output_file, "r", encoding="utf-8") as f:
|
||||
content = f.read()
|
||||
|
||||
# Check for consistent formatting:
|
||||
# - sorted keys
|
||||
# - 4-space indent
|
||||
# - comma-colon-space separator
|
||||
expected = json.dumps(
|
||||
test_data,
|
||||
ensure_ascii=False,
|
||||
sort_keys=True,
|
||||
indent=4,
|
||||
separators=(",", ": "),
|
||||
)
|
||||
assert content == expected
|
||||
|
||||
def test_atomic_write_always_used(self):
|
||||
"""Should always use temp file and rename for atomic writes."""
|
||||
with tempfile.TemporaryDirectory() as tmpdir:
|
||||
output_file = os.path.join(tmpdir, "test.json")
|
||||
test_data = {"key": "value"}
|
||||
|
||||
result = github_backup.json_dump_if_changed(test_data, output_file)
|
||||
|
||||
assert result is True
|
||||
assert os.path.exists(output_file)
|
||||
|
||||
# Temp file should not exist after atomic write
|
||||
temp_file = output_file + ".temp"
|
||||
assert not os.path.exists(temp_file)
|
||||
|
||||
# Verify content
|
||||
with codecs.open(output_file, "r", encoding="utf-8") as f:
|
||||
loaded = json.load(f)
|
||||
assert loaded == test_data
|
||||
|
||||
def test_handles_unicode_content(self):
|
||||
"""Should correctly handle Unicode content."""
|
||||
with tempfile.TemporaryDirectory() as tmpdir:
|
||||
output_file = os.path.join(tmpdir, "test.json")
|
||||
test_data = {
|
||||
"emoji": "🚀",
|
||||
"chinese": "你好",
|
||||
"arabic": "مرحبا",
|
||||
"cyrillic": "Привет",
|
||||
}
|
||||
|
||||
result = github_backup.json_dump_if_changed(test_data, output_file)
|
||||
assert result is True
|
||||
|
||||
# Verify Unicode is preserved
|
||||
with codecs.open(output_file, "r", encoding="utf-8") as f:
|
||||
loaded = json.load(f)
|
||||
assert loaded == test_data
|
||||
|
||||
# Second write should skip
|
||||
result2 = github_backup.json_dump_if_changed(test_data, output_file)
|
||||
assert result2 is False
|
||||
|
||||
def test_handles_complex_nested_data(self):
|
||||
"""Should handle complex nested data structures."""
|
||||
with tempfile.TemporaryDirectory() as tmpdir:
|
||||
output_file = os.path.join(tmpdir, "test.json")
|
||||
test_data = {
|
||||
"users": [
|
||||
{"id": 1, "name": "Alice", "tags": ["admin", "user"]},
|
||||
{"id": 2, "name": "Bob", "tags": ["user"]},
|
||||
],
|
||||
"metadata": {"version": "1.0", "nested": {"deep": {"value": 42}}},
|
||||
}
|
||||
|
||||
result = github_backup.json_dump_if_changed(test_data, output_file)
|
||||
assert result is True
|
||||
|
||||
# Verify structure is preserved
|
||||
with codecs.open(output_file, "r", encoding="utf-8") as f:
|
||||
loaded = json.load(f)
|
||||
assert loaded == test_data
|
||||
|
||||
def test_overwrites_on_unicode_decode_error(self):
|
||||
"""Should overwrite if existing file has invalid UTF-8."""
|
||||
with tempfile.TemporaryDirectory() as tmpdir:
|
||||
output_file = os.path.join(tmpdir, "test.json")
|
||||
test_data = {"key": "value"}
|
||||
|
||||
# Write invalid UTF-8 bytes
|
||||
with open(output_file, "wb") as f:
|
||||
f.write(b"\xff\xfe invalid utf-8")
|
||||
|
||||
# Should catch UnicodeDecodeError and overwrite
|
||||
result = github_backup.json_dump_if_changed(test_data, output_file)
|
||||
assert result is True
|
||||
|
||||
# Verify new content was written
|
||||
with codecs.open(output_file, "r", encoding="utf-8") as f:
|
||||
loaded = json.load(f)
|
||||
assert loaded == test_data
|
||||
|
||||
def test_key_order_independence(self):
|
||||
"""Should treat differently-ordered dicts as same if keys/values match."""
|
||||
with tempfile.TemporaryDirectory() as tmpdir:
|
||||
output_file = os.path.join(tmpdir, "test.json")
|
||||
|
||||
# Write first dict
|
||||
data1 = {"z": 1, "a": 2, "m": 3}
|
||||
github_backup.json_dump_if_changed(data1, output_file)
|
||||
|
||||
# Try to write same data but different order
|
||||
data2 = {"a": 2, "m": 3, "z": 1}
|
||||
result = github_backup.json_dump_if_changed(data2, output_file)
|
||||
|
||||
# Should skip because content is the same (keys are sorted)
|
||||
assert result is False
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
pytest.main([__file__, "-v"])
|
||||
Reference in New Issue
Block a user