Separate release assets and skip re-downloading

Currently the script puts all release assets into the same folder called `releases`. So any time 2 release files have the same name, only the last one downloaded is actually saved. A particularly bad example of this is MacDownApp/macdown where all of their releases are named `MacDown.app.zip`. So even though they have 36 releases and all 36 are downloaded, only the last one is actually saved.

With this change, each releases' assets are now stored in a fubfolder inside `releases` named after the release name. There could still be edge cases if two releases have the same name, but this is still much safer tha the previous behavior.

This change also now checks if the asset file already exists on disk and skips downloading it. This drastically speeds up addiotnal syncs as it no longer downloads every single release every single time. It will now only download new releases which I believe is the expected behavior.

closes https://github.com/josegonzalez/python-github-backup/issues/126
This commit is contained in:
Ben Baron
2020-01-06 12:40:47 -05:00
parent fac8e4274f
commit 869f761c90

View File

@@ -565,6 +565,10 @@ class S3HTTPRedirectHandler(HTTPRedirectHandler):
def download_file(url, path, auth):
# Skip downloading release assets if they already exist on disk so we don't redownload on every sync
if os.path.exists(path):
return
request = Request(url)
request.add_header('Accept', 'application/octet-stream')
request.add_header('Authorization', 'Basic '.encode('ascii') + auth)
@@ -958,8 +962,14 @@ def backup_releases(args, repo_cwd, repository, repos_template, include_assets=F
if include_assets:
assets = retrieve_data(args, release['assets_url'])
if len(assets) > 0:
# give release asset files somewhere to live
release_assets_cwd = os.path.join(release_cwd, release_name)
mkdir_p(release_assets_cwd)
# download any release asset files (not including source archives)
for asset in assets:
download_file(asset['url'], os.path.join(release_cwd, asset['name']), get_auth(args))
download_file(asset['url'], os.path.join(release_assets_cwd, asset['name']), get_auth(args))
def fetch_repository(name,