Compare commits

...

28 Commits

Author SHA1 Message Date
Jose Diaz-Gonzalez
196acd0aca Release version 0.28.0 2020-02-03 11:41:34 -05:00
Jose Diaz-Gonzalez
679ac841f6 Merge pull request #143 from smiley/patch-1
Remove deprecated (and removed) "git lfs fetch" flags
2020-02-03 11:41:10 -05:00
Jose Diaz-Gonzalez
498d9eba32 Release version 0.27.0 2020-01-21 21:29:44 -05:00
Jose Diaz-Gonzalez
0f82b1717c Merge pull request #142 from einsteinx2/issue/141-import-error-version
Fixed script fails if not installed from pip
2020-01-21 21:28:22 -05:00
Ben Baron
4d5126f303 Fixed script fails if not installed from pip
At the top of the script, the line from github_backup import __version__ gets the script's version number to use if the script is called with the -v or --version flags. The problem is that if the script hasn't been installed via pip (for example I cloned the repo directly to my backup server), the script will fail due to an import exception.

Also presumably it will always use the version number from pip even if running a modified version from git or a fork or something, though this does not fix that as I have no idea how to check if it's running the pip installed version or not. But at least the script will now work fine if cloned from git or just copied to another machine.

closes https://github.com/josegonzalez/python-github-backup/issues/141
2020-01-21 21:15:57 -05:00
smiley
b864218b44 Remove deprecated (and removed) git lfs flags
"--tags" and "--force" were removed at some point from "git lfs fetch". This broke our backup script.
2020-01-20 15:40:52 +02:00
Jose Diaz-Gonzalez
98919c82c9 Merge pull request #136 from einsteinx2/issue/88-macos-keychain-broken-python3
Fixed macOS keychain access when using Python 3
2020-01-07 11:44:36 -05:00
Jose Diaz-Gonzalez
045eacbf18 Merge pull request #137 from einsteinx2/issue/134-only-use-auth-token-when-needed
Public repos no longer include the auth token
2020-01-07 11:44:23 -05:00
Jose Diaz-Gonzalez
7a234ba7ed Merge pull request #130 from einsteinx2/issue/129-fix-crash-on-release-asset-download-error
Crash when an release asset doesn't exist
2020-01-07 11:44:00 -05:00
Ben Baron
e8a255b450 Public repos no longer include the auth token
When backing up repositories using an auth token and https, the GitHub personal auth token is leaked in each backed up repository. It is included in the URL of each repository's git remote url.

This is not needed as they are public and can be accessed without the token and can cause issues in the future if the token is ever changed, so I think it makes more sense not to have the token stored in each repo backup. I think the token should only be "leaked" like this out of necessity, e.g. it's a private repository and the --prefer-ssh option was not chosen so https with auth token was required to perform the clone.
2020-01-06 21:25:54 -05:00
Ben Baron
81a2f762da Fixed macOS keychain access when using Python 3
Python 3 is returning bytes rather than a string, so the string concatenation to create the auth variable was throwing an exception which the script was interpreting to mean it couldn't find the password. Adding a conversion to string first fixed the issue.
2020-01-06 21:10:50 -05:00
Ben Baron
cb0293cbe5 Fixed comment typo 2020-01-06 14:15:41 -05:00
Jose Diaz-Gonzalez
252c25461f Merge pull request #132 from einsteinx2/issue/126-prevent-overwriting-release-assets
Separate release assets and skip re-downloading
2020-01-06 13:12:33 -05:00
Jose Diaz-Gonzalez
e8ed03fd06 Merge pull request #131 from einsteinx2/improve-gitignore
Improved gitignore, macOS files and IDE configs
2020-01-06 13:11:06 -05:00
Ben Baron
38010d7c39 Switched log_info to log_warning in download_file 2020-01-06 13:06:22 -05:00
Ben Baron
71b4288e6b Added newline to end of file 2020-01-06 13:04:40 -05:00
Ben Baron
ba4fa9fa2d Moved asset downloading loop inside the if block 2020-01-06 12:50:33 -05:00
Ben Baron
869f761c90 Separate release assets and skip re-downloading
Currently the script puts all release assets into the same folder called `releases`. So any time 2 release files have the same name, only the last one downloaded is actually saved. A particularly bad example of this is MacDownApp/macdown where all of their releases are named `MacDown.app.zip`. So even though they have 36 releases and all 36 are downloaded, only the last one is actually saved.

With this change, each releases' assets are now stored in a fubfolder inside `releases` named after the release name. There could still be edge cases if two releases have the same name, but this is still much safer tha the previous behavior.

This change also now checks if the asset file already exists on disk and skips downloading it. This drastically speeds up addiotnal syncs as it no longer downloads every single release every single time. It will now only download new releases which I believe is the expected behavior.

closes https://github.com/josegonzalez/python-github-backup/issues/126
2020-01-06 12:40:47 -05:00
Ben Baron
195e700128 Improved gitignore, macOS files and IDE configs
Ignores the annoying hidden macOS files .DS_Store and ._* as well as the IDE configuration folders for contributors using the popular Visual Studio Code and Atom IDEs (more can be added later as needed).
2020-01-06 11:26:06 -05:00
Ben Baron
27441b71b6 Crash when an release asset doesn't exist
Currently, the script crashes whenever a release asset is unable to download (for example a 404 response). This change instead logs the failure and allows the script to continue. No retry logic is enabled, but at least it prevents the crash and allows the backup to complete. Retry logic can be implemented later if wanted.

closes https://github.com/josegonzalez/python-github-backup/issues/129
2020-01-06 11:13:25 -05:00
Jose Diaz-Gonzalez
cfeaee7309 Update ISSUE_TEMPLATE.md 2020-01-06 10:20:07 -05:00
Jose Diaz-Gonzalez
fac8e4274f Release version 0.26.0 2019-09-23 11:45:01 -04:00
Jose Diaz-Gonzalez
17fee66f31 Merge pull request #128 from Snawoot/master
Workaround gist clone in `--prefer-ssh` mode
2019-09-23 11:44:21 -04:00
Vladislav Yarmak
a56d27dd8b workaround gist clone in --prefer-ssh mode 2019-09-21 19:22:27 +03:00
Jose Diaz-Gonzalez
e57873b6dd Create PULL_REQUEST.md 2019-08-14 17:51:19 -04:00
Jose Diaz-Gonzalez
2658b039a1 Create ISSUE_TEMPLATE.md 2019-08-14 17:47:47 -04:00
Jose Diaz-Gonzalez
fd684a71fb Update README.rst 2019-07-11 13:40:25 -07:00
Jose Diaz-Gonzalez
bacd77030b Update README.rst 2019-07-11 13:39:41 -07:00
7 changed files with 130 additions and 19 deletions

9
.gitignore vendored
View File

@@ -25,3 +25,12 @@ doc/_build
# Generated man page # Generated man page
doc/aws_hostname.1 doc/aws_hostname.1
# Annoying macOS files
.DS_Store
._*
# IDE configuration files
.vscode
.atom

View File

@@ -1,9 +1,63 @@
Changelog Changelog
========= =========
0.25.0 (2019-07-03) 0.28.0 (2020-02-03)
------------------- -------------------
------------------------ ------------------------
- Remove deprecated (and removed) git lfs flags. [smiley]
"--tags" and "--force" were removed at some point from "git lfs fetch". This broke our backup script.
0.27.0 (2020-01-22)
-------------------
- Fixed script fails if not installed from pip. [Ben Baron]
At the top of the script, the line from github_backup import __version__ gets the script's version number to use if the script is called with the -v or --version flags. The problem is that if the script hasn't been installed via pip (for example I cloned the repo directly to my backup server), the script will fail due to an import exception.
Also presumably it will always use the version number from pip even if running a modified version from git or a fork or something, though this does not fix that as I have no idea how to check if it's running the pip installed version or not. But at least the script will now work fine if cloned from git or just copied to another machine.
closes https://github.com/josegonzalez/python-github-backup/issues/141
- Fixed macOS keychain access when using Python 3. [Ben Baron]
Python 3 is returning bytes rather than a string, so the string concatenation to create the auth variable was throwing an exception which the script was interpreting to mean it couldn't find the password. Adding a conversion to string first fixed the issue.
- Public repos no longer include the auth token. [Ben Baron]
When backing up repositories using an auth token and https, the GitHub personal auth token is leaked in each backed up repository. It is included in the URL of each repository's git remote url.
This is not needed as they are public and can be accessed without the token and can cause issues in the future if the token is ever changed, so I think it makes more sense not to have the token stored in each repo backup. I think the token should only be "leaked" like this out of necessity, e.g. it's a private repository and the --prefer-ssh option was not chosen so https with auth token was required to perform the clone.
- Fixed comment typo. [Ben Baron]
- Switched log_info to log_warning in download_file. [Ben Baron]
- Crash when an release asset doesn't exist. [Ben Baron]
Currently, the script crashes whenever a release asset is unable to download (for example a 404 response). This change instead logs the failure and allows the script to continue. No retry logic is enabled, but at least it prevents the crash and allows the backup to complete. Retry logic can be implemented later if wanted.
closes https://github.com/josegonzalez/python-github-backup/issues/129
- Moved asset downloading loop inside the if block. [Ben Baron]
- Separate release assets and skip re-downloading. [Ben Baron]
Currently the script puts all release assets into the same folder called `releases`. So any time 2 release files have the same name, only the last one downloaded is actually saved. A particularly bad example of this is MacDownApp/macdown where all of their releases are named `MacDown.app.zip`. So even though they have 36 releases and all 36 are downloaded, only the last one is actually saved.
With this change, each releases' assets are now stored in a fubfolder inside `releases` named after the release name. There could still be edge cases if two releases have the same name, but this is still much safer tha the previous behavior.
This change also now checks if the asset file already exists on disk and skips downloading it. This drastically speeds up addiotnal syncs as it no longer downloads every single release every single time. It will now only download new releases which I believe is the expected behavior.
closes https://github.com/josegonzalez/python-github-backup/issues/126
- Added newline to end of file. [Ben Baron]
- Improved gitignore, macOS files and IDE configs. [Ben Baron]
Ignores the annoying hidden macOS files .DS_Store and ._* as well as the IDE configuration folders for contributors using the popular Visual Studio Code and Atom IDEs (more can be added later as needed).
0.26.0 (2019-09-23)
-------------------
- Workaround gist clone in `--prefer-ssh` mode. [Vladislav Yarmak]
- Create PULL_REQUEST.md. [Jose Diaz-Gonzalez]
- Create ISSUE_TEMPLATE.md. [Jose Diaz-Gonzalez]
0.25.0 (2019-07-03)
-------------------
- Issue 119: Change retrieve_data to be a generator. [2a] - Issue 119: Change retrieve_data to be a generator. [2a]
See issue #119. See issue #119.

13
ISSUE_TEMPLATE.md Normal file
View File

@@ -0,0 +1,13 @@
# Important notice regarding filed issues
This project already fills my needs, and as such I have no real reason to continue it's development. This project is otherwise provided as is, and no support is given.
If pull requests implementing bug fixes or enhancements are pushed, I am happy to review and merge them (time permitting).
If you wish to have a bug fixed, you have a few options:
- Fix it yourself and file a pull request.
- File a bug and hope someone else fixes it for you.
- Pay me to fix it (my rate is $200 an hour, minimum 1 hour, contact me via my [github email address](https://github.com/josegonzalez) if you want to go this route).
In all cases, feel free to file an issue, they may be of help to others in the future.

7
PULL_REQUEST.md Normal file
View File

@@ -0,0 +1,7 @@
# Important notice regarding filed pull requests
This project already fills my needs, and as such I have no real reason to continue it's development. This project is otherwise provided as is, and no support is given.
I will attempt to review pull requests at _my_ earliest convenience. If I am unable to get to your pull request in a timely fashion, it is what it is. This repository does not pay any bills, and I am not required to merge any pull request from any individual.
If you wish to jump my personal priority queue, you may pay me for my time to review. My rate is $200 an hour - minimum 1 hour - feel free contact me via my github email address if you want to go this route.

View File

@@ -4,6 +4,8 @@ github-backup
|PyPI| |Python Versions| |PyPI| |Python Versions|
This project is considered feature complete for the primary maintainer. If you would like a bugfix or enhancement and cannot sponsor the work, pull requests are welcome. Feel free to contact the maintainer for consulting estimates if desired.
backup a github user or organization backup a github user or organization
Requirements Requirements

View File

@@ -41,7 +41,11 @@ except ImportError:
from urllib2 import HTTPRedirectHandler from urllib2 import HTTPRedirectHandler
from urllib2 import build_opener from urllib2 import build_opener
from github_backup import __version__ try:
from github_backup import __version__
VERSION = __version__
except ImportError:
VERSION = 'unknown'
FNULL = open(os.devnull, 'w') FNULL = open(os.devnull, 'w')
@@ -302,7 +306,7 @@ def parse_args():
help='Clone repositories using SSH instead of HTTPS') help='Clone repositories using SSH instead of HTTPS')
parser.add_argument('-v', '--version', parser.add_argument('-v', '--version',
action='version', action='version',
version='%(prog)s ' + __version__) version='%(prog)s ' + VERSION)
parser.add_argument('--keychain-name', parser.add_argument('--keychain-name',
dest='osx_keychain_item_name', dest='osx_keychain_item_name',
help='OSX ONLY: name field of password item in OSX keychain that holds the personal access or OAuth token') help='OSX ONLY: name field of password item in OSX keychain that holds the personal access or OAuth token')
@@ -337,6 +341,8 @@ def get_auth(args, encode=True):
'-s', args.osx_keychain_item_name, '-s', args.osx_keychain_item_name,
'-a', args.osx_keychain_item_account, '-a', args.osx_keychain_item_account,
'-w'], stderr=devnull).strip()) '-w'], stderr=devnull).strip())
if not PY2:
token = token.decode('utf-8')
auth = token + ':' + 'x-oauth-basic' auth = token + ':' + 'x-oauth-basic'
except: except:
log_error('No password item matching the provided name and account could be found in the osx keychain.') log_error('No password item matching the provided name and account could be found in the osx keychain.')
@@ -387,14 +393,14 @@ def get_github_host(args):
def get_github_repo_url(args, repository): def get_github_repo_url(args, repository):
if args.prefer_ssh:
return repository['ssh_url']
if repository.get('is_gist'): if repository.get('is_gist'):
return repository['git_pull_url'] return repository['git_pull_url']
if args.prefer_ssh:
return repository['ssh_url']
auth = get_auth(args, False) auth = get_auth(args, False)
if auth: if auth and repository['private'] == True:
repo_url = 'https://{0}@{1}/{2}/{3}.git'.format( repo_url = 'https://{0}@{1}/{2}/{3}.git'.format(
auth, auth,
get_github_host(args), get_github_host(args),
@@ -565,19 +571,35 @@ class S3HTTPRedirectHandler(HTTPRedirectHandler):
def download_file(url, path, auth): def download_file(url, path, auth):
# Skip downloading release assets if they already exist on disk so we don't redownload on every sync
if os.path.exists(path):
return
request = Request(url) request = Request(url)
request.add_header('Accept', 'application/octet-stream') request.add_header('Accept', 'application/octet-stream')
request.add_header('Authorization', 'Basic '.encode('ascii') + auth) request.add_header('Authorization', 'Basic '.encode('ascii') + auth)
opener = build_opener(S3HTTPRedirectHandler) opener = build_opener(S3HTTPRedirectHandler)
response = opener.open(request)
chunk_size = 16 * 1024 try:
with open(path, 'wb') as f: response = opener.open(request)
while True:
chunk = response.read(chunk_size) chunk_size = 16 * 1024
if not chunk: with open(path, 'wb') as f:
break while True:
f.write(chunk) chunk = response.read(chunk_size)
if not chunk:
break
f.write(chunk)
except HTTPError as exc:
# Gracefully handle 404 responses (and others) when downloading from S3
log_warning('Skipping download of asset {0} due to HTTPError: {1}'.format(url, exc.reason))
except URLError as e:
# Gracefully handle other URL errors
log_warning('Skipping download of asset {0} due to URLError: {1}'.format(url, e.reason))
except socket.error as e:
# Gracefully handle socket errors
# TODO: Implement retry logic
log_warning('Skipping download of asset {0} due to socker error: {1}'.format(url, e.strerror))
def get_authenticated_user(args): def get_authenticated_user(args):
@@ -958,8 +980,12 @@ def backup_releases(args, repo_cwd, repository, repos_template, include_assets=F
if include_assets: if include_assets:
assets = retrieve_data(args, release['assets_url']) assets = retrieve_data(args, release['assets_url'])
for asset in assets: if len(assets) > 0:
download_file(asset['url'], os.path.join(release_cwd, asset['name']), get_auth(args)) # give release asset files somewhere to live & download them (not including source archives)
release_assets_cwd = os.path.join(release_cwd, release_name)
mkdir_p(release_assets_cwd)
for asset in assets:
download_file(asset['url'], os.path.join(release_assets_cwd, asset['name']), get_auth(args))
def fetch_repository(name, def fetch_repository(name,
@@ -1010,7 +1036,7 @@ def fetch_repository(name,
logging_subprocess(git_command, None, cwd=local_dir) logging_subprocess(git_command, None, cwd=local_dir)
if lfs_clone: if lfs_clone:
git_command = ['git', 'lfs', 'fetch', '--all', '--force', '--tags', '--prune'] git_command = ['git', 'lfs', 'fetch', '--all', '--prune']
else: else:
git_command = ['git', 'fetch', '--all', '--force', '--tags', '--prune'] git_command = ['git', 'fetch', '--all', '--force', '--tags', '--prune']
logging_subprocess(git_command, None, cwd=local_dir) logging_subprocess(git_command, None, cwd=local_dir)

View File

@@ -1 +1 @@
__version__ = '0.25.0' __version__ = '0.28.0'