Compare commits

..

64 Commits

Author SHA1 Message Date
Jose Diaz-Gonzalez
f157ea107f Release version 0.14.0 2017-10-11 11:52:16 -04:00
Jose Diaz-Gonzalez
a129cc759a Merge pull request #68 from pieterclaerhout/master
Added support for LFS clones
2017-10-11 11:51:57 -04:00
pieterclaerhout
bb551a83f4 Updated the readme 2017-10-11 15:14:13 +02:00
pieterclaerhout
9b1b4a9ebc Added a check to see if git-lfs is installed when doing an LFS clone 2017-10-11 15:11:14 +02:00
pieterclaerhout
e6b6eb8bef Added support for LFS clones 2017-10-10 19:52:07 +02:00
Jose Diaz-Gonzalez
0b3f120e2b Merge pull request #66 from albertyw/python3
Explicitly support python 3
2017-09-30 21:01:27 -04:00
Albert Wang
990249b80b Add pypi info to readme 2017-09-30 17:16:38 -07:00
Albert Wang
cefb226545 Explicitly support python 3 in package description 2017-09-30 17:13:47 -07:00
Jose Diaz-Gonzalez
ea22ffdf26 Merge pull request #65 from mumblez/master
add couple examples to help new users
2017-05-30 11:58:42 -06:00
Yusuf Tran
0f21d7b8a4 add couple examples to help new users 2017-05-30 18:52:11 +01:00
Jose Diaz-Gonzalez
cb33b9bab7 Release version 0.13.2 2017-05-06 14:14:08 -06:00
Jose Diaz-Gonzalez
68c48cb0b3 Merge pull request #64 from karlicoss/fix-remotes
Fix remotes while updating repository
2017-05-06 14:13:46 -06:00
Dima Gerasimov
922a3c5a6e Fix remotes while updating repository 2017-05-06 12:58:42 +01:00
Jose Diaz-Gonzalez
d4055eb99c Release version 0.13.1 2017-04-11 09:40:13 -06:00
Jose Diaz-Gonzalez
d8a330559c Merge pull request #61 from McNetic/fix_empty_updated_at
Fix error when repository has no updated_at value
2017-04-11 09:37:15 -06:00
Nicolai Ehemann
de93824498 Fix error when repository has no updated_at value 2017-04-11 11:10:03 +02:00
Jose Diaz-Gonzalez
2efeaa7580 Release version 0.13.0 2017-04-05 11:49:49 -04:00
Jose Diaz-Gonzalez
647810a2f0 Merge pull request #59 from martintoreilly/master
Add support for storing PAT in OSX keychain
2017-04-05 11:49:24 -04:00
Martin O'Reilly
0dfe5c342a Add OS check for OSX specific keychain args
Keychain arguments are only supported on Mac OSX.
Added check for operating system so we give a
"Keychain arguments are only supported on Mac OSX"
error message rather than a "No password item matching the
provided name and account could be found in the osx keychain"
error message
2017-04-05 16:36:52 +01:00
Martin O'Reilly
1d6e1abab1 Add support for storing PAT in OSX keychain
Added additional optional arguments and README guidance for storing
and accessing a Github personal access token (PAT) in the OSX
keychain
2017-04-05 15:17:52 +01:00
Jose Diaz-Gonzalez
dd2b96b172 Release version 0.12.1 2017-03-27 14:55:11 -06:00
Jose Diaz-Gonzalez
7a589f1e63 Merge pull request #57 from acdha/reuse-existing-remotes
Avoid remote branch name churn
2017-03-27 14:54:02 -06:00
Chris Adams
92c619cd01 Avoid remote branch name churn
This avoids the backup output having lots of "[new branch]" messages
because removing the old remote name removed all of the existing branch
references.
2017-03-27 16:26:19 -04:00
Jose Diaz-Gonzalez
9a91dd7733 Merge pull request #55 from amaczuga/master
Fix detection of bare git directories
2016-11-22 13:36:52 -07:00
Andrzej Maczuga
6592bd8196 Fix detection of bare git directories 2016-11-22 20:11:26 +00:00
Jose Diaz-Gonzalez
e9e3b18512 Release version 0.12.0 2016-11-22 10:56:56 -07:00
Jose Diaz-Gonzalez
88148b4c95 pep8: E501 line too long (83 > 79 characters) 2016-11-22 10:55:37 -07:00
Jose Diaz-Gonzalez
8448add464 pep8: E128 continuation line under-indented for visual indent 2016-11-22 10:51:04 -07:00
Jose Diaz-Gonzalez
5b30b7ebdd fix: properly import version from github_backup package 2016-11-22 10:49:18 -07:00
Jose Diaz-Gonzalez
c3a17710d3 fix: support alternate git status output 2016-11-22 10:48:07 -07:00
Jose Diaz-Gonzalez
4462412ec7 Merge pull request #54 from amaczuga/master
Support archivization using bare git clones
2016-11-22 09:44:54 -07:00
Andrzej Maczuga
8d61538e5e Support archivization using bare git clones 2016-11-22 13:07:52 +00:00
Jose Diaz-Gonzalez
4d37ad206f Merge pull request #53 from trel/master
fix typo, 3x
2016-11-18 15:35:53 -05:00
Terrell Russell
1f983863fc fix typo, 3x 2016-11-18 15:17:42 -05:00
Jose Diaz-Gonzalez
f0b28567b9 Release version 0.11.0 2016-10-26 14:14:00 -06:00
Jose Diaz-Gonzalez
77ede50b19 Merge pull request #52 from bjodah/fix-gh-51
Support --token file:///home/user/token.txt (fixes gh-51)
2016-10-26 14:13:35 -06:00
Björn Dahlgren
97e4fbbacb Support --token file:///home/user/token.txt (fixes gh-51) 2016-10-26 01:57:33 +02:00
Jose Diaz-Gonzalez
03604cc654 Merge pull request #48 from albertyw/python3
Support Python 3
2016-10-25 17:38:05 -06:00
Albert Wang
73a62fdee1 Fix some linting 2016-09-11 01:14:36 -07:00
Albert Wang
94e1d62ad5 Fix byte/string conversion for python 3 2016-09-11 01:14:31 -07:00
Albert Wang
54cef11ce7 Support python 3 2016-09-11 01:14:19 -07:00
Jose Diaz-Gonzalez
56397eba1c Merge pull request #46 from remram44/encode-password
Encode special characters in password
2016-09-06 14:31:35 -04:00
Remi Rampin
9f861efccf Encode special characters in password 2016-09-06 14:27:47 -04:00
Jose Diaz-Gonzalez
c1c9ce6dca Merge pull request #45 from remram44/cli-programname
Fix program name
2016-09-06 12:42:45 -04:00
Jose Diaz-Gonzalez
ab18d8aee0 Merge pull request #44 from remram44/readme-git-https
Don't install over insecure connection
2016-09-06 12:42:30 -04:00
Remi Rampin
9d7d98b19e Update README.rst 2016-09-06 12:28:44 -04:00
Remi Rampin
0233bff696 Don't pretend program name is "Github Backup" 2016-09-06 12:24:51 -04:00
Remi Rampin
6154ceda15 Don't install over insecure connection
The git:// protocol is unauthenticated and unencrypted, and no longer advertised by GitHub. Using HTTPS shouldn't impact performance.
2016-09-06 12:11:29 -04:00
Jose Diaz-Gonzalez
9023052e9c Release version 0.10.3 2016-08-20 20:50:29 -04:00
Jose Diaz-Gonzalez
874c235ba5 Merge pull request #30 from jonasrmichel/master
Fixes #29
2016-08-20 20:50:25 -04:00
Jose Diaz-Gonzalez
b7b234d8a5 Release version 0.10.2 2016-08-20 20:49:46 -04:00
Jose Diaz-Gonzalez
ed160eb0ca Add a note regarding git version requirement
Closes #37
2016-08-20 20:49:42 -04:00
Jose Diaz-Gonzalez
1d11d62b73 Release version 0.10.1 2016-08-20 20:45:27 -04:00
Jose Diaz-Gonzalez
9e1cba9817 Release version 0.10.0 2016-08-18 14:20:46 -04:00
Jose Diaz-Gonzalez
3859a80b7a Merge pull request #42 from robertwb/master
Implement incremental updates
2016-08-18 14:20:05 -04:00
Robert Bradshaw
8c12d54898 Implement incremental updates
Guarded with an --incremental flag.

Stores the time of the last update and only downloads issue and
pull request data since this time.  All other data is relatively
small (likely fetched with a single request) and so is simply
re-populated from scratch as before.
2016-08-17 21:31:59 -07:00
Jose Diaz-Gonzalez
b6b6605acd Release version 0.9.0 2016-03-29 13:23:45 -04:00
Jose Diaz-Gonzalez
ff5e0aa89c Merge pull request #36 from zlabjp/fix-cloning-private-repos
Fix cloning private repos with basic auth or token
2016-03-29 13:21:57 -04:00
Kazuki Suda
79726c360d Fix cloning private repos with basic auth or token 2016-03-29 15:23:54 +09:00
Jose Diaz-Gonzalez
a511bb2b49 Release version 0.8.0 2016-02-14 16:04:54 -05:00
Jose Diaz-Gonzalez
aedf9b2c66 Merge pull request #35 from eht16/issue23_store_pullrequests_once
Don't store issues which are actually pull requests
2016-02-14 16:02:18 -05:00
Enrico Tröger
b9e35a50f5 Don't store issues which are actually pull requests
This prevents storing pull requests twice since the Github API returns
pull requests also as issues. Those issues will be skipped but only if
retrieving pull requests is requested as well.
Closes #23.
2016-02-14 16:36:40 +01:00
Jonas Michel
1e5a90486c Fixes #29
Reporting an error when the user's rate limit is exceeded causes
the script to terminate after resuming execution from a rate limit
sleep. Instead of generating an explicit error we just want to
inform the user that the script is going to sleep until their rate
limit count resets.
2016-01-20 14:48:02 -06:00
Jonas Michel
9b74aff20b Fixes #29
The errors list was not being cleared out after resuming a backup
from a rate limit sleep. When the backup was resumed, the non-empty
errors list caused the backup to quit after the next `retrieve_data`
request.
2016-01-17 11:10:28 -06:00
6 changed files with 474 additions and 75 deletions

View File

@@ -1,6 +1,161 @@
Changelog
=========
0.14.0 (2017-10-11)
-------------------
- Added a check to see if git-lfs is installed when doing an LFS clone.
[pieterclaerhout]
- Added support for LFS clones. [pieterclaerhout]
- Add pypi info to readme. [Albert Wang]
- Explicitly support python 3 in package description. [Albert Wang]
- Add couple examples to help new users. [Yusuf Tran]
0.13.2 (2017-05-06)
-------------------
- Fix remotes while updating repository. [Dima Gerasimov]
0.13.1 (2017-04-11)
-------------------
- Fix error when repository has no updated_at value. [Nicolai Ehemann]
0.13.0 (2017-04-05)
-------------------
- Add OS check for OSX specific keychain args. [Martin O'Reilly]
Keychain arguments are only supported on Mac OSX.
Added check for operating system so we give a
"Keychain arguments are only supported on Mac OSX"
error message rather than a "No password item matching the
provided name and account could be found in the osx keychain"
error message
- Add support for storing PAT in OSX keychain. [Martin O'Reilly]
Added additional optional arguments and README guidance for storing
and accessing a Github personal access token (PAT) in the OSX
keychain
0.12.1 (2017-03-27)
-------------------
- Avoid remote branch name churn. [Chris Adams]
This avoids the backup output having lots of "[new branch]" messages
because removing the old remote name removed all of the existing branch
references.
- Fix detection of bare git directories. [Andrzej Maczuga]
0.12.0 (2016-11-22)
-------------------
Fix
~~~
- Properly import version from github_backup package. [Jose Diaz-
Gonzalez]
- Support alternate git status output. [Jose Diaz-Gonzalez]
Other
~~~~~
- Pep8: E501 line too long (83 > 79 characters) [Jose Diaz-Gonzalez]
- Pep8: E128 continuation line under-indented for visual indent. [Jose
Diaz-Gonzalez]
- Support archivization using bare git clones. [Andrzej Maczuga]
- Fix typo, 3x. [Terrell Russell]
0.11.0 (2016-10-26)
-------------------
- Support --token file:///home/user/token.txt (fixes gh-51) [Björn
Dahlgren]
- Fix some linting. [Albert Wang]
- Fix byte/string conversion for python 3. [Albert Wang]
- Support python 3. [Albert Wang]
- Encode special characters in password. [Remi Rampin]
- Don't pretend program name is "Github Backup" [Remi Rampin]
- Don't install over insecure connection. [Remi Rampin]
The git:// protocol is unauthenticated and unencrypted, and no longer advertised by GitHub. Using HTTPS shouldn't impact performance.
0.10.3 (2016-08-21)
-------------------
- Fixes #29. [Jonas Michel]
Reporting an error when the user's rate limit is exceeded causes
the script to terminate after resuming execution from a rate limit
sleep. Instead of generating an explicit error we just want to
inform the user that the script is going to sleep until their rate
limit count resets.
- Fixes #29. [Jonas Michel]
The errors list was not being cleared out after resuming a backup
from a rate limit sleep. When the backup was resumed, the non-empty
errors list caused the backup to quit after the next `retrieve_data`
request.
0.10.2 (2016-08-21)
-------------------
- Add a note regarding git version requirement. [Jose Diaz-Gonzalez]
Closes #37
0.10.0 (2016-08-18)
-------------------
- Implement incremental updates. [Robert Bradshaw]
Guarded with an --incremental flag.
Stores the time of the last update and only downloads issue and
pull request data since this time. All other data is relatively
small (likely fetched with a single request) and so is simply
re-populated from scratch as before.
0.9.0 (2016-03-29)
------------------
- Fix cloning private repos with basic auth or token. [Kazuki Suda]
0.8.0 (2016-02-14)
------------------
- Don't store issues which are actually pull requests. [Enrico Tröger]
This prevents storing pull requests twice since the Github API returns
pull requests also as issues. Those issues will be skipped but only if
retrieving pull requests is requested as well.
Closes #23.
0.7.0 (2016-02-02)
------------------

View File

@@ -2,8 +2,15 @@
github-backup
=============
|PyPI| |Python Versions|
backup a github user or organization
Requirements
============
- GIT 1.9+
Installation
============
@@ -13,21 +20,24 @@ Using PIP via PyPI::
Using PIP via Github::
pip install git+git://github.com/josegonzalez/python-github-backup.git#egg=github-backup
pip install git+https://github.com/josegonzalez/python-github-backup.git#egg=github-backup
Usage
=====
CLI Usage is as follows::
Github Backup [-h] [-u USERNAME] [-p PASSWORD] [-t TOKEN]
[-o OUTPUT_DIRECTORY] [--starred] [--watched] [--all]
[--issues] [--issue-comments] [--issue-events] [--pulls]
[--pull-comments] [--pull-commits] [--labels] [--hooks]
[--milestones] [--repositories] [--wikis]
[--skip-existing] [-L [LANGUAGES [LANGUAGES ...]]]
[-N NAME_REGEX] [-H GITHUB_HOST] [-O] [-R REPOSITORY]
[-P] [-F] [--prefer-ssh] [-v]
github-backup [-h] [-u USERNAME] [-p PASSWORD] [-t TOKEN]
[-o OUTPUT_DIRECTORY] [-i] [--starred] [--watched]
[--all] [--issues] [--issue-comments] [--issue-events]
[--pulls] [--pull-comments] [--pull-commits] [--labels]
[--hooks] [--milestones] [--repositories] [--bare] [--lfs]
[--wikis] [--skip-existing]
[-L [LANGUAGES [LANGUAGES ...]]] [-N NAME_REGEX]
[-H GITHUB_HOST] [-O] [-R REPOSITORY] [-P] [-F]
[--prefer-ssh] [-v]
[--keychain-name OSX_KEYCHAIN_ITEM_NAME]
[--keychain-account OSX_KEYCHAIN_ITEM_ACCOUNT]
USER
Backup a github account
@@ -46,6 +56,7 @@ CLI Usage is as follows::
personal access or OAuth token
-o OUTPUT_DIRECTORY, --output-directory OUTPUT_DIRECTORY
directory at which to backup the repositories
-i, --incremental incremental backup
--starred include starred repositories in backup
--watched include watched repositories in backup
--all include everything in backup
@@ -60,6 +71,8 @@ CLI Usage is as follows::
authenticated)
--milestones include milestones in backup
--repositories include repository clone in backup
--bare clone bare repositories
--lfs clone LFS repositories (requires Git LFS to be installed, https://git-lfs.github.com)
--wikis include wiki clone in backup
--skip-existing skip project if a backup directory exists
-L [LANGUAGES [LANGUAGES ...]], --languages [LANGUAGES [LANGUAGES ...]]
@@ -75,6 +88,12 @@ CLI Usage is as follows::
-F, --fork include forked repositories
--prefer-ssh Clone repositories using SSH instead of HTTPS
-v, --version show program's version number and exit
--keychain-name OSX_KEYCHAIN_ITEM_NAME
OSX ONLY: name field of password item in OSX keychain
that holds the personal access or OAuth token
--keychain-account OSX_KEYCHAIN_ITEM_ACCOUNT
OSX ONLY: account field of password item in OSX
keychain that holds the personal access or OAuth token
The package can be used to backup an *entire* organization or repository, including issues and wikis in the most appropriate format (clones for wikis, json files for issues).
@@ -83,3 +102,46 @@ Authentication
==============
Note: Password-based authentication will fail if you have two-factor authentication enabled.
Using the Keychain on Mac OSX
=============================
Note: On Mac OSX the token can be stored securely in the user's keychain. To do this:
1. Open Keychain from "Applications -> Utilities -> Keychain Access"
2. Add a new password item using "File -> New Password Item"
3. Enter a name in the "Keychain Item Name" box. You must provide this name to github-backup using the --keychain-name argument.
4. Enter an account name in the "Account Name" box, enter your Github username as set above. You must provide this name to github-backup using the --keychain-account argument.
5. Enter your Github personal access token in the "Password" box
Note: When you run github-backup, you will be asked whether you want to allow "security" to use your confidential information stored in your keychain. You have two options:
1. **Allow:** In this case you will need to click "Allow" each time you run `github-backup`
2. **Always Allow:** In this case, you will not be asked for permission when you run `github-backup` in future. This is less secure, but is required if you want to schedule `github-backup` to run automatically
About Git LFS
=============
When you use the "--lfs" option, you will need to make sure you have Git LFS installed.
Instructions on how to do this can be found on https://git-lfs.github.com.
Examples
========
Backup all repositories::
export ACCESS_TOKEN=SOME-GITHUB-TOKEN
github-backup WhiteHouse --token $ACCESS_TOKEN --organization --output-directory /tmp/white-house --repositories
Backup a single organization repository with everything else (wiki, pull requests, comments, issues etc)::
export ACCESS_TOKEN=SOME-GITHUB-TOKEN
ORGANIZATION=docker
REPO=cli
# e.g. git@github.com:docker/cli.git
github-backup $ORGANIZATION -P -t $ACCESS_TOKEN -o . --all -O -R $REPO
.. |PyPI| image:: https://img.shields.io/pypi/v/github-backup.svg
:target: https://pypi.python.org/pypi/github-backup/
.. |Python Versions| image:: https://img.shields.io/pypi/pyversions/github-backup.svg
:target: https://github.com/albertyw/github-backup

View File

@@ -16,8 +16,23 @@ import select
import subprocess
import sys
import time
import urllib
import urllib2
import platform
try:
# python 3
from urllib.parse import urlparse
from urllib.parse import quote as urlquote
from urllib.parse import urlencode
from urllib.error import HTTPError, URLError
from urllib.request import urlopen
from urllib.request import Request
except ImportError:
# python 2
from urlparse import urlparse
from urllib import quote as urlquote
from urllib import urlencode
from urllib2 import HTTPError, URLError
from urllib2 import urlopen
from urllib2 import Request
from github_backup import __version__
@@ -79,8 +94,8 @@ def logging_subprocess(popenargs,
rc = child.wait()
if rc != 0:
print(u'{} returned {}:'.format(popenargs[0], rc), file=sys.stderr)
print('\t', u' '.join(popenargs), file=sys.stderr)
print('{} returned {}:'.format(popenargs[0], rc), file=sys.stderr)
print('\t', ' '.join(popenargs), file=sys.stderr)
return rc
@@ -96,9 +111,19 @@ def mkdir_p(*args):
raise
def mask_password(url, secret='*****'):
parsed = urlparse(url)
if not parsed.password:
return url
elif parsed.password == 'x-oauth-basic':
return url.replace(parsed.username, secret)
return url.replace(parsed.password, secret)
def parse_args():
parser = argparse.ArgumentParser(description='Backup a github account',
prog='Github Backup')
parser = argparse.ArgumentParser(description='Backup a github account')
parser.add_argument('user',
metavar='USER',
type=str,
@@ -116,12 +141,17 @@ def parse_args():
parser.add_argument('-t',
'--token',
dest='token',
help='personal access or OAuth token')
help='personal access or OAuth token, or path to token (file://...)') # noqa
parser.add_argument('-o',
'--output-directory',
default='.',
dest='output_directory',
help='directory at which to backup the repositories')
parser.add_argument('-i',
'--incremental',
action='store_true',
dest='incremental',
help='incremental backup')
parser.add_argument('--starred',
action='store_true',
dest='include_starred',
@@ -165,7 +195,7 @@ def parse_args():
parser.add_argument('--hooks',
action='store_true',
dest='include_hooks',
help='include hooks in backup (works only when authenticated)')
help='include hooks in backup (works only when authenticated)') # noqa
parser.add_argument('--milestones',
action='store_true',
dest='include_milestones',
@@ -174,6 +204,14 @@ def parse_args():
action='store_true',
dest='include_repository',
help='include repository clone in backup')
parser.add_argument('--bare',
action='store_true',
dest='bare_clone',
help='clone bare repositories')
parser.add_argument('--lfs',
action='store_true',
dest='lfs_clone',
help='clone LFS repositories (requires Git LFS to be installed, https://git-lfs.github.com)')
parser.add_argument('--wikis',
action='store_true',
dest='include_wiki',
@@ -218,23 +256,61 @@ def parse_args():
parser.add_argument('-v', '--version',
action='version',
version='%(prog)s ' + __version__)
parser.add_argument('--keychain-name',
dest='osx_keychain_item_name',
help='OSX ONLY: name field of password item in OSX keychain that holds the personal access or OAuth token')
parser.add_argument('--keychain-account',
dest='osx_keychain_item_account',
help='OSX ONLY: account field of password item in OSX keychain that holds the personal access or OAuth token')
return parser.parse_args()
def get_auth(args):
if args.token:
return base64.b64encode(args.token + ':' + 'x-oauth-basic')
def get_auth(args, encode=True):
auth = None
if args.username:
if args.osx_keychain_item_name:
if not args.osx_keychain_item_account:
log_error('You must specify both name and account fields for osx keychain password items')
else:
if platform.system() != 'Darwin':
log_error("Keychain arguments are only supported on Mac OSX")
try:
with open(os.devnull,'w') as devnull:
token = (subprocess.check_output([
'security','find-generic-password',
'-s',args.osx_keychain_item_name,
'-a',args.osx_keychain_item_account,
'-w' ], stderr=devnull).strip())
auth = token + ':' + 'x-oauth-basic'
except:
log_error('No password item matching the provided name and account could be found in the osx keychain.')
elif args.osx_keychain_item_account:
log_error('You must specify both name and account fields for osx keychain password items')
elif args.token:
_path_specifier = 'file://'
if args.token.startswith(_path_specifier):
args.token = open(args.token[len(_path_specifier):],
'rt').readline().strip()
auth = args.token + ':' + 'x-oauth-basic'
elif args.username:
if not args.password:
args.password = getpass.getpass()
return base64.b64encode(args.username + ':' + args.password)
if args.password:
if encode:
password = args.password
else:
password = urlquote(args.password)
auth = args.username + ':' + password
elif args.password:
log_error('You must specify a username for basic auth')
if not auth:
return None
if not encode:
return auth
return base64.b64encode(auth.encode('ascii'))
def get_github_api_host(args):
if args.github_host:
@@ -245,7 +321,7 @@ def get_github_api_host(args):
return host
def get_github_ssh_host(args):
def get_github_host(args):
if args.github_host:
host = args.github_host
else:
@@ -254,6 +330,23 @@ def get_github_ssh_host(args):
return host
def get_github_repo_url(args, repository):
if args.prefer_ssh:
return repository['ssh_url']
auth = get_auth(args, False)
if auth:
repo_url = 'https://{0}@{1}/{2}/{3}.git'.format(
auth,
get_github_host(args),
args.user,
repository['name'])
else:
repo_url = repository['clone_url']
return repo_url
def retrieve_data(args, template, query_args=None, single_request=False):
auth = get_auth(args)
query_args = get_query_args(query_args)
@@ -273,7 +366,7 @@ def retrieve_data(args, template, query_args=None, single_request=False):
errors.append(template.format(status_code, r.reason))
log_error(errors)
response = json.loads(r.read())
response = json.loads(r.read().decode('utf-8'))
if len(errors) == 0:
if type(response) == list:
data.extend(response)
@@ -305,11 +398,11 @@ def _get_response(request, auth, template):
while True:
should_continue = False
try:
r = urllib2.urlopen(request)
except urllib2.HTTPError as exc:
r = urlopen(request)
except HTTPError as exc:
errors, should_continue = _request_http_error(exc, auth, errors) # noqa
r = exc
except urllib2.URLError:
except URLError:
should_continue = _request_url_error(template, retry_timeout)
if not should_continue:
raise
@@ -322,14 +415,14 @@ def _get_response(request, auth, template):
def _construct_request(per_page, page, query_args, template, auth):
querystring = urllib.urlencode(dict({
querystring = urlencode(dict(list({
'per_page': per_page,
'page': page
}.items() + query_args.items()))
}.items()) + list(query_args.items())))
request = urllib2.Request(template + '?' + querystring)
request = Request(template + '?' + querystring)
if auth is not None:
request.add_header('Authorization', 'Basic ' + auth)
request.add_header('Authorization', 'Basic '.encode('ascii') + auth)
return request
@@ -356,10 +449,9 @@ def _request_http_error(exc, auth, errors):
print('Exceeded rate limit of {} requests; waiting {} seconds to reset'.format(limit, delta), # noqa
file=sys.stderr)
ratelimit_error = 'No more requests remaining'
if auth is None:
ratelimit_error += '; authenticate to raise your GitHub rate limit' # noqa
errors.append(ratelimit_error)
print('Hint: Authenticate to raise your GitHub rate limit',
file=sys.stderr)
time.sleep(delta)
should_continue = True
@@ -379,6 +471,13 @@ def _request_url_error(template, retry_timeout):
return False
def check_git_lfs_install():
exit_code = subprocess.call(['git', 'lfs', 'version'])
if exit_code != 0:
log_error('The argument --lfs requires you to have Git LFS installed.\nYou can get it from https://git-lfs.github.com.')
sys.exit(1)
def retrieve_repositories(args):
log_info('Retrieving repositories')
single_request = False
@@ -399,10 +498,13 @@ def retrieve_repositories(args):
return retrieve_data(args, template, single_request=single_request)
def filter_repositories(args, repositories):
def filter_repositories(args, unfiltered_repositories):
log_info('Filtering repositories')
repositories = [r for r in repositories if r['owner']['login'] == args.user]
repositories = []
for r in unfiltered_repositories:
if r['owner']['login'] == args.user:
repositories.append(r)
name_regex = None
if args.name_regex:
@@ -428,28 +530,38 @@ def backup_repositories(args, output_directory, repositories):
log_info('Backing up repositories')
repos_template = 'https://{0}/repos'.format(get_github_api_host(args))
if args.incremental:
last_update = max(list(repository['updated_at'] for repository in repositories) or [time.strftime('%Y-%m-%dT%H:%M:%SZ', time.localtime())]) # noqa
last_update_path = os.path.join(output_directory, 'last_update')
if os.path.exists(last_update_path):
args.since = open(last_update_path).read().strip()
else:
args.since = None
else:
args.since = None
for repository in repositories:
backup_cwd = os.path.join(output_directory, 'repositories')
repo_cwd = os.path.join(backup_cwd, repository['name'])
repo_dir = os.path.join(repo_cwd, 'repository')
if args.prefer_ssh:
repo_url = repository['ssh_url']
else:
repo_url = repository['clone_url']
repo_url = get_github_repo_url(args, repository)
if args.include_repository or args.include_everything:
fetch_repository(repository['name'],
repo_url,
repo_dir,
skip_existing=args.skip_existing)
skip_existing=args.skip_existing,
bare_clone=args.bare_clone,
lfs_clone=arg.lfs_clone)
download_wiki = (args.include_wiki or args.include_everything)
if repository['has_wiki'] and download_wiki:
fetch_repository(repository['name'],
repo_url.replace('.git', '.wiki.git'),
os.path.join(repo_cwd, 'wiki'),
skip_existing=args.skip_existing)
skip_existing=args.skip_existing,
bare_clone=args.bare_clone,
lfs_clone=arg.lfs_clone)
if args.include_issues or args.include_everything:
backup_issues(args, repo_cwd, repository, repos_template)
@@ -466,6 +578,9 @@ def backup_repositories(args, output_directory, repositories):
if args.include_hooks or args.include_everything:
backup_hooks(args, repo_cwd, repository, repos_template)
if args.incremental:
open(last_update_path, 'w').write(last_update)
def backup_issues(args, repo_cwd, repository, repos_template):
has_issues_dir = os.path.isdir('{0}/issues/.git'.format(repo_cwd))
@@ -477,26 +592,42 @@ def backup_issues(args, repo_cwd, repository, repos_template):
mkdir_p(repo_cwd, issue_cwd)
issues = {}
issues_skipped = 0
issues_skipped_message = ''
_issue_template = '{0}/{1}/issues'.format(repos_template,
repository['full_name'])
should_include_pulls = args.include_pulls or args.include_everything
issue_states = ['open', 'closed']
for issue_state in issue_states:
query_args = {
'filter': 'all',
'state': issue_state
}
if args.since:
query_args['since'] = args.since
_issues = retrieve_data(args,
_issue_template,
query_args=query_args)
for issue in _issues:
# skip pull requests which are also returned as issues
# if retrieving pull requests is requested as well
if 'pull_request' in issue and should_include_pulls:
issues_skipped += 1
continue
issues[issue['number']] = issue
log_info('Saving {0} issues to disk'.format(len(issues.keys())))
if issues_skipped:
issues_skipped_message = ' (skipped {0} pull requests)'.format(
issues_skipped)
log_info('Saving {0} issues to disk{1}'.format(
len(list(issues.keys())), issues_skipped_message))
comments_template = _issue_template + '/{0}/comments'
events_template = _issue_template + '/{0}/events'
for number, issue in issues.iteritems():
for number, issue in list(issues.items()):
if args.include_issue_comments or args.include_everything:
template = comments_template.format(number)
issues[number]['comment_data'] = retrieve_data(args, template)
@@ -526,19 +657,24 @@ def backup_pulls(args, repo_cwd, repository, repos_template):
for pull_state in pull_states:
query_args = {
'filter': 'all',
'state': pull_state
'state': pull_state,
'sort': 'updated',
'direction': 'desc',
}
# It'd be nice to be able to apply the args.since filter here...
_pulls = retrieve_data(args,
_pulls_template,
query_args=query_args)
for pull in _pulls:
if not args.since or pull['updated_at'] >= args.since:
pulls[pull['number']] = pull
log_info('Saving {0} pull requests to disk'.format(len(pulls.keys())))
log_info('Saving {0} pull requests to disk'.format(
len(list(pulls.keys()))))
comments_template = _pulls_template + '/{0}/comments'
commits_template = _pulls_template + '/{0}/commits'
for number, pull in pulls.iteritems():
for number, pull in list(pulls.items()):
if args.include_pull_comments or args.include_everything:
template = comments_template.format(number)
pulls[number]['comment_data'] = retrieve_data(args, template)
@@ -572,8 +708,9 @@ def backup_milestones(args, repo_cwd, repository, repos_template):
for milestone in _milestones:
milestones[milestone['number']] = milestone
log_info('Saving {0} milestones to disk'.format(len(milestones.keys())))
for number, milestone in milestones.iteritems():
log_info('Saving {0} milestones to disk'.format(
len(list(milestones.keys()))))
for number, milestone in list(milestones.items()):
milestone_file = '{0}/{1}.json'.format(milestone_cwd, number)
with codecs.open(milestone_file, 'w', encoding='utf-8') as f:
json_dump(milestone, f)
@@ -610,32 +747,72 @@ def backup_hooks(args, repo_cwd, repository, repos_template):
log_info("Unable to read hooks, skipping")
def fetch_repository(name, remote_url, local_dir, skip_existing=False):
def fetch_repository(name,
remote_url,
local_dir,
skip_existing=False,
bare_clone=False,
lfs_clone=False):
if bare_clone:
if os.path.exists(local_dir):
clone_exists = subprocess.check_output(['git',
'rev-parse',
'--is-bare-repository'],
cwd=local_dir) == "true\n"
else:
clone_exists = False
else:
clone_exists = os.path.exists(os.path.join(local_dir, '.git'))
if clone_exists and skip_existing:
return
initalized = subprocess.call('git ls-remote ' + remote_url,
masked_remote_url = mask_password(remote_url)
initialized = subprocess.call('git ls-remote ' + remote_url,
stdout=FNULL,
stderr=FNULL,
shell=True)
if initalized == 128:
log_info("Skipping {0} ({1}) since it's not initalized".format(name, remote_url))
if initialized == 128:
log_info("Skipping {0} ({1}) since it's not initialized".format(
name, masked_remote_url))
return
if clone_exists:
log_info('Updating {0} in {1}'.format(name, local_dir))
remotes = subprocess.check_output(['git', 'remote', 'show'],
cwd=local_dir)
remotes = [i.strip() for i in remotes.decode('utf-8').splitlines()]
if 'origin' not in remotes:
git_command = ['git', 'remote', 'rm', 'origin']
logging_subprocess(git_command, None, cwd=local_dir)
git_command = ['git', 'remote', 'add', 'origin', remote_url]
logging_subprocess(git_command, None, cwd=local_dir)
git_command = ['git', 'fetch', '--all', '--tags', '--prune']
else:
git_command = ['git', 'remote', 'set-url', 'origin', remote_url]
logging_subprocess(git_command, None, cwd=local_dir)
if lfs_clone:
git_command = ['git', 'lfs', 'fetch', '--all', '--force', '--tags', '--prune']
else:
git_command = ['git', 'fetch', '--all', '--force', '--tags', '--prune']
logging_subprocess(git_command, None, cwd=local_dir)
else:
log_info('Cloning {0} repository from {1} to {2}'.format(name,
remote_url,
log_info('Cloning {0} repository from {1} to {2}'.format(
name,
masked_remote_url,
local_dir))
if bare_clone:
if lfs_clone:
git_command = ['git', 'lfs', 'clone', '--mirror', remote_url, local_dir]
else:
git_command = ['git', 'clone', '--mirror', remote_url, local_dir]
else:
if lfs_clone:
git_command = ['git', 'lfs', 'clone', remote_url, local_dir]
else:
git_command = ['git', 'clone', remote_url, local_dir]
logging_subprocess(git_command, None)
@@ -693,6 +870,9 @@ def main():
log_info('Create output directory {0}'.format(output_directory))
mkdir_p(output_directory)
if args.lfs_clone:
check_git_lfs_install()
log_info('Backing up user {0} to {1}'.format(args.user, output_directory))
repositories = retrieve_repositories(args)

View File

@@ -1 +1 @@
__version__ = '0.7.0'
__version__ = '0.14.0'

View File

@@ -34,7 +34,7 @@ fi
echo -e "\n${GREEN}STARTING RELEASE PROCESS${COLOR_OFF}\n"
set +e;
git status | grep "working directory clean" &> /dev/null
git status | grep -Eo "working (directory|tree) clean" &> /dev/null
if [ ! $? -eq 0 ]; then # working directory is NOT clean
echo -e "${RED}WARNING: You have uncomitted changes, you may have forgotten something${COLOR_OFF}\n"
exit 1

View File

@@ -39,6 +39,8 @@ setup(
'License :: OSI Approved :: MIT License',
'Programming Language :: Python :: 2.6',
'Programming Language :: Python :: 2.7',
'Programming Language :: Python :: 3.5',
'Programming Language :: Python :: 3.6',
],
description='backup a github user or organization',
long_description=open_file('README.rst').read(),