Compare commits

..

57 Commits
0.2.0 ... 0.8.0

Author SHA1 Message Date
Jose Diaz-Gonzalez
a511bb2b49 Release version 0.8.0 2016-02-14 16:04:54 -05:00
Jose Diaz-Gonzalez
aedf9b2c66 Merge pull request #35 from eht16/issue23_store_pullrequests_once
Don't store issues which are actually pull requests
2016-02-14 16:02:18 -05:00
Enrico Tröger
b9e35a50f5 Don't store issues which are actually pull requests
This prevents storing pull requests twice since the Github API returns
pull requests also as issues. Those issues will be skipped but only if
retrieving pull requests is requested as well.
Closes #23.
2016-02-14 16:36:40 +01:00
Jose Diaz-Gonzalez
d0e239b3ef Release version 0.7.0 2016-02-02 14:52:07 -05:00
Jose Diaz-Gonzalez
29c9373d9d Merge pull request #32 from albertyw/soft-fail-hooks
Softly fail if not able to read hooks
2016-01-29 03:36:48 -05:00
Jose Diaz-Gonzalez
eb8b22c81c Merge pull request #33 from albertyw/update-readme
Add note about 2-factor auth in readme
2016-01-29 03:35:28 -05:00
Jose Diaz-Gonzalez
03739ce1be Merge pull request #31 from albertyw/fix-private-repos
Fix reading user's private repositories
2016-01-29 03:34:21 -05:00
Albert Wang
d2bb205b4b Add note about 2-factor auth 2016-01-29 00:33:53 -08:00
Albert Wang
17141c1bb6 Softly fail if not able to read hooks 2016-01-29 00:20:53 -08:00
Albert Wang
d362adbbca Make user repository search go through endpoint capable of reading private repositories 2016-01-28 22:52:58 -08:00
Jose Diaz-Gonzalez
89df625e04 Merge pull request #28 from alexmojaki/getpass
Prompt for password if only username given
2016-01-15 10:34:29 -05:00
Alex Hall
675484a215 Update README with new CLI usage 2016-01-12 14:40:29 +02:00
Alex Hall
325f77dcd9 Prompt for password if only username given 2016-01-12 11:18:26 +02:00
Jose Diaz-Gonzalez
f12e9167aa Release version 0.6.0 2015-11-10 15:36:20 -05:00
Jose Diaz-Gonzalez
816447af19 Force proper remote url 2015-11-10 15:36:12 -05:00
Jose Diaz-Gonzalez
d9e15e2be2 Merge pull request #24 from eht16/add_backup_hooks
Add backup hooks
2015-10-21 16:47:26 -04:00
Enrico Tröger
534145d178 Improve error handling in case of HTTP errors
In case of a HTTP status code 404, the returned 'r' was never assigned.
In case of URL errors which are not timeouts, we probably should bail
out.
2015-10-21 22:40:34 +02:00
Enrico Tröger
fe162eedd5 Add --hooks to also include web hooks into the backup 2015-10-21 22:39:45 +02:00
Jose Diaz-Gonzalez
53a9a22afb Merge pull request #22 from eht16/issue_17_create_output_directory
Create the user specified output directory if it does not exist
2015-10-16 15:05:29 -04:00
Jose Diaz-Gonzalez
2aa7d4cf1e Merge pull request #21 from eht16/fix_get_response_missing_auth
Add missing auth argument to _get_response()
2015-10-16 15:05:14 -04:00
Jose Diaz-Gonzalez
804843c128 Merge pull request #20 from eht16/improve_error_msg_on_non_existing_repo
Add repository URL to error message for non-existing repositories
2015-10-16 15:05:05 -04:00
Enrico Tröger
5fc27a4d42 Create the user specified output directory if it does not exist
Fixes #17.
2015-10-16 14:16:47 +02:00
Enrico Tröger
c8b3f048f5 Add repository URL to error message for non-existing repositories
This makes it easier for the user to identify which repository does not
exist or is not initialised, i.e. whether it is the main repository or
the wiki repository and which clone URL was used to check.
2015-10-16 14:09:13 +02:00
Enrico Tröger
2d98251992 Add missing auth argument to _get_response()
When running unauthenticated and Github starts rate-limiting the client,
github-backup crashes because the used auth variable in _get_response()
was not available. This change should fix it.
2015-10-16 14:00:56 +02:00
Jose Diaz-Gonzalez
050f5f1c17 Release version 0.5.0 2015-10-10 00:19:45 -04:00
Jose Diaz-Gonzalez
348a238770 Add release script 2015-10-10 00:19:31 -04:00
Jose Diaz-Gonzalez
708b377918 Refactor to both simplify codepath as well as follow PEP8 standards 2015-10-10 00:16:30 -04:00
Jose Diaz-Gonzalez
6193efb798 Merge pull request #19 from Embed-Engineering/retry-timeout
Retry 3 times when the connection times out
2015-09-04 10:36:57 -04:00
Mathijs Jonker
4b30aaeef3 Retry 3 times when the connection times out 2015-09-04 14:07:45 +02:00
Jose Diaz-Gonzalez
762059d1a6 Merge pull request #15 from kromkrom/master
Preserve Unicode characters in the output file
2015-05-04 14:13:11 -04:00
Kirill Grushetsky
a440bc1522 Update github-backup 2015-05-04 19:16:23 +03:00
Kirill Grushetsky
43793c1e5e Update github-backup 2015-05-04 19:15:55 +03:00
Kirill Grushetsky
24fac46459 Made unicode output defalut 2015-05-04 19:12:47 +03:00
Kirill Grushetsky
c9916e28a4 Import alphabetised 2015-05-04 13:45:39 +03:00
Kirill Grushetsky
ab4b28cdd4 Preserve Unicode characters in the output file
Added option to preserve Unicode characters in the output file
2015-05-04 13:38:28 +03:00
Jose Diaz-Gonzalez
6feb409fc2 Merge pull request #14 from aensley/master
Added backup of labels and milestones.
2015-04-23 15:37:14 -04:00
aensley
8bdbc2cee2 josegonzales/python-github-backup#12 Added backup of labels and milestones. 2015-04-23 14:05:48 -05:00
Jose Diaz-Gonzalez
a4d6272b50 Merge pull request #11 from Embed-Engineering/master
Added test for uninitialized repo's (or wiki's)
2015-04-15 11:03:25 -04:00
Mathijs Jonker
7ce61202e5 Fixed indent 2015-04-15 12:21:58 +02:00
mjonker-embed
3e82d829e4 Update github-backup 2015-04-15 12:14:55 +02:00
mjonker-embed
339ad96876 Skip unitialized repo's
These gave me errors which caused mails from crontab.
2015-04-15 12:10:53 +02:00
Jose Diaz-Gonzalez
b2a942eb43 Merge pull request #10 from Embed-Engineering/master
Added prefer-ssh
2015-03-20 10:54:42 -04:00
mjonker-embed
e8aa38f395 Added prefer-ssh
Was needed for my back-up setup, code includes this but readme wasn't updated
2015-03-20 14:22:53 +01:00
Jose Diaz-Gonzalez
86bdb1420c Merge pull request #9 from acdha/ratelimit-retries
Retry API requests which failed due to rate-limiting
2015-03-13 17:55:25 -04:00
Chris Adams
2e7f325475 Retry API requests which failed due to rate-limiting
This allows operation to continue, albeit at a slower pace,
if you have enough data to trigger the API rate limits
2015-03-13 17:37:01 -04:00
Jose Diaz-Gonzalez
8bf62cd932 Release 0.4.0 2015-03-13 16:33:28 -04:00
Jose Diaz-Gonzalez
63bf7267a6 Merge pull request #7 from acdha/repo-backup-overhaul
Repo backup overhaul
2015-03-13 16:32:49 -04:00
Chris Adams
5612e51153 Update repository back up handling for wikis
* Now wikis will follow the same logic as the main repo
  checkout for --prefer-ssh.
* The regular repository and wiki paths both use the same
  function to handle either cloning or updating a local copy
  of the remote repo
* All git updates will now use “git fetch --all --tags”
  to ensure that tags and branches other than master will
  also be backed up
2015-03-13 15:50:30 -04:00
Chris Adams
c81bf98627 logging_subprocess: always log when a command fails
Previously git clones could fail without any indication 
unless you edited the source to change `logger=None` to use
a configured logger.

Now a non-zero return code will always output a message to
stderr and will display the executed command so it can be
rerun for troubleshooting.
2015-03-13 15:50:04 -04:00
Chris Adams
040516325a Switch to using ssh_url
The previous commit used the wrong URL for a private repo. This was
masked by the lack of error loging in logging_subprocess (which will be
in a separate branch)
2015-03-13 15:39:35 -04:00
Jose Diaz-Gonzalez
dca9f8051b Merge pull request #6 from acdha/allow-clone-over-ssh
Add an option to prefer checkouts over SSH
2015-03-12 17:15:19 -04:00
Chris Adams
3bc23473b8 Add an option to prefer checkouts over SSH
This is really useful with private repos to avoid being nagged
for credentials for every repository
2015-03-12 16:10:46 -04:00
Jose Diaz-Gonzalez
2c9eb80cf2 Release 0.3.0 2015-02-20 12:41:25 -05:00
Jose Diaz-Gonzalez
bb86f0582e Merge pull request #4 from klaude/pull_request_support
Add pull request support
2015-01-16 11:06:01 -05:00
Kevin Laude
e8387f9a7f Add pull request support
Back up reporitory pull requests by passing the --include-pulls
argument. Pull requests are saved to
repositories/<repository name>/pulls/<pull request number>.json. Include
the --pull-request-comments argument to add review comments to the pull
request backup and pass the --pull-request-commits argument to add
commits to the pull request backup.

Pull requests are automatically backed up when the --all argument is
uesd.
2015-01-16 09:57:05 -06:00
Jose Diaz-Gonzalez
39b173f173 Merge pull request #5 from klaude/github-enterprise-support
Add GitHub Enterprise Support
2015-01-15 22:05:33 -05:00
Kevin Laude
883c92753d Add GitHub Enterprise support
Pass the -H or --github-host argument with a GitHub Enterprise hostname
to backup from that GitHub enterprise host. If no argument is passed
then back up from github.com.
2015-01-15 20:20:33 -06:00
5 changed files with 855 additions and 140 deletions

175
CHANGES.rst Normal file
View File

@@ -0,0 +1,175 @@
Changelog
=========
0.8.0 (2016-02-14)
------------------
- Don't store issues which are actually pull requests. [Enrico Tröger]
This prevents storing pull requests twice since the Github API returns
pull requests also as issues. Those issues will be skipped but only if
retrieving pull requests is requested as well.
Closes #23.
0.7.0 (2016-02-02)
------------------
- Softly fail if not able to read hooks. [Albert Wang]
- Add note about 2-factor auth. [Albert Wang]
- Make user repository search go through endpoint capable of reading
private repositories. [Albert Wang]
- Prompt for password if only username given. [Alex Hall]
0.6.0 (2015-11-10)
------------------
- Force proper remote url. [Jose Diaz-Gonzalez]
- Improve error handling in case of HTTP errors. [Enrico Tröger]
In case of a HTTP status code 404, the returned 'r' was never assigned.
In case of URL errors which are not timeouts, we probably should bail
out.
- Add --hooks to also include web hooks into the backup. [Enrico Tröger]
- Create the user specified output directory if it does not exist.
[Enrico Tröger]
Fixes #17.
- Add missing auth argument to _get_response() [Enrico Tröger]
When running unauthenticated and Github starts rate-limiting the client,
github-backup crashes because the used auth variable in _get_response()
was not available. This change should fix it.
- Add repository URL to error message for non-existing repositories.
[Enrico Tröger]
This makes it easier for the user to identify which repository does not
exist or is not initialised, i.e. whether it is the main repository or
the wiki repository and which clone URL was used to check.
0.5.0 (2015-10-10)
------------------
- Add release script. [Jose Diaz-Gonzalez]
- Refactor to both simplify codepath as well as follow PEP8 standards.
[Jose Diaz-Gonzalez]
- Retry 3 times when the connection times out. [Mathijs Jonker]
- Made unicode output defalut. [Kirill Grushetsky]
- Import alphabetised. [Kirill Grushetsky]
- Preserve Unicode characters in the output file. [Kirill Grushetsky]
Added option to preserve Unicode characters in the output file
- Josegonzales/python-github-backup#12 Added backup of labels and
milestones. [aensley]
- Fixed indent. [Mathijs Jonker]
- Skip unitialized repo's. [mjonker-embed]
These gave me errors which caused mails from crontab.
- Added prefer-ssh. [mjonker-embed]
Was needed for my back-up setup, code includes this but readme wasn't updated
- Retry API requests which failed due to rate-limiting. [Chris Adams]
This allows operation to continue, albeit at a slower pace,
if you have enough data to trigger the API rate limits
- Logging_subprocess: always log when a command fails. [Chris Adams]
Previously git clones could fail without any indication
unless you edited the source to change `logger=None` to use
a configured logger.
Now a non-zero return code will always output a message to
stderr and will display the executed command so it can be
rerun for troubleshooting.
- Switch to using ssh_url. [Chris Adams]
The previous commit used the wrong URL for a private repo. This was
masked by the lack of error loging in logging_subprocess (which will be
in a separate branch)
- Add an option to prefer checkouts over SSH. [Chris Adams]
This is really useful with private repos to avoid being nagged
for credentials for every repository
- Add pull request support. [Kevin Laude]
Back up reporitory pull requests by passing the --include-pulls
argument. Pull requests are saved to
repositories/<repository name>/pulls/<pull request number>.json. Include
the --pull-request-comments argument to add review comments to the pull
request backup and pass the --pull-request-commits argument to add
commits to the pull request backup.
Pull requests are automatically backed up when the --all argument is
uesd.
- Add GitHub Enterprise support. [Kevin Laude]
Pass the -H or --github-host argument with a GitHub Enterprise hostname
to backup from that GitHub enterprise host. If no argument is passed
then back up from github.com.
0.2.0 (2014-09-22)
------------------
- Add support for retrieving repositories. Closes #1. [Jose Diaz-
Gonzalez]
- Fix PEP8 violations. [Jose Diaz-Gonzalez]
- Add authorization to header only if specified by user. [Ioannis
Filippidis]
- Fill out readme more. [Jose Diaz-Gonzalez]
- Fix import. [Jose Diaz-Gonzalez]
- Properly name readme. [Jose Diaz-Gonzalez]
- Create MANIFEST.in. [Jose Diaz-Gonzalez]
- Create .gitignore. [Jose Diaz-Gonzalez]
- Create setup.py. [Jose Diaz-Gonzalez]
- Create requirements.txt. [Jose Diaz-Gonzalez]
- Create __init__.py. [Jose Diaz-Gonzalez]
- Create LICENSE.txt. [Jose Diaz-Gonzalez]
- Create README.md. [Jose Diaz-Gonzalez]
- Create github-backup. [Jose Diaz-Gonzalez]

View File

@@ -22,13 +22,15 @@ CLI Usage is as follows::
Github Backup [-h] [-u USERNAME] [-p PASSWORD] [-t TOKEN]
[-o OUTPUT_DIRECTORY] [--starred] [--watched] [--all]
[--issues] [--issue-comments] [--issue-events]
[--repositories] [--wikis] [--skip-existing]
[-L [LANGUAGES [LANGUAGES ...]]] [-N NAME_REGEX] [-O]
[-R REPOSITORY] [-P] [-F] [-v]
[--issues] [--issue-comments] [--issue-events] [--pulls]
[--pull-comments] [--pull-commits] [--labels] [--hooks]
[--milestones] [--repositories] [--wikis]
[--skip-existing] [-L [LANGUAGES [LANGUAGES ...]]]
[-N NAME_REGEX] [-H GITHUB_HOST] [-O] [-R REPOSITORY]
[-P] [-F] [--prefer-ssh] [-v]
USER
Backup a github users account
Backup a github account
positional arguments:
USER github username
@@ -38,7 +40,8 @@ CLI Usage is as follows::
-u USERNAME, --username USERNAME
username for basic auth
-p PASSWORD, --password PASSWORD
password for basic auth
password for basic auth. If a username is given but
not a password, the password will be prompted for.
-t TOKEN, --token TOKEN
personal access or OAuth token
-o OUTPUT_DIRECTORY, --output-directory OUTPUT_DIRECTORY
@@ -49,6 +52,13 @@ CLI Usage is as follows::
--issues include issues in backup
--issue-comments include issue comments in backup
--issue-events include issue events in backup
--pulls include pull requests in backup
--pull-comments include pull request review comments in backup
--pull-commits include pull request commits in backup
--labels include labels in backup
--hooks include hooks in backup (works only when
authenticated)
--milestones include milestones in backup
--repositories include repository clone in backup
--wikis include wiki clone in backup
--skip-existing skip project if a backup directory exists
@@ -56,11 +66,20 @@ CLI Usage is as follows::
only allow these languages
-N NAME_REGEX, --name-regex NAME_REGEX
python regex to match names against
-O, --organization whether or not this is a query for an organization
-H GITHUB_HOST, --github-host GITHUB_HOST
GitHub Enterprise hostname
-O, --organization whether or not this is an organization user
-R REPOSITORY, --repository REPOSITORY
name of repository to limit backup to
-P, --private include private repositories
-F, --fork include forked repositories
--prefer-ssh Clone repositories using SSH instead of HTTPS
-v, --version show program's version number and exit
The package can be used to backup an *entire* organization or repository, including issues and wikis in the most appropriate format (clones for wikis, json files for issues).
Authentication
==============
Note: Password-based authentication will fail if you have two-factor authentication enabled.

632
bin/github-backup Normal file → Executable file
View File

@@ -1,8 +1,13 @@
#!/usr/bin/env python
from __future__ import print_function
import argparse
import base64
import calendar
import codecs
import errno
import getpass
import json
import logging
import os
@@ -10,11 +15,14 @@ import re
import select
import subprocess
import sys
import time
import urllib
import urllib2
from github_backup import __version__
FNULL = open(os.devnull, 'w')
def log_error(message):
if type(message) == str:
@@ -34,7 +42,11 @@ def log_info(message):
sys.stdout.write("{0}\n".format(msg))
def logging_subprocess(popenargs, logger, stdout_log_level=logging.DEBUG, stderr_log_level=logging.ERROR, **kwargs):
def logging_subprocess(popenargs,
logger,
stdout_log_level=logging.DEBUG,
stderr_log_level=logging.ERROR,
**kwargs):
"""
Variant of subprocess.call that accepts a logger instead of stdout/stderr,
and logs stdout messages via logger.debug and stderr messages via
@@ -47,7 +59,10 @@ def logging_subprocess(popenargs, logger, stdout_log_level=logging.DEBUG, stderr
child.stderr: stderr_log_level}
def check_io():
ready_to_read = select.select([child.stdout, child.stderr], [], [], 1000)[0]
ready_to_read = select.select([child.stdout, child.stderr],
[],
[],
1000)[0]
for io in ready_to_read:
line = io.readline()
if not logger:
@@ -61,7 +76,13 @@ def logging_subprocess(popenargs, logger, stdout_log_level=logging.DEBUG, stderr
check_io() # check again to catch anything after the process exits
return child.wait()
rc = child.wait()
if rc != 0:
print(u'{} returned {}:'.format(popenargs[0], rc), file=sys.stderr)
print('\t', u' '.join(popenargs), file=sys.stderr)
return rc
def mkdir_p(*args):
@@ -76,77 +97,180 @@ def mkdir_p(*args):
def parse_args():
parser = argparse.ArgumentParser(description='Backup a github users account', prog='Github Backup')
parser.add_argument('user', metavar='USER', type=str, help='github username')
parser.add_argument('-u', '--username', dest='username', help='username for basic auth')
parser.add_argument('-p', '--password', dest='password', help='password for basic auth')
parser.add_argument('-t', '--token', dest='token', help='personal access or OAuth token')
parser.add_argument('-o', '--output-directory', default='.', dest='output_directory', help='directory at which to backup the repositories')
parser.add_argument('--starred', action='store_true', dest='include_starred', help='include starred repositories in backup')
parser.add_argument('--watched', action='store_true', dest='include_watched', help='include watched repositories in backup')
parser.add_argument('--all', action='store_true', dest='include_everything', help='include everything in backup')
parser.add_argument('--issues', action='store_true', dest='include_issues', help='include issues in backup')
parser.add_argument('--issue-comments', action='store_true', dest='include_issue_comments', help='include issue comments in backup')
parser.add_argument('--issue-events', action='store_true', dest='include_issue_events', help='include issue events in backup')
parser.add_argument('--repositories', action='store_true', dest='include_repository', help='include repository clone in backup')
parser.add_argument('--wikis', action='store_true', dest='include_wiki', help='include wiki clone in backup')
parser.add_argument('--skip-existing', action='store_true', dest='skip_existing', help='skip project if a backup directory exists')
parser.add_argument('-L', '--languages', dest='languages', help='only allow these languages', nargs='*')
parser.add_argument('-N', '--name-regex', dest='name_regex', help='python regex to match names against')
parser.add_argument('-O', '--organization', action='store_true', dest='organization', help='whether or not this is a query for an organization')
parser.add_argument('-R', '--repository', dest='repository', help='name of repository to limit backup to')
parser.add_argument('-P', '--private', action='store_true', dest='private', help='include private repositories')
parser.add_argument('-F', '--fork', action='store_true', dest='fork', help='include forked repositories')
parser.add_argument('-v', '--version', action='version', version='%(prog)s ' + __version__)
parser = argparse.ArgumentParser(description='Backup a github account',
prog='Github Backup')
parser.add_argument('user',
metavar='USER',
type=str,
help='github username')
parser.add_argument('-u',
'--username',
dest='username',
help='username for basic auth')
parser.add_argument('-p',
'--password',
dest='password',
help='password for basic auth. '
'If a username is given but not a password, the '
'password will be prompted for.')
parser.add_argument('-t',
'--token',
dest='token',
help='personal access or OAuth token')
parser.add_argument('-o',
'--output-directory',
default='.',
dest='output_directory',
help='directory at which to backup the repositories')
parser.add_argument('--starred',
action='store_true',
dest='include_starred',
help='include starred repositories in backup')
parser.add_argument('--watched',
action='store_true',
dest='include_watched',
help='include watched repositories in backup')
parser.add_argument('--all',
action='store_true',
dest='include_everything',
help='include everything in backup')
parser.add_argument('--issues',
action='store_true',
dest='include_issues',
help='include issues in backup')
parser.add_argument('--issue-comments',
action='store_true',
dest='include_issue_comments',
help='include issue comments in backup')
parser.add_argument('--issue-events',
action='store_true',
dest='include_issue_events',
help='include issue events in backup')
parser.add_argument('--pulls',
action='store_true',
dest='include_pulls',
help='include pull requests in backup')
parser.add_argument('--pull-comments',
action='store_true',
dest='include_pull_comments',
help='include pull request review comments in backup')
parser.add_argument('--pull-commits',
action='store_true',
dest='include_pull_commits',
help='include pull request commits in backup')
parser.add_argument('--labels',
action='store_true',
dest='include_labels',
help='include labels in backup')
parser.add_argument('--hooks',
action='store_true',
dest='include_hooks',
help='include hooks in backup (works only when authenticated)')
parser.add_argument('--milestones',
action='store_true',
dest='include_milestones',
help='include milestones in backup')
parser.add_argument('--repositories',
action='store_true',
dest='include_repository',
help='include repository clone in backup')
parser.add_argument('--wikis',
action='store_true',
dest='include_wiki',
help='include wiki clone in backup')
parser.add_argument('--skip-existing',
action='store_true',
dest='skip_existing',
help='skip project if a backup directory exists')
parser.add_argument('-L',
'--languages',
dest='languages',
help='only allow these languages',
nargs='*')
parser.add_argument('-N',
'--name-regex',
dest='name_regex',
help='python regex to match names against')
parser.add_argument('-H',
'--github-host',
dest='github_host',
help='GitHub Enterprise hostname')
parser.add_argument('-O',
'--organization',
action='store_true',
dest='organization',
help='whether or not this is an organization user')
parser.add_argument('-R',
'--repository',
dest='repository',
help='name of repository to limit backup to')
parser.add_argument('-P', '--private',
action='store_true',
dest='private',
help='include private repositories')
parser.add_argument('-F', '--fork',
action='store_true',
dest='fork',
help='include forked repositories')
parser.add_argument('--prefer-ssh',
action='store_true',
help='Clone repositories using SSH instead of HTTPS')
parser.add_argument('-v', '--version',
action='version',
version='%(prog)s ' + __version__)
return parser.parse_args()
def get_auth(args):
auth = None
if args.token:
auth = base64.b64encode(args.token + ':' + 'x-oauth-basic')
elif args.username and args.password:
auth = base64.b64encode(args.username + ':' + args.password)
elif args.username and not args.password:
log_error('You must specify a password for basic auth when specifying a username')
elif args.password and not args.username:
log_error('You must specify a username for basic auth when specifying a password')
return base64.b64encode(args.token + ':' + 'x-oauth-basic')
return auth
if args.username:
if not args.password:
args.password = getpass.getpass()
return base64.b64encode(args.username + ':' + args.password)
if args.password:
log_error('You must specify a username for basic auth')
return None
def get_github_api_host(args):
if args.github_host:
host = args.github_host + '/api/v3'
else:
host = 'api.github.com'
return host
def get_github_ssh_host(args):
if args.github_host:
host = args.github_host
else:
host = 'github.com'
return host
def retrieve_data(args, template, query_args=None, single_request=False):
auth = get_auth(args)
query_args = get_query_args(query_args)
per_page = 100
page = 0
data = []
if not query_args:
query_args = {}
while True:
page = page + 1
querystring = urllib.urlencode(dict({
'per_page': per_page,
'page': page
}.items() + query_args.items()))
request = _construct_request(per_page, page, query_args, template, auth) # noqa
r, errors = _get_response(request, auth, template)
request = urllib2.Request(template + '?' + querystring)
if auth is not None:
request.add_header('Authorization', 'Basic ' + auth)
r = urllib2.urlopen(request)
status_code = int(r.getcode())
errors = []
if int(r.getcode()) != 200:
errors.append('Bad response from api')
if 'X-RateLimit-Limit' in r.headers and int(r.headers['X-RateLimit-Limit']) == 0:
ratelimit_error = 'No more requests remaining'
if auth is None:
ratelimit_error = ratelimit_error + ', specify username/password or token to raise your github ratelimit'
errors.append(ratelimit_error)
if int(r.getcode()) != 200:
if status_code != 200:
template = 'API request returned HTTP {0}: {1}'
errors.append(template.format(status_code, r.reason))
log_error(errors)
response = json.loads(r.read())
@@ -167,22 +291,119 @@ def retrieve_data(args, template, query_args=None, single_request=False):
return data
def get_query_args(query_args=None):
if not query_args:
query_args = {}
return query_args
def _get_response(request, auth, template):
retry_timeout = 3
errors = []
# We'll make requests in a loop so we can
# delay and retry in the case of rate-limiting
while True:
should_continue = False
try:
r = urllib2.urlopen(request)
except urllib2.HTTPError as exc:
errors, should_continue = _request_http_error(exc, auth, errors) # noqa
r = exc
except urllib2.URLError:
should_continue = _request_url_error(template, retry_timeout)
if not should_continue:
raise
if should_continue:
continue
break
return r, errors
def _construct_request(per_page, page, query_args, template, auth):
querystring = urllib.urlencode(dict({
'per_page': per_page,
'page': page
}.items() + query_args.items()))
request = urllib2.Request(template + '?' + querystring)
if auth is not None:
request.add_header('Authorization', 'Basic ' + auth)
return request
def _request_http_error(exc, auth, errors):
# HTTPError behaves like a Response so we can
# check the status code and headers to see exactly
# what failed.
should_continue = False
headers = exc.headers
limit_remaining = int(headers.get('x-ratelimit-remaining', 0))
if exc.code == 403 and limit_remaining < 1:
# The X-RateLimit-Reset header includes a
# timestamp telling us when the limit will reset
# so we can calculate how long to wait rather
# than inefficiently polling:
gm_now = calendar.timegm(time.gmtime())
reset = int(headers.get('x-ratelimit-reset', 0)) or gm_now
# We'll never sleep for less than 10 seconds:
delta = max(10, reset - gm_now)
limit = headers.get('x-ratelimit-limit')
print('Exceeded rate limit of {} requests; waiting {} seconds to reset'.format(limit, delta), # noqa
file=sys.stderr)
ratelimit_error = 'No more requests remaining'
if auth is None:
ratelimit_error += '; authenticate to raise your GitHub rate limit' # noqa
errors.append(ratelimit_error)
time.sleep(delta)
should_continue = True
return errors, should_continue
def _request_url_error(template, retry_timeout):
# Incase of a connection timing out, we can retry a few time
# But we won't crash and not back-up the rest now
log_info('{} timed out'.format(template))
retry_timeout -= 1
if retry_timeout >= 0:
return True
log_error('{} timed out to much, skipping!')
return False
def retrieve_repositories(args):
log_info('Retrieving repositories')
single_request = False
template = 'https://api.github.com/users/{0}/repos'.format(args.user)
template = 'https://{0}/user/repos'.format(
get_github_api_host(args))
if args.organization:
template = 'https://api.github.com/orgs/{0}/repos'.format(args.user)
template = 'https://{0}/orgs/{1}/repos'.format(
get_github_api_host(args),
args.user)
if args.repository:
single_request = True
template = 'https://api.github.com/repos/{0}/{1}'.format(args.user, args.repository)
template = 'https://{0}/repos/{1}/{2}'.format(
get_github_api_host(args),
args.user,
args.repository)
return retrieve_data(args, template, single_request=single_request)
def filter_repositories(args, repositories):
log_info('Filtering repositories')
repositories = [r for r in repositories if r['owner']['login'] == args.user]
name_regex = None
if args.name_regex:
name_regex = re.compile(args.name_regex)
@@ -196,7 +417,7 @@ def filter_repositories(args, repositories):
if not args.private:
repositories = [r for r in repositories if not r['private']]
if languages:
repositories = [r for r in repositories if r['language'] and r['language'].lower() in languages]
repositories = [r for r in repositories if r['language'] and r['language'].lower() in languages] # noqa
if name_regex:
repositories = [r for r in repositories if name_regex.match(r['name'])]
@@ -205,101 +426,273 @@ def filter_repositories(args, repositories):
def backup_repositories(args, output_directory, repositories):
log_info('Backing up repositories')
issue_template = "https://api.github.com/repos"
wiki_template = "git@github.com:{0}.wiki.git"
repos_template = 'https://{0}/repos'.format(get_github_api_host(args))
issue_states = ['open', 'closed']
for repository in repositories:
backup_cwd = os.path.join(output_directory, 'repositories')
repo_cwd = os.path.join(backup_cwd, repository['name'])
repo_dir = os.path.join(repo_cwd, 'repository')
if args.prefer_ssh:
repo_url = repository['ssh_url']
else:
repo_url = repository['clone_url']
if args.include_repository or args.include_everything:
mkdir_p(backup_cwd, repo_cwd)
exists = os.path.isdir('{0}/repository/.git'.format(repo_cwd))
if args.skip_existing and exists:
continue
fetch_repository(repository['name'],
repo_url,
repo_dir,
skip_existing=args.skip_existing)
if exists:
log_info('Updating {0} repository'.format(repository['full_name']))
git_command = ["git", "pull", 'origin', 'master']
logging_subprocess(git_command, logger=None, cwd=os.path.join(repo_cwd, 'repository'))
else:
log_info('Cloning {0} repository'.format(repository['full_name']))
git_command = ["git", "clone", repository['clone_url'], 'repository']
logging_subprocess(git_command, logger=None, cwd=repo_cwd)
if repository['has_wiki'] and (args.include_wiki or args.include_everything):
mkdir_p(backup_cwd, repo_cwd)
exists = os.path.isdir('{0}/wiki/.git'.format(repo_cwd))
if args.skip_existing and exists:
continue
if exists:
log_info('Updating {0} wiki'.format(repository['full_name']))
git_command = ["git", "pull", 'origin', 'master']
logging_subprocess(git_command, logger=None, cwd=os.path.join(repo_cwd, 'wiki'))
else:
log_info('Cloning {0} wiki'.format(repository['full_name']))
git_command = ["git", "clone", wiki_template.format(repository['full_name']), 'wiki']
logging_subprocess(git_command, logger=None, cwd=repo_cwd)
download_wiki = (args.include_wiki or args.include_everything)
if repository['has_wiki'] and download_wiki:
fetch_repository(repository['name'],
repo_url.replace('.git', '.wiki.git'),
os.path.join(repo_cwd, 'wiki'),
skip_existing=args.skip_existing)
if args.include_issues or args.include_everything:
if args.skip_existing and os.path.isdir('{0}/issues/.git'.format(repo_cwd)):
continue
backup_issues(args, repo_cwd, repository, repos_template)
if args.include_pulls or args.include_everything:
backup_pulls(args, repo_cwd, repository, repos_template)
if args.include_milestones or args.include_everything:
backup_milestones(args, repo_cwd, repository, repos_template)
if args.include_labels or args.include_everything:
backup_labels(args, repo_cwd, repository, repos_template)
if args.include_hooks or args.include_everything:
backup_hooks(args, repo_cwd, repository, repos_template)
def backup_issues(args, repo_cwd, repository, repos_template):
has_issues_dir = os.path.isdir('{0}/issues/.git'.format(repo_cwd))
if args.skip_existing and has_issues_dir:
return
log_info('Retrieving {0} issues'.format(repository['full_name']))
issue_cwd = os.path.join(repo_cwd, 'issues')
mkdir_p(backup_cwd, repo_cwd, issue_cwd)
mkdir_p(repo_cwd, issue_cwd)
issues = {}
_issue_template = '{0}/{1}/issues'.format(issue_template, repository['full_name'])
issues_skipped = 0
issues_skipped_message = ''
_issue_template = '{0}/{1}/issues'.format(repos_template,
repository['full_name'])
issue_states = ['open', 'closed']
for issue_state in issue_states:
query_args = {
'filter': 'all',
'state': issue_state
}
_issues = retrieve_data(args, _issue_template, query_args=query_args)
_issues = retrieve_data(args,
_issue_template,
query_args=query_args)
for issue in _issues:
# skip pull requests which are also returned as issues
# if retrieving pull requests is requested as well
if 'pull_request' in issue and (args.include_pulls or args.include_everything):
issues_skipped += 1
continue
issues[issue['number']] = issue
log_info('Saving {0} issues to disk'.format(len(issues.keys())))
for number, issue in issues.iteritems():
if issues_skipped:
issues_skipped_message = ' (skipped {0} pull requests)'.format(issues_skipped)
log_info('Saving {0} issues to disk{1}'.format(len(issues.keys()), issues_skipped_message))
comments_template = _issue_template + '/{0}/comments'
events_template = _issue_template + '/{0}/events'
for number, issue in issues.iteritems():
if args.include_issue_comments or args.include_everything:
issues[number]['comment_data'] = retrieve_data(args, comments_template.format(number))
template = comments_template.format(number)
issues[number]['comment_data'] = retrieve_data(args, template)
if args.include_issue_events or args.include_everything:
issues[number]['event_data'] = retrieve_data(args, events_template.format(number))
template = events_template.format(number)
issues[number]['event_data'] = retrieve_data(args, template)
with open('{0}/{1}.json'.format(issue_cwd, number), 'w') as issue_file:
json.dump(issue, issue_file, sort_keys=True, indent=4, separators=(',', ': '))
issue_file = '{0}/{1}.json'.format(issue_cwd, number)
with codecs.open(issue_file, 'w', encoding='utf-8') as f:
json_dump(issue, f)
def backup_pulls(args, repo_cwd, repository, repos_template):
has_pulls_dir = os.path.isdir('{0}/pulls/.git'.format(repo_cwd))
if args.skip_existing and has_pulls_dir:
return
log_info('Retrieving {0} pull requests'.format(repository['full_name'])) # noqa
pulls_cwd = os.path.join(repo_cwd, 'pulls')
mkdir_p(repo_cwd, pulls_cwd)
pulls = {}
_pulls_template = '{0}/{1}/pulls'.format(repos_template,
repository['full_name'])
pull_states = ['open', 'closed']
for pull_state in pull_states:
query_args = {
'filter': 'all',
'state': pull_state
}
_pulls = retrieve_data(args,
_pulls_template,
query_args=query_args)
for pull in _pulls:
pulls[pull['number']] = pull
log_info('Saving {0} pull requests to disk'.format(len(pulls.keys())))
comments_template = _pulls_template + '/{0}/comments'
commits_template = _pulls_template + '/{0}/commits'
for number, pull in pulls.iteritems():
if args.include_pull_comments or args.include_everything:
template = comments_template.format(number)
pulls[number]['comment_data'] = retrieve_data(args, template)
if args.include_pull_commits or args.include_everything:
template = commits_template.format(number)
pulls[number]['commit_data'] = retrieve_data(args, template)
pull_file = '{0}/{1}.json'.format(pulls_cwd, number)
with codecs.open(pull_file, 'w', encoding='utf-8') as f:
json_dump(pull, f)
def backup_milestones(args, repo_cwd, repository, repos_template):
milestone_cwd = os.path.join(repo_cwd, 'milestones')
if args.skip_existing and os.path.isdir(milestone_cwd):
return
log_info('Retrieving {0} milestones'.format(repository['full_name']))
mkdir_p(repo_cwd, milestone_cwd)
template = '{0}/{1}/milestones'.format(repos_template,
repository['full_name'])
query_args = {
'state': 'all'
}
_milestones = retrieve_data(args, template, query_args=query_args)
milestones = {}
for milestone in _milestones:
milestones[milestone['number']] = milestone
log_info('Saving {0} milestones to disk'.format(len(milestones.keys())))
for number, milestone in milestones.iteritems():
milestone_file = '{0}/{1}.json'.format(milestone_cwd, number)
with codecs.open(milestone_file, 'w', encoding='utf-8') as f:
json_dump(milestone, f)
def backup_labels(args, repo_cwd, repository, repos_template):
label_cwd = os.path.join(repo_cwd, 'labels')
output_file = '{0}/labels.json'.format(label_cwd)
template = '{0}/{1}/labels'.format(repos_template,
repository['full_name'])
_backup_data(args,
'labels',
template,
output_file,
label_cwd)
def backup_hooks(args, repo_cwd, repository, repos_template):
auth = get_auth(args)
if not auth:
log_info("Skipping hooks since no authentication provided")
return
hook_cwd = os.path.join(repo_cwd, 'hooks')
output_file = '{0}/hooks.json'.format(hook_cwd)
template = '{0}/{1}/hooks'.format(repos_template,
repository['full_name'])
try:
_backup_data(args,
'hooks',
template,
output_file,
hook_cwd)
except SystemExit:
log_info("Unable to read hooks, skipping")
def fetch_repository(name, remote_url, local_dir, skip_existing=False):
clone_exists = os.path.exists(os.path.join(local_dir, '.git'))
if clone_exists and skip_existing:
return
initalized = subprocess.call('git ls-remote ' + remote_url,
stdout=FNULL,
stderr=FNULL,
shell=True)
if initalized == 128:
log_info("Skipping {0} ({1}) since it's not initalized".format(name, remote_url))
return
if clone_exists:
log_info('Updating {0} in {1}'.format(name, local_dir))
git_command = ['git', 'remote', 'rm', 'origin']
logging_subprocess(git_command, None, cwd=local_dir)
git_command = ['git', 'remote', 'add', 'origin', remote_url]
logging_subprocess(git_command, None, cwd=local_dir)
git_command = ['git', 'fetch', '--all', '--tags', '--prune']
logging_subprocess(git_command, None, cwd=local_dir)
else:
log_info('Cloning {0} repository from {1} to {2}'.format(name,
remote_url,
local_dir))
git_command = ['git', 'clone', remote_url, local_dir]
logging_subprocess(git_command, None)
def backup_account(args, output_directory):
account_cwd = os.path.join(output_directory, 'account')
if args.include_starred or args.include_everything:
if not args.skip_existing or not os.path.exists('{0}/starred.json'.format(account_cwd)):
log_info('Retrieving {0} starred repositories'.format(args.user))
mkdir_p(account_cwd)
starred_template = "https://api.github.com/users/{0}/starred"
starred = retrieve_data(args, starred_template.format(args.user))
log_info('Writing {0} starred repositories'.format(len(starred)))
with open('{0}/starred.json'.format(account_cwd), 'w') as starred_file:
json.dump(starred, starred_file, sort_keys=True, indent=4, separators=(',', ': '))
if args.include_starred or args.include_everything:
output_file = '{0}/starred.json'.format(account_cwd)
template = "https://{0}/users/{1}/starred"
template = template.format(get_github_api_host(args), args.user)
_backup_data(args,
'starred repositories',
template,
output_file,
account_cwd)
if args.include_watched or args.include_everything:
if not args.skip_existing or not os.path.exists('{0}/watched.json'.format(account_cwd)):
log_info('Retrieving {0} watched repositories'.format(args.user))
mkdir_p(account_cwd)
output_file = '{0}/watched.json'.format(account_cwd)
template = "https://{0}/users/{1}/subscriptions"
template = template.format(get_github_api_host(args), args.user)
_backup_data(args,
'watched repositories',
template,
output_file,
account_cwd)
watched_template = "https://api.github.com/users/{0}/subscriptions"
watched = retrieve_data(args, watched_template.format(args.user))
log_info('Writing {0} watched repositories'.format(len(watched)))
with open('{0}/watched.json'.format(account_cwd), 'w') as watched_file:
json.dump(watched, watched_file, sort_keys=True, indent=4, separators=(',', ': '))
def _backup_data(args, name, template, output_file, output_directory):
skip_existing = args.skip_existing
if not skip_existing or not os.path.exists(output_file):
log_info('Retrieving {0} {1}'.format(args.user, name))
mkdir_p(output_directory)
data = retrieve_data(args, template)
log_info('Writing {0} {1} to disk'.format(len(data), name))
with codecs.open(output_file, 'w', encoding='utf-8') as f:
json_dump(data, f)
def json_dump(data, output_file):
json.dump(data,
output_file,
ensure_ascii=False,
sort_keys=True,
indent=4,
separators=(',', ': '))
def main():
@@ -307,7 +700,8 @@ def main():
output_directory = os.path.realpath(args.output_directory)
if not os.path.isdir(output_directory):
log_error('Specified output directory is not a directory: {0}'.format(output_directory))
log_info('Create output directory {0}'.format(output_directory))
mkdir_p(output_directory)
log_info('Backing up user {0} to {1}'.format(args.user, output_directory))

View File

@@ -1 +1 @@
__version__ = '0.2.0'
__version__ = '0.8.0'

127
release Executable file
View File

@@ -0,0 +1,127 @@
#!/usr/bin/env bash
set -eo pipefail; [[ $RELEASE_TRACE ]] && set -x
PACKAGE_NAME='github-backup'
INIT_PACKAGE_NAME='github_backup'
PUBLIC="true"
# Colors
COLOR_OFF="\033[0m" # unsets color to term fg color
RED="\033[0;31m" # red
GREEN="\033[0;32m" # green
YELLOW="\033[0;33m" # yellow
MAGENTA="\033[0;35m" # magenta
CYAN="\033[0;36m" # cyan
# ensure wheel is available
pip install wheel > /dev/null
command -v gitchangelog >/dev/null 2>&1 || {
echo -e "${RED}WARNING: Missing gitchangelog binary, please run: pip install gitchangelog==2.2.0${COLOR_OFF}\n"
exit 1
}
command -v rst-lint > /dev/null || {
echo -e "${RED}WARNING: Missing rst-lint binary, please run: pip install restructuredtext_lint${COLOR_OFF}\n"
exit 1
}
if [[ "$@" != "major" ]] && [[ "$@" != "minor" ]] && [[ "$@" != "patch" ]]; then
echo -e "${RED}WARNING: Invalid release type, must specify 'major', 'minor', or 'patch'${COLOR_OFF}\n"
exit 1
fi
echo -e "\n${GREEN}STARTING RELEASE PROCESS${COLOR_OFF}\n"
set +e;
git status | grep "working directory clean" &> /dev/null
if [ ! $? -eq 0 ]; then # working directory is NOT clean
echo -e "${RED}WARNING: You have uncomitted changes, you may have forgotten something${COLOR_OFF}\n"
exit 1
fi
set -e;
echo -e "${YELLOW}--->${COLOR_OFF} Updating local copy"
git pull -q origin master
echo -e "${YELLOW}--->${COLOR_OFF} Retrieving release versions"
current_version=$(cat ${INIT_PACKAGE_NAME}/__init__.py |grep '__version__ ='|sed 's/[^0-9.]//g')
major=$(echo $current_version | awk '{split($0,a,"."); print a[1]}')
minor=$(echo $current_version | awk '{split($0,a,"."); print a[2]}')
patch=$(echo $current_version | awk '{split($0,a,"."); print a[3]}')
if [[ "$@" == "major" ]]; then
major=$(($major + 1));
minor="0"
patch="0"
elif [[ "$@" == "minor" ]]; then
minor=$(($minor + 1));
patch="0"
elif [[ "$@" == "patch" ]]; then
patch=$(($patch + 1));
fi
next_version="${major}.${minor}.${patch}"
echo -e "${YELLOW} >${COLOR_OFF} ${MAGENTA}${current_version}${COLOR_OFF} -> ${MAGENTA}${next_version}${COLOR_OFF}"
echo -e "${YELLOW}--->${COLOR_OFF} Ensuring readme passes lint checks (if this fails, run rst-lint)"
rst-lint README.rst > /dev/null
echo -e "${YELLOW}--->${COLOR_OFF} Creating necessary temp file"
tempfoo=$(basename $0)
TMPFILE=$(mktemp /tmp/${tempfoo}.XXXXXX) || {
echo -e "${RED}WARNING: Cannot create temp file using mktemp in /tmp dir ${COLOR_OFF}\n"
exit 1
}
find_this="__version__ = '$current_version'"
replace_with="__version__ = '$next_version'"
echo -e "${YELLOW}--->${COLOR_OFF} Updating ${INIT_PACKAGE_NAME}/__init__.py"
sed "s/$find_this/$replace_with/" ${INIT_PACKAGE_NAME}/__init__.py > $TMPFILE && mv $TMPFILE ${INIT_PACKAGE_NAME}/__init__.py
find_this="${PACKAGE_NAME}.git@$current_version"
replace_with="${PACKAGE_NAME}.git@$next_version"
echo -e "${YELLOW}--->${COLOR_OFF} Updating README.rst"
sed "s/$find_this/$replace_with/" README.rst > $TMPFILE && mv $TMPFILE README.rst
if [ -f docs/conf.py ]; then
echo -e "${YELLOW}--->${COLOR_OFF} Updating docs"
find_this="version = '${current_version}'"
replace_with="version = '${next_version}'"
sed "s/$find_this/$replace_with/" docs/conf.py > $TMPFILE && mv $TMPFILE docs/conf.py
find_this="version = '${current_version}'"
replace_with="release = '${next_version}'"
sed "s/$find_this/$replace_with/" docs/conf.py > $TMPFILE && mv $TMPFILE docs/conf.py
fi
echo -e "${YELLOW}--->${COLOR_OFF} Updating CHANGES.rst for new release"
version_header="$next_version ($(date +%F))"
set +e; dashes=$(yes '-'|head -n ${#version_header}|tr -d '\n') ; set -e
gitchangelog |sed "4s/.*/$version_header/"|sed "5s/.*/$dashes/" > $TMPFILE && mv $TMPFILE CHANGES.rst
echo -e "${YELLOW}--->${COLOR_OFF} Adding changed files to git"
git add CHANGES.rst README.rst ${INIT_PACKAGE_NAME}/__init__.py
if [ -f docs/conf.py ]; then git add docs/conf.py; fi
echo -e "${YELLOW}--->${COLOR_OFF} Creating release"
git commit -q -m "Release version $next_version"
echo -e "${YELLOW}--->${COLOR_OFF} Tagging release"
git tag -a $next_version -m "Release version $next_version"
echo -e "${YELLOW}--->${COLOR_OFF} Pushing release and tags to github"
git push -q origin master && git push -q --tags
if [[ "$PUBLIC" == "true" ]]; then
echo -e "${YELLOW}--->${COLOR_OFF} Creating python release"
cp README.rst README
python setup.py sdist bdist_wheel upload > /dev/null
rm README
fi
echo -e "\n${CYAN}RELEASED VERSION ${next_version}!${COLOR_OFF}\n"