Compare commits

..

12 Commits

Author SHA1 Message Date
Jose Diaz-Gonzalez
70f711ea68 Release version 0.33.0 2020-04-13 17:14:20 -04:00
Jose Diaz-Gonzalez
3fc9957aac Merge pull request #149 from eht16/simple_api_request_throttling
Add basic API request throttling
2020-04-13 17:13:58 -04:00
Enrico Tröger
78098aae23 Add basic API request throttling
A simple approach to throttle API requests and so keep within the rate
limits of the API. Can be enabled with "--throttle-limit" to specify
when throttling should start.
"--throttle-pause" defines the time to sleep between further API
requests.
2020-04-13 23:06:09 +02:00
Jose Diaz-Gonzalez
fb7cc5ed53 Release version 0.32.0 2020-04-13 17:02:59 -04:00
Jose Diaz-Gonzalez
c0679b9cc3 Merge pull request #148 from eht16/logging_with_timestamp
Add timestamp to log messages
2020-04-13 16:38:36 -04:00
Enrico Tröger
03b9d1b2d8 Add timestamp to log messages 2020-04-13 22:11:48 +02:00
Jose Diaz-Gonzalez
5025f69878 Merge pull request #147 from tomhoover/update-readme
Update README.rst to match 'github-backup -h'
2020-03-24 11:17:44 -04:00
Tom Hoover
a351cdc103 Update README.rst to match 'github-backup -h' 2020-03-22 08:48:50 -05:00
Jose Diaz-Gonzalez
85e4399408 Release version 0.31.0 2020-02-25 14:41:22 -05:00
Jose Diaz-Gonzalez
c8171b692a Merge pull request #146 from timm3/upstream-123
Authenticate as Github App
2020-02-25 14:39:27 -05:00
ethan
523c811cc6 #123 update: changed --as-app 'help' description 2020-02-25 13:13:20 -06:00
ethan
857ad0afab #123: Support Authenticating As Github Application 2020-02-25 12:35:24 -06:00
5 changed files with 105 additions and 39 deletions

View File

@@ -1,8 +1,30 @@
Changelog
=========
0.30.0 (2020-02-14)
0.33.0 (2020-04-13)
-------------------
------------------------
- Add basic API request throttling. [Enrico Tröger]
A simple approach to throttle API requests and so keep within the rate
limits of the API. Can be enabled with "--throttle-limit" to specify
when throttling should start.
"--throttle-pause" defines the time to sleep between further API
requests.
0.32.0 (2020-04-13)
-------------------
- Add timestamp to log messages. [Enrico Tröger]
0.31.0 (2020-02-25)
-------------------
- #123 update: changed --as-app 'help' description. [ethan]
- #123: Support Authenticating As Github Application. [ethan]
0.29.0 (2020-02-14)
-------------------
- #50 update: keep main() in bin. [ethan]
- #50 - refactor for friendlier import. [ethan]

View File

@@ -29,19 +29,19 @@ Usage
CLI Usage is as follows::
github-backup [-h] [-u USERNAME] [-p PASSWORD] [-t TOKEN]
github-backup [-h] [-u USERNAME] [-p PASSWORD] [-t TOKEN] [--as-app]
[-o OUTPUT_DIRECTORY] [-i] [--starred] [--all-starred]
[--watched] [--followers] [--following] [--all]
[--issues] [--issue-comments] [--issue-events] [--pulls]
[--pull-comments] [--pull-commits] [--labels] [--hooks]
[--milestones] [--repositories] [--releases] [--assets]
[--pull-comments] [--pull-commits] [--pull-details]
[--labels] [--hooks] [--milestones] [--repositories]
[--bare] [--lfs] [--wikis] [--gists] [--starred-gists]
[--skip-existing]
[-L [LANGUAGES [LANGUAGES ...]]] [-N NAME_REGEX]
[-H GITHUB_HOST] [-O] [-R REPOSITORY] [-P] [-F]
[--prefer-ssh] [-v]
[--skip-existing] [-L [LANGUAGES [LANGUAGES ...]]]
[-N NAME_REGEX] [-H GITHUB_HOST] [-O] [-R REPOSITORY]
[-P] [-F] [--prefer-ssh] [-v]
[--keychain-name OSX_KEYCHAIN_ITEM_NAME]
[--keychain-account OSX_KEYCHAIN_ITEM_ACCOUNT]
[--releases] [--assets]
USER
Backup a github account
@@ -57,36 +57,36 @@ CLI Usage is as follows::
password for basic auth. If a username is given but
not a password, the password will be prompted for.
-t TOKEN, --token TOKEN
personal access or OAuth token, or path to token
(file://...)
personal access, OAuth, or JSON Web token, or path to
token (file://...)
--as-app authenticate as github app instead of as a user.
-o OUTPUT_DIRECTORY, --output-directory OUTPUT_DIRECTORY
directory at which to backup the repositories
-i, --incremental incremental backup
--starred include JSON output of starred repositories in backup
--all-starred include starred repositories in backup
--watched include watched repositories in backup
--all-starred include starred repositories in backup [*]
--watched include JSON output of watched repositories in backup
--followers include JSON output of followers in backup
--following include JSON output of following users in backup
--all include everything in backup
--all include everything in backup (not including [*])
--issues include issues in backup
--issue-comments include issue comments in backup
--issue-events include issue events in backup
--pulls include pull requests in backup
--pull-comments include pull request review comments in backup
--pull-commits include pull request commits in backup
--pull-details include more pull request details in backup [*]
--labels include labels in backup
--hooks include hooks in backup (works only when
authenticated)
--milestones include milestones in backup
--repositories include repository clone in backup
--releases include repository releases' information without assets or binaries
--assets include assets alongside release information; only applies if including releases
--bare clone bare repositories
--lfs clone LFS repositories (requires Git LFS to be
installed, https://git-lfs.github.com)
installed, https://git-lfs.github.com) [*]
--wikis include wiki clone in backup
--gists include gists in backup
--starred-gists include starred gists in backup
--gists include gists in backup [*]
--starred-gists include starred gists in backup [*]
--skip-existing skip project if a backup directory exists
-L [LANGUAGES [LANGUAGES ...]], --languages [LANGUAGES [LANGUAGES ...]]
only allow these languages
@@ -97,8 +97,8 @@ CLI Usage is as follows::
-O, --organization whether or not this is an organization user
-R REPOSITORY, --repository REPOSITORY
name of repository to limit backup to
-P, --private include private repositories
-F, --fork include forked repositories
-P, --private include private repositories [*]
-F, --fork include forked repositories [*]
--prefer-ssh Clone repositories using SSH instead of HTTPS
-v, --version show program's version number and exit
--keychain-name OSX_KEYCHAIN_ITEM_NAME
@@ -107,6 +107,10 @@ CLI Usage is as follows::
--keychain-account OSX_KEYCHAIN_ITEM_ACCOUNT
OSX ONLY: account field of password item in OSX
keychain that holds the personal access or OAuth token
--releases include release information, not including assets or
binaries
--assets include assets alongside release information; only
applies if including releases
The package can be used to backup an *entire* organization or repository, including issues and wikis in the most appropriate format (clones for wikis, json files for issues).

View File

@@ -26,9 +26,12 @@ def main():
if args.lfs_clone:
check_git_lfs_install()
if not args.as_app:
log_info('Backing up user {0} to {1}'.format(args.user, output_directory))
authenticated_user = get_authenticated_user(args)
else:
authenticated_user = {'login': None}
repositories = retrieve_repositories(args, authenticated_user)
repositories = filter_repositories(args, repositories)
backup_repositories(args, output_directory, repositories)

View File

@@ -1 +1 @@
__version__ = '0.30.0'
__version__ = '0.33.0'

View File

@@ -7,6 +7,7 @@ import argparse
import base64
import calendar
import codecs
import datetime
import errno
import getpass
import json
@@ -50,6 +51,10 @@ except ImportError:
FNULL = open(os.devnull, 'w')
def _get_log_date():
return datetime.datetime.isoformat(datetime.datetime.now())
def log_error(message):
"""
Log message (str) or messages (List[str]) to stderr and exit with status 1
@@ -66,7 +71,7 @@ def log_info(message):
message = [message]
for msg in message:
sys.stdout.write("{0}\n".format(msg))
sys.stdout.write("{0}: {1}\n".format(_get_log_date(), msg))
def log_warning(message):
@@ -77,7 +82,7 @@ def log_warning(message):
message = [message]
for msg in message:
sys.stderr.write("{0}\n".format(msg))
sys.stderr.write("{0}: {1}\n".format(_get_log_date(), msg))
def logging_subprocess(popenargs,
@@ -168,7 +173,11 @@ def parse_args():
parser.add_argument('-t',
'--token',
dest='token',
help='personal access or OAuth token, or path to token (file://...)') # noqa
help='personal access, OAuth, or JSON Web token, or path to token (file://...)') # noqa
parser.add_argument('--as-app',
action='store_true',
dest='as_app',
help='authenticate as github app instead of as a user.')
parser.add_argument('-o',
'--output-directory',
default='.',
@@ -322,10 +331,20 @@ def parse_args():
action='store_true',
dest='include_assets',
help='include assets alongside release information; only applies if including releases')
parser.add_argument('--throttle-limit',
dest='throttle_limit',
type=int,
default=0,
help='start throttling of GitHub API requests after this amount of API requests remain')
parser.add_argument('--throttle-pause',
dest='throttle_pause',
type=float,
default=30.0,
help='wait this amount of seconds when API request throttling is active (default: 30.0, requires --throttle-limit to be set)')
return parser.parse_args()
def get_auth(args, encode=True):
def get_auth(args, encode=True, for_git_cli=False):
auth = None
if args.osx_keychain_item_name:
@@ -353,7 +372,13 @@ def get_auth(args, encode=True):
if args.token.startswith(_path_specifier):
args.token = open(args.token[len(_path_specifier):],
'rt').readline().strip()
if not args.as_app:
auth = args.token + ':' + 'x-oauth-basic'
else:
if not for_git_cli:
auth = args.token
else:
auth = 'x-access-token:' + args.token
elif args.username:
if not args.password:
args.password = getpass.getpass()
@@ -399,7 +424,7 @@ def get_github_repo_url(args, repository):
if args.prefer_ssh:
return repository['ssh_url']
auth = get_auth(args, False)
auth = get_auth(args, encode=False, for_git_cli=True)
if auth and repository['private'] == True:
repo_url = 'https://{0}@{1}/{2}/{3}.git'.format(
auth,
@@ -413,24 +438,32 @@ def get_github_repo_url(args, repository):
def retrieve_data_gen(args, template, query_args=None, single_request=False):
auth = get_auth(args)
auth = get_auth(args, encode=not args.as_app)
query_args = get_query_args(query_args)
per_page = 100
page = 0
while True:
page = page + 1
request = _construct_request(per_page, page, query_args, template, auth) # noqa
request = _construct_request(per_page, page, query_args, template, auth, as_app=args.as_app) # noqa
r, errors = _get_response(request, auth, template)
status_code = int(r.getcode())
# be gentle with API request limit and throttle requests if remaining requests getting low
limit_remaining = int(r.headers.get('x-ratelimit-remaining', 0))
if limit_remaining <= args.throttle_limit:
log_info(
'API request limit hit: {} requests left, pausing further requests for {}s'.format(
limit_remaining,
args.throttle_pause))
time.sleep(args.throttle_pause)
retries = 0
while retries < 3 and status_code == 502:
print('API request returned HTTP 502: Bad Gateway. Retrying in 5 seconds')
log_warning('API request returned HTTP 502: Bad Gateway. Retrying in 5 seconds')
retries += 1
time.sleep(5)
request = _construct_request(per_page, page, query_args, template, auth) # noqa
request = _construct_request(per_page, page, query_args, template, auth, as_app=args.as_app) # noqa
r, errors = _get_response(request, auth, template)
status_code = int(r.getcode())
@@ -495,7 +528,7 @@ def _get_response(request, auth, template):
return r, errors
def _construct_request(per_page, page, query_args, template, auth):
def _construct_request(per_page, page, query_args, template, auth, as_app=None):
querystring = urlencode(dict(list({
'per_page': per_page,
'page': page
@@ -503,7 +536,13 @@ def _construct_request(per_page, page, query_args, template, auth):
request = Request(template + '?' + querystring)
if auth is not None:
if not as_app:
request.add_header('Authorization', 'Basic '.encode('ascii') + auth)
else:
if not PY2:
auth = auth.encode('ascii')
request.add_header('Authorization', 'token '.encode('ascii') + auth)
request.add_header('Accept', 'application/vnd.github.machine-man-preview+json')
log_info('Requesting {}?{}'.format(template, querystring))
return request
@@ -528,12 +567,10 @@ def _request_http_error(exc, auth, errors):
delta = max(10, reset - gm_now)
limit = headers.get('x-ratelimit-limit')
print('Exceeded rate limit of {} requests; waiting {} seconds to reset'.format(limit, delta), # noqa
file=sys.stderr)
log_warning('Exceeded rate limit of {} requests; waiting {} seconds to reset'.format(limit, delta)) # noqa
if auth is None:
print('Hint: Authenticate to raise your GitHub rate limit',
file=sys.stderr)
log_info('Hint: Authenticate to raise your GitHub rate limit')
time.sleep(delta)
should_continue = True