Compare commits

...

28 Commits

Author SHA1 Message Date
Jose Diaz-Gonzalez
031a984434 Release version 0.36.0 2020-08-29 02:37:48 -04:00
Jose Diaz-Gonzalez
9e16f39e3e Merge pull request #157 from albertyw/lint 2020-08-29 02:37:19 -04:00
Albert Wang
2de96390be Add flake8 instructions to readme 2020-08-28 23:13:24 -07:00
Albert Wang
78cff47a91 Fix regex string 2020-08-28 23:13:24 -07:00
Albert Wang
fa27988c1c Update boolean check 2020-08-28 23:13:23 -07:00
Albert Wang
bb2e2b8c6f Fix whitespace issues 2020-08-28 23:13:23 -07:00
Albert Wang
8fd0f2b64f Do not use bare excepts 2020-08-28 23:13:23 -07:00
Jose Diaz-Gonzalez
753a551961 Merge pull request #161 from albertyw/circleci-project-setup
Add circleci config
2020-08-29 01:48:49 -04:00
Albert Wang
607b6ca69b Add .circleci/config.yml 2020-08-28 02:33:51 -07:00
Jose Diaz-Gonzalez
ef71655b01 Merge pull request #160 from wbolster/patch-1
Include --private flag in example
2020-08-27 13:23:28 -04:00
wouter bolsterlee
d8bcbfa644 Include --private flag in example
By default, private repositories are not included. This is surprising.
It took me a while to figure this out, and making that clear in the
example can help others to be aware of that.
2020-08-27 17:01:56 +02:00
Jose Diaz-Gonzalez
751b0d6e82 Release version 0.35.0 2020-08-05 12:02:21 -04:00
Jose Diaz-Gonzalez
ea633ca2bb Merge pull request #156 from samanthaq/restore-optional-throttling
Make API request throttling optional
2020-08-05 12:01:56 -04:00
Samantha Baldwin
a2115ce3e5 Make API request throttling optional 2020-08-05 11:53:17 -04:00
Jose Diaz-Gonzalez
8a00bb1903 Release version 0.34.0 2020-07-24 13:31:03 -04:00
Jose Diaz-Gonzalez
e53f8d4724 Merge pull request #153 from 0x6d617474/gist_ssh
Add logic for transforming gist repository urls to ssh
2020-07-24 13:30:40 -04:00
Matt Fields
356f5f674b Add logic for transforming gist repository urls to ssh 2020-07-07 17:54:16 -04:00
Jose Diaz-Gonzalez
13128635cb Release version 0.33.1 2020-05-28 16:44:40 -04:00
Jose Diaz-Gonzalez
6e6842b025 Merge pull request #151 from garymoon/readme-update-0.33 2020-05-28 16:43:57 -04:00
Gary Moon
272177c395 Update the readme for new switches added in 0.33 2020-05-26 19:59:47 -04:00
Jose Diaz-Gonzalez
70f711ea68 Release version 0.33.0 2020-04-13 17:14:20 -04:00
Jose Diaz-Gonzalez
3fc9957aac Merge pull request #149 from eht16/simple_api_request_throttling
Add basic API request throttling
2020-04-13 17:13:58 -04:00
Enrico Tröger
78098aae23 Add basic API request throttling
A simple approach to throttle API requests and so keep within the rate
limits of the API. Can be enabled with "--throttle-limit" to specify
when throttling should start.
"--throttle-pause" defines the time to sleep between further API
requests.
2020-04-13 23:06:09 +02:00
Jose Diaz-Gonzalez
fb7cc5ed53 Release version 0.32.0 2020-04-13 17:02:59 -04:00
Jose Diaz-Gonzalez
c0679b9cc3 Merge pull request #148 from eht16/logging_with_timestamp
Add timestamp to log messages
2020-04-13 16:38:36 -04:00
Enrico Tröger
03b9d1b2d8 Add timestamp to log messages 2020-04-13 22:11:48 +02:00
Jose Diaz-Gonzalez
5025f69878 Merge pull request #147 from tomhoover/update-readme
Update README.rst to match 'github-backup -h'
2020-03-24 11:17:44 -04:00
Tom Hoover
a351cdc103 Update README.rst to match 'github-backup -h' 2020-03-22 08:48:50 -05:00
5 changed files with 158 additions and 39 deletions

23
.circleci/config.yml Normal file
View File

@@ -0,0 +1,23 @@
version: 2.1
orbs:
python: circleci/python@0.3.2
jobs:
build-and-test:
executor: python/default
steps:
- checkout
- python/load-cache
- run:
command: pip install flake8
name: Install dependencies
- python/save-cache
- run:
command: flake8 --ignore=E501
name: Lint
workflows:
main:
jobs:
- build-and-test

View File

@@ -1,9 +1,49 @@
Changelog Changelog
========= =========
0.31.0 (2020-02-25) 0.36.0 (2020-08-29)
------------------- -------------------
------------------------ ------------------------
- Add flake8 instructions to readme. [Albert Wang]
- Fix regex string. [Albert Wang]
- Fix whitespace issues. [Albert Wang]
- Do not use bare excepts. [Albert Wang]
- Add .circleci/config.yml. [Albert Wang]
- Include --private flag in example. [wouter bolsterlee]
By default, private repositories are not included. This is surprising.
It took me a while to figure this out, and making that clear in the
example can help others to be aware of that.
0.35.0 (2020-08-05)
-------------------
- Make API request throttling optional. [Samantha Baldwin]
0.34.0 (2020-07-24)
-------------------
- Add logic for transforming gist repository urls to ssh. [Matt Fields]
0.33.0 (2020-04-13)
-------------------
- Add basic API request throttling. [Enrico Tröger]
A simple approach to throttle API requests and so keep within the rate
limits of the API. Can be enabled with "--throttle-limit" to specify
when throttling should start.
"--throttle-pause" defines the time to sleep between further API
requests.
0.32.0 (2020-04-13)
-------------------
- Add timestamp to log messages. [Enrico Tröger]
0.31.0 (2020-02-25)
-------------------
- #123 update: changed --as-app 'help' description. [ethan] - #123 update: changed --as-app 'help' description. [ethan]
- #123: Support Authenticating As Github Application. [ethan] - #123: Support Authenticating As Github Application. [ethan]

View File

@@ -29,19 +29,20 @@ Usage
CLI Usage is as follows:: CLI Usage is as follows::
github-backup [-h] [-u USERNAME] [-p PASSWORD] [-t TOKEN] github-backup [-h] [-u USERNAME] [-p PASSWORD] [-t TOKEN] [--as-app]
[-o OUTPUT_DIRECTORY] [-i] [--starred] [--all-starred] [-o OUTPUT_DIRECTORY] [-i] [--starred] [--all-starred]
[--watched] [--followers] [--following] [--all] [--watched] [--followers] [--following] [--all]
[--issues] [--issue-comments] [--issue-events] [--pulls] [--issues] [--issue-comments] [--issue-events] [--pulls]
[--pull-comments] [--pull-commits] [--labels] [--hooks] [--pull-comments] [--pull-commits] [--pull-details]
[--milestones] [--repositories] [--releases] [--assets] [--labels] [--hooks] [--milestones] [--repositories]
[--bare] [--lfs] [--wikis] [--gists] [--starred-gists] [--bare] [--lfs] [--wikis] [--gists] [--starred-gists]
[--skip-existing] [--skip-existing] [-L [LANGUAGES [LANGUAGES ...]]]
[-L [LANGUAGES [LANGUAGES ...]]] [-N NAME_REGEX] [-N NAME_REGEX] [-H GITHUB_HOST] [-O] [-R REPOSITORY]
[-H GITHUB_HOST] [-O] [-R REPOSITORY] [-P] [-F] [-P] [-F] [--prefer-ssh] [-v]
[--prefer-ssh] [-v]
[--keychain-name OSX_KEYCHAIN_ITEM_NAME] [--keychain-name OSX_KEYCHAIN_ITEM_NAME]
[--keychain-account OSX_KEYCHAIN_ITEM_ACCOUNT] [--keychain-account OSX_KEYCHAIN_ITEM_ACCOUNT]
[--releases] [--assets] [--throttle-limit THROTTLE_LIMIT]
[--throttle-pause THROTTLE_PAUSE]
USER USER
Backup a github account Backup a github account
@@ -57,36 +58,36 @@ CLI Usage is as follows::
password for basic auth. If a username is given but password for basic auth. If a username is given but
not a password, the password will be prompted for. not a password, the password will be prompted for.
-t TOKEN, --token TOKEN -t TOKEN, --token TOKEN
personal access or OAuth token, or path to token personal access, OAuth, or JSON Web token, or path to
(file://...) token (file://...)
--as-app authenticate as github app instead of as a user.
-o OUTPUT_DIRECTORY, --output-directory OUTPUT_DIRECTORY -o OUTPUT_DIRECTORY, --output-directory OUTPUT_DIRECTORY
directory at which to backup the repositories directory at which to backup the repositories
-i, --incremental incremental backup -i, --incremental incremental backup
--starred include JSON output of starred repositories in backup --starred include JSON output of starred repositories in backup
--all-starred include starred repositories in backup --all-starred include starred repositories in backup [*]
--watched include watched repositories in backup --watched include JSON output of watched repositories in backup
--followers include JSON output of followers in backup --followers include JSON output of followers in backup
--following include JSON output of following users in backup --following include JSON output of following users in backup
--all include everything in backup --all include everything in backup (not including [*])
--issues include issues in backup --issues include issues in backup
--issue-comments include issue comments in backup --issue-comments include issue comments in backup
--issue-events include issue events in backup --issue-events include issue events in backup
--pulls include pull requests in backup --pulls include pull requests in backup
--pull-comments include pull request review comments in backup --pull-comments include pull request review comments in backup
--pull-commits include pull request commits in backup --pull-commits include pull request commits in backup
--pull-details include more pull request details in backup [*]
--labels include labels in backup --labels include labels in backup
--hooks include hooks in backup (works only when --hooks include hooks in backup (works only when
authenticated) authenticated)
--milestones include milestones in backup --milestones include milestones in backup
--repositories include repository clone in backup --repositories include repository clone in backup
--releases include repository releases' information without assets or binaries
--assets include assets alongside release information; only applies if including releases
--bare clone bare repositories --bare clone bare repositories
--lfs clone LFS repositories (requires Git LFS to be --lfs clone LFS repositories (requires Git LFS to be
installed, https://git-lfs.github.com) installed, https://git-lfs.github.com) [*]
--wikis include wiki clone in backup --wikis include wiki clone in backup
--gists include gists in backup --gists include gists in backup [*]
--starred-gists include starred gists in backup --starred-gists include starred gists in backup [*]
--skip-existing skip project if a backup directory exists --skip-existing skip project if a backup directory exists
-L [LANGUAGES [LANGUAGES ...]], --languages [LANGUAGES [LANGUAGES ...]] -L [LANGUAGES [LANGUAGES ...]], --languages [LANGUAGES [LANGUAGES ...]]
only allow these languages only allow these languages
@@ -97,8 +98,8 @@ CLI Usage is as follows::
-O, --organization whether or not this is an organization user -O, --organization whether or not this is an organization user
-R REPOSITORY, --repository REPOSITORY -R REPOSITORY, --repository REPOSITORY
name of repository to limit backup to name of repository to limit backup to
-P, --private include private repositories -P, --private include private repositories [*]
-F, --fork include forked repositories -F, --fork include forked repositories [*]
--prefer-ssh Clone repositories using SSH instead of HTTPS --prefer-ssh Clone repositories using SSH instead of HTTPS
-v, --version show program's version number and exit -v, --version show program's version number and exit
--keychain-name OSX_KEYCHAIN_ITEM_NAME --keychain-name OSX_KEYCHAIN_ITEM_NAME
@@ -107,6 +108,17 @@ CLI Usage is as follows::
--keychain-account OSX_KEYCHAIN_ITEM_ACCOUNT --keychain-account OSX_KEYCHAIN_ITEM_ACCOUNT
OSX ONLY: account field of password item in OSX OSX ONLY: account field of password item in OSX
keychain that holds the personal access or OAuth token keychain that holds the personal access or OAuth token
--releases include release information, not including assets or
binaries
--assets include assets alongside release information; only
applies if including releases
--throttle-limit THROTTLE_LIMIT
start throttling of GitHub API requests after this
amount of API requests remain
--throttle-pause THROTTLE_PAUSE
wait this amount of seconds when API request
throttling is active (default: 30.0, requires
--throttle-limit to be set)
The package can be used to backup an *entire* organization or repository, including issues and wikis in the most appropriate format (clones for wikis, json files for issues). The package can be used to backup an *entire* organization or repository, including issues and wikis in the most appropriate format (clones for wikis, json files for issues).
@@ -141,10 +153,10 @@ Instructions on how to do this can be found on https://git-lfs.github.com.
Examples Examples
======== ========
Backup all repositories:: Backup all repositories, including private ones::
export ACCESS_TOKEN=SOME-GITHUB-TOKEN export ACCESS_TOKEN=SOME-GITHUB-TOKEN
github-backup WhiteHouse --token $ACCESS_TOKEN --organization --output-directory /tmp/white-house --repositories github-backup WhiteHouse --token $ACCESS_TOKEN --organization --output-directory /tmp/white-house --repositories --private
Backup a single organization repository with everything else (wiki, pull requests, comments, issues etc):: Backup a single organization repository with everything else (wiki, pull requests, comments, issues etc)::
@@ -154,6 +166,15 @@ Backup a single organization repository with everything else (wiki, pull request
# e.g. git@github.com:docker/cli.git # e.g. git@github.com:docker/cli.git
github-backup $ORGANIZATION -P -t $ACCESS_TOKEN -o . --all -O -R $REPO github-backup $ORGANIZATION -P -t $ACCESS_TOKEN -o . --all -O -R $REPO
Testing
=======
This project currently contains no unit tests. To run linting::
pip install flake8
flake8 --ignore=E501
.. |PyPI| image:: https://img.shields.io/pypi/v/github-backup.svg .. |PyPI| image:: https://img.shields.io/pypi/v/github-backup.svg
:target: https://pypi.python.org/pypi/github-backup/ :target: https://pypi.python.org/pypi/github-backup/
.. |Python Versions| image:: https://img.shields.io/pypi/pyversions/github-backup.svg .. |Python Versions| image:: https://img.shields.io/pypi/pyversions/github-backup.svg

View File

@@ -1 +1 @@
__version__ = '0.31.0' __version__ = '0.36.0'

View File

@@ -7,6 +7,7 @@ import argparse
import base64 import base64
import calendar import calendar
import codecs import codecs
import datetime
import errno import errno
import getpass import getpass
import json import json
@@ -29,9 +30,11 @@ try:
from urllib.request import Request from urllib.request import Request
from urllib.request import HTTPRedirectHandler from urllib.request import HTTPRedirectHandler
from urllib.request import build_opener from urllib.request import build_opener
from subprocess import SubprocessError
except ImportError: except ImportError:
# python 2 # python 2
PY2 = True PY2 = True
from subprocess import CalledProcessError as SubprocessError
from urlparse import urlparse from urlparse import urlparse
from urllib import quote as urlquote from urllib import quote as urlquote
from urllib import urlencode from urllib import urlencode
@@ -50,6 +53,10 @@ except ImportError:
FNULL = open(os.devnull, 'w') FNULL = open(os.devnull, 'w')
def _get_log_date():
return datetime.datetime.isoformat(datetime.datetime.now())
def log_error(message): def log_error(message):
""" """
Log message (str) or messages (List[str]) to stderr and exit with status 1 Log message (str) or messages (List[str]) to stderr and exit with status 1
@@ -66,7 +73,7 @@ def log_info(message):
message = [message] message = [message]
for msg in message: for msg in message:
sys.stdout.write("{0}\n".format(msg)) sys.stdout.write("{0}: {1}\n".format(_get_log_date(), msg))
def log_warning(message): def log_warning(message):
@@ -77,7 +84,7 @@ def log_warning(message):
message = [message] message = [message]
for msg in message: for msg in message:
sys.stderr.write("{0}\n".format(msg)) sys.stderr.write("{0}: {1}\n".format(_get_log_date(), msg))
def logging_subprocess(popenargs, def logging_subprocess(popenargs,
@@ -326,6 +333,16 @@ def parse_args():
action='store_true', action='store_true',
dest='include_assets', dest='include_assets',
help='include assets alongside release information; only applies if including releases') help='include assets alongside release information; only applies if including releases')
parser.add_argument('--throttle-limit',
dest='throttle_limit',
type=int,
default=0,
help='start throttling of GitHub API requests after this amount of API requests remain')
parser.add_argument('--throttle-pause',
dest='throttle_pause',
type=float,
default=30.0,
help='wait this amount of seconds when API request throttling is active (default: 30.0, requires --throttle-limit to be set)')
return parser.parse_args() return parser.parse_args()
@@ -348,7 +365,7 @@ def get_auth(args, encode=True, for_git_cli=False):
if not PY2: if not PY2:
token = token.decode('utf-8') token = token.decode('utf-8')
auth = token + ':' + 'x-oauth-basic' auth = token + ':' + 'x-oauth-basic'
except: except SubprocessError:
log_error('No password item matching the provided name and account could be found in the osx keychain.') log_error('No password item matching the provided name and account could be found in the osx keychain.')
elif args.osx_keychain_item_account: elif args.osx_keychain_item_account:
log_error('You must specify both name and account fields for osx keychain password items') log_error('You must specify both name and account fields for osx keychain password items')
@@ -404,13 +421,19 @@ def get_github_host(args):
def get_github_repo_url(args, repository): def get_github_repo_url(args, repository):
if repository.get('is_gist'): if repository.get('is_gist'):
return repository['git_pull_url'] if args.prefer_ssh:
# The git_pull_url value is always https for gists, so we need to transform it to ssh form
repo_url = re.sub(r'^https?:\/\/(.+)\/(.+)\.git$', r'git@\1:\2.git', repository['git_pull_url'])
repo_url = re.sub(r'^git@gist\.', 'git@', repo_url) # strip gist subdomain for better hostkey compatibility
else:
repo_url = repository['git_pull_url']
return repo_url
if args.prefer_ssh: if args.prefer_ssh:
return repository['ssh_url'] return repository['ssh_url']
auth = get_auth(args, encode=False, for_git_cli=True) auth = get_auth(args, encode=False, for_git_cli=True)
if auth and repository['private'] == True: if auth and repository['private'] is True:
repo_url = 'https://{0}@{1}/{2}/{3}.git'.format( repo_url = 'https://{0}@{1}/{2}/{3}.git'.format(
auth, auth,
get_github_host(args), get_github_host(args),
@@ -434,10 +457,18 @@ def retrieve_data_gen(args, template, query_args=None, single_request=False):
r, errors = _get_response(request, auth, template) r, errors = _get_response(request, auth, template)
status_code = int(r.getcode()) status_code = int(r.getcode())
# be gentle with API request limit and throttle requests if remaining requests getting low
limit_remaining = int(r.headers.get('x-ratelimit-remaining', 0))
if args.throttle_limit and limit_remaining <= args.throttle_limit:
log_info(
'API request limit hit: {} requests left, pausing further requests for {}s'.format(
limit_remaining,
args.throttle_pause))
time.sleep(args.throttle_pause)
retries = 0 retries = 0
while retries < 3 and status_code == 502: while retries < 3 and status_code == 502:
print('API request returned HTTP 502: Bad Gateway. Retrying in 5 seconds') log_warning('API request returned HTTP 502: Bad Gateway. Retrying in 5 seconds')
retries += 1 retries += 1
time.sleep(5) time.sleep(5)
request = _construct_request(per_page, page, query_args, template, auth, as_app=args.as_app) # noqa request = _construct_request(per_page, page, query_args, template, auth, as_app=args.as_app) # noqa
@@ -466,9 +497,11 @@ def retrieve_data_gen(args, template, query_args=None, single_request=False):
if single_request: if single_request:
break break
def retrieve_data(args, template, query_args=None, single_request=False): def retrieve_data(args, template, query_args=None, single_request=False):
return list(retrieve_data_gen(args, template, query_args, single_request)) return list(retrieve_data_gen(args, template, query_args, single_request))
def get_query_args(query_args=None): def get_query_args(query_args=None):
if not query_args: if not query_args:
query_args = {} query_args = {}
@@ -544,12 +577,10 @@ def _request_http_error(exc, auth, errors):
delta = max(10, reset - gm_now) delta = max(10, reset - gm_now)
limit = headers.get('x-ratelimit-limit') limit = headers.get('x-ratelimit-limit')
print('Exceeded rate limit of {} requests; waiting {} seconds to reset'.format(limit, delta), # noqa log_warning('Exceeded rate limit of {} requests; waiting {} seconds to reset'.format(limit, delta)) # noqa
file=sys.stderr)
if auth is None: if auth is None:
print('Hint: Authenticate to raise your GitHub rate limit', log_info('Hint: Authenticate to raise your GitHub rate limit')
file=sys.stderr)
time.sleep(delta) time.sleep(delta)
should_continue = True should_continue = True
@@ -874,18 +905,22 @@ def backup_pulls(args, repo_cwd, repository, repos_template):
pull_states = ['open', 'closed'] pull_states = ['open', 'closed']
for pull_state in pull_states: for pull_state in pull_states:
query_args['state'] = pull_state query_args['state'] = pull_state
_pulls = retrieve_data_gen(args, _pulls = retrieve_data_gen(
_pulls_template, args,
query_args=query_args) _pulls_template,
query_args=query_args
)
for pull in _pulls: for pull in _pulls:
if args.since and pull['updated_at'] < args.since: if args.since and pull['updated_at'] < args.since:
break break
if not args.since or pull['updated_at'] >= args.since: if not args.since or pull['updated_at'] >= args.since:
pulls[pull['number']] = pull pulls[pull['number']] = pull
else: else:
_pulls = retrieve_data_gen(args, _pulls = retrieve_data_gen(
_pulls_template, args,
query_args=query_args) _pulls_template,
query_args=query_args
)
for pull in _pulls: for pull in _pulls:
if args.since and pull['updated_at'] < args.since: if args.since and pull['updated_at'] < args.since:
break break