Compare commits

..

23 Commits

Author SHA1 Message Date
Jose Diaz-Gonzalez
031a984434 Release version 0.36.0 2020-08-29 02:37:48 -04:00
Jose Diaz-Gonzalez
9e16f39e3e Merge pull request #157 from albertyw/lint 2020-08-29 02:37:19 -04:00
Albert Wang
2de96390be Add flake8 instructions to readme 2020-08-28 23:13:24 -07:00
Albert Wang
78cff47a91 Fix regex string 2020-08-28 23:13:24 -07:00
Albert Wang
fa27988c1c Update boolean check 2020-08-28 23:13:23 -07:00
Albert Wang
bb2e2b8c6f Fix whitespace issues 2020-08-28 23:13:23 -07:00
Albert Wang
8fd0f2b64f Do not use bare excepts 2020-08-28 23:13:23 -07:00
Jose Diaz-Gonzalez
753a551961 Merge pull request #161 from albertyw/circleci-project-setup
Add circleci config
2020-08-29 01:48:49 -04:00
Albert Wang
607b6ca69b Add .circleci/config.yml 2020-08-28 02:33:51 -07:00
Jose Diaz-Gonzalez
ef71655b01 Merge pull request #160 from wbolster/patch-1
Include --private flag in example
2020-08-27 13:23:28 -04:00
wouter bolsterlee
d8bcbfa644 Include --private flag in example
By default, private repositories are not included. This is surprising.
It took me a while to figure this out, and making that clear in the
example can help others to be aware of that.
2020-08-27 17:01:56 +02:00
Jose Diaz-Gonzalez
751b0d6e82 Release version 0.35.0 2020-08-05 12:02:21 -04:00
Jose Diaz-Gonzalez
ea633ca2bb Merge pull request #156 from samanthaq/restore-optional-throttling
Make API request throttling optional
2020-08-05 12:01:56 -04:00
Samantha Baldwin
a2115ce3e5 Make API request throttling optional 2020-08-05 11:53:17 -04:00
Jose Diaz-Gonzalez
8a00bb1903 Release version 0.34.0 2020-07-24 13:31:03 -04:00
Jose Diaz-Gonzalez
e53f8d4724 Merge pull request #153 from 0x6d617474/gist_ssh
Add logic for transforming gist repository urls to ssh
2020-07-24 13:30:40 -04:00
Matt Fields
356f5f674b Add logic for transforming gist repository urls to ssh 2020-07-07 17:54:16 -04:00
Jose Diaz-Gonzalez
13128635cb Release version 0.33.1 2020-05-28 16:44:40 -04:00
Jose Diaz-Gonzalez
6e6842b025 Merge pull request #151 from garymoon/readme-update-0.33 2020-05-28 16:43:57 -04:00
Gary Moon
272177c395 Update the readme for new switches added in 0.33 2020-05-26 19:59:47 -04:00
Jose Diaz-Gonzalez
70f711ea68 Release version 0.33.0 2020-04-13 17:14:20 -04:00
Jose Diaz-Gonzalez
3fc9957aac Merge pull request #149 from eht16/simple_api_request_throttling
Add basic API request throttling
2020-04-13 17:13:58 -04:00
Enrico Tröger
78098aae23 Add basic API request throttling
A simple approach to throttle API requests and so keep within the rate
limits of the API. Can be enabled with "--throttle-limit" to specify
when throttling should start.
"--throttle-pause" defines the time to sleep between further API
requests.
2020-04-13 23:06:09 +02:00
5 changed files with 121 additions and 14 deletions

23
.circleci/config.yml Normal file
View File

@@ -0,0 +1,23 @@
version: 2.1
orbs:
python: circleci/python@0.3.2
jobs:
build-and-test:
executor: python/default
steps:
- checkout
- python/load-cache
- run:
command: pip install flake8
name: Install dependencies
- python/save-cache
- run:
command: flake8 --ignore=E501
name: Lint
workflows:
main:
jobs:
- build-and-test

View File

@@ -1,9 +1,44 @@
Changelog
=========
0.32.0 (2020-04-13)
0.36.0 (2020-08-29)
-------------------
------------------------
- Add flake8 instructions to readme. [Albert Wang]
- Fix regex string. [Albert Wang]
- Fix whitespace issues. [Albert Wang]
- Do not use bare excepts. [Albert Wang]
- Add .circleci/config.yml. [Albert Wang]
- Include --private flag in example. [wouter bolsterlee]
By default, private repositories are not included. This is surprising.
It took me a while to figure this out, and making that clear in the
example can help others to be aware of that.
0.35.0 (2020-08-05)
-------------------
- Make API request throttling optional. [Samantha Baldwin]
0.34.0 (2020-07-24)
-------------------
- Add logic for transforming gist repository urls to ssh. [Matt Fields]
0.33.0 (2020-04-13)
-------------------
- Add basic API request throttling. [Enrico Tröger]
A simple approach to throttle API requests and so keep within the rate
limits of the API. Can be enabled with "--throttle-limit" to specify
when throttling should start.
"--throttle-pause" defines the time to sleep between further API
requests.
0.32.0 (2020-04-13)
-------------------
- Add timestamp to log messages. [Enrico Tröger]

View File

@@ -41,7 +41,8 @@ CLI Usage is as follows::
[-P] [-F] [--prefer-ssh] [-v]
[--keychain-name OSX_KEYCHAIN_ITEM_NAME]
[--keychain-account OSX_KEYCHAIN_ITEM_ACCOUNT]
[--releases] [--assets]
[--releases] [--assets] [--throttle-limit THROTTLE_LIMIT]
[--throttle-pause THROTTLE_PAUSE]
USER
Backup a github account
@@ -111,6 +112,13 @@ CLI Usage is as follows::
binaries
--assets include assets alongside release information; only
applies if including releases
--throttle-limit THROTTLE_LIMIT
start throttling of GitHub API requests after this
amount of API requests remain
--throttle-pause THROTTLE_PAUSE
wait this amount of seconds when API request
throttling is active (default: 30.0, requires
--throttle-limit to be set)
The package can be used to backup an *entire* organization or repository, including issues and wikis in the most appropriate format (clones for wikis, json files for issues).
@@ -145,10 +153,10 @@ Instructions on how to do this can be found on https://git-lfs.github.com.
Examples
========
Backup all repositories::
Backup all repositories, including private ones::
export ACCESS_TOKEN=SOME-GITHUB-TOKEN
github-backup WhiteHouse --token $ACCESS_TOKEN --organization --output-directory /tmp/white-house --repositories
github-backup WhiteHouse --token $ACCESS_TOKEN --organization --output-directory /tmp/white-house --repositories --private
Backup a single organization repository with everything else (wiki, pull requests, comments, issues etc)::
@@ -158,6 +166,15 @@ Backup a single organization repository with everything else (wiki, pull request
# e.g. git@github.com:docker/cli.git
github-backup $ORGANIZATION -P -t $ACCESS_TOKEN -o . --all -O -R $REPO
Testing
=======
This project currently contains no unit tests. To run linting::
pip install flake8
flake8 --ignore=E501
.. |PyPI| image:: https://img.shields.io/pypi/v/github-backup.svg
:target: https://pypi.python.org/pypi/github-backup/
.. |Python Versions| image:: https://img.shields.io/pypi/pyversions/github-backup.svg

View File

@@ -1 +1 @@
__version__ = '0.32.0'
__version__ = '0.36.0'

View File

@@ -30,9 +30,11 @@ try:
from urllib.request import Request
from urllib.request import HTTPRedirectHandler
from urllib.request import build_opener
from subprocess import SubprocessError
except ImportError:
# python 2
PY2 = True
from subprocess import CalledProcessError as SubprocessError
from urlparse import urlparse
from urllib import quote as urlquote
from urllib import urlencode
@@ -331,6 +333,16 @@ def parse_args():
action='store_true',
dest='include_assets',
help='include assets alongside release information; only applies if including releases')
parser.add_argument('--throttle-limit',
dest='throttle_limit',
type=int,
default=0,
help='start throttling of GitHub API requests after this amount of API requests remain')
parser.add_argument('--throttle-pause',
dest='throttle_pause',
type=float,
default=30.0,
help='wait this amount of seconds when API request throttling is active (default: 30.0, requires --throttle-limit to be set)')
return parser.parse_args()
@@ -353,7 +365,7 @@ def get_auth(args, encode=True, for_git_cli=False):
if not PY2:
token = token.decode('utf-8')
auth = token + ':' + 'x-oauth-basic'
except:
except SubprocessError:
log_error('No password item matching the provided name and account could be found in the osx keychain.')
elif args.osx_keychain_item_account:
log_error('You must specify both name and account fields for osx keychain password items')
@@ -409,13 +421,19 @@ def get_github_host(args):
def get_github_repo_url(args, repository):
if repository.get('is_gist'):
return repository['git_pull_url']
if args.prefer_ssh:
# The git_pull_url value is always https for gists, so we need to transform it to ssh form
repo_url = re.sub(r'^https?:\/\/(.+)\/(.+)\.git$', r'git@\1:\2.git', repository['git_pull_url'])
repo_url = re.sub(r'^git@gist\.', 'git@', repo_url) # strip gist subdomain for better hostkey compatibility
else:
repo_url = repository['git_pull_url']
return repo_url
if args.prefer_ssh:
return repository['ssh_url']
auth = get_auth(args, encode=False, for_git_cli=True)
if auth and repository['private'] == True:
if auth and repository['private'] is True:
repo_url = 'https://{0}@{1}/{2}/{3}.git'.format(
auth,
get_github_host(args),
@@ -439,6 +457,14 @@ def retrieve_data_gen(args, template, query_args=None, single_request=False):
r, errors = _get_response(request, auth, template)
status_code = int(r.getcode())
# be gentle with API request limit and throttle requests if remaining requests getting low
limit_remaining = int(r.headers.get('x-ratelimit-remaining', 0))
if args.throttle_limit and limit_remaining <= args.throttle_limit:
log_info(
'API request limit hit: {} requests left, pausing further requests for {}s'.format(
limit_remaining,
args.throttle_pause))
time.sleep(args.throttle_pause)
retries = 0
while retries < 3 and status_code == 502:
@@ -471,9 +497,11 @@ def retrieve_data_gen(args, template, query_args=None, single_request=False):
if single_request:
break
def retrieve_data(args, template, query_args=None, single_request=False):
return list(retrieve_data_gen(args, template, query_args, single_request))
def get_query_args(query_args=None):
if not query_args:
query_args = {}
@@ -877,18 +905,22 @@ def backup_pulls(args, repo_cwd, repository, repos_template):
pull_states = ['open', 'closed']
for pull_state in pull_states:
query_args['state'] = pull_state
_pulls = retrieve_data_gen(args,
_pulls_template,
query_args=query_args)
_pulls = retrieve_data_gen(
args,
_pulls_template,
query_args=query_args
)
for pull in _pulls:
if args.since and pull['updated_at'] < args.since:
break
if not args.since or pull['updated_at'] >= args.since:
pulls[pull['number']] = pull
else:
_pulls = retrieve_data_gen(args,
_pulls_template,
query_args=query_args)
_pulls = retrieve_data_gen(
args,
_pulls_template,
query_args=query_args
)
for pull in _pulls:
if args.since and pull['updated_at'] < args.since:
break