mirror of
https://github.com/josegonzalez/python-github-backup.git
synced 2025-12-05 16:18:02 +01:00
Compare commits
94 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
70f711ea68 | ||
|
|
3fc9957aac | ||
|
|
78098aae23 | ||
|
|
fb7cc5ed53 | ||
|
|
c0679b9cc3 | ||
|
|
03b9d1b2d8 | ||
|
|
5025f69878 | ||
|
|
a351cdc103 | ||
|
|
85e4399408 | ||
|
|
c8171b692a | ||
|
|
523c811cc6 | ||
|
|
857ad0afab | ||
|
|
3f65eadee1 | ||
|
|
a8e8841b26 | ||
|
|
8e542fd6b6 | ||
|
|
1865941b14 | ||
|
|
03c68561a5 | ||
|
|
196acd0aca | ||
|
|
679ac841f6 | ||
|
|
498d9eba32 | ||
|
|
0f82b1717c | ||
|
|
4d5126f303 | ||
|
|
b864218b44 | ||
|
|
98919c82c9 | ||
|
|
045eacbf18 | ||
|
|
7a234ba7ed | ||
|
|
e8a255b450 | ||
|
|
81a2f762da | ||
|
|
cb0293cbe5 | ||
|
|
252c25461f | ||
|
|
e8ed03fd06 | ||
|
|
38010d7c39 | ||
|
|
71b4288e6b | ||
|
|
ba4fa9fa2d | ||
|
|
869f761c90 | ||
|
|
195e700128 | ||
|
|
27441b71b6 | ||
|
|
cfeaee7309 | ||
|
|
fac8e4274f | ||
|
|
17fee66f31 | ||
|
|
a56d27dd8b | ||
|
|
e57873b6dd | ||
|
|
2658b039a1 | ||
|
|
fd684a71fb | ||
|
|
bacd77030b | ||
|
|
b73079daf2 | ||
|
|
eca8a70666 | ||
|
|
e74765ba7f | ||
|
|
6db5bd731b | ||
|
|
7305871c20 | ||
|
|
baf7b1a9b4 | ||
|
|
121fa68294 | ||
|
|
44dfc79edc | ||
|
|
89f59cc7a2 | ||
|
|
ad8c5b8768 | ||
|
|
921aab3729 | ||
|
|
ea4c3d0f6f | ||
|
|
9b6400932d | ||
|
|
de0c3f46c6 | ||
|
|
73b069f872 | ||
|
|
3d3f512074 | ||
|
|
1c3078992d | ||
|
|
4b40ae94d7 | ||
|
|
a18fda9faf | ||
|
|
41130fc8b0 | ||
|
|
2340a02fc6 | ||
|
|
cafff4ae80 | ||
|
|
3193d120e5 | ||
|
|
da4b29a2d6 | ||
|
|
d05c96ecef | ||
|
|
c86163bfe6 | ||
|
|
eff6e36974 | ||
|
|
63e458bafb | ||
|
|
57ab5ce1a2 | ||
|
|
d148f9b900 | ||
|
|
89ee22c2be | ||
|
|
9e472b74e6 | ||
|
|
4b459f9af8 | ||
|
|
b70ea87db7 | ||
|
|
f8be34562b | ||
|
|
ec05204aa9 | ||
|
|
628f2cbf73 | ||
|
|
38bf438d2f | ||
|
|
899cf42b57 | ||
|
|
b5972aaaf0 | ||
|
|
d860f369e9 | ||
|
|
77ab1bda15 | ||
|
|
4a4a317331 | ||
|
|
5a8e1ac275 | ||
|
|
0de341eab4 | ||
|
|
b0130fdf94 | ||
|
|
b49f399037 | ||
|
|
321414d352 | ||
|
|
413d4381cc |
9
.gitignore
vendored
9
.gitignore
vendored
@@ -25,3 +25,12 @@ doc/_build
|
||||
|
||||
# Generated man page
|
||||
doc/aws_hostname.1
|
||||
|
||||
# Annoying macOS files
|
||||
.DS_Store
|
||||
._*
|
||||
|
||||
# IDE configuration files
|
||||
.vscode
|
||||
.atom
|
||||
|
||||
|
||||
281
CHANGES.rst
281
CHANGES.rst
@@ -1,19 +1,187 @@
|
||||
Changelog
|
||||
=========
|
||||
|
||||
0.19.1 (2018-03-24)
|
||||
0.33.0 (2020-04-13)
|
||||
-------------------
|
||||
------------------------
|
||||
- Add basic API request throttling. [Enrico Tröger]
|
||||
|
||||
A simple approach to throttle API requests and so keep within the rate
|
||||
limits of the API. Can be enabled with "--throttle-limit" to specify
|
||||
when throttling should start.
|
||||
"--throttle-pause" defines the time to sleep between further API
|
||||
requests.
|
||||
|
||||
|
||||
0.32.0 (2020-04-13)
|
||||
-------------------
|
||||
- Add timestamp to log messages. [Enrico Tröger]
|
||||
|
||||
|
||||
0.31.0 (2020-02-25)
|
||||
-------------------
|
||||
- #123 update: changed --as-app 'help' description. [ethan]
|
||||
- #123: Support Authenticating As Github Application. [ethan]
|
||||
|
||||
|
||||
0.29.0 (2020-02-14)
|
||||
-------------------
|
||||
- #50 update: keep main() in bin. [ethan]
|
||||
- #50 - refactor for friendlier import. [ethan]
|
||||
|
||||
|
||||
0.28.0 (2020-02-03)
|
||||
-------------------
|
||||
- Remove deprecated (and removed) git lfs flags. [smiley]
|
||||
|
||||
"--tags" and "--force" were removed at some point from "git lfs fetch". This broke our backup script.
|
||||
|
||||
|
||||
0.27.0 (2020-01-22)
|
||||
-------------------
|
||||
- Fixed script fails if not installed from pip. [Ben Baron]
|
||||
|
||||
At the top of the script, the line from github_backup import __version__ gets the script's version number to use if the script is called with the -v or --version flags. The problem is that if the script hasn't been installed via pip (for example I cloned the repo directly to my backup server), the script will fail due to an import exception.
|
||||
|
||||
Also presumably it will always use the version number from pip even if running a modified version from git or a fork or something, though this does not fix that as I have no idea how to check if it's running the pip installed version or not. But at least the script will now work fine if cloned from git or just copied to another machine.
|
||||
|
||||
closes https://github.com/josegonzalez/python-github-backup/issues/141
|
||||
- Fixed macOS keychain access when using Python 3. [Ben Baron]
|
||||
|
||||
Python 3 is returning bytes rather than a string, so the string concatenation to create the auth variable was throwing an exception which the script was interpreting to mean it couldn't find the password. Adding a conversion to string first fixed the issue.
|
||||
- Public repos no longer include the auth token. [Ben Baron]
|
||||
|
||||
When backing up repositories using an auth token and https, the GitHub personal auth token is leaked in each backed up repository. It is included in the URL of each repository's git remote url.
|
||||
|
||||
This is not needed as they are public and can be accessed without the token and can cause issues in the future if the token is ever changed, so I think it makes more sense not to have the token stored in each repo backup. I think the token should only be "leaked" like this out of necessity, e.g. it's a private repository and the --prefer-ssh option was not chosen so https with auth token was required to perform the clone.
|
||||
- Fixed comment typo. [Ben Baron]
|
||||
- Switched log_info to log_warning in download_file. [Ben Baron]
|
||||
- Crash when an release asset doesn't exist. [Ben Baron]
|
||||
|
||||
Currently, the script crashes whenever a release asset is unable to download (for example a 404 response). This change instead logs the failure and allows the script to continue. No retry logic is enabled, but at least it prevents the crash and allows the backup to complete. Retry logic can be implemented later if wanted.
|
||||
|
||||
closes https://github.com/josegonzalez/python-github-backup/issues/129
|
||||
- Moved asset downloading loop inside the if block. [Ben Baron]
|
||||
- Separate release assets and skip re-downloading. [Ben Baron]
|
||||
|
||||
Currently the script puts all release assets into the same folder called `releases`. So any time 2 release files have the same name, only the last one downloaded is actually saved. A particularly bad example of this is MacDownApp/macdown where all of their releases are named `MacDown.app.zip`. So even though they have 36 releases and all 36 are downloaded, only the last one is actually saved.
|
||||
|
||||
With this change, each releases' assets are now stored in a fubfolder inside `releases` named after the release name. There could still be edge cases if two releases have the same name, but this is still much safer tha the previous behavior.
|
||||
|
||||
This change also now checks if the asset file already exists on disk and skips downloading it. This drastically speeds up addiotnal syncs as it no longer downloads every single release every single time. It will now only download new releases which I believe is the expected behavior.
|
||||
|
||||
closes https://github.com/josegonzalez/python-github-backup/issues/126
|
||||
- Added newline to end of file. [Ben Baron]
|
||||
- Improved gitignore, macOS files and IDE configs. [Ben Baron]
|
||||
|
||||
Ignores the annoying hidden macOS files .DS_Store and ._* as well as the IDE configuration folders for contributors using the popular Visual Studio Code and Atom IDEs (more can be added later as needed).
|
||||
|
||||
|
||||
0.26.0 (2019-09-23)
|
||||
-------------------
|
||||
- Workaround gist clone in `--prefer-ssh` mode. [Vladislav Yarmak]
|
||||
- Create PULL_REQUEST.md. [Jose Diaz-Gonzalez]
|
||||
- Create ISSUE_TEMPLATE.md. [Jose Diaz-Gonzalez]
|
||||
|
||||
|
||||
0.25.0 (2019-07-03)
|
||||
-------------------
|
||||
- Issue 119: Change retrieve_data to be a generator. [2a]
|
||||
|
||||
See issue #119.
|
||||
|
||||
|
||||
0.24.0 (2019-06-27)
|
||||
-------------------
|
||||
- QKT-45: include assets - update readme. [Ethan Timm]
|
||||
|
||||
update readme with flag information for including assets alongside their respective releases
|
||||
- Make assets it's own flag. [Harrison Wright]
|
||||
- Fix super call for python2. [Harrison Wright]
|
||||
- Fix redirect to s3. [Harrison Wright]
|
||||
- WIP: download assets. [Harrison Wright]
|
||||
- QKT-42: releases - add readme info. [ethan]
|
||||
- QKT-42 update: shorter command flag. [ethan]
|
||||
- QKT-42: support saving release information. [ethan]
|
||||
- Fix pull details. [Harrison Wright]
|
||||
|
||||
|
||||
0.23.0 (2019-06-04)
|
||||
-------------------
|
||||
- Avoid to crash in case of HTTP 502 error. [Gael de Chalendar]
|
||||
|
||||
Survive also on socket.error connections like on HTTPError or URLError.
|
||||
|
||||
This should solve issue #110.
|
||||
|
||||
|
||||
0.22.2 (2019-02-21)
|
||||
-------------------
|
||||
|
||||
Fix
|
||||
~~~
|
||||
- Warn instead of error. [Jose Diaz-Gonzalez]
|
||||
|
||||
Refs #106
|
||||
|
||||
|
||||
0.22.1 (2019-02-21)
|
||||
-------------------
|
||||
- Log URL error https://github.com/josegonzalez/python-github-
|
||||
backup/issues/105. [JOHN STETIC]
|
||||
|
||||
|
||||
0.22.0 (2019-02-01)
|
||||
-------------------
|
||||
- Remove unnecessary sys.exit call. [W. Harrison Wright]
|
||||
- Add org check to avoid incorrect log output. [W. Harrison Wright]
|
||||
- Fix accidental system exit with better logging strategy. [W. Harrison
|
||||
Wright]
|
||||
|
||||
|
||||
0.21.1 (2018-12-25)
|
||||
-------------------
|
||||
- Mark options which are not included in --all. [Bernd]
|
||||
|
||||
As discussed in Issue #100
|
||||
|
||||
|
||||
0.21.0 (2018-11-28)
|
||||
-------------------
|
||||
- Correctly download repos when user arg != authenticated user. [W.
|
||||
Harrison Wright]
|
||||
|
||||
|
||||
0.20.1 (2018-09-29)
|
||||
-------------------
|
||||
- Clone the specified user's gists, not the authenticated user. [W.
|
||||
Harrison Wright]
|
||||
- Clone the specified user's starred repos, not the authenticated user.
|
||||
[W. Harrison Wright]
|
||||
|
||||
|
||||
0.20.0 (2018-03-24)
|
||||
-------------------
|
||||
- Chore: drop Python 2.6. [Jose Diaz-Gonzalez]
|
||||
- Feat: simplify release script. [Jose Diaz-Gonzalez]
|
||||
|
||||
|
||||
0.19.2 (2018-03-24)
|
||||
-------------------
|
||||
|
||||
Fix
|
||||
~~~
|
||||
- Cleanup pep8 violations. [Jose Diaz-Gonzalez]
|
||||
|
||||
|
||||
0.19.0 (2018-03-24)
|
||||
-------------------
|
||||
- Add additional output for the current request. [Robin Gloster]
|
||||
|
||||
This is useful to have some progress indication for huge repositories.
|
||||
|
||||
|
||||
- Add option to backup additional PR details. [Robin Gloster]
|
||||
|
||||
Some payload is only included when requesting a single pull request
|
||||
|
||||
|
||||
- Mark string as binary in comparison for skip_existing. [Johannes
|
||||
Bornhold]
|
||||
|
||||
@@ -24,66 +192,53 @@ Changelog
|
||||
|
||||
0.18.0 (2018-02-22)
|
||||
-------------------
|
||||
|
||||
- Add option to fetch followers/following JSON data. [Stephen Greene]
|
||||
|
||||
|
||||
0.17.0 (2018-02-20)
|
||||
-------------------
|
||||
|
||||
- Short circuit gists backup process. [W. Harrison Wright]
|
||||
|
||||
- Formatting. [W. Harrison Wright]
|
||||
|
||||
- Add ability to backup gists. [W. Harrison Wright]
|
||||
|
||||
|
||||
0.16.0 (2018-01-22)
|
||||
-------------------
|
||||
|
||||
- Change option to --all-starred. [W. Harrison Wright]
|
||||
|
||||
- JK don't update documentation. [W. Harrison Wright]
|
||||
|
||||
- Put starred clone repoistories under a new option. [W. Harrison
|
||||
Wright]
|
||||
|
||||
- Add comment. [W. Harrison Wright]
|
||||
|
||||
- Add ability to clone starred repos. [W. Harrison Wright]
|
||||
|
||||
|
||||
0.14.1 (2017-10-11)
|
||||
-------------------
|
||||
|
||||
- Fix arg not defined error. [Edward Pfremmer]
|
||||
|
||||
Ref: https://github.com/josegonzalez/python-github-backup/issues/69
|
||||
|
||||
0.14.0 (2017-10-11)
|
||||
-------------------
|
||||
|
||||
- Added a check to see if git-lfs is installed when doing an LFS clone.
|
||||
[pieterclaerhout]
|
||||
|
||||
- Added support for LFS clones. [pieterclaerhout]
|
||||
|
||||
- Add pypi info to readme. [Albert Wang]
|
||||
|
||||
- Explicitly support python 3 in package description. [Albert Wang]
|
||||
|
||||
- Add couple examples to help new users. [Yusuf Tran]
|
||||
|
||||
|
||||
0.13.2 (2017-05-06)
|
||||
-------------------
|
||||
|
||||
- Fix remotes while updating repository. [Dima Gerasimov]
|
||||
|
||||
|
||||
0.13.1 (2017-04-11)
|
||||
-------------------
|
||||
|
||||
- Fix error when repository has no updated_at value. [Nicolai Ehemann]
|
||||
|
||||
|
||||
0.13.0 (2017-04-05)
|
||||
-------------------
|
||||
|
||||
- Add OS check for OSX specific keychain args. [Martin O'Reilly]
|
||||
|
||||
Keychain arguments are only supported on Mac OSX.
|
||||
@@ -92,8 +247,6 @@ Changelog
|
||||
error message rather than a "No password item matching the
|
||||
provided name and account could be found in the osx keychain"
|
||||
error message
|
||||
|
||||
|
||||
- Add support for storing PAT in OSX keychain. [Martin O'Reilly]
|
||||
|
||||
Added additional optional arguments and README guidance for storing
|
||||
@@ -103,62 +256,48 @@ Changelog
|
||||
|
||||
0.12.1 (2017-03-27)
|
||||
-------------------
|
||||
|
||||
- Avoid remote branch name churn. [Chris Adams]
|
||||
|
||||
This avoids the backup output having lots of "[new branch]" messages
|
||||
because removing the old remote name removed all of the existing branch
|
||||
references.
|
||||
|
||||
|
||||
- Fix detection of bare git directories. [Andrzej Maczuga]
|
||||
|
||||
|
||||
0.12.0 (2016-11-22)
|
||||
-------------------
|
||||
|
||||
Fix
|
||||
~~~
|
||||
|
||||
- Properly import version from github_backup package. [Jose Diaz-
|
||||
Gonzalez]
|
||||
|
||||
- Support alternate git status output. [Jose Diaz-Gonzalez]
|
||||
|
||||
Other
|
||||
~~~~~
|
||||
|
||||
- Pep8: E501 line too long (83 > 79 characters) [Jose Diaz-Gonzalez]
|
||||
|
||||
- Pep8: E128 continuation line under-indented for visual indent. [Jose
|
||||
Diaz-Gonzalez]
|
||||
|
||||
- Support archivization using bare git clones. [Andrzej Maczuga]
|
||||
|
||||
- Fix typo, 3x. [Terrell Russell]
|
||||
|
||||
|
||||
0.11.0 (2016-10-26)
|
||||
-------------------
|
||||
|
||||
- Support --token file:///home/user/token.txt (fixes gh-51) [Björn
|
||||
Dahlgren]
|
||||
|
||||
- Fix some linting. [Albert Wang]
|
||||
|
||||
- Fix byte/string conversion for python 3. [Albert Wang]
|
||||
|
||||
- Support python 3. [Albert Wang]
|
||||
|
||||
- Encode special characters in password. [Remi Rampin]
|
||||
|
||||
- Don't pretend program name is "Github Backup" [Remi Rampin]
|
||||
|
||||
- Don't install over insecure connection. [Remi Rampin]
|
||||
|
||||
The git:// protocol is unauthenticated and unencrypted, and no longer advertised by GitHub. Using HTTPS shouldn't impact performance.
|
||||
|
||||
|
||||
0.10.3 (2016-08-21)
|
||||
-------------------
|
||||
|
||||
- Fixes #29. [Jonas Michel]
|
||||
|
||||
Reporting an error when the user's rate limit is exceeded causes
|
||||
@@ -166,8 +305,6 @@ Other
|
||||
sleep. Instead of generating an explicit error we just want to
|
||||
inform the user that the script is going to sleep until their rate
|
||||
limit count resets.
|
||||
|
||||
|
||||
- Fixes #29. [Jonas Michel]
|
||||
|
||||
The errors list was not being cleared out after resuming a backup
|
||||
@@ -178,14 +315,13 @@ Other
|
||||
|
||||
0.10.2 (2016-08-21)
|
||||
-------------------
|
||||
|
||||
- Add a note regarding git version requirement. [Jose Diaz-Gonzalez]
|
||||
|
||||
Closes #37
|
||||
|
||||
|
||||
0.10.0 (2016-08-18)
|
||||
-------------------
|
||||
|
||||
- Implement incremental updates. [Robert Bradshaw]
|
||||
|
||||
Guarded with an --incremental flag.
|
||||
@@ -198,12 +334,11 @@ Other
|
||||
|
||||
0.9.0 (2016-03-29)
|
||||
------------------
|
||||
|
||||
- Fix cloning private repos with basic auth or token. [Kazuki Suda]
|
||||
|
||||
|
||||
0.8.0 (2016-02-14)
|
||||
------------------
|
||||
|
||||
- Don't store issues which are actually pull requests. [Enrico Tröger]
|
||||
|
||||
This prevents storing pull requests twice since the Github API returns
|
||||
@@ -214,43 +349,31 @@ Other
|
||||
|
||||
0.7.0 (2016-02-02)
|
||||
------------------
|
||||
|
||||
- Softly fail if not able to read hooks. [Albert Wang]
|
||||
|
||||
- Add note about 2-factor auth. [Albert Wang]
|
||||
|
||||
- Make user repository search go through endpoint capable of reading
|
||||
private repositories. [Albert Wang]
|
||||
|
||||
- Prompt for password if only username given. [Alex Hall]
|
||||
|
||||
|
||||
0.6.0 (2015-11-10)
|
||||
------------------
|
||||
|
||||
- Force proper remote url. [Jose Diaz-Gonzalez]
|
||||
|
||||
- Improve error handling in case of HTTP errors. [Enrico Tröger]
|
||||
|
||||
In case of a HTTP status code 404, the returned 'r' was never assigned.
|
||||
In case of URL errors which are not timeouts, we probably should bail
|
||||
out.
|
||||
|
||||
|
||||
- Add --hooks to also include web hooks into the backup. [Enrico Tröger]
|
||||
|
||||
- Create the user specified output directory if it does not exist.
|
||||
[Enrico Tröger]
|
||||
|
||||
Fixes #17.
|
||||
|
||||
|
||||
- Add missing auth argument to _get_response() [Enrico Tröger]
|
||||
|
||||
When running unauthenticated and Github starts rate-limiting the client,
|
||||
github-backup crashes because the used auth variable in _get_response()
|
||||
was not available. This change should fix it.
|
||||
|
||||
|
||||
- Add repository URL to error message for non-existing repositories.
|
||||
[Enrico Tröger]
|
||||
|
||||
@@ -261,40 +384,28 @@ Other
|
||||
|
||||
0.5.0 (2015-10-10)
|
||||
------------------
|
||||
|
||||
- Add release script. [Jose Diaz-Gonzalez]
|
||||
|
||||
- Refactor to both simplify codepath as well as follow PEP8 standards.
|
||||
[Jose Diaz-Gonzalez]
|
||||
|
||||
- Retry 3 times when the connection times out. [Mathijs Jonker]
|
||||
|
||||
- Made unicode output defalut. [Kirill Grushetsky]
|
||||
|
||||
- Import alphabetised. [Kirill Grushetsky]
|
||||
|
||||
- Preserve Unicode characters in the output file. [Kirill Grushetsky]
|
||||
|
||||
Added option to preserve Unicode characters in the output file
|
||||
|
||||
- Josegonzales/python-github-backup#12 Added backup of labels and
|
||||
milestones. [aensley]
|
||||
|
||||
- Fixed indent. [Mathijs Jonker]
|
||||
|
||||
- Skip unitialized repo's. [mjonker-embed]
|
||||
|
||||
These gave me errors which caused mails from crontab.
|
||||
|
||||
- Added prefer-ssh. [mjonker-embed]
|
||||
|
||||
Was needed for my back-up setup, code includes this but readme wasn't updated
|
||||
|
||||
- Retry API requests which failed due to rate-limiting. [Chris Adams]
|
||||
|
||||
This allows operation to continue, albeit at a slower pace,
|
||||
if you have enough data to trigger the API rate limits
|
||||
|
||||
- Logging_subprocess: always log when a command fails. [Chris Adams]
|
||||
|
||||
Previously git clones could fail without any indication
|
||||
@@ -304,21 +415,15 @@ Other
|
||||
Now a non-zero return code will always output a message to
|
||||
stderr and will display the executed command so it can be
|
||||
rerun for troubleshooting.
|
||||
|
||||
|
||||
- Switch to using ssh_url. [Chris Adams]
|
||||
|
||||
The previous commit used the wrong URL for a private repo. This was
|
||||
masked by the lack of error loging in logging_subprocess (which will be
|
||||
in a separate branch)
|
||||
|
||||
|
||||
- Add an option to prefer checkouts over SSH. [Chris Adams]
|
||||
|
||||
This is really useful with private repos to avoid being nagged
|
||||
for credentials for every repository
|
||||
|
||||
|
||||
- Add pull request support. [Kevin Laude]
|
||||
|
||||
Back up reporitory pull requests by passing the --include-pulls
|
||||
@@ -330,8 +435,6 @@ Other
|
||||
|
||||
Pull requests are automatically backed up when the --all argument is
|
||||
uesd.
|
||||
|
||||
|
||||
- Add GitHub Enterprise support. [Kevin Laude]
|
||||
|
||||
Pass the -H or --github-host argument with a GitHub Enterprise hostname
|
||||
@@ -341,35 +444,21 @@ Other
|
||||
|
||||
0.2.0 (2014-09-22)
|
||||
------------------
|
||||
|
||||
- Add support for retrieving repositories. Closes #1. [Jose Diaz-
|
||||
Gonzalez]
|
||||
|
||||
- Fix PEP8 violations. [Jose Diaz-Gonzalez]
|
||||
|
||||
- Add authorization to header only if specified by user. [Ioannis
|
||||
Filippidis]
|
||||
|
||||
- Fill out readme more. [Jose Diaz-Gonzalez]
|
||||
|
||||
- Fix import. [Jose Diaz-Gonzalez]
|
||||
|
||||
- Properly name readme. [Jose Diaz-Gonzalez]
|
||||
|
||||
- Create MANIFEST.in. [Jose Diaz-Gonzalez]
|
||||
|
||||
- Create .gitignore. [Jose Diaz-Gonzalez]
|
||||
|
||||
- Create setup.py. [Jose Diaz-Gonzalez]
|
||||
|
||||
- Create requirements.txt. [Jose Diaz-Gonzalez]
|
||||
|
||||
- Create __init__.py. [Jose Diaz-Gonzalez]
|
||||
|
||||
- Create LICENSE.txt. [Jose Diaz-Gonzalez]
|
||||
|
||||
- Create README.md. [Jose Diaz-Gonzalez]
|
||||
|
||||
- Create github-backup. [Jose Diaz-Gonzalez]
|
||||
|
||||
|
||||
|
||||
13
ISSUE_TEMPLATE.md
Normal file
13
ISSUE_TEMPLATE.md
Normal file
@@ -0,0 +1,13 @@
|
||||
# Important notice regarding filed issues
|
||||
|
||||
This project already fills my needs, and as such I have no real reason to continue it's development. This project is otherwise provided as is, and no support is given.
|
||||
|
||||
If pull requests implementing bug fixes or enhancements are pushed, I am happy to review and merge them (time permitting).
|
||||
|
||||
If you wish to have a bug fixed, you have a few options:
|
||||
|
||||
- Fix it yourself and file a pull request.
|
||||
- File a bug and hope someone else fixes it for you.
|
||||
- Pay me to fix it (my rate is $200 an hour, minimum 1 hour, contact me via my [github email address](https://github.com/josegonzalez) if you want to go this route).
|
||||
|
||||
In all cases, feel free to file an issue, they may be of help to others in the future.
|
||||
7
PULL_REQUEST.md
Normal file
7
PULL_REQUEST.md
Normal file
@@ -0,0 +1,7 @@
|
||||
# Important notice regarding filed pull requests
|
||||
|
||||
This project already fills my needs, and as such I have no real reason to continue it's development. This project is otherwise provided as is, and no support is given.
|
||||
|
||||
I will attempt to review pull requests at _my_ earliest convenience. If I am unable to get to your pull request in a timely fashion, it is what it is. This repository does not pay any bills, and I am not required to merge any pull request from any individual.
|
||||
|
||||
If you wish to jump my personal priority queue, you may pay me for my time to review. My rate is $200 an hour - minimum 1 hour - feel free contact me via my github email address if you want to go this route.
|
||||
43
README.rst
43
README.rst
@@ -4,6 +4,8 @@ github-backup
|
||||
|
||||
|PyPI| |Python Versions|
|
||||
|
||||
This project is considered feature complete for the primary maintainer. If you would like a bugfix or enhancement and cannot sponsor the work, pull requests are welcome. Feel free to contact the maintainer for consulting estimates if desired.
|
||||
|
||||
backup a github user or organization
|
||||
|
||||
Requirements
|
||||
@@ -27,18 +29,19 @@ Usage
|
||||
|
||||
CLI Usage is as follows::
|
||||
|
||||
github-backup [-h] [-u USERNAME] [-p PASSWORD] [-t TOKEN]
|
||||
github-backup [-h] [-u USERNAME] [-p PASSWORD] [-t TOKEN] [--as-app]
|
||||
[-o OUTPUT_DIRECTORY] [-i] [--starred] [--all-starred]
|
||||
[--watched] [--followers] [--following] [--all]
|
||||
[--issues] [--issue-comments] [--issue-events] [--pulls]
|
||||
[--pull-comments] [--pull-commits] [--labels] [--hooks]
|
||||
[--milestones] [--repositories] [--bare] [--lfs]
|
||||
[--wikis] [--gists] [--starred-gists] [--skip-existing]
|
||||
[-L [LANGUAGES [LANGUAGES ...]]] [-N NAME_REGEX]
|
||||
[-H GITHUB_HOST] [-O] [-R REPOSITORY] [-P] [-F]
|
||||
[--prefer-ssh] [-v]
|
||||
[--pull-comments] [--pull-commits] [--pull-details]
|
||||
[--labels] [--hooks] [--milestones] [--repositories]
|
||||
[--bare] [--lfs] [--wikis] [--gists] [--starred-gists]
|
||||
[--skip-existing] [-L [LANGUAGES [LANGUAGES ...]]]
|
||||
[-N NAME_REGEX] [-H GITHUB_HOST] [-O] [-R REPOSITORY]
|
||||
[-P] [-F] [--prefer-ssh] [-v]
|
||||
[--keychain-name OSX_KEYCHAIN_ITEM_NAME]
|
||||
[--keychain-account OSX_KEYCHAIN_ITEM_ACCOUNT]
|
||||
[--releases] [--assets]
|
||||
USER
|
||||
|
||||
Backup a github account
|
||||
@@ -54,23 +57,25 @@ CLI Usage is as follows::
|
||||
password for basic auth. If a username is given but
|
||||
not a password, the password will be prompted for.
|
||||
-t TOKEN, --token TOKEN
|
||||
personal access or OAuth token, or path to token
|
||||
(file://...)
|
||||
personal access, OAuth, or JSON Web token, or path to
|
||||
token (file://...)
|
||||
--as-app authenticate as github app instead of as a user.
|
||||
-o OUTPUT_DIRECTORY, --output-directory OUTPUT_DIRECTORY
|
||||
directory at which to backup the repositories
|
||||
-i, --incremental incremental backup
|
||||
--starred include JSON output of starred repositories in backup
|
||||
--all-starred include starred repositories in backup
|
||||
--watched include watched repositories in backup
|
||||
--all-starred include starred repositories in backup [*]
|
||||
--watched include JSON output of watched repositories in backup
|
||||
--followers include JSON output of followers in backup
|
||||
--following include JSON output of following users in backup
|
||||
--all include everything in backup
|
||||
--all include everything in backup (not including [*])
|
||||
--issues include issues in backup
|
||||
--issue-comments include issue comments in backup
|
||||
--issue-events include issue events in backup
|
||||
--pulls include pull requests in backup
|
||||
--pull-comments include pull request review comments in backup
|
||||
--pull-commits include pull request commits in backup
|
||||
--pull-details include more pull request details in backup [*]
|
||||
--labels include labels in backup
|
||||
--hooks include hooks in backup (works only when
|
||||
authenticated)
|
||||
@@ -78,10 +83,10 @@ CLI Usage is as follows::
|
||||
--repositories include repository clone in backup
|
||||
--bare clone bare repositories
|
||||
--lfs clone LFS repositories (requires Git LFS to be
|
||||
installed, https://git-lfs.github.com)
|
||||
installed, https://git-lfs.github.com) [*]
|
||||
--wikis include wiki clone in backup
|
||||
--gists include gists in backup
|
||||
--starred-gists include starred gists in backup
|
||||
--gists include gists in backup [*]
|
||||
--starred-gists include starred gists in backup [*]
|
||||
--skip-existing skip project if a backup directory exists
|
||||
-L [LANGUAGES [LANGUAGES ...]], --languages [LANGUAGES [LANGUAGES ...]]
|
||||
only allow these languages
|
||||
@@ -92,8 +97,8 @@ CLI Usage is as follows::
|
||||
-O, --organization whether or not this is an organization user
|
||||
-R REPOSITORY, --repository REPOSITORY
|
||||
name of repository to limit backup to
|
||||
-P, --private include private repositories
|
||||
-F, --fork include forked repositories
|
||||
-P, --private include private repositories [*]
|
||||
-F, --fork include forked repositories [*]
|
||||
--prefer-ssh Clone repositories using SSH instead of HTTPS
|
||||
-v, --version show program's version number and exit
|
||||
--keychain-name OSX_KEYCHAIN_ITEM_NAME
|
||||
@@ -102,6 +107,10 @@ CLI Usage is as follows::
|
||||
--keychain-account OSX_KEYCHAIN_ITEM_ACCOUNT
|
||||
OSX ONLY: account field of password item in OSX
|
||||
keychain that holds the personal access or OAuth token
|
||||
--releases include release information, not including assets or
|
||||
binaries
|
||||
--assets include assets alongside release information; only
|
||||
applies if including releases
|
||||
|
||||
|
||||
The package can be used to backup an *entire* organization or repository, including issues and wikis in the most appropriate format (clones for wikis, json files for issues).
|
||||
|
||||
@@ -1,971 +1,18 @@
|
||||
#!/usr/bin/env python
|
||||
|
||||
from __future__ import print_function
|
||||
|
||||
import argparse
|
||||
import base64
|
||||
import calendar
|
||||
import codecs
|
||||
import errno
|
||||
import getpass
|
||||
import json
|
||||
import logging
|
||||
import os
|
||||
import re
|
||||
import select
|
||||
import subprocess
|
||||
import sys
|
||||
import time
|
||||
import platform
|
||||
try:
|
||||
# python 3
|
||||
from urllib.parse import urlparse
|
||||
from urllib.parse import quote as urlquote
|
||||
from urllib.parse import urlencode
|
||||
from urllib.error import HTTPError, URLError
|
||||
from urllib.request import urlopen
|
||||
from urllib.request import Request
|
||||
except ImportError:
|
||||
# python 2
|
||||
from urlparse import urlparse
|
||||
from urllib import quote as urlquote
|
||||
from urllib import urlencode
|
||||
from urllib2 import HTTPError, URLError
|
||||
from urllib2 import urlopen
|
||||
from urllib2 import Request
|
||||
|
||||
from github_backup import __version__
|
||||
|
||||
FNULL = open(os.devnull, 'w')
|
||||
|
||||
|
||||
def log_error(message):
|
||||
if type(message) == str:
|
||||
message = [message]
|
||||
|
||||
for msg in message:
|
||||
sys.stderr.write("{0}\n".format(msg))
|
||||
|
||||
sys.exit(1)
|
||||
|
||||
|
||||
def log_info(message):
|
||||
if type(message) == str:
|
||||
message = [message]
|
||||
|
||||
for msg in message:
|
||||
sys.stdout.write("{0}\n".format(msg))
|
||||
|
||||
|
||||
def logging_subprocess(popenargs,
|
||||
logger,
|
||||
stdout_log_level=logging.DEBUG,
|
||||
stderr_log_level=logging.ERROR,
|
||||
**kwargs):
|
||||
"""
|
||||
Variant of subprocess.call that accepts a logger instead of stdout/stderr,
|
||||
and logs stdout messages via logger.debug and stderr messages via
|
||||
logger.error.
|
||||
"""
|
||||
child = subprocess.Popen(popenargs, stdout=subprocess.PIPE,
|
||||
stderr=subprocess.PIPE, **kwargs)
|
||||
if sys.platform == 'win32':
|
||||
log_info("Windows operating system detected - no subprocess logging will be returned")
|
||||
|
||||
log_level = {child.stdout: stdout_log_level,
|
||||
child.stderr: stderr_log_level}
|
||||
|
||||
def check_io():
|
||||
if sys.platform == 'win32':
|
||||
return
|
||||
ready_to_read = select.select([child.stdout, child.stderr],
|
||||
[],
|
||||
[],
|
||||
1000)[0]
|
||||
for io in ready_to_read:
|
||||
line = io.readline()
|
||||
if not logger:
|
||||
continue
|
||||
if not (io == child.stderr and not line):
|
||||
logger.log(log_level[io], line[:-1])
|
||||
|
||||
# keep checking stdout/stderr until the child exits
|
||||
while child.poll() is None:
|
||||
check_io()
|
||||
|
||||
check_io() # check again to catch anything after the process exits
|
||||
|
||||
rc = child.wait()
|
||||
|
||||
if rc != 0:
|
||||
print('{} returned {}:'.format(popenargs[0], rc), file=sys.stderr)
|
||||
print('\t', ' '.join(popenargs), file=sys.stderr)
|
||||
|
||||
return rc
|
||||
|
||||
|
||||
def mkdir_p(*args):
|
||||
for path in args:
|
||||
try:
|
||||
os.makedirs(path)
|
||||
except OSError as exc: # Python >2.5
|
||||
if exc.errno == errno.EEXIST and os.path.isdir(path):
|
||||
pass
|
||||
else:
|
||||
raise
|
||||
|
||||
|
||||
def mask_password(url, secret='*****'):
|
||||
parsed = urlparse(url)
|
||||
|
||||
if not parsed.password:
|
||||
return url
|
||||
elif parsed.password == 'x-oauth-basic':
|
||||
return url.replace(parsed.username, secret)
|
||||
|
||||
return url.replace(parsed.password, secret)
|
||||
|
||||
|
||||
def parse_args():
|
||||
parser = argparse.ArgumentParser(description='Backup a github account')
|
||||
parser.add_argument('user',
|
||||
metavar='USER',
|
||||
type=str,
|
||||
help='github username')
|
||||
parser.add_argument('-u',
|
||||
'--username',
|
||||
dest='username',
|
||||
help='username for basic auth')
|
||||
parser.add_argument('-p',
|
||||
'--password',
|
||||
dest='password',
|
||||
help='password for basic auth. '
|
||||
'If a username is given but not a password, the '
|
||||
'password will be prompted for.')
|
||||
parser.add_argument('-t',
|
||||
'--token',
|
||||
dest='token',
|
||||
help='personal access or OAuth token, or path to token (file://...)') # noqa
|
||||
parser.add_argument('-o',
|
||||
'--output-directory',
|
||||
default='.',
|
||||
dest='output_directory',
|
||||
help='directory at which to backup the repositories')
|
||||
parser.add_argument('-i',
|
||||
'--incremental',
|
||||
action='store_true',
|
||||
dest='incremental',
|
||||
help='incremental backup')
|
||||
parser.add_argument('--starred',
|
||||
action='store_true',
|
||||
dest='include_starred',
|
||||
help='include JSON output of starred repositories in backup')
|
||||
parser.add_argument('--all-starred',
|
||||
action='store_true',
|
||||
dest='all_starred',
|
||||
help='include starred repositories in backup')
|
||||
parser.add_argument('--watched',
|
||||
action='store_true',
|
||||
dest='include_watched',
|
||||
help='include watched repositories in backup')
|
||||
parser.add_argument('--followers',
|
||||
action='store_true',
|
||||
dest='include_followers',
|
||||
help='include JSON output of followers in backup')
|
||||
parser.add_argument('--following',
|
||||
action='store_true',
|
||||
dest='include_following',
|
||||
help='include JSON output of following users in backup')
|
||||
parser.add_argument('--all',
|
||||
action='store_true',
|
||||
dest='include_everything',
|
||||
help='include everything in backup')
|
||||
parser.add_argument('--issues',
|
||||
action='store_true',
|
||||
dest='include_issues',
|
||||
help='include issues in backup')
|
||||
parser.add_argument('--issue-comments',
|
||||
action='store_true',
|
||||
dest='include_issue_comments',
|
||||
help='include issue comments in backup')
|
||||
parser.add_argument('--issue-events',
|
||||
action='store_true',
|
||||
dest='include_issue_events',
|
||||
help='include issue events in backup')
|
||||
parser.add_argument('--pulls',
|
||||
action='store_true',
|
||||
dest='include_pulls',
|
||||
help='include pull requests in backup')
|
||||
parser.add_argument('--pull-comments',
|
||||
action='store_true',
|
||||
dest='include_pull_comments',
|
||||
help='include pull request review comments in backup')
|
||||
parser.add_argument('--pull-commits',
|
||||
action='store_true',
|
||||
dest='include_pull_commits',
|
||||
help='include pull request commits in backup')
|
||||
parser.add_argument('--pull-details',
|
||||
action='store_true',
|
||||
dest='include_pull_details',
|
||||
help='include more pull request details in backup')
|
||||
parser.add_argument('--labels',
|
||||
action='store_true',
|
||||
dest='include_labels',
|
||||
help='include labels in backup')
|
||||
parser.add_argument('--hooks',
|
||||
action='store_true',
|
||||
dest='include_hooks',
|
||||
help='include hooks in backup (works only when authenticated)') # noqa
|
||||
parser.add_argument('--milestones',
|
||||
action='store_true',
|
||||
dest='include_milestones',
|
||||
help='include milestones in backup')
|
||||
parser.add_argument('--repositories',
|
||||
action='store_true',
|
||||
dest='include_repository',
|
||||
help='include repository clone in backup')
|
||||
parser.add_argument('--bare',
|
||||
action='store_true',
|
||||
dest='bare_clone',
|
||||
help='clone bare repositories')
|
||||
parser.add_argument('--lfs',
|
||||
action='store_true',
|
||||
dest='lfs_clone',
|
||||
help='clone LFS repositories (requires Git LFS to be installed, https://git-lfs.github.com)')
|
||||
parser.add_argument('--wikis',
|
||||
action='store_true',
|
||||
dest='include_wiki',
|
||||
help='include wiki clone in backup')
|
||||
parser.add_argument('--gists',
|
||||
action='store_true',
|
||||
dest='include_gists',
|
||||
help='include gists in backup')
|
||||
parser.add_argument('--starred-gists',
|
||||
action='store_true',
|
||||
dest='include_starred_gists',
|
||||
help='include starred gists in backup')
|
||||
parser.add_argument('--skip-existing',
|
||||
action='store_true',
|
||||
dest='skip_existing',
|
||||
help='skip project if a backup directory exists')
|
||||
parser.add_argument('-L',
|
||||
'--languages',
|
||||
dest='languages',
|
||||
help='only allow these languages',
|
||||
nargs='*')
|
||||
parser.add_argument('-N',
|
||||
'--name-regex',
|
||||
dest='name_regex',
|
||||
help='python regex to match names against')
|
||||
parser.add_argument('-H',
|
||||
'--github-host',
|
||||
dest='github_host',
|
||||
help='GitHub Enterprise hostname')
|
||||
parser.add_argument('-O',
|
||||
'--organization',
|
||||
action='store_true',
|
||||
dest='organization',
|
||||
help='whether or not this is an organization user')
|
||||
parser.add_argument('-R',
|
||||
'--repository',
|
||||
dest='repository',
|
||||
help='name of repository to limit backup to')
|
||||
parser.add_argument('-P', '--private',
|
||||
action='store_true',
|
||||
dest='private',
|
||||
help='include private repositories')
|
||||
parser.add_argument('-F', '--fork',
|
||||
action='store_true',
|
||||
dest='fork',
|
||||
help='include forked repositories')
|
||||
parser.add_argument('--prefer-ssh',
|
||||
action='store_true',
|
||||
help='Clone repositories using SSH instead of HTTPS')
|
||||
parser.add_argument('-v', '--version',
|
||||
action='version',
|
||||
version='%(prog)s ' + __version__)
|
||||
parser.add_argument('--keychain-name',
|
||||
dest='osx_keychain_item_name',
|
||||
help='OSX ONLY: name field of password item in OSX keychain that holds the personal access or OAuth token')
|
||||
parser.add_argument('--keychain-account',
|
||||
dest='osx_keychain_item_account',
|
||||
help='OSX ONLY: account field of password item in OSX keychain that holds the personal access or OAuth token')
|
||||
return parser.parse_args()
|
||||
|
||||
|
||||
def get_auth(args, encode=True):
|
||||
auth = None
|
||||
|
||||
if args.osx_keychain_item_name:
|
||||
if not args.osx_keychain_item_account:
|
||||
log_error('You must specify both name and account fields for osx keychain password items')
|
||||
else:
|
||||
if platform.system() != 'Darwin':
|
||||
log_error("Keychain arguments are only supported on Mac OSX")
|
||||
try:
|
||||
with open(os.devnull,'w') as devnull:
|
||||
token = (subprocess.check_output([
|
||||
'security','find-generic-password',
|
||||
'-s',args.osx_keychain_item_name,
|
||||
'-a',args.osx_keychain_item_account,
|
||||
'-w' ], stderr=devnull).strip())
|
||||
auth = token + ':' + 'x-oauth-basic'
|
||||
except:
|
||||
log_error('No password item matching the provided name and account could be found in the osx keychain.')
|
||||
elif args.osx_keychain_item_account:
|
||||
log_error('You must specify both name and account fields for osx keychain password items')
|
||||
elif args.token:
|
||||
_path_specifier = 'file://'
|
||||
if args.token.startswith(_path_specifier):
|
||||
args.token = open(args.token[len(_path_specifier):],
|
||||
'rt').readline().strip()
|
||||
auth = args.token + ':' + 'x-oauth-basic'
|
||||
elif args.username:
|
||||
if not args.password:
|
||||
args.password = getpass.getpass()
|
||||
if encode:
|
||||
password = args.password
|
||||
else:
|
||||
password = urlquote(args.password)
|
||||
auth = args.username + ':' + password
|
||||
elif args.password:
|
||||
log_error('You must specify a username for basic auth')
|
||||
|
||||
if not auth:
|
||||
return None
|
||||
|
||||
if not encode:
|
||||
return auth
|
||||
|
||||
return base64.b64encode(auth.encode('ascii'))
|
||||
|
||||
|
||||
def get_github_api_host(args):
|
||||
if args.github_host:
|
||||
host = args.github_host + '/api/v3'
|
||||
else:
|
||||
host = 'api.github.com'
|
||||
|
||||
return host
|
||||
|
||||
|
||||
def get_github_host(args):
|
||||
if args.github_host:
|
||||
host = args.github_host
|
||||
else:
|
||||
host = 'github.com'
|
||||
|
||||
return host
|
||||
|
||||
|
||||
def get_github_repo_url(args, repository):
|
||||
if args.prefer_ssh:
|
||||
return repository['ssh_url']
|
||||
|
||||
if repository.get('is_gist'):
|
||||
return repository['git_pull_url']
|
||||
|
||||
auth = get_auth(args, False)
|
||||
if auth:
|
||||
repo_url = 'https://{0}@{1}/{2}/{3}.git'.format(
|
||||
auth,
|
||||
get_github_host(args),
|
||||
repository['owner']['login'],
|
||||
repository['name'])
|
||||
else:
|
||||
repo_url = repository['clone_url']
|
||||
|
||||
return repo_url
|
||||
|
||||
|
||||
def retrieve_data(args, template, query_args=None, single_request=False):
|
||||
auth = get_auth(args)
|
||||
query_args = get_query_args(query_args)
|
||||
per_page = 100
|
||||
page = 0
|
||||
data = []
|
||||
|
||||
while True:
|
||||
page = page + 1
|
||||
request = _construct_request(per_page, page, query_args, template, auth) # noqa
|
||||
r, errors = _get_response(request, auth, template)
|
||||
|
||||
status_code = int(r.getcode())
|
||||
|
||||
if status_code != 200:
|
||||
template = 'API request returned HTTP {0}: {1}'
|
||||
errors.append(template.format(status_code, r.reason))
|
||||
log_error(errors)
|
||||
|
||||
response = json.loads(r.read().decode('utf-8'))
|
||||
if len(errors) == 0:
|
||||
if type(response) == list:
|
||||
data.extend(response)
|
||||
if len(response) < per_page:
|
||||
break
|
||||
elif type(response) == dict and single_request:
|
||||
data.append(response)
|
||||
|
||||
if len(errors) > 0:
|
||||
log_error(errors)
|
||||
|
||||
if single_request:
|
||||
break
|
||||
|
||||
return data
|
||||
|
||||
|
||||
def get_query_args(query_args=None):
|
||||
if not query_args:
|
||||
query_args = {}
|
||||
return query_args
|
||||
|
||||
|
||||
def _get_response(request, auth, template):
|
||||
retry_timeout = 3
|
||||
errors = []
|
||||
# We'll make requests in a loop so we can
|
||||
# delay and retry in the case of rate-limiting
|
||||
while True:
|
||||
should_continue = False
|
||||
try:
|
||||
r = urlopen(request)
|
||||
except HTTPError as exc:
|
||||
errors, should_continue = _request_http_error(exc, auth, errors) # noqa
|
||||
r = exc
|
||||
except URLError:
|
||||
should_continue = _request_url_error(template, retry_timeout)
|
||||
if not should_continue:
|
||||
raise
|
||||
|
||||
if should_continue:
|
||||
continue
|
||||
|
||||
break
|
||||
return r, errors
|
||||
|
||||
|
||||
def _construct_request(per_page, page, query_args, template, auth):
|
||||
querystring = urlencode(dict(list({
|
||||
'per_page': per_page,
|
||||
'page': page
|
||||
}.items()) + list(query_args.items())))
|
||||
|
||||
request = Request(template + '?' + querystring)
|
||||
if auth is not None:
|
||||
request.add_header('Authorization', 'Basic '.encode('ascii') + auth)
|
||||
log_info('Requesting {}?{}'.format(template, querystring))
|
||||
return request
|
||||
|
||||
|
||||
def _request_http_error(exc, auth, errors):
|
||||
# HTTPError behaves like a Response so we can
|
||||
# check the status code and headers to see exactly
|
||||
# what failed.
|
||||
|
||||
should_continue = False
|
||||
headers = exc.headers
|
||||
limit_remaining = int(headers.get('x-ratelimit-remaining', 0))
|
||||
|
||||
if exc.code == 403 and limit_remaining < 1:
|
||||
# The X-RateLimit-Reset header includes a
|
||||
# timestamp telling us when the limit will reset
|
||||
# so we can calculate how long to wait rather
|
||||
# than inefficiently polling:
|
||||
gm_now = calendar.timegm(time.gmtime())
|
||||
reset = int(headers.get('x-ratelimit-reset', 0)) or gm_now
|
||||
# We'll never sleep for less than 10 seconds:
|
||||
delta = max(10, reset - gm_now)
|
||||
|
||||
limit = headers.get('x-ratelimit-limit')
|
||||
print('Exceeded rate limit of {} requests; waiting {} seconds to reset'.format(limit, delta), # noqa
|
||||
file=sys.stderr)
|
||||
|
||||
if auth is None:
|
||||
print('Hint: Authenticate to raise your GitHub rate limit',
|
||||
file=sys.stderr)
|
||||
|
||||
time.sleep(delta)
|
||||
should_continue = True
|
||||
return errors, should_continue
|
||||
|
||||
|
||||
def _request_url_error(template, retry_timeout):
|
||||
# Incase of a connection timing out, we can retry a few time
|
||||
# But we won't crash and not back-up the rest now
|
||||
log_info('{} timed out'.format(template))
|
||||
retry_timeout -= 1
|
||||
|
||||
if retry_timeout >= 0:
|
||||
return True
|
||||
|
||||
log_error('{} timed out to much, skipping!')
|
||||
return False
|
||||
|
||||
|
||||
def check_git_lfs_install():
|
||||
exit_code = subprocess.call(['git', 'lfs', 'version'])
|
||||
if exit_code != 0:
|
||||
log_error('The argument --lfs requires you to have Git LFS installed.\nYou can get it from https://git-lfs.github.com.')
|
||||
sys.exit(1)
|
||||
|
||||
|
||||
def retrieve_repositories(args):
|
||||
log_info('Retrieving repositories')
|
||||
single_request = False
|
||||
template = 'https://{0}/user/repos'.format(
|
||||
get_github_api_host(args))
|
||||
if args.organization:
|
||||
template = 'https://{0}/orgs/{1}/repos'.format(
|
||||
get_github_api_host(args),
|
||||
args.user)
|
||||
|
||||
if args.repository:
|
||||
single_request = True
|
||||
template = 'https://{0}/repos/{1}/{2}'.format(
|
||||
get_github_api_host(args),
|
||||
args.user,
|
||||
args.repository)
|
||||
|
||||
repos = retrieve_data(args, template, single_request=single_request)
|
||||
|
||||
if args.all_starred:
|
||||
starred_template = 'https://{0}/user/starred'.format(get_github_api_host(args))
|
||||
starred_repos = retrieve_data(args, starred_template, single_request=False)
|
||||
# flag each repo as starred for downstream processing
|
||||
for item in starred_repos:
|
||||
item.update({'is_starred': True})
|
||||
repos.extend(starred_repos)
|
||||
|
||||
if args.include_gists:
|
||||
gists_template = 'https://{0}/gists'.format(get_github_api_host(args))
|
||||
gists = retrieve_data(args, gists_template, single_request=False)
|
||||
# flag each repo as a gist for downstream processing
|
||||
for item in gists:
|
||||
item.update({'is_gist': True})
|
||||
repos.extend(gists)
|
||||
|
||||
if args.include_starred_gists:
|
||||
starred_gists_template = 'https://{0}/gists/starred'.format(get_github_api_host(args))
|
||||
starred_gists = retrieve_data(args, starred_gists_template, single_request=False)
|
||||
# flag each repo as a starred gist for downstream processing
|
||||
for item in starred_gists:
|
||||
item.update({'is_gist': True,
|
||||
'is_starred': True})
|
||||
repos.extend(starred_gists)
|
||||
|
||||
return repos
|
||||
|
||||
|
||||
def filter_repositories(args, unfiltered_repositories):
|
||||
log_info('Filtering repositories')
|
||||
|
||||
repositories = []
|
||||
for r in unfiltered_repositories:
|
||||
# gists can be anonymous, so need to safely check owner
|
||||
if r.get('owner', {}).get('login') == args.user or r.get('is_starred'):
|
||||
repositories.append(r)
|
||||
|
||||
name_regex = None
|
||||
if args.name_regex:
|
||||
name_regex = re.compile(args.name_regex)
|
||||
|
||||
languages = None
|
||||
if args.languages:
|
||||
languages = [x.lower() for x in args.languages]
|
||||
|
||||
if not args.fork:
|
||||
repositories = [r for r in repositories if not r.get('fork')]
|
||||
if not args.private:
|
||||
repositories = [r for r in repositories if not r.get('private') or r.get('public')]
|
||||
if languages:
|
||||
repositories = [r for r in repositories if r.get('language') and r.get('language').lower() in languages] # noqa
|
||||
if name_regex:
|
||||
repositories = [r for r in repositories if name_regex.match(r['name'])]
|
||||
|
||||
return repositories
|
||||
|
||||
|
||||
def backup_repositories(args, output_directory, repositories):
|
||||
log_info('Backing up repositories')
|
||||
repos_template = 'https://{0}/repos'.format(get_github_api_host(args))
|
||||
|
||||
if args.incremental:
|
||||
last_update = max(list(repository['updated_at'] for repository in repositories) or [time.strftime('%Y-%m-%dT%H:%M:%SZ', time.localtime())]) # noqa
|
||||
last_update_path = os.path.join(output_directory, 'last_update')
|
||||
if os.path.exists(last_update_path):
|
||||
args.since = open(last_update_path).read().strip()
|
||||
else:
|
||||
args.since = None
|
||||
else:
|
||||
args.since = None
|
||||
|
||||
for repository in repositories:
|
||||
if repository.get('is_gist'):
|
||||
repo_cwd = os.path.join(output_directory, 'gists', repository['id'])
|
||||
elif repository.get('is_starred'):
|
||||
# put starred repos in -o/starred/${owner}/${repo} to prevent collision of
|
||||
# any repositories with the same name
|
||||
repo_cwd = os.path.join(output_directory, 'starred', repository['owner']['login'], repository['name'])
|
||||
else:
|
||||
repo_cwd = os.path.join(output_directory, 'repositories', repository['name'])
|
||||
|
||||
repo_dir = os.path.join(repo_cwd, 'repository')
|
||||
repo_url = get_github_repo_url(args, repository)
|
||||
|
||||
include_gists = (args.include_gists or args.include_starred_gists)
|
||||
if (args.include_repository or args.include_everything) \
|
||||
or (include_gists and repository.get('is_gist')):
|
||||
repo_name = repository.get('name') if not repository.get('is_gist') else repository.get('id')
|
||||
fetch_repository(repo_name,
|
||||
repo_url,
|
||||
repo_dir,
|
||||
skip_existing=args.skip_existing,
|
||||
bare_clone=args.bare_clone,
|
||||
lfs_clone=args.lfs_clone)
|
||||
|
||||
if repository.get('is_gist'):
|
||||
# dump gist information to a file as well
|
||||
output_file = '{0}/gist.json'.format(repo_cwd)
|
||||
with codecs.open(output_file, 'w', encoding='utf-8') as f:
|
||||
json_dump(repository, f)
|
||||
|
||||
continue # don't try to back anything else for a gist; it doesn't exist
|
||||
|
||||
download_wiki = (args.include_wiki or args.include_everything)
|
||||
if repository['has_wiki'] and download_wiki:
|
||||
fetch_repository(repository['name'],
|
||||
repo_url.replace('.git', '.wiki.git'),
|
||||
os.path.join(repo_cwd, 'wiki'),
|
||||
skip_existing=args.skip_existing,
|
||||
bare_clone=args.bare_clone,
|
||||
lfs_clone=args.lfs_clone)
|
||||
|
||||
if args.include_issues or args.include_everything:
|
||||
backup_issues(args, repo_cwd, repository, repos_template)
|
||||
|
||||
if args.include_pulls or args.include_everything:
|
||||
backup_pulls(args, repo_cwd, repository, repos_template)
|
||||
|
||||
if args.include_milestones or args.include_everything:
|
||||
backup_milestones(args, repo_cwd, repository, repos_template)
|
||||
|
||||
if args.include_labels or args.include_everything:
|
||||
backup_labels(args, repo_cwd, repository, repos_template)
|
||||
|
||||
if args.include_hooks or args.include_everything:
|
||||
backup_hooks(args, repo_cwd, repository, repos_template)
|
||||
|
||||
if args.incremental:
|
||||
open(last_update_path, 'w').write(last_update)
|
||||
|
||||
|
||||
def backup_issues(args, repo_cwd, repository, repos_template):
|
||||
has_issues_dir = os.path.isdir('{0}/issues/.git'.format(repo_cwd))
|
||||
if args.skip_existing and has_issues_dir:
|
||||
return
|
||||
|
||||
log_info('Retrieving {0} issues'.format(repository['full_name']))
|
||||
issue_cwd = os.path.join(repo_cwd, 'issues')
|
||||
mkdir_p(repo_cwd, issue_cwd)
|
||||
|
||||
issues = {}
|
||||
issues_skipped = 0
|
||||
issues_skipped_message = ''
|
||||
_issue_template = '{0}/{1}/issues'.format(repos_template,
|
||||
repository['full_name'])
|
||||
|
||||
should_include_pulls = args.include_pulls or args.include_everything
|
||||
issue_states = ['open', 'closed']
|
||||
for issue_state in issue_states:
|
||||
query_args = {
|
||||
'filter': 'all',
|
||||
'state': issue_state
|
||||
}
|
||||
if args.since:
|
||||
query_args['since'] = args.since
|
||||
|
||||
_issues = retrieve_data(args,
|
||||
_issue_template,
|
||||
query_args=query_args)
|
||||
for issue in _issues:
|
||||
# skip pull requests which are also returned as issues
|
||||
# if retrieving pull requests is requested as well
|
||||
if 'pull_request' in issue and should_include_pulls:
|
||||
issues_skipped += 1
|
||||
continue
|
||||
|
||||
issues[issue['number']] = issue
|
||||
|
||||
if issues_skipped:
|
||||
issues_skipped_message = ' (skipped {0} pull requests)'.format(
|
||||
issues_skipped)
|
||||
|
||||
log_info('Saving {0} issues to disk{1}'.format(
|
||||
len(list(issues.keys())), issues_skipped_message))
|
||||
comments_template = _issue_template + '/{0}/comments'
|
||||
events_template = _issue_template + '/{0}/events'
|
||||
for number, issue in list(issues.items()):
|
||||
if args.include_issue_comments or args.include_everything:
|
||||
template = comments_template.format(number)
|
||||
issues[number]['comment_data'] = retrieve_data(args, template)
|
||||
if args.include_issue_events or args.include_everything:
|
||||
template = events_template.format(number)
|
||||
issues[number]['event_data'] = retrieve_data(args, template)
|
||||
|
||||
issue_file = '{0}/{1}.json'.format(issue_cwd, number)
|
||||
with codecs.open(issue_file, 'w', encoding='utf-8') as f:
|
||||
json_dump(issue, f)
|
||||
|
||||
|
||||
def backup_pulls(args, repo_cwd, repository, repos_template):
|
||||
has_pulls_dir = os.path.isdir('{0}/pulls/.git'.format(repo_cwd))
|
||||
if args.skip_existing and has_pulls_dir:
|
||||
return
|
||||
|
||||
log_info('Retrieving {0} pull requests'.format(repository['full_name'])) # noqa
|
||||
pulls_cwd = os.path.join(repo_cwd, 'pulls')
|
||||
mkdir_p(repo_cwd, pulls_cwd)
|
||||
|
||||
pulls = {}
|
||||
_pulls_template = '{0}/{1}/pulls'.format(repos_template,
|
||||
repository['full_name'])
|
||||
query_args = {
|
||||
'filter': 'all',
|
||||
'state': 'all',
|
||||
'sort': 'updated',
|
||||
'direction': 'desc',
|
||||
}
|
||||
|
||||
if not args.include_pull_details:
|
||||
pull_states = ['open', 'closed']
|
||||
for pull_state in pull_states:
|
||||
query_args['state'] = pull_state
|
||||
# It'd be nice to be able to apply the args.since filter here...
|
||||
_pulls = retrieve_data(args,
|
||||
_pulls_template,
|
||||
query_args=query_args)
|
||||
for pull in _pulls:
|
||||
if not args.since or pull['updated_at'] >= args.since:
|
||||
pulls[pull['number']] = pull
|
||||
else:
|
||||
_pulls = retrieve_data(args,
|
||||
_pulls_template,
|
||||
query_args=query_args)
|
||||
for pull in _pulls:
|
||||
if not args.since or pull['updated_at'] >= args.since:
|
||||
pulls[pull['number']] = retrieve_data(
|
||||
args,
|
||||
_pulls_template + '/{}'.format(pull['number']),
|
||||
single_request=True
|
||||
)
|
||||
|
||||
log_info('Saving {0} pull requests to disk'.format(
|
||||
len(list(pulls.keys()))))
|
||||
comments_template = _pulls_template + '/{0}/comments'
|
||||
commits_template = _pulls_template + '/{0}/commits'
|
||||
for number, pull in list(pulls.items()):
|
||||
if args.include_pull_comments or args.include_everything:
|
||||
template = comments_template.format(number)
|
||||
pulls[number]['comment_data'] = retrieve_data(args, template)
|
||||
if args.include_pull_commits or args.include_everything:
|
||||
template = commits_template.format(number)
|
||||
pulls[number]['commit_data'] = retrieve_data(args, template)
|
||||
|
||||
pull_file = '{0}/{1}.json'.format(pulls_cwd, number)
|
||||
with codecs.open(pull_file, 'w', encoding='utf-8') as f:
|
||||
json_dump(pull, f)
|
||||
|
||||
|
||||
def backup_milestones(args, repo_cwd, repository, repos_template):
|
||||
milestone_cwd = os.path.join(repo_cwd, 'milestones')
|
||||
if args.skip_existing and os.path.isdir(milestone_cwd):
|
||||
return
|
||||
|
||||
log_info('Retrieving {0} milestones'.format(repository['full_name']))
|
||||
mkdir_p(repo_cwd, milestone_cwd)
|
||||
|
||||
template = '{0}/{1}/milestones'.format(repos_template,
|
||||
repository['full_name'])
|
||||
|
||||
query_args = {
|
||||
'state': 'all'
|
||||
}
|
||||
|
||||
_milestones = retrieve_data(args, template, query_args=query_args)
|
||||
|
||||
milestones = {}
|
||||
for milestone in _milestones:
|
||||
milestones[milestone['number']] = milestone
|
||||
|
||||
log_info('Saving {0} milestones to disk'.format(
|
||||
len(list(milestones.keys()))))
|
||||
for number, milestone in list(milestones.items()):
|
||||
milestone_file = '{0}/{1}.json'.format(milestone_cwd, number)
|
||||
with codecs.open(milestone_file, 'w', encoding='utf-8') as f:
|
||||
json_dump(milestone, f)
|
||||
|
||||
|
||||
def backup_labels(args, repo_cwd, repository, repos_template):
|
||||
label_cwd = os.path.join(repo_cwd, 'labels')
|
||||
output_file = '{0}/labels.json'.format(label_cwd)
|
||||
template = '{0}/{1}/labels'.format(repos_template,
|
||||
repository['full_name'])
|
||||
_backup_data(args,
|
||||
'labels',
|
||||
template,
|
||||
output_file,
|
||||
label_cwd)
|
||||
|
||||
|
||||
def backup_hooks(args, repo_cwd, repository, repos_template):
|
||||
auth = get_auth(args)
|
||||
if not auth:
|
||||
log_info("Skipping hooks since no authentication provided")
|
||||
return
|
||||
hook_cwd = os.path.join(repo_cwd, 'hooks')
|
||||
output_file = '{0}/hooks.json'.format(hook_cwd)
|
||||
template = '{0}/{1}/hooks'.format(repos_template,
|
||||
repository['full_name'])
|
||||
try:
|
||||
_backup_data(args,
|
||||
'hooks',
|
||||
template,
|
||||
output_file,
|
||||
hook_cwd)
|
||||
except SystemExit:
|
||||
log_info("Unable to read hooks, skipping")
|
||||
|
||||
|
||||
def fetch_repository(name,
|
||||
remote_url,
|
||||
local_dir,
|
||||
skip_existing=False,
|
||||
bare_clone=False,
|
||||
lfs_clone=False):
|
||||
if bare_clone:
|
||||
if os.path.exists(local_dir):
|
||||
clone_exists = subprocess.check_output(['git',
|
||||
'rev-parse',
|
||||
'--is-bare-repository'],
|
||||
cwd=local_dir) == b"true\n"
|
||||
else:
|
||||
clone_exists = False
|
||||
else:
|
||||
clone_exists = os.path.exists(os.path.join(local_dir, '.git'))
|
||||
|
||||
if clone_exists and skip_existing:
|
||||
return
|
||||
|
||||
masked_remote_url = mask_password(remote_url)
|
||||
|
||||
initialized = subprocess.call('git ls-remote ' + remote_url,
|
||||
stdout=FNULL,
|
||||
stderr=FNULL,
|
||||
shell=True)
|
||||
if initialized == 128:
|
||||
log_info("Skipping {0} ({1}) since it's not initialized".format(
|
||||
name, masked_remote_url))
|
||||
return
|
||||
|
||||
if clone_exists:
|
||||
log_info('Updating {0} in {1}'.format(name, local_dir))
|
||||
|
||||
remotes = subprocess.check_output(['git', 'remote', 'show'],
|
||||
cwd=local_dir)
|
||||
remotes = [i.strip() for i in remotes.decode('utf-8').splitlines()]
|
||||
|
||||
if 'origin' not in remotes:
|
||||
git_command = ['git', 'remote', 'rm', 'origin']
|
||||
logging_subprocess(git_command, None, cwd=local_dir)
|
||||
git_command = ['git', 'remote', 'add', 'origin', remote_url]
|
||||
logging_subprocess(git_command, None, cwd=local_dir)
|
||||
else:
|
||||
git_command = ['git', 'remote', 'set-url', 'origin', remote_url]
|
||||
logging_subprocess(git_command, None, cwd=local_dir)
|
||||
|
||||
if lfs_clone:
|
||||
git_command = ['git', 'lfs', 'fetch', '--all', '--force', '--tags', '--prune']
|
||||
else:
|
||||
git_command = ['git', 'fetch', '--all', '--force', '--tags', '--prune']
|
||||
logging_subprocess(git_command, None, cwd=local_dir)
|
||||
else:
|
||||
log_info('Cloning {0} repository from {1} to {2}'.format(
|
||||
name,
|
||||
masked_remote_url,
|
||||
local_dir))
|
||||
if bare_clone:
|
||||
if lfs_clone:
|
||||
git_command = ['git', 'lfs', 'clone', '--mirror', remote_url, local_dir]
|
||||
else:
|
||||
git_command = ['git', 'clone', '--mirror', remote_url, local_dir]
|
||||
else:
|
||||
if lfs_clone:
|
||||
git_command = ['git', 'lfs', 'clone', remote_url, local_dir]
|
||||
else:
|
||||
git_command = ['git', 'clone', remote_url, local_dir]
|
||||
logging_subprocess(git_command, None)
|
||||
|
||||
|
||||
def backup_account(args, output_directory):
|
||||
account_cwd = os.path.join(output_directory, 'account')
|
||||
|
||||
if args.include_starred or args.include_everything:
|
||||
output_file = "{0}/starred.json".format(account_cwd)
|
||||
template = "https://{0}/users/{1}/starred".format(get_github_api_host(args), args.user)
|
||||
_backup_data(args,
|
||||
"starred repositories",
|
||||
template,
|
||||
output_file,
|
||||
account_cwd)
|
||||
|
||||
if args.include_watched or args.include_everything:
|
||||
output_file = "{0}/watched.json".format(account_cwd)
|
||||
template = "https://{0}/users/{1}/subscriptions".format(get_github_api_host(args), args.user)
|
||||
_backup_data(args,
|
||||
"watched repositories",
|
||||
template,
|
||||
output_file,
|
||||
account_cwd)
|
||||
|
||||
if args.include_followers or args.include_everything:
|
||||
output_file = "{0}/followers.json".format(account_cwd)
|
||||
template = "https://{0}/users/{1}/followers".format(get_github_api_host(args), args.user)
|
||||
_backup_data(args,
|
||||
"followers",
|
||||
template,
|
||||
output_file,
|
||||
account_cwd)
|
||||
|
||||
if args.include_following or args.include_everything:
|
||||
output_file = "{0}/following.json".format(account_cwd)
|
||||
template = "https://{0}/users/{1}/following".format(get_github_api_host(args), args.user)
|
||||
_backup_data(args,
|
||||
"following",
|
||||
template,
|
||||
output_file,
|
||||
account_cwd)
|
||||
|
||||
|
||||
def _backup_data(args, name, template, output_file, output_directory):
|
||||
skip_existing = args.skip_existing
|
||||
if not skip_existing or not os.path.exists(output_file):
|
||||
log_info('Retrieving {0} {1}'.format(args.user, name))
|
||||
mkdir_p(output_directory)
|
||||
data = retrieve_data(args, template)
|
||||
|
||||
log_info('Writing {0} {1} to disk'.format(len(data), name))
|
||||
with codecs.open(output_file, 'w', encoding='utf-8') as f:
|
||||
json_dump(data, f)
|
||||
|
||||
|
||||
def json_dump(data, output_file):
|
||||
json.dump(data,
|
||||
output_file,
|
||||
ensure_ascii=False,
|
||||
sort_keys=True,
|
||||
indent=4,
|
||||
separators=(',', ': '))
|
||||
from github_backup.github_backup import (
|
||||
backup_account,
|
||||
backup_repositories,
|
||||
check_git_lfs_install,
|
||||
filter_repositories,
|
||||
get_authenticated_user,
|
||||
log_info,
|
||||
mkdir_p,
|
||||
parse_args,
|
||||
retrieve_repositories,
|
||||
)
|
||||
|
||||
|
||||
def main():
|
||||
@@ -979,9 +26,13 @@ def main():
|
||||
if args.lfs_clone:
|
||||
check_git_lfs_install()
|
||||
|
||||
if not args.as_app:
|
||||
log_info('Backing up user {0} to {1}'.format(args.user, output_directory))
|
||||
authenticated_user = get_authenticated_user(args)
|
||||
else:
|
||||
authenticated_user = {'login': None}
|
||||
|
||||
repositories = retrieve_repositories(args)
|
||||
repositories = retrieve_repositories(args, authenticated_user)
|
||||
repositories = filter_repositories(args, repositories)
|
||||
backup_repositories(args, output_directory, repositories)
|
||||
backup_account(args, output_directory)
|
||||
|
||||
@@ -1 +1 @@
|
||||
__version__ = '0.19.1'
|
||||
__version__ = '0.33.0'
|
||||
|
||||
1156
github_backup/github_backup.py
Normal file
1156
github_backup/github_backup.py
Normal file
File diff suppressed because it is too large
Load Diff
9
release
9
release
@@ -1,8 +1,13 @@
|
||||
#!/usr/bin/env bash
|
||||
set -eo pipefail; [[ $RELEASE_TRACE ]] && set -x
|
||||
|
||||
PACKAGE_NAME='github-backup'
|
||||
INIT_PACKAGE_NAME='github_backup'
|
||||
if [[ ! -f setup.py ]]; then
|
||||
echo -e "${RED}WARNING: Missing setup.py${COLOR_OFF}\n"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
PACKAGE_NAME="$(cat setup.py | grep "name='" | head | cut -d "'" -f2)"
|
||||
INIT_PACKAGE_NAME="$(echo "${PACKAGE_NAME//-/_}")"
|
||||
PUBLIC="true"
|
||||
|
||||
# Colors
|
||||
|
||||
1
setup.py
1
setup.py
@@ -37,7 +37,6 @@ setup(
|
||||
'Development Status :: 5 - Production/Stable',
|
||||
'Topic :: System :: Archiving :: Backup',
|
||||
'License :: OSI Approved :: MIT License',
|
||||
'Programming Language :: Python :: 2.6',
|
||||
'Programming Language :: Python :: 2.7',
|
||||
'Programming Language :: Python :: 3.5',
|
||||
'Programming Language :: Python :: 3.6',
|
||||
|
||||
Reference in New Issue
Block a user