github
- GitHub storage¶
The github
storage uses GitHub’s API to store your
metadata as a Git repository on GitHub. It uses Git revisions and tags to
keep track of changes, and will even automatically create Git LFS pointers
and configuration if applicable.
GitHub Credentials¶
Currently, metastore-lib
’s GitHub backend supports authentication using
a GitHub username and password (not recommended), or a Personal Access Token.
In the future, we plan to add support for GitHub App based authentication. See this issue for discussion and progress details.
Username and Password Authentication¶
The following example demonstrates instantiating a GitHub storage backend with username / password authentication:
import metastore
# Using your user name and password to authenticate with GitHub
config = {"github_options": {"login_or_token": "mr_username",
"password": "s0mena5tys3c4et!!1one"}}
backend = metastore.create_metastore('github', config)
Personal Access Token Authentication¶
To obtain a Personal Access Token, follow the instructions in
the relevant section in the GitHub Documentation.
The following permission scopes are required by metastore-lib
and should be
granted:
repo
andrepo:status
(other sub-scopes of repo are not required)repo_delete
If your GitHub organization requires SSO authentication, follow the steps described here after creating the token.
The following example demonstrates doing the same but using a personal access token instead:
import metastore
# Using a generated Personal Access Token to authenticate with GitHub
config = {"github_options": {"login_or_token": "averylongtokenthatwasgeneratedespeciallyforthis"}}
backend = metastore.create_metastore('github', config)
Configuration Options¶
The following configuration options can optionally be passed to the GitHub storage backend constructor or factory function:
github_options
-dict
of keyword arguments to pass to the PyGitHub client. This should, at the very least, include some authentication credentialslfs_server_url
- The base URL of the Git-LFS server in use. Providing this will make the GitHub backend create Git LFS configuration and pointer files for resources where applicabledefault_owner
- The GitHub organization or user name to use as the default owner for created repositories, if dataset names do not include aowner/
prefixdefault_author
- A defaultAuthor
object to use when committing changes if no author is specified otherwisedefault_branch
- The name of the default branch in the repository (typically, this would bemaster
)default_commit_message
- The default message to use when committing changes, if not otherwise specifiedprivate
- Whether to use private repositories. False by default. Note that private repositories must be enabled for the organization / user, and also for the token used for authenticating with GitHub for this to work
Git LFS Support¶
TBD