The Git integration allows users to manage and version control their codebase and workflows using Git directly from the Hopsworks UI. The integration currently supports repositories hosted on Github, Gitlab and Bitbucket.
The feature is currently in Beta and will be improved in the upcoming releases.
When you perform Git operations on Hopsworks that need to interact with the remote repository, Hopsworks relies on the Git HTTPS protocol to perform those operations. Authentication with the remote repository happens through a token generated by the Git repository hosting service (Github, Gitlab, Bitbucket).
Documentation on how to generate a token for the supported Git hosting services is available here:
Once the token has been generated, you need to provide it to Hopsworks so that it can be used to authenticate Git operations. In the Account Settings page you can find the Git Providers section. The Git provider section displays which providers have been already configured and can be used to clone new repositories.
You can click on the Edit Configuration to change a provider username or token, or to configure a new provider.
Tick the checkbox next to the provider you want to configure and insert the username and the token to use for that provider.
Tokens are personal
The tokens are personal to each user. When you perform operations on a repository, your token is going to be used, even though the repository might belong to a different user.
Repositories are cloned and managed within the scope of a project. The content of the repository will reside on the Hopsworks File System. The content of the repository can be edited from Jupyter notebooks and can for example be used to configure Jobs. Repositories can be managed from the Git section in the project settings. The Git overview in the project settings provides a list of repositories currently cloned within the project, the location of their content as well which branch and commit their HEAD is currently at.
The list of past and ongoing activities on the repositories is also displayed. The list contains the repository on which the activity is performed, the user performing the operation, its status as well as a message describing the outcome.
Clone a repository#
To clone a new repository, click on the
Clone repository button on the Git overview page.
The clone page asks you to specify the URL of the repository you want to clone. As mentioned above, the supported protocol is HTTPS. As an example, if the repository is hosted on Github, the URL should look like:
Additionally the UI asks you to specify which branch you want to clone. By default the UI is going to clone the
main branch, however a different branch or commit can be specified by selecting clone from a specific branch.
You can select the folder, within your project, on which the repository should be cloned. By default, the repository is going to be cloned within the
Resources dataset. However, by clicking on the location button, a different location can be selected.
Finally, click on the
Clone repository button to trigger the cloning of the repository.
On each repository a set of actions can be performed.
Pull latest changes#
You can pull changes from the remote repository into your Hopsworks local branch. If you need to resolve conflicts, you can use the Jupyter notebook to do so, or you can do that externally from your local machine.
The Switch branch action allows you to change the current branch of the repository. You can either provide a branch name or a commit name you want to checkout.
If you tick the Create new option, a new branch will be created on the Hopsworks local repository. This is useful if you are developing a new feature.
The commit action allows you to commit changes you made to the Git repository cloned on Hopsworks. To be able to commit, you need to provide a commit message.
The push action allows you to push the local changes to the remote repository. The changes will be pushed to the same remote branch as the one you have checked out locally on Hopsworks.
The delete action allows you to remove the local repository from Hopsworks.
Potential data loss
When you remove a repository, the data/code is also removed from the Hopsworks file system. Use this option carefully or you might lose any non-committed changes or changes that have not been pushed to the remote repository.
For every repository, the metadata of the most recent 20 commits is available in Hopsworks. The history is available by clicking on the History button in the repository overview page. This allows you to compare the history of the local repository with the list of commits available on the remote repository.