Civis Platform supports version control for Scripts, Workflows, HTML Reports, Notebooks, and Services through Github and Bitbucket. This article details how version control works from the version control side pane as well as from the command line in a Notebook. The Git integration works slightly differently for Container Scripts, see here for more details.
Connecting Platform to Github/Bitbucket
Platform offers Git-based version control via GitHub and Bitbucket.
To connect your Platform account with a GitHub or Bitbucket repository, follow these steps:
- Open the My Profile page using the initials menu
- Click on the Git Repos tab at the top of the page.
- This page will show you a list of repositories that you have bookmarked to your profile.
- This page will show you a list of repositories that you have bookmarked to your profile.
- Click Add Repository to add a new GitHub or BitBucket to your account.
- If this is your first time using version control, click Connect to GitHub or Connect to Bitbucket to connect to the relevant service.
- Authorize the Account
- After connecting your account, search for the repository you’d like to add and click Add Repository.
- The repository will now be available for you to connect to Scripts, Notebooks, Reports, etc.
Adding Version Control to a Platform Object
To add version control to a specific Platform object, navigate to the object’s detail page. This can be an existing or new SQL, R, or Python Script, Workflow, Notebook, or HTML Report. The example below is for a Python script.
Click Manage Versions to open the side pane, and fill out the relevant fields under Git Connection.
Repository: The repository field is a drop-down of the list of repositories that you have bookmarked in your profile.
Branch: The specific branch you would like to use to version control this Script. The branch must already exist in your repository.
Path: The path to a file in your repository that you would like to use to version control your Script. If the file does not yet exist in your repository, Platform will create a new file for you in the repository. Please note that this path must be an individual file and not a directory.
Managing Code Changes
Commit + Push
As you make iterative edits to your script in Platform, you can commit changes to the specified path. Once you make edits to the script, an icon will appear that prompts you to “Commit Changes.” Navigate to the “commit” tab of the Manage Versions pane, add a commit message and press “commit and push”. Platform will not automatically commit and push these changes.
Autosaving vs Committing
Platform automatically saves progress and changes to your script. However, Platform does not automatically commit to or pull from the Git repository when your script or the repository changes, unless you have the Auto-Checkout feature enabled. In other words, changes made to a Platform script that are autosaved in Platform will not be reflected in Git unless manually pushed through the “Manage Versions” pane.
Auto-Checkout
If you want your Platform script to always run the latest committed code from Git, toggle Auto-Checkout on.
Scripts with Auto-Checkout enabled will checkout the latest commit from the specified branch and path at run time - regardless of whether that commit was made from Platform or externally. You can view the commit history and commit message for the latest commit on the Commit tab of the Manage Versions pane for more details.
Editing script code in Platform will be disabled when Auto-Checkout is on to prevent conflicting changes. If you need to make changes to the script code, you can toggle Auto-Checkout off, make edits and commit the code.
Merge Conflicts
In the event that a commit causes a merge conflict, Platform will show an error message and will neither commit and push the change nor resolve the merge conflict. Instead, the error message will prompt you to click the “update history” button. This will update the commit log with the latest commits from Git.
After refreshing the pane, the merge conflict will disappear. However, there is a possibility that committing and pushing changes at this point may overwrite code pushed from outside of Platform, or from another user. We recommend you check the code in your Git repo and compare to your Platform code before proceeding.
Git Commit Log
Platform will automatically pull a log of commits for the file in the specific repository and branch you have selected. These commits will only reflect commits where the selected file has changed -- it is not the full commit history for the branch and/or repository.
The Git commit log will also indicate if the current Script contains code that was checked out from a specific Git commit, through the green dot on the left. In this example, this script builds on code from commit “Change date to 2018”, hash: “a5bc7c1”
Git Checkout
Platform allows you to “checkout” versions of code at a specific commit with a file size of up to 1 MB. Platform will warn you if checking out code will overwrite uncommitted changes that you have. Checking out a previous commit does not automatically commit and push back to Git. This can be done though through the Commit and Push button.
Credentials + Secrets
You should not store text such as passwords, API keys, or other sensitive information in Git. We offer a secure way to store your credentials in Civis Platform. For more information, visit our Credentials documentation.
Github and Bitbucket are an external services that are not managed by Civis. While we do securely connect to both, Civis security controls do not extend to third party services.
Command Line Version Control for Notebooks
Civis Platform supports version control in Jupyter Notebooks through the Notebook terminal as well as the version control side pane. To use version control from the command line in a Notebook, go to the Platform notebook where you would like to establish a git integration.
- To bring an existing notebook from git into Platform, create a new Platform notebook with the server stopped.
- To commit an existing Platform notebook to git, open the existing Platform notebook with the server stopped.
In order to integrate git with notebooks, you need to use image tag 1.5 (python 3) or 1.3.0 (python 2/R) or later.
Before starting the server, add a Git repository to your notebook by filling out the three Git-relevant fields.
Repository: The repository field is a drop-down of the list of repositories that you have bookmarked in your profile.
Branch: The branch field is the specific branch, tag, or commit hash. The reference you use must already exist in your repository.
Path: The path field is the path to a Notebook in your Git repository. If you are connecting this Notebook to your repository for the first time, you can type the path of the notebook file you would like to be created. Please note that the file must have an .ipynb extension, or your Notebook will fail to start.
When you have configured all of your Notebook settings, select "Manage Version", and choose the version that you want to use in the job. As you edit and save your Notebook, you’ll see an “uncommitted changes” Notification appear at the top of the Notebook.
Committing and Pushing Changes to Git
Notebooks changes are committed and pushed to Git through the terminal window in your Notebook. Executing Git commands in the terminal will use the authentication from the repository you set up in your profile to push changes to your repository. You can access the terminal by clicking on “Uncommitted Changes.” You can also access the terminal by clicking on the terminal icon in the Notebook:
To orient yourself within the terminal, first type “git status” at the command prompt.
You’ll see the branch you are working in as well as the name of the Notebook that has been modified and needs to be committed to Git. Note that the html preview from your most recent save is also available -- we do not recommend including this file in your commit.
Next, the following steps are required to commit your changes to Git:
- git add <filename> -- stage your changes for commit
- git commit -m “<<commit message>>” -- commit your changes and specify a commit message
- git push -- publish your commits to this notebook’s repository and branch
Requirements and Custom Packages
In addition to the Custom Packages option for adding packages to the Docker image, you also have the option of using a requirements.txt file from your Git repository. When you launch a notebook that has a Git repository specified, Platform will search your repository for a requirements.txt file. For example, if you provide a path to a Notebook at /notebooks/clustering/k_means.ipynb, Platform will search up the tree until it finds a requirements.txt file. If you don’t have a requirements.txt file in your repository, you can add packages to the “custom packages” list in the packages pane. However, we only use one OR the other. If you list any custom packages in the packages pane, the requirements.txt file will be ignored.
Autosaving vs Committing
Platform will continuously save a copy of the Notebook file as a backup in case you forget to commit your changes back to your Git repository. When you restart your Notebook, and the copy we have saved in Platform is not empty, we will launch the Notebook that is autosaved in Platform. If you want to use the version that is committed to Git, you can use the terminal to access a past commit.
Comments
0 comments
Please sign in to leave a comment.