Docker is a tool that makes it easy to run software in a consistent environment. A Docker image is a snapshot of a computing environment including whatever command-line utilities, libraries, and programming language tools that you need (e.g., your favorite version of Python and the packages you need). For a more complete introduction, see this page from Docker.
Civis offers a set of Public Civis Docker Images to provide consistent computing environments for Scripts, Notebooks, and Services. Civis’s images support many common data processing and data science use cases, but if you find that you need additional packages or customization, Civis Platform supports custom images as well. You can use any publicly available image with Civis Platform or publish your own to Docker Hub, as described below.
Building a Custom Docker Image
Building and hosting Docker images requires additional tools outside of Civis Platform, and there are many options available. This section provides some basic information and links to various tools. We do not necessarily endorse or support these third-party tools.
Typically, there are three components needed to build and host a custom Docker image:
- A code repository (e.g., GitHub, Gitlab) that contains a Dockerfile that defines how to build the Docker image.
- A registry to store your Docker image and make it available to pull (i.e., download). Docker Hub is a popular registry that can be used with Civis Platform for public or private images, but there are other registries that Civis Platform can pull public images from, as noted below.
- A CI/CD tool or other build tool that will retrieve the Dockerfile from your code repository, use it as a set of instructions to build your Docker image, and then push it to the registry. Examples include CircleCI, Travis CI, GitHub Actions, and AWS CodeBuild. Docker Hub also has support for automated builds. (It is possible to manually build a Docker image and push it to a registry, though it’s generally best to set up an automated build.)
Which tools to use is up to your organization, but one potential workflow is to use GitHub, CircleCI, and Docker Hub, as described in this blog post from CircleCI.
For an example of data science-focused Dockerfile, you can take a look at Civis’s datascience-python or civis-jupyter-python3 Dockerfiles that are used for Python Scripts and Notebooks, respectively in Civis Platform.
Making Your Image Available for Use in Platform
Public images may be pulled from any registry, including Docker Hub, Amazon ECR, and Github Container Registry. Pulls from arbitrary registries are unauthenticated.
Images from Docker Hub are pulled using the "crobot" Docker Hub user. Private images may be shared with this user, which will make them available to anyone using Platform.To grant Civis' background Docker user access to the image:
- Navigate to the private Docker Hub repo.
- Click on the Collaborators tab.
- Add the user "crobot"
Some additional details:
- If your image is pulled from Docker Hub, these images are not subject to any rate limits associated with anonymous pulls.
- If you have A NAT gateway enabled to allow your Civis Platform Scripts to connect to third party resources, all image pulls will be made from the IP addresses listed on this page.
A given image will be downloaded only once during the lifecycle of a compute instance. See this page for more information on monitoring your active compute instances. Any Job running on an instance that has already accessed the image will be able to use the previously downloaded Docker image. If a Job is allocated to an instance where the image is not already present, it will be downloaded.
Changing the Image used by your Platform Job or Service
By default Platform will use the civisanalytics/datascience-python image for Container Scripts and the civisanalytics/civis-services-shiny image for Civis Services. To use a different image follow these steps
- Open the settings tab on your Container Script or Civis Service
- Enter the account which owns the image and the name of the image in the “Image” parameter.
-
Optional: supply an image tag in the “Tag” parameter
- if none is selected the latest image will be used
Comments
0 comments
Please sign in to leave a comment.