What are scripts?
Scripts enable users to query, modify, import and export data in any language. Users can write scripts directly in Civis Platform, or checkout code from any Github Repository, to execute their code in a special run-time environment. Scripts can be executed manually or run on a schedule.
How to find and use scripts
- Navigate to ‘Code’ in the Civis Platform menu at the top of your screen
- To create a new script, select the appropriate script for your programming language preference (SQL, R, Python, Javascript, Container)
- To view an indexed list of previously created scripts, select the word ‘Scripts’
- From the index page, you can filter and sort existing scripts. You can also find existing scripts through the top-level search. Read more here.
Types of scripts
Civis Platform supports five types of scripts:
Script Lifecycle and Script Runs
Every time you create a new script, you are creating a Civis Platform object that has:
- A unique ID
- A name
- The ability to run; either on-demand, on a schedule, or in response to various triggers
- A runtime status
All new scripts start in Idle status. After a script is started, it is moved to Queued status. Once a script worker has been provisioned for your script, it moves to Running status. When the script finishes it will move to either Succeeded, Failed, or Cancelled status. At this point the script can again be queued to run again.
A script cannot have multiple concurrent runs. If you want to run the same script multiple times in parallel, consider cloning the script (creating a copy) or creating a script template and then multiple custom scripts referencing that script template.
Scripts are only guaranteed to run for 5 days. After this time, the script may be terminated by platform.
Runs
Each time you run a script, a new Run object is created and associated with your script. Just like jobs, runs have a status: Queued, Running, Succeeded, Failed, Cancelled. Runs are only used once. If you re-run a script, a new run is created. It is possible to list all the runs of a given job.
Runs also have an associated log. This log contains a record of the activity which occurs during the run. If you create a container script, you can write to the run's log by writing to STDOUT. Note that job run logs are retained for 6 months, after which they are deleted permanently.
The results of runs can be published as run outputs.
Run With Your Credentials
To run a job that is using a credential you don't have access to (e.g. Redshift credentials), you'll need to run it with your credentials. For jobs that you have Editor or Manager permissions on, you'll be prompted with the following message:
Clicking "run with your credentials" will give you the ability to run the job, but with your credentials instead of a different user's credentials.
For jobs like SQL scripts where you'll be querying data in your Redshift cluster, make sure your Redshift credential has the proper permissions on the queried data.
Script Results
As mentioned above, each time your script runs it starts in a pristine environment. Changes made by your script are discarded at the end of the script run. To save results from a script you have several options:
- Save results to Redshift by running a query
- Save the results as a file in Civis Platform
- Save the results as a new Civis Platform report
Since you are running any code you can write - other options abound. Your script could even create other scripts as its output.
Common Actions
You can complete the following actions on scripts by accessing the standard action menu in the top right of the script page. Read more about these actions here.
- Add Success/Failure Notifications
- Clone
- Archive
- Automate
- Share
- Manage Versions
Script Parameters
Creating and inserting parameters into your code offers an easy and convenient way to make your script flexible and reusable. Parameter “arguments” can be set or updated for each script run to customize the result. Parameterization is available on every script type.
Terms to Know
- Parameters: Known as “params” in the API, the parameters associated with a script define a framework of possible runtime changes. The “params” attribute is an array of objects, where each object specifies the parameter’s name and type, along with a few other configuration options (see “Creating a Parameter” for more). The expectation is that the creator of the script references each given parameter in the code’s body. In the UI, parameters are managed via the “Set Parameters” side pane.
- Arguments: An object of key-value pairs, where the keys match parameters’ names, and the corresponding values are the runner-supplied arguments for that run. Arguments are stored in the “arguments” attribute in the API, and are modifiable in the UI via the form in the scripts’ main page body. For a given parameter name, the given argument (aka “value”) must match the Parameter’s configuration (type, allowed values, etc)
Creating a Parameter
To create a parameter, click on "Set Parameters" on the top-right section of the script page. Follow the instructions on the Parameters pane to configure your parameter as desired.
Parameter Names
Parameter names must start and end with a number or English letter. Only numbers, letters, and underscores are allowed. Parameters may NOT be named PATH, HOME, HOSTNAME, PWD, CIVIS_API_KEY, DATABASE, RUNNER_USER, or RUNNER_USER_ID.
Parameter Types
Valid parameter types are the following: string, multi_line_string, integer, float, bool, file, database, credential_aws, credential_redshift, and credential_custom. Valid arguments for each of the parameter types are listed below
- String - Any string value. Any "numbers" used will just be treated as the string representation of that number.
- Multi-line String - Any string value. Presents the argument for the parameter as a multi-line text box.
- Integer - Any integer value. Float values are not allowed.
- Float - Any float value.
- Boolean - Either true or false.
- Redshift, Custom, or AWS Credential - Any integer value. Use the ID of the credential in Civis Platform, which can be found via the Credentials endpoint.
- Database - A dictionary containing the keys 'database' and 'credential'. The values for these keys are the id of the database and the id of the database credential respectively.
- File - Any integer value. Use the ID of the file in the Civis platform. Must be readable by the runner of the job. The file must not be expired when adding as an argument and the job will fail if the file is expired at runtime.
Default Values
When creating a parameter, you can specify a default value for the parameter. This is the value for the parameter that will be used if no argument is supplied. Defaults cannot be set on required parameters since by definition, an argument must be supplied for them.
If a parameter is 1) optional, 2) has no default, and 3) no argument is supplied for it, the following will happen:
- For SQL Scripts, an argument will still be set since valid SQL must be produced. Strings will be the empty string. Integers and Floats will be 0. Booleans will be false. Credential's values (
.id
,.username
,.password
,.access_key_id
, and.secret_access_key
) will be the empty string. - For Python, R, and Container Scripts, there will be no environment variable created.
Fixed Parameters
This feature is currently only available via the API. If you would like to define a parameter that holds a predetermined value that is not editable by custom scripts users, you can use a fixed parameter. Fixed parameters are most useful for affixing credentials and databases to your script. The values for fixed parameters are placed in environment variables in the same way arguments are. To use a fixed parameter, set the “value” attribute on the parameter to the value you would like to use.
The example code below shows how to create fixed credentials and then call them when you update a Python script. Note that when you create a database credential, json.dumps() is required.
import civis
client = civis.APIClient()
params = [] params.append({'name': 'NUM_DOGS', 'type': 'integer'}) # a normal integer cred params.append({'name': 'PRIVATE_CUSTOM_CRED', 'type': 'credential_custom', 'value': my_secret_custom_cred_id}) # a fixed custom credential params.append({'name': 'PRIVATE_POSTGIS_DB', 'type': 'database', 'value': json.dumps({'database': my_secret_database_id, 'credential': my_secret_cred_id})}) # a fixed database credential # now params is [
# {'name': 'NUM_DOGS', 'type': 'integer'}, # {'name': 'PRIVATE_CUSTOM_CRED', 'type': 'credential_custom', 'value': 6142}, # {'name': 'PRIVATE_POSTGIS_DB', 'type': 'database', 'value': '{"database": 989, "credential": 1504}'}
# ]
client.scripts.patch_python3(job_id, params=params)
Note: Fixed files must be readable by the author of a template.
Allowed Values (Custom Drop-Down Parameters)
These are parameters that appear in the UI as dropdown lists of pre-selected choices. For example, you could create a custom drop-down parameter for 'Month' and have it appear as a list of the twelve months.
To create a custom drop-down parameter, click “Add Allowed Values” in the new parameter form. “Value Name” is the value that will be used in your code, while “Display Name” is the string that will show as an option in the drop-down.
You can also create allowed values via the API, example below:
import civis
client = civis.APIClient()
client.scripts.patch_python3(job_id, params=[
{
"name": "month",
"required": False,
"type": "integer",
"allowedValues": [
{ "label": "January", "value": "1" },
{ "label": "February", "value": "2" },
{ "label": "March", "value": "3" },
{ "label": "April", "value": "4" },
{ "label": "May", "value": "5" },
{ "label": "June", "value": "6" },
{ "label": "July", "value": "7" },
{ "label": "August", "value": "8" },
{ "label": "September", "value": "9" },
{ "label": "October", "value": "10" },
{ "label": "November", "value": "11" },
{ "label": "December", "value": "12" }
]
}
])
On the script page, the parameter appears as follows:
In this example, “January”, “February”, “March”, “April”, “May”, “June”, “August”, “September”, “October”, “November”, and “December” are the only valid (allowed) arguments for the parameter “Month”.
Editing Parameters
Once your parameter is created, you can edit the Input Display Name (‘label’), Input Helper Text (‘description’), Default Value (‘default’) and Allowed Values (‘allowedValues’) by clicking on the edit icon. You cannot edit the Parameter Type (‘type’) or Parameter Name (‘name’) - if you need to make changes to these fields we suggest deleting the existing parameter and recreating it as intended. You can rearrange how the parameters display on the script page by dragging and dropping parameters in the table on the Set Parameters side pane.
Adding a Parameter to Your Code
Once your parameter is created, you'll need to reference it in your code. Here are the ways to use parameters within the different script types. The variables available to the script vary based on the parameter type.
SQL Scripts
If the parameter is named "foo", the following variables are available by wrapping them in double curly braces. Note: the valid parameter type to be used through the API is the name listed but lowercased, unless noted by parentheses.
Parameter type | Variables available within the script |
String, Multiline String (multi_line_string) |
|
Integer, Float, Boolean (bool) |
|
Redshift Credential (credential_redshift), Custom Credential (credential_custom) |
|
AWS Credential (credential_aws) |
|
Database |
|
File |
|
Once parameters are created and filled out on a SQL script page, you'll see a generated code section that shows you the code that will be executed. This allows you to ensure parameter arguments are applied as expected before running the script.
Python, R, and Container Scripts
For these script types, parameter names are turned into environment variables. All environment variables will always be strings, however their types will be validated and you will be able to cast them to the parameter type. Environment variables are made available under both the name you enter, and the name in all uppercase letters.
The following environment variables are made available for a parameter "foo". Note: the valid parameter type to be used through the API is the name listed but lowercased unless noted by parentheses.
Parameter type | Available environment variables |
String, Multiline String (multi_line_string) |
|
Integer, Float, Boolean (bool) |
|
Redshift Credential (credential_redshift), Custom Credential (credential_custom) |
|
AWS Credential (credential_aws) |
|
Database |
|
File |
|
Required Resource Parameters
The resources allocated to a script can be set through parameters named REQUIRED_CPU
, REQUIRED_MEMORY
or REQUIRED_DISK_SPACE
. If any of these parameters are present and arguments are supplied, the given values will be used to allocate resources. Otherwise, resources will be set based on the values in the script's requiredResources
attribute. Note that parameters must be of the correct types (Integer for REQUIRED_CPU
and REQUIRED_MEMORY
and Float for REQUIRED_DISK_SPACE
) in order to have an effect. In the Platform UI, these fields are accessible under Script Settings.
Run Outputs
Many scripts produce new objects within Civis Platform. Scripts may write their results to a file, generate a new report, or even create other new scripts. Run Outputs are a feature that provides a structured way for a script to document its results.
While running, the script may call the API to 'publish' objects as outputs of the current run. Published run outputs can later be obtained through the API or from the script’s Run History pane in Civis Platform. Currently the run outputs feature supports files, reports (including Tableau reports), tables, and projects. Run outputs are a robust and easy to use way for a script to clearly communicate the results of its work. Rather than rely on naming conventions or parsing log results, run outputs are clear and unambiguous.
When you create a run output, the “objectType” parameter can be one of the following: File, Report, Table, Project, Credential, or JSONValue.
Note: Files created through Run Outputs will automatically expire after 30 days.
Example: Publishing a file as an output and accessing outputs from a script's 'Run History' pane
Example: Reading outputs from a completed script
GET /scripts/containers/65812/runs/887611
[
{
"objectType": "File",
"objectId": 5561,
"name": "My output file",
"link": "api.civisanalytics.com/files/5561",
"value": null
}
]
If you have some component which cares about the results of your script (such as a dynamic report or a separate script), you can simplify coordination between the two by having the component poll for changes to the script's outputs. Once the expected output(s) are visible, the second component can continue its work. For example, you may have a report that wants to present the results of a script. As soon as the script publishes a file as a run output, the UI may then load the file and display it.
Run Outputs and Author Context
Please see this article for more information about how run outputs interact with author context. The most important point to know is that run outputs will be owned by the runner of the script.
Target Project
When publishing run outputs, you can automatically collect the outputs published by the script into a target project. Any project can serve as a target project.
You can set the target project on any script (container, custom, R, SQL, or Python). You can change the target project after creating a script.
If you set a target project on a script, you must have proper permissions to add the object to the project when you attempt to publish the object to the project.
When publishing run outputs to a target project in a script running in author context, permissions are evaluated using the script runner's identity. In short - it matters whether the script runner is able to add objects to the project.
JSONValue Run Outputs
Terms to know
JSON: “JavaScript Object Notation”. Despite the name, it is a language-independent data format. Valid JSON can be scalar values like a number, a string, or boolean. It can also be an array of values, or a name-value pair where the name is a string and the value is a scalar or array.
JSONValue run outputs are not typical Civis Platform objects. They allow you to set any JSON values as outputs of your scripts. This allows users of your scripts to retrieve values, as opposed to object IDs, as outputs. It also enables workflows to pass outputs of scripts into other jobs, which you can read about here. Furthermore, JSONValue run outputs allow you to display actual values in the run pane as outputs of your script (as opposed to Civis Platform object names with links to other pages or download urls for files).
Example JSON
JSON |
Explanation |
8 |
The number 8 |
“Hello Civis” |
The string "Hello Civis" |
[1, 1, 2, 3, 5, 8] |
An array of values. Values can be of any type, even mixed types within the same array. |
{ “age”: 25 “favoriteColor”: “blue”, “hobbies”: [“reading”, “coding”, “hiking”] } |
An Object with multiple name-value pairs, analogous to a Python Dictionary or Ruby Hash. One name-value pair maps the name “age” to the number 25. Another maps the name “hobbies” to the array [“reading”, “coding”, “hiking] |
Creating JSONValue Run Outputs
When creating a run output that is a JSONValue, you must first create a JSONValue with the `POST /json_values` endpoint. You must set your json value as a serialized string representing the value. You pass in this serialized string as the “valueStr” attribute. For example:
POST /json_values
{
"name": "My output JSON",
“valueStr”: “{\“foo\”: \“bar\”}”
}
Note how the `valueStr` attribute is a string which escapes any double quotes inside it. Many libraries do this for you, for example the `json.dumps` function in python.
A successful POST request will return a response with the `id` of the JSONValue object. This `id` is used to create a run output when calling the `POST Script Run Outputs API` endpoint.
{
“id”: 1,
"name": "My output JSON",
“valueStr”: “{\“foo\”: \“bar\”}”,
“value”: {“foo”: “bar”}
}
Retrieving JSONValue Run Outputs
When fetched from the Civis API, the JSONValue will have deserialized values in the `value` attribute, in addition to the original serialized string in the `valueStr` field.
GET /json_values/1
{
“id”: 1,
"name": "My output JSON",
“valueStr”: “{\“foo\”: \“bar\”}”,
“value”: {“foo”: “bar”}
}
The deserialized JSONValue is also available through the `LIST Script Run Outputs` endpoint in the `value` attribute of each item. Run outputs of other types, e.g. Files, Tables, Reports, will have null value in this attribute.
Reading JSONValue Run Outputs
GET /scripts/containers/65813/runs/887612/outputs
[
{
"objectType": "JSONValue",
"objectId": 5562,
"name": "My output JSON",
"link": "api.civisanalytics.com/json_values/5562",
“value”: {“foo”: “bar”}
},
{
"objectType": "File",
"objectId": 5563,
"name": "My output File",
"link": "api.civisanalytics.com/files/5563",
“value”: null
}
]
Example of how to create a JSONValue run output with the python client
When you create the JSONValue object, you must send your value as a valid JSON string. You can do this in python by using the `json` module and using the `dumps` function
python
client = civis.APIClient()
value = 8 # could be a string or dictionary or list too!
json_value_object = client.json_values.post(json.dumps(value), name=”My Awesome Value”)
Now you need to publish it as a run output to your script, using the same method as other run output types. We’ll assume you have a `job_id`, `run_id`, and that you are running as a container script:
Great! You’ve created the JSONValue object, which is in the `json_value_object` variable.
python
client.scripts.post_containers_runs_outputs(job_id, run_id, ‘JSONValue’, json_value_object.id)
Note that when you retrieve the outputs in a later script, the `value` attribute will be the deserialized json value, as opposed to the serialized string you sent in!
How JSONValue outputs show up in the run pane
JSONValue run outputs show up directly in the run pane. This is a handy way to summarize and display the output of a script to users. Previously, script developers wrote this data to a file or dumped to logs.
The JSONValue outputs show up with the name in bold followed by the value itself. Below is an example where the JSONValue’s name is “My awesome score” and the value is 0.84:
Adding Run Outputs to the Body of Notification Emails
You can leverage this feature to embed variables from a Python, R or container script into the body of notification emails. For guidance and code examples on this, please visit the respective script page.
Canceling Scripts
Overview
If you find your script is incorrect or running longer than expected, open the script's Run Log and click the 'x' button next to the spinner to cancel the script. The script will usually be canceled within one minute but some processes can take up to five minutes to cancel.
Cleanup Tasks
If you have a script that needs to do cleanup tasks when it is canceled, you can set the "cancel_timeout" parameter via the API. This feature is available when creating a new Python, R, or Container script.
The parameter should be an integer between 0 and 60 (default 0) that determines how long to wait, in seconds, between sending SIGTERM and SIGKILL signals.
You can then catch the SIGTERM signal in your script, and clean up before the script is shut down.
Python3 Code Example:
import signal
import sys
import time
def handler(signum, frame):
print('Doing cleanup before exit...')
time.sleep(20)
print('Finished cleanup!')
sys.exit(0)
signal.signal(signal.SIGTERM, handler)
Container Script Example:
trap "echo Doing cleanup before exit...; sleep 20; echo Finished cleanup!" TERM
while:
do
echo 'waiting for cancellation'
sleep 5
done
Caveats
- With container scripts, you need to be careful about launching subprocesses, since signals will not be forwarded to those subprocesses.
- For example, if you have a container script with the command `python /app/script.py`, you should instead do `exec python /app/script.py` so that the Python process will run with the same process ID (PID). You could also trap and forward the signals yourself, but that's more complicated. Python and R scripts will take care of this for you.
- If you run these examples yourself and watch the logs in real-time, you might not see the printed messages show up immediately. If you refresh the page you will be able to see the messages printed after the script was canceled.
Glossary
Script - An object created in Civis Platform which allows you to run your own code on Civis Platform Backing script - A script that is published and provides the base/reference for a script template Container script - A script that runs in a Docker container which you specify Custom script - A script that gets its code and configuration through a script template Script template - An object that wraps a script you want other users to reuse Script Parameter - A mechanism for allowing users to pass values into a script which can vary from run to run Script author - The user that created a script Script runner - The user that last updated a script. Only the current script runner can change the script User context - When running a script, the user context determines which user's identity and credentials are linked to the API key presented to the script and which server cluster the job runs on |
Comments
0 comments
Please sign in to leave a comment.