Integrating Hustle with Civis
There are two steps to moving data from Hustle to Civis:
- Move data from Hustle’s S3 bucket to your organization's AWS S3 bucket
- Move data from your organization's S3 bucket to Civis-managed Redshift cluster
This means that, in order to import data from Hustle, your organization will need its own AWS account to provision an s3 bucket. Hustle provides thorough documentation on this first step.
This help article documents the second step of data transfer: moving data from the client’s s3 bucket into their Civis-managed Redshift cluster.
Overview
The Import Hustle JSON from S3 script template allows users to import selected Hustle tables from a specified S3 bucket.
You can create your script using this link, or in Platform navigate to Code > Scripts > More Script Templates and search for “Import Hustle JSON from S3” or “52143”. If you are not shared on this template, please submit a support request through the help button in Platform.
Parameters
AWS Credential: a saved “Amazon Web Services S3” platform credential with read access to the S3 bucket
Source S3 Bucket: bucket name
Source S3 Bucket AWS Region: defaults to the same region as the Redshift destination database
Destination Redshift Database: destination database name
Destination Redshift database credential: database credential with write access to destination schema
Destination Database Schema Name: destination schema where data should be refreshed. Note all Hustle tables being imported must already exist in this schema.
Organization to Import: comma separated list of Hustle organizations for which to import data. Note all organizations’ data will be stored in the same destination table, indicated with an org_id field where applicable.
Tables To Import: comma separated list of Hustle tables to import. If blank default list will be imported: actions, agent_actions, agents, goal_steps, goals, group_memberships, groups, integrations, leads, messages, organizations, tags, threads. Tables must exist in the destination schema already.
Nested Values
Some Hustle tables (ex: leads) have nested JSON values (ex: tags, externals, and custom_fields) which may require you to normalize the values based on the following examples because of the limited JSON support in Redshift. Querying these views could be slow for large tables; if that's the case, you can create them as regular tables on an ongoing basis after new data is loaded to the source tables.
The queries below require a numbers table created once with an integer column called ordinal, filled with integers 1 through 100 (or the maximum things you’ll need from a JSON array). To create this table, run the following Python script:
import civis
import pandas
numbers_df = pandas.DataFrame(dict(ordinal=range(1, 101)))
civis.io.dataframe_to_civis(numbers_df, 'YOUR DATABASE', 'hustle.numbers').result()
Example SQL to create a view to extract custom fields from a simple JSON object:
CREATE VIEW hustle.leads_custom_fields AS (
SELECT
id,
CASE WHEN custom_fields ilike '%Zip_code%' THEN json_extract_path_text(custom_fields, 'Zip_Code', true) ELSE json_extract_path_text(custom_fields, 'ZipCode', true) END as zip,
json_extract_path_text(custom_fields, 'City', true) as city
FROM hustle_20.leads
) WITH NO SCHEMA BINDING;
Example SQL to create a view to extract tags into one row per lead ID and tag:
CREATE VIEW hustle.leads_tags AS (
WITH
joined AS (
SELECT
id,
json_array_length(tags, true) AS number_of_tags,
json_extract_array_element_text(
tags,
numbers.ordinal::int,
true
) AS item
FROM hustle.leads
CROSS JOIN hustle.numbers
--only generate the number of records in the cross join that corresponds
--to the number of items in the json array
WHERE numbers.ordinal < json_array_length(tags, true)
)
SELECT id, item FROM joined
) WITH NO SCHEMA BINDING'
Comments
0 comments
Please sign in to leave a comment.