Civis takes a table with address information and outputs geocodes as well as census information to help your organization create mapping visualizations. As a best practice, you should first use the CASS/NCOA job to standardize your addresses. This can greatly improve geocoding accuracy.
Location
1) Go to "Tools" and then select "Labs".
2) Choose "New Job".
3) Under "Data" select "Geocode".
On the Page
- Database: The Redshift cluster containing the data you would like to geocode.
- Credential: The Redshift credential that has access to the data you would like to geocode.
- Table: The table of address records you would like to geocode.
- Multipart Key: The primary and foreign keys on the source table.
- WHERE clause: If you’d like to limit which records are geocoded, you would do so here.
- Schema Name: The schema you would like the job output to be written to.
- Table Name: The name of the table the job will create.
- Output parsed address: This will take the address Civis is geocoding and parse it out into individual columns based on data type (e.g. housenumber, streetname, city, state, zip).
Required Fields
There are three sets of column types that this job can accept:
- full_address
- addr (or address, street_address, street_address1) + city + state (or state_code) + zip (or zip_code, zip5, zip9)
- latitude (or lat) and longitude (or lon)
Note that the columns must match the names listed above. If your table’s column names don’t match those listed, you can name them using the following SQL template:
alter table schema.table rename column state_abbreviation to state_code;
If you need to concatenate multiple columns, you can also use the following SQL template:
create table schema.new_table as (
select id , address || ‘, ‘ || city || ‘, ‘ || state || ‘ ‘ || zip_code as full_address
from schema.old_table );
Job Output
Once the job is complete, the output will be the multipart keys you selected when you set up the job, as well as the following columns that can be found when you click the Toggle Data Dictionary button at the bottom of the job setup page:
-
geocode_rating: This field is calculated differently for each provider. Lower is better and a rating of 1 can be considered highly accurate for all providers.
PostGIS: String distance from closest known location name. Ratings typically range from 0-100 but can be higher. A value of 0 indicates an exact match.
Geocoder_CA: Similar to PostGIS ratings but the maximum value is 21. - civis_latitude: Latitude
- civis_longitude: Longitude
- statefp: State FIPS code
- countyfp: County FIPS code
- tractce: Census tract code
- blkgrpce: Block group number
- blockce: Tabulation block number
- cousubfp: County subdivision FIPS code
- submcdfp: Subminor civil division FIPS code
- conctyfp: Consolidated city FIPS code
- placefp: Place FIPS code
- aiannhfp: American Indian/Alaska Native/Native Hawaiian area FIPS code
- aiannhce: American Indian/Alaska Native/Native Hawaiian area census code
- comptyp: American Indian/Alaska Native/Native Hawaiian area reservation/statistical area or off-reservation trust land Hawaiian home land indicator
- trsubfp: American Indian tribal subdivision FIPS code
- trsubce: American Indian tribal subdivision code
- anrcfp: Alaska Native regional corporation FIPS code
- ttractce: Tribal census tract code
- tblkgpce: Tribal block group letter
- elsdlea: Elementary school district local education agency code
- scsdlea: Secondary school district local education agency code
- unsdlea: Unified school district local education agency code
- uace: Urban area code
- cd113fp: 113th congressional district code
- sldust: State legislative district upper chamber code
- sldlst: State legislative district lower chamber code
- vtdst: Voting district code
- zcta5ce: 5-digit Zip Code Tabulation Area code
- tazce: Transportation analysis zone code
- ugace: Urban growth area code
- puma5ce: Public use microdata area code
- csfp: Combined statistical area code
- cbsafp: Metropolitan statistical area/micropolitan statistical area code
- metdivfp: Metropolitan division code
- cnectafp: Combined New England city and town area
- nectafp: New England city and town area code
- nctadvfp: New England city and town area division code
Additionally, if you selected the Output parsed address option there will be the additional location columns previously mentioned in the On the Page section.
How It Works
The Civis geocoder uses a PostGIS server to query TIGER/Line data. Note that these are not rooftop geocodes but are instead interpolated. TIGER/Line knows the first and last house number on a street, estimates where a given house number is within that range, and adds a few meters to the left/right of the road based on whether the house number is odd or even.
Geocoding Canadian Addresses
You can geocode Canadian addresses by selecting Canada as your country on the job setup page. When geocoding Canadian addresses, we strongly recommend combining all address fields into a single full_address column.
Our formula for rescaling the output from our Canadian geocoding provider is:
IF (1.0 - canadian_geocode_score) < 1e-3 {
RETURN 0
} ELSE {
RETURN INT((1.0 - canadian_geocode_score) * 20.0 +1)
}
Any error returned by the provider should result in a NULL.
Comments
0 comments
Please sign in to leave a comment.