The following script imports a file from S/FTP and loads it directly onto your database. If you need to import multiple files from S/FTP, please visit our help documentation here.
Adding a S/FTP Credential
To add your S/FTP credential to Platform go to New Credential and fill out the fields and click "Save"
- Name: what you would like this credential to be named
- Credential Type: Custom
- Username: S/FTP Username
- Password: S/FTP Password
Creating the Job
- On the top navigation menu, click Data, and then under Imports, select "S/FTP."
- Fill our the following required parameters. You may also include additional info in the optional and advanced fields as necessary:
Required
- Protocol- specify the transfer protocol to use. The options are 'ftp', 'ftp_tls', or 'sftp'.
- Host - enter the url of the s/ftp location, e.g. ‘ftp.examplesite.com’
- Credential - select the credential containing the appropriate username and password for the host.
- Host directory - specify the directory within the host from which to import files, e.g. '/Download/'.
- Port - specify the port on which to establish the connection. If left blank the default is 21 for ftp and 22 for sftp.
- Destination database- specify the database in which the data should be stored; optional if only one database is available.
- Destination schema and table name - specify the schema and table name you would like to import the data into, e.g. 'Schema.table'. If the single table import option is not checked, multiple tables can be created with suffixes of _1, _2, etc. To match the number of files imported.
- If table(s) exist - specify an action to take if any of the destination tables already exist. The options are 'fail', 'append', 'wipe', or 'drop'. The default is 'fail'.
- First row is header? - check the box if the first row of the file is a header.
- Column delimiter - specify whether the delimiter is a comma, tab, or pipe. The default is 'comma.'
Optional
- Credential password is a private key? - if the password to your ftp credential is a private key, check this box. Ultimately, this option depends on your server implementation and is only used for ftp_tls and sftp. In most cases it does not need to be checked.
- Single table import? - uncheck the box to import each matched file into its own table. If there are multiple files, each file will import to a different table with a suffix of _1, _2, etc.
- Extension match - import only files with the following extensions, e.g. 'Txt', ‘csv’
- Filename match - import only files matching this text, e.g. Type 'users' to import all files with the word 'users' in the filename. Supports regular expressions.
- Start timestamp, utc- specify the earliest timestamp of files to import, e.g. 'Yyyy-mm-dd hh:mi:ss am/pm'. Must be in utc.
- End timestamp, utc - specify the latest timestamp of files to import, e.g. 'Yyyy-mm-dd hh:mi:ss am/pm'. Must be in utc.
- Max errors - set this value to increase the number of allowable errors. If left blank, it will default to 0.
- Dist key- enter a column name to set a dist key. Distribute rows by the values in this column. Applicable for redshift databases.
- Sort key 1 - enter a column name to set a sort key. Sort your data by the column you join on most. Applicable for redshift databases.
- Sort key 2 - enter a column name to set a second sort key. Applicable for redshift databases.
- Incremental sync - check the box to import only matched files newer than the newest file previously imported by this job.
- Extra auth password - custom credential, for sftp only. 'Password' should be a password to decrypt the key in your ftp credential, if that key is encrypted. 'Username' is not needed. This credential is not needed in most cases.
Advanced
- Search directories within host directory? - check the box to search all subdirectories for files, within the host directory that is specified.
Max chunk size - if left blank, uses connection default. The default works for most connections. - Required cpu (mcpu)- set the desired cpu to allocate to this script. The default works for most connections.
- Required memory (mib) - set the desired memory to allocate to this script. The default works for most connections.
- Required disk space (gb)- set the desired disk space to allocate to this script. The default works for most connections.
Notes
- This will work for all delimited files less than 5GB.
- Importing can be scheduled and included in a workflow.
- The "incremental sync" option will set the tool to only import files newer than what was imported on the last run.
Comments
0 comments
Please sign in to leave a comment.