DLHDLH.io Documentation

Setup Steps

DLH.io Setup Guide for AWS S3 Storage

Setup Instructions

DLH.io securely connects to your AWS S3 Storage. Using the form in the DLH.io portal please complete the following basic steps.

  1. Enter a Name or Alias for this connection, in the 'Name/Alias' field, that is unique from other connectors
  2. Enter a 'Target Schema Prefix', which will be the prefix for the schema at the target you will sync your data files into. When used as a source only.
  3. Enter a 'Bucket' name, where your files are stored
    • Typically starts with s3:// or https://, so enter just the name without the prefix.
      • For every file in the Bucket a separate table will be created and loaded into the Target Connector connected via a Sync Bridge.
  4. Select your 'Region'
  5. Enter your 'Access Key', credentials to access the bucket
  6. Enter your 'Secret Key', credentials to access the bucket
  7. Enter any other optional details in the available fields (See the setup video if you need help or contact support)
    • Folder Path, is a prefix path on the root bucket from where desired files will be retrieved
      • For JSON/GZ files, that are stored within nested folders, each file(s) in the subfolder(s) will be inserted into the same Target Connection table as the parent folder. The presumption is that the file structure is the same across all the files within the nested folders.
        • Folder paths should always end with a forward-slash (/)
      • JINJA Usage:
        • Basic JINJA can be used for timestamp logic in the folder path to determine what the dynamic structure of the S3 bucket might be, for example as a time based solution, if looking in a folder prefix/bucket as such, 2025/11/10/myfiles/otherfiles/, and the date structure is correctly used over time, then the following JINJA expression can be used in the Folder Path field, {{ today() | format_date('YYYY') }}/{{ today() | format_date('MM') }}/, to ensure that DLH.io is dynamically retrieving from that folder each day
    • File Pattern, is a regular expression (RegEx) used to isolated only certain files to be retrieved. The length of the regex is limited (100 characters).
    • Folder Pattern, is a regular expression (RegEx) and JINJA field which can only be used in certain circumstances, usually by DLH.io support team when working with customers on special data integration user cases.
    • File Type, allows for a pre-determined type of file extension to be retrieved
      • JSON files stored in .gz compressed files will get ingested in the same manner as JSON files not stored in a .gz file
  8. Click the Save & Test button. Once your credentials are accepted you should be able to see a successful connection message appear.

How to Setup