You might not have access to this feature!

This feature is only available on our premium and enterprise plans. [Talk to our team](mailto:premium@customer.io) about upgrading your plan.

# Amazon S3 (Advanced)

[PremiumThis feature is available for Premium plans.](/accounts-and-workspaces/plan-features/) [EnterpriseThis feature is available for Enterprise plans.](/accounts-and-workspaces/plan-features/)

## About this integration

Amazon S3 is an object storage service that offers industry-leading scalability, data availability, security, and performance.

[Mode How we forward source data to the destination: through Customer.io's servers or directly from our JavaScript client.](/cdp/destinations/getting-started/#connection-mode)

[Web sources Indicates whether or not this integration supports our the JavaScript client.](/cdp/sources/getting-started/#types-of-sources)

[API sources Indicates whether or not this integration supports our server libraries (Go, NodeJS, Python), API, Mobile SDK, and other data sources.](/cdp/sources/getting-started/#types-of-sources)

[Supported calls The API methods this integration supports.](/cdp/sources/source-spec/source-events/)

[Integration name The name of this integration if you want to enable or disable it in the `integrations` object.](/cdp/sources/source-spec/common-fields/#the-integrations-object)

Standard

[alias](/api/cdp/#operation/alias), [group](/api/cdp/#operation/group), [identify](/api/cdp/#operation/identify), [page](/api/cdp/#operation/page), [screen](/api/cdp/#operation/screen), and [track](/api/cdp/#operation/track)

Amazon Simple Storage Service (S3)

## How it works[](#how-it-works)

This integration sends CSV, JSON, or [parquet](https://parquet.apache.org/) files containing your data to your Amazon S3 (Advanced) bucket. Then you can ingest the files in your storage bucket to your data warehouse of choice.

We write files for each type of incoming call to your storage bucket every 10 minutes. So you’ll have files for `identify` calls, `track` calls, and so on. Files are named with an incrementing number, so it’s easy to determine the sequence of files, and the order of incoming calls.

### Sync frequency and file names[](#sync-frequency-and-file-names)

Syncs occur every 10 minutes. Each sync file contains data from the previous sync interval. For example, if the last sync occurred at 12:00 PM, the next sync will only send data from 12:00 PM to 12:09:59 PM.

Each sync generates new files for each data type in your storage bucket. Files are named in the format `<integration id>.<integration action id>.<current position>.<type>`.

*   The integration ID and action ID are unique identifiers generated by Customer.io. You’ll see them with the first sync.
*   `current position` is an incrementing number beginning at 1 that indicates the order of syncs. So your first sync is 1, the next one is 2, etc.
*   `type` is the type of incoming call—`identify`, `track`, `page`, `screen`, `alias`, or `group`.

So, if your file is called `2184.13699.1.track.json`, it’s the first sync file for the `track` call type.

## Getting started[](#getting-started)

1.  Go to **Data & Integrations > Integrations** and select Amazon S3 (Advanced) in the *Directory* tab.
    
2.  Connect to your storage bucket:
    
    1.  **Endpoint**: Endpoint for the internal ETL API.
        
    2.  **Token**: Authentication token for the internal ETL API.
        
    3.  **Format**: Format of the data files that will be created.
        
    4.  **Bucket Name**: Name of an existing bucket. Learn more about [S3 buckets](https://docs.aws.amazon.com/AmazonS3/latest/userguide/UsingBucket.html) and [bucket naming rules](https://docs.aws.amazon.com/AmazonS3/latest/userguide/bucketnamingrules.html).
        
    5.  **Bucket Path**: Optional folder inside the bucket where files will be written to.
        
    6.  **Access Key**: The AWS Access Key ID that will be used to connect to your S3 Bucket. Your Access Key ID can be found in the *My Security Credentials* section of your AWS Console. Learn more about [AWS credentials](https://docs.aws.amazon.com/general/latest/gr/aws-sec-cred-types.html).
        
    7.  **Secret Key**: The AWS Secret Access Key that will be used to connect to your S3 Bucket. Your Secret Access Key can be found in the *My Security Credentials* section of your AWS Console. Learn more about [AWS credentials](https://docs.aws.amazon.com/general/latest/gr/aws-sec-cred-types.html).
        
    8.  **Region**: The AWS Region where your S3 Bucket resides in. Learn more about [AWS Regions](https://docs.aws.amazon.com/general/latest/gr/rande.html).
        
3.  Review your setup and click **Finish** to enable your integration.
    

## Schemas[](#schemas)

The following schemas represent JSON for the different *types* of files we export to your storage bucket (`identify`, `track`, and so on). For CSV and Parquet files, we stringify objects and arrays. For example, if identify calls contain the `traits` object with a `first_name` and `last_name`, CSV files output to your storage bucket will contain a `traits` column with data that looks like this for each row: `"{ "\first_name\": \"Bugs\", \"last_name\": \"Bunny\" }"`.

 identify

#### identify[](#identify)

*Identifies* files contain [identify](/integrations/api/cdp/#operation/identify) calls sent to Customer.io. The `context` and `traits` in the schema below are objects in JSON. In CSV and parquet files, these columns contain stringified objects.

*   traits object
    
    Additional properties that you know about a person. We’ve listed some common/reserved traits below, but you can add any traits that you might use in another system.
    
    *   createdAt string  (date-time)
        
        We recommend that you pass date-time values as ISO 8601 date-time strings. We convert this value to fit destinations where appropriate.
        
    *   email string
        
        A person’s email address. In some cases, you can pass an empty `userId` and we’ll use this value to identify a person.
        
    *   *Additional Traits\** any type
        
        Traits that you want to set on a person. These can take any JSON shape.
        

 group

#### group[](#group)

*Groups* files contain `group` calls sent to Customer.io. If your integration outputs CSV or parquet files, the `context` and `traits` columns contain stringified objects.

*   traits object
    
    Additional data points that the call assigns to the group.
    
    *   *Additional Traits\** any type
        
        Traits can have any name, like \`account\_name\` or \`total\_employees\`. These can take any JSON shape.
        

 track

#### track[](#track)

*Tracks* contains entries for the `track` calls you send to Customer.io. It shows information about the events your users perform.

If your integration outputs CSV or parquet files, the `context` and `properties` columns contain stringified objects. If your integration outputs JSON files, the `context` and `properties` columns contain objects.

*   event string
    
    The slug of the event name, mapping to an event-specific table.
    
*   event\_text string
    
    The name of the event.
    
*   properties object
    
    Additional properties sent with the page call. We’ve listed some common/reserved traits captured by our `Analytics.js` library, but you can add any properties that you might use in another system.
    
    *   *Event Properties\** any type
        

 page

#### page[](#page)

*Pages* contains entries for the `page` calls sent to Customer.io. If your integration outputs CSV or parquet files, the `context` and `properties` columns contain stringified objects. If your integration outputs JSON files, the `context` and `properties` columns contain objects.

*   properties object
    
    Additional properties sent with the page call. We’ve listed some common/reserved traits captured by our `Analytics.js` library, but you can add any properties that you might use in another system.
    
    *   category string
        
        The category of the page. This might be useful if you have a single page routes or have a flattened URL structure.
        
    *   path string
        
        The path of the page. This defaults to `location.pathname`, but can be overridden.
        
    *   referrer string
        
        The referrer of the page, if applicable. This defaults to `document.referrer`, but can be overridden.
        
    *   search string
        
        The search query in the URL, if present. This defaults to `location.search`, but can be overridden.
        
    *   title string
        
        The title of the page. This defaults to `document.title`, but can be overridden.
        
    *   url string
        
        The URL of the page. This defaults to a canonical url if available, and falls back to `document.location.href`.
        
    *   *Page Properties\** any type
        

 screen

#### screen[](#screen)

*Screens* files contain entries for the `screen` calls sent to Customer.io. If your integration outputs CSV or parquet files, the `context` and `properties` columns contain stringified objects. If your integration outputs JSON files, the `context` and `properties` columns contain objects.

*   properties object
    
    Additional properties that you sent in your screen event
    
    *   *Additional event properties\** any type
        
        Properties that you sent in the event. These can take any JSON shape.
        

 alias

#### alias[](#alias)

The Alias schema contains entries for the `alias` calls you send to Customer.io. It shows information about the users you merge, with each entry showing a user’s new `user_id` and their `previous_id`.