This guide will walk through the necessary steps to setup your Public API Container on S3 AWS. Before starting please reach out to your CSM to start the process.
What is an S3 Bucket?
S3 stands for simple storage service, and it is AWS's cloud storage service. S3 provides the ability to store, retrieve, access, and back up any amount of data at any time and place. As S3 is object-based storage, this means that all data is stored as objects.
Feature Overview
This feature was created as a way to share data between Metadata and its customers. Although the data contained in Metadata can be visualized and in some cases exported through the website, there are some cases where our customers need to get some data on a recurring basis so they can use that data in the best way that they want (ETL, BI tool…).
In order to provide that data to our customers, we created a system to generate some reports and automatically export the result into a S3 bucket owned by the customer. These reports will be rerun and exported on a daily basis.
As the buckets will be controlled by the client they can decide the best way of using and protecting their data.
To begin the following has to be completed on your end:
-
Set up a S3 Bucket on AWS - Please note that if you have multiple instances with the Metadata Platform each instance should get its own bucket.
-
Set up the necessary policies to protect the data (make not public, enable encryption)
-
Set up the necessary policy for Metadata to write the data on a daily basis
-
Create an AWS S3 bucket to receive Metadata’s reports deliveries and note down the S3 bucket ARN
-
Grant arn:aws:iam::581693852494:role/MetadataAppExportExternalReports write access to this new bucket, S3 bucket policy sample below
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "AllowListObjects",
"Effect": "Allow",
"Principal": {
"AWS": [
"arn:aws:iam::581693852494:role/MetadataAppExportExternalReports"
]
},
"Action": [
"s3:ListBucket",
"s3:GetBucketLocation",
"s3:ListBucketMultipartUploads"
],
"Resource": "arn:aws:s3:::{{customer-s3-reports-bucket-name}}"
},
{
"Sid": "AllowObjectsFetchAndCreate",
"Effect": "Allow",
"Principal": {
"AWS": [
"arn:aws:iam::581693852494:role/MetadataAppExportExternalReports"
]
},
"Action": [
"s3:GetObject",
"s3:PutObject",
"s3:GetObjectTagging",
"s3:AbortMultipartUpload",
"s3:ListMultipartUploadParts",
"s3:PutObjectAcl",
"s3:PutObjectVersionAcl",
"s3:deleteObject"
],
"Resource": "arn:aws:s3:::{{customer-s3-reports-bucket-name}}/*"
}
]
}
-
Additional points to note:
-
If you chose to encrypt the S3 bucket using a KMS key, the arn:aws:iam::581693852494:role/MetadataAppExportExternalReports should be granted permission to use KMS key (using a key policy)
-
If you are using the provided sample S3 bucket policy, they should edit it to
-
Update S3 bucket ARNs (“Resource” value)
-
Grant Principals from your account required permissions to the S3 bucket
-
-
- Create folders in the bucket labeled:
experiment_daily_data
experiment_totals
account_report
- Export a copy of your bucket policy / KMS key in JSON format and share this with Metadata support team.
- Please also include what Region your bucket belongs in.
To be completed on Metadata's end:
Once you have provided Metadata with the bucket name Metadata will need to configure the process to deliver the data on a daily basis to that customer. Some extra configurations that Metadata will do as per request:
-
Folder Path for each data delivery
-
File type (csv, json, parquet…)
-
Frequency (daily, weekly, monthly)
-
Create an AWS IAM policy to allow write operations on customer’s S3 bucket and attach the policy to arn:aws:iam::581693852494:role/MetadataAppExportExternalReports role, sample IAM policy below
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:ListBucket",
"s3:GetBucketLocation",
"s3:ListBucketMultipartUploads"
],
"Resource": [
"arn:aws:s3:::{{customer-s3-reports-bucket-name}}"
]
},
{
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:PutObject",
"s3:GetObjectTagging",
"s3:AbortMultipartUpload",
"s3:ListMultipartUploadParts",
"s3:PutObjectAcl",
"s3:PutObjectVersionAcl"
"s3:deleteObject"
],
"Resource": [
"arn:aws:s3:::{{customer-s3-reports-bucket-name}}/*"
]
}
]
}
Data Sample
experiment_daily_date -> Daily metrics of Experiments.
experiment_totals -> Aggregated Campaign Totals.
account_report -> Account Report.
experiment_metadata -> Experiment setup.
Files are placed daily at 830 UTC
High level overview
Comments
0 comments
Article is closed for comments.