Storing files to multiple AWS S3 buckets

Option 1, Single bucket replication (files added to Bucket A are automatically added to Bucket B)

If you aim to store files to a second S3 bucket automatically upon uploading, the built in “Cross Region Replication” is the method to use.

Its very easy to set up with just a few clicks in the AWS console.

  • Select the properties of the S3 bucket you want to copy from and in “Versioning”, click on “Enable Versioning”.
  • In “Cross Region Replication”, click on “Enable Cross-Region Replication”
  • Source: You can tell S3 to use the entire bucket as a source, or only files with a prefix.
  • Destination Region: You need to pick another region to copy to, it doesn’t work in the same region
  • Destination Bucket: If you have created a bucket in that region already you can select it, or create a bucket on the fly.
  • Destination storage class: You can chose how files are stored in the destination bucket.
  • Create/select IAM role: This will allow you to use an existing IAM role or create a new role with the appropriate permissions to copy files to the destination bucket.

Once you press Save, Cross Region Replication is now set up. Any files you upload to the source bucket from now on will automatically be added to the destination bucket a moment later.

It doesn’t copy across any pre-existing files from the source, only new files are acted upon.

Also, Cross Region Replication can’t (currently) be chained to copy from a source to more than one destination, however there’s a way to do that using Lambda.

Option 2, Multiple bucket replication (files added to Bucket A are automatically added to Bucket B,C,D )

In a nutshell, AWS Lambda can be used to trigger some code to run based on events, such as a file landing in an S3 bucket.

The following steps help create a Lambda function to monitor a source bucket and then copy any files that are created to 1 or more target buckets.

Pre-Lambda steps

  1. Create 1 or more Buckets that you want to use as destinations
  2. Clone this node.js repository locally (https://github.com/eleven41/aws-lambda-copy-s3-objects) and run the following to install dependencies
    1. npm install async
    2. npm install aws-sdk
    3. compress all the files in to a Zip file
  3. In AWS’s ‘Identity and Access management’, click on ‘Policies’ and click ‘Create Policy’, copy the JSON from the ‘IAM Role’ section of the above repository : https://github.com/eleven41/aws-lambda-copy-s3-objects
  4. In IAM Roles, create a new Role. In ‘AWS Service Roles’, click on Lambda and select the Policy you created in the previous step.

Lambda steps

  • Log in to the AWS Console and select Lambda, click on “Create a Lambda function”.
  • Skip the pre made blueprints by pressing on Skip at the bottom of the page.
  • Name: Give your lambda function a name
  • Description: Give it a brief description
  • Runtime: Choose ‘Node.js’
  • Code entry type: Choose ‘upload a .zip file’ and upload the pre-made Zip file from earlier, no changes are needed to its code
  • Handler: Select ‘index.handler’
  • Role: Select the IAM role created earlier from the Pre-Lambda steps.

You can leave the remaining Advanced steps at their default values and Create the Lambda function.

  • Once the function is created, it will open its Properties. Click on the ‘Event Sources’ tab.
    • Event Source Type: S3
    • Bucket: Choose your source bucket here
    • Event Type: Object Created
    • Prefix: Leave blank
    • Suffix: Leave blank
    • Leave ‘Enable Now’ selected and press ‘Submit’
  • Go back to your original source S3 bucket, create a new Tag called ‘TargetBucket’
  • In the ‘TargetBucket’ value add a list of target buckets, separated by a space, that you want files copied to. If the buckets are in different regions you’ll need to specify, for example:
    • destination1 destination2@us-west-2 destination3@us-east-1

You can use Lambda’s built in Test section to test the function works well. Don’t forget to change the Test script to specify your source bucket.

If there are errors, there will be a link to Cloud Watch error logs to diagnose the problem.

Any files added to the source bucket will now automatically be added to the one or more target buckets specified in the ‘TargetBucket’ value of the original bucket.

Leave a Reply

Your email address will not be published. Required fields are marked *