Below we are going to cover 10 unique ways of using AWS S3 and how it can help your business thrive in a cloud digital world.
- Storing and accessing user-uploaded files:
A common use case for Amazon S3 is storing images, videos, and other files that are uploaded by users on a website. The code can easily integrate with Amazon S3 to save these files and retrieve them as necessary.
Example code:
import boto3
s3 = boto3.resource('s3')
bucket_name = 'my-bucket'
# Uploading file
file_name = 'my_image.jpg'
object_key = 'uploads/' + file_name
s3.meta.client.upload_file(file_name, bucket_name, object_key)
# Retrieving file
response = s3.Object(bucket_name, object_key).get()
content = response['Body'].read()
- Website hosting:
Amazon S3 can also be used to host static websites, which can help reduce the load on your web server and improve performance. The code below shows how to configure a bucket for static website hosting:
import boto3
s3 = boto3.client('s3')
bucket_name = 'my-website-bucket'
# Create website configuration
website_configuration = {
'ErrorDocument': {'Key': 'error.html'},
'IndexDocument': {'Suffix': 'index.html'}
}
# Set bucket policy for website access
policy = {
"Version":"2012-10-17",
"Statement":[{
"Sid":"PublicReadGetObject",
"Effect":"Allow",
"Principal": "*",
"Action":["s3:GetObject"],
"Resource":["arn:aws:s3:::%s/*" % bucket_name]
}]
}
# Set bucket to host website
s3.create_bucket(Bucket=bucket_name)
s3.put_bucket_website(
Bucket=bucket_name,
WebsiteConfiguration=website_configuration
)
s3.put_bucket_policy(
Bucket=bucket_name,
Policy=json.dumps(policy)
)
For a full example of how you can use the create/list/update and uploading files you can reference this article here which breaks down in great detail with a full source code repo the steps to interact with AWS S3:
- Serverless computing:
Amazon S3 can be used with AWS Lambda to create event-driven serverless applications that respond to changes in an S3 bucket. For example, you could trigger a Lambda function to resize images when they are added to a specific S3 bucket.
Example code:
import boto3
import json
s3 = boto3.client('s3')
def lambda_handler(event, context):
# Get bucket name and object key from event
bucket_name = event['Records'][0]['s3']['bucket']['name']
file_key = event['Records'][0]['s3']['object']['key']
# Resize image using external library
resized_image = resize_image(s3.get_object(Bucket=bucket_name, Key=file_key))
# Upload resized image to new object
new_key = 'resized/' + file_key
s3.put_object(Bucket=bucket_name, Key=new_key, Body=resized_image)
return {
'statusCode': 200,
'body': json.dumps('Image resized successfully')
}
- Backup and disaster recovery:
Storing backups and data archives is another common use case for Amazon S3. The following code demonstrates how to copy files from a local machine to an S3 bucket for backup purposes:
import boto3
s3 = boto3.resource('s3')
source_path = '/path/to/local/files'
bucket_name = 'backup-bucket'
# Upload files to S3
for file_path in os.listdir(source_path):
s3_file_path = 'backup/' + file_path
s3.meta.client.upload_file(os.path.join(source_path, file_path),
bucket_name, s3_file_path)
- Content delivery and distribution:
Amazon S3 buckets can be configured with CloudFront to provide scalable content delivery and distribution. This can help to improve site speed and reduce latency for visitors. The following code demonstrates how to create a new CloudFront distribution:
import boto3
s3 = boto3.client('s3')
cf = boto3.client('cloudfront')
bucket_name = 'my-bucket'
origin_id = 'my-bucket-origin'
domain_name = 'example.com'
# Create CloudFront origin access identity
response = cf.create_cloud_front_origin_access_identity(
CloudFrontOriginAccessIdentityConfig={
'CallerReference': str(time.time()),
'Comment': 'Access identity for bucket origin'
}
)
identity_id = response['CloudFrontOriginAccessIdentity']['Id']
s3_iam_principal = 'origin-access-identity/cloudfront/' + identity_id
# Configure S3 bucket policy for CloudFront access
policy = {
"Version":"2012-10-17",
"Statement":[{
"Sid":"GrantCloudFrontAccess",
"Effect":"Allow",
"Principal":{"AWS":"arn:aws:iam::cloudfront:user/CloudFront Origin Access Identity " + identity_id},
"Action":["s3:GetObject"],
"Resource":["arn:aws:s3:::%s/*" % bucket_name]
}]
}
s3.put_bucket_policy(Bucket=bucket_name, Policy=json.dumps(policy))
# Create CloudFront distribution
cf.create_distribution(
DistributionConfig={
'CallerReference': str(time.time()),
'Aliases': {'Quantity': 1, 'Items': [domain_name]},
'DefaultRootObject': 'index.html',
'Origins': {'Quantity': 1, 'Items': [{
'Id': origin_id,
'DomainName': '%s.s3.amazonaws.com' % bucket_name,
'S3OriginConfig': {
'OriginAccessIdentity': s3_iam_principal
},
'CustomHeaders': {
'Quantity': 1, 'Items': [{
'HeaderName': 'Access-Control-Allow-Origin',
'HeaderValue': '*'
}]
}
}]},
'DefaultCacheBehavior': {
'TargetOriginId': origin_id,
'ViewerProtocolPolicy': 'redirect-to-https',
'ForwardedValues': {
'QueryString': False,
'Cookies': {'Forward': 'none'}
},
'MinTTL': 3600
},
'Comment': 'CloudFront distribution for my S3 bucket'
}
)
- Big Data processing:
Data stored within S3 can be queried and transformed using services such as Amazon Athena, Glue, or EMR. The following code demonstrates how to use Athena to query a dataset stored in S3:
import boto3
import pandas as pd
from pyathena import connect
s3 = boto3.resource('s3')
bucket_name = 'my-data-bucket'
query = 'SELECT * FROM my_dataset WHERE column_value > 100'
# Query data using Athena
conn = connect(region_name='us-east-1')
df = pd.read_sql(query, conn)
- Machine learning:
Large datasets stored in S3 can be accessed and processed by machine learning models running on EC2 instances or SageMaker notebooks. The following code demonstrates how to read image data from an S3 bucket to train a machine learning model:
import boto3
import numpy as np
from keras.preprocessing.image import ImageDataGenerator
s3 = boto3.client('s3')
bucket_name = 'image-bucket'
train_data_dir = 'train_data'
batch_size = 64
# Generate augmented training data
gen = ImageDataGenerator(rescale=1./255,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True)
train_data = gen.flow_from_directory(
train_data_dir,
target_size=(224, 224),
batch_size=batch_size,
class_mode='binary'
)
def generator(data_generator):
while True:
imgs, labels = next(train_data)
img_data = []
for i, img_path in enumerate(imgs):
obj = s3.get_object(Bucket=bucket_name, Key=img_path[0])
img = Image.open(BytesIO(obj['Body'].read()))
img = img.resize((224, 224))
img_data.append(np.asarray(img))
yield (np.array(img_data), labels)
- IoT devices:
IoT devices can store data in S3 that can then be analyzed or processed in real-time. The following code demonstrates how to upload telemetry data from an IoT device directly to S3:
import boto3
import json
s3 = boto3.resource('s3')
bucket_name = 'iot-data-bucket'
device_id = 'my-device'
def handle_telemetry(telemetry_data):
# Store data in S3 bucket
file_name = device_id + '/' + str(time.time()) + '.json'
data = json.dumps(telemetry_data)
s3.Object(bucket_name, file_name).put(Body=data)
- Mobile app backend:
Amazon S3 can be used to store user-generated content, such as photos or videos, that is created through a mobile app. The following code demonstrates how to upload a photo taken from a mobile app directly to an S3 bucket:
import boto3
import base64
s3 = boto3.resource('s3')
bucket_name = 'image-bucket'
# Save image to S3
image_data = base64.b64decode(image_string)
file_name = 'images/' + user_id + '_' + str(time.time()) + '.jpeg'
s3.Object(bucket_name, file_name).put(Body=image_data)
- File sharing:
Amazon S3 provides a secure way to share files across different teams or organizations. The following code demonstrates how to generate a pre-signed URL that allows a user to download a specific file from an S3 bucket:
import boto3
import datetime
s3 = boto3.client('s3')
bucket_name = 'docs-bucket'
object_key = 'confidential.pdf'
expiration = datetime.datetime.now() + datetime.timedelta(hours=1)
# Generate pre-signed URL for download
url = s3.generate_presigned_url(https://thinkcomputers.org/
_ ClientMethod=get_object,
_ Params={
_ Bucket: bucket_name,
_ Key: object_key
_ },
_ ExpiresIn=3600
)