Amazon S3 Notes
Amazon S3 Notes
Advantages
a. Compliance Capability
This helps customers to satisfy compliance requirements for virtually every regulatory
agency around the globe.
b. Flexible Management
Storage administrators help to arrange report and visualize data usage. This will help to
monitor the data and reduce cost while improving the services. Amazon S3 along with
the AWS Lambda helps customers to log activities, define alerts and many more
functions without managing any other infrastructure.
Amazon S3 provides myriads of ways to transfer data into the Amazon S3.
This is possible with the help of the API which transfers the data through the internet.
Direct connect is one of the major sources for the data transfer among S3 which helps to
transfer data to the public as well as private networks.
AWS snowball provides a petabyte-level data transfer system.
AWS Storage gateway provides the on-premises storage gateway which sends the data
directly to the cloud through the premises of the user.
AWS S3 allows the user to run the Big data analytics on a particular system without
moving it to another analytics system. Amazon Redshift spectrum allows the user to run
both the data warehouse and the S3. AWS S3 select helps the user to retrieve the data
back which the user is in need of the S3 objects. Amazon Athena provides the user the
vast amount of unstructured data to a user familiar with SQL.
a. Data Archiving
AWS S3 help the customer by providing storage facility to meet the need of compliance
archives as the organizations need fast access to the data. To meet the system retention
Amazon Glacier provides Write once read much storage. Lifecycle policies make
transitioning data between Amazon S3 storage classes simple.
A large amount of the data which can be of different types can be stored in the S3 bucket
and it can be used as a data lake for big data analytics. AWS S3 provides us with many
services which help us to manage big Data by reducing cost and increasing the speed of
innovation.
AWS S3 provides highly durable and secure places for backing up and archiving the data.
Amazon S3 and Amazon Glacier provide four different storage classes. This helps to
optimize cost and performance while meeting the Recovery Point Objective and
Recovery Time Objective.
Amazon S3 is intentionally built with a minimal feature set that focuses on simplicity and
robustness. Following are some of advantages of the Amazon S3 service:
Note:A container is a standard unit of software that packages up code and all its dependencies so
the application runs quickly and reliably from one computing environment to another
Create Buckets – Create and name a bucket that stores data. Buckets are the
fundamental container in Amazon S3 for data storage.
Store data in Buckets – Store an infinite amount of data in a bucket.
Upload as many objects as you like into an Amazon S3 bucket.
Each object can contain up to 5 TB of data. Each object is stored and retrieved using a
unique developer-assigned key.
Download data – Download your data or enable others to do so. Download your data
any time you like or allow others to do the same.
Permissions – Grant or deny access to others who want to upload or download data into
your Amazon S3 bucket.
Grant upload and download permissions to three types of users. Authentication
mechanisms can help keep data secure from unauthorized access.
Standard interfaces – Use standards-based REST and SOAP interfaces designed to
work with any Internet-development toolkit.
Amazon S3 Concepts
Topics
Buckets
A bucket is a container for objects stored in Amazon S3. Every object is contained in a
bucket.
Buckets serve several purposes: they organize the Amazon S3 namespace at the highest
level, they identify the account responsible for storage and data transfer charges, they
play a role in access control, and they serve as the unit of aggregation for usage reporting.
Objects
Objects are the fundamental entities stored in Amazon S3. Objects consist of object data
and metadata. The data portion is opaque to Amazon S3. The metadata is a set of name-
value pairs that describe the object. These include some default metadata, such as the date
last modified, and standard HTTP metadata, such as Content-Type. You can also specify
custom metadata at the time the object is stored.
Keys
A key is the unique identifier for an object within a bucket. Every object in a bucket has
exactly one key. Because the combination of a bucket, key, and version ID uniquely
identify each object, Amazon S3 can be thought of as a basic data map between "bucket +
key + version" and the object itself. Every object in Amazon S3 can be uniquely
addressed through the combination of the web service endpoint, bucket name, key, and
optionally, a version. For example, in the URL http://doc.s3.amazonaws.com/2006-03-
01/AmazonS3.wsdl, "doc" is the name of the bucket and "2006-03-01/AmazonS3.wsdl"
is the key.
Regions
You can choose the geographical region where Amazon S3 will store the buckets you
create. You might choose a region to optimize latency, minimize costs, or address
regulatory requirements. Objects stored in a region never leave the region unless you
explicitly transfer them to another region. For example, objects stored in the EU (Ireland)
region never leave it.
Amazon S3 provides read-after-write consistency for PUTS of new objects in your S3 bucket in
all regions with one caveat. The caveat is that if you make a HEAD or GET request to the key
name (to find if the object exists) before creating the object, Amazon S3 provides eventual
consistency for read-after-write.
Amazon S3 offers eventual consistency for overwrite PUTS and DELETES in all regions.
Updates to a single key are atomic. For example, if you PUT to an existing key, a subsequent
read might return the old data or the updated data, but it will never return corrupted or partial
data.
Amazon S3 achieves high availability by replicating data across multiple servers within
Amazon's data centers. If a PUT request is successful, your data is safely stored. However,
information about the changes must replicate across Amazon S3, which can take some time, and
so you might observe the following behaviors:
A process writes a new object to Amazon S3 and immediately lists keys within its bucket.
Until the change is fully propagated, the object might not appear in the list.
A process replaces an existing object and immediately attempts to read it. Until the
change is fully propagated, Amazon S3 might return the prior data.
A process deletes an existing object and immediately attempts to read it. Until the
deletion is fully propagated, Amazon S3 might return the deleted data.
A process deletes an existing object and immediately lists keys within its bucket. Until
the deletion is fully propagated, Amazon S3 might list the deleted object.
Amazon S3 Features
Topics
Storage Classes
Amazon S3 offers a range of storage classes designed for different use cases. These
include Amazon S3 STANDARD for general-purpose storage of frequently accessed
data, Amazon S3 STANDARD_IA for long-lived, but less frequently accessed data, and
GLACIER for long-term archive.
Bucket Policies
Bucket policies provide centralized access control to buckets and objects based on a variety of
conditions, including Amazon S3 operations, requesters, resources, and aspects of the request
(e.g., IP address). The policies are expressed in our access policy language and enable
centralized management of permissions. The permissions attached to a bucket apply to all of the
objects in that bucket.
Individuals as well as companies can use bucket policies. When companies register with Amazon
S3 they create an account. Thereafter, the company becomes synonymous with the account.
Accounts are financially responsible for the Amazon resources they (and their employees) create.
Accounts have the power to grant bucket policy permissions and assign employees permissions
based on a variety of conditions. For example, an account could create a policy that gives a user
write access:
1. To a particular S3 bucket
2. From an account's corporate network
3. During business hours
AWS Identity and Access Management
AWS Identity and Access Management (IAM) enables you to manage access to AWS
services and resources securely. Using IAM, you can create and manage AWS users and
groups, and use permissions to allow and deny their access to AWS resources.
Access control lists (ACLs) are one of the resource-based access policy options that you
can use to manage access to your buckets and objects. You can use ACLs to grant basic
read/write permissions to other AWS accounts. There are limits to managing permissions
using ACLs. For example, you can grant permissions only to other AWS accounts; you
cannot grant permissions to users in your account. You cannot grant conditional
permissions, nor can you explicitly deny permissions. ACLs are suitable for specific
scenarios. For example, if a bucket owner allows other AWS accounts to upload objects,
permissions to these objects can only be managed using object ACL by the AWS account
that owns the object.
Versioning
Use versioning to keep multiple versions of an object in one bucket. For example, you
could store my-image.jpg (version 111111) and my-image.jpg (version 222222) in a
single bucket. Versioning protects you from the consequences of unintended overwrites
and deletions. You can also use versioning to archive objects so you have access to
previous versions.
Operations
1. Create a Bucket – Create and name your own bucket in which to store your objects.
2. Write an Object – Store data by creating or overwriting an object. When you write an
object, you specify a unique key in the namespace of your bucket. This is also a good
time to specify any access control you want on the object.
3. Read an Object – Read data back. You can download the data via HTTP or BitTorrent.
4. Deleting an Object – Delete some of your data.
5. Listing Keys – List the keys contained in one of your buckets. You can filter the key list
based on a prefix.