S3 Introduction
Simple Storage Service (S3) provides developers and IT teams with secure, durable, highly-scalable object storage. Amazon S3 is easy to use, with a simple web services interfaces to store and retrieve any amout of data from anywhere on the web.
- S3 is a safe place to store your files
- S3 is object-based storage (not block storage)
- The data is spread across multiple devices and facilities for high-availability or disaster recovery
The Basics
- S3 is object-based - i.e. allows you to upload files
- Files can be from 0 Bytes to 5 TB.
- There is unlimited storage
- Files are stored in Buckets (similiar to a folder)
- S3 is a universal namespace. That is, names must be unique globally. E.g. https://s3-us-west-1.amazonaws.com/acloudguru
- When you upload a file to S3, you will receive a HTTP 200, if the upload was successful.
- Built for 99.99% availability for the S3 platform
- Amazon guarantee 99.9% availability
- Amazon guarantees 99.999999999% durability for S3 information. (Remember 11 x 9s). Regardless, you should keep backups and such.
- Tiered Storage Available
- Lifecycle Management
- Versioning
- Encryption
- Secure your data (Access)
- Access Control Lists and Bucket Policies
Data Consistency Model for S3
- Read after Write consistency for puts of new Objects (as soon as you add a file, it is available to be read).
- Eventual consistency for overwrite PUTS and DELETEs (can take some time to propagate).
- S3 is object-based. Objects consist of the following:
- Key: This is simply the name of the object
- Value: This is simply the data, which is made up of a sequence of bytes.
- Version ID: Important for versioning
- Metadata: Data about data you are storing
- Subresources: Bucket-specific configuration
- Bucket Policies, Access Control Lists
- Cross-Origin Resource Sharing (CORS)
- Transfer Acceleration (Accelerate transfer speeds when uploading to S3)
S3 Storage Tiers/Classes
- S3: 99.99% availability, 99.999999999% durability, stored redundantly across multiple devices in multiple facilities and is designed to sustain the loss of 2 facilities concurrently.
- S3 IA (Infrequently Accessed): For data that is accessed less frequently, but requires rapid access when needed. Lower fee than S3, but you are charged a retrieval fee.
- S3 One Zone IA: Same as IA however data is stored in a single AZ only. Still 99.999999999% durability, but only 99.5% availability (less than normal S3). Cost is 20% less than regular S3 IA.
- Reduced Redundancy Storage: Designed to provide 99.99% durability and 99.99% availability of objects over a given year. Used for data that can be recreated if lost, e.g. thumbnails. (Starting to disappear from AWS documentation but may still feature in exam).
- Glacier: (Not stricly part of S3, but interacts closely with S3) Very cheap, but used for archival only. Optimised for data that is infrequently accessed and it takes 3 - 5 hours to restore from Glacier.
Storage Class | Durability (Designed For) | Availability (Designed For) | Other Considerations |
---|---|---|---|
STANDARD | 99.999999999% | 99.99% | None |
STANDARD INFREQUENTLY ACCESSED | 99.999999999% | 99.9% | Retrieval fee for all S3 IA objects |
ONE-ZONE INFREQUENTLY ACCESSED | 99.999999999% | 99.5% | Not resilient to loss of the AZ. |
GLACIER | 99.999999999% | 99.99% (after you restore objects) | No real-time access, 4-5 hours to access. |
REDUCED REDUNDANCY STORAGE | 99.99% | 99.99% | None |
S3 Charges
You are charged for
- Storage per GB
- Requests (GET, PUT, CPY, etc.)
- Storage Management Pricing
- Inventory, Analytics, and Object Tags
- Data Management Pricing
- Data transferred out of S3
- Transfer Acceleration