S3 Performance Optimization
S3 is designed to support very high request rates. However, if your S3 buckets are routinely receiving > 100 PUT/LIST/DELETE or > 300 GET requests per second, then there are some best practice guidelines that will help optimize S3 performance.
The guidance is based on the type of workload you are running:
- GET-Intensive Workload: Use CloudFront CDN to get best performance. CloudFront will cache your most frequently access objects and will reduce latency for your GET requests.
- Mixed Request Type Workloads (DEPRECATED GUIDANCE): A mix of GET, PUT, DELETE, and GET Bucket). The key names you use for your objects can impact performance for intensive workloads.
- The use of sequencial key names e.g. names prefixed with a time stamp or alphabetical sequence increases the likelihood of having multiple objects stored on the same partition. For heavy workloads this can cause I/O issues and contention.
- By using a random prefix to key names, you can force S3 to distribute your keys across multiple partitions, distributing the I/O workload.
- Suboptimal Performance Examples:
- mybucket/2018-03-04-15-00-00/cust1234234/photo1.jpg
- mybucket/2018-03-04-15-00-00/cust3857422/photo2.jpg
- mybucket/2018-03-04-15-00-00/cust1248473/photo2.jpg
- Optimal Performance Examples
- mybucket/7eh4-2018-03-04-15-00-00/cust1234234/photo1.jpg
- mybucket/h35d-2018-03-04-15-00-00/cust3857422/photo2.jpg
- mybucket/o3n6-2018-03-04-15-00-00/cust1248473/photo2.jpg
- Using this pattern forces S3 to spread out the I/O workload thus reducing the chance of I/O contention.
- Suboptimal Performance Examples:
July Performance Update
In July 2018, Amazon announced a massive increase in S3 performance:
- 3,500 PUT requests per second
- 5,500 GET requests per second
This new increased performance negates the previous guidance to randomize your object key names to achieve faster performance. This means logical and sequential naming patterns can now be used without any performance implication.