AWS CloudFront Hands-On as a CDN (Content Delivery Network)
A CDN is a system of distributed servers (i.e Network) that deliver web pages and other types of web contents to a user based on geographic locations of that user, the origin of the web page, also a content delivery server.
What is CloudFront
Amazon CloudFront is a fast content delivery network (CDN) service that securely sends data, videos, applications, and APIs to customers globally with low latency, high transfer speeds, all within a developer-friendly environment.
CloudFront is integrated with AWS — both with physical locations that are directly connected to the AWS global infrastructure, as well as other AWS services.
CloudFront works seamlessly with services including AWS Shield for DDoS mitigation, Amazon S3, ELB or Amazon EC2 as origins for your applications, and Lambda@Edge to run custom code closer to customers’ users and to customize the user experience. Lastly, if you use AWS origins such as Amazon S3, Amazon EC2 or Elastic Load Balancing, you don’t pay for any data transferred between these services and CloudFront.
S3 vs CloudFront
Amazon S3 is designed for large-capacity, low-cost file storage in one specific geographical region. The storage and bandwidth costs are quite low.
Amazon CloudFront is a Content Delivery Network (CDN) which proxies and caches web data at edge locations as close to users as possible.
When end users request an object using this domain name, they are automatically routed to the nearest edge location for high performance delivery of your content.
The data served by CloudFront may or may not come from S3. Since it is more optimized for delivery speed, the bandwidth costs a little higher.
CloudFront has 2 distributions:
- Web Distribution: Typically used for websites.
- RMTP: Used for media streaming like Videos.
Terminology used for CloudFront
Edge Location: This is location where content will be cached. Always remember, this is separate to AWS Region/AZ. Edge locations are not just read only, you can write them too, so that means put objects in them.
Origin: This is the origin of all the files that CDN will distribute. This can be either your S3 Bucket, EC2 Instance, ELB, or Route53.
Distribution: This is the name given the CDN which consists of a collection of Edge locations.
How a CDN works
It is used for caching. It helps apps perform dramatically faster and cost less at scale. A cache is a high-speed data storage layer which stores a subset of data. The data in a cache is generally stored in fast access hardware such as RAM.
For example, if content from the Mumbai region is accessed for Frankfurt, the first time the static content will be sent from Mumbai and then cached in the Frankfurt region. Then on the next request, the contents will be taken from Frankfurt region and not from Mumbai, helping to provide a faster experience.
Caching means it takes the copy of static content and stores in nearby server for fast access. So, from second request and beyond, the contents will be delivered fast. Objects are cached for the life of the TTL (Time to Live). You can clear cached objects, but you’ll be charged.
A Few Important Points about CloudFront
- 136 points of presence globally (edge locations).
- CloudFront can help protect against network attacks.
- CloudFront can provide SSL encryption (HTTPS) at the edge using ACM.
- CloudFront can use SSL encryption (HTTPS) to talk to your applications.
Hands-On: Let’s get our hands dirty and play in console ;)
- We’ll create an S3 bucket.
- We’ll create a CloudFront distribution.
- We’ll create an Origin Access Identity and limit the S3 bucket to be accessed only using this identity.
Create an S3 bucket and upload some files.
Now let’s create a CloudFront Distribution. Go to console and click on create distribution and choose Web type.
Edit the following below options as per highlighted, we provided the origin domain name as S3 Bucket, Restrict Bucket Access “YES”. Create a New Identity in Origin Access Identity. Protocol policy HTTP to HTTPS and allow protocols GET, HEAD.
I’ll keep the rest of the options the same and click on create distribution. It will take a lot of time to create.
Now if we see in Origin Access Identity, it has created an identity for us “E8*******6”
Now if we go back to our S3 Bucket and see the bucket policy, the above access identity will be updated in policy.
Our objects in S3 Buckets are private. If we test with an S3 link, we will get access denied error.
But our bucket policy has access for CloudFront, so our objects would be only accessible through CloudFront. Now we will go to CloudFront Console as we can see our distribution has been deployed.
Now if we click on domain name with our uploaded objects, we will be redirected to that. So here is my CloudFront domain name/ object name.
https://d28mj5j4g5mpzp.cloudfront.net/Github-Microsoft-BIZ-FINAL.jpg
So our S3 objects are accessible through the distribution network.
Before I close this article, let’s look into one important concept.
CloudFront Signed URL / Signed Cookies
If you wanted to distribute paid shared content to premium users over the world, the content lives in S3. If S3 can only be accessed through CloudFront, we cannot use self-signed S3 URLs.
In this case, we can use a CloudFront Signed URL. We attach a policy with:
- Includes URL expiration.
- Includes IP ranges to access the data from.
- Trusted signers (which AWS accounts can create signed URLs).
A CloudFront signed URL can only be created using the AWS SDK, so you have to code an application to verify users and generate these URLs.
How long should the URL be valid for?
- Shared content (movie, music): make it short (a few minutes).
- Private content (private to the user): you can make it last for years.
Hit clap if you like it :)