A typical modern Spark stack nowadays most likely runs Spark jobs on a Kubernetes cluster, especially for heavy usage. Workloads are moving away from EMR on EC2 to either EMR on EKS or open-source Spark on EKS.
When you’re running Spark on EKS, you probably want to scale your Kubernetes nodes up and down as you need them. You might only need to run a few jobs per day, or you might need to run hundreds of jobs, each with different resource requirements.
This post originally started as a post about Terraform, but I decided to break that out into a separate post.
It turned out that I had a wishlist for improvements I’d like to see in CloudFormation.
I’ve been using CloudFormation for years and have been pushing teams I work with to do the same - with mixed results.
Problems with CloudFormation Writing Templates CloudFormation templates are tedious to write.
CloudFormation consists of JSON or YAML files that define “stacks”.
If you’re doing any production-level work in AWS, you should be using AWS CloudFormation. It’s really easy to get started. Let’s walk through the basics.
Why use CloudFormation? Here’s a common scenario: creating an EC2 instance and assigning an Elastic IP address. Let’s say it’s for a web server. Great! That’s easy. Just spin up an EC2 instance. Choose the correct image, size, security groups, VPC, subnet, keypair, and so on.
Another quick post — found this in the AWS Console UI. If you ever need to share your AWS Canonical ID with someone, e.g. to share S3 buckets.
You can find your AWS Canonical ID by using various APIs — but I was also able to find it using the AWS Console UI.
By opening up the S3 console and selecting a bucket you own, you can view the Canonical ID by viewing the Access Control List in the Permissions tab.
Recently I ran into a problem while working with Amazon EC2 servers. Servers without dedicated elastic IP addresses would get a different IP address every time they were started up! This proved to be a challenge when trying to SSH in to the servers.
How can I have a dynamic domain name that always points to my EC2 server?
Amazon’s Route53 came to mind. Route53, however, does not have a simple way to point a subdomain directly to an EC2 instance.