If you’re doing any production-level work in AWS, you should be using AWS CloudFormation. It’s really easy to get started. Let’s walk through the basics.
Why use CloudFormation?
Here’s a common scenario: creating an EC2 instance and assigning an Elastic IP address. Let’s say it’s for a web server. Great! That’s easy. Just spin up an EC2 instance. Choose the correct image, size, security groups, VPC, subnet, keypair, and so on. Then create and assign it an Elastic IP address. No problem!
Continue reading “AWS CloudFormation”
Another quick post – found this in the AWS Console UI. If you ever need to share your AWS Canonical ID with someone, e.g. to share S3 buckets.
You can find your AWS Canonical ID by using various APIs – but I was also able to find it using the AWS Console UI.
Continue reading “Easily Finding your AWS S3 Canonical ID”
Quick post – I’ve been busy studying for the AWS Certified Solutions Architect – Associate exam for the past few weeks – good news, I passed it a few days ago! Shoot me a note if you ever need some solutions architected.
I primarily did this because I’ve been using AWS for years now – but so has everyone else – this would be a differentiator. There was also a lot missing in-between the cracks (I learned how to give instances in a private subnet Internet access to install/update software without giving them public IP addresses and without spending hours reading Stack Overflow posts).
Continue reading “AWS Certified Solutions Architect”
One of the biggest, most time-consuming parts of data science is analysis and experimentation. One of the most popular tools to do so in a graphical, interactive environment is Jupyter.
Combining Jupyter with Apache Spark (through PySpark) merges two extremely powerful tools. AWS EMR lets you set up all of these tools with just a few clicks. In this tutorial I’ll walk through creating a cluster of machines running Spark with a Jupyter notebook sitting on top of it all.
Continue reading “Jupyter Notebooks with PySpark on AWS EMR”
You’ll definitely want to read this if you’re using AWS Kinesis with Apache Spark to stream data, it’s been extremely valuable:
Recently I ran into a problem while working with Amazon EC2 servers. Servers without dedicated elastic IP addresses would get a different IP address every time they were started up! This proved to be a challenge when trying to SSH in to the servers.
How can I have a dynamic domain name that always points to my EC2 server?
Amazon’s Route53 came to mind. Route53, however, does not have a simple way to point a subdomain directly to an EC2 instance. You can set up load balancers between Route53 and your instance, but that’s a hassle. You can also set up an elaborate private network with port forwarding – yuck.
I wanted a simple way to set a Route53 subdomain’s
A record to point to an EC2 instance’s public IP address, on startup.
Enter go-route53-dyn-dns. This is a simple Go project that solves this problem. It is a small binary that reads a JSON configuration file and updates Route53 with an EC2 instance’s public IP address.
Included in the GitHub
README.md file is how to set everything up.
The project is here: go-route53-dyn-dns.