Recently I’ve been playing around with Jekyll to create some simple websites. I’ve used Jekyll in the past and I remember that the set-up was a multi-step process.
Jekyll is a Ruby application that uses several Gems and Bundler. That means installing several dependencies. In my case I don’t have a Ruby development environment already set up, so I would have to install all these packages just to use a static site generator.
Then I found the official Jekyll Docker image.
I already have Docker installed to play around with other containers, so downloading a Jekyll container and using it was as easy as:
docker run --rm --label=jekyll --volume=$(pwd):/srv/jekyll \
-it -p 127.0.0.1:4000:4000 jekyll/jekyll jekyll serve
That’s all there is to it. This command will download the latest Jekyll image and start serving your site. No need to install Ruby, Gem, Bundler, or a bunch of other dependencies.
If you really want to get into the details of Python and learn about how the language was built and how some of its internals are implemented, Fluent Python is the book for you.
It’s a great book to refresh your knowledge of coroutines, asyncio, and other Python goodies.
You’ll definitely want to read this if you’re using AWS Kinesis with Apache Spark to stream data, it’s been extremely valuable:
If you’re just getting started with Flask or you want to learn about the innards of Django (yep, that’s right), “Flask Web Development” is the perfect place to start. This book dives right in with creating a full web application, including Jinja templates, authentication, building a REST API, forms, databases, security, and deployment to Heroku using Git. This book will get you up and running with Flask and then quickly go into detail on how to build a full web application.
However, in my opinion, Flask should be used for small applications, but this book goes into full detail about creating a half-Django for a full web application.
With that in mind, this book is great for learning about Django – how would you implement CSRF token checks? How would you set up database migrations from scratch? How would you handle forms? Django does all of that, but hides it all from developers. This book goes into full detail reimplementing a lot of what Django gives you out-of-the-box, which is great.
Overall I highly recommend “Flask Web Development” if you’re learning either Flask, Django, or just web-backend development in general. Don’t just use what Django gives you out of the box and ignore how it’s implemented. This book will answer questions like “Why does my Django app need a
SECRET_KEY? What is this CSRF error I keep seeing? How do database migrations work? How do I write my own mail handler?”, making you a better Django developer.
Get it here: http://a.co/73ERCK9
New tiny GitHub project: https://github.com/mikestaszel/spark_cluster_vagrant
Over the past few weeks I’ve been working on benchmarking Spark as well as learning more about setting up clusters of Spark machines both locally and on cloud providers.
I decided to work on a simple
Vagrantfile that spins up a Spark cluster with a head node and however many worker nodes desired. I’ve seen a few of these but they either used some 3rd party box, had an older version of Spark, or only spun up one node.
By running only one command I could have a fully-configured Spark cluster ready to use and test. Vagrant also easily extends beyond simple Virtualbox machines to many providers, including AWS EC2 and DigitalOcean and this
Vagrantfile can be extended to provision clusters on those providers.
Check it out here: https://github.com/mikestaszel/spark_cluster_vagrant
I just finished reading “Hello, Startup” by Yevgeniy Brikman, a book written for programmers about starting a startup. All the basics are covered, including hiring, teamwork, startup culture, and development methodology while scaling a startup. It’s a nice quick read (I skimmed through the chapters about development, programming, databases, and other technical chapters, but I found the other content to be a great place to start learning about what it takes to build a startup.
Check it out here (also available on Safari Books): http://www.hello-startup.net
I like to start my projects using Flask and Python because it’s fast and quick for most things, yet lightweight.
By default, Flask doesn’t give you much in terms of test frameworks, application settings, deployment, or running the application in production. I always end up making a skeleton that does some of these things, so I decided to put together a GitHub repository with a skeleton Flask project that does it for me.
Have a look here: https://github.com/mikestaszel/flask_startup
This weekend while running a rather large Python job, I ran into a memory error. It turned out that a dictionary I was populating could potentially become too big to fit into RAM. This is where DiskDict saved me some time.
It’s definitely not the best way to solve an issue, but in this case I was working with a limited system where rewriting the surrounding code would have been intrusive. Plus, the job didn’t have time constraints, so DiskDict was a decent workaround.
Wanted to share because it proved useful to me!
Hello world! I’ve now moved from WordPress to Ghost to Jekyll and now back to WordPress.
Recently I ran into a problem while working with Amazon EC2 servers. Servers without dedicated elastic IP addresses would get a different IP address every time they were started up! This proved to be a challenge when trying to SSH in to the servers.
How can I have a dynamic domain name that always points to my EC2 server?
Amazon’s Route53 came to mind. Route53, however, does not have a simple way to point a subdomain directly to an EC2 instance. You can set up load balancers between Route53 and your instance, but that’s a hassle. You can also set up an elaborate private network with port forwarding – yuck.
I wanted a simple way to set a Route53 subdomain’s
A record to point to an EC2 instance’s public IP address, on startup.
Enter go-route53-dyn-dns. This is a simple Go project that solves this problem. It is a small binary that reads a JSON configuration file and updates Route53 with an EC2 instance’s public IP address.
Included in the GitHub
README.md file is how to set everything up.
The project is here: go-route53-dyn-dns.