Google recently launched a preview of Colaboratory, a new service that lets you edit and run IPython notebooks right from Google Drive — free! It’s similar to Databricks — give that a try if you’re looking for a better-supported way to run Spark in the cloud, launch clusters, and much more.
Google has published some tutorials showing how to use Tensorflow and various other Google APIs and tools on Colaboratory, but I wanted to try installing Apache Spark.
This weekend while running a rather large Python job, I ran into a memory error. It turned out that a dictionary I was populating could potentially become too big to fit into RAM. This is where DiskDict saved me some time.
It’s definitely not the best way to solve an issue, but in this case I was working with a limited system where rewriting the surrounding code would have been intrusive.