When Localhost Isn't Enough

Alex Ioannides on Data Science

Although I am interested in many things, this blog is focused on the discipline that is ‘data science’. Given that this is a nebulous and over-used catch-all phrase I’ll be more specific – this is a blog about everything that’s involved in turning raw data into information that one could ‘do something’ with. As I see it, this covers the methods and tools used for:

  • data storage,
  • data extraction and transformation (ETL),
  • data exploration,
  • data modeling,
  • serving-up results.

I am particularly interested in R, Scala, Spark, the Elasticsearch stack and AWS, all from an OS X user’s frame of reference. These are my day-to-day tools along with pencil and (squared) paper.

And as the title of this blog suggests, it’s about getting things off my laptop and onto a production environment that can scale – when localhost is not enough you’re gonna need a bigger boat.