In the age of unprecedented tech innovation, it can be easy to forget our roots.
Despite their quirks, these new tools represented a qualitative victory: they helped us envision data uses that would have seemed impossible just a few years prior. They were not a silver bullet, but they were a step in the right direction.
A few months into that experience, I quickly learned that most of the difficult design decisions had to do with designing data models that would perform well when users submitted complex queries. This was no easy feat. We had to anticipate how users were going to query the data before we could design the data models. It also required significant amounts of code to pre-aggregate data so we could present it just as the user would expect it. Any small change request caused us to tremble because we knew it had the potential to cause ripple effects through our code and data models.
Then, about 10 years ago, a new set of data visualization tools emerged that promised to ease the pain. We bought into that promise and were early adopters. Now we could just take raw data and create our own visualizations without having to wait on perfect data models. Despite the potential, we quickly realized this didn’t scale well. As soon as we reached 100 million records, we started to have major performance problems. Despite their quirks, these new tools represented a qualitative victory: they helped us envision data uses that would have seemed impossible just a few years prior.
They were not a silver bullet, but they were a step in the right direction.
MemSQL, a small startup founded by two former Facebook employees, realized the pain points and took the next step. Realizing some of the challenges faced by our data team and many others, they developed an in-memory database specifically to address performance issues. I won’t go into all the details, but MemSQL’s in-memory database performs so well that it eliminates the need to pre-aggregate data. This means users can get access to data as quickly as they can collect it. It also means development teams don’t need to spend their time building perfect data models that anticipate a user’s every interaction with the warehouse. Instead, they can shift their energy to delivering better insights.
I’m sure this cocoon of smart kids in Silicon Valley will continue to push the envelope with innovative database solutions that enable us to unlock the value hidden in our data. I just wish those solutions had been around when I worked on my first data warehouse project.