Digital Media A-Z: Big Data
2.76 million people ride the TTC daily. 38% of customers abandoned online shopping carts when delivery estimates exceeded 7 days. Facebook posts that include 'should', 'would', and 'who' get the most comments; posts that include 'why' and 'how' get the least comments. 3.2 million vinyl records were sold in 2012. Every two days we create as much data as we did from the dawn of civilization until 2003.
We are living in a sea of data. Whether we're swimming or drowning is a matter of opinion.
In the past decade we've experienced a precipitous fall in the cost of digital storage technology, as well as the rise of simple tools that allow us to take more and more accurate measurements. Ranging from elaborate research and surveillance instruments to the hugely popular personal quantification bracelets like the Fitbit Flex and Jawbone Up, we are generating enormous amounts of information about the world around us, and about ourselves. We could say that Big Data is a consequence of these new, affordable technologies.
To paraphrase James Gleick from his book The Information: A History, A Theory, A Flood, the great revolution in computers occurred when we went from using numbers to count, to using numbers to do things.
At our current stage of Big Data, we still seem to be counting. I think the great hope is that we'll eventually be able to do something with all our data, and we do see glimmers of that hope being realized, whether that's maximizing the use of hard-to-find urban parking spaces, sleeping the perfect amount to avoid interfering with precious REM cycles, and even reducing ambulance response times in New York City.
But of course, data isn't new. It's only Big Data that's the new phenomenon. And as Hilary Mason - former Chief Scientist at bit.ly – explains, "Big Data is data you can't analyze on one computer." Her definition may seem overly simple, but there's something interesting about it: we use computers to measure and calculate data – literally, to compute. Data is 'Big' if it can't be computed on one computer, yet we couldn't have Big Data if it weren't for our computers.
I hear the sound of one hand clapping.
The problem, in the end, is of finding meaning in the madness, which in one way or another, is the existential dilemma. Cited frequently in discussions of Big Data is the short story The Library of Babel by Jorge Luis Borges. It's a tale of a seemingly infinite library filled with books. "Each book is of four hundred and ten pages; each page, of forty lines, each line, of some eighty letters which are black in colour." From that point forward, it's madness. There are countless books of complete gibberish, and there are books that describe the order behind a certain book of gibberish, and millions that do the same thing with only a few misspelled words. There are books that condemn this very blog post of being libelous, and books that accuse those very books of slander. Yet on some shelf amidst all the turmoil, there sits the master plan. The book that describes the order of the library. The book of all books.
This tale has been cited so frequently because of how well it describes our dance with Big Data. As time goes by, the data will keep getting bigger and more overwhelming. We'll develop super-duper quantum computers that will compute all the world’s data in a nanosecond, and what that speed will really do is amplify the pace of how quickly the computer can no longer independently compute our new data. We'll be right back where we began thousands of years ago staring at the stars, tightrope walkers teetering between nihilism and rapture.
All the while, at least some of us will kindle the hope that somewhere in the air-conditioned server farms of the world, we've collected just the right data from just the right place. It's the key to open...well what does it open??
The clearer we can answer that question, the sooner we can start channeling our flood into digital dams and reservoirs with which we can really do something amazing.
CODE: Hilary Mason
On February 28th, 2014, Hilary Mason – former Chief Scientist at bit.ly – spoke at the University of Waterloo Stratford Campus to kick off the Canadian Open Data Experience. Beyond her keynote address, she was kind enough to give a private interview, which can be heard below. The video above was created by combining clips from both her keynote address and her interview with originally composed music, and a stunning array of visuals aiming to explore the phenomenon of Big Data.
CODE: Hilary Mason Interview