A young man called Andrew Pole loved data, especially the patterns that would emerge. Some of his research was questionable, such as working out the optimal amount of beer that you would drink to have the confidence to chat up a woman, but not so much that you made a fool of yourself. The experiment was a failure.
Pole joined Target in 2002 as a number cruncher. Customers in the more than 1 000 stores handed out terabytes of data via their store cards, credit cards, coupons and other channels. He was in data heaven. More of him and his successful pregnancy detection later. (1)
It all started in outer space
The first documented use of the term ‘big data’ appeared in a 1997 paper by scientists at NASA, who described the problem they had with computer graphics: “Provides an interesting challenge for computer systems…” (2)
In 2001, analyst Doug Laney described the “3Vs”—volume, variety, and velocity—as the key “data management challenges” for enterprises. The amount of data, since the advent of the computer, has exploded in all three areas.
I’m an analog person. I like things I can touch and count. However big data has been thrust upon us, and the statistics of yesterday is the abacus of today. There are numerous definitions of the term ‘big data’, which I have summarised as follows: it is a collection of data from all sources giving us a plethora of data so diverse that traditional statistical techniques are as useless as my knowledge of particle physics, especially Feynman’s string theory.
Data gets big
Unlike the big bang, there is no moment in time that we can identify when big data became a commercial phenomenon. Perhaps it was when store accounts could be opened (involving paper and the post), however credit cards increased knowledge of our spending habits exponentially. Add store cards, web behaviour, social network interactions mailings sent, surveys filled out, phoning the customer help line, opening an email or visits to the website (with all its measures), purchases made online etc etc, and you get data sets that are really big with data that can’t all be measured in the same units. Apples and pears.
Andrew Pole detects who’s up the pole
One day someone from the marketing department asked Pole if he could figure out which of Target’s customers were pregnant based on their buying patterns. The reason is that pregnant women and new parents are the holy grail of a department store like Target. They are veritable gold for retailers, because they’re so sleep deprived. If they buy nappies or formula at a Target, they’ll buy just about everything else they need. Anything just to get a few more hours’ sleep.
Target already had a baby registry, so shopping patterns of pregnant/new mothers could be analysed. Extrapolate that out to the population and you get a list of thousands of women who were likely to be pregnant. So Target did what a retailer does – they mailed them everything they would need for a new baby. This went on for about a year and then the glitch came in the form of an angry parent with a mailer in his hand demanding to see the manager.
“My daughter got this in the mail. She’s still in high school, and you’re sending her coupons for baby clothes and cribs? Are you trying to encourage her to get pregnant?”
The manager apologised profusely, and just to make sure the customer had been placated, he called him the next day. This time it was the father who apologised – he had a talk with his daughter and she was due in August. That’s big data for you.
1. Gil Pres. 12 Big Data Definitions: What’s Yours? Forbes 09/03/2014. http://www.forbes.com/sites/gilpress/2014/09/03/12-big-data-definitions-whats-yours/
2. Charles Duhigg. The power of habit. Random House Books, 2013.