We can name that ‘anonymized’ person with just three data points, researchers say

Three data points was enough to pick people out of the crowd (MIT)
Three data points was enough to pick people out of the crowd (MIT)

The next time a company tells you that it wants your information, but “Don’t worry, it’s anonymized,” don’t believe it.

A group of researchers from MIT proved this week that it can identify almost anyone using just four pieces of information in a supposedly anonymized database of credit card transactions — three, under certain conditions.  The research follows similar work done with a database generated by cell phone usage; and other research showing how easy it is to “guess” a person’s Social Security number when given a couple of facts about them.

“That means that someone with copies of just three of your recent receipts — or one receipt, one Instagram photo of you having coffee with friends, and one tweet about the phone you just bought — would have a 94 percent chance of extracting your credit card records from those of a million other people,” said MIT in its announcement of the research.

It’s a problem that people who study big data sets understand well. Pick a slice of data, find matches, do some cross-referencing, and the cloak of anonymity disappears — people can be uniquely identified.

In this project, study author Yves-Alexandre de Montjoye examined three months of credit card data covering 1.1 million consumers. Picking two dates within those three months, they found one (and only one) person who shopped at both a particular coffee shop and a restaurant.  After doing that, they could see everything that person purchased during the three months.

“We are showing that the privacy we are told that we have isn’t real,” study co-author Alex “Sandy” Pentland told The Associated Press.

The research was published in this week’s edition of the journal Science.

Anonymized data is now used for everything from serving Internet ads to traffic planning to conducting medical research.  Lack of trust in anonymity could severely dampen the potential of big data.

“Sandy and I do really believe that this data has great potential and should be used,” de Montjoye said. “We, however, need to be aware and account for the risks of re-identification.”

Sign up for Bob Sullivan’s free email newsletter. 



Don’t miss a post. Sign up for my newsletter

About Bob Sullivan 1600 Articles
BOB SULLIVAN is a veteran journalist and the author of four books, including the 2008 New York Times Best-Seller, Gotcha Capitalism, and the 2010 New York Times Best Seller, Stop Getting Ripped Off! His latest, The Plateau Effect, was published in 2013, and as a paperback, called Getting Unstuck in 2014. He has won the Society of Professional Journalists prestigious Public Service award, a Peabody award, and The Consumer Federation of America Betty Furness award, and been given Consumer Action’s Consumer Excellence Award.

Be the first to comment

Leave a Reply

Your email address will not be published.


This site uses Akismet to reduce spam. Learn how your comment data is processed.