Hate fake reviews, followers? This algorithm spots cheaters

Hunting for fakes
Hunting for fakes

Anyone who’s read oddly positive online reviews or happened on Twitter users with suspiciously large followings knows that the Internet is full of fakes. Bogus followers or fake reviews aren’t just annoying — they call into question the very notion of crowdsourcing, one of the Internet’s most valuable tools.

Researchers at Carnegie Mellon University are hard at work developing automated tools to separate fake from real, attempting to restore the credibility of the wisdom of crowds. They call their new system Fraudar, a play on the word radar, and it works by detecting users who are almost certainly real. The Fraudar algorithm is open source, meaning it’s free for any company to use. This development will help those who practice white hat SEO, adding to SEO reseller benefits too.

(This story first appeared on Credit.com. Read it there.)

Those who create fake reviews and followers have long been engaged in a cat-and-mouse game with sites that facilitate sharing. Simple techniques like buying Twitter followers still work, but they are fairly easily uncovered. However, with a site like Instagram, it’s quite different. There are plenty of sites out there helping you get free instagram followers if that’s what you’re looking for, but on Twitter, it works differently. You’ve probably seen Twitter users with huge numbers of followers and accounts followed, for example, which is a sign that the user simply engages in follow-for-follow schemes.

Taking that scheme to the next level, criminals create two layers of fake accounts — “fraud” and “accomplice.” Accomplices are designed to act more like real users, and work to connect with actual users. These accomplices then interact with “fraud” accounts – accounts used to sell 1,000 Twitter followers, for example, or to commit actual scams on online auction sites — lending them an air of legitimacy. These two-sided arrangements form what researches call a bipartite core.

Plotted on a graph, the lines between various users trying to game the system on both sides form a dense center, hence the name, said Christos Faloutsos, professor of machine learning and computer science at Carnegie Mellon.

“It creates a very strange constellation [wherein] 1,000 people agree to admire the same two or three people,” he said. “This is a red flag.”

To avoid detection, fake account managers further try to camouflage themselves by mixing in authentic users. That’s where you come in. If you’ve ever been proud that some seemingly Internet famous person asked to follow you, you were probably being used. Fake accounts also follow real famous folks, like Lady Gaga or President Barack Obama, to add additional camouflage.

“They do this to make their accounts look normal,” Faloutsos said. Such camouflage works more often than not, he added, because current methods to detect fakes are not “adversarially robust.”

Detecting Normal Activity

Fraudar works because, ironically, it’s good at spotting normal activity. It picks out likely real users, them separates them from the clusters it spots. When those are removed, the bipartite core can easily be revealed.

“The algorithm begins by finding accounts that it can confidently identify as legitimate — accounts that may follow a few random people, those that post only an occasional review and those that otherwise have normal behaviors. This pruning occurs repeatedly and rapidly,” Carnegie Mellon said. “As these legitimate accounts are eliminated, so is the camouflage the fraudsters rely upon. This makes bipartite cores easier to spot.”

The algorithm works in part because it’s capable of scanning massive amounts of data very quickly. In real-world experiments using Twitter data for 41.7 million users and 1.47 billion followers, Fraudar fingered more than 4,000 accounts not previously identified as fraudulent, including many that used known follower-buying services, Carnegie Mellon said.

“The algorithm is very fast and doesn’t require us to target anybody,” Faloutsos said. “We hope that by making this code available as open source, social media platforms can put it to good use.”

Twitter did not immediately respond to Credit.com’s request for comment on the experiments.

Ultimately, Carnegie Mellon hopes the tool can be used to spot fake product reviews, fake advertising offers and even politicians who exaggerate their Internet popularity.

During the previous presidential election cycle, there were accusations of fake followers. In 2011, Newt Gingrich was accused of buying nearly a million Twitter followers. Then in August 2012, a CNET report noted Mitt Romney gained 116,000 Twitter followers in one day, which was wildly out of pattern. The issue isn’t one-sided. The Daily Mail said Barack Obama has nearly 20 million fake followers.

The issue is important because if fake reviews are allowed to crowd out real content, users will begin to ignore them altogether.

“It’s widespread right now because it works,” Faloutsos said. “But it distorts reality and translates dollars into false impressions. We are trying to stop that.”

If you’ve read this far, perhaps you’d like to support what I do. That’s easy. Sign up for my free email list, or click on an advertisement, or just share the story.

About Bob Sullivan 1408 Articles
BOB SULLIVAN is a veteran journalist and the author of four books, including the 2008 New York Times Best-Seller, Gotcha Capitalism, and the 2010 New York Times Best Seller, Stop Getting Ripped Off! His latest, The Plateau Effect, was published in 2013, and as a paperback, called Getting Unstuck in 2014. He has won the Society of Professional Journalists prestigious Public Service award, a Peabody award, and The Consumer Federation of America Betty Furness award, and been given Consumer Action’s Consumer Excellence Award.

Be the first to comment

Leave a Reply

Your email address will not be published.


This site uses Akismet to reduce spam. Learn how your comment data is processed.