The amount of information we share and how comfortable we are doing it varies from user to user, but most accept that everything that comes to light in a public profile is in the general domain. If a hacker collected all your available information and included it in a huge spreadsheet along with the data of millions of users to sell it on the internet for the highest bidder, what will happen?
That is precisely what the man who calls himself Tom Liner did last month. He compiled information from 700 million LinkedIn users around the world into a database and put it up for sale for about $ 5,000. And he did it for fun.
The incident, and other similar cases of so-called scraping on social networks, have sparked a fierce debate about whether the basic information we share publicly on our profiles should be better protected or not.
How do they sell so much information?
It was at 8:57 am UK time that the post appeared on a well-known hacker forum. It was an oddly civilized time for hackers, but of course, we have no idea what time zone the hacker calling himself Tom Liner lives in.
"Hello, I have 700 million 2021 LinkedIn registrations, " he wrote.
Included in the post was a link to a sample of one million and an invitation for other hackers to contact him privately and make offers for the database.
Understandably, the sale caused a sensation in the hacker's world. Tom tells me that he is selling his loot for about $ 5,000 to multiple clients. It doesn't reveal who they are or why they want that information, but it does say that the data is likely to be used for other malicious hacks.
The news has also caused a stir in the world of cybersecurity and privacy and sparked a debate about whether we should be concerned about this growing trend of large-scale scraping.
What is scraping?
These databases are not created by breaking into social media servers or websites. In large part, web scraping is done by scraping the public surface of platforms that use automated programs to grab whatever information is available about users.
In theory, most of the data can be found simply by selecting individual social media profiles. Although, of course, it would take a very long time to collect all the data that hackers are capable of selecting. So far this year, there have been three other major scraping incidents :
In April, a hacker sold another database of about 500 million records pulled from LinkedIn.
In the same week, another hacker posted a database of information gleaned from 1.3 million Clubhouse profiles on a forum for free.
Also in April, 533 million Facebook user data was collected from a mix of old and new scraping before being delivered to a hacking forum asking for donations.
The hacker responsible for that Facebook database was also Tom Liner.
A BBC reporter spoke to Tom for three weeks on Telegram. Some messages and even missed calls were made in the middle of the night and others during business hours, so I had no idea where they were.
The only clues about his life were when he told me that he couldn't talk on the phone because his wife was sleeping and that he has a day job and hacking is his hobby.
Tom said he created the database from 700 million LinkedIn records using almost exactly the same technique that he used to create the Facebook list. It took me several months to do it. It was very complex. I had to hack the LinkedIn API. If you make too many requests for user data at the same time, the system will permanently ban you.
API stands for Application Programming Interface and most social networks sell API associations that allow other companies to access data on the platform, for example for marketing or creating applications.
Digital security site Privacy Shark, which first discovered the database sale, examined the free sample and found that it included full names, email addresses, gender, phone numbers, and industry information.
It was not a data breach
LinkedIn says its research suggests that Tom Liner did not use its API, but confirmed that the data set includes information gleaned from LinkedIn, as well as information gleaned from other sources. This was not a LinkedIn data breach and no LinkedIn member's private data was exposed. LinkedIn data mining is a violation of our Terms of Service and we are constantly working to ensure that our member's privacy is protected.
Facebook made similar statements regarding the April incident. However, the fact that hackers are making money from these databases worries some cyber experts.
SOS Intelligence founder and CEO Amir Hadzipasic roams hacker forums on the dark web day and night. As soon as the news of LinkedIn's 700 million database spread, he and his team began analyzing it.
The specialist said, large-scale thefts like this are concerning given the intricate detail in some cases of this information, such as geographic locations or email addresses and private phone numbers. For most people, it comes as a surprise that there is so much information in these services.
Tom Liner says that he knows that his database is likely to be used for malicious attacks and noted that this bothers him, but does not explain why he continues to perform these scraping operations.
Amir argues that hackers who buy LinkedIn data could use it to launch targeted hacking campaigns on high-level targets, like company executives, for example. He also says that there is value in a large number of active emails in the database that can be used to send massive email phishing campaigns.
The data is public
Cybersecurity expert Troy Hunt, who has spent most of his working life analyzing the content of hacked databases, is not so concerned about recent scraping incidents and says that we should accept them as part of the fact that our profile is public.
It's definitely not about offenses. Most of these data are public anyway. The question to be asked in each case is c uánta of this information is publicly available by choice of the user, and how much is not expected to be.
Troy agrees with Amir that social media controls need to be improved and says we cannot ignore these incidents.
I do not disagree with the position of Facebook and other companies, but I think that the answer of 'this is not a problem, although it may be technically accurate, loses the notion of what this data is like and perhaps minimizes its role in the creation of these databases.
Tom is likely to be sued for theft of intellectual property or infringement of rights.
But when asked if he was worried about being arrested, he said no one would be able to find him, and ended the conversation by saying have a nice time.
Note this article is based on the BBC report 700 million LinkedIn users stolen, just for fun