An extensive study into the top 100,000 websites revealed that many leaked information people entered in the site forms to third-party trackers before people even pressed the submit button. It found thousands of such websites that leaked everything from email addresses to passwords, though thankfully, many fixed the issues once the researchers contacted them. “It is concerning to see websites leaking passwords,” Rick McElroy, Principal Cybersecurity Strategist at VMware, told Lifewire over email, reacting to the research. “I am happy to see that once notified, the organizations made changes to their code to stop that practice.”
Enter to Leak
The study was conducted to determine whether online trackers misuse access to web forms. The researchers point to a survey where 81% of the respondents admitted to abandoning online forms at some point. “We believe it is strongly against users’ expectations to collect personal data from web forms for tracking purposes prior to submitting a form,” noted the researchers. “We wanted to measure this behavior to assess its prevalence.” In all, they tested 2.8 million pages on the world’s highest-ranking sites. Of these, 1,844 websites allowed trackers to exfiltrate email addresses before submission, when visited from Europe. When visited from the US, the number of sites collecting information before submission increased to 2,950. The researchers note that the data leaks were apparently unintentional in some instances, with incidental password collection on 52 websites being resolved thanks to the study’s findings. “Some websites told us that they were not aware of this data collection and rectified the issue upon our disclosures,” wrote the researchers, who will present their findings at the upcoming USENIX Security Symposium, in Boston, Massachusetts.
Stay Safe
Chris Hauk, consumer privacy champion at Pixel Privacy, said that while the data leaks are coming from the websites, there are a couple of things people can do on their end to at least slow the data leaks. “Users can visit Electronic Frontier Foundation’s Cover Your Tracks website to determine how website trackers see your browser, revealing how sites can track you while online, and what you can do to at least partially prevent it,” Hauk suggested to Lifewire over email. The usual advice of using a VPN to cover your online tracks won’t be of much use to prevent this sort of leak. Hauk suggests using a disposable email address, separate from your usual personal email account, for use on websites that ask for such information. McElroy asked people to either use a web browser built for privacy like Brave, or to install privacy add-ons, such as Privacy Badger, on their regular browser. He also advocated for multi-factor authentication to minimize the damage of password leaks. Additionally, the researchers have developed a proof-of-concept browser add-on called Leak Inspector that warns and protects against data exfiltration.
Data Economy
Expressing his surprise at the extent of the collection, McElroy said people must understand that human-generated data is a commodity that’ll be collected, shared, analyzed, and used for multiple purposes. “Most of the time these purposes aren’t necessarily malicious (like sharing data with a third-party advertiser) however the flow between and amongst systems with various levels of security makes all consumers vulnerable and creates a ripe landscape for attackers to take advantage of,” explained McElroy. David Rickard, CTO North America at Cipher, a Prosegur company, thinks that people should presume that every form they fill out on the internet is saving data while data entry is underway, and every form they fill out becomes the property of the website and re-sold to third-parties. “Personal data and its value form the business model for many modern digital enterprises for the past 20+ years, even if their privacy policies explicitly state that they don’t gather PII [Personally Identifiable Information] and sell it,” Rickard told Lifewire over email. He said data aggregators work around privacy regulations by collecting several different datasets that may not include name, address, etc., which aren’t PII as such, but when matched against hundreds of additional data points from other datasets, can identify individuals with a success rate of over 90%. “This gives rise to services that are something like actuarial tables (or believed to actually be actuarial tables) indicating credit worthiness, insurability, employability, likelihood of different addictions, likely political and religious affiliations, you name it,” said Rickard.