These new techniques expose your browsing history to attackers

These new techniques expose your browsing history to attackers
An example of code the researchers used for their attacks. Credit: University of California San Diego

Security researchers at UC San Diego and Stanford have discovered four new ways to expose Internet users' browsing histories. These techniques could be used by hackers to learn which websites users have visited as they surf the web.

The techniques fall into the category of "history sniffing" , a concept dating back to the early 2000s. But the attacks demonstrated by the researchers at the 2018 USENIX Workshop on Offensive Technologies (WOOT) in Baltimore can profile or 'fingerprint' a user's online activity in a matter of seconds, and work across recent versions of major web browsers.

All of the attacks the researchers developed in their WOOT 2018 paper worked on Google Chrome. Two of the attacks also worked on a range of other browsers, from Mozilla Firefox to Microsoft Edge, as well various security-focused research browsers. The only which proved immune to all of the attacks is the Tor Browser, which doesn't keep a record of browsing history in the first place.

"My hope is that the severity of some of our published attacks will push browser vendors to revisit how they handle history data, and I'm happy to see folks from Mozilla, Google, and the broader World Wide Web Consortium (W3C) community already engage in this," said Deian Stefan, an assistant professor in computer science at the Jacobs School of Engineering at UC San Diego and the paper's senior author.

"History sniffing": smelling out your trail across the web

Most Internet users are by now familiar with "phishing;" cyber-criminals build fake websites which mimic, say, banks, to trick them into entering their login details. The more the phisher can learn about their potential victim, the more likely the con is to succeed. For example, a Chase customer is much more likely to be fooled when presented with a fake Chase login page than if the phisher pretends to be Bank of America.

After conducting an effective history sniffing attack, a criminal could carry out a smart phishing scheme, which automatically matches each victim to a faked page corresponding to their actual bank. The phisher preloads the attack code with their list of target banking websites, and conceals it in, for example, an ordinary-looking advertisement. When a victim navigates to a page containing the attack, the code runs through this list, testing or 'sniffing' the victim's browser for signs that it's been used to visit each target site. When one of these sites tests positive, the phisher could then redirect their victim to the corresponding faked version.

The faster the attack, the longer the list of target sites an attacker can 'sniff' in a reasonable amount of time. The fastest history sniffing attacks have reached rates of thousands of URLs tested per second, allowing attackers to quickly put together detailed profiles of web surfers' online activity. Criminals could put this sensitive data to work in a number of ways besides phishing: for example, by blackmailing users with embarrassing or compromising details of their browsing histories.

History sniffing can also be deployed by legitimate, yet unscrupulous, companies, for purposes like marketing and advertising. A 2010 study from UC San Diego documented widespread commercial abuse of previously known history sniffing attack techniques, before these were subsequently fixed by browser vendors.

"You had internet marketing firms popping up, hawking pre-packaged, commercial history sniffing 'solutions', positioned as analytics tools," said Michael Smith, a computer science Ph.D. student at UC San Diego and the paper's lead author. The tools purported to offer insights into the activity of their clients' customers on competitors' websites, as well as detailed profiling information for ad targeting—but at the expense of those customers' privacy.

"Though we don't believe this is happening now, similar spying tools could be built today by abusing the flaws we discovered," said Smith.

New attacks

The attacks the researchers developed, in the form of JavaScript code, cause web browsers to behave differently based on whether a website had been visited or not. The code can observe these differences—for example, the time an operation takes to execute or the way a certain graphic element is handled—to collect the computer's browsing history. To design the attacks, researchers exploited features that allow programmers to customize the appearance of their web page—controlling fonts, colors, backgrounds, and so forth—using Cascading Style Sheets (CSS), as well as a cache meant to improve to performance of web code.

The researchers' four attacks target flaws in relatively new browser features. For example, one attack takes advantage of a feature added to Chrome in 2017, dubbed the "CSS Paint API", which lets web pages provide custom code for drawing parts of their visual appearance. Using this feature, the attack measures when Chrome re-renders a picture linked to a particular target website URL, in a way invisible to the user. When a re-render is detected, it indicates that the user has previously visited the target URL. "This attack would let an attacker check around 6,000 URLs a second and develop a profile of a user's browsing habits at an alarming rate," said Fraser Brown, a Ph.D. student at Stanford, who worked closely with Smith.

Though Google immediately patched this flaw—the most egregious of the attacks that the researchers developed—the computer scientists describe three other attacks in their WOOT 2018 paper that, put together, work not only on Chrome but Firefox, Edge, Internet Explorer, but on Brave as well. The Tor Browser is the only browser known to be totally immune to all the attacks, as it intentionally avoids storing any information about a user's browsing .

As new browsers add new features, these kinds of attacks on privacy are bound to resurface.

A proposed defense

The researchers propose a bold fix to these issues: they believe browsers should set explicit boundaries controlling how users' browsing histories are used to display web pages from different sites. One major source of information leakage was the mechanism which colors links either blue or purple depending on whether the user has visited their destination pages, so that, for example, someone clicking down a Google search results page can keep their place. Under the researchers' model, clicking links on one website (e.g., Google) wouldn't affect the color of links appearing on another website (e.g., Facebook). Users could potentially grant exceptions to certain websites of their choosing. The researchers are prototyping this fix and evaluating the trade-offs of such a privacy-conscious browser.


Explore further

Bloated browser functionality presents unnecessary security, privacy risks

Citation: These new techniques expose your browsing history to attackers (2018, October 30) retrieved 20 May 2019 from https://techxplore.com/news/2018-10-techniques-expose-browsing-history.html
This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.
261 shares

Feedback to editors

User comments

Oct 30, 2018
Good lord, who actually allows their browser to keep history, or doesn't use tracks eraser software? Too many apparently. They can't hack your internet history if there isn't any.

Nov 01, 2018
This is another kind of attack similar to time attacks (like Spectre and Metdown which use measurable time effects of caches used for optimizing softwares, in speculative execution, branch prediction, or out-of-order execution, just to avoid some delays waiting for a reply). This time this is used for "sniffing" but the goal is still the same: harvest private information using covert channels. All new alarming covert channeles are now based on time, because we've inherited in ALL our software and hardware (at all scales, from most local within chips to worldwide within Internet and clouds) lot of accumated caches everywhere which can reveal private data even if they all use "private sessions".
Even HTTPS and REST APIs are attackable by time attacks (just consider the "proxy" protocol built natively in HTTP, including for "cache control").
Everywhere caches become a target for spying: most caches are not protected at all, they have measurable and highly predictive response time.

Nov 01, 2018
For browsers, a solution would be to restrict the use of caches by segregating them by domain (no separate domains can use the same shared cache, or an internal "cache hit" in the shared cache must generate the same delay as a "cache miss" for first visit by a third party domain, and add some randomness in this response time for both "cache hit" and "cache miss" events).
We cannot eliminate the use of caches, or otherwise we would return to the old age of computing before the 1980's: performance of everything would be reduiced by at least 3 orders of magnitude ! But we CAN segregate caches per "security-realm".
It's a goal now for processors (Spectre/Metdown) and any NUMA architecture, it's a goal now also for web browsers, and network appliances as well (such as routers, either local or from your ISP).

Nov 01, 2018
Caches used by web services are also extremely interesting targets for spiers! They allow secretly monitoring the activity of any web service, and finally target their users. HTTPS used by webservers will not be enough to protect their private data. So EACH private user connected to a server should use its own separate cache, not contaminated by sessions made with other users of the same webserver.

Nov 01, 2018
Anyway, the idea of using "Tor" does noit make it totally immune to such attacks.
Tor just protects from time attacks against the local browser's cache, but there are other caches on the local system (for example Tor does not protect against Spectre or Meltdown at CPU at the memory cache level, or from other local filesystem caches, even when the filesystem is on a "fast" SSD, or in "flash memory" built into a mobile device), or on the local routers and proxies that it uses, or in the routers and proxies operated by webservers that the user is known to visit.

Most caches operate using a simple "LRU" eviction policy, without segregating caches into separate pools for separate security domains. All caches are design to return data much faster when there's a "hit", and much slower when there's a cache "miss", and if there's a single pool for everyone, the cache becomes attackable.

There are caches may layers of computing systems, including over networks, or in the same local process.

Nov 01, 2018
And because there are caches are in so many layers of computing systems, including over networks (distributed computing, including today's "clouds"), or in the same local process, and today's performance of these system highly depend on the presence of these caches, removing caches would dramatically slow down these systems to the age before the 1980's.
We do need caches.
But we must immunize them by designing them so they use separate pools for separate users or separate processes running with distinct privileges (distinct security realms, identified by their origin domain).

If there's an attempt to query a cache entry, the fact it has a "hit" or miss" should give the same "miss" result with the same response time if the query comes from a different security realm than what the cache stored, and it's easy to slow down severely these "miss" responses. The querier will then only be able to query the cache for its own security realm.

Nov 01, 2018
As well, no query should cause the cache pool owned by other realms to be depleted completely. A system can also improve its own security by running locally a local secret process that will generate randomized workload in one pool of the cache, proportional to the total workload in other pools.
This randomized secret workload should be about 50% of the busiest other pools, it has the same effect on caches as DOS attacks, except that a DOS attack against a cache distributed in multiple pools cannot deplete the secret pool owned by the system and the system can then also detect and react fast to DOS attacks when there are heavy workloads.

Nov 01, 2018
When there's no DOS attack (and the system is mostly idle except by a query made by a time attacker), the time attack is more effective to steal date by time-based coverted channels, but the secret workload will still make the response time to queries (measured by the time attacker) to be also randomized, making the time attack ineffective to detect and use the covert channel.

Nov 01, 2018
Another way to protect caches with segregated pools, is also for the secret thread running on the system to randomly redistribute regularly the size of each pool between unprivileged contexts (e.g. offer to all of them a minimum size, the same for these pools, for a total not exceeding 50% of the total pool size usable by unprivileged contexts, and then distribute the rest completely randomly, and redistribute it regularly and frequently too frequently to allow statistically significant measurement of time differences over a long enough time that would cover the fastest time attacks, i.e. those that can retrieve data via secret covert channels at high bit rates).

Nov 01, 2018
This also will protect against attacks based on colors of visited links (colors seen by attackers will be only based on their own cache, not the cache used by the private user). A browser can then protect links by coloring them all in blue for everyone, and add a layer on top of it, visible only by the private user, to make it appear purple to that user only because the link is in his own segregated cache pool that the attacker cannot deplete or fill artificially. The attacker will not be able to detect that secret recoloring layer...

Please sign in to add a comment. Registration is free, and takes less than a minute. Read more