There are few things more frustrating than a slow-loading webpage. For companies, what’s even worse is what comes after: users abandoning their site in droves. Amazon, for example, estimates that every 100-millisecond delay cuts their profits by 1 percent.
To help combat this problem, researchers from MIT’s Computer Science and Artificial Intelligence Lab (CSAIL) and Harvard have developed a system that decreases page-load times by 34 percent. Dubbed “Polaris,” the framework determines how to overlap the downloading of a page's objects, such that the overall page requires less time to load.
“It can take up to 100 milliseconds each time a browser has to cross a mobile network to fetch a piece of data,” says PhD student Ravi Netravali, who is first author on a paper about Polaris that he will present at this week’s USENIX Symposium on Networked Systems Design and Implementation (NSDI '16). “As pages increase in complexity, they often require multiple trips that create delays that really add up. Our approach minimizes the number of round trips so that we can substantially speed up a page’s load-time.”
The paper’s co-authors include graduate student Ameesh Goyal and professor Hari Balakrishnan, as well as Harvard professor James Mickens, who started working on the project during his stint as a visiting professor at MIT in 2014. The researchers evaluated their system across a range of network conditions on 200 of the world’s most popular websites, including ESPN, Weather.com, and Wikipedia.
How web pages work
The problem is that browsers can’t actually see all of these dependencies because of the way that objects are represented by HTML (the standard format for expressing a webpage’s structure). As a result, browsers have to be conservative about the order in which they load objects, which tends to increase the number of cross-network trips and slow down the page load.
How Polaris fits in
What Polaris does is automatically track all of the interactions between objects, which can number in the thousands for a single page. For example, it notes when one object reads the data in another object, or updates a value in another object. It then uses its detailed log of these interactions to create a “dependency graph” for the page.
Mickens offers the analogy of a travelling businessperson. When you visit one city, you sometimes discover more cities you have to visit before going home. If someone gave you the entire list of cities ahead of time, you could plan the fastest possible route. Without the list, though, you have to discover new cities as you go, which results in unnecessary zig-zagging between far-away cities.
“For a web browser, loading all of a page’s objects is like visiting all of the cities,” says Mickens. “Polaris effectively gives you a list of all the cities before your trip actually begins. It’s what allows the browser to load a webpage more quickly.”
Dependency-trackers have existed before, but are all as constrained as the browsers themselves. This is because their method of comparing lexical relationships - specifically, the text in HTML tags - mimics the way browsers load pages and does not capture more subtle dependencies.
“We are hopeful that the system will eventually be integrated into the browser,” he says. “Doing so will enable additional optimizations that can further accelerate page loads.”
A better approach to a faster web
Tech companies like Google and Amazon have also tried to improve load-times, with an emphasis on lowering costs for data-usage. This means that they often focus on the challenge of more quickly transferring information via data compression. The CSAIL team, meanwhile, has demonstrated that Polaris’ gains on load-time are more consistent and more substantive.
“Recent work has shown that slow load-times are more strongly related to network delays than available bandwidth,” says Balakrishnan. “Rather than decreasing the number of transferred bytes, we think that reducing the effect of network delays will lead to the most significant speedups.”
“Tracking fine-grained dependencies has the potential to greatly reduce page-load times, especially for low-bandwidth or high-latency connections,” says Mark Marron, a senior research software development engineer at Microsoft. “On top of that, the availability of detailed dependence information has a wide range of possible applications, such as tracking the source statement of an unexpected value that led to a crash at runtime.”