TR14-05 Martindell
TR14-05 Martindell
modern web
by
Nick J. Martindell
Bachelor of Science
With Departmental Honors
University of Washington
June 5, 2014
Date
Abstract
The current client server model used on the web is inefficient for the delivery
of large static content. Content is transmitted independently to each consumer
from the central infrastructure without respect to the copies that exist in the
caches of other visitors. As demand grows, capacity remains fixed and all users
experience delays. This problem has been exacerbated by the rise in popularity
of large streaming content such as high definition web video. Desktop peer-
to-peer (P2P) systems have shown great promise in alleviating these issues by
leveraging the underutilized upstream bandwidth of each client to deliver con-
tent. Bringing these techniques to the web in a way that doesn’t require user
intervention would allow for a more efficient web.
This thesis explores the state of the art in browser-based P2P content de-
livery for the web and seeks to answer whether such systems can be used to
efficiently and invisibly deliver content. It presents the DCDN (Distributed
Content Delivery Network) research platform which serves content to the users
of a website using only their HTML5 enabled web browser. It then uses the
platform to explore several possible optimizations for this method of content
delivery and evaluate their success. Through this investigation, it shows that
while browser-based P2P systems can be implemented quite simply, at this time
their performance characteristics limit them to certain content types. High def-
inition web video and long-duration audio streaming are key examples. In order
to expand the possible use cases, significant roadblocks will need to be overcome.
A few emerging technologies which may provide solutions within the next year
are discussed, as well as somewhat far-fetched concepts for future improvement.
At present this technology has great value, especially when considering that its
ideal content types make up a large portion of the bandwidth currently used on
the Internet.
Contents
1 Introduction 2
2 Dependencies 3
2.1 WebSockets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.2 WebRTC Peer Connections . . . . . . . . . . . . . . . . . . . . . 4
2.3 WebRTC Data Channels . . . . . . . . . . . . . . . . . . . . . . . 4
4 Evaluation 8
4.1 Assessment Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . 8
4.2 The basic system . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
4.3 HTTP Pre-fetch . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
4.4 HTTP HEAD-start . . . . . . . . . . . . . . . . . . . . . . . . . . 12
4.5 Mixed HTTP and P2P . . . . . . . . . . . . . . . . . . . . . . . . 13
4.6 Comparison of configurations . . . . . . . . . . . . . . . . . . . . 14
5 Discussion 15
5.1 Roadblocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
5.1.1 JavaScript binary API . . . . . . . . . . . . . . . . . . . . 16
5.1.2 DOM interaction . . . . . . . . . . . . . . . . . . . . . . . 16
5.1.3 User behavior . . . . . . . . . . . . . . . . . . . . . . . . . 17
5.2 Improvements in the pipeline . . . . . . . . . . . . . . . . . . . . 18
5.2.1 Service Workers . . . . . . . . . . . . . . . . . . . . . . . . 18
5.2.2 Heuristic peer recommendation . . . . . . . . . . . . . . . 19
5.3 Moonshot Improvements . . . . . . . . . . . . . . . . . . . . . . . 20
5.3.1 Donation of idle resources . . . . . . . . . . . . . . . . . . 20
5.3.2 Heuristic optimizations for P2P networking . . . . . . . . 20
6 Conclusion 21
1
1 Introduction
As the load placed on web infrastructure grows with the ever-increasing demand
for large multimedia content, it is increasingly important to deliver it in efficient
ways. Characteristics of the Hyper-Text Transport Protocol (HTTP) used to
serve almost all web content, make it somewhat inefficient as a means of deliv-
ering static content (images, videos, etc. that don’t change often over time) to
users. The most notable issue is its focus on a strictly client-server relationship.
With this system, the cost of serving content increases with the number of users
due to the need for more servers and greater bandwidth to handle peak load.
Large Internet companies spend a considerable sum on this infrastructure, but
this increase in capacity through capital expenditure fails to address the under-
lying inefficiency of serving content in this manner: all the load is placed on
the central server even though copies of the content exist in the caches of every
client on the site.
This practice leads to an underutilization of the client’s Internet connection.
In practice, the majority of content on the web flows from large corporate data
centers to home and office consumers while comparatively little content moves
in the opposite direction. Additionally, each new visitor to a web site downloads
a full copy of all page content from the central server without regard to other
equally viable copies on the network. This focus on central infrastructure causes
problems in times of high load during which service quality for users drops
since that have to share the host’s finite capacity. Both of these problems are
addressed by peer-to-peer (P2P) content delivery techniques.
If such a P2P content delivery system could be used by a web host in a
way which would be effortless and invisible to users, it could enable a more
efficient web. Until recently, this would have required the user to install 3rd
party software and/or browser plug-ins. However, recent developments in the
JavaSript application programming interface (API) exposed to web browsers
have enabled a new class of P2P applications requiring nothing more than a
modern web browser. While the possibility of creating such applications has
been explored in recent commercial endeavors, little is known to the research
community about how this class of applications performs compared to conven-
tional content delivery systems.
My research shows that while the existing knowledge about P2P systems is
sufficient to create a functional browser-based system, there are new obstacles to
overcome in order to provide the high performance users expect. Web security
requirements, aspects of JavaScript’s binary APIs, interaction with the DOM
(The Document Object Model used to layout the content of a web page) and the
comparatively erratic behavior of the client cause the performance of browser-
based P2P systems to differ in ways that merit new investigation. This thesis
investigates several of those characteristics and suggests techniques to overcome
them. It also briefly explores technologies still in development that may provide
for a better system.
2
2 Dependencies
In order to discuss the performance characteristics of browser-based P2P sys-
tems it is first necessary to understand the characteristics of the APIs on which
such systems are built. These P2P primitives resemble those used by desktop
applications but often operate at a higher level of abstraction and with a differ-
ent security model. The speed of establishing various connection types makes
up a significant portion of the performance overhead associated with DCDN.
2.1 WebSockets
WebSockets are an implementation of full-duplex TCP (Transmission Control
Protocol) sockets in the browser. Like standard sockets, they provide bidirec-
tional push communication as well as efficient sending and receipt of binary
data. Unlike desktop sockets however, they are message-oriented and by default
use UTF-8 text rather than binary streams. A configuration option allows the
use of raw binary, but all data is still framed in messages. This design is re-
flective of a central use case: enabling server-push messaging, or the sending of
updates from the server without a request from the client [12].
Performance wise, their behavior is comparable to regular sockets once the
connection has been set up and their round trip messaging latency is near that
of a ping. On the other hand, their handshake is complex and takes quite a
while to set up. WebSockets first establish a TCP connection, then use HTTP
to request an ‘upgrade’, before handing off the socket to the WebSocket Server.
This requires several round trips before data can be sent or received causing a
significant amount of startup latency even when connecting to a nearby server.
The following data shows the results of request latency tests for 2 types of
connections: a WebSocket and an HTTP HEAD Request. The latter is identical
to the HTTP GET request used to ‘get’ most web content, except that it only
returns the response headers rather than the content itself. It is thus the smallest
and fastest type of HTTP request typically used on the web.
Table 1: Comparison of startup latency for WebSocket and HTTP HEAD re-
quests to localhost
The 3rd and 4th row of data point to another factor affecting startup latency:
the amount of other network activity occurring at the same time. In these cases,
the connections are made after the page has loaded which allows the requests
3
to work without contention from the network activity associated with loading
the page. In both cases, the request latency is reduced to a few microseconds
rather than tens or hundreds of them. Additionally the 3rd row demonstrates
how, once connected, a WebSocket can outperform even small HTTP requests.
4
In either the tunneled or regular mode, SCTP implements most of the con-
gestion control and reliability features expected from TCP, but allows each to
be toggled by the programmer. The latest Data Channel implementation in
Google Chrome allows for all of the Data Channel reliability modes specified
by the W3C (World Wide Web Consortium) [2]. This includes the choice of
reliable/unreliable/partially-reliable delivery modes as well as in-order or un-
ordered message ordering [6]. For the purposes of P2P content delivery, the
most applicable mode is the reliable, unordered mode which ensures messages
that are sent are delivered, without the overhead of buffering and ordering in-
coming messages.
Data Channels, as currently implemented, have some limitations that are
more restrictive than those of WebSockets. The most notable is a message
size limit of around 16 kilobytes, although this is said to be temporary [6].
Additionally, all WebRTC connections are required to be encrypted. While this
protects the private video and audio streams of users, it also enforces a modest
performance hit compared to open communication over Data Channels. In the
case of content distribution it is unlikely that this encryption is truly desirable
since the content is already public.
With regards to performance, WebRTC connections require the exchange of
many session descriptions, NAT traversal candidates and an offer or acceptance
over the message-passing channel before useful data can be sent. This means
a lengthy setup period and one that can only begin after the message-passing
channel is operational.
5
away from centralized infrastructure and onto the active users of the content.
Due to the similarity in purpose, the protocol used by DCDN, and presum-
ably the related systems, closely resembles that used by desktop P2P content
delivery applications such as BitTorrent [3]. While most of the knowledge of
these desktop systems likely translates to the new medium, it is unclear from
previous research what new challenges these new systems face. To aid in un-
derstanding the adaptations described later in this investigation, the following
description covers the basic implementation of DCDN. The remainder of the
paper will explore variations on the protocol that address the challenges faced
by browser-based P2P systems.
6
Figure 1: DCDN Protocol Flow
7
of this onprogress handler is to enable media streaming before content finishes
loading or the partial display of images as data becomes available.
4 Evaluation
The purpose of DCDN is to provide a simple platform for developing and eval-
uation web based P2P systems. In this section, I will describe several of the
systems studied during the course of my research as well as evaluate their ben-
efit in the context of P2P content delivery for the web. The purpose of this
research is to provide a statistically substantiated idea of best practices for such
systems. Most of the systems evaluated here rise from observations about the
differences between desktop peer-to-peer systems and those in the web browser.
The following is not a comprehensive list of optimizations but rather a small
subset which attempts to characterize the key differences between desktop and
web P2P systems.
8
Each of the following systems will be evaluated in ideal circumstances to
provide a view of their best possible results. These circumstances specify that
all parties communicate on localhost, that several (5) peers are available, and
that each peer has the full contents of the file in question (in BitTorrent parlance,
they are ‘seeds’). The test page used for these trials is the ‘Large Image’ example
available in DCDN’s GitHub repository [15]. The page contains 2 HTML img
tags, one that uses regular HTTP and one that uses DCDN, to obtain and
display a 3.2 megabyte JPEG image. Each system will be tested according to
the following procedure:
Trial Procedure:
1. The relevant code will be activated in the DCDN source
2. The coordination server will be restarted with the new code
3. The Chrome browser cache will be cleared completely
4. The test page will be opened in 5 tabs on the same computer to serve as
‘peers’
5. In a new window, the ‘client’ will open the test page
6. The client will be refreshed 3 times to ‘warm up’ itself and the peers
7. The client will be used to make 10 trial visits to the test page (by refreshing
10 more times)
8. DCDN’s built in statistics object will be dumped to a text file for each
trial
9
Figure 2: Basic DCDN Protocol
Results: The basic system showed a lengthy startup time as expected. Nearly
half a second passed before the first byte was received. However, once data
began to flow, the system delivered about 3.5 MBps which would be more than
sufficient to stream audio or video. On the other hand, the high latency to
first byte makes this system untenable for delivering image content as it was
noticeably slower than the HTTP image download alongside it on the test page.
Regarding the user experience, the HTTP image painted in as data was received
but the DCDN image was entirely blank until the last byte was received at
around 1.5 seconds.
10
perform before useful data can be transmitted, HTTP can be used to obtain
chunks much sooner in the page-load time-line than DCDN’s P2P. Much of this
is due to the lengthy setup period of WebRTC Data Channels which require
the transmission of several session descriptions before the connection is opened.
Fetching chunks before peers become available could greatly improve the time
to first byte. For streaming audio or video, this can make the difference between
playback starting nearly immediately or after a long period.
On the other hand, using HTTP for content retrieval reduces the load bal-
ancing effect of using DCDN to distribute content since the HTTP server is
once again involved in at least part of every download. However, this technique
allows the content distributor to selectively prioritize their customer experience
or their needs for bandwidth reduction. In this evaluation the client will obtain
the first 20% of the file’s chunks via HTTP as soon as it has received the file’s
meta data. All remaining chunks will be retrieved from peers.
11
This variation exhibited an unexpected result. The time to last byte was
longer and thus the effective bandwidth of the system was actually decreased.
The cause of this anomaly merits further investigation, but I suspect it is due
to contention for the network stack between the HTTP pre-fetch and WebRTC
/ WebSocket communication for peer connection.
12
Figure 4: Changes for HTTP HEAD-start
Results: This change resulted in the best performance out of all the vari-
ations tried. It maintained the lowered time-to-first byte seen in the HTTP
Pre-fetch configuration while also allowing the highest effective bandwidth of
all variations. Compared to the basic system’s bandwidth of 3.41 MBps, this
system’s increase to 3.60 MBps is significant. I suspect that allowing the HTTP
activity to occur before the metadata was fetched allowed the traffic to com-
plete before peer coordination began. In this way, the time to last byte was not
adversely impacted and was actually lowered by about 160ms compared to the
basic system since fewer chunks had to be fetched from peers.
13
served via each protocol.
Results: This system did not live up to expectations and instead resulted in
speeds nearly equal to the basic system but without as dramatic load-balancing
effects. The only significant statistic change was in the percentage of content
served over P2P which dropped to nearly 50%. In other words, the HTTP
server took half the load of serving the content but without noticeable gains in
performance.
14
Time to first Time to last Percent Effective
byte (ms) byte (ms) served bandwidth
over P2P (MBps)
Basic system 464.71 1411.92 100.0 3.41
HTTP Pre-fetch 369.11 1536.02 79.63 2.77
HTTP ‘HEAD 353.77 1251.81 79.63 3.60
start’
Mixed HTTP 386.13 1312.77 54.89 3.49
and P2P
5 Discussion
The HTML5 P2P framework is still in development and has yet to reach adop-
tion outside of the Chrome and Firefox browsers. Due to its developmental
nature, I encountered a number of roadblocks while working on DCDN that
could be alleviated by improvements to web technology soon-to-be released.
Lastly, Ill discuss the possibility of 2 moonshot improvements that could enable
new user experiences or simply provide better performance.
5.1 Roadblocks
In its current form, DCDN is limited by both the technologies it relies on and
the support of various web browsers for them. In this section, I will discuss
some of the lessons learned while building this system as well as comment on
the work-in-progress nature of the technologies I’ve built upon.
15
5.1.1 JavaScript binary API
While the additions of WebSockets and WebRTC to JavaScript provide surpris-
ingly effective control over networking in the browser, they are recent additions
to a language designed for far simpler tasks. This is most apparent when working
with the binary versions of WebSockets or RTC Data Channels since JavaScripts
support for raw binary is somewhat patchwork. While excellent primitives like
Typed Arrays exist which allow for efficient view, modification, and creation of
raw binary buffers they can only be used with APIs which explicitly support
them.
Binary mode WebSockets, WebRTC Data Channels and XML HTTP Re-
quests are examples of APIs that utilize Typed Arrays. On the other hand, the
persistence infrastructure such as HTML5 local storage and IndexDB only allow
the storage of UTF-8 text (Note: IndexDB in Firefox supports Blob storage at
the time of writing) (Note: Hours before the publication of this thesis, it was
announced that Chrome will get Blob support in the next few days [1]). While
it may be possible to serialize arbitrary binary into and out of text strings this
is not a very reliable or efficient solution. Chrome’s File System API provides
an effective solution that supports all of the binary types, but the API has failed
to be adopted by other browsers and is considered dead by many [13].
So far we have only covered static arrays of binary data, but there is a
second type of binary primitive which is necessary to efficient content delivery:
the stream. A stream is the most natural representation of partially complete
binary content and yet JavaScript does not support a Stream primitive. A
simple stream would allow chunks to be obtained by DCDN and consumed by
the DOM easily and efficiently via a straightforward and familiar interface. If
it were possible to create a URL to a stream, in the same way one can be made
for a Blob, this would solve the currently difficult problem of inserting content
into DOM tags. This kind of DOM interaction brings about the next roadblock
to seamless content delivery.
16
One solution is to generate Blob URLs to partial content as it is downloaded
and then repeatedly switch out the source URL of the tag. This works as
expected and shows a partially loaded image, or partially watchable video, but is
a hack at best. The expense of creating multiple blobs increases as the available
content grows to the point where a large part of the browser’s processing time
is spent Blob’ing new data. While acceptable for small to medium images, this
solution is entirely untenable for video content.
Luckily there is a kind of stream API for audio and video tags appropriately
called the Media Stream API. This API allows exactly the behavior we desire:
the ability to hand off binary buffers to a stream that is then used to fill the
contents of the media tag. That said, the system as implemented is nowhere
near the simplicity of a binary stream. Aside from flaky implementation details
I’ll describe next, the Media Stream API only works for audio and video tags
making it fairly narrow in applicability. Its further limited to a subset of the
supported audio and video codecs supported by HTML5. Even then, it is picky
about the encoding details of the source material. Of the three WebM videos
I attempted to use with it, one worked. I was unable to find an explanatory
difference between the 3 videos. Other quirks include seemingly random freezes
or hesitations in the streaming media even when the stream has been passed the
full content. As previously mentioned, this system does not provide a solution
for filling img tags, which are still the most common media element on the web.
Even when coerced into functioning, the Media Stream API as currently
implemented is not east to work with. The API is quirky and has strange
(undocumented) restrictions such as allowing only one buffer to be appended
in a row without events being handled in-between. A functional hack arround
this issue involves using 0-millisecond timeout callbacks to append buffers. This
causes each buffer to be appended inside a separate event handler which allows
the Media Source API to do its work between each event. This seems to imply
that the Media Source API’s buffer management is done on the main JavaScript
thread. This is supported by my observation that the second of two buffer
appends will fail unless other events are allowed to process even if a significant
(several second) delay is added between each append.
17
the swarm between the time the server recommends them to a client and the
time the client receives that message. More importantly, it means that any
models of file chunk availability maintained by the coordination server must
be refreshed much more frequently than in a typical long-lived P2P system.
Peers obtain chunks very quickly, seed them for a matter of seconds and then
disappear just as quickly. Luckily, this effect is somewhat mitigated by the fact
that the largest content also takes the longest time to consume. Videos, which
make up for much of the bandwidth usage in the US, are easily served by P2P
systems and encourage a client to stay online for the duration of the video.
18
section regarding JavaScript’s binary APIs, there is no easy and efficient way to
persist chunks which have already been obtained via P2P in a way comparable
to the browsers large, low-latency HTTP cache. Service Workers enable the
worker to add items to a cache specific to the worker, thus solving this problem.
Once a URL has been obtained via DCDN, the service worker could then store
its contents in the cache preventing the need to re-download it the next time.
Because the data in the cache is programmatically accessible to the worker, this
solution even preserves the ability to serve the content even if it has not been
downloaded by DCDN on that particular page-load.
At the moment, there is no indication that Service Workers will be able to
yield partial content to the user agent in the way that an HTTP server can
using chunked encoding. This is necessary to enable the ideal user experience
in which content loaded using P2P would yield partial content as it becomes
available. Users who have watched an image ‘paint in’ as the HTTP server
returns data will be familiar with this technique. In response to a ticket I filed
on the subject, a developer confirmed that it is not possible to yield partial
content to a request, but that this ability is being strongly considered [19].
Service Workers are only partially specified at the time of writing and the
implementation in Chrome Canary is far from complete. Further information
can be obtained at the projects GitHub page [4] at https://github.com/
slightlyoff/ServiceWorker/.
19
the coordination server when recommending peers to the client. Peers who have
the required chunks in their caches can provide better service to the client and
would be better candidates for peer recommendation. In the case of stream-
ing media, peers that started shortly before the client are very likely to have
already obtained the chunks the client now needs. The client can then request
these chunks with a very high success rate without the need for time-consuming
negotiation between peers such as the exchange of cache-status bitmaps.
20
before. The Systemd init service on Linux uses a similar technique to create
network connections before the network card is fully online [11].
6 Conclusion
Peer-to-peer content delivery would restore some balance to the Internet which
is currently dominated by companies with large financial resources. By dramat-
ically reducing the capital investment required to share audio and video it would
allow anyone with a computer and an HTTP server to once again share their
content on a level playing field; much as was the case with original text based
Internet. It would also begin to restore the balance of upload and download
traffic to end users which has the potential to make more effective use of the
Internet capacity we already have.
In its current form DCDN proves that this kind of invisible P2P content
delivery is possible and even viable for certain types of media. Compared to
HTTP it exhibits a lengthy time to first byte, but this is not a bar to its use for
streaming audio and video. Additionally, some of the less-than-perfect aspects
of its performance may be remedied with coming advances in web technology.
In its current form, this project could be used in limited commercial application,
although it needs a bit more work to be fully stable. Regardless, even at this
early stage, it is clear that the technology is viable, that its use is motivated
and that it has the potential to solve some of the capacity issues faced by the
web as its use continues to scale beyond its original conception.
It should be clear by now that the technologies involved in creating web
based P2P applications have not yet reached maturity. Many are still being
standardized, many are incomplete and many more are just ideas at this point.
For this reason, the work described here on DCDN and the results demonstrated
should be taken as a work in progress. As standards are completed and new
technologies invented, I hope that this research will serve as a building block on
the road to a more distributed web.
References
[1] Anonymous Google Developer jsb[OBSCURED]@google.com.
(2014, Jun 5). Issue 108012: IndexedDB should sup-
port storing File/Blob objects [Online]. Available:
https://code.google.com/p/chromium/issues/detail?id=108012#c153
[2] A. Bergkvist et al. (2013, September 10). WebRTC 1.0: Real-time Com-
munication Between Browsers (W3C Working Draft) [Online]. Available:
http://www.w3.org/TR/webrtc/
[3] B. Cohen. (2012, Oct 20). The BitTorrent Protocol Specification (3rd revi-
sion) [Online]. Available: http://www.bittorrent.org/beps/bep 0003.html
21
[4] A. Russell (2014, June 5). The Service Worker Specification [Online]. Avail-
able: https://github.com/slightlyoff/ServiceWorker
[5] D. P. Anderson et al. (2011). About SETI@home [Online]. Available:
http://setiathome.ssl.berkeley.edu
[6] D. Ristic. (2014, February 4). WebRTC data channels [Online]. Available:
http://www.html5rocks.com/en/tutorials/webrtc/datachannels/
[7] Headlight Software, Inc. (2006, March 13). HTTP/FTP Seeding for Bit-
Torrent [Online]. http://www.getright.com/seedtorrent.html
[8] J. Archibald. (2012, May 08). Application Cache is a Douchebag [Online].
Available: http://alistapart.com/article/application-cache-is-a-douchebag
[9] J. Hiesey, F. Aboukhadijeh and A. Raja. (2013). PeerCDN [Online]. Avail-
able: https://peercdn.com
[10] J. Hoffman. (2011, August 22). HTTP-Based Seeding Specification [On-
line]. Available: http://www.webcitation.org/6184q7Pjn
[11] L. Poettering. (2011, May 18). Systemd for Developers: Socket Ac-
tivation [Online]. Available: http://0pointer.de/blog/projects/socket-
activation.html
[12] Mozilla Developer Community. (2014, March 18). WebSockets [Online].
Available: https://developer.mozilla.org/en-US/docs/WebSockets
[13] Mozilla Developer Community. (2013, Sept 6). File System
API guide [Online]. Available: https://developer.mozilla.org/en-
US/docs/WebGuide/API/File System
[14] Netflix Inc. (2014, April). USA ISP Speed Index [Online]. Available:
http://ispspeedindex.netflix.com/usa
[15] N. Martindell. (2014, May 18). DCDN: A Distributed
CDN based on modern web technologies [Online]. Available:
https://github.com/DaemonF/dcdn
[16] Swarm Labs, LLC. (2014). What Is Swarmify? [Online]. Available:
http://swarmify.com
[17] S. Dutton. (2012, July 23). Getting Started with WebRTC [Online]. Avail-
able: http://www.html5rocks.com/en/tutorials/webrtc/basics/
[18] S. Guha and P. Francis. “Characterization and Measurement of TCP
Traversal through NATs and Firewalls,” in Proceedings of Interet Mea-
surement Conference (IMC), Berkeley, CA, Oct 2005.
[19] tabatkins (pseudonym). (2014, May 17). Partial data
/ Range requests - Issue #280 [Online]. Available:
https://github.com/slightlyoff/ServiceWorker/issues/280
22