Report from the IAB Workshop on Exploring Synergy between Content Aggregation and the Publisher Ecosystem (ESCAPE)mt@lowentropy.netmnot@mnot.netwebsecurityoriginpackagingbundleThe Exploring Synergy between Content Aggregation and the Publisher Ecosystem
(ESCAPE) Workshop was convened by the Internet Architecture Board (IAB) in
July 2019. This report summarizes its significant points of discussion and
identifies topics that may warrant further consideration.Note that this document is a report on the proceedings of the
workshop. The views and positions documented in this report are
those of the workshop participants and do not necessarily reflect IAB
views and positions.Status of This Memo
This document is not an Internet Standards Track specification; it is
published for informational purposes.
This document is a product of the Internet Architecture Board
(IAB) and represents information that the IAB has deemed valuable
to provide for permanent record. It represents the consensus of the Internet
Architecture Board (IAB). Documents approved for publication
by the IAB are not candidates for any level of Internet Standard; see
Section 2 of RFC 7841.
Information about the current status of this document, any
errata, and how to provide feedback on it may be obtained at
.
Copyright Notice
Copyright (c) 2020 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
() in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with
respect to this document.
Table of Contents
. Introduction
. Mention of Specific Entities
. Use Cases
. Instant Navigation
. Offline Content Sharing
. Other Use Cases
. Book Publishing
. Web Archiving
. Interactions between Web Publishers and Aggregators
. Incentives for Web Packages
. Operational Costs
. Content Regulation
. Web Performance
. Systemic Effects
. Consolidation
. Consolidation of Power in Linking Sites
. Consolidation of Power in Publishers
. Consolidation of User Preferences
. Effect on Web Security
. Privacy of Content
. AMP Issues Unrelated to Web Packaging
. AMP Governance
. Constraints on the AMP Format
. Performance
. Implementation of Paywalls
. Venues for Future Discussion
. Security Considerations
. Informative References
. About the Workshop
. Agenda
. Thursday 2019-07-18
. Friday 2019-07-19
. Workshop Attendees
. Web Packaging Overview
. Authority in HTTPS
. Authority in Web Packaging
. Applicability
. The AMP Format, Google Search Results, and Web Packaging
IAB Members at the Time of Approval
Authors' Addresses
IntroductionThe Internet Architecture Board (IAB) holds occasional workshops
designed to consider long-term issues and strategies for the
Internet, and to suggest future directions for the Internet
architecture. This long-term planning function of the IAB is
complementary to the ongoing engineering efforts performed by working
groups of the Internet Engineering Task Force (IETF).The IAB convened the ESCAPE Workshop to examine some proposed changes to the Internet
and the Web, and their potential effects on the Internet publishing landscape.
Of particular interest was the Web Packaging proposal from Google, under
consideration in the IETF, the W3C's Web Incubator Community Group (WICG), and
the Web Hypertext Application Technology Working Group (WHATWG).In considering these proposals, we heard about both positive effects of Web
Packaging and concerns that it could have significant effects on the
relationship between publishers (e.g., news web sites) and content aggregators
(e.g., search engines and social networks). As such, our focus was primarily on
this relationship, rather than technical discussion.Online publishers do not regularly participate in standards activities
directly. A workshop format was used to solicit input from them. The workshop
had 27 participants from a diverse set of backgrounds, including a small number
of attendees from publishers, one aggregator (Google), plus representatives from
browsers, the Accelerated Mobile Pages (AMP) community, Content Distribution Networks (CDNs),
network operators, academia, and standards
bodies. See the workshop call for papers for more information
and a complete listing of submissions.As intended, the workshop was primarily a forum for discussion, so it did not
reach definite conclusions. Instead, this report is the primary output of the
workshop, as a record of that discussion.This report documents the use cases discussed in and explains the
interactions between publishers and aggregators that might be affected by it in
. includes more details about the workshop
itself. For those unfamiliar with Web Packaging, provides a summary
as background material.Mention of Specific EntitiesParticipants agreed to conduct the workshop under the Chatham House Rule
, so this report does not attribute statements to individuals
or organizations without express permission. Submissions to the workshop were
public and thus attributable; they are used here to provide substance and
context.Use CasesMuch of the workshop concentrated on discussion of the validity and relative
merits of the use cases that might be enabled by Web Packaging. See
for an overview of Web Packaging.Instant NavigationThe largest use of Web Packaging so far is in Google Search, where packages are
intended to improve the perceived performance of navigation to pages that are
linked from search results when "clicked".To enable this, when a linking (or referring) web page includes links to pages
on another site, it also provides the browser with a packaged copy of the target
content, signed by the origin of the target content. In effect, the referring
page provides a cache for the target page's content. If navigation to one of
those links occurs, having the Web Package gives a browser the assurance that
the cache didn't change the content, so it can treat that content as if it were
acquired directly from the server for the target page -- even though it came from
a different server. In many cases, this results in significantly lower perceived
delay in displaying the target page.A vital characteristic of this technique is that the browser does not contact
the target site before navigation. The browser does not make any requests to
sites until after navigation occurs, and only then if the site requires
additional content or makes a request directly.Similar improvements could also be realized by downloading content (packaged or
otherwise) directly from the target site through a technique called
"prefetching". However, doing so would reveal information about the user's
activity on the linking page to those sites -- even when the user never actually
navigates to it.Sites bundled with Web Packaging can additionally be constructed in a way that
ensures that they render without needing any additional network access. This
makes it possible to provide near-instantaneous navigation. The proposed changes
to web navigation in support of loading Web Packages is designed to support this
use case.Workshop participants recognized the value of web performance for usability, as
well as for business metrics like retention and bounce rates. Such improvements
were seen as a valuable goal, but publishers raised questions about whether they
justified the cost of supporting an additional format, while others raised
concerns about different aspects of the Web Packaging proposal.Offline Content SharingAnother primary use case discussed was the ability to share web content between
devices where neither has an active connection to the Internet. One of the
stated goals of Web Packaging is to enable sharing of content offline.Several participants reported that in areas where Internet access is expensive,
slow, or intermittent, the use of direct peer-to-peer file exchange (e.g.,
"saving a website and sharing it on a USB stick") is commonplace. Most web
browsers already have some affordances for this, but these are recognized as in
need of improvements.In the discussion, several rejected an assumed requirement of this
use case -- that there be no difference between the treatment of a "normal" web page and
that of one loaded from an offline Web Package.The ability for a Web Package to provide clear attribution for content was seen
as valuable by some participants for a range of reasons. However, reservations
were expressed about the subtleties of the properties that signatures provide
and the effect of this on web security; see also Sections and .Many participants pointed out that using "unsigned bundles" -- that is, Web
Packages without signed exchanges -- could be adequate for this use case, since
most users don't need cryptographic proof of the site's identity. However, some
expressed concerns that this might worsen the propagation of falsehood.Some suggested that the value of signed exchanges was not realized in
small-scale interpersonal exchange of information but in the building of
systems for content delivery that might include capabilities like discovery and
automated distribution. The contention here was that effective use of digital
signatures in offline distribution of content implied considerably more
infrastructure than was described in current proposals.No definite conclusions about offline sharing were reached during the workshop.Other Use CasesA session on the second morning concentrated on two other significant potential
use cases for Web Packages: book publishing and Web archiving. These were not
seen as "primary" by the proponents of Web Packaging; the original intent was
not to spend significant time on these subjects, but there was considerable
interest from attendees.Book PublishingThe potential application of a packaging format to book publishing was
discussed, with particular reference to ways that books differ from web
content. Specialists from that industry pointed out that book delivery can vary
greatly from typical web content delivery.Workshop participants briefly explored existing solutions. PDF was seen as
particularly challenging for this use case, due to its limitations, and EPUB
has constraints that also make it challenging for publishers.Although Web Packaging might help to address this use case, the question of how
to identify book content was not resolved. The use of signed exchanges in this
context might offer means of tying content in books to a website, but several
limitations inherent in doing that were identified.In particular, book publication specialists represented that books don't have
the same requirements for timeliness or currency as web pages. For instance,
Dave Cramer's submission observed that Moby Dick was published
over 61,000 days ago, which is considerably longer than the proposed limit of 7
days for signed exchanges. The limited length of time that a Web Package can be
considered valid was discussed at some length.Additionally, the risk of a publisher going out of business during the lifetime
of a book is significant, because books -- at least successful ones -- often span
generations in their applicability. To that end, having a means of attributing
content to a publisher was considered less practical and potentially
undesirable (much like the discussion above regarding "unsigned bundles").There were other aspects of book publication that participants saw as
challenging for packaging. For example, it is currently not understood what it
means to refer to distinct parts of a book. Participants saw this as an area where
providing stable references for bundles of content might offer possibilities,
but nothing concrete came from that discussion.The potential for active content in a bundle to use web APIs to enrich content
or enable new features was considered valuable. Models for enabling paywalls
were discussed at some length (see ).Web ArchivingWeb archiving is a complicated discipline that is made more difficult by the
complex nature of the Web itself.From an archival standpoint, the potential for web content to be provided in a
self-contained form was viewed positively. Several improvements to the
structure of Web Packaging were considered, such as providing complete sets of
content and the use of Memento .Though there were potential applications of a packaging scheme, many challenges
were recognized as requiring additional work on the part of content producers to
be fully effective. For example, JavaScript is needed to render some archived
content faithfully, but attributing that content to an origin in all scenarios
is challenging.If packaging were to be widely deployed, it might improve the situation for
archival replay. In particular, the speculation is that there would be less "live
leakage" as packaged content might be less likely to refer to live resources
that currently tend to "leak" into views of archives. It was also noted that
subresources might also be more likely to be packaged, especially those that are
needed for deferred representations (i.e., after JavaScript execution on the
page or some user interactions). Other potential applications and enhancements
are discussed in .Participants discussed the use of a signature for non-repudiation at some
length. In one case related to the Internet Archive, a public figure disputed the
accuracy of archived content, asserting that the original content was
modified either at the source or in the archive.Some participants initially saw digital signatures as a way to address such
issues of provenance. As similar problems exist in other areas, such as in book
publication, medical research, and news, a solution to this problem was
considered to have broad applicability.However, the discussion ultimately concluded that providing non-repudiation in
retrospect is challenging. Signing keys are not expected to remain secure for
long periods. If keys are leaked afterwards, an attacker could retroactively
generate fraudulent signatures. Alternative solutions were discussed, such as
providing independent archives for the same data, using consensus protocols, or
using an append-only construct like a Haber-Stornetta log
, all of which can be used to increase the
difficulty of altering or misrepresenting established archives.Interactions between Web Publishers and AggregatorsA significant motivation for holding the workshop was to provide a forum where
publishers could discuss the impact of Web Packaging on the online publishing
ecosystem. Of primary interest was whether Web Packages might effectively enable
a transfer of power from publishers to aggregators.Both publishers and aggregators at the workshop expressed the importance of
maintaining a positive relationship. Publishers in particular expressed the
need to be able to trust that aggregators won't misrepresent their work or
de-emphasize it for reasons unrelated to quality and perceived value to the
user.One key question from was discussed:
Web Packaging has other uses, but it is primarily seen by a large proportion
of its stakeholders as a solution to problems that AMP created. Before we agree
to solve those issues, should we not ask if AMP was a useful approach in the
first place -- and useful to whom?
In examining this issue, discussion focused on the current incentive model
offered by aggregators. The costs that publishers incur for participation in
that system were considered. Considerable time was spent on AMP; a summary of
that discussion can be found in .We also considered the question of whether standardizing Web Packaging confers
credibility to aggregators exercising unwelcome control over publisher content
or whether the technical safeguards Web Packaging provides could allow
aggregators to relax their restrictions on the kinds of content they're willing
to cache and serve. No conclusions were drawn.Incentives for Web PackagesSubmissions to the workshop indicated that the use of inducements involving
better placement and formatting of links to publisher content had a significant
effect on the uptake of related technology. For example, in :
[...] The Washington Post has always placed a great deal of trust in Google to
represent its content--and their reward for doing so is more traffic, which
positively impacts the business.
During the workshop, several online publishers indicated that if it weren't for
the privileged position in the Google Search carousel given to AMP content,
they would not publish in that format.Publishers that do produce AMP said they see a non-trivial increase in traffic
as a result of deploying AMP content. For example, Yahoo Japan reported a 60%
increase in traffic as a result of deploying AMP on Yahoo Travel .
There was no data presented as to whether this increase was due to better
placement in Google Search results, the inherent benefits of the AMP Cache,
or the use of the AMP format.Anecdotal evidence was offered by another large publisher that saw a 10% drop
in traffic as a result of accidentally disabling AMP content. However,
increases in traffic might not result in similarly proportioned increases in
revenue, as observed in .Operational CostsSeveral participants pointed out that introducing a new, parallel
format for Web content incurs operational costs. In particular,
supporting any new format -- such as Web Packaging, Apple News, or
Facebook Instant Articles -- requires not only initial development of
tooling (some generic and some specific to a site's requirements) but
also an ongoing investment in maintaining its operability. Some
participants expressed concern about the impact upon small publishers
with limited technical and financial resources, especially in the
current publishing climate.Increased exposure from new formats might not always justify the added expense
of providing articles in that format . However, a standardized
format might help publishers reduce the cost of maintaining multiple formats.Content RegulationThe use of Web Packaging as a tool for avoiding censorship was not a
significant topic of discussion, except to note that publishers often have
regulatory requirements regarding removal or correction of content.Reference was made to the desire to remove videos of a recent shooting
and the potential difficulty in doing so if content were
available as Web Packages. Legal requirements to remove content come from
multiple angles: copyright violations, illegal content, editorial corrections or
errors, and right to erasure provisions in the European Union General Data
Protection Regulation were mentioned. One participant speculated that
making it more difficult to remove material in this way might discourage
regulators from censoring content.In this context, participants observed that it would be difficult to create
mechanisms to track and control content served as a Web Package without compromising the stated
goal of censorship resistance.Web PerformanceUnderstanding the effect that Web Packaging might have on web performance was a
matter of some contention.Some informal analysis from the Google Search deployment was presented (later
published in ) that showed significant performance improvements in
metrics related to navigation time resulting from the combination of prefetch,
prerendering, and the AMP format. These results are suggestive of a possibility
that Web Packaging could provide some of that improvement on its own, but no
data was presented that apportioned the improvement among the three components.Though data was presented to demonstrate potential rather than be a definitive
result, discussions raised a number of questions that suggest the need for
further study. Attendees suggested that future measurements consider the effect
of signed bundles distinct from the enhancements derived from the AMP
format. Future research in this area might also consider the effectiveness of
different strategies on devices with varying capabilities, bandwidth, power
consumption requirements, or network conditions.Of particular interest is the additional work required to fetch and render
multiple web pages in preparation for navigation. This might ultimately use fewer
connections but comes with an increased network and CPU cost for clients. Some
participants pointed out that different clients or applications might require
different tuning -- for example, when users have limited (or expensive) bandwidth
or for sites with less clear knowledge about the use of outbound links.Workshop participants also expressed interest in learning about the effect of
Web Packages on subsequent navigations within the target site.In discussion, some participants suggested that their experience supported a
theory that operating a cache at the linking site was most effective and the
additional work done prior to navigation in terms of fetching and preparing
content was what provided the most gains; others suggested that the benefits
inherent in the AMP format was a dominant factor.Understanding the complete effect of Web Packaging on web performance will
require further work.Systemic EffectsIt is not straightforward to estimate how a proposed technology change might
affect all of the parts of a system -- including not only other components, but
also things like end-user rights and the balance of power between parties --
ahead of time. To date, when evaluating proposals, the IETF has generally
focused on more immediate concerns, such as interoperability and security.Moreover, people often find new uses for successful standards
after they are deployed. It is rarely possible to
accurately predict all applications of a protocol or format, whether they are
harmful or beneficial. Refusing standardization only impedes both outcomes.With the understanding that predictions are difficult to make, there was
considerable speculation at the workshop about the possible effect of Web
Packaging on the Web. Some of that speculation is informed by experience, but
that experience is necessarily limited in scope. This section attempts to
capture that discussion.ConsolidationConcerns about the consolidation of power on the Internet have significantly
increased lately, as a result of several factors. While the IAB, the Internet
Society, and others are examining this phenomenon to understand it better, it is
nevertheless prudent to consider whether proposals for changes to how the
Internet works favors or counters consolidation. Favoring entities with existing
advantages -- like resources, size, or market share -- is not necessarily a factor
that disqualifies a new proposal, but it needs to be considered as a cost of
enabling that technology.Although the outcomes of adopting Web Packaging are unclear,
the workshop revealed several concerns for consolidation risks for all
involved parties: users, publisher sites, linking sites, and services they each
rely on.Consolidation of Power in Linking SitesSeveral participants noted that Web Packaging's enabling of instant navigation
() might advantage larger linking sites -- such as social networks or
search engines -- over smaller ones in the same industry because doing so
requires careful selections of which links to optimize, so as not to create
unneeded traffic.For example, a news article often has many links, but not all of them are
equally likely to be followed. Deciding which ones to prefetch requires
considerable data collection and engineering, so this technique might not be
feasible for smaller entities. Additionally, some participants noted that this
technique favors sites that have a linear set of ranked links, like search
results; it is more difficult to apply to a page of news (for example) because
predicting what link a user will follow is less obvious.This technique also requires access to a cache with terms of use compatible
with the requirements of the site. It was pointed out that the Google AMP Cache
has policies that might be acceptable to many, and there are other caches.
Sites operated by entities other than Google already use this cache, though it
was observed that a site that does not host its own cache suffers a minor
performance degradation.Consolidation of Power in PublishersParticipants seemed to agree that if performance is a strong enough
differentiator, the effective use of Web Packaging might turn out to be a
condition for success for online publishers. Google Search's choice to
privilege content that is served using HTTPS was pointed out as showing that
this sort of influence can be effective. Equally, it is not necessarily the
case that standardization of new capabilities will affect such policies
materially, as noted in :
It seems unlikely that any decisions we make in a packaging or distribution
system will affect the considerations aggregators use when deciding how to rank
recommendations or the power this gives them over publishers.
The most common concern raised in the discussion was the effect of this
technology on smaller publishers who might be less able to optimize the packages
they produce, where their primary differentiation in the market has previously
been the quality of their content.Consolidation of User PreferencesIn typical operation of the Web, servers have an opportunity to tailor content
to the needs of their users. In contrast, a static Web Package has few options
for individualization, as the content is generated once and used by many.As a result, publishers noted that AMP provides less opportunity to customize
content for their customers. Their concerns included not only personalizing
content based on what they know about the user but also optimizing the package
for specific browsers. Other participants observed in relation to this that Web
Packaging might also have a consolidating effect in the browser market.Some participants brought up the possibility of customization by providing
multiple packages, including multiple variants of resources in a single package,
or performing customization after the package was loaded. However, other
participants pointed out that all of these options have negative side effects,
either in complexity or reduced performance arising from larger bundles or
delayed customization.Effect on Web SecurityOne session explored the impact of introducing a new security model for the
Web. Currently, sites rely on connection-oriented security (provided by TLS
), but Web Packaging adds a limited form of object security.
That is, the package protects the integrity of a message, rather than providing
integrity and confidentiality for its delivery. Object security is not a new
concept in the context of the Web; designs like SHTTP are as
old as HTTPS. Though the intent is for Web Packaging to have a far more narrow
applicability, it provides fewer security guarantees than HTTPS, since it
provides only authentication, no confidentiality with respect to the cache, and
no assurance of liveness.Object-based security -- such as proposed in Web Packaging -- allows the use of
content regardless of how it is obtained; some participants noted that third
parties gain greater control over the distribution of content, reducing the
ability of publishers to retract or alter content over the validity period of
signed content.Another topic of discussion was composition attacks. In its proposed form, Web
Packaging only provides authentication of independent resources, not a web page
as a single unit, allowing an attacker to control the composition of resources.
This weakness was acknowledged as a known shortcoming of the current proposal
that would be addressed.The issue of managing the trade-off between control and performance in caches
arose. While participants recognized that problems with resource composition
already occur by accident -- for example, when a cache stores different versions
of resources -- Web Packaging allows an attacker more direct control over what
resources are available to clients.For example, an attacker might be able to cause content with a security flaw to
be used up to a week past the time that the defect was fixed.As an example of how Web Packaging might change the risk profile for sites,
participants discussed recovery from cross-site scripting attacks. It is already
the case that a brief exposure to this class of attack can result in an attacker
gaining persistent access, but mechanisms exist that can be used to avoid or
correct issues, like cache validation and Clear Site Data . These
measures are not available to clients unless they connect to the site.The discussion pointed out that these concerns are not new or uniquely enabled
by Web Packaging. However, it was pointed out that new features are routinely
subject to higher security and privacy expectations. In an example unrelated to
Web Packaging but with similar trade-offs, shared compression of multiple
resources has significant performance benefits. The risk with shared compression
is the potential for exposing encrypted information through
side channels. Though sites can use shared compression without this exposure,
shared compression will likely only be enabled once it is clear that measures to
prevent accidental information exposure are understood to be effective in a
broad set of deployments.The discussion also addressed the question of whether concerns might equally
apply to the typical use of a CDN as a
third-party provider of the content. Some participants concluded that CDNs are
typically in a contractual relationship with the sites they serve and so are
more likely to have their interests aligned.Privacy of ContentDiscussion and submissions raised concerns regarding how serving content using
Web Packages might adversely affect privacy of individuals. There are
challenges here, but the very narrow applicability of Web Packaging to what is
effectively static content limits the privacy risk. The conclusion was that,
provided sufficient care is taken in implementation, the use of Web Packages does
not substantially increase the information that an aggregator gains about what
content is consumed.Concretely, an aggregator knows what content it serves in anticipation of
navigation. This is -- at least in theory -- substantially the same as the
content that the aggregator might receive if it performed the navigation
itself. Assuming that content is stripped of personalization, the aggregator
gains no new information.AMP Issues Unrelated to Web PackagingOn multiple occasions, discussion at the workshop concentrated on problems that
arise as a result of constraints on the AMP format or details of its inclusion
in Google Search. For instance, the requirement to make pages expose their
metadata is unlikely to be affected by any standardization of a
packaging format as that requirement is independent of the process of
delivering content.This section provides some detail on aspects of the discussion that touched on
AMP more generally in this way. Some treatment of these points is considered
relevant as some of the discussion at the workshop, even under the remit of
discussing Web Packaging, concentrated on the effect of AMP on the ecosystem.Discussion and submissions referred to a commitment to allow
publishers to use content that met specific criteria to access privileged
positions in search results, regardless of their adoption of AMP. Participants
felt that this approach might address some of these concerns if it were adopted
and durable. For instance, the use of Web Packaging might be sufficient to
remove some constraints on active content on the basis that the active content
would be attributed to the publisher and not the AMP Cache.AMP GovernanceThere was interest from workshop participants in the governance model used for
AMP. In particular, the question of how independent the AMP project would be of
Google and Google Search arose.Three of the seven members of the AMP Technical Steering Committee, the body
that governs AMP, are Google employees, which gives Google considerable
influence over the project. It was asserted that the governance structure was
intended to be more independent of Google over time. The understanding was that
any consumer of the format, such as Google Search, would make an independent
assessment about whether to use or require different aspects of the AMP project
products.Constraints on the AMP FormatSites often implement AMP by creating a separate set of content in parallel to
their regular HTML content. Publishers noted this as a high cost, particularly
for smaller sites. It was pointed out that websites can serve AMP-compliant
content exclusively. However, several publishers referred to limitations in the
format that made it unsuitable for their needs.Many cited reasons for this duplication were related to the necessity of
running arbitrary active content (typically, JavaScript). For example:
AMP provides a framework for supporting user authentication, but publishers
asserted that using this framework was not considered practical.
AMP content does not support rendering of certain content, which can affect
the ability of publishers to innovate content production.
The AMP model for the implementation of paywalls () was claimed
to be inimical to some publisher business models.
More broadly, they considered AMP's constraints on the use of active content as
problematic, since they prevent the use of capabilities that are provided on
equivalent non-AMP pages. Reference was made to a proposed <amp-script>
element -- which has since been made fully available -- that seeks to provide
limited access to some dynamic content.PerformancePublishers observed that using the AMP format does not provide any guarantee of
performance gains and, in some cases, could contribute to performance
degradation. It was suggested that this was most problematic for sites that are
already well-tuned for performance.Implementation of PaywallsThe use of paywalls by web publishers to control access to content in return
for payment is increasingly common. One popular approach is to offer a limited
number of articles without payment while insisting on a paid subscription to
access further articles.On several occasions, participants expressed dissatisfaction with the difficulty
of integrating paywall authorization when using AMP. In particular, they said
AMP encourages publishers to include an article's full content, hidden by
default but easily accessible to motivated users.
The discussion extended to workarounds like cookie syncing ,
which is used as part of authorization and is a consequence of having cached content hosted on the
linking site rather than the target site.The same topic came up concerning book publication, where publishers indicated
that having a means of enabling different methods of distribution without also
facilitating unconstrained copying of book content was necessary.This conflation of AMP issues with those addressed by Web Packaging was
recurrent in the discussion. As observed in , these concerns might be
addressed by linking to a signed bundle.Venues for Future DiscussionWeb Packaging work continues in multiple forums. Questions about the
core format and signatures are being discussed on the wpack@ietf.org
mailing list. Changes to web browsers as proposed in will be discussed on the Fetch specification
repository.Security ConsiderationsProposals discussed at the workshop might have a significant security impact,
and these topics were discussed in some depth; see .Informative ReferencesSupporting Web Archiving via Web PackagingOld Dominion UniversityOld Dominion UniversityOld Dominion UniversityLos Alamos National LaboratoryData Archiving and Networked ServicesStandardizing lessons learned from AMPGoogleThe Speed Benefit of AMP PrerenderingGoogleHow to time-stamp a digital documentBellcoreBellcoreJournal of Cryptology, Vol. 3, Issue 2, pp. 99-111ESCAPE: The New York Times PositionThe New York Times CompanyESCAPE Position / Patch.comPatch.comBundled HTTP ExchangesBundled exchanges provide a way to bundle up groups of HTTP request+response pairs to transmit or store them together. They can include multiple top-level resources with one identified as the default by a manifest, provide random access to their component exchanges, and efficiently store 8-bit resources.Work in ProgressExploring Synergy between Content Aggregation and the Publisher Ecosystem Workshop 2019Internet Architecture BoardChatham House RuleChatham House'Thousands' of Christchurch shootings videos removed from YouTube, Google saysStuff LimitedStuff LimitedClear Site DataGoogleW3C Working DraftThe Web Never ForgetsCSS '14: Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security, pp. 674-689Packaging BooksHachette Book GroupThe Implication of Signed Exchanges on E-Commerce1-800-Flowers.comSigned Exchanges and The Importance of Trust in Aggregator/Publisher relationshipsThe Washington PostThe Washington PostGeneral Data Protection RegulationEuropean UnionEU Regulation 2016/679Hypertext Transfer Protocol (HTTP/1.1): Message Syntax and RoutingThe Hypertext Transfer Protocol (HTTP) is a stateless application-level protocol for distributed, collaborative, hypertext information systems. This document provides an overview of HTTP architecture and its associated terminology, defines the "http" and "https" Uniform Resource Identifier (URI) schemes, defines the HTTP/1.1 message syntax and parsing requirements, and describes related security concerns for implementations.Loading Signed ExchangesGoogleHTTP Framework for Time-Based Access to Resource States -- MementoThe HTTP-based Memento framework bridges the present and past Web. It facilitates obtaining representations of prior states of a given resource by introducing datetime negotiation and TimeMaps. Datetime negotiation is a variation on content negotiation that leverages the given resource's URI and a user agent's preferred datetime. TimeMaps are lists that enumerate URIs of resources that encapsulate prior states of the given resource. The framework also facilitates recognizing a resource that encapsulates a frozen prior state of another resource.The Web Origin ConceptThis document defines the concept of an "origin", which is often used as the scope of authority or privilege by user agents. Typically, user agents isolate content retrieved from different origins to prevent malicious web site operators from interfering with the operation of benign web sites. In addition to outlining the principles that underlie the concept of origin, this document details how to determine the origin of a URI and how to serialize an origin into a string. It also defines an HTTP header field, named "Origin", that indicates which origins are associated with an HTTP request. [STANDARDS-TRACK]Deployment Experience of Signed HTTP Exchanges with AMP as a PublisherYahoo Japan CorporationThe Secure HyperText Transfer ProtocolThis memo describes a syntax for securing messages sent using the Hypertext Transfer Protocol (HTTP), which forms the basis for the World Wide Web. This memo defines an Experimental Protocol for the Internet community.What Makes for a Successful Protocol?The Internet community has specified a large number of protocols to date, and these protocols have achieved varying degrees of success. Based on case studies, this document attempts to ascertain factors that contribute to or hinder a protocol's success. It is hoped that these observations can serve as guidance for future protocol work. This memo provides information for the Internet community.Signed HTTP ExchangesThis document specifies how a server can send an HTTP exchange--a request URL, content negotiation information, and a response--with signatures that vouch for that exchange's authenticity. These signatures can be verified against an origin's certificate to establish that the exchange is authoritative for an origin even if it was transferred over a connection that isn't. The signatures can also be used in other ways described in the appendices. These signatures contain countermeasures against downgrade and protocol-confusion attacks.Work in ProgressDistributed and syndicated contentW3C TAG FindingThe Transport Layer Security (TLS) Protocol Version 1.3This document specifies version 1.3 of the Transport Layer Security (TLS) protocol. TLS allows client/server applications to communicate over the Internet in a way that is designed to prevent eavesdropping, tampering, and message forgery.This document updates RFCs 5705 and 6066, and obsoletes RFCs 5077, 5246, and 6961. This document also specifies new requirements for TLS 1.2 implementations.Chrome's position on the ESCAPE workshopGoogleAbout the WorkshopThe ESCAPE Workshop was held on 2019-07-18 and the morning of 2019-07-19 at
Cisco's facility in Herndon, Virginia, USA.Workshop attendees were asked to submit position papers. These papers
are published on the IAB website .The workshop was conducted under the Chatham House Rule , meaning that statements
cannot be attributed to individuals or organizations without explicit
authorization.AgendaThis section outlines the broad areas of discussion on each day.Thursday 2019-07-18
Web Packaging Overview:
A technical summary of Web Packaging was provided, plus a longer discussion
of a range of use cases.
Web Packaging and Aggregators:
The use of Web Packaging from the perspective of a content aggregator was
given.
Web Packaging and Publishers:
After a break, presentations from web publishers talked about the benefits
and costs of Web Packaging. This included some discussion of the effect of
developing AMP-conformant versions of content from a publisher perspective.
Web Packaging and Security:
This session concentrated on how the Web Packaging proposal might affect the
web security model.
Alternatives to Web Packaging:
This session looked at alternative technologies, including those that were
attempted in the past and some more recent ideas for addressing the use case of
making web navigations more performant.
Friday 2019-07-19
Web Archival:
This session talked about the potential application of a technology like Web
Packaging in addressing some of the myriad problems faced by web archival
systems.
Book Publishing:
The effect of technologies for bundling and distribution of
books was discussed.
Conclusions:
A wrap-up session attempted to capture key takeaways from the workshop.
Workshop AttendeesAttendees of the workshop are listed with their primary affiliation as it
appeared in submissions. Attendees from the program committee (PC), the
Internet Architecture Board (IAB), and the Internet Engineering Steering Group
(IESG) are also marked.
, Old Dominion University
, Ericsson (IAB)
, Cisco
, New York Times (PC)
, Cloudflare
, Patch.com
, Cisco (IESG, IAB)
, Hachette Book Group
, Washington Post
, AMP Advisory Committee
, Google
, Center for Democracy & Technology (PC)
, Washington Post
, Old Dominion University
, Fastly (IAB, PC)
, Yahoo
, Mozilla
, Mozilla (IESG)
, Akamai Technologies
, W3C
, Pantheon (PC)
, Hughes
, W3C
, Mozilla (IAB, PC)
, Google
, Internet Society
, John Wiley & Sons
Web Packaging OverviewWeb Packaging is comprised of two separate technologies: resource bundling
and signed exchanges
.In both the submissions and workshop discussion, the most controversial aspect
of the technology is the use of signed exchanges as an alternative means of
providing authority over a particular resource, for a few different reasons.This appendix explains how authority works on the Web and how Web Packaging
proposes to change that.Authority in HTTPSThe Web currently uses HTTPS to establish a server's
authority -- that is, to give an assurance that the content came from where the
URL implies. The combination of URI scheme (https), domain name (or host), and
port number are formed into a single identifier, the origin
to which content is attributed.Web browsers use the certificate offered as part of a TLS connection
to servers in determining whether a server is authoritative
for that origin; see and
.
Content is attributed to a given URL only if it is received from a connection
to a server that is authoritative for the associated origin.As an example, a web browser seeking to load https://example.com/index.html
makes a TLS connection to a server. As part of the TLS connection
establishment, the server offers a certificate for the name example.com. If
the browser accepts the certificate, it will then make requests for URLs on the
https://example.com origin on that connection and consider any answers from the
server to be authoritative.This notion of authority is a crucial property of web security: only content
that is attributed to the same web origin can access all information in that
origin, including the content of most resources as well as state associated
with the origin, such as cookies. This separation ensures that sites can keep
secrets from each other, even when they are both loaded in the same browser.Authority in Web PackagingWeb Packaging, through the use of signed exchanges, aims to provide an
alternative means of establishing authority. A signed exchange is an expression
of an HTTP request and response (an exchange) with certain information stripped
and a digital signature applied.The signature is made with a similar certificate to the one a server might
offer in HTTPS -- that certificate can also be used for HTTPS -- but it includes
a special attribute that denotes its suitability for signed exchanges.A web browser that has been provided with a signed exchange can verify the
signature and, if the signature is valid and the certificate is acceptable,
use the content from the signed exchange. Critically, the web browser does not
make an HTTPS connection to a server to get the content or to verify the
signature.In effect, Web Packaging moves from a model where authority is derived from the
delivery method (i.e., TLS) to an object security model, where authority is
derived from a signature on objects. In doing so, it aims to render the means
of delivery irrelevant to determinations of security.ApplicabilityWeb Packaging does not claim to supplant the authority model of the Web
completely, but it does provide an alternative that might be used under certain
narrow conditions. In particular, Web Packaging is intended for use with
content that is not secret from an entity that is aware of the existence of
that content.In aid of this goal, Web Packaging does not include information
from exchanges that is related to the process of acquiring content
nor does it include any information that is related to individual requests.
For instance, use of the
Set-Cookie header field is expressly forbidden, as it often contains
information that is related to a particular user.The AMP Format, Google Search Results, and Web PackagingThe relationship between the AMP Project and Web Packaging is
complicated. The AMP Project, sponsored by Google, establishes a profile of HTML
with a stated goal of providing support for the best practices for the format,
with a strong emphasis on performance. The format tightly constrains the use of
HTML features but also offers a library of components that provide sanitized
implementations of many commonly used capabilities.The connection to Web Packaging is bound up in the way that Google Search
treats AMP content specially. AMP content provides two properties that Google
Search exploits: metadata exposure and static analysis of active content.AMP content provides metadata in a form that can be reliably extracted, using
the microformats defined by the Schema.org project . This
aspect of AMP has no effect on the discussion, except to the extent that this
relates to Google Search and their use of this metadata in populating the
carousel.Constrained use of active content -- such as JavaScript -- in AMP makes it
possible to analyze content to verify that actions taken are narrowly limited.
This static analysis assures that AMP content can be served without affecting
other content on the same site. For Google Search, this is what enables the
loading of AMP content alongside search content and other AMP resources.To provide preloading, Google operates the Google AMP Cache
, from which AMP content is served.
As a consequence, browsers attribute the content to the origin
of the AMP Cache and not the publisher, creating some
confusion about how content is attributed, as discussed in the W3C finding on
distributed content .An important goal of Web Packaging is to attribute content loaded from a cache,
such as the Google AMP Cache, to the publisher that created that content. For more on
this, see .IAB Members at the Time of ApprovalInternet Architecture Board members at the time this document was approved
for publication were: