Where Am I? NYTimes or Google?(theinternetbytes.com) |
Where Am I? NYTimes or Google?(theinternetbytes.com) |
You are supposed to trust Google.
And when your browser says 'Google' - you know it is all good.
I live Out In The Styx, in a Shithole country, at the end of an allegedly 2MB/s piece of wet string masquerading as an internet connection that seldom lives up to its adverted performance. AMP has never once made any significant difference to my web experience.
My point is that it's so easy to just see the drawbacks and none of the benefits when you're sitting on a good connection. All threads on HN becomes completely one-sided where everyone is just backslapping each other's complaints.
Well, this is one kind of modern skepticism I particularly like: Does gravity kill if one jumps off a cliff? Is a sphere round? Is it really bad if we give up our freedom? Who are we to think for ourselves?
When questions like this are asked, the damage is already done. And it seems like it's already beyond repair.
Image AMP? No URL for You! https://news.ycombinator.com/item?id=23322730
[0]: https://addons.mozilla.org/en-GB/firefox/addon/amp2html
[1]: https://addons.mozilla.org/en-GB/firefox/addon/privacy-redir...
Google should can do this stuff if they like...on their own network in their own ecosystem.
Insane that they got rich from hyperlinks and now want to fiddle with the so others can't.
Hearing people mention low quality search results was what kept me off, but I’ve actually only needed to do a google search about once a week, far less than I was expecting.
I know Google wants browsers to lie to the user about the website they're visiting. But the article screenshot is a case where that's not happening, it's displaying the real URL.
Well at least under Google AMP, the pages loaded on time.
Your comment is pure flamebait without any insight.
Oh, exactly that: Google/AMP with the web, systemd with Linux.
The point though isn't about the technology or the tactics. It's about the seemingly-benign apologia from third parties that bit-by-bit chips away at the objectors' arguments. It's part of how X wins their long game and takes control of Y.
I don't even think the original comment to which I replied does this, but it reminded me of a pattern. The way in which AMP/web and systemd/Linux are playing out are similar enough to be worth thinking about.
(I almost certainly lack any kind of meaningful insight and – according to quite a few others here – an ability to write. It's disappointing to be accused of pure flamebait though.)
Yeah, because Google is cheating.
https://www.economist.com/science-and-technology/2018/11/03/...
I would classify that as cheating.
Once signed exchanges become a thing, that may change, but as seen in this thread, there's a lot of push back for that.
Well of course it's cheating if you compare load times between pre-loaded sites vs. not preloaded ones. And then argue that "AMP is faster", which is obviously wrong because the conditions are not the same.
This quip is generally used sarcastically or wryly. "Say what you like about [something that's seriously bad], at least [frivolous matter] has improved."
No it isn't.
Gorgoiler says "a common English aphorism about punctuality on 20th century Italian railways" and links to https://www.economist.com/science-and-technology/2018/11/03/... Both are clearly about Mussolini making trains run on time (or rather, not actually doing so.) Wikipedia describes the origins of the quip as such:
> Mussolini was keen to take the credit for major public works in Italy, particularly the railway system.[109] His reported overhauling of the railway network led to the popular saying, "Say what you like about Mussolini, he made the trains run on time."[109] Kenneth Roberts, journalist and novelist, wrote in 1924: "The difference between the Italian railway service in 1919, 1920 and 1921 and that which obtained during the first year of the Mussolini regime was almost beyond belief. The cars were clean, the employees were snappy and courteous, and trains arrived at and left the stations on time — not fifteen minutes late, and not five minutes late; but on the minute.[110]"
The dubious premise of Mussolini being responsible for reliable trains predates the Holocaust.
As for "it’s in bad taste and offensive", I agree that comparing fascism to systemd is in bad taste.
Mozilla (and Apple) are strictly against it and thank god for Mozilla. If Google had a bigger market share this would already be something we would have been living with. I'm sure there are better sources for this, but here is the first result:
https://9to5google.com/2019/04/18/apple-mozilla-google-amp-s...
Sure, you could move it somewhere else and have it show up in the address bar the same, but the actual URL has changed and you need to somehow get the new URL into people's hands. And ultimately you've centralized a lot of websites under a smaller number of service providers which, before, would have been on their own domains.
How so?
AMP is a scourge. It's a bad idea being pushed by bad actors.
Stated another way, with a typical CDN setup the user has to trust their browser, the CDN, and the source. With signed exchanges we're back to the minimal requirement of trusting the browser and the source; the distributor isn't able to make modifications.
Google controls the AMP project and the AMP library. They can start rewriting all links in AMP containers to Google’s AMP cache and track you across the entire internet, even when you are 50 clicks away from google.com.
They cannot be allowed to become the gatekeeper for the web.
It seems to me, a lot of the security concerns come from the requirements to make pages served live and pages served from bundles indistinguishable to a user - a requirement that really only makes sense if you're Google and want to make people trust your AMP cache more.
I'd be excited about an alternative proposal for bundles that explicitly distinguishes bundle use in the URL (and also uses a unique origin for all files of the bundle).
https://m.youtube.com/watch?v=gqGEMQveoqg
(Google Tech Talk from Van Jacobsen on CCN many years ago)
I don’t support ads but I also don’t support Google serving a version of the page that steals money from content creators. So, therein lies the problem: choice.
I can imagine a future where amp is ubiquitous and Google begins serving ads on amp content. Luckily, companies have to make money and amp is not in most people’s or company’s best interests.
If amp was opt-in only, this would be much more ethically sound.
Perhaps that's a great thing to do, but it's not something to do quietly.
https://www.techradar.com/uk/news/google-is-phasing-out-thir...
Cloudflare allow using of same domain to use AMP. In this case, content is served from Cloudflare CDN.
As I didn't know about this addon, thanks for sharing it.
[1] https://addons.mozilla.org/en-US/android/addon/amp2html/
One should never forget that at a certain point, Google will likely invoke the looser's argument ("protect you from terrorists and pedophiles") to require proof of identity prior to granting access to any resource or service it controls.
Anything that helps them advance in that direction must be fought fiercely.
It's Google way of combatting phone apps.
If all of the world's information — especially current news and similar information — moves from the open web into apps, then Google can no longer crawl, index, or scrape that information for its own use. The rise of the mobile phone app is a threat to Google on so many levels from ad revenue to data for training its AIs.
So Google comes up with Amp to convince publishers to keep their content on the open web, where it can be collated, indexed, and otherwise used by Google for Google's services like search and those search result cards that keep people from visiting the content creators.
Google's explicit carrot in all this is the user benefit of page loading speed. Google's implicit carrot in all of this is page rank. But Google's real motivation is to have all of that information available to itself.
Can you imagine what would happen if content from even one of the big providers was no longer visible to Google? New York Times, WaPo, or even Medium? It would create a huge hole in a number of Google products and services, make its search results look even weaker than they already are, and cause people to look for search alternatives.
That's my theory, anyway.
AMP project by itself is open-source and it explicitly states 'Other companies may build their own AMP cache as well'.[1] There are only 2 AMP Cache providers - Google, Bing. Further, 'As a publisher, you don't choose an AMP Cache, it's actually the platform that links to your content that chooses the AMP Cache (if any) to use.'[2]
Say, if Cloudflare provides a AMPCache and if the site publisher can choose their own Cache provider this can be resolved effectively as AMP by design itself is easy for a laymen to create high performance websites; of course there is no excuse for hiding URLs.
[1]https://amp.dev/support/faq/overview/
[2]https://amp.dev/documentation/guides-and-tutorials/learn/amp...
I Agree. IMO, Google has been using 'open-source' for weaponized marketing, same way Apple has been using 'Privacy'. But, either of them could be much worse without those.
It's a really bad look on Google's part to be pushing this.
> To be blunt, this is a really dangerous pattern: Google serves NYTimes’ controlled content on a Google domain.
No, "Google serves NYTimes' controlled content" is an oxymoron. Google controls the content that is served, and that's all your browser is verifying. Google could very well make the NYTimes content on there display something else and your browser wouldn't show a warning. NYTimes could do nothing about that.
I disagree that this pattern is dangerous. While Google taking over serving the world's content is hardly a thing to celebrate, at least we're seeing that it's doing so here.
Trying to copy the domain of a url without the protocol just infuriates me.
Really, I couldn't care less about stuff getting pruned from the URL bar, as long as there's an easy and permanent way to show everything.
https://www.theverge.com/2020/5/28/21272543/google-search-re...
So much of the distrust here is that google wants to be everything: to host their content and publisher content and user content; to broker ads and recommend links; to run their software on your computer and phone, to store your data on their servers. They serve too many masters.
So it is kind of frustrating to see someone offering to fix a problem they helped create in the first place through neglect or carelessness.
I just made https://sites.google.com/view/whalefacts, took me literally ten seconds, confirmed it was accessible from multiple IPs and multiple browsers.
Google wants to be a content host and an ad broker and a search engine. Each of these is reasonable in isolation. Yet you can search on google, and Google will serve you an ad linking to a google.com site, and that site scams you out of money. This isn't theoretical, I know because my family was hit.
Screenshot if it gets taken down: https://i.imgur.com/T6hVHr5.png
If NYTimes and every other news organisation refused to participate then yes, Google would be in trouble. But they can rely on good old divide and conquer: these news organisations all compete with each other. All it would take is for one to starting producing AMP content again and they’d vacuum up all the search traffic, and all the other sites would follow them immediately.
But right now we are not living in that ideal world and because all other publications are doing that they have to follow if they don't want to risk losing visibility against the competition.
So of course they don't "have to" but they also kinda do.
It's a tempting Ponzi scheme.
In addition to this, I previously stumbled upon a few situations where I visited an AMP site to read an article and I noted down the site name in my mind. A few days later I tried to visit that site and when I put the site name in the address bar in hopes of getting helped by autocomplete, guess what?! It was nowhere to be found.
How to fight back against Google AMP https://markosaric.com/google-amp/
And the original thread https://news.ycombinator.com/item?id=21712733
Even if it didn't have all of the problems associated with it I just don't get the point. I don't need Google to repackage a website with less useability. It's frequently not even faster.
As a user, before learning computer knowledge, I am so thankful and amazed by those AMP pages, because they are really fast! And I barely look at this URL thing to care for security which is huge deal to those conspiracy queens, because as non-tech user I don't know a heck about URL, all I care is how fast a page is presented to me.
So, no, the problem is only you, yes, you can just use a dramatic title just because you are so bored with your life to cause a scene, you are only embracing yourself and bring some noise to this already chaotic world, please, go find yourself something to do instead of trying so hard to be internet famous. Thank you.
That itself sounds awesome and something we should promote. The other part of AMP is of course that it's served through Google's servers. While their global edge caches probably bring the speed up I think that's less important.
I other words: AMP as a framework to force users to build light-weight pages without bloat is a good thing. Google's control is a bad thing.
I think many of the comments here make it a borderline topic where there's either all or nothing. I want to see a more nuanced discussion on what the possible alternatives and solutions are instead of just "Google bad, AMP bad".
One interpretation is that Google is changing the URL bar from "where" to "who", which may be the more relevant information for most users. Signed exchanges are an interesting way to achieve that.
I guess times have changed.
They could argue that Google is using The New York Times's branding and domain name to make it look like this content is controlled and provided by The New York Times, when in fact it isn't, and that an average person (“idiot in a hurry”) could be deceived.
If The New York Times willingly gives Google permission (or The New York Times willingly abets Google's monopoly position), then I guess Google can do whatever they like.
To be clear, I don't think Google or any other large company is evil. It's just the way things turn on, how the incentives are structured.
I feel like in the first few years it was not yet an advertising giant. Wikipedia says they had small text ads in 2000, but seems to imply that the advertising didn't get really huge for them until after IPO. Correct me if I am wrong, I was not following super closely in those years. But that would provide a few years of "not being evil".
Meh, they preserved 2/3rds of it.
1/ Apple and Facebook were hosting all the content.
2/ The content did not come with megabytes of JS and other unnecessary crap.
Amp is an attempt at saving the web, and Google is interested in that for the reason that you gave: they make their money from the web.
Yes; attempting to save the web in much the same way that the parasitic wasp is trying to oviposit in your thorax and take over your behavior, in order to save you from being eaten by the spider.
No thank you, sawfly.
TIL
Imagine if instead of having all news stories a quick search away you instead had to install apps from X different news sources (and inevitably grant them permission to access your location, contacts list, name of first born child etc.). It'd create lots of little silos of news with very little ability to go outside those silos.
Put another way, the web is a great platform for news. It does benefit Google, but it also benefits the billions of people who can freely access a huge range of sources.
Besides,you dont need the app on your mobile.
Also, for techie people, do you consider RSS as part of the "web" ? To me, an RSS aggregator app is superior to browsing 20 different news websites, all with different formats.
"web is just way more practical" isn't obvious. It depends about what you put in the "web" bag, and the use cases. Most apps use "web" protocols, so they are technically part of the web.
This is the point.
People easily confuse "open source" with "free software" and "community driven".
A lot of corporate-driven open source greenwashed the dark patterns of closed source: centralized development, user lock-in, walled gardens, poor backward compatibility, forced software and hardware upgrades.
And it appears to be a problem.
Another problem is, there's effectively no distinction between regulator and regulatee.
Yes, AMP is an anti-competitive move by Google
At the same time AMP is "faster" because it gets rid of all the nagware and JS crap that the original page has.
So yeah, I don't like what Google is doing but I don't like what NYT is doing neither
It’s been one of the primary things that’s driven me away from google and into DDG. I don’t really care about privacy enough to leave google, but I end up leaving more and more of their services because their competition is just less annoying.
Google gives preference to AMP content whether the source page is lightning fast or not. I get the frustration with crappy web pages, but a big part of the reason web pages are getting increasingly crappy is because Google and Facebook (and to a much lessor extent Amazon in a weird way) have a stranglehold on the web advertising market and publishers are getting smaller and smaller slices of advertising revenue. AMP increases Google's lock on the market. Since AMP pages can only really be monetized by the publisher, this puts even more power in Google's hands.
AMP is faster only for poorly-optimized JS-heavy pages but the design is fundamentally flawed to require all of its own large amount of JavaScript to run before anything displays, whereas most of the traditional bloat doesn’t block rendering. That means any optimized page - Washington Post, NYT, etc. – loads noticeably faster even before you factor in how often you need to wait for AMP to load, realize that some part of the content is missing, and then wait for the real page to load anyway.
That design forces it to be less reliable, too: before I stopped using Google on mobile to avoid AMP, I would see on a near-daily basis failed page loads due to the AMP JS failing in some way and when it wasn’t failing it was still notably slow (5+ seconds or worse on LTE). Since all of that JavaScript is forced into the critical path, anything less than unrealistically high cache rates means the experience is worse than a normal web page.
WPT examples:
https://www.webpagetest.org/result/200704_GR_62165b7f695e300...
https://www.webpagetest.org/result/200704_5F_f5c36a7c41cf4c2...
So you can see why there must be some kind of internal struggle at Google. They understand the value of a faster web but they also cannot go after the main cause of the slow web. And this is how technology such as AMP gets invented and makes things worse.
From the non-technical people I've talked to, the answer is no, they don't know what a URL is, and that was happening before AMP came around.
This change would restore the idea that the URL indicates the provenance of the content,
I don't like AMP nor much of how Google has behaved with it (http://exple.tive.org/blarg/2020/05/05/the-shape-of-the-mach... largely matches my thoughts), but let's stick to what's actually happening with SXG.
No, Signed HTTP exchanges are something that Google dreamed up so people don't have to see their hegemony over the modern web (or as the article you linked calls it, a shakedown). It's not a browser standard so far, because of Apple and Mozilla's resistance.
There are legitimate ways for NYTimes to allow Google to serve content on behalf of them, like so many other CDNs around the world (it usually involves the CDN generating the certificate for your site as well). Why should people create new standards for HTTPS and URLs simply for Google's benefit?
I don't deny that there's a way to make "nytimes.com" work where everything is served by "google.com". What I'm questioning is why we need a completely new web standard for doing so that affects the URL, something that has been standard for decades.
It's also much faster to render, which makes a huge difference on the crappy Android phones that are everywhere. Hell, I'm using a $200 Android phone right now because my iPhone broke and browsing the web is painful on it. And with the terrible hauwei $40 phones that have taken over Africa, most of the web is unusable.
I don't like Google's control of Amp, but it exists because of the original sin of html and js. Everything about html is terrible: bloated, pointlessly verbose, etc.
I have a dream that we all just start using Gopher and dump the www, but it's never going to happen. Maybe even browser vendors could get to together and design a super light weight markup based on S-exps or something, but that's probably not going to happen either. Amp is the best we got and it solves a real problem. And it solves the problem well.
I appreciate that not everyone has fast data, but not having data speed to read a basic web page is really becoming the exception, not the norm. Data transmission is getting cheaper and faster and available in more remote places every year.
I wouldn't have a huge problem with AMP if I could opt out. Unfortunately I can't. So despite my blazing fast unlimited plan on a flagship device, I'm getting served crippled pages with degraded performance. It's like I own a Ferrari kitted out with all the extras and Google is saying "here have you tried out this cool bicycle? It has special pedals so you can't go too fast and we reconfigured the handlebars so you don't accidentally do something like steering! It even has a bell. Ting-ting, ting-ting! How cool is that?"
In all seriousness, it is neat if it makes the web more useable for low-connectivity users, but maybe then limit AMP to those places (which are shrinking every year) and don't serve needlessly crippled pages when I'm standing in downtown Amsterdam or Hong Kong at the center of the internet, connected to blazing fast Wifi.
Huh? Yes. Hugely. I'm on my fast home internet using a new iPhone I bought two months ago, and loading a NYTimes article just took 8 seconds. God only knows if it's bounded by network or CPU or both, if the problem is frameworks or ads or what. And it isn't even "stuck" on anything -- I watch the blue loading bar in Safari move pretty smoothly across the top.
I did a search for a NYT article on Google, clicked it, and it appeared instantaneously.
That's an insane difference. I know everyone hates AMP here, but when I've got my user hat on rather than my developer hat... it's unbelievably more performant.
But even if I can load both pages at roughly the same time AMP experience is just so much better, they always load at the very least at the same speed as the original website, there's no weird scrolling implemented, there's no annoying popups, etc.
I always choose AMP pages when possible, compared to the "native" ones - because I know for a fact that I'll get fast loading, and other stuff mentioned above.
Google has never injected ads into any cache served AMP document (technically if the publisher uses AdSense, this is false, but that's not the point you are making).
It's difficult to follow what definition of theft is being suggested. The cache does not modify the document rendering, it's essentially a proxy. In a semantic sense, this is no different than your ISP delivering the page or your WiFi router.
Isn't signed exchanges basically CDN's without having to setup DNS? It's in theory no different than using CloudFlare to serve your content, except any CDN can just serve it without giving them access to your domain.
Creation: https://www.cloudflare.com/press-releases/2017/cloudflare-an...
Death: https://blog.cloudflare.com/announcing-amp-real-url/
I agree with you that users should be able to choose their Amp Cache.
(Albeit, that’s far more blockable)
I'm confused, you make it sound like a free CDN is somehow a bad thing. You do realize people actually pay money to have their content on a CDN. I don't think Bing makes money on their AMP cache, and doubt they would want or even allow Google to link to content on their AMP cache...
The point of AMP cache is for Google (and Bing) to waste money making content faster for their users, in the hope that the user will then spend more time on search so they see more ads. The cache itself has nothing to do with the monopoly, and the fact that Bing can use AMP at all (since its open source) to get the same benefits actually shows the exact opposite.
That's nonsensical. That would reveal what the person searched for to a third party (Microsoft) even if they don't click on any results. The AMP Cache has to be controlled by the link aggregator in order to support safe prerendering, so Bing's AMP cache is used to prerender Bing results, and Google's AMP cache is used to prerender Google results. Compare to directly integrating with Google, in which case, Bing wouldn't get to take advantage of prerendering. The latter (the Apple News setup) is anti-competitive. AMP is not.
Had enough of HN ...
This place is bullsh!t.
Ban me.
Your web browser will show a scary warning and refuse to display the bundle if it's not correctly signed. Google is not going to fake signatures for other sites, as certificate mis-issuance would open up Google to legal consequences.
Because of the exact reasons that people are complaining about in this very thread: they want NYT to control the content and display the domain name appropriately, but they want to serve it from Google servers and allow for eager prefetching without leaking private details.
Today it would be easily possible if NYT just gave Google their private cert, but then Google would be able to serve any content they want as NYT. With the proposed solution they can display the content NYT wants without being able to serve arbitrary other content.
$ ping http://whatever.com [furious line editing ensues]
But thankfully it's fixed now, with the "Always show full URLs" option.More information is strictly superior.
Publisher hosted copies are in the pipeline, as I referenced in the parent comment. My choice of verbiage was a bit confusing it appears.
Yes, and? What’s your point? It’s actually a security weakness to include third party JS. The whole thing runs on trust.
"AMP HTML documents MUST..."
"The AMP runtime is loaded via the mandatory <script src="https://cdn.ampproject.org/v0.js"></script> tag in the AMP document <head>."
Do a whois on ampproject.org:
"Registrant Organization: Google LLC Registrant State/Province: CA Registrant Country: US Admin Organization: Google LLC"
Note that jQuery, as mentioned in some GP comment has no such requirement. Google AMP is quite unique in this regard. This is NOT some general CDN type issue. Also...agreed, WTF is "open intent"?
What matters is that the domain points to where the NYT considers is the correct source of their content.
Personally I don't think there's anything wrong with the fundamental concept of signed exchanges. The only problem is that it's just that: a signed exchange of content, which should have nothing to do with the domain name authority in the URL. By all means, display "Content from: a.com" in a box next to the URL, but don't change b.com to a.com in the URL as though it doesn't already have a well defined meaning.
The issue is that the technical meaning of the URL is very far from what most user think of.
Is the URL an address for NYT's server? Not really because you are actually hitting Fastly's server. So when NYT sets up a magical DNS config, it suddenly is fine, but using crypto to sign the package and serve it on a CDN that way, then it's suddenly "subverting the meaning of the URL"?
We can have a real discussion of what the meaning of a URL is, but I think your interpretation is unfair. I think it's entirely fair to argue that it makes sense for URLs to be an address to a specific content.
For your question about fastly, I already answered that in the comment you replied. The fastly CDN requires that the DNS is configured to point at fastly servers. Take a look at https://docs.fastly.com/en/guides/sign-up-and-create-your-fi... under "Start serving traffic through Fastly".
Once you’re ready, all you need to do to complete your service
setup and start serving traffic through Fastly is set your domain's
CNAME DNS record to point to Fastly. For more information, see
the instructions in our Adding CNAME records guide."
A CNAME record is a dns mechanism that aliases an alternate domain for a canonical domain.https://blog.chromium.org/2020/01/building-more-private-web-...
I'm going to copy paste my older comment on this:
I find their "removing 3rd party cookies will incentivise businesses to rely on fingerprinting" discourse dangerous.
It implies that other browser vendors (Mozilla, Safari/WebKit, new Edge) are in fact making the Web a more dangerous place.
I believe it's dangerous because it creates a harmful, unproductive PR narrative—people might just assume this is a true statement, without learning about both sides of the problem. I'm not trying to strip anyone of agency, I just don't think most of my friends would have time to research this topic and might decide to follow the main opinion instead.
The answer I'd like to hear: Yes, it does push some actors towards fingerprinting, but preventing fingerprinting should be dealt with regardless. Changes should happen both on legislative and browser-vendor level.
This concern has been raised time again with every major Google open-source project e.g. Android, Chromium, Golang etc. and that concerns have helped improve certain aspects of the project.
But, I wonder whether a huge corporate like Google can build such large scale projects without such criticism, if the the project needs to be successful they to gain from it after-all they are investing their employees and other resources in it. And them being invested in it, is a major reason for adoption by other parties and resulting in a successful open-source project.
More over, such large projects have helped overall SW ecosystem and even startups economically. I for one would say, without such large open-source projects I wouldn't have even been able to build products from a village in India and compete with products from valley.
All I'm saying is, them being open-source at least helps us raise concerns and make them take actions; being a complete walled garden and just asking to 'trust us' is much worse.
Yes: they could at least develop large projects in a foundation with many other companies
> And them being invested in it, is a major reason for adoption by other parties and resulting in a successful open-source project.
...and the main source of pain when the projects are "pivoted" or just dropped due to a single company business needs, as it happened many times.
> such large projects have helped overall SW ecosystem and even startups economically.
They hugely harmed competing projects and competing companies including Mozilla, many phone OSes, many grassroots programming languages.
It's well known that google developed various projects to kill competitors or buy startups cheaply and drop the project afterwards.
There isn't an infinite pool of open source developers - far from it!
Any large corporation that drains the pool to create a competitor to already existing FLOSS projects is actively harming the ecosystem.
> being a complete walled garden and just asking to 'trust us' is much worse.
Closed source can be less harmful that fake-open source. A lot of people actively avoid closed source and fall for the latter.
IMO, we're the reason it failed. We as a consumer didn't buy FirefoxOS phone over Android, iOS. We haven't adopted Firefox browser enough for it to become have the major market share. The same argument can levelled against any proprietary product VS open-source product.
That proves my point, being 'completely community driven' open-source project isn't the only criteria for the success of a project.
AMP’s design is very fragile: if you are using Google search results, they correctly guess what you’re going to tap on before you do and your browser fully preloads it, it _might_ be faster to run all of that JavaScript before anything is allowed to load and render. If any part of that chain fails, it will almost certainly be slower or, because it disables standard browser behavior, prevent you from seeing content at all.
It is. AMP results load instantly for me.
> you would find it educational to learn about the issues with detecting user intent, reliably prefetching dependencies, and the relatively small / frequently purged caches on mobile browsers.
And you might find it educational to learn why AMP doesn't rely on these things. There are no dependencies that need to be fetched for the initial render.
This idea isn't surprising. Multiple other systems use the same ideas, including Apple News, many RSS readers, and Facebook Instant Articles. AMP just does it in a way that isn't anti-competitive (like the former) and allows for multiple monetization schemes and rich formatting (unlike RSS).
> if you are using Google search results, they correctly guess what you’re going to tap on before you do and your browser fully preloads it, it _might_ be faster to run all of that JavaScript before anything is allowed to load and render
AMP doesn't rely on fully prerendering the page, only the portion above the fold, which it can calculate because the link aggregator page knows the display size, and the elements allowed in AMP are required to report their dimensions. This allows multiple pages to be prerendered.
> because it disables standard browser behavior,
What standard browser behavior does it disable?
AMP's documentation seems to indicate that the LTS is stable only for one month (new features released via the same URL each month), and so is not compatible with SRI (see https://github.com/ampproject/amphtml/blob/master/contributi...)
You can specify a version (ie, https://cdn.ampproject.org/rtv/somenum/v0.js), but the AMP validator complains about that.
If they do that, it's not really visible, I don't see any regulation with how Google is behaving regarding search & web, if anything it looks like anti-competitive monopoly behaviours.
Self-regulating in the same way an alcoholic meth addict self-regulates.
This expiration can also never be set more than 7 days in the future.
"Signing a bad response can affect more users than simply serving a bad response, since a served response will only affect users who make a request while the bad version is live, while an attacker can forward a signed response until its signature expires. Publishers should consider shorter signature expiration times than they use for cache expiration times."
Really? Could you publish how you are inspecting an unknown program to determine if it exhibits a specific behavior? There are a lot of computer scientists interested in your solution to the halting problem.
Joking aside, we already know from the halting problem[1] that it you cannot determine if a program will execute the simplest behavior: halting. Inspecting a program for more complex behaviors is almost always undecidable[2].
In this particular situation where Google is serving an unknown Javascript program, a look at the company's history and business model suggests that the probability they are using that Javascript to track use behavior is very high.
def divisors(n):
for d in range(1, n):
if n % d == 0:
yield d
n = 1
while True:
if n == sum(divisors(n)):
break
n += 2
print(n)
I don’t know if this program halts. But I’m pretty sure it won’t steal my data and send it to third parties. Why? Because at no point does it read my data or communicate with third parties in any way: it would have to have those things programmed into it for that to be a possibility. At no point I had to solve the halting problem to know this.Also, if I execute a program and it does exhibit that behaviour, that’s a proof right there.
The same kind of analysis can be applied to Google’s scripts: look what data it collects and where it pushes data to the outside world. If there are any undecidable problems along the way, then Google has no plausible deniability that some nefarious behaviour is possible. Now, whether that is a practical thing to do is another matter; but the halting problem is just a distraction.
This has nothing to do with the halting problem because that is concerned about for all possible programs not some programs.
We obviously know if some programs halt.
while true: nop
Is an infinite loop. X = 1
Y = X + 2
Halts.More complex behaviours can be easier. Neither of my programs there make network calls.
Likewise, AMP pages are mostly accessed from Google search that's already tracked.
AMP requires that you consume other Google products, which requires that additional JS is loaded. When your mobile site doesn't use AMP, Google limits SEO rankings your mobile site can have. Google AMP requires your pages meet Google's Content Policies or they won't host them.
AMP and CDN delivered pages are architected differently and Google imposes restrictions and requirements that don't exist in a CDN.
I'm still opposed to the change, I see this centralization of the web through CDNs as a bad thing, I don't want to make it easier.
My argument is not really concerned with what most users think of, but humor me, what do they think of?
> So when NYT sets up a magical DNS config, it suddenly is fine, but using crypto to sign the package and serve it on a CDN that way, then it's suddenly "subverting the meaning of the URL"?
Yes, because HTTP/S scheme URLs have a definition that implies a meaning, which is subverted when you create exceptions to that meaning. NYT setting up a "magical" DNS config that resolves to some third party server is perfectly fine by that definition, and resolving one FQDN while displaying another is not. It's not sudden, this standard has existed in one form or another since 1994.
> We can have a real discussion of what the meaning of a URL is
Yeah, let's do that instead of harping on about what's fair and unfair. It's not a matter of fairness, it's a matter of standardized definitions. By all means, create a new "amp:" URI scheme where the naming authority refers to whoever signed the data and resolves to your favorite AMP cache, but don't call it http or https.
An example of where this occurs today is caching. You could be hitting a cache anywhere along the way. Hell you could be seeing an "offline" version, but the website would still show you the "address" of the content.
This is no different, you're hitting a different cache, but the "URL" you see is the canonical address of the content you are looking at, not where it was actually fetched from.
The only sense in which content is located anywhere is as data on a memory device somewhere. With the traditional URI in which the host part of the authority is an address of or a domain name pointing towards an actual host, you have a better indication of where the content is located than you do if this is misrepresented as being some other domain name which in fact does not at all refer to the location of the content.
The shift, if any, is that people may be less interested in where the content is located and more interested in its publishing origin.
> An example of where this occurs today is caching. You could be hitting a cache anywhere along the way. Hell you could be seeing an "offline" version, but the website would still show you the "address" of the content.
Yes, because that's how domain names work.
> This is no different, you're hitting a different cache, but the "URL" you see is the canonical address of the content you are looking at, not where it was actually fetched from.
It's different in the sense that a host name as displayed by the browser then has multiple, conflicting meanings that have no standardized precedent.
I think what Spivak is saying though is right. If we could move from location addressing (dns+ip) to content-addressing , but not via the AMP cache servers, in general, anyone could serve any content on the web. Add in signing of the content addressing, and now you can also verify that content is coming from NYTimes for example.
Also, I'd say that the internet (transports, piping, glue) is decentralized. The web is not. Nothing seems to work with each other and most web properties are fighting against each other, not together. Not at all like the internet is built. The web is basically ~10 big silos right now, that would probably kill their API endpoints if they could.
I don't think this should be shoehorned into the URL bar or into some meta info that no one ever reads hidden behind some obscure icon.
It actually makes perfect sense in Doublespeak. /s
But AMP is a much narrower technology, I’d imagine only Google would be able to impersonate other websites, essentially centralised as you say. The generic idea would just be a distraction to push AMP.
Everything would be so much better if the original websites were not so overloaded with trackers, ads and banners, then there would be no need for these “accelerated” versions.
Could there be net-neutrality-like questions in all this as well?
Create a new “original URL” field or something.
What you're saying would be described as distributed... Not decentralized.
The serving nodes are not necessarily under control of a well intended party that complies with upgrade requests.
What a bummer.
(Disclosure: I work for Google, speaking only for myself)
https://blog.cloudflare.com/keyless-ssl-the-nitty-gritty-tec... is a thing now.
To put it simply, Cloudflare still controls the content. The proposal here would avoid that, by allowing Cloudflare to transmit only pre-signed content.
And by hovering, or one-clicking, a popup could show both the distributor's address (say, CloudFlare), and the content's/publisher's address (say, NyT)?
Google AMP doesn't show Google on the page. Google is pushing for the URL to show the origin site's URL instead of Google[2].
If an attacker poisons a nytimes.com article served by Google AMP, how does a browser's domain blacklisting help? Block google? Block nytimes.com? Neither makes sense.
1. https://web.archive.org/web/20050401090916/http://www.google...
2. https://9to5google.com/2019/04/18/apple-mozilla-google-amp-s...
example.com generates a content bundle and signs it. Google.com downloads the bundle and decides to mirror it from their domain. Your browser downloads the bundle from google.com, and verifies that the signature comes from example.com. Your browser is now confident that the content did originate from example.com, and so can freely say that the "canonical URL" for the content is example.com.
Malicious.org does the same thing, and the browser spots that malicious.org is blocked. At this point it doesn't matter if the content came from google.com, because the browser knows that the content is signed by malicious.org and so it originated from there.
Hope this helps clarify. Obviously blacklisting isn't a great security mechanism; my point is just that signed exchanges don't really open any NEW vectors for attack.
Imagine that example.com builds the bundle by pulling data from a database. If an attacker can find a way to store malicious content in that database (stored XSS) and that content ends up in a signed bundle that Google AMP serves (similar to cache poisoning) then users will see malicious content. When the stored XSS is removed from the database, Google AMP may continue to serve the malicous signed bundle. So an extra step may be needed to clear the malicious content from Google AMP.
How exactly the attacker influences the bundle is going to be implementation dependent, so some sites may be safe while others are exploitable.
If, purely as a hypothetical, Russian operatives got a credible propaganda story posted on the NYT website 24 hours before the November elections, and an AMP-hosted version of it stayed live long after the actual post got removed from nyt.com, I'd certainly call that "malicious". Of course, just like archive.org, I suspect that in a case as high-profile as that, you'd see a human from the NYT on the phone with a human at Google to get the cached copy yanked ASAP, but maybe on a slightly smaller scale the delay could be hours-to-days, which is bad enough.
Tracking doesn't require reading any of your data. All that is necessary is to trigger some kind of signal back to Google's servers on whatever user behavior they are interested in tracking.
> or communicate with third parties
Third parties like Google? Which is kind of the point?
> [example source code]
Of course you can generate examples that are trivial to inspect. Real world problems are far harder to understand. Source is minified/uglified/obfuscated, and "bad" behaviors might intermingle with legitimate actions.
Instead of speculating, here is Google's JS for AMP pages:
https://cdn.ampproject.org/v0.js
How much tracking does that library implement? What data does it exfiltrate from the user's browser back to Google? It obviously communicates with Google's servers; can you characterize if these communications are "good" or "bad"?
Even if you spent the time and effort to manually answer these questions, the javascript might change at any time. Unless you're willing to stop using all AMP pages every time Google changes their JS and you perform another manual inspection, you are going to need some sort of automated process that can inspect and characterize unknown programs. Which is where you will run into the halting problem.
Be cool if you did ;)