Welcome to Internal Tech Emails: internal tech industry emails that surface in public records. 🔍 If you haven’t signed up, join 37,000+ others and get the newsletter:
Internal Tech Emails is brought to you by… TidyCal
The best tech founders, consultants, and executives use TidyCal to help manage their schedule.
Book meetings and manage your calendar from a beautiful, simple interface — and tons of advanced features. Organize paid coaching calls, free intro chats, group coaching meetings, date polls, and so much more.
And with key integrations — like Google, Zoom, and iCal — your availability is updated in real-time so you're never double-booked.
Make booking your next meeting and managing your calendar simpler than ever. Join 100,000+ people using TidyCal.
Google engineer: AI is a serious risk to our business
On Wed, Dec 26, 2018 at 4:48 PM Eric Lehman wrote:
I’d like to offer a thought for contemplation over the break:
Within the near future, a deep ML system will clearly outperform Google’s 20-year accumulation of relevance algorithms for web search.
Here, I’m just talking about relevance; that is, determining whether a document and query are talking about the same thing. There is a lot more to web ranking for which ML seems much less appropriate. But I think basic relevance is the major task in web ranking and probably “objective” enough to go after pretty effectively with ML.
None of us can see the future, but my bet is that this is nearly certain to be true within 5 years and could be true even within 6 months. One problem after another that is similar in flavor to web ranking has fallen, and there is little reason to think that web ranking is somehow exceptional. Indeed, this holiday thought stems from recent advances in web answers, where deep ML (in the form of BERT) abruptly subsumed essentially all preceding work.
For the web answers team, the tidal wave of deep ML that arrived in the last few weeks was a complete shock. With this warning, we should not allow ourselves to be caught off-guard again; rather, we should start thinking through the implications now. And *now* is really the time, because in the new year I expect a lot of web ranking engineers to reflect on BERT and start thinking along these same lines.
One consideration is that such a deep ML system could well be developed outside of Google-- at Microsoft, Baidu, Yandex, Amazon, Apple, or even a startup. My impression is that the Translate team experienced this. Deep ML reset the translation game; past advantages were sort of wiped out. Fortunately, Google's huge investment in deep ML largely paid off, and we excelled in this new game. Nevertheless, our new ML-based translator was still beaten on benchmarks by a small startup. The risk that Google could similarly be beaten in relevance by another company is highlighted by a startling conclusion from BERT: huge amounts of user feedback can be largely replaced by unsupervised learning from raw text. That could have heavy implications for Google.
Relevance in web search may not fall quickly to deep ML, because we rely on memorization systems that are much larger than any current ML model and capture a ton of seemingly-crucial knowledge about language and the world. And there are lots of performance challenges, specialized considerations, etc. Still, my guess is that the advantages of our current approach will eventually crumble; ML is advancing very fast, and traditional techniques are not.
I don’t know how other think about this. Maybe this prospect was already obvious to you. Or you might think this view of the future is just wrong. Personally, I’m inclined to think that this future is near-inevitable, but-- despite that-- I hadn’t taken the next step of thinking through implications. Some questions to ponder might include:
• Can we take actions now so that this transition is something that we drive rather than something to which we fall victim? Personally, I don’t want the perception in a few years to be, “Those old school web ranking types just got steamrolled and somehow never saw it comin’...” Could we instead, say, formulate some 2019 collaboration goal with research to beat our best existing [REDACTED] prediction with a deep model?
• How might we discuss this possible future with people working on web ranking without crushing morale? [REDACTED]
I think I heard that the Translate team decided to go “all in” on large-scale ML some years ago, which seems wise in retrospect. I’m skeptical of such an extreme step around relevance today, because we might sacrifice significant gains by traditional means between now and the time a deep-ML approach really takes over-- which I think is probably at least a couple years out. Yet hearing the BERT wake-up call and not adjusting our plans seems unwise as well.
Anyway, this is on my mind during this quieter time, and I thought I’d share.
/Eric
[This document is from U.S. v. Google (2024).]
Behind the scenes
Google ranks its search results with the help of user interaction data. By observing how users interact with a search results page — clicking on a result, backing up, clicking on something else — Google gains valuable information about which pages are most relevant to a given query. And for years, this has helped Google maintain the upper hand in search relevance, since it has more user interaction data than any other search engine.
But in late 2018, Google’s engineers came to a startling realization: sophisticated language models would eventually be able to understand a web page from its text alone, without the benefit of any user feedback. And that could put Google’s 20-year advantage in search at risk — potentially even from a small startup.
Eric Lehman, a veteran software engineer at Google, wrote the above email after seeing the early results of Google’s BERT language model on the featured snippets (a.k.a. “web answers”) on the search results page. At the U.S. v. Google trial, he recalled:
[BERT] had outperformed everything that we'd done — you know, dozens of engineers over roughly a decade. And so I sent this email to reflect on what the implications of that were for the future.
It was quite a shock. I always felt that Google's — you know, its really big advantage is that by focusing on search for essentially so many years, we developed this deep body of theory and practices and we made a lot of mistakes but we learned over time. And I thought that was kind of our treasure chest.
And then along comes this system and just like, no, never mind, beats everything. And at this point we had just seen that this kind of, sort of junior problem, web answers, had experienced this just dramatic disruption, where machine learning made everything previous kind of irrelevant.
And the thought I'm considering here is that that's going to happen to web search, too. I was guessing that these kinds of advances would — yeah, just kind of clear the table of all past work.
Lehman worked at Google for more than 17 years. By the time he left the company in November 2022, the importance of LLMs was becoming clear, both inside and outside Google: “Things had begun to accelerate so fast, and it looked like the implications were just going to be staggering, and not just for search but for the world at large.”
Aravind Srinivas, cofounder and CEO of Perplexity: “In the first few weeks of launching Perplexity, an early OG Google employee told me the same thing as this: ‘don't worry too much about getting a lot of users to compete with Google. You're living in an age where unsupervised learning from raw internet text works. you do not need as much click stream data to build good index and ranking, and there lies your opportunity’“
Elon Musk on Tesla compensation
From: Elon Musk
Sent: 7/30/2017 12:20 PM
To: Todd Maron
Subject: Re: My comp stuff
The added comp is just so that I can put as much as possible towards minimizing existential risk by putting the money towards Mars if I am successful in leading Tesla to be one of the world's most valuable companies. This is kinda crazy, but it is true. Hope it isn't misinterpreted.
[This document is from Richard J. Tornetta v. Elon Musk (2024).]
Further context from Jef Feeley for Bloomberg: “Elon Musk’s $55 billion pay package at Tesla Inc. was struck down by a Delaware judge after a shareholder challenged it as excessive ... That is if the ruling survives a likely appeal.” (January 30, 2024)
Sponsor the Internal Tech Emails newsletter, and reach a curious and influential audience of 37,000+ tech founders, investors, operators, and enthusiasts. We’re booked for the first half of 2024, but sign up here to learn about future openings. 🔍
Thanks for reading!
-Internal Tech Emails
Great post; I liked the “behind the scenes” bit too. Thanks for writing it!
On Google: it’s fascinating to look at this from a lens of a well-run company worried about disruption. OTOH its also a reminder that customers will always demand and value a better experience - which is the driving force being why disruption is a necessary force.