Responsible optimism for ML, which is still Programming's Asbestos
Wednesday, July 19, 2023 :: Tagged under: engineering essay. ⏰ 11 minutes.
🎵 The song for this post is Hardware Store, by Weird Al. Truly one of his best. 🎵
Four years ago, I wrote that ML/AI is programming's asbestos: amazing for the applications we're finding, but the long-term effects aren't known and I suspect going to bring a lot more complications than peddlers are willing to admit (this was hot on the heels of reading a horribly credulous and inaccurate book on it). In the last year, a LOT happened: we had Midjourney and DALL-E and generative images which, freaky hands aside, produced stunning work (and they've largely fixed the hands). Now we've got ChatGPT and LLMs.
I think the original article holds up, but I firmly believe these were game changers, and as a technologist I'm pretty blown away. I didn't think I'd see results like this until way later in my lifetime. I feel pretty confident we'll find at least a few great uses for this.
So I'm optimistic, but I still get treated like a hater for bringing up what I think are very reasonable observations about the limits. Here are some notes on shortcomings: I'd consider basic respect for these a prerequisite for "responsible optimism" for the tech. We've got a lot of companies AI-washing their product offerings (a bit like when a Long Island Ice Tea company added "Blockchain" to their name, and their stock soared). As someone who actually, you know, builds things and not just talks about them? Here are some weaknesses you've got to keep in mind before you start running your mouth on how this is going to Change Everything Forever And Ever and if you don't, you're not gonna make it.
Note: I gathered most of these links and notes ~6 months ago, and I really wish I'd published this then, when people were REALLY losing their minds, because I would have seemed so much smarter 😛. A lot of the wilder hype seems to have died down. Also, a lot of my outbound links point to Twitter threads, and Elon made the change that you have to be logged-in to see tweets, which absolutely sucks, so, sorry about that, hope you have an account.

On Grander visions: LLM progress does not carry over to other AI domains
"AI" means "Artificial Intelligence," but the most recent things to get all the attention (LLMs, or "Large Language Models") do not carry over into other domains. So just because ChatGPT is really cool, this doesn't mean anything new for, say, robotics, or playing chess.
LLMs are just a button you press to say "give me the next word." It happens to be really, really good at it. But other fields demand other skills. Robotics is hideously hard because of challenges with interactions in the real world and sensors: it's very hard for a robot to know, for example, how to hold an orange without squeezing it too hard. No amount of "what's the next word in the sequence" is going to help the robot with that task.
Another great example is self-driving cars: they've been "around the corner" since 2012. But they're still not widely deployed because that last 10% is really, really hard to do safely, and LLMs or Stable Diffusion aren't going to help them with that.
This is Stockfish (a chess program) playing as White, and ChatGPT playing as Black. If you don't understand chess, you should understand that at a certain point, ChatGPT is making up the rules.
So celebrate the wins! But if someone is talking about The Future and how AI will solve so much, bear in mind that each of these cases is quite limited, and don't cross-pollinate easily.
It still fails. A lot. And that's fundamental to what they are
AI is often still dumb as rocks. Here's a military simulation that was defeated with cardboard boxes. The strategy game Go was considered "solved by AI" because it was beating grandmasters, but it turns out you can do the equivalent of "cardboard boxes" and win in a way no human would ever allow. Translation was considered "solved" but if you type random Latin-sounding words in Google Translate it will make up new meanings for your gibberish. The first paragraph of this article has a lot of examples of how all this auto-generated crap is making all the most popular products way worse.
AI is still pretty powered by humans. Not in the loop (e.g. the old "AI" companies like x.ai said they were AI-powered but run by a person); but these models are trained on $2/hour international labor (stories here and here). People think AIs will free us up to live fuller lives, but it turns out the future we built was one where humans do all the drudgery and the computers get to make all the art. Technology making humans do more drudgery and spend less time doing life-fulfilling things is the historical precedent, by the way.
Everyone knows this already and if you don't care already, you're never gonna care, but: the results are as biased as the data, and locks us into those biases harder. If you ask it to "professionalize" something, it makes it Whiter and Male-er.
The models, which were already incredibly opaque, are becoming moreso, especially after OpenAI removed any semblance of "Open" from their mission with the release of GPT4. This sucks, in part, because now you can't audit their work: there's evidence of contamination in the training and we just now have to take on faith they built these things correctly and ethically (it's not like someone would ship software with bugs, right?). We're increasingly relying on proprietary, opaque models that lock in biases as Microsoft fires their entire AI Ethics team and Google had a scandal with firing their researchers the year prior.
The same models, when running at different times, perform with wildly different accuracies on the same task. This is a new finding! But it turns out if you asked GPT4 in March 2023 to identify prime numbers, it was correct ~98% of the time. If you asked it in June with the same questions it was right 2.4% of the time. This is likely a regression with updates (again, we ship bugs all the time), but again: how good do you feel betting anything significant on an opaque product behind a company wall that can regress that quickly? How safe do you feel betting your company on it?
Complications when incorporating into a product
Okay, functionally, all those previous things, but: if you're a company and you want to put AI in your product, what risks should you be aware of?
First, there seems to be no way to secure these things particularly well from injection attacks. (here, here). If you're giving user inputs to an LLM (which is what most people are doing, because it's cool and the part that definitely works and feels like magic), you have to be very, very, very careful.
COGS. This article speculates it cost $100m to train GPT4. Even if they're exaggerating (IMO press about OpenAI isn't skeptical enough, other figures put it at $40m), it's still extremely expensive to do in-house and many people don't have the core competencies yet. Also, good luck getting your hands on those GPUs.
Okay, so you use a third-party model. Congrats, a core feature of your product is now handled by someone else. It's okay to outsource parts of your business for sure, but if you want to use AI for an actual product category, it's always dangerous to outsource a core competency. Additionally, you're locked into their pricing.
Which brings us back to COGS. If you have a button that costs entire dollars to click (vs., say, an ElasticSearch query that you could hit millions of times before it costs you pennies) you've suddenly made your software much more expensive to run, especially if it's users who press that button. And how well are VC-backed companies doing right now on cost controls? I'll die on this hill but I think software organizations should care about what things cost.
Lastly, data privacy. If you're using a third-party, you're now submitting your data and maybe your customer's data to someone you don't know, who's probably going to train on it, may log sensitive data by accident, could get hacked (this is distinct from the data privacy of from injection attacks, mentioned above).
"Oh, but their contract says they won't retain or train on it!" lol, okay, but listen to boosters when an artist or a writer wants to opt-out of their work being used to train these models. The peddlers become entitled and selfish, accusing the artists of being afraid of or holding back progress. The contempt for the desires of the artists are plain and naked. The companies are also desperate and afraid of falling behind (because they have no moat) to someone less ethical. If you trust these companies with data privacy or data security after the last two decades of tech in addition to the ways they talk about the data they've already stolen, you're a chump.
Long term outlook (Ouroborous effect)
This one is harder to explain succinctly, but: the current AI results were from training on a human-made corpus, the greatest one to ever exist. We're now putting AI output into that corpus, which does 2 things:
-
Subsequent AI models will be trained on the output of previous models, so it'll be less "human" in all the ways human output is different (creativity, cadence, tone…). Additionally, it'll get harder to make improvements because we've poisoned the corpus with substandard training material. Quality will go down.
-
We'll slow the advancement of culture. If a large portion of 2023's Internet is built using models trained on 2021's Internet, 2023 will smell more like 2021. And 2024 will smell even more like that, and so on, and so on.
This "culture eating its tail" has happened before with the Internet and Google Search: it used to be a map of a very human internet full of websites. But over the course of decades, the Internet made choices to satisfy the map makers. Now that human internet got replaced with mountains and mountains of SEO-laden crap meant to please the ad robots instead of people. It's impossible to find decent consumer advice on the Internet anymore, pages barely load with all the trackers we put on, and most of us are resorting to silly hacks to reach human output again.

Do you like innovation? Do you think there are treasures in the past that didn't get enough attention? Well with LLMs, we don't have enough training data to use anything that wasn't popular from the past, and it'll be that much harder to innovate on core abstractions in the future, a bit like the fact that our terminal emulators are condemned to forever pretend they're a machine designed in the 70's. (via)
This will also solidify incumbants, and we're likely to lose a ton of knowledge that isn't easily trainable. It'll make it even harder to innovate on certain layers of technology because of how much training data there is.
The weird religion
Finally… please ignore the weird religion that's coming out of this. People are saying things like "AGI [Artificial General Intelligence] will become superintelligent at a rate we can't keep up with and destroy us all! We must appease Roko's Basilisk! It's an existential threat to humanity!!"
Look… I loved Deus Ex. It was a really cool game. But we've been "on the brink" of creating AGI several times and never have, and most people who rant about this demonstrate within 5 minutes of talking to them that they've given very little thought to the definitions of cognition and consciousness or how to differentiate the two. Making a box that that performs NAND run operations to generate another word from the last one isn't cognition. There's a lot of study behind this kind of thing, and while I know tech people like to rub their temples while saying "first principles" over and over about fields of study they've thought about for a few hours, they end up being idiots stuck on the second floor since they were too smart to read or understand prior scholarship (look at the crypto people speedrunning why we have financial regulation).
I call it "the weird religion" because it so strongly echoes "our actions will summon God into this world and he will smite us for our sins unless we repent now." It's not even worth engaging with the arguments, but if you want a good explainer for why "the computer will make itself smarter" is unlikely, besides the clear limitations in the sections above and/or knowing what LLMs are, I love Ted Chiang.
So, it's all shit?
Not at all! Again, this stuff is awesome too.
But I worry. Self-driving got to this point and it was never fully deployed because roads and cars are well-regulated, we recognized cars are lethal, and we waited for them to get a little more reliable before releasing them into the world. It turns out that last very important step never materialized, and my friends in industry say it'll be another decade at least, if it does at all. In the meantime, cars are as lethal as they ever were, but not more (though Tesla released something they call "Full Self-Driving" anyway, and hours later there was nine-car crash).
In the information game, there are no regulating bodies. This isn't recognized as potentially lethal. So we're going full-steam ahead and putting these things everywhere. I think there are risks to this, materially and culturally. I'm excited for the tech, but if you're a grown-ass adult you can keep two things in your head at the same time: optimism for what this allows but also a realistic and grounded sense of where they're weak.
Additionally, one should exercise prudence before stuffing this tech into your product. There are decent use cases for it, but in the presence of injection attacks and leaking confidential data, on top of the rampant costs associated with it and the inability of most companies to make their own models… I'd consider it a risky bet. If you absolutely must, consider using a leaked big model or an open source one.
Thanks for the read! Disagreed? Violent agreement!? Feel free to join my mailing list, drop me a line at , or leave a comment below! I'd love to hear from you 😄