The paradox of AI

I remember back in 1999 I got to try a speech recognition software for the first time. It required a training phase where I had to read paragraphs and paragraphs of text out loud so it could “pattern match” my voice. I was excited enough so I did the chore and waited for the moment of truth. I was about to experience magic, I was about to use “AI” for the first time.

Spoiler alert, it didn’t last for more than 10 minutes, I had a couple of “wow” moments but then ditched it and never looked at it again.

Why? 

Because it was good as something novel but not good enough to be part of my life.

Rings a bell? You bet. It happens all the time with AI. One could even argue it’s the main reason behind past “AI winters”.

This is typically a paradox that (some) AI systems have: it’s impressive at first sight but if it does not work ~100% of the time, it is utterly unreliable, hence depressively useless.

Speech recognition is one example of that. The tech has been around for the last 20 years but its wide adoption had to wait until deep learning pushed it to being consistently accurate.

(by the way, I am not saying it wasn’t useful prior to that, it definitely made it to the accessibility features and helped a large number of disabled people, but those who benefited basically did not have a choice, it was that bad software or nothing.)

Back to the paradox of AI. When I see a major social media boasting that “94% of hate speech being taken down by AI”, I can’t help but cringe. At this rate and for such an application, I would not even call it working at all. Not to mention the disasters it creates in languages like Ge’ez and Burmese where genocides are involved. When I read some prominent ML researcher prophetize (in 2016) that AI will replace all radiologist and in today’s debates that AI will eventually replace artists and creators, I can’t help but think we haven’t fully learned the lesson (in addition to “we should leave these prognostics to labor experts”).

Now we are seeing large language models like chatGPT and image generators like StableDiffusion generate truly impressive stuff. We also know they are not fully reliable and will rather unpredictably fail. Their answers are sometimes as convincing as they are wrong. When they are not plain dangerous. E.g. when a search engine, powered by such models, advises dangerous measures to people searching for what to do in case of a seizure.

The bitter lesson of Machine Learning here is that it is probably approximately correct (as Gael nicely puts it, hinting at a well-known ML theoretical framework) and as such, we need to be careful with its definition of “working” and remain overly cautious about its impact on people’s lives. 

Comments