Running To Stand Still

The challenge of interpreting silence

This Week In Voice VIP will now be syndicated once a week in The UN Brief, a publication on global digital transformation enjoyed by over 100,000 diplomats, technologists, and executives around the world.

In 1992, the NFL and CBS (which aired the Super Bowl) lost a massive 20-25 million viewers (and 10 ratings points, a sizable chunk) to then-upstart FOX, which dared to counterprogram In Living Color during the typically boring Super Bowl halftime show.

Incensed, CBS executives the following year decided they would make sure that didn’t happen again, hiring the biggest and baddest entertainer in the world to perform during halftime: Michael Jackson.

The rest is history, as this catapulted the Super Bowl halftime show to become one of the biggest and most prestigious gigs in all of entertainment, singularly as a result of Michael Jackson’s otherworldly and memorable performance.

What often gets forgotten, though, is what Michael Jackson did at the start of this performance, when he first showed up on stage.


Michael Jackson stood there, almost motionless, for nearly two entire minutes.

A 30 second Super Bowl ad that year cost approximately $850,000, so this silent introduction cost nearly $3.5 million dollars, in 1993 money.

That’s one loud statement.

I had a chance to hear a talk Cathy Pearl of Google gave back in 2019, and during her presentation, she mentioned that part of the complexity of human communication is how humans utilize silence.

One human being not speaking to another can translate to a wide variety of words or expressions, and the challenge for voice assistants and AI is to be able to effectively use context to figure out what silence might mean.

Imagine a voice assistant being thrust into this situation, shown within the 2001 remake of Ocean’s Eleven, in which Danny Ocean (George Clooney) has a one-sided conversation with his friend Rusty (Brad Pitt) about if they need one more person for their upcoming heist:

Humans have been talking - and sometimes, not talking - to each other for a long time.

We are born with 43 muscles in our face, communicating approximately 10,000 unique combinations of expressions, all without saying a word.

Adding to the complexity of non-verbal communication is how one look, one expression, one glance might mean one thing to one culture, and something completely different to another.

It’s a problem so large, a vast multitude of companies will be required to solve it.

The more we start talking about it, the better.

You’ve got to cry, without weeping.
Talk, without speaking.
Scream, without raising your voice.

Subscribe to This Week In Voice VIP - the daily letter to the voice technology and AI communities.

Have a friend? Gift ‘em This Week In Voice VIP.

Give a gift subscription

Discounts for This Week In Voice VIP subscribers:

DBW Global is the (virtual) gathering of the wide world of publishing, taking place September 14-16. Produced by Bradley Metrock. Use the promo code to save when you register here.

Project Voice 2021 is the #1 event for voice tech and AI in America. In person, in April 2021, in Chattanooga, Tennessee. Use the promo code to save when you register here.

Subscribe to This Week In Voice TV on YouTube!