David Senra is contagious. Wait, no. David Senra’s enthusiasm for studying founder’s is contagious. Yeah, that’s probably right. And he spreads his enthusiasm through his voice.
Last week, Senra replayed an older episode of Founder’s on Mike Bloomberg (Founder of Bloomberg LLP and former Mayor of New York City). I still have this line echoing through my mind: “Why not just quit then? […] Real entrepreneurs never do.” I’ve read a lot of books about entrepreneurship. I love reading books. I love holding books in my hand and sitting next to a wall-to-wall bookcase everyday. But, man, the magnified impact of voice is hard to ignore. Let me demonstrate. I recorded this excerpt from the Founders episode where Bloomberg talks about quitting in response to the myriad headaches that come when a company grows large.
Real entrepreneurs never stop.
Hearing it makes you more likely to remember it. If you want to make it even more useful, say it out loud yourself. Write it on the whiteboard. Draw it. Play with it. Audio, I have found, allows me to play with ideas in different ways than just writing it on the digital page (Ben Thompson will express a similar sentiment later in this article).
Listen to this
Voice and audio have really changed our lives over the past few years. Podcasts and audiobooks have embedded themselves into our culture. For me, I started listening to EconTalk in 2011. I can still remember the scenery of my runs listening to Russ Roberts. Founders, Invest Like the Best, Freakonomics, Knowledge Project, Lenny’s Podcast, Analytics Power Hour, and Practical AI are among the podcasts I listen to on a weekly basis now. I also usually go search for specific topics (such as “voice and AI”).
Pew research (via Edison Research) shows that 70% of the US are active [weekly] listeners of online audio.
Radio has been around for a long time, yes, but the variety of the content today is orders of magnitude larger than it used to be.
Lots of people tell me that they’ve read such and such books, only to specify that they listened to the audiobook. In terms of revenue, printed books accounted for 74.7% of the industry’s $26 billion revenue, but e-books sales only generated 7.48% of the revenue. That means other formats, primarily audiobooks, accounted for 17.82% of revenue.
And Speechify. I can’t forget Speechify. Over 1 million people have downloaded the Speechify Chrome plugin. The ability to listen to text is getting easier every single day.
Anecdotally at least, audio technology has changed our lives on the listening side. What about the creation side? The ability to create a podcast or distribute an audiobook has exploded in the past decade. The ways in which we can send voice messages to others has also dramatically improved. Although the use of audio technology has transformed our listening experiences, the creation side remains asymmetrical.
One way to look at this: if I estimate that there are 5-10 million people making podcasts… that’s 1.5% of the population who is combining their voice with technology to create1. If you are thinking that this is also true for the percentage of people that write books (since 80% of people are literate), it doesn't appear to be the case. This small survey estimates that ~8% of people have written a book. And even though practically everyone writes text messages or emails, only 30% of people send audio messages on a weekly basis.
Voice and audio creation might be underutilized.
Ben Thompson on chatGPT voice
Technology is getting us much closer to doing all sorts of things with voice. When I say technology, I mean AI.
A few weeks ago on the Sharp Tech podcast, Ben Thompson talked about the potential for AI smart glasses and chatGPT voice. By default, he expected the vision capability to be the killer product and the voice-only feature to be a novelty. I augmented the transcript of what he said (with the help of chatGPT) and recorded it myself. Take a listen…
The talking (chatGPT voice) was so compelling.
This is a step change in the experience of bots.
Emerging landscape of voice technology
Considering the landscape of voice and AI to create, there are several interesting ideas to be aware of…
→ Audio messages. Voice mail is a thing of the past. It got rejected hard in favor of text messages. In 2023, the middle-of-the-bell-curve sentiment is that voice is for podcasts, audiobooks & synchronous meetings. Not so fast! Audio messages are making a comeback, especially in the younger generation and especially in cross-timezone work teams. I remember Patrick O’Shaughnessy talking about this last year:
Despite voice aversion, the capability of AI to make your voice sound clear and crisp will attract more and more people to this medium as the days go by.
→ ChatGPT voice-only. As mentioned by Ben Thompson above, this is an early beta feature.
and I have experimented with the voice-only LLM experience… and it’s complex!→ Intelligent agents. Robb Wilson has been working on voice technology for decades. In the Age of Intelligent Machines, he writes:
Nothing could make interactions with machines easier than using our most natural forms of communication. Speech and text are methods for sharing information that nearly everyone on the planet can leverage without special training.
Robb sees voice as the way to build rapid automation systems. Just imagine being able to create Zapier Zaps in minutes using voice guidance.
→ Speech to text. Dragon speech is a decades-old technology. The problem is, it’s decades old. Speech to text is still foundational, but LLMs now can take us several strides beyond transcripts.
→ Voice to writing. The days of saying stuff like “Hey Bill, comma, I am ten minutes away, period” are over. The number of options for voice to writing tools are growing (Otter.ai, Audio pen, Audio notes, Ramble Fix, The Oasis). And of course my tool, Storied, falls into this group.
→ Emotion detection. Emotion detection in text has been around for decades2, but emotion detection in audio is starting to become a real thing. It might help with understanding your own sentiments about something as well as others. However, it looks like there is still a lot of issues with this due to recording equipment, environment, and placement (distance or angle) of recording device from the source.
The future expands with the use of voice
One of the most memorable articles I’ve ever read was from my friend Chris Sweeney called The Remarkable Life of Roxie Laybourne. The content was surely engaging; a lot of it was about planes flying into birds and the consequences. The reason it has stuck with me for so long, I believe, goes beyond Chris’s written words. The article also includes images and audio clips of Roxie. The multimodal effect really let me play with the story in my mind.
Some of the most memorable conversations I’ve ever had at work were through audio messages. During my last few months at Target, I had conversations with my team members through an asynchronous voice exchange. If inclusion is about being heard, these conversations helped me hear 10x better than any regular 1:1 discussion.
I’ve encountered individuals that are certain that innovative combinations of voice and AI won’t change the way we communicate at work. I’m 100% certain3 that, given the rapid technology advancements in the space and the powers that voice gives us (speed & human connection), these individuals have fooled themselves into believing that the future world looks like the world today. And that’s 100% wrong, because entrepreneurs never stop.
These are ballpark estimates. You’d be wise to not reuse them :)
A previous writing where I tried to extract emotions from vision docs (documents used in coaching engagements).
I never say I’m 100% certain of anything. But I feel comfortable saying that “I know” the future doesn’t look like today.
Super interesting analysis on the use of voice messages as a new mean of communication inside companies. You should take a look to this startup (it's like a B2B Clubhouse) : https://async.com