Table of Contents >> Show >> Hide
- What Sparked the Blow-Up: “Sky,” GPT-4o, and a Very Familiar Vibe
- Johansson’s Version of Events: “No Thanks”… Then “Wait, What?”
- OpenAI’s Response: “Not Her,” But Also “Paused”
- Why a Voice Can Be a “Likeness” in U.S. Law
- The Ethics Question: Even If It’s “Not Her,” Why Does It Feel Like Her?
- Hollywood’s Bigger AI Anxiety (It’s Not Just About One Voice)
- What Brands, Creators, and AI Companies Can Learn From This
- 1) Consent is not a “nice-to-have.” It’s the product requirement.
- 2) “Sound-alike” is a reputational risk, even when it’s legal.
- 3) Transparency beats mysteryespecially with voices.
- 4) Product design choices communicate “intent,” even when you don’t say it out loud.
- 5) Pay voice actors welland still protect them from backlash.
- Where Things Stand and What to Watch Next
- Real-World Experiences: What This Dispute Feels Like Across the Ecosystem
- Conclusion: The Future Will Talk BackBut Who Gets to Own the Sound?
For one brief moment, the internet lived in a very specific sci-fi fantasy: your chatbot could talk back like a calm,
charming, human-sounding assistantno robotic beep-boop vibes, no “press 1 for disappointment.” Then came the part where
everyone asked the same question at once: “Wait… is that Scarlett Johansson?”
In May 2024, OpenAI demoed a more natural, more expressive voice experience for ChatGPT, and one voice option“Sky”sparked
immediate comparisons to Johansson’s famously recognizable performance as the AI assistant “Samantha” in the film
Her. Johansson soon issued a statement saying OpenAI had previously asked her to voice the system, she declined,
and yet the final product used a voice she described as “eerily similar” to her own. OpenAI denied that Sky was Johansson,
said it was performed by a different professional voice actor, and paused the Sky voice anyway.
If this story feels like a tech-meets-Hollywood soap opera, that’s because it is. But it’s also a real-time case study in
how AI voice tools collide with consent, celebrity voice rights, and the messy human fact that “similar” can be legally
complicatedeven when the code is shiny and the demo is smooth.
What Sparked the Blow-Up: “Sky,” GPT-4o, and a Very Familiar Vibe
OpenAI’s big reveal in May 2024 centered on a newer model and a more fluid voice experiencesomething closer to normal
conversation than the classic “I will now read my answer in a monotone paragraph” approach. The trouble (for OpenAI) was
that voice isn’t just an interface choice. It’s identity. It’s vibe. It’s brand. It’s also the fastest route to “Hey, I
know that person,” even when you don’t.
“Sky” wasn’t marketed as Johansson. But to many listeners, it carried a familiar cadencewarm, intimate, and
Her-adjacent enough that comparisons spread like glitter: instantly and everywhere. Adding fuel to the fire, OpenAI
CEO Sam Altman posted the single word “her” on social media after the demoa wink that, fairly or not, many people read as
a deliberate reference to Johansson’s film. Suddenly, the question wasn’t just “Is that her voice?” It became: “Did OpenAI
want people to think it was her?”
Johansson’s Version of Events: “No Thanks”… Then “Wait, What?”
In her statement (shared through her publicist), Johansson described being contacted about voicing a ChatGPT system months
earlier. She said Altman told her that her voice could help bridge the gap between tech companies and creatives, and make
people feel comfortable with what he called a “seismic shift” around humans and AI. She declined “for personal reasons.”
Then, when OpenAI’s updated voice experience arrived, she said friends, family, and the public noticed how much “Sky”
sounded like her. Johansson said she was “shocked, angered and in disbelief” that OpenAI would pursue a voice so close to
hers that, in her view, people around her couldn’t tell the difference.
She also said Altman reached back out shortly before the demoasking her team to reconsideryet the demo went live before
they could connect. According to her statement, she hired legal counsel, and her lawyers sent letters requesting details
about how the Sky voice was created and selected. She framed her response as bigger than one product: in an era of
deepfakes and identity misuse, she argued, people deserve “absolute clarity,” and she called for transparency and
“appropriate legislation” to protect individual rights.
OpenAI’s Response: “Not Her,” But Also “Paused”
OpenAI’s official position was direct: Sky was not Scarlett Johansson, and it was not intended to mimic her. The company
published a post explaining how ChatGPT voices were chosen, describing a months-long casting process with professional
voice actors, agencies, and casting directors. OpenAI said it believes AI voices should not deliberately mimic a celebrity’s
distinctive voice, and that Sky belonged to a different professional actress using her natural speaking voice. OpenAI also
said it would not name the voice talent to protect privacy.
Here’s the detail that matters: OpenAI said it had cast the voice actor behind Sky before any outreach to Johansson, and
that Johansson was approached separately about potentially becoming an additional voice option. In the same breath, OpenAI
said it paused Sky “out of respect” for Johansson and apologized for not communicating better.
That combinationdeny, explain, pauseleft the public with two competing impressions:
(1) OpenAI did nothing wrong but stepped back to reduce conflict, or
(2) OpenAI sensed reputational smoke and decided not to wait for flames.
Either way, pausing the voice wasn’t just a product tweak. It was a message: voice likeness is sensitive enough that even
“we didn’t mean it” won’t automatically end the conversation.
Why a Voice Can Be a “Likeness” in U.S. Law
Most people think of identity theft as faces and names. But U.S. law has long treated voice as part of a person’s
recognizable identityespecially when used commercially. That’s where the “right of publicity” comes in: a state-based
set of laws (not a single federal statute) that can give individuals control over the commercial use of their identity.
Two classic voice cases that still shape the conversation
Long before AI voice cloning, courts dealt with sound-alikes. In Midler v. Ford Motor Co. (1988), Bette Midler
successfully sued after Ford used a singer to imitate her voice in an ad. In Waits v. Frito-Lay (1992), Tom Waits
won a case involving a vocal imitation used in a commercial. The underlying point is simple: if you use a voice that’s
deliberately designed to evoke a specific person, “but it’s not literally them” may not be a get-out-of-jail-free card.
That doesn’t mean every similar-sounding voice is illegal. Voices overlap. Accents overlap. “Warm mezzo-soprano with a
calm, intimate cadence” is not one human being’s exclusive property. The legal fight often turns on intent, marketing,
and whether the use is likely to confuse the public about endorsement or identity.
The Ethics Question: Even If It’s “Not Her,” Why Does It Feel Like Her?
This dispute lives in a uniquely modern gray area: a company can hire a real actor, pay them fairly, and still end up with
a product that feels like a celebrity impressionespecially if cultural references nudge the audience in that
direction.
Consider the full context: a voice assistant that sounds similar to a star who voiced an AI assistant in a famous movie;
a CEO who openly enjoys that movie; a “her” social post that reads like a wink; and a product demo that went viral fast.
None of that proves wrongdoing. But it illustrates how “implied association” can happen without a single explicit label.
In other words: people don’t experience technology in a courtroom. They experience it in a swirl of vibes, context, and
pop culture. If a company wants the “Samantha from Her” feeling without paying for “Samantha from Her,”
it shouldn’t be surprised when the internet plays detective.
Hollywood’s Bigger AI Anxiety (It’s Not Just About One Voice)
Johansson’s complaint landed in the middle of broader entertainment-industry worries about AI: synthetic performers, voice
cloning, digital replicas, and the growing ease of generating convincing fakes. Actor unions and creative guilds have been
pushing for clearer rules around consent, compensation, and transparency.
And Johansson didn’t stop at the OpenAI moment. In 2025, she publicly urged lawmakers to prioritize AI safety legislation
after an AI-generated video using celebrity likenesses spread onlinearguing that the misuse of AI can distort reality and
amplify harm at scale. Whether you see that as caution, frustration, or both, the through-line is consistent: she’s
positioning voice and likeness as personal rights that shouldn’t become default training data for the future.
What Brands, Creators, and AI Companies Can Learn From This
1) Consent is not a “nice-to-have.” It’s the product requirement.
If your product resembles someone’s distinctive identityvoice, face, name, or signature stylepermission should be
explicit, written, and specific. “We asked once” is not consent. “We didn’t literally copy” is not consent. And “people
just happened to think of her” is not a strategy you want to defend in public.
2) “Sound-alike” is a reputational risk, even when it’s legal.
The court of public opinion moves faster than the court system. If thousands of users hear “Sky” and immediately think of
Johansson, the company has a perception problem even if it believes it has a legal defense.
3) Transparency beats mysteryespecially with voices.
OpenAI said it couldn’t disclose voice talent names for privacy. That may be reasonable. But the tradeoff is predictable:
secrecy invites speculation. Companies working with synthetic voices should be ready to explain process, safeguards, and
guardrails in plain languagebefore the controversy, not after.
4) Product design choices communicate “intent,” even when you don’t say it out loud.
Names, marketing language, demo staging, and yes, cheeky social posts can suggest inspiration. If you don’t want audiences
to connect the dots, don’t hand them the marker and a fresh sheet of paper.
5) Pay voice actors welland still protect them from backlash.
One overlooked part of this story: when a voice becomes controversial, the performer behind it can get caught in the
crossfire. Companies should not only compensate talent fairly, but also plan for how to protect voice actors from being
harassed, doxxed, or pressured to “prove” they’re the real source.
Where Things Stand and What to Watch Next
The immediate outcome was practical: OpenAI paused Sky and publicly reaffirmed it would not intentionally mimic a
celebrity’s voice. Johansson, meanwhile, framed her response as part of a larger push for rights protection in an AI era.
What happens next in similar disputes will likely turn on three questions:
- How do we define “distinctive voice likeness” when AI makes near-infinite variations easy?
- What standard of proof matterstraining data, casting records, internal intent, or public confusion?
- Which laws will leadstate publicity rights, federal consumer protection, union contracts, or new AI-specific legislation?
The bigger takeaway is that voice is becoming a new “face” online: instantly recognizable, easily imitated, and hard to
control once it escapes into the wild. If you’re building voice AI, you’re not just shipping a feature. You’re shipping
identity-adjacent techand people will react accordingly.
Real-World Experiences: What This Dispute Feels Like Across the Ecosystem
To understand why the Johansson–OpenAI moment hit such a nerve, it helps to look beyond headlines and into the lived
experiences orbiting voice AI.
For performers and voice actors, the “Sky” controversy feels like a preview of the next decade of work
anxiety. Voice acting has always involved rangeaccents, tones, charactersbut AI adds a new fear: that your “natural
speaking voice” could be turned into a product feature, endlessly reusable, and forever associated with a brand you didn’t
choose. Even when a company hires a legitimate actor, other creatives watch closely because the precedent matters. If a
famous voice can be “evoked” without being hired, what does that mean for everyone else who’s less famous and has fewer
lawyers on speed dial?
For tech teams building voice experiences, the lesson is that audio isn’t just another skin. Engineers
can optimize latency, reduce interruptions, and polish turn-takingbut the moment a voice reminds users of a specific
person, the conversation changes from “cool feature” to “who authorized this?” Product managers learn quickly that voice
selection requires cultural sensitivity: what sounds “warm and approachable” to one listener can sound like “celebrity
imitation” to another. And once social media latches onto a comparison, no FAQ can fully put the toothpaste back in the
tube.
For everyday users, the experience is oddly emotional. People don’t just hear a voice; they build trust
with it. A voice can feel comforting, patient, and presentespecially when it responds quickly and remembers context. So
when users suspect a voice is borrowed from a real person without permission, the product suddenly feels less magical and
more unsettling. Users may still want the convenience of an AI assistant, but they don’t want to feel like they’re
participating in something ethically questionable every time they ask for a grocery list.
For brands and marketers, this controversy is a cautionary tale about “familiarity” as a growth hack.
In the short term, a recognizable vibe can drive attention. In the long term, it can trigger backlash, legal threats, and
a trust deficit. The safest path is also the simplest: if you want a celebrity voice, license it transparently. If you
want a “celebrity-adjacent” vibe, don’t design your product so that consumers reasonably conclude it’s the celebrityand
don’t wink at the comparison. In an era of deepfakes, ambiguity reads like intent.
For policymakers and advocates, this is the kind of story that turns abstract AI ethics into a concrete
dinner-table argument. People might ignore debates about “synthetic media,” but they understand “someone sounds like me
without my permission.” That’s why voice likeness and digital identity laws keep popping up in conversations about AI
governance: the tech is advancing faster than the rules, and high-profile incidents become the pressure points that force
clarity.
Put all that together and you get the real reason this story lingered: it’s not only about Scarlett Johansson. It’s about
how quickly “your voice” is becoming something that can be copied, approximated, and productizedand how urgently society
needs shared norms about what’s allowed, what’s ethical, and what requires a clear, documented “yes.”
Conclusion: The Future Will Talk BackBut Who Gets to Own the Sound?
Scarlett Johansson’s criticism of OpenAI wasn’t just celebrity outrage; it was a flare shot up over a messy new frontier.
When AI can speak naturally, the human instincts around recognition and trust kick in instantly. That makes voice a powerful
design choiceand a legal and ethical tripwire.
OpenAI says Sky was not Johansson and wasn’t meant to mimic her, yet it still paused the voice. Johansson says the similarity
was obvious enough to alarm the people closest to her, and she wants transparency and stronger protections. Between those two
claims sits the reality of modern AI: “close enough” can be the whole controversy.
The safest future for voice AI probably looks boring on paper: clear consent, clean documentation, transparent casting, and
fewer winks at pop culture. But if the alternative is a world where anyone’s identity becomes a vibe anyone can monetize,
maybe boring is the new revolutionary.