Sesame AI’s Voice Assistant Steals Hearts, Aims for $200 Million in Funding

The round of talks might be co-led by Sequoia Capital and Spark Capital.

“How do we know when someone truly understands us?” This fundamental question is the driving force behind Sesame AI Inc., a company set on changing the way people interact with voice assistants.

They found their purpose of creation in “Voice”, which is our most personal form of communication, conveying nuanced meanings through subtle variations in tone, pitch, rhythm, and emotion.

They clearly understood their objective and recognized that current digital voice assistants lack the essential qualities for true utility. Effective collaboration is impossible without harnessing the full potential of voice. A personal assistant with a monotonous voice will struggle to remain relevant in our lives once the novelty fades.

Therefore the hyper-realistic artificial intelligence voice assistants named Maya and Miles was created, with a magical quality that makes spoken interactions feel real, understood, and valued.

Ankit Kumar, the co-founder of augmented reality startup Ubiquity6 Inc., and Brendan Iribe, a co-founder of the Oculus headset system now owned by Meta Platforms Inc., founded the company in 2023, joined by Ryan Brown

Sesame’s new Conversational Speech Model (CSM) may have crossed the “uncanny valley” of AI-generated speech. The company released a demo featuring a male or female voice assistant (“Miles” and “Maya”), and some testers reported feeling emotional connections to the voices.

One user on Hacker News said, “it was genuinely startling how human it felt,” he went on to express more on it as he tested and felt,”I’m almost a bit worried I will start feeling emotionally attached to a voice assistant with this level of human-like sound.”

Sesame AI Inc. utilizes a conversational AI model built on two collaborative AI models that are based on Meta’s Llama architecture. This architecture integrates processing into a single-stage multimodal transformer for both text and audio, enabling human-like conversations.

Using a large dataset of publicly available audio, which is then transcribed, diarize and segmented, the platform filters resulting into a dataset consisting of approximately one million hours of predominantly English audio.

This “untrapped potential” of Sesame’s voice assistant will feature some key components.

Emotional intelligence which will allow AI to pick up on how the user is feeling and respond accordingly, offering empathy when frustration arises or excitement when there’s good news. Conversational dynamics will ensure the flow of dialogue feels human, with the right timing, pauses, and emphasis. This makes interactions feel more thoughtful, not rushed or robotic. Contextual awareness would be the key for adjusting tone and style based on the situation—whether it’s a formal conversation or a casual chat, the assistant adapts to match the mood. Finally, a consistent personality will help users trust the assistant, knowing it will remain reliable and coherent in every interaction.
The company that came out of stealth mode just in February is not only wreaking havoc amongst its tester but also amongst the investor community.

The young company is in discussions to secure over $200 million in funding from investors, potentially raising its valuation to over $1 billion.The round of talks might be co-led by Sequoia Capital and Spark Capital. More details on this might come soon.

Crunchbase, a data provider, reported that the company had previously secured $47.5 million in a Series A funding round led by Andreessen Horowitz.

Even though AI voice assistants are now people’s daily companions, several companies might pose a tough competition for Seasame.

AI assistants like Hound and WaveForms AI are integrating different emotion cues in its system to cater to the human needs.But as of now Sesame’s goal to achieve that “magical” essence is taking the user community by storm as more and more users come forward to share their experiences, like one reddit user says, “this is the only voice model i’m actually enjoying talking to”

📣 Want to advertise in AIM Research? Book here >

Picture of Upasana Banerjee
Upasana Banerjee
Upasana is a Content Strategist with AIM Research. Prior to her role at AIM, she worked as a journalist and social media editor, and holds a strong interest for global politics and international relations. Reach out to her at: upasana.banerjee@analyticsindiamag.com
Subscribe to our Latest Insights
By clicking the “Continue” button, you are agreeing to the AIM Media Terms of Use and Privacy Policy.
Recognitions & Lists
Discover, Apply, and Contribute on Noteworthy Awards and Surveys from AIM
AIM Leaders Council
An invitation-only forum of senior executives in the Data Science and AI industry.
Stay Current with our In-Depth Insights
The Most Powerful Generative AI Conference for Enterprise Leaders and Startup Founders

Cypher 2024
21-22 Nov 2024, Santa Clara Convention Center, CA

25 July 2025 | 583 Park Avenue, New York
The Biggest Exclusive Gathering of CDOs & AI Leaders In United States
Our Latest Reports on AI Industry
Supercharge your top goals and objectives to reach new heights of success!