Twelve Labs Secures $30M to Build Deeper, More Adaptable Video AI Solutions

Our sole focus on video allows us to develop solutions that go deeper and are more adaptable.

Video content is exploding, but finding specific moments within videos often feels like searching for a needle in a haystack. Jae Lee, the co-founder and CEO of Twelve Labs, recognized this challenge and built a company dedicated to solving it. Twelve Labs, a San Francisco-based startup, is pioneering video understanding AI, and its latest funding round—$30 million from key strategic investors—signals the industry’s confidence in its groundbreaking approach.

Nvidia backed Twelve Labs in building AI that understands videos like humans. Lee, the founder of Twelve Labs, mentioned that Jensen Huang, CEO of Nvidia, always had a special place in his heart for computer vision, with video understanding being one of the first use cases powered by Nvidia chips. According to Lee, Nvidia’s vision and expertise in video understanding made for a perfect match with Twelve Labs’ mission.

The new investment round included heavyweights like Databricks, Snowflake Ventures, SK Telecom, HubSpot Ventures, and In-Q-Tel (IQT). This funding not only brings the company’s total raised to $107.1 million but also solidifies partnerships that promise to reshape enterprise video analytics.

At the heart of Twelve Labs’ innovation is its proprietary multimodal foundation models, Marengo and Pegasus. These models go beyond basic keyword searches, diving into the nuances of video content, from actions and objects to background sounds. Whether it’s precise semantic search, summarization, or automated content moderation, these tools bring human-like understanding to video data.

Jae Lee, a data scientist by training, believes current video search tools fall short. “Finding a specific moment or angle in videos can be like looking for a needle in a haystack,” Lee explained. Existing keyword searches may pull up titles or tags but fail to penetrate the actual content. Twelve Labs addresses this gap by mapping text to what’s happening inside a video.

The company’s flagship models aren’t just tools for content creators—they’re platforms for building next-gen applications. Developers can integrate these models into their workflows to generate highlight reels, moderate content, or even guide ad insertions.

How It All Started

Twelve Labs’ journey began in 2020, founded by Lee after he completed his computer science studies at UC Berkeley and served in the Ministry of National Defense’s Cyber Operations Command in South Korea. While serving, Lee and his colleagues, who shared a passion for AI, began discussing AI technologies and research papers, which ultimately led to the founding of Twelve Labs. Despite the market’s focus on text and image-based AI at the time, Lee and his co-founders saw an opportunity in video, a field they believed could be revolutionized even with limited initial investment.

This drive to make an impact in the AI field with video led to the development of their multimodal AI models. Their first major breakthrough was the creation of Pegasus, a model that could analyze videos and answer questions about their content. Since then, Twelve Labs has expanded its reach, working with major organizations like the National Football League (NFL) to help them monetize their vast video archives.

What Sets Twelve Labs Apart

While major players like Google and Microsoft offer video analytics, Twelve Labs distinguishes itself with its video-first philosophy and unparalleled customization options. “General-purpose multimodal models aren’t optimized for video,” said Lee. “Our sole focus on video allows us to develop solutions that go deeper and are more adaptable.”

One of Twelve Labs’ standout products, Marengo, now in version 2.7, applies a multi-vector approach to video understanding. This advanced methodology delivers unprecedented accuracy, making it a game-changer for industries like media, entertainment, and professional sports.

Beyond video, Twelve Labs is branching into “any-to-any” search, enabling queries across images, text, and audio. Its Embed API allows seamless creation of multimodal embeddings, which are mathematical representations that make complex queries more efficient and accurate.

The new funding round isn’t just about capital—it’s about strategic alignment. Databricks and Snowflake, two of the world’s leading enterprise infrastructure providers, are integrating Twelve Labs’ technology into their platforms, a move that validates the company’s growing influence.

Databricks has developed an integration that allows customers to invoke Twelve Labs’ embedding service directly from their data pipelines. This reduces development time and enhances workflows, enabling real-time video analytics and large-scale content classification. Andrew Ferguson, VP at Databricks Ventures, emphasized the importance of this partnership: “Twelve Labs’ technology fills an important gap in the current AI ecosystem. Integrating their Embed API with our Mosaic AI Vector Search overcomes challenges in processing large-scale video datasets.”

Snowflake, meanwhile, is leveraging Twelve Labs’ multimodal video embeddings in its Cortex AI service. This collaboration aims to enhance video search, personalization, and creative applications, all while maintaining Snowflake’s rigorous data security and governance standards. Bill Stratton, Snowflake’s Global Head of Media, Entertainment, and Advertising, noted the potential: “Our investment will unlock opportunities for customers to leverage AI without copying or moving their data.”

New Leadership Appointment

To keep pace with its ambitious growth plans, Twelve Labs recently hired Yoon Kim as President and Chief Strategy Officer. Dr. Kim brings decades of experience, including pivotal roles in the development of Apple’s Siri and AI innovation at SK Telecom. He will focus on scaling Twelve Labs’ global presence, securing top AI talent, and driving strategic acquisitions.

“Yoon is the right person to help us execute,” said Jae Lee. “His expertise in AI and global business strategies will be instrumental in achieving our ambitious goals.”

Dr. Kim echoed this optimism, stating, “Twelve Labs is on the verge of establishing clear and sustainable leadership in video understanding AI. I’m excited to help fulfill the mission of enabling humans and machines to understand all video content in the world.”

Twelve Labs is set on expanding into new verticals, including automotive and security. While Lee declined to specify, the involvement of In-Q-Tel—a venture arm known for supporting U.S. intelligence capabilities—hints at potential applications in national security and defense. “We’re always open to exploring opportunities where our technology can have a positive, meaningful, and responsible impact,” said Lee.

📣 Want to advertise in AIM Research? Book here >

Picture of Anshika Mathews
Anshika Mathews
Anshika is the Senior Content Strategist for AIM Research. She holds a keen interest in technology and related policy-making and its impact on society. She can be reached at anshika.mathews@aimresearch.co
Subscribe to our Latest Insights
By clicking the “Continue” button, you are agreeing to the AIM Media Terms of Use and Privacy Policy.
Recognitions & Lists
Discover, Apply, and Contribute on Noteworthy Awards and Surveys from AIM
AIM Leaders Council
An invitation-only forum of senior executives in the Data Science and AI industry.
Stay Current with our In-Depth Insights
The Most Powerful Generative AI Conference for Enterprise Leaders and Startup Founders

Cypher 2024
21-22 Nov 2024, Santa Clara Convention Center, CA

25 July 2025 | 583 Park Avenue, New York
The Biggest Exclusive Gathering of CDOs & AI Leaders In United States
Our Latest Reports on AI Industry
Supercharge your top goals and objectives to reach new heights of success!