Higgsfield AI Thinks Generative Video Still Lacks One Critical Thing

While platforms like Runway, Pika Labs, and OpenAI focus on visual fidelity, Higgsfield has prioritized the grammar of film which is the motion, perspective, and spatial composition that shape a story.

“AI video looks better, but it doesn’t feel like cinema.” “We kept hearing the same thing from creators: AI video looks better, but it doesn’t feel like cinema,” said Alex Mashrabov, founder of Higgsfield AI and former head of AI at Snap. “There’s no intention behind the camera.”

That critique became the foundation for Higgsfield AI, a generative video company focused on bringing cinematic language into AI video, not by enhancing visual fidelity, but by giving creators direct control over how the camera moves through a scene.

Founded by Mashrabov, a pioneer in the AI video space, Higgsfield recently raised $15 million in seed funding, launching into a market where venture capital has flooded into AI video. Luma raised $45 million, while companies like Runway and Pika Labs now command valuations exceeding $500 million. Yet, despite the excitement, Mashrabov is clear-eyed about who’s actually using the technology: “Most of the adoption I think of the video AI comes from professionals today 100%.”

Higgsfield’s technology stems from lessons learned during the launch of Diffuse, a viral app Mashrabov previously developed that lets users create personalized AI clips. While Diffuse found traction, it also revealed the creative limits of short-form, gag-driven content. The Higgsfield team shifted their focus to storytelling, specifically serialized short dramas for TikTok, YouTube Shorts, and other mobile-first platforms.

At the heart of Higgsfield’s offering is a control engine that allows users to craft complex camera movements, dolly-ins, crash zooms, overhead sweeps, and body-mounted shots—using nothing more than a single image and a text prompt. These kinds of movements traditionally demand professional rigs and crews. Now, they’re accessible through presets.

The idea is not just to produce good-looking frames but to make AI video feel intentional and cinematic. Higgsfield is tackling one of the most common criticisms of AI-generated content: that it lacks structure, rhythm, and authorship.

“We’re not just solving style—we’re solving structure,” said Yerzat Dulat, Higgsfield’s Chief Research Officer. The platform directly addresses character and scene consistency over time, still a persistent challenge in generative video tools. Murat Abdrakhmanov, VC did mention that his rule of thumb as an experienced angel investor is to invest in people, not products. So, as much as Higgsfield’s technology revolutionizes video AI generation and content creation, getting to know its founder was just as important.

Higgsfield DoP I2V-01-preview

The company’s proprietary model, Higgsfield DoP I2V-01-preview, is an Image-to-Video (I2V) architecture that blends diffusion models with reinforcement learning. Unlike traditional systems that simply denoise static frames, this model is trained to understand and direct motion, lighting, lensing, and spatial composition the essential components of cinematography.

By introducing reinforcement learning after diffusion, the model learns to inject coherence, intentionality, and expressive movement into scenes. This approach draws from how RL has been used to give large language models reasoning and planning capabilities.

Built on AMD Instinct™ MI300X with TensorWave

Higgsfield built and tested its model in partnership with TensorWave, deploying on AMD Instinct™ MI300X GPUs. Using TensorWave’s AMD-based infrastructure and pre-configured PyTorch and ROCm™ environments, the team ran inference workloads without custom setup—allowing them to evaluate model performance and stability under real-world conditions.

Filmmaker and creative technologist Jason Zada, known for Take This Lollipop and brand work with Intel and Lexus, produced a short demo titled Night Out using Higgsfield’s platform. The video features stylized neon visuals and fluid, high-impact camera motion—all generated within Higgsfield’s interface.

“Tools like the Snorricam, which traditionally require complex rigging and choreography, are now accessible with a click,” Zada said. “These shots are notoriously difficult to pull off, and seeing them as presets opens up a level of visual storytelling that’s both freeing and inspiring.”

John Gaeta, the Academy Award–winning visual effects supervisor behind The Matrix and founder of escape.ai, praised Higgsfield’s system for pushing creators closer to having “total creative control over the camera and the scene.” Gaeta’s platform escape.ai focuses on films created with AI, game engines, and other emerging tools.

While platforms like Runway, Pika Labs, and OpenAI focus on visual fidelity, Higgsfield has prioritized the grammar of film which is the motion, perspective, and spatial composition that shape a story.

📣 Want to advertise in AIM Research? Book here >

Picture of Anshika Mathews
Anshika Mathews
Anshika is the Senior Content Strategist for AIM Research. She holds a keen interest in technology and related policy-making and its impact on society. She can be reached at anshika.mathews@aimresearch.co
Subscribe to our Latest Insights
By clicking the “Continue” button, you are agreeing to the AIM Media Terms of Use and Privacy Policy.
Recognitions & Lists
Discover, Apply, and Contribute on Noteworthy Awards and Surveys from AIM
AIM Leaders Council
An invitation-only forum of senior executives in the Data Science and AI industry.
Stay Current with our In-Depth Insights
The Most Powerful Generative AI Conference for Enterprise Leaders and Startup Founders

Cypher 2024
21-22 Nov 2024, Santa Clara Convention Center, CA

25 July 2025 | 583 Park Avenue, New York
The Biggest Exclusive Gathering of CDOs & AI Leaders In United States
Our Latest Reports on AI Industry
Supercharge your top goals and objectives to reach new heights of success!