Search
Close this search box.

Galileo Releases ‘Luna’ to Light Up Enterprise Gen AI Evaluation

The key innovation in Luna is the use of right-sized, purpose-built EFMs rather than massively over parameterized models.

In a move that could accelerate the enterprise adoption of generative AI, startup Galileo has unveiled a novel suite of AI models specifically designed for evaluating the outputs of large language models like GPT-3. 

The new offering, dubbed Galileo Luna, represents a first-of-its-kind approach to GenAI evaluation using what the company calls Evaluation Foundation Models (EFMs). These specialized models are finely tuned for tasks like detecting hallucinations, toxic language, data leaks, and malicious prompts in the responses from AI systems.

“For gen AI to achieve mass adoption, it’s crucial that enterprises can evaluate hundreds of thousands of AI responses for hallucinations, toxicity, security risk, and more, in real time,” said Vikram Chatterji, Co-Founder and CEO of Galileo. “In speaking with customers, we found that existing approaches, such as human evaluation or LLM-based evaluation, were too expensive and slow, so we set out to solve that. With Galileo Luna®, we’re setting new benchmarks for speed, accuracy, and cost efficiency. Luna® can evaluate millions of responses per month 97% cheaper, 11x faster, and 18% more accurately than evaluating using OpenAI GPT3.5.”


The key innovation in Luna is the use of right-sized, purpose-built EFMs rather than massively over parameterized models. This approach yields major gains in evaluation speed, cost, and precision over conventional techniques, Galileo claims.   

Notable capabilities of Luna include exceeding industry benchmarks for detecting issues like hallucinations by up to 20%, costing 30 times less than traditional methods, delivering evaluations in milliseconds, and not requiring expensive ground truth datasets. The models can also be rapidly customized for over 95% accuracy on specialized enterprise use cases.

“Evaluations are essential for safe, reliable AI products, but existing methods have been very costly and slow,” said Alex Klug, Head of Data Science & AI at HP Inc., one of Galileo’s customers. “Luna overcomes those hurdles in a way that’s a real game changer.”

Already integrated into Galileo’s AI governance platforms Protect and Evaluate, Luna is being used by major enterprises to handle millions of GenAI queries monthly while safeguarding against harmful outputs and reducing operating costs.

With generative AI poised for broader enterprise adoption, Galileo’s Luna models could help overcome one of the biggest remaining bottlenecks around robustly and affordably evaluating these powerful but still flawed AI systems.

Picture of Mansi Singh
Mansi Singh
Mansi's interest centers around use of Gen AI in enhancing daily lives and she is dedicated to exploring the latest trends and tools in AI.
Subscribe to our Latest Insights
By clicking the “Continue” button, you are agreeing to the AIM Media Terms of Use and Privacy Policy.
Recognitions & Lists
Discover, Apply, and Contribute on Noteworthy Awards and Surveys from AIM
AIM Leaders Council
An invitation-only forum of senior executives in the Data Science and AI industry.
Stay Current with our In-Depth Insights
The Biggest Exclusive Gathering Of CDOs & Analytics Leaders In United States

MachineCon 2024
26 July 2024, New York

MachineCon 2024
Meet 100 Most Influential AI Leaders in USA
Our Latest Reports on AI Industry
Supercharge your top goals and objectives to reach new heights of success!

Cutting Edge Analysis and Trends for USA's AI Industry

Subscribe to our Newsletter