Fastino Raises 17.5 M To Launch TLMS That Do One Job Really Well

Developed by in-house AI researchers from Google DeepMind, Stanford, Carnegie Mellon, and Apple Intelligence, these models deliver performance that is 99 times faster than standard large language models (LLMs), making them both cost-effective and scalable.

“AI is most valuable when it’s specialised, fast, and deployed exactly where it’s needed,” that’s the idea behind Fastino.

Fastino’s journey began with a realisation that general-purpose models, while versatile, often come with high costs and inefficiencies. Drawing from experiences in their previous ventures, co-founders Ash Lewis and George Hurn-Maloney recognised the need for models tailored to particular tasks. This insight led to the development of TLMs or Task-Specific Target Models which are optimised for tasks like summarisation, function calling, text-to-JSON conversion, PII redaction, text classification, and profanity censoring.

Clearing this mission behind the formation of this startup, Lewis stated, “We started this company after our last startup went viral and our infrastructure costs went through the roof. At one point, we were spending more on language models than on our entire team.” 

Smaller Models Are Winning

What sets Fastino apart is its innovative approach to model training. Unlike traditional methods that rely on expensive hardware, Fastino’s models were trained on Nvidia’s low-end gaming GPUs, costing less than $100,000 in total. Developed by in-house AI researchers from Google DeepMind, Stanford, Carnegie Mellon, and Apple Intelligence, these models deliver performance that is 99 times faster than standard large language models (LLMs), making them both cost-effective and scalable.

“AI developers don’t need an LLM trained on trillions of irrelevant data points—they need the right model for their task,” said George Hurn-Maloney, COO and co-founder of Fastino. “That’s why we’re making highly accurate, lightweight models with the first-ever flat monthly pricing—and a free tier so devs can integrate the right model into their workflow without compromise.”

Fastino’s unique approach has garnered significant attention from investors. The company recently secured a $17.5 million seed round led by Khosla Ventures, bringing its total funding to nearly $25 million. The company has previously raised $7 million in a pre-seed round led by Microsoft’s VC arm M12 and Insight Partners with notable backing from industry leaders like Scott Johnson and Lukas Biewald. This influx of capital will enable Fastino to expand its research team and further refine its TLMs. 

Fastino has successfully impressed its investors as Jon Chu, Partner at Khosla Ventures said, “Fastino’s tech allows enterprises to create a model with better-than-frontier model performance for just the set of tasks you care about and package it into a small, lightweight model that’s portable enough to run on CPUs, all while being orders of magnitude faster with latency guarantees. These tradeoffs open up new use cases for generative models that historically haven’t been practical before.”

Runs on Everyday Hardware

Understanding the needs of developers, Fastino offers a free tier allowing up to 10,000 requests per month, making it accessible for experimentation and integration. For enterprises, the models can be deployed within private environments, ensuring data privacy and compliance. The company’s pricing model is transparent, moving away from unpredictable per-token charges to a flat monthly subscription, catering to both startups and large enterprises. Fastino’s TLMs are already making an impact across various industries. From document parsing in finance and healthcare to real-time search query intelligence in e-commerce, Fortune 500 companies are leveraging these models to streamline operations and enhance productivity.

The team claims that testing on Fastino’s initial model family reveals that TLMs are not just smaller and faster; they’re also smarter where it counts.

It has developed internal benchmarks based on real-world customer and partner use cases to showcase their TLMs’ performance. These benchmarks assess accuracy and latency for PII redaction, information extraction, and classification. Fastino’s models were evaluated zero-shot, mirroring their expected customer usage without prompts or fine-tuning.

As AI continues to advance, Fastino’s focus on task-specific models positions it as a leader in the field. By prioritising efficiency, cost-effectiveness, and developer needs, Fastino is paving the way for a future where AI is both powerful and accessible.

📣 Want to advertise in AIM Research? Book here >

Picture of Upasana Banerjee
Upasana Banerjee
Upasana is a Content Strategist with AIM Research. Prior to her role at AIM, she worked as a journalist and social media editor, and holds a strong interest for global politics and international relations. Reach out to her at: upasana.banerjee@analyticsindiamag.com
Subscribe to our Latest Insights
By clicking the “Continue” button, you are agreeing to the AIM Media Terms of Use and Privacy Policy.
Recognitions & Lists
Discover, Apply, and Contribute on Noteworthy Awards and Surveys from AIM
AIM Leaders Council
An invitation-only forum of senior executives in the Data Science and AI industry.
Stay Current with our In-Depth Insights
The Most Powerful Generative AI Conference for Enterprise Leaders and Startup Founders

Cypher 2024
21-22 Nov 2024, Santa Clara Convention Center, CA

25 July 2025 | 583 Park Avenue, New York
The Biggest Exclusive Gathering of CDOs & AI Leaders In United States
Our Latest Reports on AI Industry
Supercharge your top goals and objectives to reach new heights of success!