The advent of ChatGPT in November 2022 marked a watershed moment, ushering in an era where natural language interaction with computers became tangible, opening a realm of new possibilities. Reflecting on the journey spanning over three decades, from decision support systems to the emergence of Generative AI, we are joined by Anand Raghavan, Data Science and AI Leader.
As we delve into the impact of trends over the past one, five, and ten years on Large Language Models (LLMs), the discourse around Advanced General Intelligence (AGI) in 2024 gains prominence. This introspection not only elucidates the trajectory of LLMs but also underscores the imperative to evaluate emerging technologies through the lens of genuine customer pain points, scalability, and ethical considerations.
AIM Research: How have trends over the past one, five, and ten years shaped the current status of Large Language Models (LLMs) and brought Advanced General Intelligence (AGI) into more realistic discussions in 2024?
“The launch of ChatGPT in November 2022 represents a revolutionary development, making natural language interaction with computers a reality and opening a vast spectrum of new application possibilities.”
Anand Raghavan: This journey extends back over 30 to 40 years. Whether termed decision support systems, machine learning, AI, or Generative AI, there has been a consistent evolution in technological capabilities across several decades. Discussing AI comprehensively, it can be categorized into at least four distinct areas. Initially, we have statistical techniques, including anomaly detection and outlier identification, foundational elements that persist over time. Following this, traditional machine learning techniques emerged, with classifiers—such as those used in spam detection acting as binary classifiers—and the development of multimodal classifiers. Furthermore, we observe the advancement of computer vision algorithms, predominantly focusing on image processing. Natural language processing techniques, utilized for tasks like sentiment analysis or named entity recognition, represent another significant stride. Subsequently, the advent of neural networks, generative AI, and large language models marks the latest phase in this technological progression.
Historically, the introduction of ImageNet was a pivotal moment, demonstrating the profound capabilities of neural networks in accurately recognizing faces and objects, a feat becoming feasible in the early 2010s. This milestone initiated further advancements in language processing, highlighted by the implementation of sequence-to-sequence transformers and enhancements in tools like Google Translate. The seminal “Attention is All You Need” paper further accelerated progress in this domain. However, the launch of ChatGPT in November 2022 signifies a revolutionary development. For the first time, it became possible to interact with computers using natural language, facilitating a direct and intuitive interface. This breakthrough has unveiled a vast spectrum of new application possibilities.
When discussing technological scalability, a frequently mentioned concept is ‘technology readiness’. This term encompasses various dimensions, beginning with governance. The ethical use of data, including proper anonymization and securing consent for its utilization, is paramount. Auditability also plays a critical role, offering insights into whether internal activities comply with governance standards. Infrastructure readiness is another crucial consideration, addressing the technical and economic feasibility of deploying these models.
In the rush to employ large language models (LLMs) for diverse applications, it’s vital to discern whether an LLM is genuinely required to solve specific customer challenges, or if the motivation is merely to chase the latest technological trend. Identifying the appropriate technology for each application, ensuring its responsible use, and engaging in a continuous cycle of feedback and improvement are essential for achieving technology readiness. This process, driven by ongoing learning and adaptation based on customer feedback, is fundamental for refining models and optimizing their effectiveness.
AIM Research: Considering factors such as infrastructure, the composition of technology stacks, and organizational readiness, which criteria would you prioritize to evaluate the suitability of emerging technologies for 2024? Specifically, why has Generative AI emerged as a critical area of discussion within these contexts?
“In many scenarios, a trillion-parameter model might not be necessary; a model with a few billion parameters, like Mistral, could suffice. The key lies in identifying the correct application, fine-tuning as needed, and ensuring the right data sets and controls are in place for effective model training and guardrails against abuse.”
Anand Raghavan: The first consideration is that certain applications require frontier models. You will need GPT-4 or GPT-3.5 because you are aiming to generate responses that presume knowledge of the entire internet and an understanding of numerous details in language. For instance, if you’re attempting to condense a large body of text into a few words, you will need a large language model. In many scenarios, a trillion-parameter model might not be necessary; a model with a few billion parameters, like Mistral, could suffice. Then, the question arises: Can you use it straight out of the box, or do you need to fine-tune it? Perhaps building a custom model based on an open-source model is the right approach. There are also situations where much smaller models could be utilized, for example, taking an Electron model, training it for a specific domain-related task, and launching an application targeted at that particular niche.
These represent various strategies for deploying language model-based applications in the market. Alongside these strategies, a few critical components must be developed. First, it involves identifying the correct data sets, ensuring the data is appropriately tagged, and using the right data to train the models. Second, it’s about implementing controls to establish feedback loops based on customer interactions with the prompts and their responses, such as likes or dislikes. This feedback is crucial for training the models effectively. Third, establishing guardrails to prevent abuse of the product is essential. This includes measures to avoid prompt induction attacks, toxicity, and other potential abuses by attackers within your ecosystem.
AIM Research: Considering various technologies like AR, VR, and blockchain experienced hype but struggled to scale despite functionality, what factors contribute to this? Why might Generative AI succeed where others have not as we approach 2024?
“Real applications are emerging, paving the way for the development of enduring companies and products in the market.”
Anand Raghavan: Whenever we encounter a new technology cycle, hype is inevitable. As we strive to comprehend the capabilities of the technology and assess the market potential of being the first to introduce such innovations, a gold rush mentality often ensues. The examples you cited indeed experienced this phenomenon. However, these examples also demonstrate real-world applications that are currently being pursued. Initially, there’s a surge of enthusiasm, leading to its use in unsuitable areas or exaggerated claims to attract investment and spawn startups. But eventually, the dust settles, revealing genuine, valuable applications.
We’re witnessing a similar pattern with large language models (LLMs). Not every task requires an LLM, yet there are numerous instances where they can significantly simplify our lives. The media has highlighted scenarios akin to finding a needle in a haystack. For instance, individuals have input a plethora of symptoms observed in a pet into Chat GPT, which then identifies likely diseases—a process validated by veterinarians confirming the accuracy of these diagnoses. Additionally, LLMs excel in summarizing extensive texts, enabling rapid comprehension of documents, and alleviating the burden of repetitive tasks. This is evident in the legal field, where paralegals are utilizing LLMs to swiftly generate briefs based on voluminous data. Real applications are emerging, paving the way for the development of enduring companies and products in the market.
AIM Research: How can we identify technologies that are more hype than substance? What characteristics indicate a technology may not sustain long-term success, despite initial excitement and investment?
“It’s essential to start with a genuine customer pain point and ask, ‘Is this a real issue in the market today, and how can I address it?”
Anand Raghavan: One crucial aspect to evaluate is whether the technology in question is akin to a hammer searching for a nail. It’s essential to start with a genuine customer pain point and ask, “Is this a real issue in the market today, and how can I address it?” There might have been times when certain technological solutions were not feasible due to the absence of the necessary technology. However, if today a particular technology seamlessly integrates into solving a customer’s problem, it is considered valuable for addressing real-world challenges. This approach contrasts with inventing scenarios where the technology’s application is uncertain and not directly linked to solving a specific customer pain point.
For example, when planning a vacation and traveling for a week, you encounter a series of tasks regardless of the trip’s nature. You need to find the most affordable or preferable flight, decide on accommodations, and perhaps choose airlines or hotel chains based on personal preferences. Traditionally, a human travel agent excellently manages these tasks. Yet, this is a pain point that can be automated using a large language model within an agent framework, capable of navigating various providers’ options via APIs to offer possible alternatives. This represents a genuine pain point in the market.
Similarly, in the enterprise sector, consider a scenario where an individual is attempting to familiarize themselves with a product or solution. Sifting through extensive documentation to understand how to use the product may not be as effective as posing a question and having generative AI summarize the information, providing a concise response. This approach not only saves time but also enhances the efficiency of using the product. Such instances are real use cases, highlighting that beginning with a customer pain point and ensuring the technology is appropriately applied to address it is a reliable method to determine whether the technology will have lasting value.
AIM Research: Do technologies ever emerge without addressing a pre-identified customer pain point, or is identifying such pain points a universal first step in technology development?
“To develop an authentic application, it’s imperative to establish appropriate guardrails and tailor it to specific customer use cases.”
Anand Raghavan: Today, we’re witnessing a trend that serves as a compelling illustration. In the rush to deploy chat GPT-powered applications, numerous companies are merely adding a user interface (UI) to OpenAI’s technology, claiming to have developed a bot. However, the issue arises when these applications are not properly constructed, lacking the necessary enterprise guardrails. For instance, if a company launches a chat application by simply integrating OpenAI’s API without implementing any safeguards, it essentially provides unrestricted access to OpenAI. Without these guardrails, users can pose any question, not necessarily related to the company’s services, potentially incurring significant costs while the users benefit extensively.
This year marks a turning point, as there’s a growing realization that merely encasing OpenAI with a UI is insufficient for creating a genuine application. To develop an authentic application, it’s imperative to establish appropriate guardrails and tailor it to specific customer use cases. This approach will ultimately define the success of technology implementations like these.
AIM: Doesn’t effective technology development demand a clearly defined problem statement, focusing on understanding the issue’s root cause rather than merely its superficial aspects?
“There will be a crucial moment of reflection to determine whether the integration of AI genuinely adds value and if it’s worth continuing down this path of development.”
Anand Raghavan: In the short term, engaging in “AI washing” is straightforward. Companies can adorn their websites with a plethora of buzzwords, claiming their products are AI-enhanced. However, in the long term, the true measure of value becomes pivotal. To illustrate, consider profit margins: a product minimally reliant on AI likely boasts significantly higher margins than one deeply embedded with AI technology. This difference primarily stems from the costs associated with computing power and, in this context, the substantial expenses accrued from operational costs.
If a company resorts to AI washing without thoroughly evaluating the necessity of AI in its product, it will inevitably affect the company’s financial health. Eventually, there will be a crucial moment of reflection to determine whether the integration of AI genuinely adds value and if it’s worth continuing down this path of development.
AIM Research: In the context of technology adoption, how does the interplay between large enterprises’ desire to innovate, media hype, and vendors’ exaggerated promises contribute to a cycle that overlooks actual problem-solving, potentially hindering technology scalability?
“Currently, there’s a surge in startups within each of these categories, but eventually, it will consolidate around a few that offer unique value, continuing to build sustainable businesses.”
Anand Raghavan: It’s a combination of various factors. When we examine use cases within an enterprise, considering how a developer’s productivity can be enhanced through GitHub Copilot becomes obvious. Such use cases are inevitable. Similarly, when we think about automating QA, MLOps, or DevOps, we identify internal applications where AI, especially Generative AI, can be beneficial.
Companies developing these products undoubtedly add lasting value. Furthermore, there’s the value I aim to deliver as a Fortune 500, Global 2000, or company of any size to my end customers. The question arises on how I utilize Generative AI or AI to achieve this. This requires deeper thought, which has been the focus of most of our discussion. Are we employing this technology in scenarios where it genuinely enhances customer value? Is the use of Generative AI necessary to deliver this value? Looking at the vendors and the ecosystem creating these tools, we see a burgeoning ecosystem focusing on security, relevance, automation of the toolchain, and model evaluation. All these aspects are essential.
Currently, there’s a surge in startups within each of these categories, but eventually, it will consolidate around a few that offer unique value, continuing to build sustainable businesses. This pattern resembles the typical technology curve where initial investment leads to the emergence of successful companies.
AIM Research: For enterprise applications, how do you envision the deployment and impact of Generative AI in the next 5 to 10 years? Could you specify potential successful use cases and areas where it might face challenges?
“Generative AI serves as a tool for enhancing productivity, allowing individuals to accomplish more in the same amount of time within an enterprise environment.”
Anand Raghavan: Generative AI can offer substantial benefits in situations where customers use products that are complex or require navigating through extensive features to accomplish tasks. Additionally, it proves useful in workflows that involve numerous steps, which can be time-consuming for humans to execute. Take, for instance, a tier one analyst in a Security Operations Center (SOC), who lacks the experience of a tier three analyst. The U.S. faces a significant shortage of tier three analysts. The question then becomes how to empower a tier one analyst to exceed their current capabilities. In such scenarios, Generative AI tools can be incredibly valuable by synthesizing information from various data sources, offering recommendations for next steps, identifying patterns in the data, and highlighting areas of focus. Moreover, they can streamline the process by reducing the number of alerts analysts see, focusing only on those that are most relevant.
An example of this in action is email security. Many employees use the “report phishing” button when they encounter a suspicious email. In many organizations, these reports are sent to analysts who must then determine whether the emails are genuinely malicious. I’ve spoken with analysts who manually sift through 2000 emails a day, only to find that 80% are not phishing attempts. Often, these are cases where an employee is simply dissatisfied with a marketing email. The challenge lies in automating this process to avoid manually reviewing 2000 emails a day. If we could reduce this number to around 10, analysts would have more time to dedicate to upskilling, learning new things, and being more productive. Hence, Generative AI serves as a tool for enhancing productivity, allowing individuals to accomplish more in the same amount of time within an enterprise environment.
AIM Research: As someone with extensive experience in the data and AI industries, what are your top recommendations for enterprises to align their business strategies with upcoming technologies, not limited to Generative AI? How should they approach talent management, governance, and strategic decision-making to prepare for these technological advancements?
“Their deep understanding of language becomes a valuable asset in collaborating with AI experts to develop products, showcasing how diverse skills can play a critical role in the AI-driven transformation of industries.”
Anand Raghavan: Indeed, the advent of AI and Generative AI is ushering in a variety of new job roles and subcategories. For instance, legal teams will require individuals with a deep understanding of AI to develop robust governance structures for responsible AI use. Additionally, there’s a need for products capable of managing the data governance and security for the input data of models. A new category known as ML Ops or LLM Ops, encompassing DevOps and SRE roles, is emerging. These professionals must grasp how to automate the infrastructure supporting these technologies.
Furthermore, there’s an evolving field for AI engineers, who occupy a unique niche. They are not purely backend engineers nor are they machine learning engineers with a deep understanding of ML; instead, they possess the skills to turn ML models into tangible products, leveraging their expertise in backend or platform engineering.
Data analysts are also adapting, learning to craft effective prompts and engage in prompt engineering to fine-tune models for enhanced performance. Moreover, individuals without a technological background, including those with PhDs in English or linguistics, are finding opportunities to contribute. Their deep understanding of language becomes a valuable asset in collaborating with AI experts to develop products, showcasing how diverse skills can play a critical role in the AI-driven transformation of industries.