Exploring the transformative role of artificial intelligence in enhancing software development workflows, our focus shifts to how Generative AI technologies are making significant strides in automating routine tasks and augmenting developer capabilities. This realm is witnessing a paradigm shift, where AI-driven tools are not merely assistants but powerful collaborators that elevate efficiency and innovation within the coding process.
In this week’s episode, we have the pleasure of welcoming Nandan Thor, a Head of AI at Palo Alto Networks. With his expertise, Nandan sheds light on the multifaceted challenges developers encounter daily and how generative AI emerges as a game-changing solution. From automating the trivial to redefining productivity metrics and balancing the integration of open-source and proprietary AI tools, our discussion with Nandan promises to offer a comprehensive insight into the current and future state of developer efficiency. As we delve into these critical aspects, readers will gain a deeper understanding of the evolving landscape of software development in the age of artificial intelligence.
AIM: Before we delve into developer productivity, could you clarify why this topic is of interest to you? What drives your passion for discussing it?
“Developer productivity, to me, means using AI to automate trivial tasks and massively increase efficiency.”
Nandan Thor: Developer productivity is one of the areas in which AI has the potential to fully reshape the landscape and how we think about a software developer’s work day to day. So this is for me one of the key areas that I’m really focusing on day to day.
To me, developer productivity is short for AI driven software developer productivity. So the way that I’m really considering it in this day and age and in the field of AI, things change so rapidly. It is how do we use existing AI tools as AI assistance or almost in the sense of junior developers enabling existing software developers to really hone in on what matters, to automate some of the more trivial tasks, and to really see these massive increases in efficiency.
AIM: What are some of the day-to-day challenges that coders face today, and how can these challenges be effectively addressed?
“AI can actually help with almost each of these steps, making the process much more streamlined, standardized, and simple for developers.”
Nandan Thor: The way to answer that question is to really take a step back and look at the current software developer project lifecycle. Then, we can examine each task that happens and explore how AI can assist in each of those tasks.
In my mind, the developer project lifecycle comprises a few distinct components. The first of which is actually getting the task from the manager and understanding it. After comprehending the task, we move on to the planning stage, which often involves creating some ticket to track the work and ensure everything flows in the correct direction.
Next, we consider prototyping, which involves creating an initial solution, followed by debugging, testing, and finally, deploying or committing that bit of code. If we break each of these steps down, AI has a significant role in all of them. Starting at the beginning, AI has the potential to act as a sounding board to ensure accurate understanding of what’s being asked. One could use it as a tool by typing in, “Hey, my manager said I need to do this,” and then have that conversation.
The next bit around the planning aspect involves creating tickets to track your work. Again, AI is incredibly helpful here. Take this project, break it down into smaller, meaningful chunks, and then create draft tickets for that. Then, there’s the area everyone thinks about: prototyping and coming up with the initial draft of the code. A model is exceptionally good at this, making the part almost obsolete. However, research and creating the first prototype can also be done with AI.
But then comes the really interesting step: developing use cases and finding bugs. At this point, you would take the code that’s been created, upload it, and identify potential bugs. Also, it’s essential to come up with test cases to actually test the code.
The final step is to deploy or commit it, which, at this point, I cannot assist with. But, taking a step back, we see that AI can actually help with almost each of these steps, making the process much more streamlined, standardized, and simple for developers.
AIM: How can some of today’s tools, including or leveraging technologies like generative AI be utilized to reconcile these differences and overall develop the,enhance the developer productivity?
“This requires a broader level of thinking, elevating all developers to perform more senior-level functions with AI handling tasks typically assigned to junior developers.”
Nandan Thor: I’m going to break your question down into two aspects. The first aspect is about metrics: How do we think about metrics in this new AI-driven world? The second aspect is about hands-on implementation, particularly from an onboarding standpoint with new employees.
Starting with the first aspect, in terms of metrics, I find metrics to be extremely challenging for any organization. It’s hard to think of any organization that has really nailed it in terms of focusing on the metrics that accurately determine developer productivity. Common metrics include lines of code, number of bugs, and time to create, among others. However, it’s worth stepping back to assess which of these metrics remain relevant in the new world of AI-driven developer productivity. This involves considering what tasks AI can excel at right now. For instance, if asked to create a piece of code that counts all the commas in a given bit of text, I would simply use a Generative AI tool, type in the request, and get a response. In this scenario, lines of code become an irrelevant metric. Instead, the focus shifts to the quality of the code. Generative AI can assist with this as well, such as by identifying bugs, finding test cases, or even transitioning code from a legacy language to a more modern one with ease.
Consequently, we move away from superficial metrics like lines of code or number of bugs and shift towards qualitative measures. These include how well the code integrates into the overall architecture, its impact on system safety and robustness, among other factors.
This requires a broader level of thinking, elevating all developers to perform more senior-level functions with AI handling tasks typically assigned to junior developers. Therefore, the metrics need to evolve to reflect these changes, focusing on the quality of the code, its integration, and its overall contribution to the project. This signifies a massive shift in how enterprises are reevaluating metrics, moving from quantity to quality.
AIM:How does the implementation of AI or Generative AI to enhance developer productivity balance the initial investment of time and resources against the potential savings for a company or organization, especially when strategic evaluation and customization for various use cases are required?
“Leveraging generative AI technologies significantly boosts efficiency, reducing tasks that once took a week to potentially just a day.”
Nandan Thor: I believe we are in agreement that leveraging generative AI technologies significantly boosts efficiency, reducing tasks that once took a week to potentially just a day. The question then becomes how to best utilize the newfound time. While focusing on metrics is one valuable area, there are numerous other beneficial activities to consider. For instance, dedicating more time to the integration of new code into the entire codebase or to enhancing the architecture of systems represents a higher-level, senior staff software engineer approach, rather than tasks typically associated with junior engineers. This shift not only allows for an increase in productivity but also enables the existing workforce to upskill. By investing additional time in these areas, employees can focus on more strategic, high-value tasks, thereby elevating the overall quality and robustness of projects.
AIM: What is your perspective on the balance between using open source and enterprise AI coding assistants or technologies, considering factors such as cost and security? How can organizations find the right equilibrium between these options? I’m interested in your thoughts on this debate.
“When we engage with large language models, it’s more pragmatic to view them as a Software as a Service (SaaS) product. Each company will ultimately make its own choice, balancing privacy, security, and performance factors according to its priorities.”
Nandan Thor: That’s an excellent question, and indeed, every enterprise adopting generative AI is contemplating this very issue. Let’s delve into a somewhat philosophical perspective, which may not have been anticipated. During my journey of learning data science and AI, everything was accessible for free, including the models. Utilizing a random forest model, for example, incurred no charges, which seemed fitting in the older era of AI and data science. However, as we transition to the present, large language models, despite being labeled as models, increasingly resemble sophisticated software. This distinction arises from the substantial investment required by enterprises to develop these foundation or frontier models, which can reach tens or even hundreds of millions of dollars. This investment covers not only training but also the necessary talent, and notably, the hosting post-instruction tuning.
Consequently, when we engage with large language models, it’s more pragmatic to view them as a Software as a Service (SaaS) product. This perspective helps rationalize the associated costs, which include not just the model itself but also the hosting services provided by the developing companies. Despite this, it’s important to acknowledge the significant contributions and performance of open-source models in the AI field. Although a performance gap exists between hosted enterprise models and open-source alternatives, this disparity is gradually narrowing.
The conversation also extends to security and privacy concerns. Some enterprises have inadvertently contributed proprietary data to train large language models, risking exposure of sensitive information. This scenario underscores the importance of considering open-source models for critical applications, especially where maintaining on-premises (physically or virtually) data control is crucial for security and confidentiality.
From a cost perspective, the assumption that open-source models are significantly cheaper does not always hold true. Hosting a model with billions of parameters on one’s own infrastructure can be costly. Therefore, the decision between open-source and enterprise-hosted models hinges on the specific needs for privacy, security, and performance.
Each company will ultimately make its own choice, balancing these factors according to its priorities. Nevertheless, the advancements emanating from both the open-source community and enterprise endeavors are exciting and promising for the future of AI.
AIM: When considering the decision-making process for adopting open source or enterprise models to enhance developer productivity, which stakeholders would you involve, and what roles would they play? This can be discussed in the context of a general work guide or specific use cases.
“It’s essential to first identify the current priorities and strategies at the enterprise level and then explore how Gen AI can be integrated to support these goals.”
Nandan Thor: That’s indeed an insightful observation. What we’re witnessing across the industry is the eagerness of enterprises to harness the incredible potential of generative AI (Gen AI) by developing use cases derived from its capabilities. While this approach can yield results, optimizing for the highest return on investment (ROI) actually requires a reversed methodology. It’s essential to first identify the current priorities and strategies at the enterprise level and then explore how Gen AI can be integrated to support these goals.
This strategic alignment should not commence at the AI engineer level but rather at the business level, where the focus is on achieving specific goals, milestones, and strategic objectives. Business leaders should then engage with their teams to determine how Gen AI can contribute to these aims, necessitating a collaborative effort across a broad spectrum of stakeholders, including business units, IT departments (especially if hosting an open-source model is considered), and finance departments to evaluate the costs associated with model hosting and usage.
Interestingly, launching an open-source model often involves a more fixed cost structure, attributed to the expense of resources, whereas utilizing a hosted enterprise model incurs variable costs based on usage frequency. This distinction further emphasizes the need for comprehensive planning and collaboration across the enterprise.
Gen AI’s accessibility marks a significant departure from traditional AI and data science approaches, which typically required stakeholders to trust the predictions without tangible interaction. Now, Gen AI allows individuals to directly engage with AI tools online, broadening the scope of AI’s appeal and understanding. This evolution has expanded the community of AI enthusiasts and practitioners, including those who might not have previously considered exploring AI. This democratization of AI technology not only facilitates innovation but also fosters a more inclusive environment for AI exploration and application across various sectors.
AIM: How do you ensure that your team members are effectively involved in the dialogue to enable their contribution to solving problems, considering factors such as security and cost, while also prioritizing their needs as clients?
“Keeping coders at the heart of the conversation is essential, not just for developer productivity, but for all use cases. They are both the builders and the users, uniquely positioned to understand both how to build it and how to use it effectively.”
Nandan Thor: I think about it in two ways. One is where the software developers or coders are the implementers. So, they’re the ones who are actually sitting there writing the code to make these gen A.I. calls and to build up the systems. The other bit, relevant to the point at hand, is where they’re also the target audience. They’re building these systems for themselves, which is super interesting because that doesn’t happen often. It’s really important to listen to them. Jet developer productivity has so many different use cases around it: of course, it’s the coding aspect, the bugging, the debugging, the translation from code. Where should it sit? Should it be a desktop thing? Should it be in your I.D.E.?
You have all these various considerations. And it’s really interesting to talk to coders about this because they think about it in both senses. One is how do I build it? And two is how do I use it? So that’s why it’s often becoming one of the most successful use cases for enterprises, because you have the person building it also being the one that uses it, so they can build it exactly how they want.
It’s important to keep the coders really at the heart of this, of course, for developer productivity, but also for the other use cases, because they’re also there to tell you what’s technically feasible. What would they think about it? The conversation often can switch into, how involved is a coder going to be with these technologies? And my point of view is even more so, especially in terms of developer productivity.
AIM: How crucial is it to ensure technology fits specific use cases, especially in coding applications developed by your team? Where does prompt engineering come into play in this process, particularly in organizations like Palo Alto Networks transitioning to open source or enterprise solutions? And how does prompt engineering contribute to maximizing developer productivity in such contexts?
“Prompt engineering addresses the challenge of receiving undesired outputs from an LLM by strategically modifying the input prompt to steer the model towards generating the preferred response.”
Nandan Thor: Prompt engineering indeed caught the spotlight, particularly when job postings for prompt engineers with salaries around $600,000 per year emerged, drawing widespread attention. This development highlights the critical role of prompt engineering in optimizing interactions with Large Language Models (LLMs). Essentially, prompt engineering addresses the challenge of receiving undesired outputs from an LLM by strategically modifying the input prompt to steer the model towards generating the preferred response.
There are two main strategies for achieving this alignment: fine-tuning and prompt engineering. Fine-tuning involves adjusting the model’s parameters directly with specific data, which, despite being effective, is notably costly and time-consuming due to the model’s complexity. On the other hand, prompt engineering maintains the original model parameters but alters the input prompt to guide the model’s output in the desired direction. Although fine-tuning presents a considerable initial investment in model customization and hosting, it becomes financially sensible when the alternative involves extensive prompt engineering to achieve the desired outcome.
Prompt engineering demands careful consideration, especially when incorporating few-shot learning, where a small number of input and output examples are provided to the model to improve its responses. Advanced techniques in prompt engineering, such as chain of thought, tree of thought, and graph of thought, further enhance the model’s reasoning by breaking down complex tasks into manageable segments, significantly improving accuracy.
Moreover, studies have shown that increasing the number of interactions with an LLM improves outcomes. For example, asking an LLM to rank multiple solutions and select the best one yields more accurate results than requesting a direct answer. While this approach may be slower and more expensive, it is invaluable for tasks requiring utmost precision.
In practice, prompt engineering is a collaborative effort involving both the AI team and business stakeholders. This partnership ensures that the outputs not only meet technical specifications but also align with business objectives. Business stakeholders contribute by verifying outputs and suggesting scenarios for few-shot learning, enhancing the model’s relevance to specific business contexts. This synergy between technical expertise and business insight is essential for maximizing the benefits of prompt engineering in addressing complex challenges.
AIM: Reflecting on your mention of AI’s potential in automating simple tasks, with the ultimate goal for generative AI and large language models moving towards more complex, intelligent functions, do you think these models will advance to also tackle intelligent tasks, beyond the mundane? What are your thoughts on the implications of such an evolution?
“We’re at the beginning of not just a technological revolution, but also a societal, economic, and even cultural revolution.”
I’ll start with one of my favorite quotes, which I believe is from George Box, the statistician: “All predictions or all models are wrong, but some are useful.” So, I’ll provide some predictions, but hopefully, it’s useful. We really want to consider AI from a standpoint of the potential impact on society. From an economic point of view, there are a lot of estimates suggesting that the size of the benefit or the size of AI that’s going to be introduced into the economy purely through AI will be about the size of the American economy. Taking a step back, that’s unbelievable—to essentially clone the biggest economy and say, “AI is going to have a big role.” So, we’re at the beginning of not just a technological revolution, but also a societal, economic, and even cultural revolution.
I’ll briefly discuss three areas that we need to make advancements in to reach that AGI potential, and then we’ll wrap everything up. The first area that I see, and I see a lot of really amazing work coming out of this area, is around multimodal models. The way I explain this to people is that if we want to teach a model about what a dog is, at this point, we’ve given it every snippet of text on the Internet about dogs. Now, think about how much richer an understanding the LLM (Language Learning Model) is going to have about dogs when you also input things like every dog video out there, every image of a dog, every snippet of dog audio, whatever exists. So, you get this way more representative understanding. We’re just scratching the surface with multimodal models, but we’re going to see some really great advancements out there, and we already have. I see that as being a rocket ship towards AGI.
The next bit is really about how we’re scaling input sequence lengths. At this point, some vendors have come out with massive sequence lengths. In order for this to be really tractable, you need to not have a limit there. So, your sequence input length is essentially infinite. We have seen some really cool techniques with linear biases in terms of attention, attention sinks, all these really cool things that are out there. That’s also going to be an obstacle when you don’t need to worry about input sequence length limits. that also allows us to take a big step towards AGI. This also becomes super relevant where, going back to the last point, text takes up very little space, whereas videos take up massive amounts of space. So, one and two are going hand in hand.
Now, the final bit for me is really around explainability. Explainability and stepping around the notion of these models being non-deterministic. For the audience, non-deterministic essentially means that when you input the same data multiple times, you may get different responses. Again, that’s great when you are looking at creative use cases, but when we’re looking at things for enterprises, if we could have some switch that flips on and off in terms of how deterministic something is, again, we do have parameters, but you need it at the actual model architecture level. That’d be super valuable. Just quickly going back to the concept of explainability, right now, it’s not just a very active research field for large language models; it’s for all neural networks and deep learning models out there. Some great enterprises are researching this or really being able to track an output produced. Here is why at the model level, I think that’ll increase adoption, decrease hallucinations, and really allow widespread adoption. Those are the three areas that we’re going to see some massive progress in.
All of this is just my opinion, I’m still super interested in how regular people are applying and using AI. That’s where we’re also going to see a huge amount of super interesting things as we see how even people who aren’t in the field of AI are interacting with these tools and models. That’s also going to be a great hint as to where the field is going.