In recent years, the integration of artificial intelligence (AI) into various aspects of business operations has ushered in transformative changes. One domain where the potential of AI is gaining significant attention is coding assistance within companies. AI coding assistants, powered by advanced machine learning algorithms, have emerged as game-changers, promising to enhance developer productivity, streamline software development, and facilitate innovation. However, as with any technological advancement, there are both promises and perils associated with the adoption of AI coding assistants.
To dwell more on this we had a roundtable discussion on the agenda “The Potential of AI coding assistants in companies – Promises and Perils”. The session was moderated by Ravindra Patil, Sr. Director Data Science at Tredence along with panelists Dhruv Rastogi, Vice President & Head of Data Science at IKS Health, Issac Mathew, Data Science Leader, Biswanath Banik, Director of Data at FinAccel and Narasimha Medeme, VP Head Data Science at MakeMyTrip, Uday Nedunuri, Head of CoE for Digital & Analytics at CEAT.
AI Coding Assistants for Enhanced Productivity
This is definitely a winner as we are at that phase. We’re actually in the exploration and experimentation phase right now. And some of the early results indicate that it definitely has potential. We use different languages in the technology group, different platforms, different IDs. So we’ll have to wait and figure out what’s the relative impact across the different technologies? Like we see but yes, I think one thing is for sure, till now, we as data scientists have been trying to increase the productivity of others. Now, it’s time we increase our own productivity. And this is an excellent way to start, on how we make ourselves more productive.
– Dhruv Rastogi, Vice President & Head of Data Science at IKS Health
A lot of new use cases have come which was previously not even thought of. Even the business has started, actively exploring it, and I think there’s going to be a lot of disintermediation. So, all it’s going to take is at least what I’ve seen so far in the experimental stages that I’ve been through. Firstly, they never had the patience to wait for the results and they always wanted the results yesterday. I think it’s preventing an SME or a business person along with a solution architect, and using an auto GPT to generate code while they’re having a discussion. So no longer do you really need a translator. It is not going to replace everything overnight, but it’s a great start. Good to see productivity go through the roof, and how people use them. That’s definitely what so that’s the positive side of it. The negative side of it, plenty. Primarily, the one that I was most concerned with immediately is how it is disrupting the hiring pipeline, even without Gen AI in the current state. But things are going to change.
– Issac Mathew, Data Science Leader
Accelerating Innovation with ChatGPT
One thing that is quite remarkable for me as a data leader is, whenever there is a new problem, or a new topic that comes and the team is brainstorming, Chat GPT is giving us the initial framework. Although it’s just the boilerplate kind of structure, it still can be the baseline, and a very good initial starting topic. So before everyone gets into the first brainstorming session, they’re already a bit more researched and a bit more accelerated in their understanding on the topic and best practices. Therefore the initial brainstorming is more productive. This has an immediate impact on upstream where all the solution design is actually taking place. It’s thus helping us leapfrogging the solution design. To me this is a great upstream productivity gain!
– Biswanath Banik, Director of Data at FinAccel
The Evolution of AI-Assisted Coding and its Impact on Productivity
AI assisted coding if you just look at this particular aspect, not the entire Generative AI applications, previously it was more of an individualistic drive to kind of sign up and use that as ChatGPT has made it so accessible, so closer to the individual that everybody has seen the productivity improvement and bottom up, top down. So that idea of proof of concepts has reduced and has also given opportunity to everybody to be more productive in that area. What is challenging, is how much has the efficiency improved among the developers. That’s very subjective. The problem is not just about these tools and the models. It’s also about the right tools and the coding environment. Individuals are coding that tool or plugin and things have to be available and it has to be next level so that it has been there so that it’s easier for the individuals to use. Overall it is positive. And it’s gonna just accelerate at a faster pace.
– Narasimha Medeme, VP Head Data Science at MakeMyTrip
Enhancing Productivity and Efficiency in Development
We tested use cases where we supplied the code and we actually got the right documentation, but that’s just one use case. Secondly, there are many models that are being built and operationalized. Many times, we’re looking to write code or functions as quickly as possible. Currently, we’re exploring the use of coding assistants to help with code writing. The second important aspect is diagnostics, specifically troubleshooting. This is where we’ve faced challenges. Today, most enterprise-grade tools are SAS-based, requiring us to input sample codes as part of our prompt. These tools have their own language models trained on various open-source languages, which then provide targeted code as output. Here’s the challenge for the use case: I have a sample code that could be submitted to a coding assistant. It could give us answers like identifying an undefined variable causing an error in a specific function. However, scaling this use case becomes a roadblock. To scale it, we need to grant the tool access to our code bases, train it, and get relevant code. This is where I hit a roadblock. So, we’re going back to coding assistance, which combines multiple solutions. There’s great promise in this, but we need to set clear rules for it to be used by all organizations effectively.
– Uday Nedunuri, Head of CoE for Digital & Analytics at CEAT
Overcoming Challenges in Adapting New Development and Data Science Tools
Our team experimented with some automated code generation tools eight-nine months ago. In our experience, the available LLM models at that time were not very good in generating custom code which was useful enough. The auto-generated code was still very generic and we needed to do lot of manual tweaking to make it work! But one area where it still proved to be useful was in code documentation and code commenting.
However, GPT4 is much better in code generation. But we have to write a very detailed prompt to get back a well structured code that is useful or customisable in our scenarios. We are also planning to explore the AWS CodeWhisperer, which has been recently launched in SEA region on experimental basis and available only through VS code or Sagemaker studio.
As a FinTech company, we are heavily regulated and have to be very cautious about how much we can share in the Public LLM’s while writing the prompts. There is lot of hesitation on what information to put on the prompt or code explainers, and this cautious approach is actually a good thing to prevent accidental leakage of data or intellectual properties.
I think the best way to auto-generate customised code is fine tune open source LLM’s with the in-house codebase. However, they will work better on the engineering and ETL side, may be not so much on the Data Science side. For example, feature engineering from new data sources is a very critical and time consuming process and needs lot of hypothesis generation by the Data Scientists. In-house LLM models can probably be more effective to standardise the feature engineering code after the Data Scientists create new features. They also can be useful to auto-generate some of the baseline features from new data sources, but this area will still need lots of manual effort.
We have experimented LLM models for chatbots, auto-responders and marketing content generation and they have shown good results.
– Biswanath Banik, Director of Data at FinAccel
Custom Copilot: Enhancing Data Privacy and Model Tuning for Industry-Specific Solutions
Most organizations could benefit from utilizing open-source databases, which grant access to extensive foundational models. The quality of these large foundational models, such as GPT-3, remains robust, verbose, and refined. The challenge isn’t as much about coding structures for technologies like 5G – there are several available tools and assistance for that – but rather lies in data structuring. Data structuring isn’t a simple task, especially when considering the other end of the spectrum, which is fine-tuning. Fine-tuning involves complexities beyond coding structures. The critical aspect is data structuring. It’s not a straightforward problem. Additionally, the training process requires high-quality datasets that can’t be easily prepared in a short time. Curating data demands careful preparation and can be time-consuming. Many organizations might lack the resources to dedicate to this process if they’re seeking swift results.
Hence, there are applications that can make use of the Foundation Model Canvas or coding assistance. Alternatively, an intermediary solution exists, allowing organizations to tap into their own databases. This could include domain-specific data or even their own code, perhaps for code modularization. This intermediary approach addresses the data structuring and preparation challenges. On the other end of the spectrum is fine-tuning. Currently, fine-tuning open-source models might not yield results as polished and eloquent as one might expect, especially in achieving a conversational style. The work in progress recognizes the need for meticulous data curation and assembling relevant metrics. The intermediary solution offers a quicker path to success, providing practical applications that suit organizational needs.
– Narasimha Medeme, VP Head Data Science at MakeMyTrip
What are the new skills needed?
We are continuously learning new things and the rate at which we are taking in new things has increased tremendously. Learning to learn is ever more important and learning to know your style better than anybody else is going to help a lot. The exact content or knowledge base of what you currently have, seems to be less and less relevant as we go along. When we started off, it was SAS. Hardly anybody uses SAS these days. It’s not enough that you accumulate knowledge. But when you’re training a model, your learning has to be a little adaptive. Right now is the best time to do that. If you know how to pick up things that are relevant for you on the fly, that could be more important than exactly what you pick up. What you learn today is not going to be relevant, in another five years from now, but learning how to learn and pick up things on the fly that’s a must. I don’t think that changes.
– Issac Mathew, Data Science Leader
All these tools and technologies are enabling us to focus more on problem solving, rather than coding, and that’s where having the right knowledge of tools becomes important. Having the knowledge about those tools is important so that we can be good at problem solving. That’s how these roles will evolve by being problem solvers rather than coders. Secondly they’re also enabling us to foster our own creativity. Many times in our day to day job, our creativity goes for a toss. We don’t really have any time for creativity, or thinking outside the box. But with all these things coming in, a lot of mundane tasks, repetitive tasks, will go away.
– Dhruv Rastogi, Vice President & Head of Data Science at IKS Health