Cohere Technologies’ Command Beta model made a significant achievement by securing the top position in Stanford University’s HELM (Holistic Evaluation of Language Models) competition earlier this month. Among the participants, Cohere’s model stood out among 36 large language models (LLMs), such as Meta’s Galactica, OpenAI’s Davinci, Google’s Flan, Bloom, and more.
Despite the accolade, Ed Grefenstette, the Head of Machine Learning at Cohere, remained admirably modest, considering the achievement a “nice marketing moment” rather than a definitive success.
“Leaderboards are always things that you should take with a grain of salt. We do not want to be complacent and imagine that just because we topped this leaderboard, we will be better than other models close behind us,” said Grefenstette. The natural language processing (NLP) expert explained that he is more excited about the week-on-week progress of their models than the leading position.
Grefenstette also mentioned that OpenAI’s latest GPT-4 model, which is undeniably powerful, had not been benchmarked at the time. He acknowledged that the results might differ in the next run if Stanford benchmarks the current models against GPT-4.
Tracing his journey into artificial intelligence, Grefenstette shared that his interest in the field was sparked by studying the philosophy of mind during his undergraduate studies, as well as reading science fiction and cyberpunk novels in his teens. He chose to pursue computer science due to the lack of job opportunities in philosophy.
As a budding philosopher, Grefenstette was intrigued by the question of what makes humans and other species intelligent. He was particularly interested in understanding how humans use intelligence to reason about the physical world and abstract concepts. However, he soon realized the complexity of this task and began exploring how artificial intelligence could aid in reasoning.
While completing his doctoral work in natural language processing at the University of Oxford, Grefenstette discovered that the approach he and his colleagues were developing shared similarities with rudimentary neural networks. This realization led them to establish Dark Blue Labs in 2014 to commercialize their ideas, which was acquired by Google a few months later.
Shortly before the acquisition, Google had also purchased British AI company DeepMind. As a result, Grefenstette merged his team into DeepMind, where he contributed to the establishment of the NLP group and a program synthesis and understanding group.
During his time at DeepMind, Grefenstette observed the company’s exponential growth from 80 employees to over 1,000 in just four years. He noted that this rapid expansion can make it difficult for individual voices to be heard and can change the culture and dynamics of a workplace.
Grefenstette later joined Facebook AI Research in London, where he helped build a new research lab. Despite considering launching his own startup, he found the balance between entrepreneurship and the opportunity to start something small and grow it into something larger at Facebook.
However, after three years, Grefenstette was once again drawn to smaller organizations and early-stage ventures. Cohere, with its focus on conversational AI and potential for growth, proved to be the perfect fit for Grefenstette’s entrepreneurial spirit.
Grefenstette commended OpenAI’s fantastic progress in the field of AI, stating that many stakeholders are now looking to close the gap created by OpenAI’s head-start. He also expressed interest in creating the next groundbreaking moment in AI, similar to ChatGPT.
Addressing the ethical and societal implications of AI technology, Grefenstette emphasized that these issues cannot be left entirely to technologists. He called for broader education among the population and government regulations to address potential
risks and concerns associated with AI applications. According to Grefenstette, open dialogues and collaboration between various stakeholders, including technologists, policymakers, educators, and the general public, are crucial in tackling these challenges.
Focusing on the concern of language models being utilized in medical applications, Grefenstette pointed out that people should be educated about the intrinsic risk of hallucination associated with these models. He explained that AI models are trained to generate plausible responses based on the data they were trained on, rather than always providing the truth. Although plausibility and truth occasionally overlap, they are not synonymous, leading to potential vulnerabilities for users who may expect the AI-generated responses to be accurate.
Grefenstette stressed the importance of cultivating a sense of “healthy skepticism” within the broader population when it comes to AI applications. Encouraging critical thinking and awareness of AI limitations can help prevent misuse and overreliance on these technologies in sensitive areas.
When asked about his views on the race for AI companies to achieve human-level intelligence, also known as Artificial General Intelligence (AGI), Grefenstette classified himself as a skeptic. He argued that humans are highly adaptable and versatile but are not the ultimate benchmark for what is possible in physics and biology.
In Grefenstette’s view, the pursuit of AGI should be framed within a broader understanding of the limitations and potential of human intelligence. It is essential to recognize that our capabilities as humans have been shaped by evolutionary pressures and environmental constraints. Consequently, the development of AGI should not solely aim to replicate or surpass human abilities but should also explore novel capabilities that may extend beyond our innate capacities.
In conclusion, Ed Grefenstette’s journey in AI and his role in Cohere Technologies illustrate the importance of adaptability, collaboration, and ethical considerations in the rapidly evolving field of artificial intelligence. By fostering an environment that encourages both technical innovation and social responsibility, the AI community can continue to make significant strides in developing AI solutions that benefit society at large.