🛏 Death of Large Language Models?

Reeshabh Choudhary
2 min readJan 9, 2024

📢 With Microsoft’s latest release Phi-2, with 2.7 billion parameters aims to be the LLM killer. Currently, Phi-2 challenges models like Llama-2 and Mistral with around 7B parameters, but is it something really innovative?
People move on pretty quickly in Politics and Information industry. Back in the day, when GPT-3 and GPT-3.5 was released, there was a lot of hype around for its performance and usage. Computing was always going to be expensive. With the price of GPT models in Azure AI Studio and its failure to perform even the basic mathematical operations on excel, eyebrows were raised on the potential of LLMs. Not every organization was able to afford to leverage GPT models for production usage and engineers being engineers were looking to cut short this problem.

🔫 Enter models like GPT4All, which were trained on subset data (randomly selected) of original GPT models and could be run on local machine with 4GB-16GB of RAM. And to make things more interesting, it was able to do a fair job in comparison to LLM capabilities, given limited resources it was able to access.
However, the marketplace was soon crowded with open sourced LLMs led by Llama by Meta and people soon forgot about the naïve attempt made by the developers of GPT4All. Amidst the chaos of LLM boom in the market with LLMs competing each other in terms of billions and trillions of parameter, Microsoft seemed to have learnt its lesson from the failure of converting the sale of its LLM models to production, as companies and their client’s were not ready to shell the money for its capabilities. Perhaps the attempt of GPT4All was not in vain and was noticed and took even further with Microsoft releasing its paper “Textbooks are all you need”, which focused on textbook quality data. They release Phi-1 silently which was trained on data curated synthetically to teach the model common sense reasoning and general knowledge, including science, daily activities, and theory of mind, among others.

💡 Idea was simple and derived from the lessons from human history. You can not dump garbage at one place and expect to create something of value out of it. To achieve goals, you need goal centric approach. Rather than learning everything off the internet or whatever available, strategically chose what you want to learn and then apply it to solve the problems.

💲 Cost was definitely a factor. Soon, like AWS, Microsoft also allowed other LLMs than GPT on its platform, meanwhile developing and improving Phi-2 model. To make it available to common folks, the product has to be made accessible to them. With the possibility of running the new small language models on device, the next logical step is going to be able to run these models through browser, which will be the ultimate tipping point in the context of Gen-AI! Till then, all we can do is keep playing around with the resources available for free! 😎

#genai #llm #costsavings #ai #gpt #bard #microsoft

--

--

Reeshabh Choudhary

Software Architect and Developer | Author : Objects, Data & AI.