Microsoft has launched the latest iteration of its lightweight AI model, Phi-3 Mini, marking the first of three small models slated for release by the company.
Phi-3 Mini boasts 3.8 billion parameters and is trained on a relatively smaller dataset compared to larger language models like GPT-4.
Eric Boyd, Corporate Vice President of Microsoft Azure AI Platform who disclosed this in a press release on Tuesday, described Phi-3 Mini as equally capable as larger language models (LLMs) like GPT-3.5 but in a more compact form.
He added that Phil-3 is now accessible on Azure, Hugging Face, and Ollama platforms, with Microsoft planning subsequent releases of Phi-3 Small (7B parameters) and Phi-3 Medium (14B parameters). Parameters represent the complexity of instructions a model can comprehend.
Following the December release of Phi-2, which performed comparably to larger models like Llama 2, Microsoft claims that Phi-3 demonstrates improved performance over its predecessor, delivering responses akin to models ten times its size.
Compared to their larger counterparts, smaller AI models are often more cost-effective to operate and excel on personal devices such as phones and laptops.
Earlier this year, The Information reported Microsoft's establishment of a dedicated team focused on developing lightweight AI models. Alongside Phi, the company has also developed Orca-Math, a model tailored for solving mathematical problems.
Rival tech giants also offer their own small AI models, targeting simpler tasks like document summarization or coding assistance. For instance, Google's Gemma 2B and 7B excel in chatbots and language tasks, while Anthropic's Claude 3 Haiku specializes in summarizing dense research papers. Meta's recent release, Llama 3 8B, is designed for chatbots and coding assistance.
Boyd explained that developers trained Phi-3 using a "curriculum," drawing inspiration from how children learn from simplified stories and books. They utilized a list of over 3,000 words to create "children's books" that could educate Phi.
Furthermore, Boyd noted that Phi-3 builds upon the knowledge acquired by its predecessors.
While Phi-1 focused on coding and Phi-2 began learning reasoning, Phi-3 excelled in both areas.
However, despite possessing some general knowledge, the Phi-3 family cannot match the breadth of a GPT-4 or another large language model.
Boyd said that many companies find that smaller models like Phi-3 are better suited for their custom applications, especially given that their internal datasets are typically smaller. Additionally, these models require less computing power, making them more affordable options for many businesses.