Meta has released the first wave of its next-generation open-source generative AI models, Llama 3.
Disclosed in a blog post on Meta's official website, Llama 3 initial offering includes two models: Llama 3 8B and Llama 3 70B, with more to follow in the future.
Meta claims these models represent a significant leap forward compared to their predecessors, boasting superior performance on various AI benchmarks. They measure an AI model's ability in tasks like knowledge comprehension, skill acquisition, and reasoning with text.
While the validity of these benchmarks is debated, they currently serve as a standard for comparing AI models. Here's how Llama 3 stacks up:
- Llama 3 8B: This model outperforms other publicly available options with a similar parameter count, like Mistral 7B and Google's Gemma 7B, on at least nine benchmarks. These benchmarks test knowledge, code generation, problem-solving, and commonsense reasoning.
- Llama 3 70B: This larger model competes with industry leaders like Google's latest Gemini 1.5 Pro. It surpasses Gemini on benchmarks involving knowledge understanding, code generation, and math word problems. Additionally, Llama 3 70B outperforms Anthropic's Claude 3 Sonnet on several benchmarks, showcasing its capabilities.
Meta goes further, suggesting the larger Llama 3 70B model can compete with industry leaders like Google's latest Gemini 1.5 Pro.
Llama 3 70B outperforms Gemini 1.5 Pro on benchmarks like MMLU and code generation tasks.
While not surpassing Anthropic's top-of-the-line Claude 3 Opus, Llama 3 70B bests the weaker Claude 3 Sonnet model on several benchmarks.
It's important to note that benchmarks are just one way to assess AI models. Real-world performance and user experience are also crucial factors.
With the release of Llama 3, Meta positions itself at the forefront of open-source generative AI, and it will be interesting to see how these models are implemented and how they compare to future offerings from competitors.