Our series in AI evals continue with an exploration of the types of benchmarks. The opinion series explores the trend of all the major AI labs creating the same primitives( research, reasoning, search, etc) and its implications. In research, we dive into the new Llama 4 release. Engineering explores another cool framework. I had written an editorial for this newsletter and then Meta dropped the Llama 4 release Sat. Ohh well, time to rewrite the whole thing but it was definitely worth it because this Llama release is a big deal! New architectures and enhanced multi-modality. Llama 4 debuts with two models: Llama 4 Scout and Llama 4 Maverick. Both are multimodal, capable of processing not just text but also audio, video, and images. This versatility positions them as foundational models for next-generation applications requiring rich contextual understanding across modalities. At the core of Llama 4 lies a sophisticated mixture-of-experts (MoE) architecture. Llama 4 Scout incorporates...
learn more