Links Justin Sends 4/11

πŸ¦™πŸ’₯ Llama 4, 🧬 De-Extinct Wolves, πŸ€– Combat Bots, 🧠 AutoRAG & More AI Breakthroughs This Week

πŸ‘‹ Happy Friday!

One of the most common questions I get β€” especially from folks not deep in the trenches β€” is: "These numbers and model names are cool, but what do they actually mean for progress?"

Think of it this way: If AI progress in 2018 was like building better calculators, today's progress is more like building helpful coworkers. Each year, we're not just making models a bit better β€” we're expanding what they can do. A model that took hours and custom code to fine-tune five years ago can now be run in seconds via an API, with far better results.

Benchmarks give us a glimpse β€” like a standardized test score β€” but real-world gains show up when models:

  • Understand long documents without losing context (like Llama4’s 10M-token window),

  • Write or debug code at near-human levels,

  • Interpret images or even plan multi-step tasks autonomously, or

  • Run locally on your laptop instead of needing a supercomputer.

We're moving fast β€” and each link below is a snapshot of how AI is becoming less abstract and more capable, week by week.

Now on to the links you came here for πŸ‘‡οΈ 

πŸ¦™ Meta Releases Llama 4 Models β€” Meta dropped Llama 4 over the weekend, including Scout, Maverick, and Behemoth. Scout supports a wild 10M token context window. Maverick is a 17B active parameter multimodal model with impressive image understanding. Behemoth is a 288B active parameter MoE with 2T total parametersβ€”designed for training smaller models through distillation. It uses DeepSeek-style activation to keep it compute efficient.

πŸ’Ύ AMD Supports Llama 4 on Day Zero β€” AMD announced immediate support for Llama 4 using its Instinct GPUs. GROQ has also confirmed compatibility, which should make testing across hardware a lot more interesting.

πŸ› οΈ Cloudflare’s Agent Toolkit for MCP β€” Cloudflare launched support for the Model Context Protocol (MCP), enabling agent-based systems to communicate with services more seamlessly.

🐺 Colossal Biosciences De-Extincts Direwolves β€” Using CRISPR and AI-driven synthetic biology, Colossal brought back the direwolf. Romulus and Remus, born in October 2024, are the first de-extinct animals in over 10,000 years. HBO, if you're listening, Ghost needs some more siblings.

πŸ“œ Shopify's Internal AI Manifesto β€” Shopify CEO Tobi Lutke published an internal memo stating AI is the new baseline at Shopify. A strong example of executive leadership driving AI-first culture.

🧠 Cloudflare Launches AutoRAG β€” A fully-managed retrieval-augmented generation pipeline. Upload documents, integrate your system, and let Cloudflare handle the embeddings, vector DB, and API-based querying.

πŸ§‘β€πŸ’» VS Code Launches Agent Mode with MCP β€” VS Code now supports Agent Mode, integrated with MCP. Seems like they are targeting Cursor and Windsurf in the next-gen AI dev tooling wars.

πŸ§ͺ Mixed Reactions to Llama 4 β€” Testing shows polarized results. Some praise it as state-of-the-art; others say it was over-tuned for benchmarks. I’ll report back once I run it through multi-agent systems in depth.

🍎 Apple’s AI Dev Tools Gap Widens β€” iOS developers are jumping ship from native Swift apps to Expo React to keep up with AI development speed. Apple’s historic developer platform strength is slipping without AI-enabled tooling.

πŸ§ͺ OpenAI Launches Evals API β€” A new API to define tests, run evals, and iterate on prompts directly in your pipeline. It’s a closed-source alternative to one of my favorite tools, PromptFoo (Which I'm going to still use).

πŸ“¦ Firebase Launches Firebase Studio β€” Firebase Studio enters the arena against Bolt.dev and Lovable. For now, it looks more hands free dev than Cursor/Windsurf-competitive, but it could grow into an all-in-one AI dev suite.

πŸ₯Š Unitree Combat Bots Incoming β€” Unitree revealed the Iron Fist King, a humanoid robot built for combat. Priced under $50K, it opens up a whole new category of robot sports... or robotic bodyguards πŸ€– πŸ‘Š ?

πŸ“ˆ ZR1-1.5B Outperforms LLaMA 3.1–70B on Code β€” Jan’s ZR1-1.5B model beat Llama-3.1–70B on hard coding tasks (GPQA-Diamond) with a 37.91% pass@1. Big win for compact reasoning models.

Have a great weekend!

-Justin

aka the guy with great AI links

Co-founder & Head of Technology @ BetterFutureLabs