AirLLM: Run 70B Models on Your 4GB GPU (But Pack a Lunch)
AirLLM lets you run Llama 3.1 405B on 8GB VRAM and 70B models on 4GB GPUs through layer-wise inference. Here’s the catch: it’s insanely slow. Is the tradeoff worth it?
AirLLM lets you run Llama 3.1 405B on 8GB VRAM and 70B models on 4GB GPUs through layer-wise inference. Here’s the catch: it’s insanely slow. Is the tradeoff worth it?
OpenAI’s GPT-5.2 derived a new result in theoretical physics, proving that single-minus gluon amplitudes are nonzero. The discovery, verified by…
The “Blue Collar” Agent is here. While OpenAI and Anthropic fight for the $20/month subscription slot, a Chinese lab just…
Canvas-of-Thought replaces linear Chain-of-Thought with mutable DOM-based reasoning. New paper shows it beats CoT, ToT, and PoT on VCode, RBench-V, and MathVista.
University of Michigan’s Prima AI can diagnose brain MRIs in 3 seconds with 97.5% accuracy. This vision-language model trained on 200K+ scans identifies 50+ neurological conditions and could solve radiology’s workforce crisis.
GLM-5’s 744B open-source beast crushes Opus 4.6 on price at $1/1M tokens, but Anthropic’s 1M context window dominates knowledge work. The February 2026 coding showdown explained.
A reference to “Gemini 3.1 Pro Preview” appeared on Artificial Analysis Arena. A Google DeepMind employee hinted “Thursday seems likely.” Here’s what’s happening. [158 chars]
Zhipu AI launches GLM-5, a 744B parameter open-source model trained purely on Huawei chips. It hits 77.8% on SWE-bench Verified and finally reveals the OpenRouter “Pony Alpha” mystery. Here’s why it matters.
Anthropic is opening its first India office in Bengaluru in early 2026. Here’s why the AI giant is betting on India’s talent and what it means for the global AI race.
The Pentagon’s rebrand to the ‘Department of War’ came with a $13.4 billion AI war chest. Here’s why OpenAI and Google are fighting for the soul of GenAI.mil.