Tencent’s open-source SearchAgent-8B performs 10+ retrieval turns for deep research at zero API cost. Run it locally on your own data.
Devin’s 67% PR merge rate and 4x faster problem-solving signal a shift in AI software engineering. Here’s the technical breakdown.
QwenLong-L1.5 processes up to 4M tokens with state-of-the-art reasoning. Learn how Alibaba’s memory-augmented architecture matches GPT-5 performance.
NVIDIA pays $20B to acquire Groq’s inference technology and team—not the company. Here’s why this strategic move reshapes the AI chip market and what it means for AMD.
ChatGPT Atlas browser failed privacy tests with alarming scores—1/100 for fingerprinting, 0/100 for tracker blocking. Here’s why OpenAI’s “once-a-decade” browser might be the biggest privacy disaster of 2025.
GLM 4.7 represents a leap in open-source agentic AI. With “Interleaved Thinking” and “Preserved Thinking” modes, it challenges proprietary models in complex reasoning. We analyze its new agentic architecture.
MiniMax M2.1 review: A $0.20/1M token model with 4M context that challenges Claude Opus 4.5. Deep dive into its Lightning Attention architecture, benchmarking scores, and real-world coding performance.
Pentagon awards xAI contract to integrate Grok into GenAI.mil platform for 3 million military and civilian personnel. Critics raise misinformation concerns.
Research reveals 32.67% of SWE-bench successes come from solution leakage, not real problem-solving. PatchDiff study shows 7.8% of “correct” patches actually fail. What this means for AI coding tool claims.
Anthropic partners with US Department of Energy on Genesis Mission, deploying Claude AI across 17 national labs and 40,000 scientists for energy and bioscience research.

