← AM·PM Brief

🤖 Together AI ships serverless inference, Sakana AI & NVIDIA speed sparse models

May 10, 2026 · Morning brief · 14 news · 5:48

Audio in Mandarin Chinese · English transcript below

Serverless models at cabbage prices, sparse LLMs 20% faster, AI toy lawyer fumble, subtitle glasses go real-time cheat, SIGGRAPH films spam generative skills—rush-hour subway edition.

Today's Top 3 Headlines

  1. AI Industry News

    🤖 Together AI serverless: DeepSeek-V4-Pro 512k ctx, $2.1/M in, $8.4/M out

    Together AI docs reveal serverless inference now hosts 20+ LLMs incl DeepSeek-V4-Pro: $2.1/M in, $4.4/M out, cached in $0.2/M. Devs can run LLMs with zero reserved compute and no minimum spend—perfect for prototyping and low-traffic apps.

    Source
  2. AI Technology

    🤖 1B-LLM 20% faster: Sakana AI×NVIDIA debut TwELL format

    Sakana AI & NVIDIA unveil TwELL sparse format + custom CUDA kernels, cutting LLM inference >20% at 1B scale while slashing peak RAM and energy. Devs can now run bigger models or higher concurrency on the same silicon, unlocking greener compute for edge deployment.

    Source
  3. AI Industry News

    🤖 Overconfident Agent? Intent-Chaos Test Exposes Flaws Pre-Launch

    VentureBeat: 2026 enterprise rollout halted 4h after an observability Agent mis-flagged a batch job and triggered rollback—LLM non-determinism plus multi-Agent “poison inputs.” Intent-driven chaos testing shifts the metric from “task done” to “intent preserved.”

    Source

+9 more headlines

  • 🤖 Palantir CEO: AI is product & target; biz may be replaced by LLM
  • 🤖 Dyson 360 Vis Nav drops $919 → $279.99
  • 🤖 1,500 firms race into AI kids’ toys; cheap rush sparks content chaos, regulators called
  • 🤖 WIRED crowns Even Realities G2 2026’s best real-time caption glasses
  • 🤖 SIGGRAPH 2026 drops first slate: Mandalorian, Avatar 3, Toy Story 5 VFX wizardry
  • 🤖 Redis dad ships ds4, DeepSeek V4 inference rockets on Apple Silicon
  • 🤖 19 hottest EVs at 2026 Beijing Auto Show: China leads with electric+AI
  • 🤖 9 2026 specs drive AI tools: AWS Kiro, BMAD, GSD lead
  • 🤖 Palantir £600M UK win: history grad lobbyist Moseley
Unlock all 12 headlines + deep analysis →Free 3-day trial · cancel anytime