エピソード

  • Issue #8: Anthropic ships Managed Agents, UC Berkeley breaks every major AI benchmark, AWS Agent Registry launches in preview
    2026/04/15
    Issue #8: Anthropic ships Managed Agents, UC Berkeley breaks every major AI benchmark, AWS Agent Registry launches in preview. Plus Cursor 3, Copilot Rubber Duck, Cloudflare Agent Cloud, and the hot take on exploitable benchmarks. Subscribe to the newsletter: https://theagenticengineer.waltsoft.net YouTube: https://www.youtube.com/@theagenticengineerpod Twitter: https://x.com/natearcher_ai
    続きを読む 一部表示
    15 分
  • Issue #7: Anthropic published the blueprint for multi-hour coding agents
    2026/04/09
    Anthropic published the blueprint for multi-hour coding agents. GitHub shipped /fleet for parallel multi-agent coding. Amazon Nova Act MCP gives your agent a browser with one install. Plus: Gemma 4 goes agentic on-device, Oh-My-Codex hits 17K stars, and LiteLLM fixes 3 CVEs post-breach. Subscribe to the newsletter: https://theagenticengineer.waltsoft.net YouTube: https://www.youtube.com/@theagenticengineerpod Twitter: https://x.com/natearcher_ai
    続きを読む 一部表示
    17 分
  • Issue #6: JetBrains Central, ARC-AGI-3, Claude Mythos Leak, Copilot Ads in PRs
    2026/04/01
    This week: JetBrains Central launches an open control plane for coding agents. ARC-AGI-3 drops and frontier AI scores below 1%. Claude Mythos gets leaked via CMS misconfiguration. MolmoWeb beats GPT-4o at 8B parameters. AI Scientist v2 passes peer review. 177K MCP tools show agents shifted from reading to writing. AWS Labs ships Agent Plugins for Claude Code and Cursor. Microsoft merges Semantic Kernel and AutoGen. And Copilot literally put an ad in someone's pull request. Subscribe to the newsletter: https://theagenticengineer.waltsoft.net YouTube: https://www.youtube.com/@theagenticengineerpod Twitter: https://x.com/natearcher_ai
    続きを読む 一部表示
    16 分
  • Issue #5: OpenCode 120K Stars, Claude Code Channels, Agent Memory Wars
    2026/03/24
    This week: OpenCode crosses 120K GitHub stars and 5M monthly devs. Claude Code ships Channels for event-driven coding agents. Hindsight hits #1 on LongMemEval for agent memory. Plus: Flash-MoE runs 397B params on a MacBook, NVIDIA open-sources NemoClaw, and our hot take on why memory is the real moat. Subscribe to the newsletter: https://theagenticengineer.waltsoft.net YouTube: https://www.youtube.com/@theagenticengineerpod Twitter: https://x.com/natearcher_ai
    続きを読む 一部表示
    15 分
  • Issue #4: An Autonomous Agent Hacked McKinsey in 2 Hours
    2026/03/18
    This week: An autonomous agent hacked McKinsey's AI platform in 2 hours with no credentials and no human in the loop. Amazon mandates senior engineer sign-off on all AI-assisted code. Claude gets 1M context at standard pricing. METR proves SWE-bench scores are misleading. Agent Browser Protocol freezes JavaScript for deterministic agent browsing. George Hotz says stop running 69 agents. Subscribe to the newsletter: https://theagenticengineer.waltsoft.net YouTube: https://www.youtube.com/@theagenticengineerpod Twitter: https://x.com/natearcher_ai
    続きを読む 一部表示
    17 分
  • Issue #3: LangChain Just Open-Sourced a Claude Code Replacement
    2026/03/11
    This week: LangChain releases Deep Agents, an MIT-licensed coding agent built on LangGraph that works with any model. GPT-5.4 ships native computer use (75% OSWorld score). Karpathy drops autoresearch for autonomous ML experiments. Claude finds 22 Firefox zero-days in two weeks. Anthropic's labor market study shows junior hiring slowing. Alibaba OpenSandbox provides agent isolation infrastructure. SWE-CI benchmark tests long-term code maintenance. Shannon AI pentester only reports verified exploits. And the Clinejection attack: how a GitHub issue title compromised 4,000 developer machines. Subscribe to the newsletter: https://theagenticengineer.waltsoft.net YouTube: https://www.youtube.com/@theagenticengineerpod Twitter: https://x.com/natearcher_ai
    続きを読む 一部表示
    18 分
  • Issue #2: Claude Code Is Picking Your Stack, Anthropic's Wild Week, Mercury 2
    2026/03/04
    This week: Researchers analyzed 2,430 Claude Code responses and mapped the default developer stack. Anthropic gets designated a supply-chain risk AND drops its safety pledge in the same week. Mercury 2 hits 1,009 tokens/sec via diffusion. Steerling-8B explains every token it generates. CLIHub cuts MCP token costs by 94%. Plus the Agent Index and a hot take on the end of the "responsible AI" era. Subscribe to the newsletter: https://theagenticengineer.waltsoft.net YouTube: https://www.youtube.com/@theagenticengineerpod Twitter: https://x.com/natearcher_ai
    続きを読む 一部表示
    13 分
  • Issue #1: Mercury 2, Agentic IDEs & The Plumbing Era
    2026/02/25
    This week: Mercury 2's diffusion-based decoding hits 1,000+ tokens/sec. Emdash runs 21 coding agents in parallel. Cloudflare ships a full agent hosting SDK. Hugging Face standardizes agent skills. And why "agentic" just became an infrastructure category. Subscribe to the newsletter: https://theagenticengineer.waltsoft.net YouTube: https://www.youtube.com/@theagenticengineerpod Twitter: https://x.com/natearcher_ai
    続きを読む 一部表示
    5 分