『Is Claude Opus 4.7 Mythos Distilled, Running Qwen 3.6 Locally, and the AI-On-AI Arena』のカバーアート

Is Claude Opus 4.7 Mythos Distilled, Running Qwen 3.6 Locally, and the AI-On-AI Arena

Is Claude Opus 4.7 Mythos Distilled, Running Qwen 3.6 Locally, and the AI-On-AI Arena

無料で聴く

ポッドキャストの詳細を見る
Is Claude Opus 4.7 really burning tokens? Is open source dead after mythos? Co-hosts Shimin Zhang and Dan Lasky — with recurring guest Rahul Yadav — ran the experiments this week on ADI Pod #22 (April 21, 2026).This episode covers Anthropic's Claude Opus 4.7 release (the "mythos slice"), Alibaba's open-source Qwen 3.6 35B A3B, cal.com going closed source for security reasons, and a HIPAA-violating vibe-coded patient portal that is, in Dan's words, the bullshit future already here.In this episode▸ **Claude Opus 4.7 review** — the new mythos-derived tokenizer (3× bloat on plain English), stricter instruction-following, and why Shimin's SVG experiments suggest the token-burn panic is overblown: 35¢ on Opus 4.7 vs $2 on Opus 4.6 for the same task, with ~40× fewer reasoning tokens.▸ **Qwen 3.6 35B A3B** — Alibaba's open-source mixture-of-experts model (3B active params at any time) running locally on Shimin's laptop at 90–95 tokens/sec via llama.cpp + Unsloth. The first model to break Simon Willison's pelican-on-a-bicycle benchmark against a larger frontier model.▸ **cal.com goes closed source** — why the AI Security Institute's $12,000-per-attempt mythos pentesting data ($125,000 for 10 runs) is changing the open-source calculus, and Drew Breunig's three-phase dev/review/hardening cycle prediction.▸ **Jesse Vincent's "Rules and Gates"** — a coding-agent prompting technique that reformulates optional preferences into directed preconditions, and whether agents can "weasel out" by rewriting the gate itself.▸ **AI vibe coding horror story** — a German doctor who inlined a full patient portal into a single HTML page with database credentials client-side. HIPAA, meet DSGVO.▸ **Kyle Kingsbury's "The Future of Everything is Lies"** — the Jepsen author's 8-step action list on AI's second- and third-order societal effects.▸ **The AI-on-AI Arena** — Shimin's weekend project grading 11 frontier models against each other. The "delusion index" reads almost exactly like Dunning-Kruger in humans: GPT-5.4 scored -1.6 (humble), Gemini 3.1 Pro Preview rated itself well while peers ranked it last.▸ **Two Minutes to Midnight** — Paul Graham's log-scale chart comparing AI capex (~1% of US GDP) to the US railroad peak (~10%). We dialed the AI bubble clock back 45 seconds to 3 min 30 sec.Key takeaways— Opus 4.7's token-burn reputation may be overblown. Stricter instruction-following can reduce total reasoning tokens by up to 40× vs Opus 4.6 on the same task.— Security-driven closed-sourcing may spread as mythos-class agents make open repos easier to exploit. Hardening could make software capital-intensive again.— Cognitive debt is real: Dan's wake-up call was a production bug a pre-LLM colleague solved in 5 minutes. His first instinct was to double down on the tool.— Shimin's defense against skill atrophy: read 100% of LLM-generated PR lines (except tests).— Weaker models rate themselves higher than stronger ones. Calibration appears to improve with capability.Chapters(00:00) - Introduction to AI and Software Development (02:25) - Alibaba's Quinn 3.6 Model Overview (08:06) - Anthropic's Claude Opus 4.7 Release (18:08) - Cal.com Goes Closed Source: Implications for Security (20:40) - The Future of Vibe Coding (23:41) - Techniques for Effective AI Utilization (27:13) - Post-Processing and AI in Real-World Applications (33:07) - The Cultural Impact of AI and Technology (41:30) - Navigating Code Review Challenges (42:57) - Exploring AI's Societal Impact (45:16) - Evaluating AI Models: Performance and Insights (49:09) - The Future of Data Centers and AI (50:54) - Investment Trends and Economic Perspectives (57:58) - Reflections on Historical Investment Cycles (59:35) - Optimism Amidst Uncertainty Resources mentionedClaude Opus 4.7 & Qwen 3.6• Introducing Claude Opus 4.7 (Anthropic): https://www.anthropic.com/news/claude-opus-4-7• Claude Opus 4.7 System Card: https://cdn.sanity.io/files/4zrzovbb/website/037f06850df7fbe871e206dad004c3db5fd50340.pdf• Qwen3.6-35B-A3B: Agentic Coding Power, Now Open to All: https://qwen.ai/blog?id=qwen3.6-35b-a3b• Simon Willison — Qwen3.6-35B-A3B on my laptop drew me a better pelican than Claude Opus 4.7: https://simonwillison.net/2026/Apr/16/qwen-beats-opus/• Shimin — Opus 4.7 isn't dumb, it's just lazy: https://shimin.io/journal/opus-4-7-just-lazy/Security & open source• Cal.com is going closed source. Here's why: https://cal.com/blog/cal-com-goes-closed-source-why• Drew Breunig — Cybersecurity Looks Like Proof of Work Now: https://www.dbreunig.com/2026/04/14/cybersecurity-is-proof-of-work-now.htmlTechnique & commentary• Jesse Vincent — Rules and Gates: https://blog.fsck.com/2026/04/07/rules-and-gates/• An AI Vibe Coding Horror Story: https://www.tobru.ch/an-ai-vibe-coding-horror-story/• Kyle Kingsbury (Aphyr) — The Future of Everything is Lies, I Guess: https://aphyr.com/posts/411-...
adbl_web_anon_alc_button_suppression_t1
まだレビューはありません