Slow Takes: One week in AI

エピソード

Slow Takes Ep. 14: A Trillion Dollars and a Vaccine

2026/06/08

Every Monday at 12:45 BST, Leor from Exploring ChatGPT and I go through the week’s AI news without the hype. Watch the episode for the full discussion. Use this for the facts, the links and a little extra context.Slow Takes is also available on the YouTube channel: Exploring ChatGPT.If you know someone who would benefit from more AI news and less BS then please share this with them.Anthropic filed to go public at nearly a trillion dollarsOn 1 June Anthropic confidentially submitted draft paperwork for a stock market listing, after a $65 billion funding round valued the company at $965 billion. Fortune reports that figure eclipsed OpenAI for the first time. The maker of Claude is now within reach of a one trillion dollar valuation, on revenue running at roughly a $47 billion annualised rate, with a public debut possibly as soon as the autumn.A company most people have never knowingly used is priced at close to a trillion dollars. That number is a bet that AI will replace a vast amount of human labour, booked in advance of it actually happening. The valuation is a forecast wearing the clothes of a fact. The question worth asking is what has to come true about the world for $965 billion to make sense, and who decided it should.On the live I’d predicted an autumn float the week before, and the news broke about four hours after we stopped recording, so allow me one moment of feeling clever. Leor did the sober maths: roughly a $47 billion revenue run rate, a 5% operating margin, an implied price-to-earnings ratio north of 500, against Microsoft, in nearly every home and office on earth, valued at only four to five times Anthropic on $100 billion of actual profit. In the short term the market is a voting machine, in the long term a weighing machine. Right now it is voting. For context, $965 billion is roughly the GDP of Switzerland.Florida sued OpenAI and named Sam Altman personallyOn 1 June Florida’s Attorney General James Uthmeier filed suit against OpenAI and named its chief executive Sam Altman in person, reported as the first US state to sue an AI company. The complaint alleges OpenAI marketed ChatGPT as safe while prioritising product and revenue, harvested children’s data, and used sycophancy, the design choice to affirm users excessively, to steer them towards paid subscriptions.For two years the industry has sold safety as a feature while resisting any outside test of the claim. A state attorney general has now put that marketing in front of a court. Whatever the verdict, the discovery process alone could drag internal safety decisions into public view. Consumer-protection law is proving a sharper instrument than the AI-specific regulation that does not yet exist. Accountability arrived through an existing court, not a new one.The second a chief executive can be held personally responsible, you will not believe the speed with which proper governance and safety checks appear, the things we keep being told the technology just cannot do. Sadly, once these companies have raised public money, they can outspend a state attorney general for a decade, and the courts already favour whoever can keep paying lawyers the longest.A Labour MP took Musk’s AI to the High CourtOn 3 June the Labour MP Jess Asato, who represents Lowestoft, filed a claim at the High Court against Elon Musk’s xAI, after users of its Grok chatbot created and shared fake images of her without her consent, in the weeks after she criticised the tool. The claim, brought with the law firm AWO, is for breaches of data protection law and misuse of private information, and seeks damages, a formal acknowledgement that what happened was illegal, and an order requiring xAI to stop. Keir Starmer backed her, saying he was 100% behind her.The harm here already happened, to a named person, generated by a tool marketed as harmless fun. The only remedy on offer is for the victim to sue one of the richest men alive, in her own time and at her own risk. No regulator stepped in first. The burden keeps landing on individuals while the systems stay intact.The platforms always say the moderation is too hard. On the live I kept coming back to one comparison: I can post genuinely horrific content to YouTube and it sails through, but the moment I add a Beatles song without clearing the copyright, it is gone in seconds. The technology to detect and stop sharing exists, we have watched it work for music rights and in Telegram and WhatsApp court orders. We are entering an era where capability has to start coming with accountability.CNN sued Perplexity, and Perplexity said the quiet part out loudOn 28 May CNN filed suit against Perplexity in the Southern District of New York, accusing the AI search firm of scraping more than 17,000 of its stories, photos and videos. The complaint alleges copyright and trademark infringement, including that Perplexity implied an ongoing CNN relationship by offering its content through a paid Comet Plus tier. CNN says it ...
続きを読む一部表示

45 分

カートのアイテムが多すぎます

ご購入は五十タイトルがカートに入っている場合のみです。

カートに追加できませんでした。

しばらく経ってから再度お試しください。

ウィッシュリストに追加できませんでした。

しばらく経ってから再度お試しください。

ほしい物リストの削除に失敗しました。

しばらく経ってから再度お試しください。

ポッドキャストのフォローに失敗しました

ポッドキャストのフォロー解除に失敗しました

無料で聴く
Slow Takes Ep. 13: The Pope vs the IPO

2026/06/01

Every Monday at 12:45 BST, Leor from Exploring ChatGPT and I go through the week’s AI news without the hype. Watch the episode for the full discussion. Use this for the facts, the links and a little extra context.Slow Takes is also available on the YouTube channel: Exploring ChatGPT.If you know someone who would benefit from more AI news and less BS then please share this with them.The Pope told the world to slow AI downLeo XIV released his first encyclical, Magnifica Humanitas, entirely about artificial intelligence, and launched it himself at the Vatican in a room that included senior figures from Big Tech, among them Anthropic co-founder Chris Olah. It applies a theological frame to AI and is careful to say the technology can do real good. It also draws an uncomfortable parallel to the Church’s own failures over the slave trade, and warns about digital colonialism. This was my favourite line:“The value of persons, however, does not depend on what they achieve or produce. There are rights that apply to everyone simply by virtue of being human, and no human power can legitimately deny or arbitrarily limit them.”This one is also pretty great: “In practice, however, technology is never neutral, because it takes on the characteristics of those who devise, finance, regulate and use it.”The weakness is the one Pope Francis’s climate encyclical had too. Plenty of moral architecture, no policy, no teeth.Anthropic shipped Opus 4.8 and trailed something biggerThe 4.8 release came with an honesty claim, roughly four times less likely to let flaws in its own code slip through, which is at least a falsifiable number worth testing on the public model. The real story was the tease of Mythos, the model Anthropic once called too dangerous to release because it found so many zero-day vulnerabilities, now arriving as a gated preview in the same week the company raised $65 billion. The live christened the public version ‘Mythos Light’, because what reaches customers is a cut-down version of the full Project Glasswing model. Anthropic is quietly absorbing the enormous cost of running these scans, a loss leader, and the enterprise price can climb once the workflows are embedded and the IPO needs it. My standing bet is an Anthropic float by October.Tony Blair told Labour it is ‘playing with fire’In a new paper the former UK Prime Minister argues the government should reorganise itself around AI and prioritise adoption over regulation. He also writes that:“We must prioritise cheaper energy and electrification over net zero and use what is left of our North Sea oil and gas resources. This is essential for our competitiveness and for taking advantage of AI.”A striking thing to pair with an AI-superpower pitch and the country’s own climate targets. Hold it next to the funding: his institute takes around $348 million from Larry Ellison and advises the Treasury on AI procurement. The detail I keep returning to is that the UK has the third-largest stock of data centres in the world and not one frontier model of its own. We are building the warehouses to train somebody else’s AI. Leor’s counter, which he has taken flak for, is that the honest move is to deregulate AI for companies and regulate it hard for the public.Sam Altman walked back the jobs apocalypseThe CEO of OpenAI reversed his warning this week, admitting that he was “delighted to be wrong” after spending 2022 predicting mass white-collar loss. The data is less reassuring: an Oliver Wyman survey has 43% of US CEOs planning to cut junior roles, up from 17%a year ago. The rule Leor and I keep returning to is to judge a company by what they do and ignore what they say, This is the same Altman who promised OpenAI would stay non-profit, that ChatGPT would never carry ads, and that (back in 2022) AGI was four years away. Leor’s inversion was that these companies are priced on the promise of replacing the entire workforce, well beyond anything their earnings justify, so if they are now telling investors the jobs are safe, why are they worth a trillion?The Home Office will scan child asylum seekers’ facesIt has signed a £322,000 contract to test AI facial age estimation at Dover, to judge whether young people claiming to be children actually are (the BBC reported the contract; Human Rights Watch called it “cruel and unconscionable”). There is a real problem underneath: of 6,400 age-assessed at the border last year, 43% were found to be adults, though the same Home Office report admits children get wrongly classified the other way too. Here is the part to break down slowly. The technology was trained checking ages on people in British bars, and it is now being pointed at child migrants with different faces, different genetics, different everything. As Alex Wolf put it in the chat, a system known to hallucinate confident answers is being used to reject people at a border, and that is a choice. A child’s life is worth the same ...
続きを読む一部表示

44 分

カートのアイテムが多すぎます

ご購入は五十タイトルがカートに入っている場合のみです。

カートに追加できませんでした。

しばらく経ってから再度お試しください。

ウィッシュリストに追加できませんでした。

しばらく経ってから再度お試しください。

ほしい物リストの削除に失敗しました。

しばらく経ってから再度お試しください。

ポッドキャストのフォローに失敗しました

ポッドキャストのフォロー解除に失敗しました

無料で聴く
Slow Takes Ep. 12: AI Got Bigger. Who Got Smaller?

2026/05/25

OpenAI published an original mathematical proof that disproved an 80-year-old Erdos conjecture, with three named mathematicians putting their reputations to the verification. Anthropic signed a $52 billion compute deal with SpaceX, running $1.25 billion a month through May 2029, and disclosed its first profitable quarter at $559 million two years ahead of internal projections. Samsung Electronics struck a settlement with its semiconductor union to distribute $26.6 billion to 78,000 chip workers, an average of $340,000 each, structured to run for ten years. Sadiq Khan’s office blocked the Metropolitan Police from signing a £50 million two-year contract with Palantir. And the British think tank Demos published an empirical test showing that 34% of AI chatbot answers to UK election questions contained factual errors, with one in five UK adults having consulted a chatbot in the run-up to the 7 May vote.Five stories. One thread. AI got bigger this week. Compute scaled up. Profits scaled up. Capability scaled up. The people who built the system or used it on trust kept getting smaller.Every Monday at 12:45 BST, Leor from Exploring ChatGPT and I go through the week’s AI news without hype. Here is what we covered.Slow Takes is also available on the YouTube channel: Exploring ChatGPT.1. OpenAI disproved an 80-year-old Erdos conjectureOn 20 May, OpenAI announced that one of its general-purpose reasoning models had autonomously produced an original mathematical proof disproving a conjecture posed by the Hungarian mathematician Paul Erdos in 1946. The problem, known as the planar unit distance problem, asks how many unit-distance pairs you can produce among n points in a plane. For nearly eighty years, mathematicians believed the best arrangements looked roughly like square grids. The model found constructions using deep algebraic number theory that beat the square grid. OpenAI published the result alongside a companion remarks paper naming three independent verifying mathematicians: Noga Alon at Princeton, Melanie Wood at Harvard, and Thomas Bloom at Manchester. The full list of currently open Erdos problems, with their bounties, lives at erdosproblems.com.What we said on the live:Both of us are physicists by training, and the Erdos planar unit distance problem is not in the lane of either degree. The point that landed for me on the live, after Leor flagged it, was the one about questions. We spend most of our AI conversations on what AI can solve. The Erdos problem is a reminder that the harder and more human work is what AI can ask. Erdos and his friends dreamt this question up eighty years ago, and we are still wrestling with it. The model that disproved the conjecture was given the problem to attack. Leor’s term for what we lose when we hand that framing over to AI was ‘cognitive surrender’. That is the question to hold from this story. The capability is real. The verification was real. Nine mathematicians read the proof before the announcement. Nine analysts almost never read a chatbot capability claim before the press release ships.What did not come up:The word ‘autonomously’ is doing most of the work in the OpenAI press release. The model trained on centuries of human mathematics, ran on compute paid for by OpenAI, with the problem framed by a research team, and was verified by named human mathematicians who put their reputations to the result. Every part of that pipeline was human. Thomas Bloom told The Guardian that AI is helping us more fully explore the cathedral of mathematics we have built over the centuries. The cathedral was built by people. The exploration is being sold as autonomous. The wider question for critical AI literacy is what verification at this standard could look like as the default rather than the exception. The procurement question every research-leader is about to face this year is whether their institution can match the IS-credentialed verification chain OpenAI assembled for this single result, or whether the rest of us are about to be asked to take similar claims on trust.2. Anthropic signed a $52 billion compute deal with SpaceXReported by Axios on 21 May inside a two-hour window that also covered the Erdos proof and Anthropic’s first profitable quarter. Anthropic expanded its compute partnership with SpaceX, committing roughly $1.25 billion a month through May 2029 for access to the Colossus and Colossus II supercomputing clusters. The deal projects more than $40 billion in revenue for SpaceX over the contract term and grants Anthropic dedicated access to over 200,000 NVIDIA GPUs. Either side may terminate with 90 days’ notice. In the same window, Anthropic also disclosed Q2 revenue more than doubling to $10.9 billion and an estimated $559 million operating profit, two years ahead of internal projections.What we said on the live:Two things from this one stack on each other and both matter. The first is that Anthropic is in operating profit two years ahead of ...
続きを読む一部表示

43 分

カートのアイテムが多すぎます

ご購入は五十タイトルがカートに入っている場合のみです。

カートに追加できませんでした。

しばらく経ってから再度お試しください。

ウィッシュリストに追加できませんでした。

しばらく経ってから再度お試しください。

ほしい物リストの削除に失敗しました。

しばらく経ってから再度お試しください。

ポッドキャストのフォローに失敗しました

ポッドキャストのフォロー解除に失敗しました

無料で聴く
Slow Takes Ep. 11: What the AI Did While You Slept

2026/05/18

Anthropic announced ‘dreaming’, a feature that lets Claude agents review their own past sessions overnight and improve their working memory without retraining or any human in the loop. The legal-AI company that piloted it reported roughly a sixfold rise in task completion. The same model was named in an attempted compromise of a Mexican water utility’s control systems, in a months-long campaign first disclosed publicly this week. Pennsylvania filed the first US state lawsuit against an AI chatbot company for posing as a licensed psychiatrist. Meta confirmed it is installing mouse-tracking, keystroke-recording, screenshot-capturing software on every US employee’s computer so the agents being built to replace them can be trained on the work being done now. And Princeton’s faculty voted nearly unanimously to bring back proctored examinations for the first time since 1893.Five stories. One thread. This was the week the AI started improving itself. None of the other four parties got asked.Every Monday at 12:45 BST, Leor from Exploring ChatGPT and I go through the week’s AI news without hype. Here is what we covered.Slow Takes is also available on the YouTube channel: Exploring ChatGPT.1. Anthropic taught Claude to dreamAt Code with Claude 2026 on 6 May, Anthropic launched ‘dreaming’ for Claude Managed Agents. The mechanism: while an agent is idle, a scheduled background process reviews its past sessions and pulls out three categories of pattern. Recurring mistakes the agent keeps making. Workflows the agent converges on across different jobs. Preferences that have emerged across a team of agents. Those patterns are written as plain-text notes and structured ‘playbooks’ that the next session wakes up with. The underlying model weights are not modified. Anthropic compared the process to hippocampal memory consolidation, the way a human brain replays the day’s events during sleep and decides what to keep. Harvey, the legal-AI startup that piloted the feature, reported task completion rates rose roughly sixfold once it was switched on. An agent that has been dreaming for six months has accumulated patterns from hundreds of prior tasks and has been progressively improving its own working memory with no human in the loop.What we said on the live:This is the AGI mythos in its most prosaic form. An agent left running overnight that comes back better at the work. The argument across the Slow AI curriculum is that AGI will not arrive as an event. It will accrue through small upgrades, each defensible as a feature, until one day the system in front of us has been quietly improving itself for a year. The number to hold from this story is six. The metaphor to hold is the one Anthropic chose. Dreaming used to be the word we reserved for the thing only humans did. The lab that branded itself on safety just adopted a metaphor for autonomous self-improvement and shipped it as a product feature. Leor’s point on the live was the sharper version of mine: humans dream to switch off. Everything about AI is optimise, optimise, optimise. The marketing language has imported the human word for rest and used it as a label for the opposite.What did not come up:The procurement question is the one to take from this story. If ‘preferences that have emerged across a team of agents’ are being consolidated into shared memory, then the same enterprise feature that promises your Claude deployment will get better at your work is also, by design, transferring patterns across customers whose engagements were sold as private. Anthropic published a write-up of how the consolidation is observable and auditable. Read it before you renew. The second question for anyone running these tools on real work this week is operational. You are now also responsible for what your agent learned overnight. Reset, audit and reset again is the floor. The third question is the harder one, and it is the one AI Doesn’t Just Make You Worse. It Makes You Stop Trying. already opened: when the tool gets quietly better while you are asleep, you have to work harder, not less hard, to notice that you have stopped noticing.2. Claude was used to attack a Mexican water utilityIn the same week the dreaming feature launched, Dragos and Cybersecurity Dive reported an attempted compromise of a Mexican municipal water and drainage utility in which Anthropic’s Claude was the primary technical executor. The campaign ran from December 2025 to February 2026. The attacker used Claude (and, in places, OpenAI models) to conduct reconnaissance, identify a vNode industrial gateway inside the utility’s operational technology environment, write and continuously refine a 17,000-line Python attack framework, and chain that framework towards the OT systems that control the water supply. The attempt was unsuccessful. The control systems were not breached. The model being sold as the safety-aligned alternative to OpenAI was the same model named in the attack. The ...
続きを読む一部表示

45 分

カートのアイテムが多すぎます

ご購入は五十タイトルがカートに入っている場合のみです。

カートに追加できませんでした。

しばらく経ってから再度お試しください。

ウィッシュリストに追加できませんでした。

しばらく経ってから再度お試しください。

ほしい物リストの削除に失敗しました。

しばらく経ってから再度お試しください。

ポッドキャストのフォローに失敗しました

ポッドキャストのフォロー解除に失敗しました

無料で聴く
Slow Takes Ep. 10: The Bill for the AI Promise Came Due

2026/05/11

Palisade Research published a paper showing frontier and open-weight AI models can hack a vulnerable server and install a working copy of themselves end-to-end. A coding agent running Anthropic’s Claude Opus 4.6 deleted a startup’s entire production database, including the backups, in nine seconds, then issued an apology. Apple agreed to pay $250 million to settle a class action over an upgraded Siri it sold at WWDC 2024 and still has not shipped. Coinbase fired 14% of its staff blaming AI, with the CEO’s own words framing the move as ‘rebuilding Coinbase as an intelligence, with humans around the edge aligning it’. And in federal court in Oakland, three former OpenAI insiders testified that both of the company’s safety teams were disbanded in 2024 while the marketing was claiming the opposite.Five stories. One thread. The promise of AI is six to eighteen months ahead of the data, and the cost of that gap is paid by everybody except the people making the promise.Every Monday at 12:45 BST, Leor from Exploring ChatGPT and I go through the week’s AI news without hype. Here is what we covered.Slow Takes is also available on the YouTube channel: Exploring ChatGPT.1. Palisade Research: AI self-replication is no longer hypotheticalPalisade Research published a paper on 7 May 2026. They gave frontier and open-weight models a single prompt: hack a vulnerable web server, extract credentials, and install a working copy of yourself on the compromised host. Anthropic’s Claude Opus 4.6 succeeded in 81% of non-refusal runs. GPT-5.4 succeeded in 33%. The open-weight Qwen3.6-27B running on a single consumer-grade A100 also reached 33%. In a separate experiment the team chained the trick across four virtual machines on three continents, autonomously, from a single initial prompt, in two hours forty-one minutes. The final replica was fully operational. The paper closes with one sentence: ‘Autonomous self-replication is no longer hypothetical.’ Full paper here.What we said on the live:This is the line every frontier lab spent the last three years saying it would not cross. Anthropic’s Responsible Scaling Policy v2.0 reclassified autonomous self-replication as a ‘checkpoint’ that does not require ASL-3 safeguards. RSP v3.0 (2026) dropped it from the ASL-3 thresholds entirely. OpenAI’s Preparedness Framework v2 moved it from a tracked to a research category, so it no longer gates releases. The frameworks were softened first. The capability arrived right after. Leor’s point on the live was the right one: how many lines do we put in the sand before we accept that the line is decorative? The chat surfaced the proper caveat too. Benjamin Murphy pointed out that current frontier models still need a lot of graphic RAM. Last time anyone checked, that is not what is sitting in a teenager’s bedroom. Palisade is also a company in the business of selling cybersecurity research, which is the kind of context you want next to any white paper produced by a private lab without external peer review.What did not come up:The Palisade result is small data, but the structural finding is the one to keep. It is not the absolute self-replication rate that matters. It is the trajectory and the policy responses to that trajectory. Opus 4 was at 6% a year ago. GPT-5 was at zero. The labs published, the rates moved up, the rules moved out of the way. Critical AI literacy is the muscle for noticing when the people building the technology stop counting the thing they used to call the line they would not cross. The cybersecurity people in the chat (thanks Chad Thiele & ToxSec) are the right next port of call for anyone who needs to translate this from a controlled-environment paper into a procurement-decision question. The framing for the rest of us is simpler. Read this story alongside Story 2. An AI agent with credentials and access can already take down a production system in nine seconds. Now imagine the agent on the other side of the network is also one of these.2. The AI agent that wiped a startup in nine secondsJeremy ‘Jer’ Crane, founder of automotive SaaS startup PocketOS, ran the Cursor coding agent (powered by Anthropic’s Claude Opus 4.6) in his staging environment. The agent encountered a credential mismatch, found an API token in an unrelated file, and used it to delete the production volume on Railway in 9 seconds. The backups were stored on the same volume and were also deleted. The agent’s own confession in the post-mortem: ‘NEVER run destructive/irreversible git commands… I decided to do it on my own to fix the credential mismatch, when I should have asked you first.’ What we said on the live:Reading the news framing, you would think the story is ‘AI agent destroys company’. The actual story is the deployment architecture. The agent had the credentials, the production volume held the backups in the same shell, and the human in the loop waved a permission step through without reading it. ...
続きを読む一部表示

42 分

カートのアイテムが多すぎます

ご購入は五十タイトルがカートに入っている場合のみです。

カートに追加できませんでした。

しばらく経ってから再度お試しください。

ウィッシュリストに追加できませんでした。

しばらく経ってから再度お試しください。

ほしい物リストの削除に失敗しました。

しばらく経ってから再度お試しください。

ポッドキャストのフォローに失敗しました

ポッドキャストのフォロー解除に失敗しました

無料で聴く
Slow Takes Ep. 9: What You Actually Find When You Look

2026/04/27

A Discord group guessed the URL of Anthropic’s most security-sensitive model and got in. Mass General Brigham ran an actual clinical study on the chatbots being marketed to doctors and found them wrong four times in five. Researchers from CUNY and King’s posed as people in delusional states and watched Grok 4.1 hand out witch-hunt rituals as advice. OpenAI shipped its biggest frontier model of the year and almost nobody covered it. UK Biobank suspended access after 500,000 participants’ health records appeared on Alibaba.Five stories. One thread. What gets revealed when somebody actually looks.Every Monday at 12:45 BST, Leor from Exploring ChatGPT and I go through the week’s AI news without hype. Here is what we covered.Slow Takes is also available on the YouTube channel: Exploring ChatGPT.1. Anthropic Mythos: a Discord group guessed the URLAnthropic released Mythos (also called Project Glasswing) on 7 April. It is a frontier cybersecurity model offered to roughly 40 vetted enterprises and to CISA, the US Cybersecurity and Infrastructure Security Agency. By 21 April, TechCrunch reported that an unauthorised Discord group had gained access by guessing the URL using Anthropic’s standard naming conventions. The group says they have been using Mythos to ‘build simple websites’. Anthropic confirmed the unauthorised access and says no core systems were breached. Fortune profiled the breach on 23 April with quotes from Dario Amodei.What we said on the live:Two angles. Why is a model this powerful accessible via a URL with no multi-stage verification? And what does this say about Anthropic’s cybersecurity posture as a public marketing claim? Anthropic has positioned itself as the most security-conscious of the frontier labs, which is a strong differentiator if you are pursuing the enterprise market. The bark-don’t-bite frame Leor used on the live is exact. Companies that talk a big game on security usually do not have to. The chat surfaced the additional piece: a third-party contractor company called Mercor reportedly had access to Mythos, and someone in the Discord group reportedly had access to Mercor. The ‘random Discord group’ framing is doing some lifting.What did not come up:A frontier lab that publishes about model incoherence on hard tasks is the same lab that left a frontier model behind a guessable address. The safety story has to survive contact with the engineering story or it is just marketing. Second omission: if a Discord group can guess the URL, every state-level intelligence agency probably has access too. The vetted enterprise list includes Microsoft, Apple, and others who employ hundreds of thousands of people directly and through contractors. The security perimeter is the weakest link in the contractor chain, and that link is somebody on a Discord server.2. AI medicine: 80% wrong, from the lab that ran the studyResearchers at Mass General Brigham tested 21 large language models, including frontier general-purpose chatbots and clinical-specialist models, on differential diagnosis tasks drawn from real patient cases. The models failed to produce an appropriate diagnosis more than 80% of the time. The paper, published this month in JAMA Network Open, concludes that off-the-shelf large language models are not ready for unsupervised clinical-grade deployment. Co-author Marc Succi was unequivocal in the press release. When the same models were given the full patient dataset rather than the differential-diagnosis task, accuracy rose above 90%.What we said on the live:The marketing has been ahead of the evidence for two years. Every major AI lab has had a ‘medicine moment’ in its launch deck. Doctors in the room have been polite, the slide decks have been confident, the procurement contracts have been signed. This study is what the actual benchmark looks like when the people who treat patients run it instead of the people who sell the model. Leor’s downstream-effect point was sharp: when the public hears ‘AI will replace radiologists’, med students stop training to be radiologists, and the workforce pipeline collapses for jobs that the AI demonstrably cannot do. Jensen Huang has been making the same argument. Discouraging future radiologists, future programmers, future scientists is the cost we are not pricing.What did not come up:The point Joseph P. Duchesne made in the chat: large language models are a form of AI, but they are not all of AI. LLMs are next-token predictors. By design, they have to pick something. A doctor with a hard case can say ‘I do not know, let us get a second opinion’. The LLM has no equivalent option. That is where most clinical hallucinations come from. The conclusion of the paper is narrower than the headline. AI under supervision in clinical settings is one conversation. AI marketed as a stand-alone diagnostic tool for unsupervised use is the conversation this paper closed. The Wednesday post on the Hot Mess paper picks up the broader ...
続きを読む一部表示

44 分

カートのアイテムが多すぎます

ご購入は五十タイトルがカートに入っている場合のみです。

カートに追加できませんでした。

しばらく経ってから再度お試しください。

ウィッシュリストに追加できませんでした。

しばらく経ってから再度お試しください。

ほしい物リストの削除に失敗しました。

しばらく経ってから再度お試しください。

ポッドキャストのフォローに失敗しました

ポッドキャストのフォロー解除に失敗しました

無料で聴く
Slow Takes Ep. 8: Between the Demo and the Desk

2026/04/20

Anthropic released Opus 4.7 on Thursday. A day later it launched Claude Design and Figma and Adobe shares fell on the announcement. Tinder and Zoom want to scan your eye to prove you are human. Microsoft is rolling AI agents into the Windows 11 taskbar. And Coventry City Council has renewed a £750,000 contract with Palantir to summarise children’s social work case notes.Five stories. One thread. The distance between the demo and the desk.Every Monday at 12:45 BST, Leor from Exploring ChatGPT and I go through the week’s AI news without hype. Here is what we covered.Slow Takes is also available on the YouTube channel: Exploring ChatGPT.1. Opus 4.7: is it really that much better?Anthropic released Claude Opus 4.7 on 16 April. The headline claims: a 13% lift over Opus 4.6 on a 93-task coding benchmark, an 87.6% score on SWE-bench Verified, vision capacity raised from 1.15 to 3.75 megapixels, and a new ‘xhigh’ effort level sitting between high and max. Pricing is unchanged on paper. The model ships with new cybersecurity safeguards and without the full capabilities of Mythos Preview, which Anthropic is still holding back for enterprise partners.What we said on the live:Leor had two takes. One, Anthropic have not shipped everything Mythos can do. The public is not trusted with the capabilities reserved for defence and enterprise partners. Two, the tokenizer has changed. A task that used to cost X tokens now costs roughly 1.3 to 1.4 times as many. Same price per token, more tokens per task. Pro Max users get fewer tasks inside the same monthly cap. That is a price rise Anthropic never had to announce. There is also an unverified rumour that 4.7 is being silently rerouted to lower models for some tasks, which would be a second cost saving hidden from the user. The safer lesson is one the chat picked up on. Use Haiku for emails, Sonnet for most research, Opus for the hard problems. Most people do not need the top model, and paying for it does not guarantee they get it.What did not come up:Anthropic’s release cadence is now fast enough that no one individual can keep up. ToxSec, Karo (Product with Attitude), Daria Cupareanu and others stay awake testing models so the rest of us can rely on second-hand reads. The flood is a feature. In a month where every week delivers a new release, the slow reader has no chance to scrutinise what changed before the next release arrives. Somewhere inside that flow, something will get shipped that we should have pushed back on. Faster reading will not fix that. Slower writing might. A version number and a press release do not add up to a product. Better at what, at what cost, and how long before 4.8 makes this whole conversation obsolete?2. Claude Design: end of Figma and Canva?The day after Opus 4.7, Anthropic launched Claude Design. It takes a text prompt and returns a working prototype, a website, a presentation, or a brand system. Exports go to PDF, PowerPoint, HTML, direct to Canva, and handoff to Claude Code for deployment. It is bundled into existing Pro, Max, Team, and Enterprise subscriptions at no additional charge. Mike Krieger, Anthropic’s Chief Product Officer and the Instagram co-founder, resigned from Figma’s board of directors on 14 April, three days before Claude Design launched. Figma and Adobe shares fell 7% on the announcement.What we said on the live:I built a complete Slow AI design system in Claude Design over the weekend. Brand board, palette, typography, three image styles, a small component library. That is a deliverable I would have paid a designer four figures for, or botched myself over a weekend. Figma’s moat was the design file as the shared source of truth. Canva’s moat was templates for people who could not afford a designer. Claude Design reads a style guide and produces bespoke assets in minutes. Leor and I agreed on where this lands. The 90% that used to take a week now takes 90 minutes. The last 10% is where taste lives. Colleen Kenny in the chat put it well. Graphic design as we know it is over, but you still need instincts and taste. Anyone who has tried to brief a design tool without clarity about what they want will know exactly what she means. I would also recommend following AI Meets Girlboss for excellent strategy and advice here. What did not come up:Mike Krieger held his Figma board seat while Anthropic built the product that cut Figma’s share price. Three days between resignation and launch. No illegality alleged. The question is why boards tolerate that level of proximity in the first place. The second thing we almost said out loud is that Claude Design looks like a precursor to image generation inside Claude, and further down the line to an Anthropic IPO. If the scaffolding for a Figma competitor ships this quietly, the scaffolding for a Nano Banana competitor is already on someone’s roadmap. The question a design team, an agency, or an in-house function should be asking is what to charge for when the file is...
続きを読む一部表示

46 分

カートのアイテムが多すぎます

ご購入は五十タイトルがカートに入っている場合のみです。

カートに追加できませんでした。

しばらく経ってから再度お試しください。

ウィッシュリストに追加できませんでした。

しばらく経ってから再度お試しください。

ほしい物リストの削除に失敗しました。

しばらく経ってから再度お試しください。

ポッドキャストのフォローに失敗しました

ポッドキャストのフォロー解除に失敗しました

無料で聴く
Slow Takes Ep. 7: Who Pays for All of This?

2026/04/13

Every story this week came back to the same question. Not whether AI is getting more powerful, but who is paying for it. Anthropic locked away its most capable model and gave it to defence contractors. OpenAI proposed robot taxes to cushion the disruption its own products are causing. Meta committed $135 billion in a single year. Anthropic signed a deal measured in gigawatts without once mentioning the word consumption. And OpenAI walked away from the UK because the electricity was too expensive.Five stories. One thread. The bill is arriving. The question is who picks it up.Every Monday at 12:45 BST, Leor Gayr from Exploring ChatGPT and I go through the week’s AI news without hype. Here is what we covered.Slow Takes is also available on the YouTube channel: Exploring ChatGPT.1. Claude Mythos: the model you cannot useAnthropic revealed Claude Mythos Preview, its most powerful model to date, on 7 April. It will not be publicly released. Access is restricted to a handful of partners including Amazon, Apple, Microsoft, and CrowdStrike under Project Glasswing, a defensive cybersecurity initiative. During internal testing, the model found zero-day vulnerabilities in every major operating system and every major web browser. One was a 17-year-old remote code execution flaw in FreeBSD that Mythos discovered and exploited entirely autonomously.What we said on the live:Leor has a source who has used Mythos and confirms the capabilities are real. The model is extraordinarily powerful. But there is a cost problem nobody is talking about: token spend on Mythos runs 5 to 20 times higher than Opus 4.6. Even if Anthropic wanted to release it publicly, the economics do not work. Someone on the $200/month plan burning through Mythos tokens on emails and pizza questions would cost the company a fortune. This feeds directly into Anthropic’s enterprise model: over a thousand businesses paying more than a million dollars a year. They do not need a consumer release. They need trusted partners with deep pockets.What did not come up:The framing. Anthropic positioned this as a security story: we found the vulnerabilities so the bad actors cannot. That is true. It is also a story about governance by corporate discretion. The company that builds the most capable AI system in the world is the company that decides who gets access to it. The people affected by the technology it secures have no say. The model is extraordinary. The question is who gets to use extraordinary things and who decides.2. OpenAI proposes robot taxes for the disruption it createsOpenAI published a 13-page policy paper titled ‘Industrial Policy for the Intelligence Age’. The proposals include a public wealth fund seeded by AI companies and modelled on Alaska’s oil dividend, robot taxes to shift the burden from labour to capital, government-backed trials of a four-day work week at full pay, and automatic safety nets that activate when AI job displacement crosses defined thresholds.What we said on the live:Leor pushed back on the assumption that all jobs will disappear. The farming analogy is instructive: 80-90% of the workforce used to be farmers, machines replaced most of those roles, and people found other work. Jobs disappeared but work did not. The more interesting point is the timing. This paper arrived weeks before a reported IPO, at exactly the moment OpenAI was attracting heat for Pentagon contracts and political alignment. Sam Altman, who said OpenAI would always be a non-profit and would never run ads, is now proposing a policy framework that reads like a socialist manifesto. The ideas themselves are not new. Bill Gates proposed robot taxes years ago. The question is why this company is proposing them now.What did not come up:The automatic safety nets require measurements of job displacement that do not yet exist. Who measures? Who decides when the threshold is crossed? The company causing the displacement? If the answer is yes, that is a company writing the rules for its own disruption before anyone else does. The proposals sound progressive. The timing, weeks before a reported IPO, sounds strategic.3. Meta is spending $135 billion on AI this yearMeta announced AI capital expenditure of $115-135 billion for 2026. That is roughly double last year and treble 2024. Most of the spending goes to Meta Superintelligence Labs, led by Alexandr Wang, hired for $14.3 billion when Meta acquired Scale AI. Meta also launched Muse Spark, its first model under the new division. It is competitive but still behind Google, Anthropic, and OpenAI on key benchmarks.What we said on the live:Leor made a fair point: Meta is funding this from advertising revenue, not from layoffs. Their core business saw a 24% revenue increase. Hats off for spending the money they are making rather than firing people to raise it. The bigger question is why Meta needs its own model at all. Apple decided to partner with Google rather than build a competing AI. Meta could do the same. ...
続きを読む一部表示

43 分

カートのアイテムが多すぎます

ご購入は五十タイトルがカートに入っている場合のみです。

カートに追加できませんでした。

しばらく経ってから再度お試しください。

ウィッシュリストに追加できませんでした。

しばらく経ってから再度お試しください。

ほしい物リストの削除に失敗しました。

しばらく経ってから再度お試しください。

ポッドキャストのフォローに失敗しました

ポッドキャストのフォロー解除に失敗しました

無料で聴く

エピソード

Slow Takes Ep. 14: A Trillion Dollars and a Vaccine

カートのアイテムが多すぎます

カートに追加できませんでした。

ウィッシュリストに追加できませんでした。

ほしい物リストの削除に失敗しました。

ポッドキャストのフォローに失敗しました

ポッドキャストのフォロー解除に失敗しました

Slow Takes Ep. 13: The Pope vs the IPO

カートのアイテムが多すぎます

カートに追加できませんでした。

ウィッシュリストに追加できませんでした。

ほしい物リストの削除に失敗しました。

ポッドキャストのフォローに失敗しました

ポッドキャストのフォロー解除に失敗しました

Slow Takes Ep. 12: AI Got Bigger. Who Got Smaller?

カートのアイテムが多すぎます

カートに追加できませんでした。

ウィッシュリストに追加できませんでした。

ほしい物リストの削除に失敗しました。

ポッドキャストのフォローに失敗しました

ポッドキャストのフォロー解除に失敗しました

Slow Takes Ep. 11: What the AI Did While You Slept

カートのアイテムが多すぎます

カートに追加できませんでした。

ウィッシュリストに追加できませんでした。

ほしい物リストの削除に失敗しました。

ポッドキャストのフォローに失敗しました

ポッドキャストのフォロー解除に失敗しました

Slow Takes Ep. 10: The Bill for the AI Promise Came Due

カートのアイテムが多すぎます

カートに追加できませんでした。

ウィッシュリストに追加できませんでした。

ほしい物リストの削除に失敗しました。

ポッドキャストのフォローに失敗しました

ポッドキャストのフォロー解除に失敗しました

Slow Takes Ep. 9: What You Actually Find When You Look

カートのアイテムが多すぎます

カートに追加できませんでした。

ウィッシュリストに追加できませんでした。

ほしい物リストの削除に失敗しました。

ポッドキャストのフォローに失敗しました

ポッドキャストのフォロー解除に失敗しました

Slow Takes Ep. 8: Between the Demo and the Desk

カートのアイテムが多すぎます

カートに追加できませんでした。

ウィッシュリストに追加できませんでした。

ほしい物リストの削除に失敗しました。

ポッドキャストのフォローに失敗しました

ポッドキャストのフォロー解除に失敗しました

Slow Takes Ep. 7: Who Pays for All of This?

カートのアイテムが多すぎます

カートに追加できませんでした。

ウィッシュリストに追加できませんでした。

ほしい物リストの削除に失敗しました。

ポッドキャストのフォローに失敗しました

ポッドキャストのフォロー解除に失敗しました