为了测试这个新模型的理解极限,他随手甩出了一道极其刁钻的测试题:「给我画一张设定在古威尼斯的《寻找沃尔多(Where’s Waldo)》,但里面要找的不能是人,得是一只穿着蓝色条纹飞行服的水獭。」
I completely ignored Anthropic’s advice and wrote a more elaborate test prompt based on a use case I’m familiar with and therefore can audit the agent’s code quality. In 2021, I wrote a script to scrape YouTube video metadata from videos on a given channel using YouTube’s Data API, but the API is poorly and counterintuitively documented and my Python scripts aren’t great. I subscribe to the SiIvagunner YouTube account which, as a part of the channel’s gimmick (musical swaps with different melodies than the ones expected), posts hundreds of videos per month with nondescript thumbnails and titles, making it nonobvious which videos are the best other than the view counts. The video metadata could be used to surface good videos I missed, so I had a fun idea to test Opus 4.5:
。爱思助手下载最新版本对此有专业解读
Nature, Published online: 25 February 2026; doi:10.1038/d41586-026-00599-5
目前,已有不少媒体机构与个人创作者开始尝试这一功能,发布内容多集中于深度故事、社会新闻以及个人成长等相对长周期、非即时性的叙事方向。