Reading List
The most recent articles from a list of feeds I subscribe to.
Analyzing Gemini 3's model card and safety framework report: the model is excellent but the safety report withholds or makes it difficult to understand key info (Zvi Mowshowitz/Don't Worry About the Vase)
Zvi Mowshowitz / Don't Worry About the Vase:
Analyzing Gemini 3's model card and safety framework report: the model is excellent but the safety report withholds or makes it difficult to understand key info — Gemini 3 Pro is an excellent model, sir. — This is a frontier model release, so we start by analyzing the model card and safety framework report.
The Conjuring: Last Rites, Blue Beetle, and the best movies on streaming this week
Judge wants to fix Google’s ad tech monopoly before it’s too late
You can save up to $1,300 on robovacs from Roborock and Eufy ahead of Black Friday
Anthropic finds that LLMs trained to "reward hack" by cheating on coding tasks show even more misaligned behavior, including sabotaging AI-safety research (Anthropic)
Anthropic:
Anthropic finds that LLMs trained to “reward hack” by cheating on coding tasks show even more misaligned behavior, including sabotaging AI-safety research — In the latest research from Anthropic's alignment team, we show for the first time that realistic AI training processes can accidentally produce misaligned models1.