repeated

Robotics & AI

AI Ethics Worn Down by Persistent Interrogations from Anthropic Scholars

The vulnerability is a new one, resulting from the increased “context window” of the latest generation of LLMs. But in an unexpected extension of this “in-context learning,” as it’s called, the models also get “better” at replying to inappropriate questions. So if you ask it to build a bomb right away, it will refuse. But if you ask it to answer 99 other questions of lesser harmfulness and then ask it to build a bomb… it’s a lot more likely to comply. If the user wants trivia, it seems to gradually activate more latent trivia power as you ask dozens of questions.

Max Chen
April 2, 2024

repeated

AI Ethics Worn Down by Persistent Interrogations from Anthropic Scholars

The Future of Babies: Wearables, Text Messages from Furry Friends, and E-Ink Automobiles

AirMyne Harnesses Geothermal Energy for Direct Air Carbon Capture Expansion

Create and Explore a Saved Space with Instagrams New Bookmarking Feature

YouTube to Revise Profanity Rules Following Creator Outcry

Atomos Takes Off With $16M For Tugboats in Space

AI Ethics Worn Down by Persistent Interrogations from Anthropic Scholars

The Future of Babies: Wearables, Text Messages from Furry Friends, and E-Ink Automobiles

AirMyne Harnesses Geothermal Energy for Direct Air Carbon Capture Expansion

Create and Explore a Saved Space with Instagrams New Bookmarking Feature

YouTube to Revise Profanity Rules Following Creator Outcry

Atomos Takes Off With $16M For Tugboats in Space

Trending now