View: Anthropic’s Creative Solution to Illicit Answers from AI

Youtube Thumb Anthropic Tc Min
Such is the case with Anthropic and its latest research which demonstrates an interesting vulnerability in current LLM technology. Of course given progress in open-source AI technology, you can spin up your own LLM locally and just ask it whatever you want, but for more consumer-grade stuff this is an issue worth pondering. But the closer we get to more generalized AI intelligence, the more it should resemble a thinking entity, and not a computer that we can program, right? If so, we might have a harder time nailing down edge cases to the point when that work becomes unfeasible? Anyway, let’s talk about what Anthropic recently shared.

AI Ethics Worn Down by Persistent Interrogations from Anthropic Scholars

Gettyimages 1424498694
The vulnerability is a new one, resulting from the increased “context window” of the latest generation of LLMs. But in an unexpected extension of this “in-context learning,” as it’s called, the models also get “better” at replying to inappropriate questions. So if you ask it to build a bomb right away, it will refuse. But if you ask it to answer 99 other questions of lesser harmfulness and then ask it to build a bomb… it’s a lot more likely to comply. If the user wants trivia, it seems to gradually activate more latent trivia power as you ask dozens of questions.

“Amazon Takes a $4B Gamble on Anthropic’s Triumph”

Tc Minute Anthropic Yt
The current AI wave is a never-ending barrage of news items. To understand what I mean, ask yourself how long you spent considering the fact that Amazon put another $2.75 billion into Anthropic AI last week. We’ve become inured to the capital influx that is now common in AI, even as the headline numbers get even bigger. Sure, Amazon is slinging cash at Anthropic, but single-digit billions are chump change compared to what some companies have planned. Hell, even smaller tech companies — compared to the true giants — are spending to stay on the cutting edge.

Amazon strengthens commitment to Anthropics, fulfills proposed $4 billion investment

Gettyimages 1305439234
Amazon invested a further $2.75 billion in growing AI power Anthropic on Wednesday, following through on the option it left open last September. The $1.25 billion it invested at the time must be producing results, or perhaps they’ve realized that there are no other horses available to back. Lacking the capability to develop adequate models on their own for whatever reason, companies like Amazon and Microsoft have had to act vicariously through others, primarily OpenAI and Anthropic. Right now the AI world is a bit like a roulette table, with OpenAI and Anthropic representing black and red. We know Anthropic has a plan, and this year we’ll find out what Amazon, Apple, Microsoft and other multinational interests think they can do to monetize this supposedly revolutionary technology.

” New Models from Anthropic Outperform GPT-4

Claude2 Blog V1 1
All show “increased capabilities” in analysis and forecasting, Anthropic claims, as well as enhanced performance on specific benchmarks versus models like GPT-4 (but not GPT-4 Turbo) and Google’s Gemini 1.0 Ultra (but not Gemini 1.5 Pro). A model’s context, or context window, refers to input data (e.g. In a technical whitepaper, Anthropic admits that Claude 3 isn’t immune from the issues plaguing other GenAI models, namely bias and hallucinations (i.e. Unlike some GenAI models, Claude 3 can’t search the web; the models can only answer questions using data from before August 2023. Here’s the pricing breakdown:Opus: $15 per million input tokens, $75 per million output tokensSonnet: $3 per million input tokens, $15 per million output tokensHaiku: $0.25 per million input tokens, $1.25 per million output tokensSo that’s Claude 3.

Anthropologists discover deceptive capabilities of trained AI models

Gettyimages 1548038240 1
A recent study co-authored by researchers at Anthropic, the well-funded AI startup, investigated whether models can be trained to deceive, like injecting exploits into otherwise secure computer code. The most commonly used AI safety techniques had little to no effect on the models’ deceptive behaviors, the researchers report. Deceptive models aren’t easily created, requiring a sophisticated attack on a model in the wild. But the study does point to the need for new, more robust AI safety training techniques. “Behavioral safety training techniques might remove only unsafe behavior that is visible during training and evaluation, but miss threat models … that appear safe during training.