In the light of Snowden’s latest post: What are your FOSS-AIs?

@thegreekgeek@midwest.social

Is abliteration based off the research by the Anthropic team? When they got Claude to say it was the golden gate bridge?

FaceDeer

Ironically, as far as I’m aware it’s based off of research done by some AI decelerationists over on the alignment forum who wanted to show how “unsafe” open models were in the hopes that there’d be regulation imposed to prevent companies from distributing them. They demonstrated that the “refusals” trained into LLMs could be removed with this method, allowing it to answer questions they considered scary.

The open LLM community responded by going “coooool!” And adapting the technique as a general tool for “training” models in various other ways.

In the light of Snowden’s latest post: What are your FOSS-AIs?

In the light of Snowden’s latest post: What are your FOSS-AIs?

Privacy

A place to discuss privacy and freedom in the digital world.

Some Rules

Related communities