Hey everyone, so for the past few month I have been working on this project and I’d love to have your feedback on it.

As we all know any time we publish something public online (on Reddit, Twitter or even this forum), our posts, comments or messages are scrapped and read by thousands of bots for various legitimate or illegitimate reasons.

With the rise of LLMs like ChatGPT we know that the “understanding” of textual content at scale is more efficient than ever.

So I created Redakt, an open source zero-click decryption tool to encrypt any text you publish online to make it only understandable to other users that have the browser extension installed.

Try it! Feel free to install the Chrome/Brave extension (Firefox coming soon): https://redakt.org/browser/

EDIT: For example, here’s a Medium article with encrypted content: https://redakt.org/demo/

Before you ask: What if the bots adapt and also use Redakt’s extension or encryption key?

Well first they don’t at the moment (they’re too busy gathering billions of data points “in clear”). If they do use the extension then any changes we’ll add to the extension (captcha, encryption method) will force them to readapt and prevent them to scale their data collection.

Let me know what you guys think!

Last
link
fedilink
-37M

deleted by creator

@touzovitch@lemmy.ml
creator
link
fedilink
3
edit-2
7M

But why? Why do you people hate AI so much?

I don’t think it’s a question to “hate” AI or not. Personally, I have nothing against it.

As always with Privacy, it’s a matter of choice: when I publish something online publicly, I would like to have the choice wether or not this content is going to be indexed or used to train models.

It’s a dual dilemma. I want to benefit from the hosting and visibility of big platforms (Reddit, LinkedIn, Twitter etc.) but I don’t want them doing literally anything with my content because lost somewhere in their T&C it’s mentioned “we own your content, we do whatever tf we want with it”.

war
link
fedilink
17M

deleted by creator

How come when I plagiarize other people’s creative content it’s illegal, but when AI does it it’s fine?

It’s not fine… Well, it is fine under certain circumstances, but they’ve been exploited by big companies pretending to be doing research… It’s complicated.

@PipedLinkBot@feddit.rocks
bot account
link
fedilink
17M

Here is an alternative Piped link(s):

It’s complicated

Piped is a privacy-respecting open-source alternative frontend to YouTube.

I’m open-source; check me out at GitHub.

Last
link
fedilink
37M

deleted by creator

deleted by creator

People make derivative works because they add their own ideas and spin. AI do not have ideas or spin, it’s copy-paste with extra steps.

Have you even been following what images AI can generate now? Every work is original, it doesn’t just copy and paste pixels.

What it does is use a large statistical model to determine which pixels it copies, but it’s still copy/paste with extra steps.

Zach777
link
fedilink
37M

@queermunist @moreeni I have to disagree. The plagiarism claims are unfounded as the ais are making their own artwork off of what they have learned. Usually starting from noise and de-noising it into something that matches its’ memories of the key words. In the case of the generative art ais anyway.

While there can be valid arguments against copyrighted material being used for the ais, plagiarism is not one of them.

Far be it from me to defend the concept of intellectual property, but if a chat bot can be argued to not plagiarize then that implies it has an intelligence. It really doesn’t. It’s plagiarism with extra steps.

Last
link
fedilink
17M

deleted by creator

The tech requires huge amounts of processing power and loose laws to even exist. It could be banned quite easily.

It won’t be lol

S410
link
fedilink
47M

It’s illegal if you copy-paste someone’s work verbatim. It’s not illegal to, for example, summarize someone’s work and write a short version of it.

As long as overfitting doesn’t happen and the machine learning model actually learns general patterns, instead of memorizing training data, it should be perfectly capable of generating data that’s not copied verbatim from humans. Whom, exactly, a model is plagiarizing if it generates a summarized version of some work you give it, particularly if that work is novel and was created or published after the model was trained?

All these AI do is algorithmically copy-paste. They don’t have original thoughts and or original conclusions or original ideas, all if it is just copy-paste with extra steps.

S410
link
fedilink
47M

Learning is, essentially, “algorithmically copy-paste”. The vast majority of things you know, you’ve learned from other people or other people’s works. What makes you more than a copy-pasting machine is the ability to extrapolate from that acquired knowledge to create new knowledge.

And currently existing models can often do the same! Sometimes they make pretty stupid mistakes, but they often do, in fact, manage to end up with brand new information derived from old stuff.

I’ve tortured various LLMs with short stories, questions and riddles, which I’ve written specifically for the task and which I’ve asked the models to explain or rewrite. Surprisingly, they often get things either mostly or absolutely right, despite the fact it’s novel data they’ve never seen before. So, there’s definitely some actual learning going on. Or, at least, something incredibly close to it, to the point it’s nigh impossible to differentiate it from actual learning.

deleted by creator

S410
link
fedilink
3
edit-2
7M

Not once did I claim that LLMs are sapient, sentient or even have any kind of personality. I didn’t even use the overused term “AI”.

LLMs, for example, are something like… a calculator. But for text.

A calculator for pure numbers is a pretty simple device all the logic of which can be designed by a human directly.

When we want to create a solver for systems that aren’t as easily defined, we have to resort to other methods. E.g. “machine learning”.

Basically, instead of designing all the logic entirely by hand, we create a system which can end up in a number of finite, yet still near infinite states, each of which defines behavior different from the other. By slowly tuning the model using existing data and checking its performance we (ideally) end up with a solver for something a human mind can’t even break up into the building blocks, due to the shear complexity of the given system (such as a natural language).

And like a calculator that can derive that 2 + 3 is 5, despite the fact that number 5 is never mentioned in the input, or that particular formula was not a part of the suit of tests that were used to verify that the calculator works correctly, a machine learning model can figure out that “apple slices + batter = apple pie”, assuming it has been tuned (aka trained) right.

Chat bots do not learn, stop anthropomorphizing them.

S410
link
fedilink
2
edit-2
7M

Not once did I claim that LLMs are sapient, sentient or even have any kind of personality. I didn’t even use the overused term “AI”.

LLMs, for example, are something like… a calculator. But for text.

A calculator for pure numbers is a pretty simple device all the logic of which can be designed by a human directly.

When we want to create a solver for systems that aren’t as easily defined, we have to resort to other methods. E.g. “machine learning”.

Basically, instead of designing all the logic entirely by hand, we create a system which can end up in a number of finite, yet still near infinite states, each of which defines behavior different from the other. By slowly tuning the model using existing data and checking its performance we (ideally) end up with a solver for something a human mind can’t even break up into the building blocks, due to the shear complexity of the given system (such as a natural language).

And like a calculator that can derive that 2 + 3 is 5, despite the fact that number 5 is never mentioned in the input, or that particular formula was not a part of the suit of tests that were used to verify that the calculator works correctly, a machine learning model can figure out that “apple slices + batter = apple pie”, assuming it has been tuned (aka trained) right.

Create a post

A place to discuss privacy and freedom in the digital world.

Privacy has become a very important issue in modern society, with companies and governments constantly abusing their power, more and more people are waking up to the importance of digital privacy.

In this community everyone is welcome to post links and discuss topics related to privacy.

Some Rules

  • Posting a link to a website containing tracking isn’t great, if contents of the website are behind a paywall maybe copy them into the post
  • Don’t promote proprietary software
  • Try to keep things on topic
  • If you have a question, please try searching for previous discussions, maybe it has already been answered
  • Reposts are fine, but should have at least a couple of weeks in between so that the post can reach a new audience
  • Be nice :)

Related communities

Chat rooms

much thanks to @gary_host_laptop for the logo design :)

  • 0 users online
  • 57 users / day
  • 383 users / week
  • 1.5K users / month
  • 5.7K users / 6 months
  • 1 subscriber
  • 2.44K Posts
  • 57.3K Comments
  • Modlog