What Kinds of Data do AI Chatbots Collect?

@will_a113@lemmy.ml

Not that we have any real info about who collects/uses what when you use the API

@will_a113@lemmy.ml

Nobody knows! There’s no specific disclosure that I’m aware of (in the US at least), and even if there was I wouldn’t trust any of these guys to tell the truth about it anyway.

As always, don’t do anything on the Internet that you wouldn’t want the rest of the world to find out about :)

@will_a113@lemmy.ml

They’re talking about what is being recorded while the user is using the tools (your prompts, RAG data, etc.)

@will_a113@lemmy.ml

Anthropic and OpenAPI both have options that let you use their API without training the system on your data (not sure if the others do as well), so if t3chat is simply using the API it may be that they themselves are collecting your inputs (or not, you’d have to check the TOS), but maybe their backend model providers are not. Or, who knows, they could all be lying too.

@will_a113@lemmy.ml

And I can’t possibly imagine that Grok actually collects less than ChatGPT.

@will_a113@lemmy.ml

The scenario you describe with ISPs is pretty US-centric, as are the various copyright laws and companies backing it, which is (one of the reasons) why many of the most successful VPN companies are either not based in the US (and most have server nodes that are not too).

Mullvad is from Sweden, for example, and Proton is from Switzerland, so if a content company can even figure out which endpoint nodes are hosting/routing the pirate content they then also have to figure out (a) who owns the node and (b) then send them an angrygram which will just immediately be torn up by the VPN provider as they’re not subject to US law.

Finally, an operating principle of these companies is to keep no logs, so even if a US-based VPN company got an angry letter, they’d probably be unable to do anything since they would have no record of the activity.

@will_a113@lemmy.ml

I visited both in a row over the summer on business and the misery just kind of ran together in my brain — and I’m saying that as a south floridian,mind so I have a certain tolerance for stupidity and pain.

@will_a113@lemmy.ml

Oh damn, you’re right, I mean Houston!

@will_a113@lemmy.ml

the house sizes and spacing between houses is highly correlated to the girth of an average citizen.

No idea if this is true or not but now I will definitely have to start paying attention.

@will_a113@lemmy.ml

I think the only rule they had when “planning” Dallas was “there are no rules”. Zero zoning rules means one giant skyscraper in the middle of a mile of strip malls, multiple city “centers”, vast areas of it are competely unwalkable due to lack of sidewalks and/or what are basically highways running through them, and no mass transit to speak of. It’s like they took 5 shitty, small cities and glued them together with more shitty city material.

@will_a113@lemmy.ml

If the US got rid of zoning laws our cities wouldn’t look like Tokyo, they’d look like Dallas. And believe me, one Dallas is more than enough.

@will_a113@lemmy.ml

I wonder if this matrix app was just a honeypot that was named to trick people into thinking they were using the “real” matrix.

@will_a113@lemmy.ml

And next year we get more tariffs too, and maybe even a trade war!

@will_a113@lemmy.ml

Is that pedal set up up to drive the machine, or is it just for looks?

@will_a113@lemmy.ml

Are you looking for a tool that can diff legal documents line by line or clause by clause? If the latter I’d bet an LLM with a large context size could do a pretty good job, especially if you used a script (or another pass through the LLM) to break them down into like sections so that could just compare e.g. all Controlling Law sections with each other and all IP Indemnification sections with each other.

Now that I think about it, tuning the prompt (and keeping the temperature very low, like 0) you could probably get it to return everything from proper diffs to summaries of conceptual differences. And it could definitely do multiples at once if you were to break them into like pieces ahead of time.

@will_a113@lemmy.ml

You can kinda do it with Google Customizabe Search Engine, which is basically a thin wrapper around Google. In a regular Google search you can use syntax like -site:ignorethisdomain.com to exclude specific domains (i do this with Pinterest whenever searching for images, for example). But manually typing in a large list of black listed domains would be tedious so instead you can set up a CSE with everybody you want to ignore and then just use the special URL as your search engine.

@will_a113@lemmy.ml

I worked in a field that managed a lot of technology in retail stores. The big ones know everything about you, it’s just astonishing. At the time (around 15 years ago) there was very little oversight, but also most CIOs were inept and couldn’t really make the data sing and dance. Today that is very much no longer true, and it’s almost too easy to build a comprehensive profile of an “anonymous” guest and then attach it to their personally identifiable information, all without their consent or knowledge.

What Kinds of Data do AI Chatbots Collect?

What Kinds of Data do AI Chatbots Collect?

The Powerful AI Tool That Cops (or Stalkers) Can Use to Geolocate Photos in Seconds

The Powerful AI Tool That Cops (or Stalkers) Can Use to Geolocate Photos in Seconds

Hackers Claim Massive Breach of Location Data Giant, Threaten to Leak Data

Hackers Claim Massive Breach of Location Data Giant, Threaten to Leak Data

U.S. Consumer Financial Protection Bureau plans to regulate the surveillance industry

U.S. Consumer Financial Protection Bureau plans to regulate the surveillance industry

What Kinds of Data do AI Chatbots Collect?

What Kinds of Data do AI Chatbots Collect?

The Powerful AI Tool That Cops (or Stalkers) Can Use to Geolocate Photos in Secondsplus-square

The Powerful AI Tool That Cops (or Stalkers) Can Use to Geolocate Photos in Secondsplus-square

Hackers Claim Massive Breach of Location Data Giant, Threaten to Leak Dataplus-square

Hackers Claim Massive Breach of Location Data Giant, Threaten to Leak Dataplus-square

U.S. Consumer Financial Protection Bureau plans to regulate the surveillance industryplus-square

U.S. Consumer Financial Protection Bureau plans to regulate the surveillance industryplus-square

The Powerful AI Tool That Cops (or Stalkers) Can Use to Geolocate Photos in Seconds

The Powerful AI Tool That Cops (or Stalkers) Can Use to Geolocate Photos in Seconds

Hackers Claim Massive Breach of Location Data Giant, Threaten to Leak Data

Hackers Claim Massive Breach of Location Data Giant, Threaten to Leak Data

U.S. Consumer Financial Protection Bureau plans to regulate the surveillance industry

U.S. Consumer Financial Protection Bureau plans to regulate the surveillance industry