• 0 Posts
  • 7 Comments
Joined 2Y ago
cake
Cake day: Jun 07, 2023

help-circle
rss

It’s going to depend on what types of data you are looking to protect, how you have your wifi configured, what type of sites you are accessing and whom you are willing to trust.

To start with, if you are accessing unencypted websites (HTTP) at least part of the communications will be in the clear and open to inspection. You can mitigate this somewhat with a VPN. However, this means that you need to implicitly trust the VPN provider with a lot of data. Your communications to the VPN provider would be encrypted, though anyone observing your connection (e.g. your ISP) would be able to see that you are communicating with that VPN provider. And any communications from the VPN provider to/from the unencrypted website would also be in the clear and could be read by someone sniffing the VPN exit node’s traffic (e.g. the ISP used by the VPN exit node) Lastly, the VPN provider would have a very clear view of the traffic and be able to associate it with you.

For encrypted websites (HTTPS), the data portion of the communications will usually be well encrypted and safe from spying (more on this in a sec). However, it may be possible for someone (e.g. your ISP) to snoop on what domains you are visiting. There are two common ways to do this. The first is via DNS requests. Any time you visit a website, your browser will need to translate the domain name to an IP address. This is what DNS does and it is not encrypted by default. Also, unless you have taken steps to avoid it, it likely your ISP is providing DNS for you. This means that they can just log all your requests, giving them a good view of the domains you are visiting. You can use something like DNS Over Https (DOH), which does encrypt DNS requests and goes to specific servers; but, this usually requires extra setup and will work regardless of using your local WiFi or a 5g/4g network. The second way to track HTTPS connections is via a process called Server Name Identification (SNI). In short, when you first connect to a web server your browser needs to tell that server which domain it wants to connect to, so that the server can send back the correct TLS certificate. This is all unencrypted and anyone inbetween (e.g. your ISP) can simply read that SNI request to know what domains you are connecting to. There are mitigations for this, specifically Encrypted Server Name Identification (ESNI), but that requires the web server to implement it, and it’s not widely used. This is also where a VPN can be useful, as the SNI request is encrypted between your system and the VPN exit node. Though again, it puts a lot of trust in the VPN provider and the VPN provider’s ISP could still see the SNI request as it leaves the VPN network. Though, associating it with you specifically might be hard.

As for the encrypted data of an HTTPS connection, it is generally safe. So, someone might know you are visiting lemmy.ml, but they wouldn’t be able to see what communities you are reading or what you are posting. That is, unless either your device or the server are compromised. This is why mobile device malware is a common attack vector for the State level threat actors. If they have malware on your device, then all the encryption in the world ain’t helping you. There are also some attacks around forcing your browser to use weaker encryption or even the attacker compromising the server’s certificate. Though these are likely in the realm of targeted attacks and unlikely to be used on a mass scale.

So ya, not exactly an ELI5 answer, as there isn’t a simple answer. To try and simplify, if you are visiting encrypted websites (HTTPS) and you don’t mind your mobile carrier knowing what domains you are visiting, and your device isn’t compromised, then mobile data is fine. If you would prefer your home ISP being the one tracking you, then use your home wifi. If you don’t like either of them tracking you, then you’ll need to pick a VPN provider you feel comfortable with knowing what sites you are visiting and use their software on your device. And if your device is compromised, well you’re fucked anyway and it doesn’t matter what network you are using.


That was just the first example to pop to mind where you couldn’t just grep search * and I didn’t want to get into a bunch of specific file formats. For something like epub you could probably just use zcat and then pipe the output to grep. Perhaps using a for loop if you want to do other fancy stuff along the way (e.g. output file names as headers).

So ya, “hard” may have been a bit overblown. “not simple” may have been better. But, without the OP actually stating what format the ebooks were in, I wasn’t going to write a primer on dealing with any format.


It’s going to be different for different file formats. For example, something like epub is going to be hard because the format is really just a zip file with a specific internal file structure. So, it’s not really the .epub file you want to grep, but one of the files within that zip file you want to grep through. EBooks stored as PDFs could be a bit easier, as they are a monolithic file format with text often (though not always) stored just as plain text. However, the text streams can be encrypted and/or compressed (FlateDecode); so, there is no guarantee of seeing plain text.

I’m sure there are more formats, but I think you get the idea, how you would do a string search comes down to the actual file format. And some are not going to be easily greppable. It’s not impossible, just not straight forward.


But they pinky promised that only the “good guys” would use the “front doors”. /s


There may also be a (very weak) reason around bounds checking and avoiding buffer overflows. By rejecting anything longer that 20 characters, the developer can be sure that there will be nothing longer sent to the back end code. While they should still be doing bounds checking in the rest of the code, if the team making the UI is not the same as the team making the back end code, the UI team may see it as a reasonable restriction to prevent a screw up, further down the stack, from being exploited. Again, it’s a very weak argument, but I can see such an argument being made in a large organization with lots of teams who don’t talk to each other. Or worse yet, different contractors standing up the front end and back end.


I don’t know how anyone makes it without a password manager at this point.

Password reuse. Password reuse everywhere.


Another Cybersec worker here, and I’ll broadly agree with all this. That said, I’d also point out that, depending on your site setup, the browser history may be nothing more than another place to correlate information we have from elsewhere.

Several sites I have been at have used Data Loss Prevention (DLP) software which automagically records (and possibly blocks) data moving into and out of the environment. This can be very detailed, to the point of knowing when someone copy/pastes data to a web form. I’ve also been at sites which sniff web traffic at the firewall and record full pcaps and extract metadata for quick analysis. So yes, for those not aware, deleting browser history or using “in private” browsing or other steps to avoid us seeing your porn browsing, may not be as effective as you think.

All that said, I’ve never been on a Cybersec team which has had enough time to really care about porn browsing, so long as you are not putting the network at risk. And, so long as HR/Management doesn’t tell us to care. We have better things to spend our time on.

Lastly, if you don’t want us seeing it, don’t so it on a work computer. Look, we have lots of ways to see what you are doing. Just, do that stuff at home, on your own hardware. And leave the work computer for work. Writing up misuse reports is something I really hate doing.