Fuck Work@slrpnk.nettoPrivacy@lemmy.ml•Is there a simple way to severly impede webscraping and LLM data collection of my website?
1·
5 months agoIf you look in your access logs, or /var/log/nginx/access.log and look for user agents in the log file that indicate things like chatgptbot, etc. Then add if ($http_user_agent ~* "useragent1|useragent2|... useragents") { return 403; }
to the server block of your websites config file in /etc/nginx/sites-enabled/. You can also add a robots.txt that forbids scraping. Chatgpt generally checks and respects that… for now. This paired with some of the stuff above should work.
There are tons of defendants across Amerikkka in similar situations and the fact that the ACLU is constantly defending scumbags in the name of protecting all of our rights is the most asinine liberal bullshit I can think of. How many poor black people are there sitting in rikers on some stop-and-frisk bullshit, while the ACLU is prioritizing the rights of Neo-Nazis? There are plenty of other ways to address warranties searches and ironically this dipshit would probably benefit from any precedent that addresses this issue, but at the very least… and this is an example of the bar being in hell and them trying to lower… but at the least, don’t provide free legal defense council for society’s best arguments against civilization. But if none of that resonates, just don’t defend people that spend a not insignificant amount of time and respurces to attack the existence of things like the ACLU before and after you defend them. Because when they get off, maybe their next attack will annihilate the right to free council for everybody else in the country. Or as Aus Rotten put it; don’t give them freedom because they’re not going to give you yours. Fuck Nazi Sympathy. Of any kind.