7c0h

Writer's block

I wrote my last post in November 2024, which is a lot. Unlike other dry spells (seven months in 2017 and six months in 2019), this one was not because I forgot about the blog but because I have a severe writing block due to, you guessed it, AI.

I receive lots of bot traffic every day — way more, I suspect, than actual human readers. That's usually not a problem because I think of this blog as a message in a bottle, drifting lazily through the oceans until one day someone finds it by chance, reads it and thinks "yeah, VGA cables are indeed cool". But AI changes the equation — if for every human I reach there are one hundred bots, then all I'm effectively doing is creating AI training data for free.

I used to think that the terms under which I release this content (Creative Commons BY-SA 4.0) would help: the "ShareAlike" terms give you permission to share and adapt my ideas, but only as long as you give proper attribution and distribute the result under the same license. This is all well and good until you learn that AI companies have decided that they can use all content ever created under a "fair use" policy, licenses be damned and sell you the results without even attributting the original author (let's not even think about compensation).

Much more talented people than me have given their reasons for why they think this whole situation is a travesty — Dorothy Gambrell, the creator Cat and Girl has made an eloquent case for why AI feels so much like theft and Mike Krahulik of Penny Arcade has made comics like this one, this other one and this third one. None of those texts include the word "fucking assholes", though, but that's understandable. My version does, though.

A cartoon of several artists put in a blender and being unhappy about it

Social problems cannot be solved with technology, but given the limited amount of power I have I will have to make do with the technology I have. I have decided to implement a three-step process:

  1. Add all known AI crawlers to my robots.txt file and restrict them to the PR part of this website (who I am, what I do, publications). Everything else is off-limits.
  2. Add some AI protection — I like the spirit behind Anubis but I may end up sticking to a combination of Apache's mod_evasive and Fail2Ban.
  3. If all that fails, a Tarpit for particularly misbehaved bots. How to generate an infinite maze of URLs for cheap is an interesting problem to think about, and one that I've started working on.

One of my personal goals has always been to make this website as quick to load as possible, which rules out captchas of any kind. A Proof-of-Work solution like the above-mentioned Anubis can work, but that remains to be seen. Other measures are less drastic, with the simplest one being that I'll try and get at least one curse per post. The reasons for this are complex enough that they deserve their own post, so stay tuned.