Ok, it is over. End of an era for me. No more self-hosted git. I had a public git server running since 2011, and a public cvs server before that. AI scrapers have hammered the poor, little server to death by flooding the cgit frontend with tons of pointless² requests. Actually a few months ago already.

Now I finally decided to not try rebuild the server, be it with or without cgit web frontend. I don't feel like taking up the fight with the scrapers in my spare time, I leave that to people who are in a better position to do so. Most repositories had mirrors on one or two of the large gitforges already. Those are the primary repositories now. Go look at gitlab and github.

Last week I've fixed all (I hope) dangeling links to the cgit repsitories to point to the forges instead.

Now I'm down to one self-hosted service, which is the webserver hosting mainly this blog and a few more little things. In 2018 I've migrated the blog from wordpress to jekyll, so it is all static pages. Taking this out by AI scrapers overloading the machine should be next to impossible, and so far this has hold up.

Nevertheless AI scrapers already managed to trigger one outage. Apparently millions of 404 answers where not enough to convince the bots that there is no cgit service (any more). Apache had no problems to deliver those, but the logs have filled up the disk so fast that logrotate didn't manage to keep things under control with the default configuration. Fixed config. Knook wood.


¹ Title inspired by the 2025 edition of Security Nightmares. Fun watching if you speak german.
² Most inefficient way to get the complete repo. Just clone it, ok?