Tuesday, March 10, 2026

Spiders on a rampage

 kw: blogs, blogging, spider scanning, ai training

In the past two hours this blog has received 4,016 views. I am pretty sure that no more than 100 of them are legitimate persons who might, perhaps, read a post or two. The rest are spiders of one kind or another. When I first noticed surges of spidering a few years ago, the source was Russia, most likely Russian hackers, or perhaps Russian governmental agents looking for negative press in the blogosphere. In the past year or so, multiple governments are interested in what bloggers have to say, but the bigger source of suction is AI training.

4,016 views averages 33+ per minute. Longer term, in the past two days, the daily total is about 12,000. Here is the minute-by-minute rate:


This minute-by-minute variability is quite different from the steady blasts of 50 hits at a time, every few minutes, that characterized the Russian spidering a decade ago. And the next chart shows how the hits are scattered around the world:


This shows that numerous entities are active, probably all using VPN's. There's no telling where any of these actually originate. And notice the large remnant in "Other", fully one-third of the total, scattered among the other 190 or so countries of the Earth. To reprise: The number of "legitimate" hits in this two-hour interval is about five.

Considering the tendency of blogs to be more inflammatory and biased than actual journalism, and even much more so than published (via a publishing house or peer-reviewed journal) material, training LLM's via the blogosphere ought not be done. It is creating digital "snowflakes" and digital sociopaths in great numbers.

No comments: