Another different, but equally annoying type of spam is web page comment spam. This blog gets maybe one or two a week at most because a) the content-management system is non-standard and b) it's very low traffic. However it's still obviously spammable because I still have to nuke spammy comments every so often. Now, I run another web site using this same CMS (well, a slightly outdated one with a few custom changes to fit the content of the page) which is linked on the left and called The Mantis-Eye Experiment. It's a site I put entirely too much time into and is centered around one of my favorite shows, The Venture Bros. It is high traffic. Just to give you an idea, this blog gets anywhere from 1000 to 2200 hits per month (low of 1037, high of 2256), whereas Mantis-Eye got 51,606 hits in June, 72,318 hits in July and 91,469 hits in August since the new season of the show started back in late June. Before then it pulled in anywhere from 10k to 15k hits per month.
Since I implemented the new comment system back in May (I believe) the page has gotten 24,469 user comments. Of these, 14,524 were not nuked, meaning 9,945 comments have been nuked (which means they stay in the database but are not displayed to users). 9,525 comments have been flagged as spam and auto-nuked by the system's content scanning (which is very simple and pretty much just looks for common spam words like 'viagra' or any posts with excessive use of bbcode).
420 comments have been nuked but are not flagged spam. A couple of those are double-posts or me nuking someone for posting how or where to pirate the show, but that's not overly common. So in about four months I have had to deal with and personally nuke 420 comments. That's not too bad, but I can't be around all the time, and the spam is really annoying and just makes a page look shitty in general.
Today I think I found a very simple way of tricking spam bots. Apparently what they do is scan the HTML of a page and look for the first