My apologies if you’ve tried to access my personal blogs recently. I’ve been inundated by comment spammers and it has been a tremendous pain in the buttocks getting them straightened out. For a while, I was getting only a half dozen or so a day. Short comments about what an amazing blog/post it was and that they’d definitely be back and/or bookmark/subscribe.
I could manually delete them without too much inconvenience for a while. Lately, though, there’s been a staggering increase in these weasels so I’ve adopted measures a little more… drastic.
A Comment Filter BlogEngine.Net Extension
I noticed that most of these spammers shared some distinctive characteristics. Many of them put down the same email address, for example. I also noticed that there were only three or four websites generally involved. Since the spam exists for the purpose of Google pagerank manipulation, the website is probably the important thing to note.
Now, I looked for a BE.Net extension that’d do this already. Unfortunately, most of the comment filters I found were tied into Akismet or some other blog filter service. That’s more overhead than I really want (in terms of configuration, registering, and complexity etc.). All I really need is something to check the email address, website, and maybe IP address against a known blacklist I can maintain myself. That shouldn’t be difficult, right?
Adventures in Comment Filtering
On the surface, these things weren’t that hard to accomplish. BlogEngine.Net has some quirks, though, that got in my way until I figured them out. For those interested, I’m going to explain them here. If you want to skip the gory details, head down to the next section. Or if you just want the extension, download it, pop it into the App_Data/Extensions folder and season to taste.
Finding the Right Event
My first impulse was to look at the Comment object for useful events to extend. Comment.Validating looked like a good candidate so I tried that one out. Unfortunately, that event never got hit on my blog. It took me a bit to realize that this is because I don’t actually validate comments. Validating comments is a setting where a comment doesn’t show up until it is approved. Since I only do blog maintenance once a day or so, I don’t want to prevent comments from showing up for that long. Validating comments would pretty much stop discussions in their tracks and I don’t want that.
Once I remembered that comments are managed on the Page object, things went much better. The Page.AddingComment event turned out to be the one I wanted.
ExtensionParameter Fun
This is the one that held me up the longest. ExtensionParameters can be assigned types that include things like “DropDown” and “ListBox”. That seemed like exactly the kind of thing I could use for my filters. You see, each filter will be of a limited number of valid types: “Website”, “Email”, “IP Address”, or “Length” (I added Length when I noticed that all these messages are really short and I might want to account for that in my filter).
Unfortunately, these ParamType values are a complete red herring for tabular data storage. I noticed that BE.Net wasn’t actually storing my selection when I tried to add filter entries. The thing is that BE.Net stores tabular values on each parameter in the DataStore and only maintains a link to them by the order in which they appear. So my parameters in the DataStore look like this once saved:
<Parameters>
<Name>Filter</Name>
<Label>Filter</Label>
<MaxLength>100</MaxLength>
<Required>true</Required>
<KeyField>true</KeyField>
<Values>http://www.sonicity.com/</Values>
<Values>http://www.unlockprivateprofiles.com/</Values>
<Values>http://www.lastminutejoy.de/</Values>
<Values>http://www.mooladays.com/</Values>
<Values>http://www.dbpclan.com/</Values>
<Values>200</Values>
<Values>email002545@hotmail.com</Values>
<Values>http://www.ramshyam.com/</Values>
<ParamType>String</ParamType>
<SelectedValue/>
</Parameters>
<Parameters>
<Name>FilterType</Name>
<Label>Filter Type</Label>
<MaxLength>100</MaxLength>
<Required>true</Required>
<KeyField>false</KeyField>
<Values>Website</Values>
<Values>Website</Values>
<Values>Website</Values>
<Values>Website</Values>
<Values>Website</Values>
<Values>Length</Values>
<Values>Email</Values>
<Values>Website</Values>
<ParamType>String</ParamType>
<SelectedValue/>
</Parameters>
It looks to me like list types (DropDown, ListBox, etc.) were mainly implemented with scalar settings in mind rather than tabular settings as this needs to be. This is unfortunate, but I can’t see an easy way to alter the architecture to enable list types easily. I could create my own custom admin page for the extension (and I still may) but that’s more work than I wanted to do to get this running.
The Extension
So my comment extension has been up and working for a day or two now and things have calmed down a lot. This is a good thing. I can’t say that it is extensively tested for the simple reason that I don’t get many legitimate comments on a regular basis.
Configuration is pretty simple as long as you don’t typo the Filter Type value. Each filter is its own entry in the tabular list on top.
(Click image to enlarge)
Talking Back to Spammers
When I noticed that it still looks to the user like their comment is saved (because the comment is still part of the page object, it just isn’t saved to the DataStore), I had an inspiration. Since the comment is still displayed to the person who posted it (though not to anyone else), that’s an opportunity to make sure that someone running afoul of my length requirement doesn’t end up wondering what happened. Plus, it gives me a chance to tell spammers that they’ve been noticed (yeah, that’s of dubious value and I may rethink this, but for now, it just makes me feel better). If you enlarged the image above, you’ll see that there are templated values that will be used to replace the comment content. I can be as nasty as I want and the only ones who see it will be the spammers—though you’ll probably want to take it easy on those who stumble on your length filter (if any).
Spammers Should Die
A day or so after this filter went into effect I started to get new messages. These are clever little plays for sympathy saying things like “my comment got eaten but anyway… <regular spiel here>”. Or another “my blog is getting lots of comment spam, do know any way to help?” The website links were still classic spam sites so these weren’t real users looking for help. Cheeky little locusts, aren’t they? Seriously, someone with the right skills needs to hunt these bastards down and rearrange key organs into innovative new patterns.