Spammers Are Vermin

by Jacob 19. July 2009 09:32

Cockroaches My apologies if you’ve tried to access my personal blogs recently. I’ve been inundated by comment spammers and it has been a tremendous pain in the buttocks getting them straightened out. For a while, I was getting only a half dozen or so a day. Short comments about what an amazing blog/post it was and that they’d definitely be back and/or bookmark/subscribe.

I could manually delete them without too much inconvenience for a while. Lately, though, there’s been a staggering increase in these weasels so I’ve adopted measures a little more… drastic.

A Comment Filter BlogEngine.Net Extension

I noticed that most of these spammers shared some distinctive characteristics. Many of them put down the same email address, for example. I also noticed that there were only three or four websites generally involved. Since the spam exists for the purpose of Google pagerank manipulation, the website is probably the important thing to note.

Now, I looked for a BE.Net extension that’d do this already. Unfortunately, most of the comment filters I found were tied into Akismet or some other blog filter service. That’s more overhead than I really want (in terms of configuration, registering, and complexity etc.). All I really need is something to check the email address, website, and maybe IP address against a known blacklist I can maintain myself. That shouldn’t be difficult, right?

Adventures in Comment Filtering

On the surface, these things weren’t that hard to accomplish. BlogEngine.Net has some quirks, though, that got in my way until I figured them out. For those interested, I’m going to explain them here. If you want to skip the gory details, head down to the next section. Or if you just want the extension, download it, pop it into the App_Data/Extensions folder and season to taste.

Finding the Right Event

My first impulse was to look at the Comment object for useful events to extend. Comment.Validating looked like a good candidate so I tried that one out. Unfortunately, that event never got hit on my blog. It took me a bit to realize that this is because I don’t actually validate comments. Validating comments is a setting where a comment doesn’t show up until it is approved. Since I only do blog maintenance once a day or so, I don’t want to prevent comments from showing up for that long. Validating comments would pretty much stop discussions in their tracks and I don’t want that.

Once I remembered that comments are managed on the Page object, things went much better. The Page.AddingComment event turned out to be the one I wanted.

ExtensionParameter Fun

This is the one that held me up the longest. ExtensionParameters can be assigned types that include things like “DropDown” and “ListBox”. That seemed like exactly the kind of thing I could use for my filters. You see, each filter will be of a limited number of valid types: “Website”, “Email”, “IP Address”, or “Length” (I added Length when I noticed that all these messages are really short and I might want to account for that in my filter).

Unfortunately, these ParamType values are a complete red herring for tabular data storage. I noticed that BE.Net wasn’t actually storing my selection when I tried to add filter entries. The thing is that BE.Net stores tabular values on each parameter in the DataStore and only maintains a link to them by the order in which they appear. So my parameters in the DataStore look like this once saved:

<Parameters>
  <Name>Filter</Name>
  <Label>Filter</Label>
  <MaxLength>100</MaxLength>
  <Required>true</Required>
  <KeyField>true</KeyField>
  <Values>http://www.sonicity.com/</Values>
  <Values>http://www.unlockprivateprofiles.com/</Values>
  <Values>http://www.lastminutejoy.de/</Values>
  <Values>http://www.mooladays.com/</Values>
  <Values>http://www.dbpclan.com/</Values>
  <Values>200</Values>
  <Values>email002545@hotmail.com</Values>
  <Values>http://www.ramshyam.com/</Values>
  <ParamType>String</ParamType>
  <SelectedValue/>
</Parameters>
<Parameters>
  <Name>FilterType</Name>
  <Label>Filter Type</Label>
  <MaxLength>100</MaxLength>
  <Required>true</Required>
  <KeyField>false</KeyField>
  <Values>Website</Values>
  <Values>Website</Values>
  <Values>Website</Values>
  <Values>Website</Values>
  <Values>Website</Values>
  <Values>Length</Values>
  <Values>Email</Values>
  <Values>Website</Values>
  <ParamType>String</ParamType>
  <SelectedValue/>
</Parameters>

It looks to me like list types (DropDown, ListBox, etc.) were mainly implemented with scalar settings in mind rather than tabular settings as this needs to be. This is unfortunate, but I can’t see an easy way to alter the architecture to enable list types easily. I could create my own custom admin page for the extension (and I still may) but that’s more work than I wanted to do to get this running.

The Extension

So my comment extension has been up and working for a day or two now and things have calmed down a lot. This is a good thing. I can’t say that it is extensively tested for the simple reason that I don’t get many legitimate comments on a regular basis.

Configuration is pretty simple as long as you don’t typo the Filter Type value. Each filter is its own entry in the tabular list on top.

CommentFilterConfiguration (Click image to enlarge)

Talking Back to Spammers

When I noticed that it still looks to the user like their comment is saved (because the comment is still part of the page object, it just isn’t saved to the DataStore), I had an inspiration. Since the comment is still displayed to the person who posted it (though not to anyone else), that’s an opportunity to make sure that someone running afoul of my length requirement doesn’t end up wondering what happened. Plus, it gives me a chance to tell spammers that they’ve been noticed (yeah, that’s of dubious value and I may rethink this, but for now, it just makes me feel better). If you enlarged the image above, you’ll see that there are templated values that will be used to replace the comment content. I can be as nasty as I want and the only ones who see it will be the spammers—though you’ll probably want to take it easy on those who stumble on your length filter (if any).

Spammers Should Die

A day or so after this filter went into effect I started to get new messages. These are clever little plays for sympathy saying things like “my comment got eaten but anyway… <regular spiel here>”. Or another “my blog is getting lots of comment spam, do know any way to help?” The website links were still classic spam sites so these weren’t real users looking for help. Cheeky little locusts, aren’t they? Seriously, someone with the right skills needs to hunt these bastards down and rearrange key organs into innovative new patterns.

Tags: , , , ,

Programming | Software

Comments


July 14. 2009 14:27
David
Is it really that much work to add Akismet support? Once you get it running there is very little maintenance. Your blogs may be target for more comment spam than mine but Akismet works really well for me, saving me from thousands of spam comments.


July 14. 2009 17:13
Jacob
The truth is that my familiarity with Akismet is low and I just don't want to take the time to learn the intricacies and trade-offs. Particularly when I find posts others have made regarding bugs they've had implementing their Akismet interfaces. Nothing against Akismet. It's an invaluable service for those with higher-traffic blogs than mine.

My extension is simple and straight-forward and it does the job I need it to do. It is reactive and thus little more than a stop-gap. Still, it's a useful one for now.


August 3. 2009 21:37
Justin
Akismet is awesome and not that hard to use with WP but I have had problems with getting BlogEngine to work with it so I changed to WP and installed disqus as my commenting system. Although with the code that you have provided I think I will switch back and reinstall BlogEngine. Thanks for the help!


August 6. 2009 11:25
Oscar
Pretty cool that you wrote your own thing. Thats cool. I also use Akismet via its plugin for WordPress, but if you think doing your own thing is a better option then cool! Glad you can figure it out, and thanks for sharing.  


August 20. 2009 11:38
Hamming
One of my WP blog is using Akismet as well. However, recently, I found that it starts marking some non-spam comments as spam comments. I wonder if it's because of Akismet or just my site. By the way, the spam protection you've built is simply awesome.


September 13. 2009 14:38
Sunny
I also wonder why you don't use Akismet. Had a blog running without it and added certain keywords manually to the filter and it didn't work out properly for me. It was like Sisiphos ...
But of course Akismet has the danger of marking non-spam as spam.

So this is a solution for you as long as it doesn't become too much.


September 18. 2009 07:59
Malkav
As I don't have problems with spam (combination of country, language and low traffic, I supose), I think I'll give your option a try.

I'll just added that was this blog what make me set up a domaing with blogengine, and give blogging a try. (and now this will be marked as spam, I know it!).


December 19. 2009 16:28
article directory
I've had a similar problem on a few on my sites.  I'm developing a .net blog at he moment.  One of the main considerations I have at the moment is comment problems.  I think I have a pretty good solution, will share it when I am live! Smile


January 31. 2010 14:14
BizTron
Jacob, It's been a Long time. You have the same views of this slime as I do. I've been thinking about this approach as well, but imagine a central repository for everyone to subscribe to. New spammers can be submitted and reviewed by people...so the community can contribute.  I know there are other cans of worms, but it's easier and currently more legal than bodily harm (at least for now.)

I'm wondering that if DOS is not illegal, these sites can be slammed for spamming, just as they would do against legit sites for "hurting" them. Another approach could be to target them at the other end. When a browser hits a site, they could be flagged as a spammer and people can decide whether or not to visit.

I'm sure this isn't over yet.


January 31. 2010 20:16
Jacob
A shared repository for blacklisted sites and/or emails could be useful, but you'd have to be careful about who could submit new blacklists. It wouldn't be that hard to create a webservice that'd handle the basics and be queryable in a BE.Net extension. I wouldn't want to take the initiative on anything retaliatory, though. That just leads to escalation and spammers have financial incentive to commit resources and I don't.


February 4. 2010 20:33
sikat ang pinoy
What do you think the best between blogengine and subtext?

Comments are closed

Information

    Recent Posts

    Calendar

    <<  September 2010  >>
    MoTuWeThFrSaSu
    303112345
    6789101112
    13141516171819
    20212223242526
    27282930123
    45678910

    View posts in large calendar
    Disclaimer
    The opinions expressed herein are my own personal opinions and do not represent my employer's view in anyway.

    © Copyright 2010 Scruffy-looking Cat Herder