New Kind Of Comment Spam Targeting WordPress
November 2nd, 2009 | by admin |
Today I discovered a new type of attempted comment spam targeting my WordPress blog. The spam bot managed to figure out a section of a legitimate comment and then immitate it and then append a spam link to the end. Technically this is quite easy to do using regex or some simple string functions once a page has been “spidered” by the spam bot.
Here is a quick example of what is going on. The following is an origional comment that has been automatically extracted from my blog and then had spam attached to it prior to reposting…
I just came back from Betong last 2 weeks… i really forgot about this Blog… hehe… sorry. Well, I might plan to go there next 2 months…. really miss my girl there. Well, you haven’t added me in YM! yet… try adding me in MSN since i’m always on-line…william from a free insurance quots company.
I removed the actual link but you can guess which few words were added as links (including the spelling mistake!).
The reason spammers are trying this new tactic is because it could have a high success rate of getting past manual comment moderation. If a blog admin did not pay enough attention then this could slip through and end up being approved. The best way to combat this kind of spam is to make sure you skim read comments in your moderation que prior to approving them – easier said than done if you have a stady flow of 100′s or even 1000′s of comments! Luckily I have vary few comments on my blog and spotting this spam attack was easy enough.
How does this WordPress comment spam thing work? Quite simple really… Here is the source code for a comment on my blog:
1 2 3 4 5 6 7 8 9 10 | <div id="comment-20456" class="rounded margined"> <div class="paginated-comments-number" style="float: left; color: rgb(34, 34, 34);"><p>[28]</p></div> <div style="float: right;"> <img alt="" src="http://www.gravatar.com/avatar/94d6dd058297144c15747b9bad969fbf?s=32&d=http%3A%2F%2Fwww.gravatar.com%2Favatar%2Fad516503a11cd5ca435acc9bb6523536%3Fs%3D32&r=G" class="avatar avatar-32 photo" width="32" height="32"> </div> <p>On <a href="http://hygen.net/blog/?p=112#comment-20456" title="">October 9th, 2008 at 6:06 am</a> <a class="comment-edit-link" href="http://hygen.net/blog/wp-admin/comment.php?action=editcomment&c=20456" title="Edit comment">edit</a> <cite><a href="http://www.hygen.net" rel="external nofollow" class="url">admin</a></cite> wrote:</p> <p>Hi Shamyl,</p> <p>That’s correct but only useful if you are creating documents. Then you can save as .doc instead of .docx i guess. The problem is more for people who get docx documents and can’t open them.</p> <p>Dan</p> </div> |
Note the part div id=”comment-20456″ this is all a program needs to look for when disecting one of your blog posts. Every comment in WordPress is identifyable. Once the program has extracted every section of HTML code that is inside a div with id=”comment-XXXXX” where the XXXXX could be any number, the program can then process all of that text by for example filtering out any markup it wishes before finally adding it’s own spam links to the end of the string.
If you still don’t follow what I’m saying it’s because it’s best understood by looking at some actual code – however I’d shy away from that as a script kiddy could copy and paste it into another new WordPress comment spamming robot! Programmers you understand me though right?
Finally there is one more possible way that the spam bot could be working. The bot could be digesting your WordPress comments RSS feed (mine is located here – http://hygen.net/blog/?feed=comments-rss2) which is potentially even easier to diesect because it’s in basic XML format.
Home
HYGEN Web Design
