SPAM is annoying, I got it.

Wow, I do still not know how could I had been so patient the past 6 weeks where I in purpose stopped writing on this BLOG for personal reasons and also due I was trying to understand what is all about with this SPAM flooding affecting the miserable four blog entries I have written so far. I stop counting the type of SPAM as well as how many different languages those comments were written mainly some type of Cyrillic kind of language I assume mostly Russian or from some eastern Europe countries. Topics ranged from medicine sales, human “organs” enlargement stuff, any sort of services sales and few trying to advertise their very own blogs were I do assume all these trying to drive traffic to other websites.

I FINALLY DECIDED TO STOP ACCEPTING COMMENTS ON MY BLOG POSTS !!!! Well at least I had a couple of legitimate commentators; thanks you two and sorry I am not showing your comments anymore.I am in the process to figure out a best way to handle this SPAM but that might require to install the newest version of WordPress which will take some time due it is very low in my To Do list of pending tasks.

For the few of you that may be asking themselves what in the world this guy is writing about. SPAM (Electronic NOT Food) a form of abuse of an electronic service where unsolicited or unexpected (read, not related to the topic in question) messages or blog comments in my case are received in a bulk or in a indiscriminately form. Even though WordPress application does a great job to help you manage posts and pages comments the amount of them I have been receiving daily in the past few weeks it is ridiculous.

Let’s see if I can provide some help out of this post other than ranting about this issue. If you are suffering of the same problem try reading the official article on this topic; “Combating Comment SPAM“.  There is also a list of “Anti SPAM plug ins” for WordPress but I can not recommend anyone just yet. Seems like CAPTCHA is a very popular way to avoid non-human SPAMers which claims to stop a big chunk of this SPAM traffic but still you would need to deal with the human type which might be somewhat manageable. Most WordPress “Anti-SPAM” plug-ins uses some sort of heuristics to determine if the comment in question is suspicious to be SPAM or not. Other’s use an external database where most of the well known SPAMers domain names and ip addresses are captured and keep in track.

A list of recent articles you can find out of ACM digital library in case you would like to get a some what deep  review on this topic.

TrackBack spam: abuse and prevention

Elie Bursztein, Peifung E. Lam, John C. Mitchell
November 2009
CCSW ’09: Proceedings of the 2009 ACM workshop on Cloud computing securityContemporary blogs receive comments and TrackBacks, which result in cross-references between blogs. We conducted a longitudinal study of TrackBack spam, collecting and analyzing almost 10 million samples from a massive spam campaign over a one-year period. …

Keywords: blog, linkback, linksback, pingback, refback, secure trackback, spam, talkback, trackback

Spam filtering for short messages

Gordon V. Cormack, José María Gómez Hidalgo, Enrique Puertas Sánz
November 2007
CIKM ’07: Proceedings of the sixteenth ACM conference on Conference on information and knowledge managementWe consider the problem of content-based spam filtering for short text messages that arise in three contexts: mobile (SMS) communication, blog comments, and email summary information such as might be displayed by a low-bandwidth client. Short messages …

Keywords: blog, classification, email, filtering, sms, spam

Detecting splogs via temporal dynamics using self-similarity analysis

Yu-Ru Lin, Hari Sundaram, Yun Chi, Junichi Tatemura, Belle L. Tseng
February 2008
Transactions on the Web (TWEB) , Volume 2 Issue 1This article addresses the problem of spam blog (splog) detection using temporal and structural regularity of content, post time and links. Splogs are undesirable blogs meant to attract search engine traffic, used solely for promoting affiliate sites. …

Keywords: Blogs, regularity, self-similarity, spam, splog detection, temporal dynamics, topology

Identifying commented passages of documents using implicit hyperlinks

Jean-Yves Delort
August 2006
HYPERTEXT ’06: Proceedings of the seventeenth conference on Hypertext and hypermediaThis paper addresses the issue of automatically selecting passages of blog posts using readers’ comments. The problem is difficult because: (i) the textual content of blogs is often noisy, (ii) comments do not always target passages of the posts and, …

Keywords: implicit links, passage extraction, weblogs

Relaxed online SVMs for spam filtering

D. Sculley, Gabriel M. Wachman
July 2007
SIGIR ’07: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrievalSpam is a key problem in electronic communication, including large-scale email systems and the growing number of blogs. Content-based filtering is one reliable method of combating this threat in its various forms, but some academic researchers and industrial …

Keywords: blogs, spam filtering, splogs, support vector machines

Comments are closed.