Sunday, November 23, 2014

Spam gets a personal touch: Human 1, Machine 1

Blogging and spamming practically come hand in hand. The obvious ones have been pretty well controlled by the major blogging platforms' spam filters, thanks to advances in text analysis and machine learning algorithms. However, it is not perfect, or is it - you be the judge in this case.

This could be an example of how creative spammers are at combating algorithms.

Or, it could be an example of a business owner trying to do his own selective SEO (search engine optimization).

An old post on mandatory school uniforms got the following spam:
I think school uniforms must be compulsory in schools because after one time-investment in the uniform, it prevents the child from the traits of social inequality,inferiority complex etc.And If you have decided to buy the uniform, buy it from Wang Uniforms (link removed)

I speculate that a human wrote the comment, because it is a sensible comment, and also because of the grammatical, punctuation and spacing errors.

However, the link, which I removed for this post, does point to a legitimate school uniform maker in the UAE. I suppose there are two possibilities:
1) The uniform business had legitimately read the article, had something genuine to say, and also wanted to promote its own business.
2) The uniform business hired a spammer / mass commenter to do the job for SEO purposes.

I had a bit of a hard time deciding whether this is spam or not. Since I cannot edit the comment to remove the link, I rejected the comment. Especially after I found out that the profile for the commenter was some jewelry shop in South East Asia - nothing to do with uniforms.

Algorithms are never perfect. The underlying uncertainty is why we build algorithms at all. Given I the human had trouble identifying the authenticity of this comment, I'm glad the machine (spam filter) didn't just rule it out.

So... Human vs Machine: Human 1, Machine 1?


P.S. Unrelated, but this is quite funny. Don't be fooled by the title.
Visualizing Big Data