HAVE YOU HAD any spam emails recently? If so, there’s a decent chance they involved quotes from the Harry Potter novels.
There is a rash of Harry Potter-based spam descending on the internet. Dozens of Irish and international Twitter users have noted it.
People are pretty puzzled.
The quotes are one or two sentences from the novels, selected apparently at random.
So what’s causing all the Harry Potter spam? We asked Sal McDonagh of Irish internet security firm Copperfasten Technologies, who make the SpamTitan anti-spam engine.
He told DailyEdge.ie that the wave of Harry Potter quotes are an attempt to evade one of the main anti-spam technologies. Basically, anti-spam engines work with three levels of filters:
- RBLs – essentially blacklists of IP addresses known to be used by spammers
- URIBLs – engines that look up any links in the email and check them against a blacklist of spam sites.
- And lastly, a Bayes engine, which acts on anything that’s got through the other two filters. And is the part that’s relevant here.
So what on earth is that?
A Bayes engine, says McDonagh, is basically a statistical system for working out which words and phrases spam emails are likely to use.
It can be used to assign to scores to different words and say this word is more likely to appear in spam. So ‘Viagra’ or ‘Canadian pharmacy’ would be simplistic examples.
The Harry Potter quotes, then, are an attempt to fool the Bayes engine into thinking it’s dealing with Harry Potter fans rather than spam.
But why suddenly loads of Harry Potter?
The reason that all the Harry Potter spam is coming at once is that Bayes engines learn – so spammers have to move quickly to beat them. “Spam comes in waves,” McDonagh says:
What happens is that as people mark email as spam, the engine learns from that. And the more [Harry Potter] emails people click on, the more it learns. It’ll probably learn quite quickly to identify them, so they’ll then move on to the next author or popular meme.
It’s not just fiction, either.
They tend as well to use the latest news items. So if there’s some celebrity on the news, they’ll use quotes from that because they’re much harder to filter.
Wait. Does this mean I’ll never be able to quote Harry Potter in an email ever again?
No. The Bayes engines also learn to forget, says McDonagh – so that as the Harry Potter spam dies out, it drops out of the filter. Give it a month or so and you can send all the fanfic you want.