Spam Ratios

By: Dave - December 1, 2006

I recently read that 90 to 95% of email traffic is junk mail. That seemed exaggerated until I actually compared the number of junk emails entering my Yahoo Bulk folder to the number of legitimate emails entering my Inbox folder. Yup, 90 to 95% junk.

Unfortunately, blog comment spam has the same distribution. The WordPress filter here at BT catches most of it and holds it in a queue for 15 days. There are presently 2443 comment spam entries in the queue, which averages out to about 162 spam comments per day. BT gets maybe ten or fifteen legit comments per day. So about 90 to 95% of BT comments are spam. Very sad. Is this all thanks to Google and their page rank algorithms? Is this the future of the Internet, spawning new junk transmission technologies?

11 Comments

  1. As someone who makes a living off helping the junk mail process, I’m not too bothered by it. I am more surprised that people do it even though they know it is inneffective. I mean, something like 95% of Spam is caught by filters and never reaches it’s destination. 95% of Outbound Telemarketing Calls are not answered. 95% of Junkmail is not read.

    Comment by Matt W. — December 1, 2006 @ 12:31 pm

  2. It must be cost-effective, otherwise people would not do it.

    Comment by Mike Parker — December 1, 2006 @ 2:04 pm

  3. So – I’m guessing this is why it seems every tenth or so post I submit to BT vanishes into the ether?

    Comment by Ivan Wolfe — December 1, 2006 @ 2:38 pm

  4. If you do have a comment that disappears, contact one of the Blog managers and they will track the comment down. The spam software is smart and learns if we release false positives.

    Comment by J. Stapley — December 1, 2006 @ 3:16 pm

  5. Oh man, I wrote an entire research paper on the topic of anti-SPAM legislation as a law student a couple years ago.

    Now I’m going to have to dig it out…

    Comment by Seth R. — December 1, 2006 @ 3:41 pm

  6. The cost is almost nothing.

    However the reason for the huge increase in spam the last six months are Russian mafia scams which utilize computers taken over and turned into spam zombies – primarily in the US. So a lot of spam are actually Phising attacks for the stupid. Since they cost nothing even a 0.001% success rate is great if you can rob the people involved.

    Comment by Clark Goble — December 1, 2006 @ 11:24 pm

  7. BTW – I thought I had a higher than normal spam ranking. I had installed SpamSieve (a Bayesian spam filter to improve Apple’s spam filter) and it keeps track of statistics.

    I’m getting on average 1164 Spam messages per day which make up 89% of my traffic. Interestingly my spam filter is 99.5% accurate with (thus far) 70 false negatives (primarily as I’m still training it) and 2 false positives.

    Comment by Clark Goble — December 1, 2006 @ 11:26 pm

  8. Aside from the initial startup costs of enslaving several “zombie computers” to route the emails, covering your tracks, and perhaps getting some offshore web hosting, the continuing costs for professional spammers are negligible.

    Essentially it costs essentially the same dollar amount to send 100,000 emails as it does to send 10. And professional spammers send millions of messages in a typical “ad campaign.” All you need is for 5 people out of every 1,000 to actually buy something and you’re making a REALLY nice profit for very little advertising outlay.

    Spamming is extremely effective marketing from a pure cost-benefit analysis.

    Back in the spring of 2005, it was estimated that about 7 professional spammers were responsible for about 90% of the spam sent worldwide. These guys easily make $10,000 per month hiring out to commercial sellers to conduct SPAM ad campaigns, and that’s a conservative estimate.

    It’s almost impossible to track these people. Blocking the computer that originated the spam emails or, as some angry recipients have suggested, retaliating against the computer of origin with your own spam or viruses, or whatever… is also a complete waste of time.

    Spammers will hack into unprotected computers, enslave the computer, and direct that computer to send spam emails. Retaliating therefore, is more likely to hurt your relatively low tech parents (who are unaware that their computer has been hacked) than the original professional spammer. Suddenly, mom’s been blacklisted or hit with some revenge viruses and she has no idea why.

    Even if you can trace it back to the host server, it’s likely located in China or some other offshore host and is therefore, beyond the reach of US law enforcement.

    Couple years back, Congress enacted a much talked-about “Do not SPAM registry.” Kind of similar to the “Do Not Call List” for telemarketers.

    Do not put your email address on the “Do Not SPAM Registry!”

    The moment you do that, you will see an increase in spam emails. Professional spammers don’t give a flying leap about the law. They’ve already insulated themselves from the law. All the Registry is to them, is a publicly available listing of valid email addresses that they can use to send more spam.

    Thanks Congress!

    Never respond to a spam email address. Once the spammer gets your angry response, they will simply add your email address to their listing of “confirmed as valid emails.” Until you were dumb enough to respond, they actually didn’t know whether your email address was valid or not. But now, since you were kind enough to clear that question up for them, you can expect an increase in spam emails for your troubles.

    At present, there really isn’t a good solution to the spam problem. All we can really do is continue the technological arms race between spammers and filter programmers.

    Comment by Seth R. — December 2, 2006 @ 8:40 am

  9. Just a handy vocab word.

    splog = SPAM-like comments on a blog

    Bill Gates was criticized a couple of years ago for saying that the SPAM problem was mostly solved (how could he claim that! the blusterers blustered). But when you consider the contention that 95% of email is SPAM, and any email classifier worth its salt is about 99.5% accurate … and you look at how many false positives end up in your SPAM folder (not many) … I’d say Blil was largely correct.

    [Full disclaimer: I published a paper on SPAM classification once that improved one of the state of the art algorithms by about 10%. Although, I'm not any richer.]

    Comment by queuno — December 2, 2006 @ 2:13 pm

  10. Queuno the problem with spam filters is that there is still the bandwidth cost that someone has to pay. If 90% of email is spam that’s a lot of bandwidth.

    Secondly any spam filter creates the problem of false positives. This isn’t a problem if you have a small collection of friends you email. However if, like myself, your workflow depends upon strangers contacting you then false positives can lead to lost sales.

    Comment by Clark Goble — December 3, 2006 @ 2:28 am

  11. Various ISPs have been attempting to haul spammers into court on various common law concepts, but judges have been a little slow to innovate in this new legal territory. One avenue that has been tried before is “tresspass to chattels.” Problem is that internet bandwidth isn’t exactly a tangible good that spammers could be said to be interfering with and judges are usually the last people in government to start innovating with new ideas and solutions (no matter what the media hype says).

    State statutes provide for law enforcement options, but don’t do much to give private parties a remedy unfortunately.

    Comment by Seth R. — December 3, 2006 @ 5:02 pm