Runworks
Welcome, Guest      Blog      Forums      Calculator
     Log in  


 
 Spamalot
Submitted by Rickshaw :: Thu Mar 01, 2007 11:54 pm
I spent some time de-spamming Runworks today. If you visited in the last couple of days, you probably noticed the appearance of a whole bunch of posts about what Lindsey Lohan is doing with Catherine Zeta Jones, or something else not safe for work viewing. Thanks, but no thanks. I decided I needed to tighten up security around here to help keep spam out, so I opened up my toolbox and got to work.

Since about a year ago, new users registering at Runworks have been required to solve a captcha (an image containing distorted/hard-to-recognize text) and reply to a confirmation email. I figured that would prevent anybody from creating an automated tool to create new user accounts for posting spam. I was wrong.

Solving the captcha in an automated way proved to be easier than I would have thought. Do a Google search for "defeat captcha", and you'll see that the captchas for many popular web sites can be solved in an automated manner by using image filters and pattern recognition programs. It's not something a casual user could automate in 5 minutes, but for someone knowledgable enough and with a big enough potential reward, it's doable.

Part of my problem was that I'm using phpBB, a very popular bulletin board program, so the captchas on Runworks looked identical to all the other phpBB sites in the world. That makes spending the time to defeat the phpBB captcha a worthwhile goal for a spammer, where defeating a captcha that was unique to Runworks wouldn't be. So I removed the phpBB captcha, and with a little experimentation, I created my own text-obfuscating algorithm to use in its place. I'm not sure if it's any more robust than the phpBB-provided one, but since it's unique to this site alone, it's doubtful anyone will be interested enough to try to defeat it.

http://www.runworks.com/templates/runworksClean/images/photos/captcha.png

Can you read this? Neither can I.

The confirmation email proved to be easy to automate as well. Free email sites like hotmail and gmail can be checked for and handled specially, but now spammers have discovered a way to generate randomly-named temporary domains for receiving email. I had lots of registrations from users with email like asterbast@fkdj39jdkh291.info. Spammers (or an automated tool they wrote) would have to actually receive and reply to email at this address in order to complete the registration process. I believe this is possible because of a technicality in the rules governing domain registration. You can "test" a domain name for 24 or 48 hours before you have to pay for it, so spammers register hundreds or thousands of random domain names, use them for a day or two, and then dump them. There's not much I can do about that. Hopefully the new captcha will be enough.

For good measure, I also went through and disabled about 300 relatively recently created user accounts, so already-existing accounts couldn't be used to post new spam. I think these were all bogus accounts, but if I accidentally nuked a real user's account in the process, please let me know.

The most interesting moment in this whole process came when the spammer happened to log in and start posting more spams while I was in the middle of making the security changes. I was able to determine the spammer's IP address and trace it back to an ISP in Hong Kong. Then I permanently banned that IP.

Hopefully this will keep things quiet for a while, but I doubt it's the last we'll see of spam. The problem with solutions like captchas is that they only determine who's a computer and who's a human (and imperfectly at that), not who's got good intentions and who's got bad ones. A spammer could easily employ people in a sweatshop somewhere to solve captchas for target sites, and then turn over the actual posting of the spams to an automated program. You could even imagine a black-hat web site hosting illegal software or porn that required users to solve a captcha (pulled from the spammer's target site list) before the next warez file could be downloaded.

The internet is a bit of a mess. It's amazing that it actually works as well as it does.

Now back to running...



mfox

South Orange, New Jersey
Joined: 19 Dec 2004
Posts: 367

Re: Spamalot Posted: Fri Mar 02, 2007 7:14 pm 

I'm not hip to what the spammers are truly capable of but would it be of any help to create a captch that rather than displaying text that you have to re-input instead displayed a simple equation to which the user has to input the answer. Some text before the captcha would prompt the user to "Enter the solution." The captcha would display something like "5+5=". And the user would enter "10" or "ten."

This assumes that the automated tools for defeating the captcha that uses pattern recognition would (should?) incorrectly try to input "5+5=" rather than the true answer. Or are these tools smart enough to recognize the text prompt and figure out what input is actually expected?


Rickshaw
Runworks 2005 5M Racer
San Francisco, CA
Joined: 26 Nov 2004
Posts: 1157

Re: Spamalot Posted: Fri Mar 02, 2007 7:48 pm 

An automated program that first encountered something like 5+5 would likely be stumped. But as soon as the spammer noticed that lots of sites were starting to use a test like that, it would be pretty easy to write a program to solve it in an automated way. I think the general idea is that a single automated program could never solve any arbitrary test, but a skilled programmer could probably write a program to automatically solve any one style of test, like a math problem or an image recognition test.

The most promising kind of test I've seen is a large set of general knowledge questions, like "Please enter President Washington's first name." That probably couldn't be defeated by any current program. However, you'd need to write thousands of different questions to prevent the spammer from simply hand-entering answers to them all one time, and you'd also run the risk that real humans couldn't answer the questions if they were too difficult or relied on particular cultural knowledge, etc.

Long term, the solution probably involves several different approaches: better tools for tracking and identifying spammers, more consistent punishment of spammers to deter new ones, and automated identification of "bad" behavior (like posting an identical message twice within a short time) instead of a a simple human/computer check.


BGibbsLMT

Southington, CT
Joined: 12 Dec 2004
Posts: 68

Re: Spamalot Posted: Sat Mar 17, 2007 6:34 pm 

Unfortunatly entering president Washington's first name would be a problem for most adults in this country. Besides, I like spam, fried with a little lettuce and tomato on toast. Maybe some hot sauce. Yumm

Rickshaw
Runworks 2005 5M Racer
San Francisco, CA
Joined: 26 Nov 2004
Posts: 1157

Re: Spamalot Posted: Sat Mar 17, 2007 9:21 pm 

Sad, but true. I happened to catch an episode of the gameshow "Are You Smarter than a Fifth Grader?" recently. Boy was that funny. One contestant didn't know what country was to the north of the United States. Overall, the 5th graders definitely seemed to be a smarter bunch than the contestants.

mfox

South Orange, New Jersey
Joined: 19 Dec 2004
Posts: 367

Re: Spamalot Posted: Sat Mar 17, 2007 10:31 pm 

Something can be said for using this approach to screen for reasonably intelligent forum discussions. You could argue that If someone doesn't know Washington's first name then that that person might not have much to contribute to posting on the forum. ;)

BGibbsLMT

Southington, CT
Joined: 12 Dec 2004
Posts: 68

Re: Spamalot Posted: Sun Mar 18, 2007 6:37 am 

I think this would unfairly eliminate many trail runners. Most of us have suffered some form of brain damage from running headlong into low hanging tree limbs and can barely remember our own first names let alone some guy who's been dead for 200 years.

mfox

South Orange, New Jersey
Joined: 19 Dec 2004
Posts: 367

Re: Spamalot Posted: Sun Mar 18, 2007 11:17 am 

Ha, ha...point well taken. I guess there's no perfect mouse trap, huh.

View posts:     


All times are GMT - 8 Hours
Page 1 of 1
 


Copyright © 2014 Runworks. All rights reserved.   Powered by phpBB © phpBB Group

Questions or Comments  Privacy Policy