Democratic Underground Latest Greatest Lobby Journals Search Options Help Login
Google

If You Use the Web, You May Have Already Been Enlisted as a Human Scanner

Printer-friendly format Printer-friendly format
Printer-friendly format Email this thread to a friend
Printer-friendly format Bookmark this thread
This topic is archived.
Home » Discuss » Topic Forums » Science Donate to DU
 
FloridaJudy Donating Member (1000+ posts) Send PM | Profile | Ignore Tue Aug-19-08 07:04 PM
Original message
If You Use the Web, You May Have Already Been Enlisted as a Human Scanner
SciAm.com, August 18, 2008 By Adam Hadhazy

You're just about ready to buy a pair of tickets on Ticketmaster, but before you can take the next step, an annoying box with wavy letters and numbers shows up on your screen. You dutifully enter in what you see—and what a bot presumably can't—in the name of security.

But what you may not know is that you also have helped archivists decipher distorted characters in old books and newspapers so that they can be posted on the Web.

You might think that computer scientists would have figured out a way to get computers to decipher those characters. But they haven't, so instead they've figured out a way to harness all that effort you're making to protect your security. "When you're reading those squiggly characters, you are doing something that computers cannot," says Luis von Ahn, a computer scientist at Carnegie Mellon University (C.M.U.) in Pittsburgh.


This sounds like an inspired idea, but I think they need to work on the ethics. There's something a little creepy about being used as an uninformed research subject, though I can't see any obvious risks to the user.

Full article here: http://www.sciam.com/article.cfm?id=human-book-scanners-on-the-web&sc=rss
Printer Friendly | Permalink |  | Top
dbonds Donating Member (1000+ posts) Send PM | Profile | Ignore Tue Aug-19-08 07:08 PM
Response to Original message
1. I don't see any harm and lots of benefits
Many sites need a CAPTCHA system to distinguish people from programs, that just had the added benefit of helping archive books. It is a system called re-captcha. It is a nice complete CAPTCHA system for anyone that doesn't want to code their own - and they are harder than you think to code ones that aren't program readable.
Printer Friendly | Permalink |  | Top
 
notesdev Donating Member (1000+ posts) Send PM | Profile | Ignore Tue Aug-19-08 07:14 PM
Response to Original message
2. Wait a sec...
This doesn't quite make sense. If they don't already know what the characters are, how can it be used in a verification process? What are you going to compare the user input to, to see if it's real or a bot trying to bluff?
Printer Friendly | Permalink |  | Top
 
Indenturedebtor Donating Member (1000+ posts) Send PM | Profile | Ignore Tue Aug-19-08 07:18 PM
Response to Reply #2
4. Hahaha true. Perhaps they could insert a "dummy" character
If you enter the dummy character you're good? :shrug:
Printer Friendly | Permalink |  | Top
 
FiveGoodMen Donating Member (1000+ posts) Send PM | Profile | Ignore Tue Aug-19-08 07:25 PM
Response to Reply #2
5. My first thought as well.
Maybe there's an answer, but it seems crazy that the article didn't address this.
Printer Friendly | Permalink |  | Top
 
caraher Donating Member (1000+ posts) Send PM | Profile | Ignore Tue Aug-19-08 11:08 PM
Response to Reply #5
7. It could still function as a "screen"
If there's one ambiguous letter chances are there are only 2-3 reasonable choices anyway. If you think it looks like a "6" and I think it looks like a "b" they could accept either entry but reject someone who enters a "j"
Printer Friendly | Permalink |  | Top
 
FiveGoodMen Donating Member (1000+ posts) Send PM | Profile | Ignore Wed Aug-20-08 12:08 PM
Response to Reply #7
9. Okay. That could work.
On the other hand, do suppose someone is going to all the trouble to set that up just to see if other people agree with their guess on what a smudged character really is?
Printer Friendly | Permalink |  | Top
 
mainegreen Donating Member (1000+ posts) Send PM | Profile | Ignore Thu Sep-04-08 06:49 AM
Response to Reply #2
13. The same text goes out to multiple people simultaneously.
Your answer is compared to the mob's answer.

The mob is considered to be correct.
Printer Friendly | Permalink |  | Top
 
FiveGoodMen Donating Member (1000+ posts) Send PM | Profile | Ignore Thu Sep-04-08 02:12 PM
Response to Reply #13
16. That makes sense.
Printer Friendly | Permalink |  | Top
 
mike_c Donating Member (1000+ posts) Send PM | Profile | Ignore Tue Aug-19-08 07:16 PM
Response to Original message
3. you're not "an uninformed research subject" at all....
Research subjects are the topics of research, not the tools used to conduct research. If anything, we're uncompensated research assistants, but a little altruism never hurt anyone.
Printer Friendly | Permalink |  | Top
 
FloridaJudy Donating Member (1000+ posts) Send PM | Profile | Ignore Tue Aug-19-08 07:31 PM
Response to Reply #3
6. I think over 99% of people
Including me would say "Sure! Why not?" when asked to help out - after all, I lend my computer to Stanford for their protein folding project when I'm not using it, and I don't see any possible risks to the user from being used as a scanner. But it could lead to a flood of "parasitic computing" which might not be so benign.

http://www.sciam.com/article.cfm?id=parasitic-computing-can-u
Printer Friendly | Permalink |  | Top
 
Occulus Donating Member (1000+ posts) Send PM | Profile | Ignore Wed Aug-20-08 12:50 AM
Response to Original message
8. Wow, that's wrong
Edited on Wed Aug-20-08 12:52 AM by kgfnally
"You might think that computer scientists would have figured out a way to get computers to decipher those characters. But they haven't,"

"When you're reading those squiggly characters, you are doing something that computers cannot,"

Complete nonsense. I work in a postal distribution plant, and we have industrial-grade optical character readers. The hardware and software can read cursive handwriting, for frak's sake.

AT TENS OF THOUSANDS OF PIECES PER HOUR.

Did they not realize this when the article was written?
Printer Friendly | Permalink |  | Top
 
skoalyman Donating Member (751 posts) Send PM | Profile | Ignore Wed Sep-03-08 02:02 PM
Response to Reply #8
10. I wouldn't be so fast on dismissing it here have a try
Printer Friendly | Permalink |  | Top
 
zipplewrath Donating Member (1000+ posts) Send PM | Profile | Ignore Wed Sep-03-08 03:10 PM
Response to Reply #8
11. Different problem
Those machines know they are looking for addresses. They even have a basic format from which they are attempting to decipher information. Ambiguities can be sorted out by cross correlation (unsure zipcodes can be matched against vauge state abbreviations). This problem is slightly more complicated than that.
Printer Friendly | Permalink |  | Top
 
Posteritatis Donating Member (1000+ posts) Send PM | Profile | Ignore Wed Sep-03-08 05:48 PM
Response to Reply #8
12. Sure they did
Can your postal computers read variant alphabets? Make out faded or multiply-overwritten text? Font/script changes? Non-linear writing? Blocks of text which are obviously in nothing near normal-postal-address format? Writing 'projected' onto curved surfaces, or which is heavily overlapped by damage or other script/illustration/etc?

OCR ain't that hot, and postal services use a very specialized version of that. They'd fall down dealing with margin notes in sixteenth-century manuscripts and the like.
Printer Friendly | Permalink |  | Top
 
joshcryer Donating Member (1000+ posts) Send PM | Profile | Ignore Thu Sep-04-08 07:13 AM
Response to Reply #8
14. Postal addresses are composed of a few small variables. Just the zip code alone...
...allows the 'text' for the City/State to be completely ignored. Reducing the problem down significantly.

If you use an extended zip code then you need post no text at all, just a 9 digit number.
Printer Friendly | Permalink |  | Top
 
Canuckistanian Donating Member (1000+ posts) Send PM | Profile | Ignore Thu Sep-04-08 11:06 AM
Response to Original message
15. Speaking of Ticketmaster
There was a documentary on CBC that asked the question - How do these ticket reseller companies manage to snatch up nearly ALL the good tickets to popular events in a matter of minutes?

The answer was that these companies were using sophisticated scanners that could figure out almost any scheme of distorted characters.

The company spokespeople vehemently denied that they had even heard of any such software, but some insiders said that they had seen it working.

No investigations are even pending against these guys.
Printer Friendly | Permalink |  | Top
 
skoalyman Donating Member (751 posts) Send PM | Profile | Ignore Thu Sep-04-08 10:23 PM
Response to Reply #15
17. I wouldn't doubt it I spent a little time on the website thats under my
other post.Typing away on there website I managed to slip up a couple of times and it said it was correct when it wasn't so it wouldn't be hard to believe a bot gaining entry were its not welcomed.
Printer Friendly | Permalink |  | Top
 
DU AdBot (1000+ posts) Click to send private message to this author Click to view 
this author's profile Click to add 
this author to your buddy list Click to add 
this author to your Ignore list Fri May 03rd 2024, 08:00 PM
Response to Original message
Advertisements [?]
 Top

Home » Discuss » Topic Forums » Science Donate to DU

Powered by DCForum+ Version 1.1 Copyright 1997-2002 DCScripts.com
Software has been extensively modified by the DU administrators


Important Notices: By participating on this discussion board, visitors agree to abide by the rules outlined on our Rules page. Messages posted on the Democratic Underground Discussion Forums are the opinions of the individuals who post them, and do not necessarily represent the opinions of Democratic Underground, LLC.

Home  |  Discussion Forums  |  Journals |  Store  |  Donate

About DU  |  Contact Us  |  Privacy Policy

Got a message for Democratic Underground? Click here to send us a message.

© 2001 - 2011 Democratic Underground, LLC