A small Amazon Mechanical Turk Success Story
Today, I got the opportunity to try out Amazon Mechanical Turk (as a requester). For a long time, I've been excited about the potential a system like Mechanical Turk creates, and I'd been interested to put it to a real-world test. If you're not familiar with the Mechanical Turk, it's a market place for (typically small) jobs that require human intelligence. The saying goes, "artificial intelligence is hard so Amazon's Mechanical Turk is Artificial Artificial Intelligence." I first heard about an interesting application of the Turk reading Girl Turk, where Andy Biao leverages mechanical turk to perform research on the genres of the music sampled in the Girl Talk album Feed the Animals (which I ended up buying).
Fast forward to today, where I was stuck with a problem. I had a four-page real estate agreement that I needed to customize, but I only had the scanned copy. I ran it through Acrobat's OCR and tesseract-ocr, neither of which created results even close to usable (the corrections process would have taken longer than retyping the documents by hand).
I tried typing the document by hand, but i found typing a single paragraph took me approximately 10 minutes, so I felt it would be worth my time to investigate a process that would be repeatable.
So I thought I would give the Turk a try. I registered and started to create my first Human Intelligence Task (HIT). Since HITs work best when they're balanced and small, I decided to break up my job into four HITs (one for each page). I would have preferred to break it up into even smaller chunks (just paragraphs), but that would have required more intelligent processing, but the pages I could separate mechanically.
The Amazon sample HIT templates didn't include anything for typing a page (most samples are for small things like "answer a simple question" or "flag questionable content"). So I created my own template. It was simple and easy. I used their HTML editor to create some short instructions, an iframe to contain the PDF page, and a text box below to enter the text. At Amazon's suggestion, I also included a comment box at the end where the worker could include whatever feedback they wanted. The template contains exactly one variable, "PAGE_URL" which is the URL to the PDF PAGE to be displayed.
In order to create each of the HITs, I had to do two things:
Finally, I loaded $4.40 into my amazon turk account and created the HITs (worth $1 each plus Amazon takes $.1). I wasn't sure at that price if the job would get done, but in less than two hours, I had all four pages typed, each which took the workers an average of 45 minutes each to type. Two of the workers responded that they hoped I was happy with the job and were eager to do more.
Then, Amazon gives me a link to the results which is returned as a comma-separated file, with one of the columns containing the full text of the page that was hand-typed by the worker.
I was so happy with the results, I swiped my credit card for another $4 and gave a $1 bonus to each worker.
But what a bargain. Instead of typing something up, I got to program in Python and create something that was repeatable, while at the same time trading approximately 3 hours of typing for an hour of programming and $8.40.
The next step, of course, is to learn the Amazon Turk API and set up a site so others can upload their scanned PDFs and they can swipe their credit card with me (for the service of handling the pages and brokering the exchange).
Fast forward to today, where I was stuck with a problem. I had a four-page real estate agreement that I needed to customize, but I only had the scanned copy. I ran it through Acrobat's OCR and tesseract-ocr, neither of which created results even close to usable (the corrections process would have taken longer than retyping the documents by hand).
I tried typing the document by hand, but i found typing a single paragraph took me approximately 10 minutes, so I felt it would be worth my time to investigate a process that would be repeatable.
So I thought I would give the Turk a try. I registered and started to create my first Human Intelligence Task (HIT). Since HITs work best when they're balanced and small, I decided to break up my job into four HITs (one for each page). I would have preferred to break it up into even smaller chunks (just paragraphs), but that would have required more intelligent processing, but the pages I could separate mechanically.
The Amazon sample HIT templates didn't include anything for typing a page (most samples are for small things like "answer a simple question" or "flag questionable content"). So I created my own template. It was simple and easy. I used their HTML editor to create some short instructions, an iframe to contain the PDF page, and a text box below to enter the text. At Amazon's suggestion, I also included a comment box at the end where the worker could include whatever feedback they wanted. The template contains exactly one variable, "PAGE_URL" which is the URL to the PDF PAGE to be displayed.
In order to create each of the HITs, I had to do two things:
- Split the document into separate pages.
- Supply Amazon with a CSV containing the PAGE_URL values. Since a CSV with only one field is trivial, I just needed a text document containing the word PAGE_URL, followed by each of the urls on separate lines.
Finally, I loaded $4.40 into my amazon turk account and created the HITs (worth $1 each plus Amazon takes $.1). I wasn't sure at that price if the job would get done, but in less than two hours, I had all four pages typed, each which took the workers an average of 45 minutes each to type. Two of the workers responded that they hoped I was happy with the job and were eager to do more.
Then, Amazon gives me a link to the results which is returned as a comma-separated file, with one of the columns containing the full text of the page that was hand-typed by the worker.
I was so happy with the results, I swiped my credit card for another $4 and gave a $1 bonus to each worker.
But what a bargain. Instead of typing something up, I got to program in Python and create something that was repeatable, while at the same time trading approximately 3 hours of typing for an hour of programming and $8.40.
The next step, of course, is to learn the Amazon Turk API and set up a site so others can upload their scanned PDFs and they can swipe their credit card with me (for the service of handling the pages and brokering the exchange).
Written on May 1, 2010