Many of you might already be aware of it, but I thought it might be helpful to let other experimental philosophers know about an easy way to get participants for experimental studies. It's called Amazon's Mechanical Turk, or Mturk for short, and allows you to put together questionnaires, have participants complete them online, and export the results in an excel file.
In creating the questionnaires (referred to as Human Intelligence Tasks or HITs), it's helpful to know a little bit of HTML code, but you can also create them in a rich text editor or from stock forms. You get to set exactly how much participants are paid for completing the questionnaire and can reject participants who don't answer control questions correctly or complete the questionnaire without reading.
MTurk can be a really useful resource, especially for piloting data, but there are also a few potential problems: Participants often answer the questions as quickly as possible without reading carefully and many of them are not from countries where English is the native language. But, that being said, it is an extremely easy way to get data very quickly, and I thought some x-phiers might find it useful.
I would be really interested to hear what those who have used MTurk think about it. Is it a reliable way to collect data for experimental philosophy studies?
Jonathan--
Thanks for posting this. It would be great to hear more about how to increase the likelihood that participants read the instructions and respond to the questions carefully. Have you found some ways to successfully do that?
I too have found that participants tend to take VERY short time in responding, even when offered higher pay ($1, when the usual is more like $0.25 I gather). A survey that tends to take ~10 minutes in lab settings was completed in less than 3 minutes by nearly half of the participants in one thing that I ran.
Some other potential problems: I am not sure if there is an option to do otherwise, but it seems that all questions (and vignettes) have to be on the same page. There is also no easy way to counterbalance the order other than manually. As you say, MTurk might be good for piloting, but I have some doubts about its use for a more rigorous study.
At any rate, it'd be nice to hear how people use this tool! One helpful resource page that I've found (through Nina Strohminger) is http://experimentalturk.wordpress.com/resources/ (including demographic information of MTurk workers).
Posted by: Shen-yi liao | Monday, January 04, 2010 at 05:56 AM
Sam,
I think the best way to make sure participants are carefully reading the vignettes and instructions is to include control questions or comprehension checks which can only be answered with careful reading. It is then easy to determine which participants did, in fact, follow the instructions and read carefully.
When you create a survey, you can also control who is able to participate (e.g. allowing only people whose work has have very rarely been rejected). I also find this to be helpful.
You are definitely right, though, that you can't counterbalance the order of questions or break up an experiment into multiple pages. For more rigorous, technical experiments, it may be better to use online services like Cotterweb or Qualtrics.
Thanks for including that link! It really does have a lot of very helpful information.
Posted by: Jonathan Phillips | Monday, January 04, 2010 at 02:24 PM
I'd like to use this resource, but it sounds problematic. Is there any way to make payment contingent on getting comprehension questions right? One thing Dylan Murray came up with in our recent studies is, in addition to dropping those participants who miss any of the 2-3 comprehension questions, we drop those who complete the survey in less time than two standard deviations from the mean (and there's lots of overlap between these fast-takers and those who miss comp. questions).
While we're at it, does anyone know if there are any standards in psychology for the best format for comprehension questions (and whether they should appear before or after experimental questions)?
And has anyone discovered any other ways to get diverse subject pools (other than Josh's park roving)? I still use students, though they are at Georgia State, so more diverse in SES, race, and religion than most universities. But I'd like to try some other populations.
Posted by: Eddy Nahmias | Monday, January 04, 2010 at 09:10 PM
Hi guys,
I used MTurk to run two surveys (Sam turned me on to it). I ran 280 people in 18 hours total (in two separate blocks). I chose only people from the USA, assuming that would provide more homogeneity in English competence/idiolect. They averaged close to 3 minutes per survey, which seems brisk to me, but maybe ok.
I directed the Turkers to a U. Michigan secure site that hosts the survey. This allowed me to do multiple pages, full counter-balancing, etc.. When the survey is done, it gives the Turker a confirmation code to enter on the MTurk page. If you have Qualtrics access, this would be a perfect way to use MTurk and Qualtrics together.
In terms of data quality, I had run a subset of these questions paper and pencil with more than 200 students. The results are very very similar (all statistical tests for differences are highly non-significant). Also, in my MTurk survey, I asked the same question multiple times in slightly different ways (If this sounds weird, its a long story why I did this). There was good within subject consistency even though similar question items were often separated by a dozen other questions (making it hard for someone entering random responses to be consistent).
Interestingly, I also ran a similar survey using Craig's List ads and got about 100 responses. Quality was much worse with many, many more missing items and strange responses that are hard to believe or inconsistent. And it took a LOT of time and effort to get the ads placed.
All in all, I am pretty satisfied with the MTurk experience and will likely use it a lot in the future, for pilots if not for actual stuff to be published.
-Chandra
Posted by: Chandra Sripada | Monday, January 04, 2010 at 10:26 PM
Eddy,
Using Mturk, you can either 'accept' or 'reject' any completed survey, and participants are only paid for accepted surveys. So there is no need to compensate participants who took the survey too quickly or who failed the comprehension questions. You get to choose who gets paid.
I like Dylan's way of selecting how quickly is too quickly for survey completion.
As far as the standards for comprehension questions, my sense is that they are generally at the end of the survey because you don't want to risk having them affect participants' answers. However, I am definitely not an expert on this.
Chandra,
This is really encouraging, and I really like your approach of combining MTurk and Qualtrics! This definitely seems like the way to go, and I think I'll take that approach in the future for studies that are complex enough to need multiple pages. Thanks for the tip.
Posted by: Jonathan Phillips | Tuesday, January 05, 2010 at 02:41 PM
Eddy,
For memory, Justin Sytsma, jonathan Livengood, Adam Feltz and I are running a website, called PhilosophicalPersonality (google it if you do not know it).
We are perfectly willing to add some vignettes, and it costs something between 20 and 25 dollars (if memory serves) per vignette.
Cheers
Edouard
Posted by: Edouard MAchery | Tuesday, January 05, 2010 at 04:27 PM
Thanks for posting this, Jonathan! Sorry I'm sort of late to the discussion.
While I share Sam's worry about the speed at which Mturk subjects work, it might not be such a bad thing. After all, most of the time we want snap, intuitive judgments based on simple cases. And these Mturk subjects have the monetary incentive to both move quickly (so they can do more HITs) as well as to comprehend the material (else they don't get paid). Given Chandra's positive experience comparing data from more usual subjects, this sounds like a good resource for more than just piloting.
A major issue then is whether Institutional Review Boards would be at all wary of the use of human subjects on Mturk. My IRB at least is quite picky!
Posted by: Josh May | Wednesday, January 13, 2010 at 01:29 PM