Simon Cullen--a student at Melbourne University working on a thesis under the direction of Frank Jackson and Neil Thomason--just sent me a link to a very interesting project that he has been working on that is entitled "Survey Driven Romanticism." On the surface, at least, it appears that he has taken the time to flesh out some of Antti Kauppinen's worries about x-phi in greater detail. More importantly, Simon has run studies of his own to support the claims he puts forward! As such, this paper is certainly one that we should all read and address. Here is the abstract (the link to the paper is here):
Survey-Driven Romanticism: What's wrong with experimental philosophy
Many experimental philosophers assume that subjects' intuitions about the intrinsic, philosophically interesting, features of survey-based thought-experiments can be simply ``read off'' from subjects' survey responses. The experimental results presented here demonstrate that this assumption is false: responses to some of the most influential (and oft quoted) experimental philosophy surveys vary systematically according to a variety of philosophically irrelevant factors. I present the results of an eight-month study involving over 5,000 subjects---by far the largest in experimental philosophy yet.
I conclude that experimental philosophers have not yet managed to reveal folk intuitions about the philosophically interesting features of their survey vignettes. Rather, conversational norms and the formal pragmatic features of the surveys themselves provide meaningful and highly effective cues which subjects' rely on when reasoning about survey vignettes. Further, slight and prima facie philosophically irrelevant variations to the semantic content of survey vignettes can drastically effect subject responses.
Despite experimental philosophers' insistence that they are ``unified behind ... the application of methods of experimental psychology to the study of the nature of intuitions'', they have worked largely in isolation from current and well-established social and cognitive scientific research on survey methodology. Until this is corrected their results are likely to be of little philosophical significance.


Dear Simon,
I have not yet read the paper (I just printed it), but the abstract strikes me as dialectically strange in two respects.
- An important project in experimental philosophy is intuition debunking (what I like to call the Rutgers Plan). The argument goes as follows: intuitions are affected by philosophically irrelevant intuitions (e.g., culture, SES), so they cannot play the role they are assumed to play. It seems from your abstract that your findings provide more evidence for the premise of this argument. So, rather than undermining this aspect of experimental philosophy, it supports it.
- The last paragraph is dubious, given that at least some papers by experimental philosophers have been published in highly regarded PSYCH journals and given that some of us are actively collaborating with psychologists.
Edouard
Posted by: Edouard Machery | Monday, March 24, 2008 at 12:18 PM
Hi Edouard!
I should have been more specific in the abstract -- in the intro I do say that the paper is focused on a particular (I think until now quite common) way of doing X-phi. I definitely don't want to go on the record as claiming that *all* experimental philosophers are guilty of this kind of methodological neglect -- I make a point of saying that some are much more conscious of these issues! (And I get the impression that it's getting better all the time.)
About the "restrictionist" argument. My results might be taken to strongly support that view -- but the point of my paper is that I think another, more plausible, way of interpreting the results is by very clearly distinguishing _survey results_ from _intuitions_.
Anyway, I'm looking forward to hearing what you think of the paper.
Cheers!
Simon
Posted by: Simon | Monday, March 24, 2008 at 01:15 PM
I obviously meant "affected by philosophically irrelevant variables" rather than "affected by philosophically irrelevant intuitions".
Posted by: Edouard Machery | Monday, March 24, 2008 at 02:02 PM
Thank you, Simon, for an interesting and provocative paper. There are a number of comments to be made. At this point, I am going to limit myself to three.
First, you offer the following characterization of restrictionism:
“Quite briefly, restrictionism is the view that “what it is intuitive to say” should be restricted to quite specific groups; and as these groups become smaller and more various, the traditional questions about “our” concepts become proportionally less interesting.”(pp. 3)
This might be a faithful summary of one of the main points made in the WNS paper, but it doesn’t faithfully capture the general restrictionist position. Restrictionism is better summarized, I think, in something like the following way. There have been a series of recent empirical studies that suggest that some particularly prominent, and commonly appealed to, philosophical case intuitions have epistemically undesirable properties. In particular, these studies suggest that some philosophical intuitions are sensitive to philosophically irrelevant features such as who is considering the hypothetical case, the presence or absence of affective content, and the context in which the hypothetical case is being considered. Let’s use the term “instability” to pick out this epistemically undesirable characteristic. Even though only a few philosophical intuitions have been examined, we can neither explain what it is about any of these intuitions that makes them unstable nor predict, of any other philosophical intuition to which we might like to appeal, whether or not that intuition will also be unstable. These findings thus present a challenge to the well-functioning of the philosophical practice of appealing to such intuitions: some restriction of that practice is called for. The restriction might involve restricting the class of intuitions to which we can appeal as evidence. The restriction might involve restricting the class of people whose intuitions can be appealed to as evidence. The restriction might involve restricting under what conditions we might appeal to intuitions as evidence.
Second, if I understand correctly, your central worry is with our methodological presupposition that “intuitions can simply be read off from survey responses”. You seem willing to grant that we have shown, for example, that survey responses vary according to the context in which a hypothetical case is being considered, but want to contend that we are wrong to claim that we have shown that intuitions vary according to the context in which a hypothetical case is being considered. So, why should we think that intuitions can’t be read off from survey responses? If I understand correctly, you want to say that: (1) If a person’s judgments are sensitive to the wrong kinds of things, then that person’s judgments don’t count as intuitions; and (2) survey responses are sensitive to the wrong kinds of things. I think that we agree that (2) is true (and, furthermore, that we have a great deal of empirical evidence that it is true). My worry is that you have done little to show that (1) is true. That is, I don’t see what reason you have given us for thinking that if a person’s judgments are sensitive to the wrong kinds of things then they don’t count as intuitions.
Finally, I feel obligated to comment on the tone of your paper. I hasten to add that the spirit of this comment is entirely constructive. My impression is that the tone of the paper is too combative, snarky, and loaded with histrionics. There are a number of reasons why you want to avoid this kind of tone in your work. First, you seriously risk alienating those with whom you are trying to hold a philosophical conversation. Beginning a conversation by challenging the positions and arguments of one’s philosophical opponents is, of course, fine; but it is quite another thing to begin a conversation by challenging their intelligence. Second, you run the risk of obscuring what important insights your contribution to the conversation might actually have. If the focus of the conversation becomes whether or not you have been appropriately charitable to your opponents, this draws the attention away from where it really should be. Third, you run the risk of devaluing the contribution that you are trying to make. There are a great many things written in philosophy (and elsewhere) that are so wrongheaded that they simply aren’t worth either our time or attention. If you spend too much time casting your opponent’s position or arguments as completely wrongheaded, then it becomes hard to see why you spent any time or energy at all trying to show what is wrong with that position or those arguments. After all, we remember those who slay dragons, not those who take out the trash.
Posted by: Joshua Alexander | Tuesday, March 25, 2008 at 05:19 PM
Hi Joshua!
Thank you for your helpful comments re defining restrictionism. I need to fix that bit up.
On to number two. Unless I've misunderstood your reply, I don't think you've responded to the point of the paper. I think the point in the priming experiments is that people's responses do not at all reflect their judgments about the intrinsic philosophically interesting features of the cases (which I assumed are what you wanted to measure).
So, in the priming example (pp.12-18) my point was that people interpret the survey in line with conversational norms, and that means understanding the questions as requests for non-redundant information. The only way for them to do that, is to locate the second, Truetemp question, in the context of the first priming question, and that means _understanding the question comparatively_ with respect to the priming question (p.13). So, on my account, subjects read “Does Charles really know that it's 71 degrees in the room, or does he only believe it?” as asking, essentially, “Is Charles' case more obviously, or less obviously, a case of knowledge, than Dave's (or Karen's) case?” (p.17)
And what I want to say is that this is exactly what we should expect them to do – given that they are trying to provide informative responses to your surveys. You say: “We found that intuitions in response to [Charles’ case] vary according to whether, and which, other thought experiments are considered first.” And my point is simply: your subjects are not giving you their intuitions about Charles' case.
The story is a little different in the dichotomous/non-dichotomous and forced choice/Likert scale experiments. The point there is simply that what you, Nichols, and Stich are measuring is, inter alia, people's reactions to the formal pragmatic features of your surveys. So, when you find that Western and East Asian subjects respond differently to the same survey, it might well be because Westerners and East Asians respond differentially to the survey pragmatics, i.e., the response alternatives, and not the philosophically interesting features of the vignette. This possibility is very well-supported by decades of survey research (pp.23-7).
About your third point. For all the reasons you mention, I certainly didn't mean it to be too “combative, snarky, and loaded with histrionics”, much less, to challenge anyone's intelligence!! So let me apologise for that here. I will keep it in mind for later revisions of the paper.
My critique of your papers was pretty detailed, and I don't think you've really responded to it, So, for now, I can only ask that you try to overlook any snarkiness on my part!
Cheers,
Simon
Posted by: Simon | Wednesday, March 26, 2008 at 03:58 AM
Simon,
First, I just wanted to point out that you have not responded adequately to Joshua's second point. Indeed, until you have done so it seems to me that your entire project falls short. Keep in mind that your main point--as far as I can tell--is that experimental philosophers have merely been exploring people's responses to survey questions. As such, a number of problems arise. Most importantly, by your lights, these problems show that we are not really getting at folk intuitions at all--i.e., we cannot make the move from “participant P judged that x is an instance of y on a survey” to "P's intuitions are such that P judges that x is an instance of y.” As far as I can tell, Joshua is correct that your evidence for this seems to be that insofar as participants responses to the surveys of experimental philosophers are affected by philosophically irrelevant features of the wording and order of the questionnaires, the experimental setting, and the like, we cannot simply assume that these responses give us insight into the participants' intuitions. My question for you is: How are you defining “intuition”? If you are simply defining intuition as an “intuitive seeming that is not affected by philosophically irrelevant factors,” it seems that you are begging the question from the start. I think most of we experimentalists are operating with a notion of intuition that is roughly in line with the account put forward by Goldman and Pust (1998)-viz., “the contents of intuitions are usually singular classificational propositions, to the effect that such-and-such an example is or is not an instance of knowledge, of justice, of personal identity, and so forth.” You seem to want to limit the class of singular classificational judgments and propositions that are to count as intuitions. I suppose I would like to hear your argument for doing so.
Second, setting issues of tone aside, it seems to me that the stance you take in the paper is only possible if you (a) do not provide a very detailed analysis of the meta-philosophical background of the works you criticize (e.g., no mention of Stich's early work with Nisbett or the more recent work by Bishop and Trout), and (b) ignore the work of several others working in the experimentalist tradition whose work is part of a project that Eddy and I call Experimental Descriptivism. This project includes work by Marc Hauser, John Mikhail, Liane Young, Fiery Cushman, Joshua Greene, David Pizarro, John Doris, Rob Woolfolk, John Darley, Jen Wright, and others. There are two things worth pointing out. First, this list contains many trained (and in some cases very prominent) psychologists—which makes your charges of methodological shortsightedness look either overblown or ill-informed. Second, even the experimental philosophers you treat as your primary stalking horses—i.e., the experimental restrictionists (Alexander, Stich, Nichols, Weinberg, Swain, Machery, and others) and experimental analysts (me, Nahmias, Knobe, and others)—often also work on the descriptive project of trying to figure out the underlying cognitive and neural mechanisms that undergird and generate our intuitions. Experimental philosophy is not only a big umbrella, but those who consider themselves to be doing experimentalist work in philosophy often wear many hats--something you do not seem to adequately take into account.
Third, you try to get mileage out of the fact that you purportedly ran “by far the largest experimental philosophy study yet”—but I don’t think you are correct on this front either. I am pretty sure Cushman’s Moral Sense Test is the largest on-going experimental philosophy study. That being said, I am curious whether you think the work being done with that data set is susceptible to your criticisms. Of course, you could always once again opt for a scope limiting move which would cast the meaning of “experimental philosophy” very narrowly so as to exclude the work being done by the psychologists. But this would be a suspicious move that would require argumentation to show that you weren’t just begging questions against us yet again. Moreover, since Cushman, Young, Pizarro, Greene, and others explicitly place themselves under the large umbrella of experimental philosophy, you would need an argument for why they are misguided in doing so.
In short, I think your argument ultimately depends on two unmotivated restrictive moves. First, you rely on a narrow (and arguably question begging) notion of what it means for a judgment to be an intuition. Second, despite drawing the distinction between the projects of experimental analysis and experimental restrictionism, you neglect to either acknowledge or address the work done in the descriptivist tradition. By my lights, this is particularly problematic since the work in this area is least susceptible to your criticisms.
Posted by: tnadelhoffer | Wednesday, March 26, 2008 at 09:20 AM
Hi Thomas,
I think you're right to pull me up on defining “intuition” – it's something I seriously need to answer. I think the definition I work with is basically the same as yours---intuitions are spontaneous judgements about whether a case is or is not a case of knowledge (free agency, justice, etc). But *some* points in my paper don't rely on any particular definition of “intuition”. So, I'll discuss these first!
Of all the parts of the paper, the discussion of Swain, Alexander, and Weinberg's priming results is, I think, the most immune to your criticism. Their major premise is to have “found that intuitions in response to this [Truetemp] case vary according to whether, and which, other thought experiments are considered first. ” My first point is, even granting that subjects are producing intuitions (on pretty much any interpretation of the word), they are not intuitions in response to the Truetemp case. Subjects perceive the cases and questions as belonging together, as they are presented to them in the survey. But in their conclusion, Swain, Alexander, and Weinberg reason from their subjects' responses to just one of the questions (the Truetemp case) considered in isolation.
From this they conclude that their subjects' responses are influenced by “irrelevant factors”. But it seems to me that the fairer way to present their results is: subjects respond that “Karen's case is a more obvious case of knowledge than Charles'”, and “Dave's case is a less obvious case of knowledge than Charles'”. And they'd be right! So let me ask if we agree about this much and whether what I've said relies on any very particular view about intuitions?
Examples of this sort of thing are all over the survey research literature. The example I give in the paper are part-whole questions. When you ask a specific question followed by a general question, people interpret the general question, in line with Grice's Maxim of Quantity, as asking for additional information. My example was: imagine asking someone “Do you enjoy eating Jelly-beans?” and they respond, “Yes, I absolutely love Jelly-beans!” And then you immediately ask them, “Do you enjoy eating junk food?”. They are going to perceive the two questions together: the second question is asking whether they enjoy eating junk food _other than_ Jelly-beans.
So this is an example where, if I presented their response to the second question (suppose they answered “no”) in isolation, it could give a very misleading impression. My claim was that something like this is going on in Swain, Alexander, and Weinberg's priming experiments. Their subjects perceive the questions as related – and respond to them together. I give an analysis of why they do this in terms of contrast and assimilation effects in the paper.
So I'm happy to go with Goldman and Pust's definition, “the contents of intuitions are usually singular classificational propositions, to the effect that such-and-such an example is or is not an instance of knowledge ...” And what I want to say is, these survey responses (or intuitions) are not about a single case – they are about cases considered in conjunction. So that's the first bit.
The point where I think your comment is really important is when I argue from the dichotomous/non-dichotomous and forced choice/Likert scale results to the claim that intuitions can't simply be “read off” from survey responses.
I'll just quickly re-hash those results for other readers of the blog who mightn't have read the paper. I found that when you ask subjects “Does Charles really know that it is 71 degrees in the room or does he only believe it?”, and you allow them to respond be selecting either “Really knows” or “Only believes”, about 30% answer that he “really knows” (this is also what Weinberg, Nichols, and Stich found). But, when you ask simply “Does Charles know that it is 71 degrees in the room”, and allow as possible answers “knows” and “does not know”, only 57% say “knows”. And I found this effect seems to work on a variety of survey vignettes.
Now your point is that restrictionists might respond “Look! Their intuitions about the case change depending on how you ask the question!” And the worry is that my response is question begging in the following sense. If I define “intuition” to mean “judgement that is not influenced by irrelevant factors” then it follows trivially from the premise that how you ask the question is irrelevant to whether or not the case is a case of knowledge, that subjects aren't producing intuitions. So, if I understand you, you think I might rely on something like the following argument:
1. Intuitions are spontaneous judgements that are not influenced by irrelevant factors
2. In survey based though-experiments, how you ask a question is irrelevant to whether a case is a case of knowledge
3. How you ask the question in (e.g.) the Truetemp case, dramatically effects subjects responses, therefore,
4. Responses to (e.g.) the Truetemp case are not intuitions
And Joshua might (reasonably) respond: OK: you've got lots of evidence for (2), BUT (1) is just the negation of the restrictionist claim!
I think I've already explained why this isn't relevant to what I've said about Swain, Alexander, and Weinberg's experiments. So again, I'll start with the easier response---how my results bear on inter-cultural studies like Weinberg, Nichols, and Stich's.
They found that Western and East Asian subjects respond differently to the same question, and they conclude that Western and East Asian subjects have different epistemic intuitions. What I've found is that the pragmatic features of the survey itself drastically effect subject responses. Now my argument is that the difference between Western and East Asian subjects' survey responses might be accounted for, not in terms of their having different concepts of _knowledge_, and thus different epistemic intuitions, but in terms of their differing sensitivity to survey pragmatics. That is, we know that how you phrase the question has a substantial effect on how people respond. Could that effect be culturally differential?
The answer seems to be, Yes. (I list some evidence from cross-cultural psychologists and survey researchers on pp.22-27.) So while W and EA subjects might have exactly the same epistemic intuitions (the same concept of knowledge), EAs might be more sensitive to (e.g.) the presence of the intensifying adverb “really” in the question and in response alternatives. (And a whole bunch of other pragmatic features of the survey.) So, I don't *think* that argument relies on a very particular view of what intuitions are.
Now, there is a harder question, which I'm not going to face up to right now: how do my results challenge restrictionist arguments which rely only on results from single subject population experiments? I've been aware of this problem for a while – so thank you for forcing me to face up to it. I'm going to think about it a bit before I reply. But let me just say that there's a lot in the paper which doesn't rely on any of this stuff, and doesn't have anything to do with restrictionism in particular -- so even if my analysis of (e.g.) the dichotomous/non-dichotomous results is ultimately wrong, I think much of what I've said should still be important for experimental philosophers.
About ignoring the meta-philosophical background. My stance is just that the methods being employed in this particular part of X-phi (which is not *all* of X-phi -- but still a considerable part of it) urgently need to be improved, and that they have given rise to a number of bad arguments. (Obviously what I say doesn't have much to do with Josh Greene's fMRI work!) I think it's a really interesting hypothesis that epistemic intuitions vary between cultures and other sub-populations, and I don't have any problem with the motivation for restrictionists' or analysts' work (though I am more interested in analysis myself). Ignoring experimental descriptivism is a failing of mine – but like I said in the paper, it's about a very particular way of doing experimental philosophy – one which I think has been common enough to ensure the paper's interest.
Thanks for your excellent comments – I'm sure they will help to improve the paper. I'll think some more and write soon.
Simon
Posted by: Simon Cullen | Wednesday, March 26, 2008 at 12:12 PM
Sorry, I meant to say, I'll return to the general worries about how my results bear on both restrictionists' arguments which rely only on results found within a single sub-population, where responses to survey pragmatics can be assumed to be uniform, and experimental philosophers' survey-based investigations of folk concepts generally!
Posted by: Simon Cullen | Wednesday, March 26, 2008 at 12:19 PM
Simon,
I am sure you realize that the argument against Weinberg, Nichols and Stich is not very impressive.
You write "Could that effect be culturally differential? The answer seems to be, Yes." Sure, that COULD be true, as could a bunch of other interpretations of the finding. What would be more impressive is some evidence that different reactions to the pragmatic context of the study DO explain the finding. This, of course, calls for more experimental philosophy.
Furthermore, you should not lose track that Weinberg, Nichols, and Stich (and for that matter Machery, Mallon, Nichols, and Stich) have always been very cautious about the significance of their findings. The philosophical argument have always been conditional: If the findings are not an artifact and if the proposed interpretation is correct, then here would be the philosophical implications.
Edouard
Posted by: Edouard Machery | Wednesday, March 26, 2008 at 12:34 PM
Edouard,
There's definitely motivation for some experimental philosophy here---but we're not in the dark, thanks to a lot of excellent survey researchers' work. So, while I certainly don't claim to have demonstrated that these pragmatic effects DO account for Weinberg, Nichols, and Stich's results, it remains a very real possibility, and one which has not, to my knowledge, been seriously explored by experimental philosophers.
Simon
Posted by: Simon Cullen | Wednesday, March 26, 2008 at 12:49 PM
Simon,
You claim that some points in your paper don’t turn on your ability to provide either an account of what you mean by “intuition” or a defense for your claim that if a person’s judgments are sensitive to the wrong kinds of things, then that person’s judgments don’t count as intuitions (presumably, the former would ground the latter).
You then restate your objection to the SAW study:
“even granting that subjects are producing intuitions (on pretty much any interpretation of the word), they are not intuitions in response to the Truetemp case.”
Why aren’t they intuitions in response to the Truetemp case? You seem to have substituted the problem of having to identify what you think an intuition is with the problem of having to defend a view of how to individuate intuitions. What does it mean for an intuition to be an intuition about a specific hypothetical case? If I understand your view, an intuition counts as being about a specific hypothetical case only if the intuition is responsive only to intrinsic features of that case. Okay, but why?
You suggest that a fairer way of understanding our results is that subjects respond “Karen’s case is a more obvious case of knowledge than Charles’” and “Dave’s case is a less obvious case of knowledge than Charles’”. It is important to point out that these aren’t actually the responses that we received. Of course, it may well be that the responses that we received are best explained as having been influenced (either consciously or unconsciously) by the pro-attitudes that you are attributing to the subjects. But—and here is where the individuation problem I mentioned in the previous paragraph emerges—it isn’t clear that this means that their responses aren’t intuitions about the Truetemp case. You need an account of what it means for an intuition to be an intuition about a particular hypothetical case. Let me reiterate: we claim that our subjects have intuitions about the Truetemp case and that these intuitions are influenced by philosophically irrelevant factors. You claim that our subjects have intuitions that are influenced by philosophically irrelevant factors and that this means that they don’t have intuitions about the Truetemp case. It seems to me that you need to do something to motivate that claim.
Posted by: Joshua Alexander | Wednesday, March 26, 2008 at 04:26 PM
Simon,
I also wanted to add, in response to your earlier apology, that there aren’t any hard feelings about the tone. I felt that it was something that needed to be addressed, but never thought that you intended to set that kind of tone in your paper.
Josh
Posted by: | Wednesday, March 26, 2008 at 07:17 PM
Hi Jonathan!
First, I want to thank you for this excellent discussion – it's definitely helped me to articulate my position more carefully!
OK. Now, I have to pick you up on one thing. I do not claim that if a person's judgements are sensitive to the wrong kinds of things, then they are ipso facto not intuitions! Nor do I claim that because your “subjects have intuitions that are influenced by philosophically irrelevant factors ... this means that they don’t have intuitions about the Truetemp case”. I want to distinguish what we might call “survey intuitions” (what I call simply “survey responses” in my paper), as a particular kind of intuitive judgement which is highly sensitive to a host of pragmatic influences, particular to the context of an experimental survey. Many of these influences are well-known to survey researchers – so they should not come as any surprise to us.
So I endorse a limited kind of restrictionism, which, to give it a name, we might call “survey restrictionism”. Josh Knobe suggested that it might be helpful for me to make the distinction in terms of “intuitions”, which are a kind of mental state, and “survey responses”, which are a kind of overt behaviour. Norbert Schwarz pointed out an interesting related debate in psychology about whether people have stable attitudes which can be distorted by context effects, or whether it's all “made up on the spot”, as he put it. Schwarz directed me to his 2007 article in Social Cognition, “Attitude Construction: Evaluation in Context” (doi:10.1521/soco.2007.25.5.638) which you might take a look at.
My reason for resisting full-blown restrictionism is simply that much of the survey research you take to support restrictionism, exploits context sensitivities which are unique to (or at any rate, profoundly amplified by) the survey context. So, response order effects, assimilation and contrast, different responses to open vs. closed questions, dichotomous vs. non-dichotomous response alternatives, and so forth, either do not exist or are not too important in the relatively stable contexts in which philosophers consider thought-experiments. (You suggest that readers of the original Truetemp case might have been primed by an earlier section where Lehrer considers clear cases of knowledge. This seems unlikely since many priming effects are known to disappear in explicit conversational contexts like those in which philosophers consider thought-experiments (p.16). Further, the effects are likely amplified by subjects' naiveness as to the purpose of considering the hypothetical scenarios, in addition to the unsettling strangeness of their narratives (p.11) -- hence subjects' need to draw on contextual clues. I talk a lot about this in my paper – especially around p.11 RHS, p.13, and p.18 LHS, and I suggest an experiment which might help to further the discussion in footnote 14.)
That being said, I do think there are contextual and social cognitive issues which *might* influence philosophers' intuitions (e.g., it would be a brave philosopher indeed who denied having “Gettier intuitions”!). But you haven't provided evidence for that (just yet!).
So my view is that restrictionists' survey research _does_ provide a serious objection to experimental philosophers. But the inference from “survey intuitions are sensitive to 'irrelevant factors'” (remember they are not irrelevant to the respondents), to full-blown restrictionism, remains unjustified, and will remain unjustified so long as well-known pragmatic _survey issues_ can easily account for your results, as I believe they clearly do in your paper with Swain and Weinberg.
In your last post you asked “Why aren’t they [the survey responses] intuitions in response to the Truetemp case?” And my answer is: of course they are! -- In part. They are responses to the Truetemp case in conjunction with either one of the priming cases. In just the same way as a response to the question “Do you like junk food?” which immediately proceeded a response to the question “Do you like eating Jelly-beans”, would be a response to those two questions in conjunction (i.e., a response to the question “Do you like eating junk-food other than Jelly-beans?).
I'm not going to provide a general principle for individuating intuitions (that sounds scary!) -- but I can tell you, in any particular case, what counts as a response to a single case, and what counts as a response to some number of cases considered in conjunction. The way to do that is exactly as I did it just now with the Jelly-bean question: by analysing the context in which the cases are considered. If I establish a different conversational context by prefacing the questions with “I'm interested in your opinion about two separate dietary areas, (1) Jelly-beans and (2) Junk-food”, then people would consider the questions independently, and their intuitions would (I hope!) be correspondingly independent. So I don't see that there's any problem of “individuating intuitions” -- it's something you've got to do case by case.
So let me clarify my position. First, I take your results (and lots of survey researchers' too) to present a serious objection to straightforwardly identifying “survey intuitions” with philosophers' intuitions, or lay-people's intuitions in other, non-experimental contexts. (In my paper I called “survey intuitions” simply “survey responses” -- but I now see that might suggest that I deny survey responses are intuitive classificatory judgements to the effect of X is a case of Y -- I don't.) So, you might call my position “survey restrictionism”.
The difficulty it presents for experimental restrictionists is it tempers the inference from “survey results are sensitive to various (relative-to-the-observer-) irrelevant factors”, to “philosophers' intuitions (or lay-people's in non-experimental contexts) are sensitive to irrelevant factors” – since you would need to argue that the sensitivities you take to undermine ordinary intuitions' evidentary propriety are not unique to the survey context. I'd argue that they are unique to the survey in the case of your work with Swain and Weinberg, and that they quite plausibly are in the case of Weinberg, Nichols, and Stich's inter-cultural and inter-social experiments.
To experimental analysts, my results present the same trouble as those of experimental restrictionists more generally. But I'm optimistic that lots of these issues can be ameliorated by a better methodology (which is what I'm trying to think about now!).
Cheers!
Simon
Posted by: Simon Cullen | Friday, March 28, 2008 at 10:43 AM
"...so long as well-known pragmatic _survey issues_ can easily account for your results, as I believe they clearly do in your paper with Swain and Weinberg."
I think the most fundamental problem here, is that this is just not obviously true. Indeed, with your "easily" and "clearly" in there, we can say that your claim as stated is just not true, full stop. There is no actual hypothesis presented in your paper that accounts for our result. And if there is a survey-pragmatics confound out there, you have yet to show us what it could be.
For starters, at a bare minimum, you owe us an explanation as to why there's a Karen/Dave effect only on Truetemp, and not on Karen, Dave, or most importantly, on "Fake Barn" Suzy. It does not seem that the kind of effects your are invoking would be so finely discriminating in where they would apply. And thus they are not terribly good candidates for explaining our findings.
(Ideally, your hypothesis would also make sense of the combination of our results with the results those found by Jen Wright:)
http://experimentalphilosophy.typepad.com/experimental_philosophy/2006/12/thoughts_about_.html
It's really just not the case at all that anyone in x-phi (_pace_, that is, your attributions of naivete to us)is confused about the idea that there is OF COURSE an inferential step from observed behavior on surveys to whatever is philosophically relevant going on in our subjects' heads. (I think if you had read, e.g., the objections and replies section of the WNS article with even a hint of charity, you would see that we are not unsympathetic to the idea that there is no immediate inference from survey results to philosophical intuitions.) However, the way the game is played, is that in order to cast doubt on that inferential step, you need an actual rival hypothesis that fits the data. In the absence of any competitor explanation, the default interpretation of differences between survey responses is that they do indeed track differences in the underlying psychology. But, you have not yet presented an actual competitor. And I don't see any obvious way to build one from the materials that you try to bring to bear. Your idea of hunting for one in the survey-pragmatics literature really was a fine idea, and I commend you for it, but you still have to actually _succeed_ in that hunt, before you will be in a position to raise an objection here. (I would also add to what Edouard said above, that the cross-cultural materials you cite do not even come close to suggesting an actual confound for the patterns in the WNS or MMNS findings.)
So, you haven't really put any actual plausible confounds into play. This seems to be, in part, because you simply have not formulated one, but I also think that partly this is because of a confusion that runs throughout your paper, conflating together the literature on pragmatics-based effects on survey responses with the literature on other contextual effects on judgment itself. For example, the Damisch et al. paper that you pull a big quote from is a paper that, because it concerns context effects on expert judges (see study 4), is much more consistent with our interpretation (that these survey differences are reflecting induced differences in the actual judgments) than with your interpretation (that these survey differences are just artifacts of the experimental conditions on untrained subjects). So it is unsurprising that you have not seen how to apply all of this literature to your ends, because not all of this literature actually supports your ends.
You might also want to consider the way in which Schwarz's position, if embraced, is a much more potent threat to traditional philosophical method of intuitions than it is to the restrictionist program -- if there's simply no fact to the matter about what "the" underlying attitude is about Gettier cases and the like, then the would-be starting point for traditional epistemological theorizing would seem to be nonexistent. Schwarz concludes his paper with, "Context sensitivity is not
noise that we need to overcome—it is the message. We should heed it." I and my fellow restrictionists have heard this message loud and clear, and we've been working hard to relay it to the rest of philosophy. But what, I wonder, would it mean for analytic philosophers to hear & heed it? Now _that_ would be a question worth pursuing.
Posted by: jonathan weinberg | Friday, March 28, 2008 at 07:59 PM
Hi Jonathan W!
Thanks for your comments.
About the Schwarz paper I mentioned to Josh. I'm not sure if you meant to alert me to Schwarz's conclusion at the end of your comment? I mentioned his paper in SC because of the objection it provides to my suggestion (made immediately beforehand) that we might distinguish survey responses from intuitions considered as mental states. You seem to have picked up on this. I think it's a really interesting hypothesis which provides, prima facie, a deep threat to the most popular way of interpreting the practice of conceptual analysis. But it seems to me you don't need to take Schwarz's line on that topic to make use of his survey research. So I'm not sure if you meant your comment to bear on my paper in this way?
I state my hypothesis for the SAW data in a few places in the paper (and again in the comments above):
“Swain et al. present their subjects with an extreme case (Dave's or Karen's) followed by a more moderate case (Charles'), within the one conversational context. The explanation for their results is the same too: the correct responses to the extreme cases will seem to subjects---who are totally unaware of the motivation for the experiment---so obvious as to clearly violate conversational norms [see, Haviland and Clark, 1974]. And since asking very peculiar questions makes people draw most heavily on contextual clues [Schwarz, 1995, p.156], it is not at all surprising that in the experimental conditions Swain et al. have created, we should find strong contrast effects. The conclusion is, Swain et al.'s experimental results are also but minor additions to the already extensive psychological literature on contrast and assimilation.
...to explain Swain et al.'s results there is no need to make recourse to their subjects' intuitions about the intrinsic epistemically relevant features of the Truetemp case. A plausible if less exciting hypothesis is that their survey results are products of the well-known phenomena of contrast and assimilation. ... On this account, Swain et al.'s subjects tried to provide meaningful, informative responses to the survey questions. They were primed with stunningly obvious cases of knowledge or non-knowledge, and they (mistakenly) gave their researchers the benefit of the doubt. Thus, they turned to the conversational context of the survey and interpreted the questions as requesting comparative judgments and not their ``intuitions'' about the intrinsic features of each case [p.15-16].”
If you don't think this constitutes a serious, testable hypothesis, I'd be interested in your reasons for this. A few pages on I say: “What [Swain et al.] have not considered is that within the conversational context in which their subjects consider the Truetemp vignette, the ordering is highly relevant: it helps to determine the very meaning of the question.”
You do offer one reason for rejecting this hypothesis:
“For starters, at a bare minimum, you owe us an explanation as to why there's a Karen/Dave effect only on Truetemp, and not on Karen, Dave, or most importantly, on "Fake Barn" Suzy. It does not seem that the kind of effects your are invoking would be so finely discriminating in where they would apply. And thus they are not terribly good candidates for explaining our findings.”
I'll take a look at Jen Wright's results in more detail–but for now I can point you to where I explain “why there's a Karen/Dave effect only on Truetemp, and not on Karen, Dave, or most importantly, on "Fake Barn" Suzy”. (I only explicitly mention the Fake Barn case, but the explanation can be extended naturally):
“Swain et al. have themselves provided further evidence for the conversational-pragmatics hypothesis. They ``expected [the less obvious fake-barn case] to generate mixed intuitions; with some subjects willing to attribute knowledge, and others not''. And ``since the other cases were designed to test the effects of presenting a clear case of knowledge and a clear case of non-knowledge before the Truetemp Case, we included the last case to test the effects of presenting a mixed case before the Truetemp Case''. Swain et al. ``found that subjects' intuitions about [the fake-barn] case were, given the Truetemp Case's liability, surprisingly stable'', which they take to ``raise an interesting question for the philosopher who relies on intuitions: which intuitions, if any, are resistant to the potential effects of irrelevant factors?''
On the conversational pragmatics hypothesis Swain et al.'s results are not at all surprising. In fact, Swain et al. have themselves provided the explanation: they expected the fake-barn case ``to generate mixed intuitions; with some subjects willing to attribute knowledge, and others not''. Because subjects find the Fake Barn case less extreme than either Karen's or Dave's cases, the contrast between the priming question and the target question is accordingly less extreme. The pragmatic hypothesis predicts that under these conditions there will be a less extreme shift in judgements; which is exactly what Swain et al. found” [p.17, ftnt 15].
I'll quickly mention that what Jen Wright found seems, at first blush, at least, completely compatible with this hypothesis too:
“Of particular interest, however is the fact that neither Coin Flip nor Chemist were vulnerable to the order effect. In both cases, participants’ knowledge attributions remained relatively stable regardless of the order of presentation. Across presentation order, the strong majority stably judged that Chemist knows that p (yes=79-90%) and that Coin Flip does not know that p (no=94-100%).”
This is what we'd expect since, as I explained in the paper as well as my comments here, it's only in relatively ambiguous situations that subjects turn most strongly to contextual clues to help decide their responses. (Too, my gut response is to agree with Jen's analysis: “Having only just recently formed a judgment about another case, it makes sense that participants would rely on that case as a relevant point of comparison, as it were.”)
About the WNS study. I'm the first to acknowledge that I provide no serious hypothesis of my own. But that wasn't my point. The point of the discussion there isn't to develop an alternative hypothesis, but only to suggest that the methodology of the study didn't take into account a range of serious, commonly recognised methodological issues.
About my supposedly confounding literature from both “pragmatics-based effects on survey responses with the literature on other contextual effects on judgment itself”. To tell you the truth, I just don't see the distinction. Conversational pragmatics based effects are just another side of the more general context sensitivity of cognition. Both Olympic judges and experimental philosophy subjects in your study are placed in the position of evaluating stimuli sequentially. The point of my paper and previous comments -- which you've not responded to -- is that you purposefully foster conditions which are highly conducive to C&A effects in the SAW study, and then seem to generalise without argument to the contexts in which philosophers consider thought-experiments. But philosophers do not assess thought-experiments in the manner of your experimental subjects responding to survey vignettes, and they do not score them in the manner of Olympic judges scoring sports events. Until you provide _some_ evidence that we can reasonably expect contrast and assimilation effects to have a substantial effect in the contexts analytic philosophers' typically consider thought-experiments, it seems to me that you haven't done anything to undermine the evidential propriety of their intuitions.
Simon
Posted by: Simon Cullen | Saturday, March 29, 2008 at 10:22 AM
Simon,
Let me begin with a few comments concerning your Friday post.
I didn’t mean to suggest that you actually made those claims explicitly. I meant only to suggest that your criticism of restrictionism seems to commit you to those claims (or something like them) and that those are, subsequently, claims that you need to defend. But, let’s set that issue aside for the moment.
If I understand your Friday post, you now want to distinguish between survey intuitions (what you were calling “survey responses”) and intuitions generated in philosophical (or, at least, non-survey) contexts. Let’s call these latter intuitions, for want of a better name, “philosophical intuitions”. You seem willing to concede that restrictionists have shown that survey intuitions are unreliable. What you want to claim is that we haven’t shown that philosophical intuitions are unreliable.
In order to defend this claim, you must give us some reason to think that the kind of sensitivity displayed by survey intuitions is unique to survey intuitions (or, at least, isn’t displayed by philosophical intuitions). That is, you must give us some reason for thinking that philosophical intuitions are more reliable than survey intuitions. If I understand your position correctly, you want to say that there is either something special about philosophical contexts (the conversational context is more explicit in philosophical contexts than in survey contexts) that make philosophical intuitions more reliable than survey intuitions or something special about philosophers (philosophers better understand the nature and purpose of philosophical analysis and attempt to provide non-comparative judgments to thought-experiments) that make philosophical intuitions more reliable than survey intuitions.
The problem that I have is that you seem content to merely speculate both about what differences there are between philosophical and survey contexts (and between philosophers and survey subjects) and about how those differences make philosophical intuitions more reliable than survey intuitions.
Why should we agree with you that the conversational context is more explicit in philosophical contexts than in survey contexts? Why should we think that this, if true, makes philosophical intuitions less sensitive than survey intuitions to philosophically irrelevant factors? Why should we think that philosophers, unlike survey subjects, attempt to provide non-comparative judgments to thought-experiments? Why should we think that one person’s having a better understanding than another person of the nature and purpose of philosophical analysis will make the first person’s intuitions less likely than the second person’s intuitions to be sensitive to philosophically irrelevant factors? It’s not that I don’t think that you can answer these questions; it’s that you must answer them in order to provide a compelling challenge to restrictionism and that I don’t think that you have yet answered them. (For example, you claim, in your paper, that priming effects can disappear when the conversational context is made explicit. But, in order to explain why philosophical intuitions are more reliable than survey intuitions, you need more than this; you need to show that they do disappear in philosophical contexts. And, I don’t see that you have done this.)
Let me see, then, if I can summarize what has been worrying me from the start. You seem to want to argue that experimental philosophers haven’t been studying the right kind of thing and that, therefore, restrictionism leaves the philosophical practice of appealing to the right kind of thing untouched and unscathed. At times, it seems like you want to distinguish between survey responses and intuitions; at other times, it seems like you want to distinguish between, what we might call, single-case intuitions and comparative intuitions; and, at still other times, it seems like you want to distinguish between survey intuitions and (what I have called) philosophical intuitions. My general worry throughout has been, and remains, that you haven’t yet provided sufficient motivation for making these distinctions nor shown that, whatever it is that philosophers are interested in appealing to as evidence, that kind of thing is more reliable than the kinds of things that have been studied by experimental philosophers.
This brings me to a comment that you made today in your response to Jonathan’s comments:
“The point of my paper and previous comments -- which you've not responded to -- is that you purposefully foster conditions which are highly conducive to C&A effects in the SAW study, and then seem to generalise without argument to the contexts in which philosophers consider thought-experiments. But philosophers do not assess thought-experiments in the manner of your experimental subjects responding to survey vignettes, and they do not score them in the manner of Olympic judges scoring sports events. Until you provide _some_ evidence that we can reasonably expect contrast and assimilation effects to have a substantial effect in the contexts analytic philosophers' typically consider thought-experiments, it seems to me that you haven't done anything to undermine the evidential propriety of their intuitions.”
The problem, I am afraid, is that you are confusing on whose shoulders the dialectal burden rests. You are suggesting that there is a difference between survey intuitions and philosophical intuitions (between survey responses and intuitions, between single-case intuitions and comparative judgments). The burden is, thus, on you to provide reason for thinking that such a distinction is substantive and to show that, whatever it is that philosophers are interested in appealing to as evidence, that kind of thing is more reliable than the kinds of things that have been studied by experimental philosophers.
Josh
Posted by: Joshua Alexander | Saturday, March 29, 2008 at 01:24 PM
re: Schwarz - My bad. I thought you were citing his article in defense of your line of argument, but I see now that that wasn't what you were doing.
Posted by: jonathan weinberg | Saturday, March 29, 2008 at 02:35 PM
Hi Josh,
I think there are at least two things going on here. First, you make a good point -- I do owe you some reason to distinguish the experimental survey context for those contexts more typical of philosophers considering thought-experiments. (I think I've provided some good reasons for this and that more can be elaborated – to which I will return.) The second thing is, I think you (and Jonathan) may have misunderstood the aims of my paper and my own stance on the issues, and that this is causing some confusion about what I intend my argument to achieve.
You say: "you want to say that there is either something special about philosophical contexts ... that make[s] philosophical intuitions more reliable than survey intuitions or something special about philosophers' [intuitions]..." Well, I'm not sure that I want to say anything quite _that_ strong. Perhaps Nadelhoffer's introducing the paper as "fleshing out some of Antti Kauppinen's worries about x-phi" has given the impression that I want defend philosophers' right in general to make claims about "what we find it intuitive to say" -- but that's not my end at all. (I say in my paper that "I think [experimental philosophers] are right to suspect that what philosophers find intuitive might diverge, possibly quite often, from what lay-people find intuitive" [p.35], and that the empirical investigation of intuitions is well-motivated. -- Which is, incidentally, why I'm interested in experimental philosophy.) _If_ I were out to defend analytic philosophers' right to uncritically rely on their own intuitions in general, then I'd _really_ owe you a story about what makes their intuitions good and the contexts they appear in special.
You say: "But, in order to explain why philosophical intuitions are more reliable than survey intuitions, you need more than [the fact that priming effects disappear in explicit conversational contexts]; you need to show that they do disappear in philosophical contexts. And, I don't see that you have done this." All I am saying is that the evidence you have presented against philosophers' use of intuition isn't any good. I do not think the kind of context sensitivity which you have demonstrated in your experimental subjects is plausibly having much of an effect on philosophers' consideration of thought-experiments. Now, it may yet turn out that philosophers intuitions are unreliable – but it seems to me that you have not shown this.
So given that I'm not out to defend analytic philosophers' practice of appealing to intuitions in general, but only to defend them from your claim to have (prima facie) demonstrated that "philosophical intuitions", to borrow your phrase, are sensitive to the order in which cases are considered, I don't have to argue that "there's something special ... that make[s] philosophical intuitions more reliable than survey intuitions". All I have to argue is that whatever force is responsible for the instability in your subjects' intuitions, is plausibly not present in philosophical contexts. And I think that once we correctly identify that force as a contrast and assimilation effect driven by the survey presentation, and once we see that its peculiar strength comes from subjects' interpreting the vignettes and questions in very particular circumstances (the characteristics of which I have elaborated several times), it's not hard to see that this force isn't likely to play much of a role in determining philosophers' judgements. So, to further the debate, I don't think any further experiments are required: all that needs to be shown is that the effect is not likely to operate in philosophical contexts.
You say: "The problem that I have is that you seem content to merely speculate both about what differences there are between philosophical and survey contexts (and between philosophers and survey subjects) and about how those differences make philosophical intuitions more reliable than survey intuitions." It is always better to have empirical data specific to each case over which one wishes to speculate. However, it is also possible, and very common, to extract general principles from cases so far considered, and to apply these principles to unconsidered cases. We know a lot about the processes which govern question-answering, etc., so empirically well-informed speculation is possible and should constitute a serious move in this debate. Demanding that I collect experimental data on the conversational contexts in which philosophers consider thought-experiments before I can legitimately respond to your experiments seems unfair. More so since I'm not concerned to defend philosophers' reliance on intuition in general, but only to defend them from your claim to have challenged it empirically.
A good deal is known about the factors which foster and mediate pragmatic influences in survey responses – and we can assume certain things about the contexts in which most philosophers consider thought-experiments. For example, we can safely assume that philosophers do not take into account the likely interests of survey researchers – however it is widely accepted that survey respondents frequently consider this. More seriously, the many classic survey issues deriving from the formal features of the survey, the presentation of response alternatives, etc., are simply irrelevant in philosophers' cases. And most importantly for your experiments, we can assume that philosophers do not consider series of contrasting thought-experiments on the assumption that survey researchers have purposefully presented them in that order to covey their meaning and to communicate their purpose. (I know that sounds silly, but the situations are so different it is impossible to find an analogue. Perhaps you might say, philosophers do not assume that the order in which an author (or colleague) presents thought-experiments within the one chapter (or lecture) to communicate their purpose and their meaning – unless it is done explicitly, and in which case is desirable.)
You ask: "Why should we agree with you that the conversational context is more explicit in philosophical contexts than in survey contexts?" Do you agree that a subject interpreting a bizarre survey vignette, without any knowledge of the purpose for which it was created, is in a substantially different position to a philosopher considering a thought-experiment which she knows has been designed to illicit her intuitions about a particular concept? We can safely assume, for example, that philosophers are aware of the complete irrelevance of most of the narrative details (details which are apt to confuse the pants off a naïve experimental subject) of a thought-experiment. This sort of knowledge is what I refer to when I say that the conversational context in which philosophers consider thought experiments is more explicit than that in which subjects consider survey vignettes. Survey respondents, in contrast to philosophers, operate on the assumption that all contributions to the survey are relevant to their task of generating responses. This is well established in the survey literature, I emphasise it my paper several times and I provide references to respected survey research. It doesn't seem that any more research is required to demonstrate this.
I hope this helps to clarify and motivate the distinction between the the conversational contexts in which philosophers and experimental subjects operate. If not, I have to ask (as politely as I possibly can!!) that you have a look for yourself in the survey literature which I've referenced in my paper.
You ask: "Why should we think that this, if true, makes philosophical intuitions less sensitive than survey intuitions to philosophically irrelevant factors? Why should we think that philosophers, unlike survey subjects, attempt to provide non-comparative judgments to thought-experiments? Why should we think that one person's having a better understanding than another person of the nature and purpose of philosophical analysis will make the first person's intuitions less likely than the second person's intuitions to be sensitive to philosophically irrelevant factors?" The answer is: because people turn to contextual clues (e.g., surrounding questions) when they are uncertain of the meaning of the question (vignette) or their task. They do this in proportion to the ambiguity of the question/task/purpose of the experiment. This too is empirically well-established, and I provide several references. Now, it is possible that other contextual influences have a substantial effect on philosophers' responses to thought-experiments, but I don't see that you have demonstrated any.
You say: "At times, it seems like you want to distinguish between survey responses and intuitions; at other times, it seems like you want to distinguish between, what we might call, single-case intuitions and comparative intuitions; and, at still other times, it seems like you want to distinguish between survey intuitions and (what I have called) philosophical intuitions." First, the distinction between "philosophical intuitions" and "survey intuitions" just is the distinction between survey responses and philosophical intuitions. You introduced the first term, and I only introduced the second term to be clear that I wasn't denying that survey responses are classificatory judgements to the effect that C is a case of X. The distinction between single-case intuitions and comparative intuitions is simply the distinction between a single response to a single case considered in isolation, and a single response to two cases considered jointly. It seems like a pretty natural and well-motivated distinction to me (see my Jelly-bean example above). I explained in a previous comment that in any particular case we can determine whether a response is likely a response to a single case considered in isolation, or to two cases considered jointly, "by analysing the context in which the cases are considered. ... it's something you've got to do case by case."
Turning from analytic philosophers to experimental analysts. Your line of argument might seem very threatening, given that restrictionists and analysts both work in very similar experimental contexts. On the reading of the SAW study which I think you favour, the results provide a prima facie challenge to the _very possibility_ of analysis, since you take the study to show something deep about intuitions, viz., that they they are unstable. But on the reading I favour, where we explain the results not in terms of intuitions' being sensitive to objectively epistemically irrelevant factors, but in terms of their reflecting subjectively relevant pragmatic cues contained in the survey, the SAW study provides an important methodological _lesson_ about, inter alia, presenting sequences of cases for subjects' consideration.
You say: "The problem, I am afraid, is that you are confusing on whose shoulders the dialectal burden rests. You are suggesting that there is a difference between survey intuitions and philosophical intuitions (between survey responses and intuitions, between single-case intuitions and comparative judgments). The burden is, thus, on you to provide reason for thinking that such a distinction is substantive and to show that, whatever it is that philosophers are interested in appealing to as evidence, that kind of thing is more reliable than the kinds of things that have been studied by experimental philosophers." I hope this clarifies the motivation for distinguishing philosophical and experimental contexts, and clarifies how these differences are relevant to the appearance and mediation of contextual effects like those exploited in the SAW study. All of this has been, to my knowledge, pretty well empirically established by survey methodologists. Given that I'm not out to defend philosophers reliance on intuition in general, I will make one comment about where the the burden of argument rests.
It seems to me that you have created a specific and highly contrived experimental situation to foster the appearance of contrast and assimilation effects in your subjects' judgements. It would seem, then, that the burden is on you to argue that this also plausibly occurs in philosophers cases.
Simon
Posted by: Simon Cullen | Monday, March 31, 2008 at 11:09 AM
Simon,
I've had a chance now to review the relevant sections of your paper, and some of your sources, and I'm afraid that the original diagnosis still stands: you have not yet presented us with a confound of the SAW results.
But first, let's step back for a minute and get clear on the dialectical situation. Here's what I take to be the standard line of reasoning: Study X (such as WNS, MMNS, or SAW) presents preliminary, but nonetheless real prima facie evidence that survey responses aimed at eliciting judgments very similar to the sort typically cited as intuitions in the philosophical literature, are sensitive to factors that are not philosophically relevant (like ethnicity, or question order). Therefore, in the absence of any specific demonstrable reason to think that intuitions of the sort cited in the philosophical literature are psychologically distinct in an appropriate way from the judgments elicited in survey X, practitioners of intuition-driven philosophy now have a reason to worry that their own preferred source of evidence is itself sensitive to philosophically irrelevant factors.
(Note, btw, that this argument does not require that the context, pragmatics, etc. of surveys and philosophical intuitionizing are _identical_ – merely that they are not _relevantly_ different. And note also that this argument does not in the slightest require any immediate inference from what a response on a survey is to what the philosophically-meaningful cognition is. So, those are two claims that you attribute to the restrictionists that you're really going to need to un-attribute.)
Now, there are at least three clear lines of attack against this restrictionist argument. (1) One can attempt to cast doubt on the very first step, that study X presents even prima facie evidence concerning the survey responses; perhaps the researchers have not conducted their statistics properly. (2) One can try to argue that the variation in question is, in fact, philosophically relevant; such would be one kind of contextualist response to SAW. And (3) one might grant the conclusion as worded, but attempt to devise a relevant specific demonstrable reason to think that philosophers' intuitions work differently in some appropriate way from the judgments measured by the surveys.
I take it that your paper is meant to be pursuing strategy (3). Now, there are two distinct components that one can consider, when offering a response of type (3): an account of what's going on in the survey judgments, and an account of what is going on in whatever it is that philosophers cite as intuitions. Moreover, these are both empirical components, though it is certainly possible that various aspects of them will not require special scientific forms of investigation. (E.g., that philosophers do not typically respond using a Likert scale.) And it is incumbent upon someone pursuing this sort of response to tell a filled-in enough story about either or both of these components, so that we can see how it really does draw a substantial psychological difference where it needs to be drawn; and moreover, it is incumbent upon them to provide sufficient appropriate empirical evidence for that story. You have some sketches of some materials from both, but none of it, as it stands currently, makes it to the level of clarity and justification required.
As Josh has noted, your discussion of the cognition of philosophical intuition is pretty slim, and you would do well to take a bit more time to consider what the pragmatics, and the psychology more generally, of typical philosophical argumentation is. Philosophy papers are not especially less artificial than surveys (especially if J. L. Austin knew what he was talking about!) There's an awful lot of serial considerations that go on in many intuition-based papers; consider BonJour's original deployment of his range of slightly-varied clairvoyance cases, to take one example that comes easily to mind. And there is a distinct pragmatic element of the way in which we are usually told very directly, and it is usually clear antecedently from the argumentative context, what intuition we are expected by the author to have about the case in question. And – most importantly, I think – though you disclaimed the distinction in an earlier comment, a great many context effects are not generated by pragmatics, and so it may be simply irrelevant that philosophy papers and surveys have different pragmatic contexts. The very work that you cite talks at length about the importance of activated information and the like, which does not require anything survey-like at all.
This is not an automatic killer for your argument, but it does very substantially raise the stakes for your account of the psychology of the survey context, which now has to do pretty much all the work. Basically, you need to show, using only aspects of the survey context that are patently not in the philosophical context, that you can explain the entirety of the pattern in the data in your targets' studies. This doesn't give you much to work with, mostly just the presence of the Likert scales and your claims about the subjects' finding the context itself confusing. And here is where you run aground. Your proposed explanation is that subjects are using the Likert scale to indicate a degree of difference between any given case and the one(s) previous to it. But then you should predict that Fakebarn Suzy will also show a Karen/Dave effect; but no such effect is observed. Moreover, your explanation is symmetrical between the different cases, as all that is supposed to matter is the perceived comparative sameness/difference across cases. So whatever effect Karen or Dave has on Charles, one should expect a similar effect of Charles on Karen or Dave; likewise with Suzy on Karen or Dave. But no such effects are observed. And you should also predict that Karen and Dave will also have an effect on each other; but no such effect is observed. Now, one might very reasonably point out that the Dave responses are so consistently close to the floor, that there's no room for them to move further south in response to comparisons to other cases, and that would be fair enough. But the Karen responses are not at ceiling – their overall mean is just shy of 4 on a 5-point scale – so there's no reason, on your account, for there not to be an effect of the other cases on Karen. And, if one adds in Jen Wright's findings, the issue of Fakebarn Suzy gets much worse, since she is influenced by Charles, even though she shows no Karen/Dave effect; your theory would predict the opposite.
Your footnote that you pointed at in your recent comment, btw, only addresses the question of why there is no Suzy effect on Charles or vice-versa, which is indeed what we reported. But it doesn't seem at all to address the rest of the pattern, and indeed, it seems to commit you pretty solidly to treating Suzy as an intermediate case like Charles, and hence committing you to an incorrect prediction that there would be a similar Karen/Dave effect on Suzy. (And this is not the sort of thing you should have been trying to address in a footnote, either! Demonstrating that you've got a real confound for us here is, or at least should be, at the very heart of your argument.)
Note that we have not given any reasons to think that the sort of factors you're invoking aren't in play at all. I suspect that those factors are indeed present, but that they just aren't making too big of an impact; we would probably detect them if we had a much larger n. But what we're saying is that the factors you're pointing to do not make for a good contender to explain the pattern that we did detect. Just because you
have an explanation to offer that does successfully explain some sorts of order effects that might really occur in our surveys, it does not follow that you have an explanation for any and every order effect that might crop up there. And ours seems to be a case of an order effect that is not well-explained by those factors.
There are some further theoretical problems here as well, in that the psychological literature you invoke to underwrite your account doesn't really fit our survey contexts, either. There's nothing at all like a part/whole relationship in our survey items, and no obvious candidates in them for the conversational norm of non-redundancy to take hold. Here's a key difference between what's going on in our surveys, and what's going on both in the part/whole cases and in your disjunction study: in the latter, but not the former, there is a very clear candidate for an alternative construal of the terms in question. You make a conjecture about the comparative judgment, but it's only that: a conjecture. (I don't find it _obvious_ that the subjects must be doing something other than answering our questions in a fairly literal way: "Hm, gosh… is this, or is this not, a case of knowledge?" They _might_ not be doing it that way, but then again, they might.) What would be ideal for you, as a way to make a different version of your argument, is if there were an clear alternative construal of "knows", such that you could tell a story as to how it could be alternatively cued up by different question orders. But I'm not seeing what one could be. The best candidates I can think of for alternative construals of "knows" are the "subjective certainty" sense, and the "animal knowledge" sense, but at least right now I'm just not seeing how either of those would be particularly well cued up by our surveys in any way that would generate our pattern of findings. Our questions seem more like the _conjunctive_ question in your disjunction study – there's no easily available alternative reading of the English "and", so I'll bet that, if you put those questions in the opposite order, you just aren't going to find much of a survey-pragmatics effect on those questions. One of the key raw materials for the kind of explanation you want to offer is just not obviously present in these surveys.
It's really important for you to see that, so long as you don't have an actual confound to offer, the rest of your arguments just can't quite get off the ground. Your part of the exchange that you've been having with Josh depends crucially on the idea that our variations can be accounted for entirely in terms of artifacts of the survey context. I think you recognize this, e.g., you write in part of your last comment, "…once we correctly identify that force as a contrast and assimilation effect _driven by the survey presentation_…" (emphasis added). But my argument here is that you precisely have not yet accomplished any such identification. And the data once taken on the whole, as they stand so far, are not particularly amenable to your doing so. (All of this applies, mutatis mutandis, to the WNS and MMNS stuff, too.)
Now, of course, none of this settles the matter. Well, it _does_ settle the matter that it was wrong of you to claim that there was any sort of obvious, easy, clear, etc. confound for our results – but it doesn't settle the question of whether there's _some_ confound for our results out there, maybe one you could find by looking for it in the right way. That you haven't gotten one yet does not entail that you won't be able to find one down the line. Here are some ideas for some studies that it would make sense for you to run, in order to try to develop a version of your theory that might fit our data:
--the one in your footnote 14;
--some sort of study to reveal subjects' own conception of what they've been asked, how weird they find the survey structure, do they take themselves to be making comparative judgments, etc. – i.e., checking to see if the empirical presuppositions of your argument obtain or not
--versions with other 'intermediate' cases, with very different contents and structures than Truetemp/Charles. Since your account says that the intermediateness is what's doing the trick, you get a very clear prediction: all such intermediate cases should show a Karen/Dave effect. (Though, as noted above, the Fakebarn Suzy case is a really bad problem for your theory.)
--getting clearer on what you're calling the dichotomous/nondichotomous distinction – since you contrast the RK/OB probe with a K/DK probe, there are two differences in play– the nondichotomous/dichotomous distinction, and the presence or absence of the intensifying adverbs. You need to run a version with a dichotomous "really knows/doesn't really know" probe, in order to figure which (or both) of these factors is driving the difference in your data.
--once you discern whether it's the dichotomosity or the intensifiers that you think are doing the trick, redoing the SAW study with your preferred probe, and see whether our effects disappear. Or, for that matter, you might as well do two versions, one with the intensified and one with the non-intensified dichotomous probes.
We’d be very interested in seeing the results of any of these studies and think that conducting these studies might (depending on their outcome) put you in a stronger position to criticize restrictionism.
Posted by: jonathan weinberg | Tuesday, April 01, 2008 at 02:22 PM
Simon,
I think that Jonathan has done a really nice job of pointing out what worries us about the current form of your argument and of suggesting avenues that you might pursue from here. At this point, I don't have much to add, except to say that we have really enjoyed the discussion and we both look forward to seeing a future version of your paper.
Posted by: Joshua Alexander | Tuesday, April 01, 2008 at 04:41 PM