Advertisement

Contributors

« The Instability of Philosophers' Judgments about Hypothetical Moral Scenarios | Main | Genre and Folk Evaluations of Art »

Comments

Chandra Sripada

Hi David and colleagues,

I’m going to try to avoid getting into a long back and forth exchange again. This is time consuming and may not be that helpful as the discussion tends not to be presented in a way that many can join in. So I will keep my response here brief and stop there.

1. I have been working with colleagues at Michigan to look more closely at the highly unique Tetrad analysis from your paper. Our conclusion is that the analysis is statistically illegitimate. In essence, the Bayesian information criterion (BIC) cannot be used as both the scoring variable in a global Tetrad search, and also as the goodness of fit index in a model comparison. This is a kind of overfitting error. In the reply to your article requested by the Editor of Philosophical Psychology, we elaborate on this critical point. My co-authors in the reply are established figures in statistical learning theory and former Chairs of their respective departments. I note their expertise and titles because it is important for all of us to have our analyses thoroughly vetted by senior figures who know the methods inside and out. In short, I am afraid that your paper rests on a faulty analysis. To be clear, this is not a problem with Tetrad (which is a very useful approach that I use myself) but rather with the unique way you employ Tetrad to try to reject competing models, which is not at all its usual purpose.

2. Our disagreement centers on the direction of a causal arrow: Do Deep Self attributions cause Intentionality Judgments as I claim or do Intentionality Judgments cause Deep Self attributions as you claim. Structural equation modeling (SEM) is generally very poor at resolving the question of whether a single causal arrow goes in one or the reverse direction (we use SEM for a quite different purpose in our paper). Everyone in the field will tell you that other methods, especially manipulations, are far preferable to address this kind question.

I have performed five follow-up studies encompassing over 1500 subjects using manipulations, reaction time, and other methods to demonstrate there is a substantial causal arrow running from Deep Self to Intentionality Judgments. I think that arguing about a single SEM study won’t help much. I will let my follow-up studies speak for themselves.

3. Many have criticized the Deep Self Concordance Model claiming that normative factors interact with Deep Self attributions to influence intentionality judgments. Josh Knobe has a view related to this, as do Florian Cova and Hichem Naar, and also Jason Shepard. I take this line of criticism very seriously and think it has a lot of merit. So I am not dogmatically claiming that the Deep Self Concordance Model is necessarily right. But I do reject the critique you’ve offered.

Edouard Machery

Chandra

just a quick few points of reply to your first 2 points:

Point 1. As far as I can see from your brief description, this point addresses only one of the three points we are making in the paper - the least important one at that.

In addition, we use several *other* measures of fit in addition BIC. Thus, I am puzzled by this response. I may understand it better when I read your response.

(And, since you like arguments from authority, we ran the paper by some of the folks at CMU.)

Point 2. You are mischaracterizing our views. We are not endorsing the best fitting model. We are not even arguing that the deep self model is false. What we are saying is that it is not supported by the data that you have published.

Chandra Sripada

Edouard, all the relevant fit statistics are related. You cannot search with the BIC and compare fit between the search-outputted model and an a priori model with another fit statistic for the purposes of rejecting the latter. So the statistical error remains. You are right that I do not mention the other two points from the paper. This is because in my view, these are completely and totally off base, and it is a mistake to direct much attention to them. I and my co-authors directly but briefly address these two points in the printed response. I do not plan to have a ‘spirited’ discussion about this on the blog, as these discussions do not seem to be helpful.

Jonathan Livengood

Chandra,

I'm more than a little worried that you are missing the point of our paper.

As I see it, we made two claims in the paper. First, we made a methodological claim systematic model search is a better approach than the guess-and-check approach favored by most social scientists and deployed in the paper of yours that we were writing about. Second, we made a claim about the evidence you presented in that paper. Specifically, we said that the evidence presented was not strong support for your conclusion. We never even implicitly denied that the DSCM might be well-supported by *other* evidence.

I entirely agree with you that genuine experiments are preferable to observational studies. And I am glad that your follow-up studies have been fruitful. As I said in our last exhausting round of discussion, I don't have a dog in the fight about intentional action judgments.

As to your first point, I suppose I will hold off until I see the criticism. Just looking back at our paper, I don't see anything obviously wrong in the way of over-fitting, and I'm not sure how it would matter to our critique if your technical objection were correct.

Alistair Isaac

I don't have a dog in the intentionality fight either, but I am interested in scientific method.  And I'm especially interested in the practice of deriving conclusions from data without "feigning hypotheses."  I just read the paper, though, and the methodological suggestion seems unsound.  In particular, I don't think the overfit worry (already addressed in the comments) is taken seriously. Generally speaking, one can't perform a model fitting procedure on a single data set and take the resulting model very seriously.

In domains where one needs to produce results (e.g. classifiers like spam filters) and one really only has a single data set to play with, standard practice is to fit the model to half the data set, then test it for predictive accuracy on the other half. Fitting a model to a data set with an iterative optimizing technique always optimizes over a subset of the population of interest (that subset represented by the data).  Since there is no in principle guarantee that this will also constitute an optimization over the entire population of interest (in this case, all possible intentionality attributions by all humans? all US citizens?), checking the model against another subset from the same population provides a (still only partial) check against overfit.

Furthermore, characterizing the alternative reasoning procedure as "guess and check" seems deeply inappropriate.  In the example of Sripada's deep self model, he cites Hume, and puts extensive effort into motivating the model.  It clearly arises from prior considerations, and is not a random "guess."  Beginning with a prior hypothesis space is, of course, the hallmark of Bayesian methods, and maybe the intent here is just to endorse classical statistical methods over their Bayesian counterparts - if so, fair enough.

Nevertheless, classical statistical tools (and the model fitting procedure described in the paper) do not provide a general scientific method, just a means for analyzing data sets. There's still the question of where a data set comes from. And here it looks like some prior "guess" is needed to motivate an experiment in the first place. So, it doesn't look like this could possibly be a criticism of the practice of social scientists in general, just their treatment of data sets after an experiment has been performed. But if that's the case, a more subtle alternative than model-fitting needs to be proposed in order to avoid the overfit worry.

Jonathan Livengood

Alistair,

Thanks for the comment. I agree that a leave-some-out data sampling technique would have made the inferences (on both sides) more robust. Once Sripada gave us some of his actual data (after our first round of discussion of this paper), I should have gone back and done what you recommend. From the perspective of the paper as it was written, however, you need to understand that the only data resource available was a covariance matrix for the entire data set. (Had Sripada done what you recommend, then we would have had to do the same in order to argue that his evidence did not support his conclusions. I don't think that what we did was inappropriate given the dialectic, however.)

With respect to "guess-and-check" and Bayesianism, well, I think I'm going to disagree. I don't think that taking cues from Hume makes a hypothesis any more respectable than simply pulling it out of the air. That is, your guess is as good as Hume's.

As to Bayesianism, our complaint is that the specific model building techniques used by Sripada (which are standard in the relevant literatures) do a poor job of exploring the relevant hypothesis space. The problem isn't that Sripada is beginning with a prior hypothesis space. In fact, we both begin with the same prior hypothesis space (a set of directed graphs over the variables that Sripada measured). What we claim is that GES does a better job of searching the agreed-on hypothesis space.

Your last remarks are correct as far as they go: no one has a generic scientific search procedure. That strikes me as too strong a demand at this stage, though. It's like saying that two chess-playing computer programs cannot be compared with respect to their chess-playing ability because neither one can pass the Turing test. So, you're right, we are not addressing the larger, much harder search problems involved in (a) finding an interesting problem and (b) selecting and measuring appropriate variables. Then again, we never said that we were addressing that larger problem.

Chandra Sripada

On overfitting: Alistair is absolutely right that the overfitting issue in your ‘Deep Trouble…’ analysis is not a ‘technical objection’. Rather, it represents a very fundamental issue. Nor will cross-validation save you, as we will point out in the printed reply. Cross-validation in SEM is very complicated and not yet well developed enough to provide reliable conclusions, especially given the proliferation of cross-validation strategies (e.g., ‘tight’ versus multiple kinds of ‘partial’ strategies; see MacCallum 1994). There aren’t yet any clear guidelines about which of these to deploy. The idea that one would use cross-validation in SEM to decide between the Tetrad model and my a priori model, where the two differ in the direction of a single causal arrow, is pretty much insane. My approach has been to collect brand new data sets (>1500 subjects) using varied study designs (including manipulations and reaction time measures) to show my model is superior to the model outputted by Tetrad’s blind search procedure. This is clearly the right way to resolve the question of which is the better model.

On guess-and-check: I quoted Hume to motivate the Deep Self Concordance Account, but the basic idea that intentional action requires some or other pro-attitude directed at the outcome is found in Anscombe, Davidson, E.J. Lowe, and many, many others. There is also a large literature on moral responsibility that directly parallels the DSCA approach to intentional action (e.g., T.M. Scanlon 1998 and A. Smith 2005). In short, the DSCA is built on very strong a priori hypotheses. If you think that all these thinkers are just plain ‘guessing’, then I think we’ve reached an impasse where productive discussion just breaks down.

Jonathan Livengood

Chandra,

Although we still disagree about a few things, I just wanted to say that I entirely agree with this remark of yours:

"My approach has been to collect brand new
data sets (>1500 subjects) using varied study designs (including manipulations and reaction time measures) to show my model is superior to the model outputted by Tetrad’s blind search procedure. This is clearly the right way to resolve the question of which is the better model."

The comments to this entry are closed.

FSU Free Will Project

Google Search

  • Google Search
    Google

    WWW
    http://experimentalphilosophy.typepad.com/

Wikio Ranking

  • Wikio - Top Blogs - Sciences