Are there really such things as artistic masterworks? That is, do works belong to the artistic canon because critics and museum curators have correctly discerned their merits? An experiment carried out by Matthew Kieran, Aaron Meskin, and myself in connection with the experiment month initiative was designed to engage with a previous study that suggested we should be sceptical.
There are several difficult philosophical issues involved here, but for the sake of this discussion let’s focus on two possible views: the first, call it Humeanism, is the view that when we make evaluative judgements of artworks, we are sensitive to what is good or bad. On this Humean view, the works that form the artistic canon are there for good reason. Over time, critics, curators, and art historians arrive at a consensus about the best works; these works become known as masterworks, are widely reproduced, and prized as highlights of museum collections. However, a second view—call it scepticism—challenges these claims about the role of value in both artistic judgement and canon formation. A sceptic will point to other factors that can sway critics and curators such as personal or political considerations, or even chance exposure to particular works, arguing that value plays a much less important role than the Humean would lead us to believe. According to such a view, if a minor work had fallen into the right hands, or if a minor painter had had the right connections, the artistic canon might have an entirely different shape.
How is one to determine whether we are sensitive to value when we form judgements about artworks? In a 2003 study, psychologist James Cutting (2003, 2006) briefly exposed undergraduate psychology students to canonical and lesser-known Impressionist paintings (the lesser-known works exposed four times as often), with the result that after exposure, subjects preferred the lesser-known works more often than did the control group. Cutting took this result to show that canon formation is a result of cultural exposure over time. He further took this to show that the subjects’ judgements were not merely a product of the quality of the works. “If observers were able to judge quality alone in the image pairs, their judgments should not have been contaminated by appearance differences in the classroom. To be sure, quality could still play a role, but such an account must then rely on two processes- mere exposure and quality assessment (however that might be done). My proposal is that these are one-process results and done on the basis of mere exposure inside and outside the classroom” (Cutting 2003, 335).
However, because all the paintings used in Cutting’s study were of high artistic quality, an alternative and broadly Humean explanation is available for the effects of exposure. It could be that exposure is giving subjects an opportunity to learn what is good in the painting, and so does not by itself control preference, but rather facilitates evaluation, whether positive or negative. If this latter explanation were right, whether or not the exposed paintings are good or bad should make a difference. This is what our study examined. We replicated Cutting’s study exposing subjects to 12 little-known late landscapes of John Everett Millais, alongside 48 paintings by the American artist Thomas Kinkade, (again, half of each group of paintings were exposed four times as often). We asked control groups[1] and the experimental group to express the extent to which they liked each painting using a 10 point Likert scale. We found that with bad paintings by Kinkade, exposure decreased, rather than increased, liking in relation to our control groups. This is consistent with the Humean challenge to Cutting's conclusions.
The experiment subjects had been exposed to all 60 paintings in the study at least once. In light of this, we distinguished between those paintings to which that group had been exposed once versus those to which they had been exposed multiple times. That is, we compiled results for four groups of paintings: Millais (single exposure); Millais (multiple exposure); Kinkade (single exposure); and Kinkade (multiple exposure).
Comparing the ratings given by our experimental subjects to those given by the members of our philosophy control group, we observed almost uniformly lower ratings for the Kinkade paintings. 47 out of 48 Kinkades received lower mean liking scores from the experimental subjects than they received from those in the unexposed control group. This resulted in mean scores of 5.9 (control) versus 5.1 (experiment) for the single exposure Kinkade paintings, and mean scores of 5.74 (control) versus 4.75 (experiment) for the multiple exposure Kinkades. [Chart 1]
For the experiment subjects, the difference in mean degree of liking expressed for the singly and multiply exposed Millais paintings was not significant (p = .721).[2] The singly exposed and multiply exposed groups of Millais paintings both received mean ratings close to 5.7, with the mean of the multiply exposed group less than 0.06 lower than that of the singly exposed group. However, the difference in mean degree of liking for the two groups of Kinkades is significant (p < .001). The singly exposed group of Kinkade paintings received a mean score of 5.1, whereas the multiply exposed Kinkades received a mean score of 4.75. [Chart 2]
We conclude from these results that mere exposure will not always produce an increase in liking for paintings. This puts pressure on Cutting’s conclusions that canon formation is simply a function of cultural exposure, and that quality is not playing a role in artistic judgement.
[1] We had several control groups available to us: a small group of graduate music students, a larger group of music undergraduates, and a similarly-sized group of philosophy undergraduates. Where we refer to ‘combined control’, we have combined the ratings from these three groups. Elsewhere we make comparisons just to the philosophy control.
[2] These were t-tests.


Interesting but the analysis may be flawed. The authors should have employed a 2x2 ANOVA to analyze the results, rather than t-tests. Also, on the graphs, show us your error bars.
Posted by: Ray Lopez | Monday, October 24, 2011 at 11:24 AM
@Ray - Is your concern about the assumption of equality of variance? Is that really such a problem, if the size of the samples are about the same?
Error bars would be nice, though.
Posted by: Jonathan Weinberg | Tuesday, October 25, 2011 at 12:09 PM
Really cool stuff!!!
I wasn't sure what Figure 2 is supposed to show. Is the difference between Kinkade one-exposure and Kinkade multi-exposure statistically significant?
Posted by: Shen-yi Liao | Tuesday, October 25, 2011 at 11:23 PM
I’ll respond to the questions about the analyses since, in connection with experiment month, I ran them.
Ray, thanks for the question. The reason I opted for running T-tests rather than an ANOVA had to do with the fact that the numbers of Millais paintings vs Kinkade paintings were so different. Remember, the study looked at sixty paintings, 12 of which were late Millais painting and 48 of which were Kinkade paintings. The relevant analyses—the means of which are captured in the second bar graph—compare via T-tests 1) mean liking ratings for the 6 single-exposed Millais paintings to mean liking ratings for the 6 multiply-exposed Millais paintings, and 2) mean liking ratings for the 24 single-exposed Millais paintings to mean liking ratings for the 24 multiply-exposed Millais paintings. In other words, the dependent variable for each comparison is an average of the liking-ratings participants gave for all the paintings in the relevant condition. But since the number of paintings in each of the Kinkade conditions was much higher than the number of paintings in each of the Millais conditions, there’s a sense in which the dependent variable for the Millais paintings is not the same dependent variable for the Kinkade paintings. True, both are average liking-ratings, however they are averages over much different numbers of paintings. Doing T-tests on each painter instead of an overall ANOVA standardized the number of paintings from which the dependent variable is constructed—for one T-test, we are comparing the average liking-score for 6 singly exposed Millais paintings to the average liking-score for 6 multiply exposed Millais paintings; for the other T-test, we are comparing the average liking-score for 24 singly exposed Kinkade paintings to the average liking-score for 24 singly exposed Kinkade paintings. (Perhaps Margaret can address the reason why the numbers were so disparate for the good and bad artist. I believe it had to do with the desire to avoid canonical good landscapes and the difficulty of finding a large number of those by the same artist.)
I’m willing to consider, though, that perhaps the fact that the average is based on such dissimilar numbers of paintings is irrelevant (after all, it is the same dependent variable in the sense that it is an average liking-rating). In that case, the ANOVA is the appropriate analysis. Thus I have now gone back and run a 2 factor (Artist, Exposure) repeated measures ANOVA. Here, we find a main effect for Artist (Millais preferred to Kinkade, p=.011, Greenhouse-Geisser test), and a main effect for exposure, in the opposite direction of Cutting (single-exposed preferred to multi-exposed, p=.022, Greenhouse-Geisser), however, the interaction is not significant (p=.095, Greenhouse-Geisser). So, these results tell against Cutting's hypothesis—here mere exposure resulted in overall less preference. But these results don't, in and of themselves, reveal a difference for good and bad art. However, the overall means and also the previously conducted T-test results do suggest the difference for good and bad art. And, anyway, the real key points in favor of the hypothesis are the between participant results for unexposed controls verses the exposed test group.
I believe Margaret et al have the standard errors for the means, and can report those.
Posted by: Mark Phelan | Thursday, October 27, 2011 at 12:18 PM