Category Archives: The Hardest Science

What is counterintuitive? – Sanjay Srivastava (The Hardest Science)

Simine Vazire has a great post contemplating how we should evaluate counterintuitive claims. For me that brings up the question: what do we mean when we say something is “counterintuitive?”

First, let me say what I think counterintuitive isn’t. The “intuitive” part points to the fact that when we label something counterintuitive, we are usually not talking about contradicting a formal, well-specified theory. For example, you probably wouldn’t say that the double-slit experiment was “counterintuitive;” you’d say it falsified classical mechanics.

In any science, though, you have areas of inquiry where there is not an existing theory that makes precise predictions. In social and personality psychology that is the majority of what we are studying. (But it’s true in other sciences too, probably more than we appreciate.) Beyond the reach of formal theory, scientists develop educated guesses, hunches, and speculations based on their knowledge and experience. So the “intuitive” in counterintuitive could refer to the intuitions of experts.

But in social and personality psychology we study phenomena that regular people reflect on and speculate about too. A connection to everyday lived experience is almost definitional to our field, whether you think it is something that we should actively pursue or just inevitably creeps in. Continue reading

What did Malcolm Gladwell actually say about the 10,000 hour rule? – Sanjay Srivastava (The Hardest Science)

A new paper out in Intelligence, from a group of authors led by David Hambrick, is getting a lot of press coverage for having “debunked” the 10,000-hour rule discussed in Malcolm Gladwell’s book Outliers. The 10,000-hour rule is — well, actually, that’s the point of this post: Just what, exactly, is the 10,000-hour rule?

The debate in Intelligence is between Hambrick et al. and researcher K. Anders Ericsson, who studies deliberate practice and expert performance (and wrote a rejoinder to Hambrick et al. in the journal). But Malcolm Gladwell interpreted Ericsson’s work in a popular book and popularized the phrase “the 10,000-hour rule.” And most of the press coverage mentions Gladwell.

Moreover, Gladwell has been the subject of a lot of discussion lately about how he interprets research and presents his conclusions. The 10,000-hour rule has become a runaway meme — there’s even a Macklemore song about it. And if you google it, you’ll find a lot of people talking about it and trying to apply it to their lives. The interpretations aren’t always the same, suggesting there’s been some interpretive drift in what people think the 10,000-hour rule really is. I read Outliers shortly after it came out, but my memory of it has probably been shaped by all of that conversation that has happened since. Continue reading

In which we admire a tiny p with complete seriousness – Sanjay Srivastava (The Hardest Science)

A while back a colleague forwarded me this quote from Stanley Schachter (yes that Stanley Schachter):

“This is a difference which is significant at considerably better than the p < .0001 level of confidence. If, in reeling off these zeroes, we manage to create the impression of stringing pearls on a necklace, we rather hope the reader will be patient and forbearing, for it has been the very number of zeros after this decimal point that has compelled us to treat these data with complete seriousness.”

The quote comes from a chapter on birth order in Schachter’s 1959 book The Psychology of Affiliation. The analysis was a chi-square test on 76 subjects. The subjects were selected from 3 different experiments for being “truly anxious” and combined for this analysis. True anxiety was determined if the subject scored at one or the other extreme endpoint of an anxiety scale (both complete denial and complete admission were taken to mean that the subject is “truly anxious”), and/or if the subject discontinued participation because the experiment made them feel too anxious.


Tagged: p-values

Let’s talk about diversity in personality psychology – Sanjay Srivastava (The Hardest Science)

In the latest issue of the ARP newsletter, Kelci Harris writes about diversity in ARP. You should read the whole thing. Here’s an excerpt:

Personality psychology should be intrinsically interesting to everyone, because, well, everyone has a personality. It’s accessible and that makes our research so fun and an easy thing to talk about with non-psychologists, that is, once we’ve explained to them what we actually do. However, despite what could be a universal appeal, our field is very homogenous. And that’s too bad, because diversity makes for better science. Good research comes from observations. You notice something about the world, and you wonder why that is. It’s probably reasonable to guess that most members of our field have experienced the world in a similar way due to their similar demographic backgrounds. This similarity in experience presents a problem for research because it makes us miss things. How can assumptions be challenged when no one realizes they are being made? What kind of questions will people from different backgrounds have that current researchers could never think of because they haven’t experienced the world in that way?

 In response, Laura Naumann posted a letter to the ARP Facebook wall. Continue reading

An interesting study of why unstructured interviews are so alluring – Sanjay Srivastava (The Hardest Science)

A while back I wrote about whether grad school admissions interviews are effective. Following up on that, Sam Gosling recently passed along an article by Dana, Dawes, and Peterson from the latest issue of Judgment and Decision Making:

Belief in the unstructured interview: The persistence of an illusion

Unstructured interviews are a ubiquitous tool for making screening decisions despite a vast literature suggesting that they have little validity. We sought to establish reasons why people might persist in the illusion that unstructured interviews are valid and what features about them actually lead to poor predictive accuracy. In three studies, we investigated the propensity for “sensemaking” – the ability for interviewers to make sense of virtually anything the interviewee says—and “dilution” – the tendency for available but non-diagnostic information to weaken the predictive value of quality information. In Study 1, participants predicted two fellow students’ semester GPAs from valid background information like prior GPA and, for one of them, an unstructured interview. In one condition, the interview was essentially nonsense in that the interviewee was actually answering questions using a random response system. Consistent with sensemaking, participants formed interview impressions just as confidently after getting random responses as they did after real responses. Consistent with dilution, interviews actually led participants to make worse predictions. Study 2 showed that watching a random interview, rather than personally conducting it, did little to mitigate sensemaking. Study 3 showed that participants believe unstructured interviews will help accuracy, so much so that they would rather have random interviews than no interview. People form confident impressions even interviews are defined to be invalid, like our random interview, and these impressions can interfere with the use of valid information. Our simple recommendation for those making screening decisions is not to use them.

Continue reading

The hotness-IQ tradeoff in academia – Sanjay Srivastava (The Hardest Science)

The other day I came across a blog post ranking academic fields by hotness. Important data for sure. But something about it was gnawing on me for a while, some connection I wasn’t quite making.

And then it hit me. The rankings looked an awful lot like another list I’d once seen of academic fields ranked by intelligence. Only, you know, upside-down.

Sure enough, when I ran the correlation among the fields that appear on both lists, it came out at r = -.45.

hotness-iq

I don’t know what this means, but it seems important. Maybe a mathematician or computer scientist can help me understand it.

Continue reading

The flawed logic of chasing large effects with small samples – Sanjay Srivastava (The Hardest Science)

“I don’t care about any effect that I need more than 20 subjects per cell to detect.”

I have heard statements to this effect a number of times over the years. Sometimes from the mouths of some pretty well-established researchers, and sometimes from people quoting the well-established researchers they trained under. The idea is that if an effect is big enough — perhaps because of its real-world importance, or because of the experimenter’s skill in isolating and amplifying the effect in the lab — then you don’t need a big sample to detect it.

When I have asked people why they think that, the reasoning behind it goes something like this. If the true effect is large, then even a small sample will have a reasonable chance of detecting it. (“Detecting” = rejecting the null in this context.) If the true effect is small, then a small sample is unlikely to reject the null. So if you only use small samples, you will limit yourself to detecting large effects. And if that’s all you care about detecting, then you’re fine with small samples.

On first consideration, that might sound reasonable, and even admirably aware of issues of statistical power. Unfortunately it is completely wrong. Continue reading

Where is RDoC headed? A look at the eating disorders FOA – Sanjay Srivastava (The Hardest Science)

Thomas Insel, director of NIMH, made a splash recently with the announcement that NIMH funding will be less strictly tied to the DSM. That by itself would be good news, given all the problems with DSM. But the proposed replacement, the Research Domain Criteria (RDoC), has worried some people that NIMH is pursuing biology to the exclusion of other levels of analysis, as opposed to taking a more integrated approach.

We can try to divine NIMH future directions from RDoC description and the director’s blog post, but it’s hard to tell whether mentions of behavior and phenomenology reflect real priorities or just lip service. Likewise for social and cultural factors. They come up in a discussion of “environmental aspects” that might interact with neural circuits, but they do not appear as focal units of analysis in the RDoC matrix, leaving them in a somewhat ambiguous state.

Another approach is to look at revealed preferences. Regardless of what anybody is saying, how is NIMH actually going to spend its money?

As an early indication, the NIMH RDoC overview page links to 2 funding opportunity announcements (FOAs) that are based on RDoC. Presumably these are examples of where RDoC-driven research is headed. One of the FOAs is for eating disorders. Here is the overview:

Continue reading

A null replication in press at Psych Science – anxious attachment and sensitivity to temperature cues – Sanjay Srivastava (The Hardest Science)

Etienne LeBel writes:

My colleague [Lorne Campbell] and I just got a paper accepted at Psych Science that reports on the outcome of two strict direct replications where we  worked very closely with the original author to have all methodological design specifications as similar as those in the original study (and unfortunately did not reproduce the original finding). 

We believe this is an important achievement for the “replication movement” because it shows that (a) attitudes are changing at the journal level with regard to rewarding direct replication efforts (to our knowledge this is the first strictly direct replications to be published at a top journal like Psych Science [JPSP eventually published large-scale failed direct replications of Bem's ESP findings, but this was of course a special case]) and (b) that direct replication endeavors can contribute new knowledge concerning a theoretical idea while maintaining a cordial, non-adversarial atmosphere with the original author. We really want to emphasize this point the most to encourage other researchers to engage in similar direct replication efforts. Science should first and foremost be about the ideas rather than the people behind the ideas; we’re hoping that examples like ours will sensibilize people to a more functional research culture where it is OK and completely normal for ideas to be revised given new evidence.

An important achievement indeed. The original paper was published in Psychological Science too, so it is especially good to see the journal owning the replication attempt. And hats off to LeBel and Campbell for taking this on. Someday direct replications will hopefully be more normal, but in world we currently live in it takes some gumption to go out and try one.

I also appreciated the very fact-focused and evenhanded tone of the writeup. If I can quibble, I would have ideally liked to see a statistical test contrasting their effect against the original one - testing the hypothesis that the replication result is different from the original result. I am sure it would have been significant, and it would have been preferable over comparing the original paper’s significant rejection of the null versus the replications non-significant test against the null. Continue reading

Pre-publication peer review can fall short anywhere – Sanjay Srivastava (The Hardest Science)

The other day I wrote about a recent experience participating in post-publication peer review. Short version: I picked up on some errors in a paper published in PLOS ONE, which led to a correction. In my post I made the following observation:

Is this a mark against pre-publication peer review? Obviously it’s hard to say from one case, but I don’t think it speaks well of PLOS ONE that these errors got through. Especially because PLOS ONE is supposed to emphasize “a high technical standard” and reporting of “sufficient detail” (the reason I noticed the issue with the SDs was because the article did not report effect sizes).

But this doesn’t necessarily make PLOS ONE worse than traditional journals like Psychological Science or JPSP, where similar errors get through all the time and then become almost impossible to correct.

My intention was to discuss pre- and post-publication peer review generally, and I went out of my way to cite evidence that mistakes can happen anywhere. But some comments I’ve seen online have characterized this as a mark against PLOS ONE (and my “I don’t think it speaks well of PLOS ONE” phrasing probably didn’t help). So I would like to note the following:

1. After my blog post went up yesterday, somebody alerted me that the first author of the PLOS ONE paper has posted corrections to 3 other papers on her personal website. The errors are similar to what happened at PLOS ONE. Continue reading