Category Archives: funderstorms

How High is the Sky? Well, Higher than the Ground – David Funder (funderstorms)

Challenged by some exchanges in my own personal emails and over in Brent Robert’s “pigee” blog, I’ve found myself thinking more about what is surely the weakest point in my previous post about effect size: I failed to reach a clear conclusion about how “big” an effect has to be to matter. As others have pointed out, it’s not super-coherent to claim, on the one hand, that effect size is important and must always be reported yet to acknowledge, on the other hand, that under at least some circumstances very “small” effects can matter for practical and/or theoretical purposes.

My attempt to restore coherence has two threads, so far. First, to say that small effect sizes are sometimes important does not mean that they always are. It depends. Is .034 (in terms of r) big enough? It is, if we are talking about aspirin’s effect on heart attacks, because wide prescription can save thousands of lives a year (notice, though, that you need effect size to do this calculation). Probably not, though, for other purposes.

But honestly, I don’t know how small an effect is too small. As I said, it depends. I suspect that if social psychologists, in particular, reported and emphasized their effect sizes more often, over time an experiential base would accrue that would make interpreting them easier. But, in the meantime, maybe there is another way to think about things. Continue reading

Does (effect) Size Matter? – David Funder (funderstorms)

Personality psychologists wallow in effect size; the ubiquitous correlation coefficient, Pearson’s r, is central to nearly every research finding they report.  As a consequence, discussions of relationships between personality variables and outcomes are routinely framed by assessments of their strength.  For example, a landmark paper reviewed predictors of divorce, mortality, and occupational achievement, and concluded that personality traits have associations with these life outcomes that are as strong as or stronger than traditional predictors such as socio-economic status or cognitive ability (Roberts et al., 2007).  This is just one example of how personality psychologists routinely calculate, care about, and even sometimes worry about the size of the relationships between their theoretical variables and their predicted outcomes.

Social psychologists, not so much.  The typical report in experimental social psychology focuses on p-level, the probability of the magnitude of the difference between experimental groups occurring if the null hypothesis of no difference were to be true.   If this probability is .05 or less, then: Success!  While effect sizes (usually Cohen’s d  or, less often, Pearson’s r) are reported more often they they used to be – probably because the APA Publication Manual explicitly requires it (a requirement not always enforced) – the emphasis of the discussion of the theoretical or even the practical importance of the effect typically centers around whether it exists.  The size simply doesn’t matter.

Is this description an unfair caricature of social psychological research practice?  That’s what I thought until recently. Continue reading

Speaking of replication… – David Funder (funderstorms)

A small conference sponsored by the European Association of Personality, held in Trieste, Italy last summer, addressed the issue of replicability in psychological research.  Discussions led to an article describing recommended best practices, and the article is now “in press” at the European Journal of Personality.  You can see the article if you click here.

Update November 8: Courtesy of Brent Roberts, the contents of the special issue of Perspectives in Psychological Science on replicability are available.  To go to his blog post, with links, click here.


On inference (updated x 2) – David Funder (funderstorms)

At a conference I attended last month, I heard for the first time about an Oxford philosopher who, according to his fellow philosophers, has pretty much proved that we live inside of a computer simulation. I’ll take the philosophers’ word for it when they say the inferential logic appears to be impeccable.

Which brings me to draw the following lesson: Any system of rigid (or automatic) inferential rules, followed out on a long enough chain, will eventually lead to an absurd conclusion. (If someone else has already coined this principle as a proverb, or something, I’d love to hear about it.)

For example, consider rigid applications of constitutional law. The Second Amendment of the US Constitution says that the right to bear arms shall not be infringed. Therefore, as an American citizen, I cannot be prohibited from owning a pistol, an assault rifle or (why not?) a nuclear bomb. The logic is fine; the conclusion is ridiculous.

The vulnerability of automatic systems to absurd outcomes is one reason  I dislike the term “inferential statistics.” There is really no such thing. All statistics are descriptive. Continue reading

The perilous plight of the (non)-replicator – David Funder (funderstorms)

               As I mentioned in my previous post, while I’m sympathetic to many of the ideas that have been suggested about how to improve the reliability of psychological knowledge and move towards “scientific utopia,” my own thoughts are less ambitious and keep returning to the basic issue of replication.  A scientific culture that consistently produced direct replications of important results would be one that eventually purged itself of many of the problems people having been worrying about lately, including questionable research practices, p-hacking, and even data fraud.

But, as I also mentioned in my previous post, this is obviously not happening.  Many observers have commented on the institutional factors that discourage the conduct and, even more, the publication of replication studies.  These include journal policies, hiring committee practices, tenure standards, and even the natural attractiveness of fun, cute, and counter-intuitive findings.  In this post, I want to focus on a factor that has received less attention: the perilous plight of the (non) replicator.

The situation of a researcher who has tried and failed to replicate a prominent research finding is an unenviable one.  My sense is that the typical non-replicator started out as a true believer, not a skeptic.  For example, a few years ago I spent sabbatical time at a large, well-staffed and well-equipped institute in which several researchers were interested in a very prominent finding in their field, and wished to test further hypotheses they had generated about its basis.  As good scientists, they began by making sure that they could reproduce the basic effect.  To their surprise and increasing frustration, they simply could not.  They followed the published protocol, contacted the original investigator for more details, tweaked this, tweaked that.  (As I said, they had lots of resources. Continue reading

Replication, period. – David Funder (funderstorms)

Can we believe everything (or anything) that social psychological research tells us?  Suddenly, the answer to this question seems to be in doubt.  The past few months have seen a shocking series of cases of fraud –researchers literally making their data up — by prominent psychologists at prestigious universities.  These revelations have catalyzed an increase in concern about a much broader issue, the replicability of results reported by social psychologists.  Numerous writers are questioning common research practices such as selectively reporting only studies that “work” and ignoring relevant negative findings that arise over the course of what is euphemistically called “pre-testing,” increasing N’s or deleting subjects from data sets until the desired findings are obtained and, perhaps worst of all, being inhospitable or even hostile to replication research that could, in principle, cure all these ills.

Reaction is visible.  The European Association of Personality Psychology recently held a special three-day meeting on the topic, to result in a set of published recommendations for improved research practice, a well-financed conference in Santa Barbara in October will address the “decline effect” (the mysterious tendency of research findings to fade away over time), and the President of the Society for Personality and Social Psychology was recently motivated to post a message to the membership expressing official concern.  These are just three reactions that I personally happen to be familiar with; I’ve also heard that other scientific organizations and even agencies of the federal government are looking into this issue, one way or another.

This burst of concern and activity might seem to be unjustified.  After all, literally making your data up is a far cry from practices such as pre-testing, selective reporting, or running multiple statistical tests.  These practices are even, in many cases, useful and legitimate.  So why did they suddenly come under the microscope as a result of cases of data fraud?  The common thread seems to be the issue of replication. Continue reading