Does Personality Really Predict Job Performance, and How Can We Tell?
Wendell Williams
As a scientist/practitioner, I see organizations using personality tests every day to make hiring and promotion decisions. But do they really predict what they promise? Personality and performance research continues to show a wide variety of results, from a time when personality instruments were limited to clinical applications and scientists hand-calculated statistics, to a period in the mid ‘50’s when personality and job performance results were so inconsistent that some folks concluded using personality to make hiring decisions was akin to a parlor game. We don’t do stats the old fashioned way anymore, but modern day meta-analytic studies, the continued use of psychometrically unsound tests, and weak performance criteria still obscure the true predictive validity of personality traits.
I believe five persistent problems plague studies attempting to correlate personality scores with job performance: (1) flawed dependent criteria, (2) failure of an instrument to meet professional test standards, (3) measuring non-job related traits, (4) confusing differences between people with differences in predicting performance, and (5) self-reporting error.
Performance Fog
It is often difficult to convince human resource departments to validate tests used to make hiring and promotion decisions. Professional HR associations don’t help the issue by over-emphasizing the administrative side of HR and underestimating (or totally misunderstanding) the benefits of best-practice multi-trait/multi-method hiring/promotion systems.
When we enter this hodge-podge environment to conduct research, there is seldom any trustworthy performance criteria available. Instead, we often are limited to working with manager opinions, performance appraisal data, or some other indirect measure of productivity that forces us to statistically tease out confounding variance. Since final job performance is the result of many prior factors (planning, analysis, market forces, and so forth) it’s easy to mask the contributions that personality traits, along with knowledge, skills, and abilities (KSAs), make to the job performance equation:
(KSAs + Associated Traits) + (Expected Results) + (Uncontrollable Factors) = Performance
Wrong Tests
The 1999 APA Standards for Educational and Psychological Testing was a joint effort between three well-respected professional associations to define professional test development criteria. One would think an investigator would ensure their test conformed to these recommendations before beginning research. But often researchers and lay people choose trait and preference tests that fall far short of established standard for sound theory, validity, and reliability.
For example, some researchers continue to use tests developed when personality psychology was in its infancy, or based on theories proposed by long-deceased icons. Somehow tests like these are perceived as more robust and appropriate than ones we have available today. A quick review of such instruments using Tests in Print, the Mental Measurements Yearbook, the 1999 APA Standards for Educational and Psychological Testing, the 1978 Uniform Guidelines on Employee Selection, and the Americans with Disabilities Act makes it exceptionally clear why tests like these should not be a part of employee selection or promotion validity research.
Not Job-Related
Validity studies using Big Five personality models often show less than enlightening results. Although Robert Hogan and others have highlighted the damaging effects of Dark Triad personality dimensions (narcissism, machiavellianism, and self-enhancement), especially among management, some researchers consistently default to using broad-based tests of normal human performance. Coupling scores from technologically dated instruments with fuzzy performance criteria, and examining the results for statistically significant relationships without having clear ideas of job-specific relevance does not advance the legitimacy of personality effects on job performance.
If broad-based instruments are not sufficiently error-prone to warrant caution, I have encountered more than one test vendor who “validates” personality scores by dividing performers into “high” and “low” productivity groups, administering a test that does not meet the APA Standards, and comparing the group averages. The facts that correlation does not imply causation, restriction of range minimizes statistical differences in performance, performance criteria are subjective, group-level data cannot be used to make assumptions about individuals, and personality is a precursor (not a result) of performance does not seem to bother them. Sadly, vendors who believe this is scientific can sometimes convince clients to purchase their test licenses.
People vs. Performance
Aside from the first three problems, business people tend to borrow tests that they have taken and enjoyed during a testing workshop. If, for example, they attend a test vendor’s workshop and are impressed by the accuracy of affirmatively answering several questions about being extraverted and then receiving feedback that they are highly extraverted, they may conclude that this same test should be used for hiring or promotion. Without a thorough discussion, it’s exceptionally difficult to explain how and why hiring many people with similar personality profiles can be disruptive, and why personality traits discussed in testing workshops are often different from traits that affect performance in a particular job.
For example, I once visited a truck assembly plant that had only hired highly agreeable people. Within a few months, this hiring policy had helped create an unproductive environment in which workers would not call a meeting unless everyone could attend, would not make decision until everyone agreed, and would not confront others about quality problems. Rather than addressing this issue by implementing a job analysis-based hiring test, HR decided to purchase another sub-standard test as the solution.
True Scores
There seems to be a tendency to assume that personality scores are perfectly valid and stable. Deniz Ones and others argue that their meta-analytic studies show personality tests are somewhat immune to faking; however, in my own work, applicants almost always attempt to better present themselves than incumbent job holders. Even with embedded social desirability scales, one can never be assured if the candidate is job-matching, faking good, being realistic, or being delusional. Unstable independent variable scores gathered from self-descriptive instruments make personality scores a moving target.
Overcoming Obstacles
How can we address these persistent problems with efforts to predict job performance from personality traits? While many researchers use broad-brush personality tests as casting-nets for correlates, using personality traits to predict objective job performance is considerably more complex than one might imagine. Some personality factors might only be observed on the job, or contribute in unexpected ways that only job content experts—such as current workers and their immediate managers—could identify or explain.
Therefore, future investigations might begin with thoroughly understanding the job using structured discussions with current workers and managers. When selecting content experts, one should pay particular attention to interviewing people who can clearly articulate what is expected and how it is accomplished. In my experience, the best sources of data are slightly above average performers, because top performers tend to operate on automatic pilot and take important steps for granted.
The key outcome of these discussions would be a list of measurable performance dimensions and their associated personality traits. Since is not unusual for job-holders and managers to discuss what is produced in the job, as opposed to how, the analyst must be able to tease out specific KSAs and pursue the discussion until unique personality factors associated with each KSA are clearly identified and understood. This list is critical because it will later become the dependent criteria for the study. I suggest putting particular emphasis on measurable dimensions of job performance that are specific, actionable, realistic, and time bound.
Next, a specific personality instrument should be carefully chosen based on conformance with the APA Standards, the Uniform Guidelines on Employee Selection, the provisions of the Americans with Disabilities Act, and comments published in respected academic-level test reviews. The analyst should also pay special attention to the inventory’s sub-scales and items that seem relevant to the job performance domains identified during the investigation. While all of this might seem like overkill, it ensures robust measurement based on job requirements and business necessity.
After collecting personality data, the analyst should return to the client site and conduct performance reviews by again meeting with job content experts. The content of these meetings should include discussing definitions of the job performance aspects being rated, discussing performance results to get clarity and agreement, and determining final performance scores for each employee participating in the study.
Finally, the analyst can use the data to test their hypotheses about how specific personality traits predict specific aspects of job performance. Following this protocol should help ensure job relevance and business necessity, clearly identify the performance criteria, select the proper instrument for the investigation based on theory, reliability, and validity, ensure that job performance raters understand the dependent criteria, and ultimately advance our knowledge of how personality traits contribute to job performance.