Login
Tags
Administration
Benefits
Communication
Communication Programs
Compensation
Conflict & Dispute Resolution
Developing & Coaching Others
Employee Satisfaction/Engagement
Executive Coaching
HR Metrics & Measurement
HR Outsourcing
HRIS/ERP
Human Resources Management
Internal Corporate Communications
Labor Relations
Labor Trends
Leadership
Leadership Training & Development
Leading Others
Legal
Management
Motivating
Motivation
Organizational Development
Pay Strategies
Performance Management
Present Trends
Recognition
Retention
Staffing
Staffing and Recruitment
Structure & Organization
Talent
The HR Practitioner
Training
Training and Development
Trends
U.S. Based Legal Issues
Vision, Values & Mission
Work-Life Programs & Employee Assistance Programs - EAP
Workforce Acquisition
Workforce Management
Workforce Planning
Workplace Regulations
corporate learning
employee engagement
interpersonal communications
leadership competencies
leadership development
legislation
News
Onboarding Best Practices
Good Guy = Bad Manager :: Bad Guy = Good Manager. Is it a Myth?
Five Interview Tips for Winning Your First $100K+ Job
Base Pay Increases Remain Steady in 2007, Mercer Survey Finds
Online Overload: The Perfect Candidates Are Out There - If You Can Find Them
Cartus Global Survey Shows Trend to Shorter-Term International Relocation Assignments
New Survey Indicates Majority Plan to Postpone Retirement
What do You Mean My Company’s A Stepping Stone?
Rewards, Vacation and Perks Are Passé; Canadians Care Most About Cash
Do’s and Don’ts of Offshoring
Error: No such template "/hrDesign/network_profileHeader"!
Blogs / Send feedback
Help us to understand what's happening?
Reason
It's a fake news story
It's misleading, offensive or inappropriate
It should not be published here
It is spam
Your comment
More information
Security Code
When CATs are DOGs: The Limitations of Computer Adaptive Testing
Created by
Rich Griffith
Content
Computer Adaptive Testing (CAT) has been around for almost thirty years. There are a lot of advantages to it; but, as with anything else, there are some areas where it just doesn’t make a lot of sense.
The foundation of CAT rests in Item Response Theory (IRT). With IRT, test developers generate stable item parameters for each test item, including difficulty level and degree of discriminability. IRT allows us to determine the likelihood of people at a certain ability level to get the item correct. So, while the average of everyone who takes a particular item might be 50%, with IRT we know that people who are at the 70th percentile in ability have an 85% chance of getting it correct. Discriminability takes it one more step because it looks at the slope of the response scale. For instance, two items may be of exactly the same difficulty level but with one item the slope is very steep, wherein everyone who is at the 70th percentile in ability will get it correct, while another may be more shallow and the percentage correct goes up more gradually. That’s the basis of IRT and it can be applied to dichotomous items, as well as polytomous personality scales.
The advantage CAT brings is that it can present different candidates with different items based on their general ability level and then hone in on their ability level in less time than a traditional test. It doesn’t make a lot of sense to give average first graders advanced calculus problems because they will get them all wrong. In the same vein, you don’t learn a lot from giving graduate level mathematicians simple algebra questions because they will get them all right. For those reasons, CAT really shines when you’re measuring things like logical reasoning, mathematical skills, or knowledge based items. In these situations, CAT not only reduces test taking time, but also reduces the potential for cheating or item piracy by limiting exposure to the items.
So where does CAT fall short? There are a couple of applications where CAT has some significant limitations. The first is with Situational Judgment Tests (SJT’s), which have grown dramatically in popularity because they are flexible, face valid, offer a solid measure of ability and provide an alternative measure of personality. The problem is CAT’s can’t accurately measure SJT items. The issue is that IRT, and therefore CAT’s only measure unidimensional items and SJT items are typically multidimensional. To work properly IRT needs to measure a unidimensional construct, even if it’s a relatively messy construct like knowledge of American history. It doesn’t work at all, if you’re measuring logical reasoning and conscientiousness and customer service at the same time, like you often are in SJT’s. Most, if not all, SJT’s gather information on multiple competency areas, such as honesty and customer service with a single robust item. It’s one of SJT’s greatest strengths. Therefore, CAT, or any approach that mechanically decides whether an item appears to a candidate or not based on an assumption of unidimensionality, is inherently flawed. Further, it will lead to skewed, if not downright incorrect, information about the candidate.
Another problem with CAT relates to cross-cultural applications. I’m a big fan of SJT’s. I think they’re some of the best ways to measure a broad range of traits in a user-friendly manner. When you use them internationally it’s not unusual for the content of the scenario to change from one culture to another. That’s not a big deal and is handled by a cultural review and sound translations. However, when you throw CAT into the mix it becomes a mess. Imagine developing a CAT based on an SJT developed here in the U.S. , which as I mentioned earlier is problematic in and of itself, and then trying to apply that to a new cultural and language. There is simply no reasonable way for underlying item parameters to remain invariant across cultures.
As it currently stands, CAT is a solid tool when your goal is to shorten the length and increase the item relevance of a long test of specific content knowledge or cognitive ability. But CAT’s value is counterproductive when applied to testing methods outside of that realm, such as with personality, SJT, or even biodata, where there is not only inherent multidimensionality but where significant cross-cultural issues come into play.
Copyright © 1999-2025 by
HR.com - Maximizing Human Potential
. All rights reserved.