Author
Abstract
Tests are widely used to measure ability, yet performance on a test often reflects more than the ability to execute assigned tasks. It also reflects the ability to recognize which tasks are worth attempting, how they should be prioritized, and how effort should be allocated under uncertainty. This paper studies how tests can be designed to separate these capabilities. We model a test as a sequential decision problem. Tasks differ in difficulty, their ordering is uncertain, and examinees may acquire costly information about that ordering before choosing how to proceed. The testing environment is the informational structure surrounding the realized test: in particular, the examinee's beliefs about how task difficulty has been arranged. Performance is therefore generated by an optimal recognition–execution policy, not by execution skill alone. The analysis delivers two negative results. First, even in the simplest two-task environment, a single score exhibits dimensional collapse: distinct combinations of execution skill and recognition capability generate identical expected scores. Second, with three tasks, the relationship between capabilities and scores becomes environment-dependent: changing beliefs about task ordering can change which actions are considered and how capabilities translate into performance. These results imply that standard scores are not generally informative enough to separate the capabilities that generate performance. This matters because scores are used to summarize what individuals can do and to guide downstream decisions about placement, training, and instruction. If a test does not separately reveal execution and recognition, it provides limited guidance about which capability is strong, which is weak, and where improvement should be directed. We then show how more informative tests can be designed. Under a simple communicability constraint, two canonical environments—ordered and randomized tests—induce distinct relationships between capabilities and scores. In an ordered test, recognition is suppressed and performance isolates execution. In a randomized test, recognition is activated and performance reflects both execution and recognition. Observing performance across these environments separates capabilities that are confounded in any single score. The paper reframes testing as a problem of informational design: tests should be designed not only to record performance, but to reveal the distinct capabilities that generate it.
Suggested Citation
Andrew Caplin & Leo Zhu, 2026.
"Designing More Informative Tests: Separating Execution from Recognition,"
NBER Working Papers
35232, National Bureau of Economic Research, Inc.
Handle:
RePEc:nbr:nberwo:35232
Note: ED
Download full text from publisher
As the access to this document is restricted, you may want to
for a different version of it.
More about this item
JEL classification:
- C90 - Mathematical and Quantitative Methods - - Design of Experiments - - - General
- D83 - Microeconomics - - Information, Knowledge, and Uncertainty - - - Search; Learning; Information and Knowledge; Communication; Belief; Unawareness
- I21 - Health, Education, and Welfare - - Education - - - Analysis of Education
- I24 - Health, Education, and Welfare - - Education - - - Education and Inequality
- J24 - Labor and Demographic Economics - - Demand and Supply of Labor - - - Human Capital; Skills; Occupational Choice; Labor Productivity
- O33 - Economic Development, Innovation, Technological Change, and Growth - - Innovation; Research and Development; Technological Change; Intellectual Property Rights - - - Technological Change: Choices and Consequences; Diffusion Processes
Statistics
Access and download statistics
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nbr:nberwo:35232. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: the person in charge (email available below). General contact details of provider: https://edirc.repec.org/data/nberrus.html .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.