finding Lepine, Kim, Mishkin & Beane, arXiv:2505.10742, 2025

Measuring and Managing Cognitive Load in Learning

How working-memory limits shape learning, why extraneous load is the enemy of good instruction, and what 'just enough scaffolding' means for adaptive tutoring.

Read the source ↗

Cognitive Load Theory (CLT) starts from a hard constraint: working memory is small and brief, while long-term memory is effectively unlimited. Learning is the process of building schemas in long-term memory, but everything has to pass through the working-memory bottleneck first. When that bottleneck is overloaded, learning stalls. The whole field is an attempt to design instruction that respects this limit.

The kinds of load

CLT, formalized by John Sweller in his 1988 paper on cognitive load during problem solving, distinguishes types of load by their source. Intrinsic load is the inherent difficulty of the material, driven by element interactivity: how many pieces a learner must hold in mind at once because they depend on each other. Extraneous load is imposed by how material is presented rather than by the material itself. Poor layout, confusing wording, and forcing learners to mentally integrate separated sources all add extraneous load that does nothing for learning.

A third category, germane load, was proposed for the effort that goes directly into schema construction. In their 2019 review, “Cognitive Architecture and Instructional Design: 20 Years Later,” Sweller, van Merriënboer, and Paas revised its status, treating germane load not as a separate source but as the working-memory resources devoted to handling intrinsic load. The cleaner contemporary framing is: minimize extraneous load, manage intrinsic load, and the productive effort takes care of itself.

What this implies for instruction

Good instruction cuts extraneous load through clear organization and by avoiding the split-attention effect, where related information is physically or temporally separated and the learner wastes capacity stitching it back together. Intrinsic load is managed by sequencing, attending to prerequisites, and using worked examples that show a full solution instead of forcing novices to search problem space blindly.

The catch is the expertise reversal effect. Guidance that helps a novice can hinder an expert: once a learner has the relevant schema, redundant scaffolding becomes extraneous load they must process and discard. This is why “just enough scaffolding” is not a slogan but a measurable design target. Support has to fade as competence grows.

Measuring load

Because load is internal, it has to be inferred. The most common tools are subjective scales: Paas’s single-item mental-effort rating and the multidimensional NASA-TLX. These are cheap and validated but coarse and retrospective. Researchers also use physiological and behavioral signals, including pupillometry, eye-tracking, EEG, and interaction traces, to estimate load closer to real time.

This is where adaptive systems come in. Lepine and colleagues’ 2025 study, “Precision Proactivity,” analyzed financial professionals working with GPT-4o, estimating intrinsic and extraneous load from their task transcripts. Extraneous load carried the largest negative association with output quality, roughly three times that of intrinsic load. Their proposal is that AI tutors and assistants should calibrate intervention to a user’s inferred cognitive state rather than flooding them with help. That is the expertise reversal effect restated for ed-tech: detect the load, then give exactly enough.