A Theory On Becoming an Expert

What I love most about learning and performance is the euphoria that comes with doing something you did not know you were capable of doing. I’ve come to appreciate that, when you live at the edge of your capability, you find out who you really are.

Over the past two years, I made the transition from building developer tools and infrastructure to doing applied ML research. My thesis was that if AI is the “last technology,” understanding these systems at a deep level was going to matter. I wanted to move from consuming to shaping, to influence the future of how humans interact with computers and raise the ceiling of what humans can do.

This post is a working theory of what becoming an expert means in the age of AI, written for someone trying to build real expertise in a setup that makes feeling like an expert easier than it has ever been.

I’ve had a number of transitions after finishing my career as an athlete and college football coach, but none as big as this one. I would equate the process of trying to speed run all of deep learning and the spectrum of machine learning to chewing glass. The cognitive ground and distance I had to cover at the lower levels of understanding looked different from anything I had ever encountered.

That is the brain’s natural response to the feeling of incompetence. The hardest discipline as a learner is to actively press into the feeling rather than away from it.

The brain is built to generalize. The lever you control is intentionality, what you choose to put yourself through.

The person who arrives at the right answer instantly is not necessarily processing faster, even though that matters. They have built, through accumulated experience and reflection, the kind of interconnected mental architecture that allows rapid retrieval. The quick, correct intuition we call talent is the visible surface of invisible preparation.

Networks are built. The richly interconnected mind does not arrive in that state. It develops through some combination of reading, experience, and reflection. If intelligence is partly a matter of how knowledge is organized, then it is partly a matter of what can be learned. The architecture of the mind is constructed, not given. Intelligence is not fixed. Expertise is cultivated.

A learner’s prior knowledge in a given domain is the single most significant factor shaping their trajectory. The hard part of constructing this architecture is that we are bad at knowing how little we know. Three failure modes play out whenever we lack knowledge in a domain.

We confuse our subjective knowledge with objective reality.
We overestimate what we think we know.
We assume we have all the information we need.

AI sharpens all three. The systems on the other side of the prompt raise the floor of what you can produce, which makes the gap between what you produce and what you understand harder to feel. Anthropic’s latest study on how AI impacts skill formation found that developers using AI to learn a new library rated the task easier than the control group did, then scored 17% lower on a concept quiz minutes after finishing.Shen and Tamkin, How AI Impacts Skill Formation (2026).

Success in this journey is less about knowing an “answer” and more about being able to define what great looks like, and knowing which questions to ask to get there.

The framework is simple:

Scaffolding gives you the first lens.
Syntax lets you speak the language.
Decomposition shows you the system.
Tree expansion gives you sequence.
Contact turns knowledge into retrieval.
Calibration keeps you honest.

Learning speed

learning speed effectiveness throughput

Throughput is how many real contact loops you execute per unit time.

Effectiveness is how much useful learning signal each loop extracts.

What an expert actually is

What counts as an expert? In the cognitive science literature, the canonical definition is consistently superior performance on the representative tasks of a domain (Ericsson and Lehmann, 1996). What separates experts from novices in producing that performance is representational. Experts organize problems by their underlying principles.

This is hard to feel from the inside because of the illusion of information adequacy. Gehlbach and colleagues (2024) showed that people given only a portion of the relevant information about a decision still reported feeling adequately informed, and the illusion was not corrected by intelligence or education. Koch (2026) found that AI use accelerates this. Output rises while metacognitive accuracy falls.

So if a learner cannot reliably feel what they are missing, what do experts have that lets them not miss it? Experts see structure.

Adriaan de Groot’s 1946 chess research, formalized by Chase and Simon in 1973, showed chess masters and novices a real-game position for five seconds, then asked them to reconstruct it. Masters reproduced the position with high accuracy. Novices were near random. Then they ran the control. They placed the same pieces on the board in random configurations that had never occurred in actual play. The masters’ advantage almost entirely disappeared.

The masters had been seeing a handful of meaningful chunks, where each chunk encoded a familiar pattern such as a king-side pawn structure, a piece coordination, or an opening trace. Their advantage was in the organization of perception, not in the raw capacity to take in information.

Chi, Feltovich and Glaser (1981) extended the finding outside chess. They asked novice and expert physicists to sort physics problems into categories. Novices grouped by surface features (inclined planes, pulleys, springs). Experts grouped by underlying principle (conservation of energy, Newton’s second law). The expert’s primary representation was at the level of solution method, not at the level of problem appearance. Same problems, entirely different schema.

Schema is an organized structure of related knowledge that supports rapid pattern recognition and retrieval, the mental architecture that holds knowledge in long-term memory. Zooming out, expertise can be considered a structure on top of stored knowledge, enabling you to search and rank chunks of information into recognizable patterns. The schema is what makes problems solvable. Quick, correct intuition is the surface manifestation of a “file system” that has been built underneath.

Finzi et al. (2026) define how important sequencing is from a curriculum standpoint. Effectively, what a computationally bounded learner can extract from data depends on how the data is ordered and presented, not only on what it contains.Epiplexity quantifies how much usable information a computationally bounded observer can extract from a dataset. Finzi et al. (2026). Hyperparameters and humans obviously aren’t the same thing, but we share properties of useful learning. Translation. How you sequence and structure learning matters as much as the raw scale of information consumption.

Scaffolding & Syntax

If you want a learner to think independently, start with structure. Scaffolding is the entry point to building expertise.

Wood, Bruner, and Ross (1976) defined scaffolding as a process enabling “a novice to solve a problem, carry out a task, or achieve a goal which would be beyond his unassisted efforts.”

The first thing scaffolding installs is the syntax of the domain. Vocabulary, conventions, the patterns by which experts compose their thinking. You cannot move within a field whose syntax you do not yet recognize.

Scaffolds give you a functional template or starter kit to review, understand the syntax, and mentally map the core components. The point is to make the structure of the domain visible and usable, so the learner can do work that would be impossible without the support.

Scaffolding is a lens. It changes what the learner can see, comprehend, and compound.

Fluency is learning which terms map to which mechanisms. Language is the mechanism of instruction. It is also the bridge to communicating with other experts, which is the channel through which most domain knowledge actually transfers. Without the syntax, the conversation with experienced practitioners has a hard time getting started.

An expert hands you the lens that took them years to build. You spend a small amount of time learning to look through it, and you start seeing the structure of the field rather than its surface. The lens is what makes the rest of the construction possible.

What scaffolding ultimately gives you is a rubric: a working definition of what “great” looks like at each dimension of the field.A rubric is an external definition of quality: the criteria, dimensions, examples, and error signals an expert uses to judge whether work is actually good. Start with the rubric and work backwards. You can measure progress against an external target instead of your own felt sense of how things are going. Trying to eyeball it can work, but it will more often have you drifting, with varying levels of clarity about where you actually are.

Concept Decomposition

Once you have the syntax of a domain, the next step is to build a map.

This is where a lot of learners get stuck. They learn the vocabulary, read enough to follow the conversation, and mistake familiarity for structure. AI makes this easier to do because it can explain any concept on demand. You can keep asking for summaries, definitions, analogies, and historical context until the field starts to feel legible. But legibility is not the same thing as understanding.

A domain is a system of primitives, mechanisms, constraints, incentives, artifacts, and failure modes.

Start with the primitives and mechanisms: what the field manipulates and how those things interact. In machine learning, the primitives might be data, models, losses, gradients, evaluations, and deployment surfaces. In coaching, they might be players, assignments, leverage, spacing, technique, and decision rules. The mechanisms are the interactions underneath them: gradients update weights, incentives shape behavior, pressure changes decision quality, and bottlenecks move through a system as constraints are removed.

Then look at the constraints and incentives: what reality will not let you ignore, and what the system rewards. Compute budget, physics, regulation, attention, distribution shift, human trust, latency, capital, and scarcity all shape what remains possible. Incentives explain why the system behaves the way it does. What gets rewarded? What gets punished? Who benefits from the current arrangement? Where does money flow? Where does status flow? Where does truth enter the system, and where does it get distorted?

The artifacts are the things experts actually produce and judge. Papers, models, benchmarks, codebases, contracts, investment memos, experimental protocols, scouting reports, design docs, postmortems. If you want to understand a domain, study its artifacts. They reveal what the community considers worth preserving.

The failure modes show you where the structure breaks. Reward hacking, overfitting, brittle abstractions, bad incentives, false positives, proxy metrics. These are often the fastest path to understanding the real shape of the domain because they reveal which parts of the map are load-bearing.

The structural questions that have helped me most are:

What are the core objects?
What is the flow of the value chain?A value chain is how work turns into value in a domain: who does what, what gets produced, who pays, who benefits, and where money, status, or leverage flows.
What are the constraints?
What are the incentives?
What are the failure modes?

This is how you make the domain searchable in your own head. Once you can decompose a field into primitives and mechanisms, new information has somewhere to attach. You are no longer collecting isolated facts. You are placing each new fact into a structure.

This is where learning starts to compound. Scaffolding, syntax, and decomposition are the first three layers of the map.

Tree Expansion

Once you have a map, you need a sequence.

This is curriculum design: deciding the order of exposure that makes the structure usable later.

A learning tree is a way to narrow the search space.This is loosely analogous to Monte Carlo tree search, where a system does not exhaustively search every possible path. It samples and expands promising branches, using feedback to decide where search should continue. See Browne et al. on Monte Carlo Tree Search. You are not trying to learn everything at once. You are deciding which branch deserves attention now, which branches are prerequisites, and which branches should wait until the structure underneath them is stronger.

The trunk is the set of concepts that everything else depends on. The branches are the subdomains, techniques, tools, institutions, artifacts, and recurring arguments underneath the field. Start with the trunk, then let the trunk lead you to the next boundary condition.A boundary condition is the constraint or edge case that tells you where your current understanding stops working. It marks the next place your map needs to expand. This is where throughput gets decided: which real loops are available to you, and in what order you can run them.

In machine learning, the trunk might be tensors, optimization, loss functions, data, evaluation, and generalization. From there, the branches become transformers, reinforcement learning, inference systems, interpretability, multimodal models, or distributed training. Each branch has its own syntax, artifacts, constraints, and failure modes.

The human brain has shown us that the order of learning changes what can be learned from the same information. Bjork’s work on desirable difficulties argues that learning conditions that make performance feel worse in the short term can produce more durable retention and transfer later, especially when they force effortful retrieval, variation, and reconstruction rather than passive recognition.Desirable difficulties are learning conditions that create productive friction. They can reduce short-term performance while improving long-term retention and transfer. See Bjork, Memory and Metamemory Considerations.

Learning should feel hard. If it does not, ask whether you are actually learning or only recognizing. Productive difficulty creates better questions. Noise only creates confusion.

A good learning tree should answer a few questions:

What concepts are prerequisite to everything else?
Which branch gives me the most leverage if I understand it deeply?
Which adjacent branches should I interleave so I learn the difference between them?Interleaving means mixing related but distinct problem types instead of practicing one type in isolation. It can make practice feel harder while improving later discrimination and transfer. See Taylor and Rohrer, The Effects of Interleaved Practice.
Where is the difficulty productive, and where is it just noise?
Where is the scaffold still helping, and where is it starting to slow me down?The expertise reversal effect is the finding that instructional support useful for novices can become redundant or harmful as expertise increases. See Kalyuga, Expertise Reversal Effect.

Success in exploring branches looks less like having the answer and more like knowing what question to ask next. Each branch should make the next branch easier to see.

Loop It & Make Contact

Two phrases I come back to when I feel stuck:

Energy follows motion.
Action produces information.

All of that takes mental energy.

Contact, through experiments, teaching, writing, building, publishing, or working with someone who can see what you cannot see yet, turns the map into something your mind can retrieve under pressure.At the biological level, repeated practice is supported by neural plasticity, including changes in myelination. Myelin is the insulating layer around nerve fibers that helps signals transmit quickly and efficiently; myelin plasticity appears to help tune circuits involved in learning and skilled behavior. See MedlinePlus and Nature Reviews Neuroscience.

Every time you act, observe the result, update the map, and try again, the loop gives you information you could not have gotten from passive study.

The key is that the loop has to touch something real.

Calibration

Calibration is the habit of comparing your internal map against external evidence. Did the thing work? Did an expert agree? Did the benchmark measure what you thought it measured? Did the artifact survive use? Did the market care? Did the system fail in the way your map predicted?

This is where effectiveness gets decided: whether the loop actually corrects the map. The point is to build loops that reveal where you are wrong quickly enough to update.

A real contact loop is one where you can name the artifact, name the judge, name the signal, name the failure that would change your mind, and name the part of your map that needs to be updated.

This is the difference between collecting information and being able to use it under pressure. Retrieval gets built when your mind has to reach for the right structure on a clock.

The expert is the person whose map has been corrected by reality enough times that the structure becomes visible quickly.

How To Make This Real

To pressure test and evaluate your own understanding, ask yourself:

Can you define what “great” looks like? Can you recognize it when you see it? Can you explain why one artifact is better than another? Do you know which questions move you closer? Do you know where your current map is weak?

Expertise is not the possession of information. It is the construction of a map. Scaffolding gives you the first lens. Syntax lets you speak the language. Decomposition shows you the system. Tree expansion gives you sequence. Contact turns knowledge into retrieval. Calibration keeps you honest.

AI can accelerate every stage, but it also hides the gap. It can make weak maps sound fluent and make summaries feel like understanding: the confidence of structure before any structure is built.

At the edge of your understanding, progress is invisible. Slowly, then suddenly, the structure shows up.