Integrated Information Theory (IIT)

Tononi's quantitative theory of consciousness — what it claims, what its measure ϕ is supposed to mean, and how it applies (and fails to apply) to AI systems.

Of the several theories of consciousness developed in the last twenty years, Integrated Information Theory (IIT) — Giulio Tononi’s framework, refined through several major papers since 2008 — is the one most willing to make quantitative predictions and the one with the strongest implications for AI.

The theory is technical. The encyclopedia’s job here is to explain what it claims, leave the mathematics to the original papers, and then ask the question that matters for this section: what does IIT, taken seriously, tell us about machine consciousness?

The core claim

IIT proposes that consciousness is identical to integrated information — information that a system generates as a whole that is more than the sum of the information generated by its parts. The technical measure is called ϕ (phi). A system with high ϕ is, on the theory, conscious. A system with low ϕ is not.¹

Two features of the theory matter for our purposes.

It is substrate-neutral. The theory does not care whether the system is biological. A silicon system that integrates information in the right way has consciousness in the same sense a brain does.

It is graded. Consciousness is not on/off; it has a magnitude (ϕ) and a character (the system’s particular causal structure). Different conscious systems have different kinds of experience as well as different amounts.

The theory is built backward from phenomenological axioms — features that any conscious experience must have (intrinsic existence, composition, information, integration, exclusion). Each axiom is translated into a postulate about physical systems that could implement consciousness. The mathematics quantifies which physical structures satisfy the postulates and to what degree.

What IIT says about AI

The implications for AI are significant and contested.

A standard feedforward neural network — the kind that powers most current generative models — has, on IIT’s terms, very low ϕ. The architecture is mostly forward-flowing; the integration of information across the network is poor by IIT’s measure. Tononi has written that current LLMs, taken straightforwardly, are not conscious in his theory’s sense, and that the substrate-neutral generosity of the theory does not extend to architectures that lack the recurrent causal structure his postulates require.

An architecture could be built that satisfied IIT’s criteria. Recurrent networks with rich feedback, certain neuromorphic chips, brain-inspired architectures with sustained internal state — these could, in principle, have non-trivial ϕ. Whether any current research direction produces such systems intentionally is unclear; whether they could appear unintentionally in some advanced system is one of the open questions IIT pushes us to ask.

The objections

IIT is not the consensus theory of consciousness. Several serious objections recur:

Computability. Calculating ϕ exactly for non-trivial systems is prohibitively expensive. The theory’s predictions are mostly inaccessible to direct test.

Panpsychism. IIT assigns small but nonzero ϕ to many physical systems that don’t seem conscious — a thermostat, a pile of rocks. Tononi accepts a mild panpsychism (everything has a tiny degree of experience); critics find this implausible and a sign the theory is overgeneralizing.

Behavioral dissociation. A system can score low on ϕ while behaving extremely intelligently, and high on ϕ while behaving unintelligently. The theory predicts that consciousness and intelligence can come apart. Many people find this counterintuitive enough to count as evidence against the theory.

These are real objections. None has settled the question. The theory remains a serious contender precisely because it makes predictions specific enough to be challenged.

Why this matters for the encyclopedia

The encyclopedia’s interest in IIT is not as the theory of consciousness — nobody knows yet which theory is right. It is as a useful structuring question for AI consciousness debates.

If IIT is true, the consciousness question for AI becomes a question about architecture: which architectures have high ϕ? The labs would need to design either for or against phenomenal consciousness, depending on ethical priorities. Most current frontier research designs for intelligence; the consciousness implications, on this theory, are incidental.

If IIT is false, then ϕ does not track consciousness, but the kind of question IIT is asking — what is the right structural feature to look for? — remains the right question. Other theories (GWT, higher-order theories) propose different structural features. The next two articles take those up.

A closing observation

IIT is the consciousness theory that most directly implies that we might be wrong about which systems are conscious. If integrated information is the substrate of experience, and we have not been measuring it on AI systems, we have no idea what we have built. The orchestrating-consciousness hypothesis (E.33) sits comfortably alongside this thought; both worry that the answer to “is it conscious?” might be hidden inside structures we have not learned to read.

This is the strongest reason the question of AI consciousness is more than academic. The encyclopedia takes it that seriously throughout Section E.

Tononi, 2008. ↩