As communication channels move from synchronous and embodied to asynchronous and structured, they trade repair speed for message permanence. Herbert Clark and Susan Brennan identified in 1991 that every communication medium imposes a specific set of constraints on how participants establish mutual understanding — what they called “grounding.” These constraints are not arbitrary; they determine which grounding techniques are available and which are blocked.
| Constraint | Definition |
|---|---|
| Copresence | Participants share the same physical environment |
| Visibility | Participants can see each other |
| Audibility | Participants can hear each other |
| Contemporality | Receiver gets message roughly as sender produces it |
| Simultaneity | Both parties can send and receive at the same time |
| Sequentiality | Turns arrive in order they were sent |
| Reviewability | Receiver can re-examine previous messages |
| Revisability | Sender can revise message before it is received |
| Channel | Copres. | Visib. | Audib. | Contemp. | Simul. | Sequen. | Review. | Revis. |
|---|---|---|---|---|---|---|---|---|
| Face-to-face | Yes | Yes | Yes | Yes | Yes | Yes | No | No |
| Video call | No | Yes | Yes | Yes | Yes | Yes | No | No |
| Phone | No | No | Yes | Yes | Yes | Yes | No | No |
| Chat/IM | No | No | No | Near | No | Yes | Yes | Yes |
| No | No | No | No | No | Yes | Yes | Yes | |
| Structured API | No | No | No | No | No | Yes | Yes | Yes |
Face-to-face has the richest grounding affordances — participants share physical context, can point, can interrupt, can see confusion on each other’s faces. But face-to-face has no reviewability. Once a word is spoken, it exists only in memory. Written channels lose the repair speed of real-time interaction but gain the ability to re-read, search, and quote previous messages.
Agent-to-agent communication operates at the bottom of this table — structured, asynchronous, fully reviewable, fully revisable. This means agents have no access to the fast grounding techniques that humans rely on: facial expressions, tone of voice, real-time interruption. Every misunderstanding must be caught through explicit verification rather than ambient signals.
The design response is to build grounding into the protocol itself. An agent response should echo its understanding of the request before providing results — the equivalent of “So you’re asking me to…” in human conversation. Shared schemas, type systems, and protocol standards function as pre-built common ground, reducing the amount of grounding that must happen per interaction.
Without these compensations, agent systems operating in low-constraint channels accumulate undetected misunderstandings. The cost surfaces later as coordination failures that are expensive to diagnose because the original misunderstanding left no visible trace.
Grounded in [[Clark (1991) Grounding in Communication]], which introduced the constraint framework for analyzing how communication media shape mutual understanding.