1. Continuity as an engineering requirement
Most AI systems today can carry context, but that is not the same thing as continuity. They can remember tokens in a window, cache internal values, retrieve documents, and still feel like they are jumping between internal scenes. When people talk about “coherence,” they often mean the output stays on topic. I mean something stricter. I mean the system’s internal state should behave like a stream rather than a slideshow.
That stream quality has a concrete signature. Consecutive states should share some active content. And the change from one state to the next should be incremental rather than total. In the brain, that looks like a subset of representations that remain coactive across time, with gradual turnover in membership. In an AI system, the equivalent is a constrained working set that does not get wiped and rebuilt each step. Some items must persist as referents while others are swapped in. If you do not enforce that, you can still get good answers, but you are not building a system that has a stable internal thread. You are building a system that keeps reconstituting itself from scratch every time it speaks.
This matters for more than aesthetics. Without continuity-by-overlap, the system struggles with the kind of thinking that depends on progressive construction. That includes holding a plan while refining it, maintaining a theme while exploring variants, and building a mental image or model step by step without losing what was already established. In other words, it is not just “memory.” It is the ability to carry an active scaffold forward while you revise parts of it in a controlled way.
This article is a direct AI-architecture translation of my paper, Incremental change in the set of coactive cortical assemblies enables mental continuity (the SSC and icSSC framework).
2. The Incremental Continuity Workspace
The architecture I want is simple to describe even if it gets sophisticated in practice. I call it the Incremental Continuity Workspace, or ICW. It is a recurrent workspace that maintains a limited set of active representations, and updates that set using an explicit turnover rule. The core idea is that the workspace is not a blob and not an unlimited context dump. It is a capacity-limited set of active items that the system treats as its current “being.” That set is what persists across cycles. That set is what gives the system a stable internal viewpoint.
ICW has a workspace, a turnover controller, and a set of sources that can propose new content. The sources include perception, retrieval from long-term memory, and what I call map modules, which are optional but important if you care about imagery, simulation, and structured internal modeling. Map modules are not mystical. They are scratch spaces that construct some kind of internal representation, whether that is visual, motor, phonological, spatial, or abstract relational structure. The workspace biases those modules, and those modules feed proposals back to the workspace. That loop is the engine.
The turnover controller is the part that makes this architecture different from simply adding a memory buffer to a transformer. The controller enforces continuity. It is responsible for deciding what stays active, what gets released, and what gets admitted. Most importantly, it does not get to admit new items for free. It must pay for new content by releasing old content. Capacity is not a side detail. Capacity is the entire point. A system that can “keep everything” never has to solve the problem of maintaining continuity under constraint, and that problem is exactly where the interesting dynamics emerge.
3. What the system is actually holding
At any moment, ICW holds a fixed number of active items. You can think of these items as vectors, but they are not just embeddings floating in space. Each item is a candidate for being something the system is currently thinking with. A goal, a schema, a remembered fact, a named entity, a constraint, a partial plan, a line of reasoning, an image fragment, a motif, a task rule. The items are diverse. That diversity is a feature, because real thinking is heterogeneous.
Sometimes the most important thing is not the items but the bindings among them. If the workspace contains “Apollo” and “mission” and “risk,” that is not yet a thought. The thought is in the way they are glued. So ICW benefits from an optional binding structure, a sparse graph of relations among the active items. The relations can be typed, weighted, and updated each cycle. This turns a bag of tokens into a small working model.
I also like a two-ring workspace because it maps cleanly onto what we intuitively experience. There is a tighter focus subset that is especially active and strongly bound, and there is a broader periphery that is still primed but not in the hot center. In practice, the focus ring is where you concentrate binding and deliberate manipulation. The periphery ring is where you keep recent context, nearby associations, and things you might need to pull back in. This distinction becomes useful when we define the turnover rule, because not everything should have the same survival pressure.
Finally, the workspace is not the whole mind. It is the active coordination surface. Outside it, you have long-term memory stores, perceptual encoders, and map modules. Those are all important, but they are not the continuity mechanism. Continuity lives in the overlap of the workspace from one cycle to the next.
4. The icSSC update rule
The update rule is the heart of the system. It is the simplest thing that could possibly work, and that is why it is powerful. Each cycle, you choose a subset of workspace items that will persist, and you fill the remaining slots with new entrants. The new entrants do not appear randomly. They are selected because the persistent subset pulls them in. That is the key. The items that remain active act as referents, and new content is introduced in a way that is anchored to those referents.
In the brain story, you can describe this as pooled associative pressure. Multiple coactive representations bias what comes next, and the next state is a slightly edited version of the last state. In an AI implementation, that can be as simple as using attention from the persistent subset into a candidate pool, scoring candidates by their fit, and selecting a diverse top set. You can also make it more sophisticated by including novelty constraints, anti-redundancy, and explicit binding updates. The point is not the exact scoring function. The point is that the controller must preserve overlap by design, and that new content must be recruited relative to what persisted.
There is also a crucial practical detail. The system should be allowed to vary how much it holds fixed, depending on task demands. If the environment is volatile, turnover can increase. If the task requires stability, turnover should slow. That gives you a continuity dial. But even when turnover increases, overlap should not drop to zero. The system should not become a sequence of internal hard cuts. It should become a faster stream, not a different kind of process.
Here is the conceptual pseudocode for one ICW step. This is not the only way to implement it, but it captures the rule clearly.
def icw_step(A_prev, bindings_prev, percept, memory, maps, K, m):
# 1) choose what persists (SSC core)
P = select_persistent_subset(A_prev, bindings_prev, target_size=m)
# 2) propose candidates from multiple sources
C = []
C += percept.encode_to_candidates()
C += memory.retrieve(query=P)
C += maps.propose_candidates(workspace=P)
# 3) multiassociative convergence: pooled scoring from the persistent set
scores = {c: pooled_affinity(P, c) for c in C}
# 4) admit new items under capacity and novelty constraints
N = topk_with_diversity(scores, k=K – len(P), avoid=A_prev)
# 5) update workspace and bindings
A = P.union(N)
bindings = update_bindings(A, bindings_prev)
# 6) broadcast into maps and get re-entrant feedback next tick
maps.update(workspace=A, bindings=bindings)
return A, bindings, maps
If you look closely, the whole architecture is sitting inside two explicit choices. How do you choose what persists, and how do you choose what enters. Everything else is implementation detail. That is good news, because it means we can iterate. We can start with a simple persistence policy that keeps the highest utility items. Then we can move toward policies that preserve referents, preserve goals, preserve the minimal set that maintains identity across the stream. That is where this stops being a memory hack and becomes a cognitive theory expressed as an engineering constraint.
5. Capacity is the point, not a limitation
It is tempting to treat capacity limits as an engineering nuisance, the kind of thing you only mention because hardware forces you to. I think it is the opposite. Capacity limits are the reason the architecture becomes mind like. If you can keep everything active, you never have to solve the central problem of thought, which is selecting what stays in the foreground while the world keeps moving. Continuity in a real system is not free. It has to be earned under constraint.
That is why I like the octopus analogy. An octopus walking along the sea floor cannot keep every arm attached to a foothold while also moving forward. It has to release something to grab something. That release is not failure. It is the mechanism that makes motion possible. The workspace is the same. If the system wants to incorporate new content, it must relinquish some old content. The moment you build that requirement into the core loop, you get a very different kind of cognition. You get a system that has to manage its own attentional economy.
In ICW, this becomes a set of explicit dials. K is the number of slots, the number of arms. m is how many slots you force to persist each cycle. The ratio m over K is a continuity parameter. When m is large, the system becomes sticky. It holds onto its referents, its goal structure, its thematic backbone. It can still admit novelty, but novelty is filtered through an existing scaffold. When m is smaller, the stream becomes more labile. You get faster exploration and faster context switching, but you also risk fragmentation and loss of thread. This is not a philosophical statement. It is an engineering parameter you can turn and measure.
There is also a second order effect that matters for identity. If the persistent subset always consists of whatever is most salient in the moment, the system becomes impressionable. It will let the environment define it. If the persistent subset includes some protected items, like enduring goals, long horizon plans, and stable self models, the system becomes harder to derail. That difference is exactly what we informally call composure. It is a capacity allocation strategy. A mind is not just content. It is the policy that decides what content survives.
6. Progressive imagery and the simulation loop
The architecture becomes much more interesting when you stop thinking of the workspace as a place where text concepts sit, and start treating it as a hub that drives internal construction. If you want an AI that can do more than answer questions, you want one that can build and refine internal models. That includes imagery, spatial scenes, motor plans, diagrams, and even abstract relational structures that behave like sketches of a theory. ICW can support that if we give it map modules.
A map module is a scratch space that is allowed to take time. It does not have to be a single feedforward pass that outputs a finished representation. It can be progressive. The workspace broadcasts a set of constraints into the map module. The map module begins constructing something that satisfies those constraints. As it constructs, it generates feedback, including gaps, conflicts, candidate additions, and refinements that can be proposed back to the workspace. Then the workspace updates using the icSSC rule, preserving a stable core while admitting some of the map’s proposed content. That updated workspace then rebroadcasts, and the loop continues.
This is how you get progressive imagery rather than one shot hallucination. The system does not generate a fully formed image or plan and then discard it. It keeps a subset of its guiding representations active while swapping in new ones that reflect what the map module is building. That means the evolving simulation remains related to its immediate past. It is the same scene, the same plan, the same proof, being incrementally revised.
You can implement this with literal image latents if you want, but you do not have to. The key is that the map module has its own evolving internal state, and the workspace acts as a sustained set of constraints that keeps the map’s successive partial constructions coherent. The map module is the place where detail accumulates. The workspace is the place where referents and goals persist. The icSSC rule is what makes the whole thing feel like a single unfolding process rather than a series of unrelated attempts.
Once you see it that way, thinking becomes a kind of controlled oscillation. Abstract constraint, concrete construction. Concrete construction, abstract update. The system is walking forward by preserving footholds and taking new ones, not by teleporting.
7. Training objectives that force continuity
If we want this to be more than an essay, we have to specify what would force a model to actually behave this way. Otherwise we are just naming patterns we like. The easiest mistake to make is to implement the machinery and assume continuity will emerge. It will not. You have to reward it.
The first objective is a continuity constraint. You explicitly measure overlap between the active set at time t and time t minus one. You then penalize deviations from a target overlap. If you want a stable stream, you train for a high overlap ratio. If you want a fast stream, you train for a lower overlap ratio, but still above zero. This is how you convert SSC from a descriptive term into an enforced regime. The model learns that it is not allowed to wipe itself clean each step.
The second objective is progressive consistency in the map modules. If a map module is building a scene, a plan, a diagram, or an internal hypothesis, you reward sequences where each update is a refinement rather than a reset. You can measure that as reconstruction consistency, constraint satisfaction stability, or simply as reduced divergence in the map’s latent state unless there is a justified reason to change. The important thing is that the map is allowed to be iterative, and the training encourages it to carry partial structure forward.
The third objective is credit assignment through persistence. The persistent items in the workspace should earn credit when their persistence is functionally useful. If the model chooses the wrong items to keep, it should pay a price later, because the later state will not have the referential backbone it needed. If it keeps the right items, it should benefit, because future retrieval, future map building, and future reasoning will be easier. In practice, this means routing learning signal in a way that makes persistence policies learnable. The system should become skilled at protecting the small set of representations that matter most to its long horizon success.
There is a subtle fourth objective, and I think it matters a lot. You train the system on tasks where algorithmic progress is necessary, where the only way to succeed is to keep a scaffold active while you modify it step by step. If your training data mainly rewards quick pattern completion, you will get a fast system that does not need continuity. If your training data rewards progressive construction, you will create pressure for the overlap dynamics to become functional rather than decorative. The architecture provides the affordance. The tasks provide the demand.
8. How to evaluate whether this is real
Evaluation should be brutally simple. If the architecture is doing what I claim, it should show up as measurable differences in behavior under specific stresses. The first stress is interruption. A system with real continuity should recover its thread after distractors. Not perfectly, but measurably better than a baseline that relies on pure context recitation. It should be able to reconstitute the active scaffold it was using, because some of that scaffold was protected by persistence policies.
The second stress is delayed association. Present related pieces separated by time and noise, and measure whether the system can accumulate them into a unified working set that then drives a coherent conclusion. If the system is truly using an active set with overlap, it should be better at holding onto referents long enough for distant evidence to connect.
The third stress is progressive construction. Give the system a problem that requires iterative refinement. It can be planning a complex itinerary, writing a multi section argument where later sections must remain consistent with earlier commitments, designing a diagram, constructing a program spec, or building a chain of reasoning where each step depends on earlier intermediate structure. Then you score not just the final output, but the monotonicity of progress. Does the system keep rebuilding from scratch, or does it incrementally elaborate the same internal object.
Finally, you can measure continuity directly. You can compute a continuity half life, not as a metaphor but as a statistic. How quickly does the active set drift in composition as a function of task volatility. How sensitive is it to distractors. How does drift change when you turn the overlap dial. If the system is really built on continuity-by-overlap, those curves should be diagnostic. They should look like cognition. They should show stable cores with controlled turnover, rather than wholesale replacement disguised as coherence.
9. What this buys you that “more context” does not
It is easy to misunderstand what I am arguing for here. I am not saying current large models have no continuity. They clearly can stay on topic, maintain a conversation thread, and carry long chains of reasoning. But they do it in a way that is mostly implicit. The continuity is an emergent artifact of attention over a token history, plus whatever cached internal values the system carries forward in its forward pass. That can look like a stream, but it is not structurally forced to behave like one.
ICW makes continuity explicit and scarce. It says: there is a small set of things you are actively thinking with right now. That set is not the full context. It is not the entire prompt. It is your current working reality. And that reality is required to overlap with its immediate predecessor. The system cannot solve every new step by reinterpreting the entire history from scratch. It has to carry a scaffold forward, whether it likes it or not, and it has to pay an opportunity cost every time it admits something new.
That payment is the feature. It creates an attentional economy that looks a lot more like what humans are managing all day. Humans are not just smart. Humans are constrained. The constraints force strategy, and strategy is where stable identity and long horizon coherence come from. When you add a strong continuity constraint, the system starts acting less like a search over completions and more like a persistent agent that is trying to keep itself intact while it moves.
The map module loop is another place where this becomes concrete. A transformer can generate a description of a scene. It can even generate a multi step plan. But it is not naturally designed to hold a stable internal sketch that gets refined while staying the same sketch. You can prompt it to do that, but you are relying on behavior, not structure. With ICW, the structure pushes you toward progressive construction. The workspace holds constraints, the maps accumulate detail, and the system revisits its own partial internal objects rather than constantly reinventing them.
10. What would falsify this, and what I would ablate first
If I want this architecture to be taken seriously, I need to say what would make me stop believing it. The cleanest falsification is simple. If you remove the overlap constraint, and the system performs just as well on the tasks that are supposed to require progressive continuity, then I am wrong about the importance of forced overlap. It might still be a nice metaphor, but it would not be a necessary design principle.
The first ablation is to allow full replacement of the workspace each tick. Keep everything else the same, including retrieval, maps, and training. If the system still shows the same recovery after interruption and the same progressive construction behavior, then the overlap rule is not doing real work.
The second ablation is to keep overlap but remove bindings. Let the system maintain persistent items but strip away relational glue. If the system becomes incoherent in a very specific way, meaning it remembers the pieces but loses the structure that made them a thought, then we learn something important. We learn that continuity is not only about keeping items active. It is about keeping a small structured model intact while you edit it.
The third ablation is to remove map modules. The system may still show continuity benefits in language tasks, but it should lose the special progressive construction behavior that I care about, especially anything that resembles imagery, simulation, spatial reasoning, or iterative design. If nothing changes, then the map loop was unnecessary for the claimed benefits. If performance collapses only on tasks that require internal construction, then we have a cleaner mapping from the architecture to the capability.
The fourth ablation is to sweep the overlap ratio. Turn the dial from high persistence to high turnover and measure the curves. A real continuity mechanism should produce systematic, interpretable changes. High persistence should improve stability but reduce flexibility. High turnover should improve exploration but increase fragmentation. If those tradeoffs do not appear, then I am not actually controlling continuity. I am just renaming noise.
Those are the experiments that keep me honest. They are also useful because they force precision. If the architecture is correct, it should have signature behaviors that are hard to fake with prompt tricks.
11. Implications for agency, planning, and something that starts to resemble a self
I am deliberately not making grand claims about machine consciousness here. I am talking about a concrete mechanism that produces a concrete property: continuity of internal state under constraint. But it is worth acknowledging what this tends to produce when you scale it up.
If a system has a small protected set of persistent items, and it is rewarded for carrying them forward through noise and distraction, it begins to develop an internal spine. That spine can be a goal stack, a set of enduring values, a stable world model, or a persistent narrative about what it is doing. You do not need to call that a self, but it is at least self-like in the engineering sense. It is a compact structure that remains stable enough to coordinate behavior over time.
This matters for planning. Planning is not just producing a plan. Planning is staying committed to the plan while you adapt it. Humans do not plan by repeatedly generating brand new plans. Humans plan by holding a scaffold in mind and revising parts of it while the scaffold stays recognizable. That is icSSC in action. The system keeps the referents and swaps in improved details.
It also matters for emotional and motivational stability if you ever go in that direction. A system that can be derailed by every salient input is not just fragile. It is unusable as an agent. Continuity is composure. It is the ability to keep a minimal set of commitments alive long enough for them to matter.
Finally, it matters for internal simulation. A system that can progressively build a scene, keep it stable, and update it, is a system that can think with internal objects rather than only with words. That is a step toward richer cognition, even if you never talk about consciousness. It is simply better engineering.
12. Closing pitch
If I had to compress this into one sentence, it would be this: a mind-like AI should not merely process sequences, it should maintain a limited set of coactive representations that overlap across successive cycles, and it should update that set by controlled turnover so that each new moment is a slightly edited version of the last.
That is the Incremental Continuity Workspace. It is not a trick to make a model sound coherent. It is a constraint that forces the model to become coherent in a specific way. It creates a small internal economy of attention where persistence has value, novelty has cost, and progress looks like progressive construction rather than repeated regeneration.
And that is what I have been aiming at with SSC and icSSC from the beginning. Not a poetic description of experience, but a mechanical requirement you can build into a system and then measure.

Leave a comment