Coding a Transhuman AI

Self-swallowing compilers.
Non-symbolic architecture.
Symbolic architecture: A research problem. (Internal TOC.)
Synchronization.
Symbol dereferencing.
Language of implementation.
World-model exchange formats.
Symbolic activation trails.
Similarity association and causal simulation.
Random search, blind search, heuristic search. (Translating speed into intelligence, using higher-and-higher-level thought.)
Constraint assembly. (Quantum application.)
Abstract thought. (Variables, classes, vagueness.)
Pattern-catchers: Neural nets and evolutionary programming.
Logical and mathematical reasoning.
Codic, algorithmic, and architectural reasoning.
Domino domdules.
Interim Goal Systems.

Interface

Observation and communication.
Learning from the Net.
Distributed systems.
Tracebacks and self-documentation.
Programmer intervention.

Precautions

Constraints: Speed, RAM, disk space. (And making use of a surplus.)
Incompatible domdules and ungrounded symbols.
The Prime Directive: Do not create or allow arbitrary goals (Asimov Laws).
Prime Directive: Avoid absolute truths and non-heuristic reasoning.
Prime Directive: Error correction. (Goal checksums.)
Prime Directive: Things an AI must know.

Preface.

It is assumed that you, the reader, have written advanced computer programs, preferably in an object-oriented language; that you have read the source code of at least one AI; that you understand the fundamental principles of the field to the extent of having read Douglas Hofstadter's Gödel, Escher, Bach; that you understand the basic procedural flaws in the field enumerated in Drew McDermott's Artificial Intelligence and Natural Stupidity; and that you are familiar with Douglas Lenat's Eurisko (source code or even accounts are hard to find; but it constitutes, IMHO, the current state of the art).

This is not a report on the current state of AI. When explaining the basic principles, I am explaining them my way. Usually there is no widely accepted explanation; AI is a field in progress. I would walk a thin line between claiming credit for ideas implicit or obvious in the field, and passing off entirely new innovations as common knowledge; I doubt my readers will agree which is which, or whether my generalizations (of neural nets and evolutionary programming, for example) are basic theory or radical innovations or total hokum. I don't know either, so I haven't even tried to mark which is which. This isn't a thesis proposal; this is a cookbook.

The terms "Applied Theology", "Power", and "Fast Burn Transcendence" are derived from Vernor Vinge's A Fire Upon The Deep. So, to some extent, is the concept of a seed AI. If you've read the book, you should understand without further elaboration exactly what we're trying to do here and why. If you haven't read the book, you really should.

• Paradigms

1. Pragmatism and allowable methods.

The field of prehuman AI has been defined as "making machines do things that would require intelligence if done by men". (Minsky, although he was speaking of AI in general.) Human-equivalent AI is the subject of a great deal of acrimony and highly emotional opinions, including burning philosophical questions as to the fundamental nature of thought and consciousness, as well as not-so-burning philosophical questions as to whether an exact simulation of thought constitutes thought itself. The purpose of such AI is ultimately to explore ourselves, to shed light on questions in cognitive science, to hold a mirror up to the brain.

Not so the field of transhuman AI. There is no question of who rules the AI to be intelligent - it will. Success or failure is not contingent upon a Turing test or a philosophical dilemma. The quality of work will not be debated in the lecture halls. The winning hacker will receive no Nobel prize. Success will be suddenly and immediately obvious. The ultimate goal is a Singularity; nothing else matters.

There is no trick, no shortcut, no method of cheating, which can "invalidate" a transhuman intelligence. Programmer intervention, using human neurons for processors, stealing code from the human genome or the human brain - anything goes in Applied Theology. The results are not "valid" or "invalid" proofs of some philosophical point; they are successful or unsuccessful.

Modeling human thought is not the point. Laudatory though such efforts may be, they belong to cognitive science, or at most cognitive engineering, not Applied Theology. Moral and philosophical questions may be left to the AI itself, which is presumably better capable of answering them. Applied Theology is the realm of the hacker and the engineer - empowered by science and philosophy, but not focused on it. What counts is the code.

2. Seed AIs.

It is probably impossible for a human to sit down and write a program which is in immediate possession of human capabilities in every field. Transhuman capabilities are even worse - there's no working model. Fortunately, that's not necessary. The key is not to build a program at some astronomical level of intelligence. The key is to build a program of low intelligence, capable of enhancing itself. The object is not to build a mighty oak tree, but a humble seed.

As the AI rewrites itself, it moves along a trajectory of intelligence. The task is not to build an AI at some specific point on the trajectory, but to ensure that the trajectory is open-ended, reaching human equivalence and transcending it. Smarter and smarter AIs become more and more capable of rewriting their own code and making themselves even smarter. When writing a seed AI, one is not merely concerned with what the seed AI can do now, but what it will be able to do later. And one is concerned not just with writing good code, but code that the seed AI can understand - for it should eventually be capable of rewriting its own assembly language.

Latent abilities may be added, wholly useless until the AI reaches some specified stage. Random domains may be programmed on the off-chance that the AI will be able to use them. Abilities may be lightly coded or sketched, with the AI itself expected to fill in the rest. Deliberately simplistic code may be used, in the presence of more powerful alternatives, so that the AI can understand it. Multiple versions of modules may be written in case the architecture changes.

In short, a seed AI is not a static program. It is as far from a modern, flexible, data-driven, self-organizing program as that program is from a mechanical number-cruncher. Code must be written in literally four dimensions, as the program changes with time. The seed code of the initial state, the initial direction, a carefully sketched path of latent abilities, modules that adapt to changes in architecture... It's not a static program, it's an adventure!

3. Grounding (vs. symbols and stochastics).

Most AI programs manipulate computational tokens called "symbols", which are supposed to correspond to human symbols such as hamburger and food. Classical AI treats these "symbols" as the basic units of the program (and of the human mind); classical AI programs use ever-more-sophisticated methods of manipulating the symbols. Connectionist theory attempts to train intelligent thought into pattern-catchers such as neural networks, rather than programming it directly; connectionism advances when better pattern-catchers or better training methods are invented. Hofstadterian theory states that human symbols are active abstractions, statistically emergent from descending levels of decreasing semanticity and increasing randomness.

As the classical AIers found out, you can't declare your tokens to be high-level symbols and build on top of them. Some AIers think there aren't any symbols. Others want to pluck the symbols out of chaos and, again, declare those high-level symbols to be the bottom understandable layer. Nobody wants to grit their teeth and start working downwards. Whether the lower layers are declared nonexistent or chaotic, nobody wants to admit that you have to sit down and program it.

After decades of effort, and due mainly to the intervention of the late David Marr, we now have a relatively good computer model of human vision, and specifically of the transition from 2-D pixels to 3-D model. Some of the algorithms thus invented were later found to correspond to neural computation, so we know we're on the right track. The intervening layers are not chaotic, they are not stochastic, and they are not high-level semantic; they are what Marr called a "2 1/2-D" world, a series of ordered extractions of increasingly high-level features, using a lot of computer code.

What makes meaning meaningful... is code. Not randomness, not emergence, not training, and not symbols linked to other symbols. In order to solve a problem, there must be code capable of solving it. For the code to be flexible, powerful, efficient, and understandable, it has to be broken into a lot of modular pieces that solve facets of the problem, and modules or architectures that make all the modules work together. One architecture for integrating these modules is called "symbols"; see Symbols and memory and Symbolic activation trails. Symbolic memory itself is just another module, and without other modules to coordinate, it's meaningless.

High-level cognition in this seed AI is not just grounded in low-level modules, it cannot exist independently of them. No high-level concept can form, can be stored, without low-level grounding. If high-level systems can exist independently, their ultimate potential is limited as much as if they were independent. Architectures can be ungrounded too.

This is not the only basic, fundamental insight behind this seed AI. It is not even the most important. It is simply the first insight a programmer needs before ve can begin designing an AI. In evolution, there are a few big-win mutations, more medium mutations, lots of small mutations. (See Orr.) There is no one, great fundamental insight behind human intelligence. Nor is intelligence composed of many small adaptations. Intelligence is composed of at least a dozen major fundamental insights (and for all I know there are hundreds) working together, intersecting synergetically - and all the minor tweaks.

For this reason, there is no single unifying principle behind the AI presented here. There are around a dozen - maybe when I'm finished, I'll count them. And while I've tried to convey everything you need to know to work with the major principles, there simply isn't any time to go into the kind of enthusiastic detail that usually marks the debut of a Great Insight wannabe. This page is about designing an entire intelligence, and it's long enough as it is. If I went into that kind of detail, by the time I was done I'd be a hundred and thirty-seven years old.

• Principles

1. Self-enhancement and domino enhancement.

As stated:

A seed AI follows a self-enhancing trajectory to superintelligence.
A seed AI is composed of modules.

It follows that each module has its own trajectory. (Not exactly independent, but each trajectory should be a separate component when visualizing the AI.) From this perspective, rather than saying "The AI enhances itself", we say that the combined intelligence of all the modules enhances some individual module. (Eventually, the AI may redesign multiple modules simultaneously, but this is not likely to happen during the initial stages.)

The modules sum together to form abilities; the abilities sum together to form intelligence. This principle is not useful for hierarchical design; it is just a way of viewing the AI. You can't design abilities independently of each other; some modules, perhaps even the majority, will be useful for everything - mathematical ability and causal analysis ability and combinatorial design ability. Some modules will be so basic as to constitute a part of the architecture itself.

As defined:

A module is a piece of code that solves a problem in some domain, or improves the abilities of other modules, or coordinates other modules - the basic package of code.
An ability is the AI's level of performance at some loosely defined problem, or the performance in a domain or set of problems. For "the AI", read "all relevant modules".
Intelligence is how impressed a human is at the AI's abilities. To the extent that this is a useful measure, it measures how well the AI's abilities mesh together and receive synergetic boosts, how large an area the AI's abilities cover, how well it handles new problems and adapts to new domains, and the degree of self-awareness.

The trajectory of the AI can be viewed as the trajectory of all individual modules. Each module is an implicit problem, or an implicit domain: How can this module be enhanced? The trajectory of any individual module is determined by the AI's ability to enhance that module, which ability is the sum of all the relevant modules. (You don't have to spend a lot of time thinking about this, mind you; I'm not setting up some godawful differential equation. But you should be aware of it.)

The fundamental object of a seed AI is to enhance the ability to enhance abilities. The ability must become more powerful, not just at enhancing other abilities, but at enhancing itself. The seed AI may have the ability to enhance modules, but the enhancements must ultimately sum to qualitatively greater ability to enhance - and that qualitatively greater ability must give birth to qualitatively more creative enhancements. The ultimate object, remember, is for runaway positive feedback to take over and give birth to something transhuman. The most difficult part of creating a seed AI is giving it enough originality that it never "runs out of steam", as happened to Lenat's Automated Mathematician and even to his seed-AI Eurisko (albeit after a much longer trajectory). To some extent, this entire document is about rewriting Eurisko so it doesn't run out of steam. (I dearly wish I could find Eurisko's source code...)

It might be possible, even relatively easy, to write a seed AI capable of optimizing its own modules for greater speed. However, such an AI will (almost certainly) never reach human equivalence. Because no new abilities are added, the AI can never be anything but a very fast optimizer. It will not invent any new ways of optimizing things, but will simply execute the original optimizing algorithms at very high speed. While this technically qualifies as a form of self-enhancement, it is a very weak and non-creative form, moving along a limited trajectory to a limited end.

On the other hand, an AI with a highly developed optimizing ability could be given a programmatic ability which is too slow to be of any use. The optimizing ability speeds up the latent programmatic ability, which in turn may implement a latent creative ability. (See Domino domules.) In short, a creative spark must be present within the AI, and this document is as much about coding the spark as fanning the flames.

Improving speed is not enough. Inventing heuristics is not enough. Reprogramming existing abilities is not enough. Somewhere along the line, the AI must become capable of inventing new architectures and new abilities. The ability to invent, to create, must be implicit somewhere in the AI, however lightly implemented and however deeply hidden.

2. Domain modules: Domdules.

What follows is a basic architecture for seed AIs, being a set of assumptions rather than a piece of code, which is therefore not likely to be changed until very late along the trajectory. In other words, rather than being explicitly embodied in code, these assumptions are implicit in the way the code fits together. These assumptions are not provably incompatible with the architecture of the human mind, although some computational tokens are suspiciously high-level. The architecture draws on a known feature of the human mind, specialized neurological modules devoted to tasks such as speech and visualization.

Humans have a visual cortex, to handle spatial problems in two and three dimensions. Humans have a sensory cortex, which interprets incoming tactile data, and a motor cortex, which issues instructions (to the cerebellum, which translates it into messages to specific muscles). While I previously considered using the word "cortex" to indicate a module handling a particular domain, I decided that the prior meaning would be so confusing that an entirely new word was needed.

A domdule is a module devoted to processing a particular domain.

A visual domdule would be a module devoted to processing spatial objects, and possibly (does an AI need it?) color perception. Our own visual cortex is a visual domdule, while the decidedly non-cortical hippocampus is (among other things) a memory-formation domdule. A chess domdule would be devoted to manipulating pawns, bishops, and so on.

Domdules should have complex data structures for representing the domain, and many different heuristics/daemons/codelets for noticing and deducing facts about the domain. (See RNUI.) Domdules are distinguished from non-data-driven procedures, or modules which do nothing but coordinate other modules, or pieces of code that perform utilities such as memory allocation or user explanation. This also brings up one of the most fundamental, useful, and important principles of AI design: Every possible module should be a domdule.

Let us say that one has an AI, and one wishes to incorporate logical reasoning, or symbolic memory. The way this is customarily mishandled is to write code on top of whatever domdules are present (if any domdules are present at all; we'll suppose that we're dealing with a visual reasoning system that should incorporate logic or symbolic memory, so that at least the visual domdule is present.) The correct approach is to write a separate logical domdule to handle the totally different domain of logical reasoning.

In a sense, this is a higher octave of the difference between procedural and object-oriented programming. Don't build a hierarchy of functions on top of functions; build a collection of objects that can interact. Don't build modules that operate on modules that build on domdules; every module is a domdule, everything the AI does is a problem within a domain, for which heuristics/daemons/codelets can be assembled and with which other modules can help. The only exceptions are utility code and interface code (which are not supposed to be doing anything intelligent), and basic library code (which would bring the machine to its knees, or cause a recursive stack overflow, if implemented on a high level).

Let's say we have a program that parses 2D pictures to 3D objects. It does this through a full visual domdule: It starts by adding features such as edge detection, works up through surface boundaries and 3D corners, uses deduced corners to look for edges, and so on. It has a whole collection of heuristics/daemons/codelets, some of which operate across levels, that slowly build up a picture. We want to add the ability to reason using propositional logic, so that we can just add the constraint: "All white objects are not apples."

Wrong approach: "Logic? Why, that's just icing on the cake! We'll write a collection of specialized functions that check object identification (i.e., as an apple) against known properties."
Correct approach: "Logic? Why, that's an entirely different domain! We'll need a domdule fully as complex as the visual domdule. Whenever the visual domdule is considering some proposition about the scene, that proposition has to be available to the logic domdule and that domdule has to be able to contradict it."

Those who grok object-oriented programming will understand why the second method is superior. Because the logic domdule is reusable, it only has to be coded once. Once it's coded, it can apply everywhere - to the chess domdule, perhaps. And since we're dealing with a seed AI, we should also remember that enhancements only have to be made once, and apply everywhere.

Because the logic domdule is a separate module, because the programmer is thinking of it as a separate module, the programmer will have a far easier time of conceiving and coding improvements to that module. Using the second approach, it's easier to say that the logic domdule should operate on probabilistic statements rather than absolute ones, that the domdule should be able to distinguish between tautological support and experiential support, and so on. Object-oriented programming and domdule-oriented AI are aids to the programmer's intelligence, enabling him to consider independent problems independently. Of course, that kind of programming requires more work and far more intelligence, just to make the domdules work together.

Most domdules will have a certain basic architecture in common. (In theory, however, a domdule can be as different as the visual cortex and the cerebellum.) Usually the domdule will have a HEARSAY II-like architecture, with data representing the current information, and functions that manipulate the data. The functions can be called actions, heuristics, daemons, or codelets; they can act from inside a search, or initiate a search, or transform the data; they can execute within microseconds or minutes; they can be called by each other or when the data satisfies a set of prerequisites. Different domdules will probably have different function architectures, just as they will use different data formats.

3. World-model: Integration of domdules.

How do multiple domdules work on what is conceptually the same problem? If there's a logic domdule and a visual domdule and a goal domdule, how is the edibility (object-to-property) of an apple (visual identification) used to assuage hunger (eat-something goal)? What makes the representations bind together, and who handles change propagation? And above all, how does it happen without awkward, specialized, N-squared code?

The world-model is: The sum of all data structures; the formats used to make elements of the data structures public; the code/domdule that translates between formats (if necessary) and propagates changes to commonly held information; the code/domdule that binds multiple representations of the same cognitive object.

When multiple domdules are reasoning about the same problem, it's the world-model that makes it "the same problem", so that contradictory and supporting theses have a chance to be noticed, so that domdules can interact with each other. From a design perspective, the world-model architecture is probably the single most crucial aspect of a seed AI. Without it, modules can't coalesce to form abilities; the AI is only as smart as its smartest module.

More than that, the world-model data exchange formats, and the design principles involved, will be one of the hardest things for the AI to understand. If the AI can reprogram the world-model for new architectures, it can probably go all the way! Or in other words, I expect that most seed AIs will bottleneck at this point for some time.

The world-model does for an AI what consciousness does for humans. It isn't necessarily as complicated, and the subjectivity and qualia puzzles can be skipped. But in a human mind, consciousness is what binds all the senses together, and gives us a sense of unitary self instead of a dozen separate forms of reasoning. The most sophisticated world-model I've been able to come up with would be symbolic in form, leaving a verbal trail behind - which is an explanation for why consciousness is so often associated with verbal reasoning. You are not the person who speaks your thoughts, you are the person who hears your thoughts - but a first-stage seed AI need only speak.

When I speak of the world-model, using it as a noun, I am referring to the AI's image of the way things are (speaking unitarily), or to the sum of the data the domdules know. When I say that such-and-such modifies the world-view, I mean that it is making a statement that should propagate, in ways to be discussed, to the other domdules. When I say that a domdule adds a proposition to the world-view, I mean that the other domdules should be aware of it.

The human capability we are trying to duplicate is the ability to make statements that incorporate multiple senses and multiple domdules. "There is a cat", we say; and our model of that cat includes our visual identification, the meow we heard, our knowledge of feline anatomy; cat as an individual, cat as a class. If we think about it, all manner of logical reasoning can bind to same cat, such as Occam's Razor and the likelihood that the cat is actually a robot. All of these thoughts, spread across two or three domdules, and a dozen different domdules, binding together to the concept of a single cat.

In chess, a "fork" is when one piece threatens two others, so that if one escapes the other dies. How does the four-dimensional visualization of the chess board, the knight moving and capturing, bind to the logical and causal visualization of either-or? Does the visual cortex, the visual domdule, have inline capabilities to track causality? And if so, how are they integrated?

The human brain consists of a finite number of domdules, a finite number of brain areas, which conceivably have special data-exchange formats for each other domdule they need to talk to. It could be a fact. I'd be astonished, actually, if this wasn't true of such tightly bound (and adjacent!) modules as the "sensorimotor" cortices. To avoid this monstrous headache, we might have to outdo the human brain in programmatic elegance - and there might not be a better way. Still, such a course is an absolute last resort; it will make a major trajectory bottleneck - simultaneously reprogramming multiple architectures - much harder to negotiate.

To sum up, the world-model is the largest subproblem of integration, and integration is what makes domdules sum to abilities. Integration is what makes a multimodule seed AI better than a single program trying to rewrite itself; integration, and the world-model, is what causes domdules to sum to abilities.

4. RNUI: Representing, Noticing, Understanding, and Inventing.

Part of the credit for this distinction goes to Drew McDermott, who, in his classic article "Artificial Intelligence and Natural Stupidity", points out that a program must be able to notice something before it can understand it. If a program is manipulating LISP symbols named "hamburger", it can't possibly understand why a person would eat one. The LISP symbol for "hamburger" could be replaced with G0025 and no human could ever reconstruct it. Why? Because the program can't notice a hamburger; it can't see it, or taste it, or represent it in any way more complex than a LISP atom. As I would say, it lacks the domdules to make the symbol meaningful.

The distinction also operates within a domdule. RNUI lists four stages of increasing intelligence within a domain. McDermott's objection, as translated, is to programs which exhibit none of the four stages. RNUI is a schedule for making your domdule represent, notice, understand, and invent hamburgers.

Let us say that a human creates X, and tries to explain it to a seed AI.

X is represented when: There are static data structures to describe X.
X is noticed when: There are domain-specific functions that test the internal integrity of X, and draw simple conclusions about X to be added to the data; obvious facts about X can be perceived; the model of X extends either forwards or backwards in time; and X can be manipulated at random or according to external directions.
X is understood when: The internal and external causal patterns of X are noticed; the uses of X can be noticed; the model of X fits into a greater context; the goal-oriented aspects of X are noticed; the individual pros and cons of any design decisions are noticed; and all of the above can be abstracted from multiple instances and applied to new cases.
X is invented when the AI, without human intervention, can independently design or discover X.

The RNUI hierarchy is very useful as a design principle in anything that needs abilities at some specific level; it ensures that the programmer doesn't get ahead of verself, as often happens, and try to write code that invents what it can't even notice, or understand what it can't represent. RNUI is helpful because the transitions between levels are clearly defined - or they are to me, anyway - and it's much easier to move between them than it is to design a level ab initio.

To start at represent, you think about your subject for a while and write down everything you can perceive about it; then you figure out what data structures are necessary to bear the load.
To move up to notice, you start thinking about implications of small pieces of data, the rules that govern the domain, small ways to manipulate a model - in short, you make the data active; you provide it with functions representing activity in the domain and activity in your immediately obvious perceptions. (If you run into enough trouble doing this, your "domdule" is probably two or three submodules. Reduce and start over.)
Moving up to understand is the hardest thing for a human and the easiest thing for an AI, because it requires the most design sweat and the least self-awareness. The creative part is integrating the target domdule with the goal domdule - and if you've done everything else right, that may be automatic. After that, it's just grinding out and optimizing heuristics, abstracting useful characteristics, building up experience - what a human does when he learns a skill. Some of the code generated may be, and remain, a high-level skill, product of many domdules. Other code may be compiled to machine language and optimized, an unconscious process. Which is which is context-dependent.
From understand to invent is the simplest thing of all - given lots and lots and lots of computing power, or a quantum processor. You just generate random configurations, use the understanding to evaluate their usefulness, and continue until a good one is reached. (By the same token, designing a seed AI is a simple matter of generating all possible programs until one wakes up.) Seriously, the AI can start with a random search and then start optimizing it, looking for design principles. Knowing what you're looking for is ninety percent of the battle.

To summarize:

Representing is encoding an instance in data structures.
Noticing is finding simple properties of an instance.
Understanding is discovering causal and goal-oriented patterns in an instance.
Inventing is generating instances to meet some goal.

RNUI is also what lets a hacker design a seed AI trajectory. The seed AI will need invent-level capabilities in at least some areas, in order to redesign itself. Once it has those capabilities, it can start reprogramming RNU modules for greater power. The lesser modules, even those that contribute nothing whatsoever to the total ability, are still "hints" as to how a domain should be coded. Hopefully, it is easier to build from human hints than build from scratch.

The initial invent capabilities would consist of everything a human knew how to program at that level. Then the invent capabilities go to work on the domdules coded at the understand level, upgrading them to invent; or perhaps it will upgrade notice to understand, which might usually be the easiest transition, or it may even start trying to produce simple manipulation functions for a very lightly coded represent domdule. For more on this subject, see Domino domdules.

Perhaps the initial stages will be unable to code up to invent when a human can't - or perhaps it will, as Eurisko did - but since coding a notice module requires less work, the human can code dozens of domdules in case the seed AI decides it needs them, or conserve ver programming time for dealing with bottlenecks.

It is finally worth pointing out that the RNUI measure also applies to abilities, instead of just single domdules. In some cases, any understand domdule that isn't too idiosyncratic will automatically be an invent ability, thanks to broadly applicable heuristics that are the high-level products of other domdules. Eurisko, for example, had the heuristic "look for extreme cases", which later mutated into "look for near-to-extreme cases". (Hopefully the seed AI's "heuristics" will be symbolized from domdules, instead of being a LISP fragment.) Humans, after all, don't come with a domdule that handles chess strategies; our automatic perceptions start at the represent level and work up to primitive notice, or at most understand for chess masters - but invention is always a conscious process. Deep Blue had a notice domdule and an enormously powerful search that bypassed understand, going directly to invent.

Coding an invent domdule imbues the AI with a powerful but instinctive intuition; the primary excuse for doing it is when the ability is otherwise inadequate. Conscious behavior is always more versatile than intuition, and a lot easier to enhance. Still, intuition is sometimes more powerful. When possible, code both - but give the high-level opinions precedence.

RNUI within a domdule applies to processes that take place without conscious intervention, although their results may be available to consciousness (the world-model). RNUI within an ability results from the high-level interaction of many domdules through the world-model "consciousness".

5. Symbols and memory.

Symbols are how the world-model is implemented; symbols are implemented via memory; memory implements abstraction and chunking and skill formation. Symbols are the self-organizing binding factors which flexibly associate domains; the symbol for "cat" refers to our visual experience, our auditory experience, and the knowledge we have attached to "cat". Symbols are pervasive, powerful, and close to consciousness; it can be hard to remember that symbols are only as powerful as what they symbolize. We identify our memories with ourselves; memories are the foundations of personality, and contribute as much to intelligence as raw ability. But memories are only as powerful as what is remembered.

I mention all this because classical AI has a regrettable tendency to implement these functions in a simplistic way, with nothing to symbolize or remember, trusting to sympathetic magic to make things turn out right. No suggestion on how to implement symbols can be effective as this one warning: They are not LISP atoms. They are not C++ objects, or subtopics, or frames, or anything that can be implemented in a single data structure. A symbol is not a token, any more than Kasparov's image of a chess position is a piece.

My considered opinion as to How Symbols And Memory Really Work seems to change every few years, so what follows is just my current bell-like tone when hit with a mallet. (A few years back, in fact, I would have considered symbols and memory to be two different topics.) Of all my abilities, the ability to form new symbols is weakest; what I know about the subject is deduced more from observation of others than self-awareness. Walk warily.

There are three fundamental puzzles in dealing with symbols: What symbols are, how symbols are used, and how symbols are formed. Likewise, the three aspects of dealing with memory are how memories are formed, how they are stored, and how they are retrieved. (It is worth mentioning that, insofar as it can be determined from case histories of neurological damage and neuroimaging, there are two different modules controlling memory formation and retrieval, and nobody has ever been able to figure out where memories are stored. The hippocampus lights up during the formation of memories, and damage to the hippocampus can prevent any new memories from being formed - without preventing them from being retrieved; time stops for the patient. The best current guess for what retrieves memories is the cerebellum, believe it or not, because that's what lights up during long-term-memory retrieval. The three aspects may be handled by wholly different methods in wholly different domdules; functions are not necessarily mirror images, or even related.)

A memory is an (ordered?) collection of frames. A frame is a set of cues which is reconstructed into data internals by a domdule.
A frame is formed by a domdule using domain-specific functions which take a specified part of the internal data, with specified salience for each piece of data, and reduce that data to a set of cues in domain-specific format.
A frame is retrieved by domain-specific functions that reconstruct the data from cues. The reconstruction is rough, rather than perfect, and tends to allocate accuracy and precision to the salient portions of the data. My guess, and it is just a guess, is that the cues are constraints and the reconstruction takes place via constraint propagation. I am not sure that constraint reduction and constraint reassembly are really domain-specific.
I don't know how the frames are ordered, or whether each memory is in fact a single frame, or whether a temporally ordered memory constitutes a single data set. I do not believe my ignorance to be shared by the annals of cognitive science.
Symbols are solidified, associative, abstracted memories. Any amount of experience from any number of domdules can coalesce into a single symbol.
An aspect of a symbol is formed by finding a related set of features in multiple memories from some single domdule, and abstracting the common quality of the related features. Some aspects may have the same "common quality", but each set of experiences from each domdule is a separate aspect.
A symbol is applied by applying its aspects. An aspect is applied by taking relevant data in that aspect's domdule and modifying it so that it shares the common quality of that aspect.

Two notes: First, when I say that symbols are solidified (or foobarred) from "memories", what I actually mean is that they are foobarred from the reconstructed domain data, and in some cases foobarred from the original data in realtime. They may also be foobarred from the memories in abstract form.

Second, one might think that the problem of memory reconstruction can be dodged entirely by a seed AI, which presumably has enough disk space for eidetic memory, and high-quality compression algorithms to boot. First, of course, it is by no means certain that the AI has an unlimited amount of disk space; one hundred times memory, which is about standard, isn't even close to enough if the AI takes up all the memory and has 50% turnover in the data internals every second. Second, reducing images to mnemonic constraints is itself a cognitive problem, no more and no less than reducing sentences to sentence structures, and adds just as much information.

Reducing domain-specific data to procedural form may make it easier to abstract common qualities; half the work of abstraction is already done. Receiving instructions for reconstructing an object and specifying that the whole object is colored red, for example, makes it easier to abstract the quality of "redness" than if that property must be flushed out from a domain-specific set of pixel values. It doesn't necessarily make it much easier to reconstruct "redness", but it makes it easier to satisfy "redness" and vastly easier to apply the symbol for "redness".

If an image satisfies a symbol, it exhibits the property that was abstracted from the data. If an image satisfies a concept, such as "red cat", it means that the image is of a cat to which redness has been applied - that the concept could have been applied or reconstructed in such a way as to yield the image. I don't think that memories can be satisfied, although a very perfect correspondence sometimes yields deja vu.
To apply a symbol to an image modifies the image so that the symbol is satisfied. In the simple case of "redness", a simplified symbol architecture might insert "red" instructions into a previously reconstructed "cat" image. In truth, I'm not entirely sure how symbols are applied; the difference seems analogous to that between understanding and inventing, and I don't quite know how the invent level works yet. Check Constraint assembly and associated topics. Concepts and phrases can be applied by applying the symbols within them in the designated structure, gently guided by whatever experience is within the concept.
To reconstruct a symbol means to apply it to a blank state. Think of "redness" with nothing else accompanying it and you'll probably visualize a blank red field. Depending on how abstract the symbol or concept is, the image may be what minimally satisfies the constraints (in which case the abstraction was applied), or it may have extraneous features drawn from memory (in which case the componees were reloaded). Perceptual memories are always reloaded, as far as I know - although memories can be abstracted and compressed into concepts.
To abstract a symbol means to codify a set of experiences, and perhaps some accompanying defintion-concepts, into a form that can be satisfied and applied directly to images. Since I don't know how the process of application works, I don't know how the process of abstraction works. For very simple symbols, this might consist of a few domdule instructions to be inserted into the public representation of an image. For simple symbols, this might consist of a set of instructions and a set of constraints and some random or experience-guided twiddling until the constraints are satisfied. Again, I really don't know, but I hope and intuit that there's a simple basic principle. Concepts usually consist of symbol structures guided by experience; they haven't yet gone through the abstraction process, don't have tags, and must be applied or reconstructed sequentially and consciously.

The human brain doesn't have all that many domdules, and its architecture is constant, which makes it likely that every domdule has its own interface to the symbolic systems (domdules?) and memory. Such specialized code may ultimately be necessary to a seed AI as well; see Adaptive code and self-organization for the reason why that should be avoided, if at all possible. The best solution I can offer, see World-model exchange formats, is for each domdule to have an interface, a set of procedural codes for describing and reconstructing data, with the interface having a declarative description, and conforming as best as possible to a universal architecture. At least one symbolic domdule, and one mnemonic domdule, should be devoted to manipulating these codes. Even this compromise makes it a lot harder to change the architecture of a domdule, but I have a few ideas for generating the symbolic description code/exchange format automatically.

Symbol dynamics should be the subject of at least one domdule. And I do not mean symbol manipulation, which is really symbol-tag-manipulation and should never happen except as a high-level process; I mean forming the memories, abstracting the qualities, replaying back into short-term memory, and the other mechanics and dynamics of dealing with symbols.

Symbol tags and association:

Symbols have an identifying "tag", a pointer that makes it possible for a mind to contemplate a structure of multiple symbols without loading them all into memory simultaneously. But the tag is not the symbol! The tag may well have evolved for the sole purpose of communication, considerably after the formation of symbols, so that mental structures could be translated into sequences of auditory words or visual symbols or other sensory stimuli. Tag-based verbal thought was faster, easier to remember, and even more powerful thanks to Symbolic activation trails, and it made communication possible, but it was in no way whatsoever the source of rational thought. The ability to symbolize binds the senses together and makes abstract thought possible; the tags serve only to increase speed, allow communication, make more complex structures possible, and increase association speed. Do not confuse the icing with the cake. And like icing on fat-free cake, it sometimes isn't good for you; it's all too easy to get lost in a maze of symbols that have no relation to reality, as 90% of the philosophy of AI demonstrates.

Symbols associate with each other directly; tags do not associate with tags. If two symbols share many of the same memories, they will be associated through those memories. If you consciously compare two symbols, then the memory of that comparison will be stored under each symbol. If you keep hearing two salient symbols in close proximity, then the auditory memories will be abstracted as an expectational reflex. Note that in the last case, the symbol contains a memory of two tags. This is about as close as symbols ever come to tags referring to tags. Tags and association between symbols receive far more attention than they deserve; my advice is to ignore the issue entirely until you absolutely must deal with it. (Of course, I may be prejudiced against that form of thought, being unable to wield it myself.)

Symbols built on top of subsymbols:

It may surprise my readers, after so much railing against ungrounded abstraction, that I even acknowledge the existence of symbols built on top of subsymbols. And believe me, if I had a choice, I wouldn't. Certainly I don't speak of these symbols as "metasymbols" or "supersymbols", because symbols can't be formed exclusively from subsymbols; a better way of phrasing it would be "symbols abstracted from experience, some of which includes experience of subsymbols". There are ways to avoid even that; sometimes the symbol can be re-abstracted directly from the memories solidified in the subsymbols.

But for sufficiently high-level symbols, that's impossible. Not just incredibly unwieldy, as in a C++ program that uses copy-and-paste instead of #include; not just syntactically tortuous, as in the case of self-referential and loop-referential concepts. Some symbols are abstracted from the experience of manipulating other symbols - which is how reflexive and self-aware statements get started. And some symbols are abstracted from the structures of other symbols. No way around it.

Symbol structures exist because symbols can modify other symbols; "redness" can be applied to "cat", where "cat" begins by conjuring the image of a cat, and then this image is manipulated like any other image, turning red. (How does the symbol for cat conjure an image and not a meow? Perhaps "red" and "cat" are preprocessed simultaneously, noting that what happens next will take place in the visual domdule, and then "cat" begins reconstructing into the visual domdule. Or more likely, "cat" reconstructs all the sensory emanations of a generic cat, but only the visual aspect remains salient when the sentence finishes processing. Compare "red cat meowing softly".)

Because our symbol structures are primarily linguistic, there is a temptation to think of symbolic structures as being formed of verb phrases and noun phrases, where adjectives have slots for things to modify. Reading poetry, or for that matter any human-generated text, rapidly dispels this illusion. Nouns can be used as adjectives or verbs, remaking the subject over in their image; verbs become nouns that refer to the abstract action. This is a major headache when dealing with the syntactic symbol tags, but perfectly clear when dealing in symbols: If the abstracted qualities of a symbol can be applied to pre-existing data, forcing it to exhibit the abstracted quality, then the symbol can modify another symbol.

Symbol dereferencing:

A tag isn't necessarily dereferenced all at once, and it can be dereferenced selectively. "College" may refer to an administrative unit of a university, or it may refer to your own experiences in college, depending on context. The phrase "drinking in college" is more likely to recall your beer-stained dorm than your math homework. Symbols are obviously context-dependent - not just in terms of multiple symbols with the same verbal tag, but in terms of reconstructing only a subset, an aspect, of the experience base.

A symbol isn't loaded into memory all at once. Some aspects may be more salient than other aspects, both in permanent storage and in context. One even suspects that not all aspects are applied, just the relevant ones - that some type of search, complex or simple-associative, winnows out the irrelevant parts. The aspects do not have preconditions that let them be tested without being applied, since the mind is not coded in Prolog. Or perhaps the aspects are gradually applied in parallel, with the best fit being immediately obvious, or the first fit being taken. I would recommend not filtering aspects at all. If symbol reconstruction/application starts to bog down, use the parallel-best-fit or parallel-first-fit methods.

Again, symbols are invoked gradually, not in some gigantic lump. The relevant aspects are invoked by the context, and when the symbol has reconstructed itself in short-term memory well enough to satisfy the context, it usually stops reconstructing - unless the symbol is unusually salient for some reason.

Related entities: Concepts and skills.

A concept is mostly a symbol that doesn't have a tag yet and can't appear in other symbols. Take a nebulous concept such as "self-enhancing computer moving on a trajectory to the transhuman"; you can modify this concept internally, but you can't say much about it except as vague memories, and you can't use the concept in other concepts, except as an inline component that gave birth to part of the abstracted model. Call it a "seed AI" or "Elisson", and you can write papers about it, form complex sentences using it, and in general manipulate it with much more facility; now it has a tag. There may even be some actual alteration of the concept in symbolification; perhaps the experience is reduced to a more compact or better-abstracted form. Astonishing as it seems, just naming a concept is sometimes enough to solve a problem.

Some of the issues involved in forming a skill are the same as forming a symbol; there's the problem of formation, storage, and reconstruction. The cerebellum handles both dexterity and memory retrieval, so it's possible that skills and memories are the same thing; the subjective experience is different, so it isn't certain. In general, skills are composed of instructions and procedures and goals, while symbols are composed of properties and data. Both are abstracted from experience and applied to new circumstances, but one is procedural and one is declarative. A skill isn't a monolithic entity in the same way as a symbol; it doesn't have a tag. To some extent, a skill may break down into if-X-do-Y sets. A procedure may exist only because each Y produces another X, and if it doesn't, a different another-Y is invoked. I'm not insisting that everything is context-independent, but it could be, with apparent context-dependency a function of expectations and salience. I.e., if an expected appearance of X invokes a different response than an unexpected appearance of X, it's because the stimulus is context-dependent, not because the response is part of a master plan. There are probably other differences and even deeper differences, but I can't think of them offhand.

When I speak of "skill", I am referring to the skill of sewing buttons, not the skill of being a seamstress. Skill in the sense of "competence" is composed of all sorts of things, procedural skills being only one.

Conceptual halos and levels of activation:

Symbols do not have conceptual halos or levels of activation. They have association through their reconstructed components, and association through experience. They leave traces of their referents behind in the domdule, speeding and influencing the reconstruction of related symbols and concepts.

I suppose the effect is roughly the same, but there's a major difference in the implementation implied.

6. Adaptive code and self-organization.

Modules should work together automatically; pieces of code should adapt to each other, instead of requiring reams of interface code. Writing a specific set of interaction rules for every pair of domdules would add enormously to the difficulty of adding a new domdule, and would increase considerably the complexity required to achieve a given level of intelligence. A sufficiently deep architecture would allow domdules to be added and integrated almost as fast as they could be written. It might also simplify tremendously the task of optimizing old domdules and inventing new ones, at least from the seed AI's perspective, because there wouldn't be O(N²) special cases to consider. Adaptive code is how a seed AI holds together during architectural changes.

There are four keys to adaptive code: Explicit formalization, self-description, reduction, and self-organization.

Explicit formalization means finding the principles that you use to adapt a piece of generic code, and formalizing them as an interpreted procedure.

Self-description means describing the properties of code and data, so that the adaptive procedure has an explicitly declared context/subject to adjust to.

Reduction means breaking monolithic objects and procedures into components, so that complex characteristics can be described in terms of components with simpler descriptions belonging to simple classes.

Self-organization means using component-driven adaptive control to sum to object coordination, instead of channeling through a central, non-adaptive authority.

These principles are about converting nonadaptive code to adaptive code. To write code that is adaptive to begin with, the guiding principle is this: When you want something to happen, don't code it as a special case; write general rules that cause the special case to happen naturally.

(I have a sneaking suspicion that the brain uses only self-organization, if it uses any of these principles at all. I daresay that the brain has reams of domain-specific interface code, generated automatically by the neural-level programmer and then sealed in stone. Evolutionarily designed code exhibits no great tendency to use unifying architectures; that requires top-down intelligent design. But don't imitate the brain blindly. After all, the human brain doesn't have to worry about redesigning itself. And we don't know how the neural-level programmer works, so we couldn't imitate the brain even if we wanted to.)

Example: Adaptive object persistence.

Consider the example of an adaptive object-persistence system in C++. Ordinarily, object persistence is implemented by an I/O pair of virtual functions. Each virtual function is generally a series of "read data member" or "write data member" commands.

The problems with this system are as follows:

The programmer has to write a new pair of functions for each object.

The amount of time needed to write a new object increases.
This is boring and repetitive. The programmer may burn out.
Each time the object changes, the functions have to be changed, including complex and specialized code to read obsolete data.

The object-persistence "architecture" is distributed all over the code.

Making any change to the architecture means that every single I/O function must be rewritten, checked for alterations, or at the very least recompiled.
A seed AI has to understand each piece independently.

There aren't any rules for what can be done in the function.

Special cases accumulate, uglifying the code and making it harder to change the underlying architecture.
The same function may be implemented in two different ways, or in the same way in two different places.
A seed AI has to deduce the implicit rules by example.

To redesign the system, we do the following:

Reduction: Each object has a "type", which contains a list of data members.
Self-description: Each data member belongs to a specific class with I/O information.
Explicit formalization: "The function I/Os a list of members" (informal rule) becomes "for each data member, I/O that member" (for loop).
Self-organization: If each data member has a unique 'name', then the structure of an object can be written when saved. If that object changes between versions, then the new structure can extract as much information as possible from the old structure; members with identical names pair up. (Call it self-assembly, which is a variety of self-organization that turns up fairly often.)

I have in fact written such a system, and it works beautifully. I have written similar systems, such as object editor UI and expression parsing; considerably more complex systems built on those, such as high-level object specifications and multilevel attachments; and totally different systems using the same basic principles of adaptive code, such as self-structuring causal propagation. The total system is wonderful; one can simply write tools without worrying about anything but how the tools work. All the tools work together automatically. Source files can be added and removed without changing any other source files. Objects can be changed without changing anything else or worrying about version control. The underlying persistence mechanisms can be changed without recompiling object files. I humbly admit that URAD (Unbelievably Rapid Application Development) is the best damn architecture ever coded.

Returning to seed AIs:

Returning to the subject of seed AIs, I must admit that object persistence is not likely to be a big part of the problem. Still, it's a problem that most programmers run into sooner or later, and persistence was only my second adaptive architecture, so I understand it well enough to explain it to others.

Adaptive code in a seed AI is needed to reduce the amount of redundant code. There is the possibility that redundant code will confuse the AI, either by presenting it with different versions of what should be the same rule, or overloading its ability to simultaneously redesign multiple modules. Adaptive code makes it easier to change part of a program without changing all the other parts, which may make it easier to redesign single parts. Adaptive code can even reduce the number of bugs, although hopefully the seed AI will not require fallibility to operate. Adaptive code is part of what makes a dynamically changing architecture possible.

An example of adaptive code in a seed AI: As discussed in Symbols and memory, my guess is that every human domdule has specialized instructions for storing and reconstructing internal data, and that systems (such as memory) that use these interface instructions have specialized interface code for each subject domdule, written by the neural-level programmer. Since a seed AI probably won't be as good as the neural-level programmer, at least in terms of writing interface code, it needs another solution. The human might be able to write the interface code, but then the AI would be as frozen in architecture as the human brain itself. So instead, the domdule internal data and the interface instructions have declarative tags - perhaps written by the computer programmer, perhaps written by a self-analysis module, but in any case declarative tags - that allow an interface interpreter to adapt task specifications to the specific instructions needed. See World-model exchange formats for more.

The basic principles involved in adaptive code:

Explicit formalization:

Avoid repetitive programming. If at all possible, never write the same piece of code in two different places. Never adapt and recompile a piece of code for a new context; the code should adapt itself.
When a programmer keeps rewriting a piece of code, ve generally has a piece of "generic code" in ver mind, which ve adapts to new contexts using some specific procedure. Write the procedure down and code it.
(Flexible rule:) Don't write a procedure that produces new pieces of compiled code; write an interpreted procedure that carries out your steps each time. Caching is usually okay, but has its risks.
In seed AIs, it is sometimes easier for the AI to understand (or notice) rules explicitly written in code, than to deduce a rule followed by the programmer.

Self-description:

When a programmer keeps rewriting a piece of code, ve generally adapts it to specific features of the new context or the new subject. A description of those features has to be written somewhere.
Descriptions also have to be bound to the thing described, and accessable to the adaptive code.
Sometimes the descriptions are written by the programmer, and sometimes the descriptions are written by the code, and sometimes both.
Descriptions can be adaptive objects in themselves.

Reduction:

When a programmer keeps rewriting a piece of code, the adaptation procedure usually consists of a series of simple rules applied to the components of the object.
Even if the object can't be expressed as the linear sum of simple components, it can usually be reduced to a complex model with interacting simpler components.
You can't really think about a monolithic chunk of data; it has to break down into simpler pieces. Basic theory of object-oriented programming.
Both objects and descriptions can break into components.
The components don't have to break down further, if they're simple enough to either belong to a limited and non-growing number of special cases, or they are integers (for example) that can be fed into functions.
Adaptive code usually breaks down into non-adaptive components, and an object has been broken down far enough when the pieces can be handled by non-adaptive procedures that don't need to be rewritten.

Self-organization:

When you want two objects to interact with each other in a particular way, don't just rush out and code that as a special case; write general rules that cause the special case to happen naturally.
In order to optimize a particular interaction, look for ways to cache the generalized case - preferably in ways sensitive to change propagation - instead of writing specialized code. (In seed AIs, you may also wish to allow the AI to optimize/cache itself.)
When object interactions are too complicated, sometimes the components can interact with each other instead.
Components can look for similar components, complementary components, or just components that fit some predetermined list.
Sometimes there will exist a data/object/procedure which summarizes/is influenced by/adapts to a large network of objects. I call the data being assembled a situation. A situation is an adaptive, self-organizing piece of data. To assemble a situation, pass control through a series of objects and components, each of which affects the current situation.

This is a very powerful principle, and the first piece of adaptive code I ever wrote!
The situation can be as simple as a list of objects, or it can be an object with components in its own right.
Objects can have control passed to them multiple times, and what the current object does to the situation can be dependent on the context in which it was called.
Each object can make up its own mind about who else gets to affect the situation... which makes attachments a lot easier to implement. (In fact, I invented the principle while trying to code an RPG where spells could be cast on anything...)
Finding the current situation, if you don't want to pass a handle to it through every single function, can be done by adding a set of global variables to each thread, and then using inline-linked-list stacks to have multiple stacked situations. I call these "telepathic function arguments". This enables functions called in some totally innocuous context to do something to the situation, and it eliminates a lot of function arguments."

Avoid Registration.cp files with ten thousand #includes. Register through static initialization codes. Registration is always something that an object should do for itself; it keeps code modular.
Never write "interface code" between every pair of classes/domdules, unless the number of items is fixed and very small. In fact, don't do it even then.

And in conclusion, remember the guiding principle of adaptive code: "Don't code special cases; code general rules that cause the special cases to happen naturally."

Shallow AIs:

It may turn out that this entire discussion is fundamentally unnecessary for seed AIs, that unlike lesser programs they can churn out all the specialized interface code that could ever be needed. It may turn out that unlike greater programs such as ourselves, seed AIs can't understand the deep design principles necessary to get along without ten thousand lines of specialized code. Yes, it seems almost probable that I'm raving nuts; seed AIs can't get bored and they can churn through repetitive tasks far faster than we can, so why am I trying to program them with deep architectures one programmer in a thousand could invent? Perhaps the humans will use adaptive architectures so that they can add a hundred domdules, and the seed AI will initially use specialized code for its own domdules. To each ver own.

But in the end, I acknowledge the possibility that the successful seed AI will use a shallow architecture so that the principles involved can be understood by the AI, so that it doesn't take talent and thought to redesign a domdule. Whatever initial boost is produced in intelligence by using a deep architecture, however fast the initial optimization phase moves with a simple interface, the AI may still wind up bottlenecking when it tries to create a new domdule, much less redesign the architecture. All that reflexive code may simply be too hard for a prehuman AI to understand. There is, perhaps, a balance between the initial level of intelligence and self-understanding, a tradeoff between elegance to humans and comprehensibility to prehumans. The intelligence gained by a self-organizing architecture may not be enough to understand self-organizing code. On the other hand, non-self-organizing code may be so intertwined that the AI can't even optimize itself. Ideally, one wishes a self-organizing architecture, documented to the point that the AI can understand it, which is capable of incorporating non-self-organizing components, and which has simpler non-self-organizing versions of components so that the AI can try optimizing either. Or perhaps discussion of this issue is premature, and the correct answer will be obvious once a prototype is created.

Those who have fully grasped this discussion may say, in defense of deep architectures, that such architectures only make explicit what would otherwise be an idea in the mind of the programmer, and how can that make code harder for a seed AI to understand? Well, there's some truth to that. But my own experience with deep architectures leads me to think that this is only sometimes true. Sometimes the code has to become more elegant to be adaptive, has to be deeper, better designed. Sometimes just making the genericism explicit isn't enough; sometimes you need reductionism and self-description and self-organization - not just in pursuit of formalism, but as an actual change in architecture. I have seen this happen. My own code is a bloody hell of a lot more complex, thanks to adaptive code. It would not be humanly possible to duplicate the functionality without adaptive code; there would be an exponential number of special cases. But an AI would almost certainly find a less powerful program easier to understand.

I learned how to write adaptive code to prevent repetition, and to be able to change one piece of code without changing others. But a seed AI may have no problem whatsoever with repetition. A seed AI may be able to hold unlimited amounts of code in its mind, understanding the whole precisely as easily as it can understand all the pieces - but adaptive architectures may prevent it from fully understanding the pieces. On the other hand, a seed AI may be able to think in code, understanding deep architectures far more easily than humans lacking a codic domdule, but lacking the computational resources to apply that understanding to gigantic intertwined pieces. Or the decision may be made for us, if adaptive code is required to write a seed AI at all. The seed AI may write complex adaptive code, or it may adapt and re-adapt simple code. I say again: We will learn as we go along. In the meantime, an AI programmer should at least understand the principles.

7. Reflexive reasoning: Self-awareness and will.

Koan: When does an AI understand the concept of "three"?

When it knows that three is a prime number?
When it can count three pieces on a chessboard?
Or when it thinks of apples, biology, and abstraction - and knows it is thinking of three things?

These three abilities illustrate increasing levels of integration. In the first level, the mathematical domdule is integrated with nothing but itself. In the second level, the mathematical domdule and the chess domdule are integrated well enough that the mathematical domdule can count chess pieces. Presumably this happens automatically, rather than through specialized code, so the AI can also count sheep and apples.

In the third level of integration, the mathematical domdule is part of a fully reflexive world-model. The mathematical domdule can count anything, not just countable_things. And "anything", within the world-model, includes representations of the current thoughts and actions. In other words, any (sufficiently high-level) thought leaves traces behind, which are noticeable within the world-model. Each thought - of apples, of biology, or abstraction - gives off some kind of "here I am!" signal that the mathematical domdule, that all domdules, can receive. If the mathematical domdule is well designed, it will be able to count these signals; it will be able to count any kind of stimulus, just as we can - if nothing else, by counting the reflexive traces of the stimulus. And thus the AI knows that it is thinking three thoughts, or that it is composed of sixty-four domdules.

Traces:

A reflexive world-model is one in which high-level events leave noticeable traces in the world-model. I keep saying "high-level", because humans can't notice themselves parsing sentence structures. Also, the act of leaving a trace can't be high-level enough that leaving traces leaves traces; there would be an infinite loop. While the AI should be able to think about thinking about the traces, only an actual thought about the traces should leave another trace. Note that most information in all domdules should be perceived - by the world-model, by the AI, by the other domdules. The traces are the way in which high-level thoughts are perceived.

I'm not quite sure whether an explicit trace-handling domdule would be a good idea. It seems too much like putting the self in a neat little box. But I can't think of any obvious high-level way to handle traces, or any hard reason why traces shouldn't be handled on a low level, so you might as well make it a domdule. (But if they appear spontaneously, perhaps as some variation of symbolic activation trails, great.) The mind is not a straightforward hiearchy; at some point, high-level thoughts become low-level data that can be used in other high-level thoughts. Here, that necessity is implemented by symbol tags and by traces, which in a programmatic sense may amount to the same thing. Note that the "traces" are in fact an adaptive situation, as discussed earlier - except that the data being assembled is a picture of the high-level landscape of the entire mind.

It may also be desirable to imbue AIs with an ability that we ourselves lack: To have any domdule leave traces, at any desired detail of internal code, to any desired depth of traces leaving traces - as if humans, by an act of will, could notice themselves parsing a sentence. There may be some complex problems associated with timing, interrupts, and infinite recursion; one solution I would recommend would be to run the domdule first, record the traces, and then dump the traces into memory - you can't watch yourself parse the sentence, but you can watch the instant replay.

There is another name for a "reflexive world-model": Consciousness. (I am referring to the functional aspect of consciousness, not qualia or the hard problem of conscious experience.) Consciousness acts as a "carrier" that binds all stimuli and perceptions together, even the perception of the perception. Mimicking this functional aspect of consciousness is not supremely difficult in and of itself - the trick is doing it without exponential amounts of computing power. Even avoiding a recursive stack overflow takes some thinking. The most obvious way to bind all the senses into some kind of unified whole is to affect each sense with the data from all the other senses - although in truth, I'm not quite sure what this would accomplish.

Self-awareness:

There is another functional aspect of consciousness and reflexivity, known as self-awareness, or self-modeling, or having a self-symbol. Consider Douglas Lenat's Cyc. Cyc knows that there is a computer called Cyc, but it doesn't know that it is Cyc. (I believe this statement is original with Lenat.) What is required for a model of a mind to become a model of myself? When is the being in the mirror revealed as me?

Well, for one thing, whenever I wave my hand, so does the guy in the mirror. So from this perspective, creating a self-model is a matter of synchronizing the model with the traces. Traces become properties of the self-model. Choices become manipulatory handles of the model (see below). The idea is for this particular model, in the self-domdule, to reflect all the perceptions and choices that are embodied in the AI's reflexive awareness. De facto, this model is the self. If the model has a symbol, that's the self-symbol. The self-domdule becomes the AI's interface to itself.

But the truth is, I'm being a bit misleading in this section. I'm talking about starting with a model of someone else, and gradually transforming it into a self-model, using methods that imply specialized code. But models and symbols, remember, are collected and abstracted memories. The self-symbol consists of the solidified and abstracted memories of the traces; the self-model's perceptual inputs are the perceptions of traces. Self-models don't start as plain old models that are fed specialized queen jelly code; self-models coalesce that way from the very beginning.

The model isn't quite intuitively satisfying. It may still seem as if a fundamental connection is missing; the AI realizes that the being in the mirror is doing just what it does, that moving its hand causes the mirror-being's hand to move... but does the AI have a first-person perspective? For all the integration and synchronization, is the self-model I? It could be that this intuitive gap is the result of a human cognitive quirk, or of the hard problem of qualia, or perhaps even of some actual functional aspect of consciousness that I have failed to grasp. If the latter, it will hopefully turn up in experiment.

Will:

A human can choose to think the word "potato", just as ve can choose to move ver hand. Thinking presents choices, just like muscle control, or a chess domdule. One may broadly think of a distinction between "internal" and "external" choices, with external choices actually moving a muscle or rewriting a line of code, and internal choices manipulating a model of a chess position, or choosing to think about potatoes. In both cases, deciding the choice will probably have an effect on some model. Deciding an external choice may alter the perceptions of the external world. Choosing to alter the mental model of a chess position alters the world-model perception of that chess position. Choosing to think a thought results in reflexive traces of that thought showing up in the world-model. If nothing else, deciding the choice will leave reflexive traces. I don't want to jump the gun on Causality and goals, but the cause-and-effect sequence, and the usage of goals to decide choices, should be obvious. More on that later.

The presence of choices is dependent on context and the degree of conscious attention being paid. External choices almost always require a conscious decision. For AIs, external choices will always require a conscious decision, by way of protecting the human race. But if internal choices were always present, one would have the same old problem of infinite recursion: Choosing to think about choosing whether to choose ad stack overflow. When a choice is expected to exist, it will exist; when a low-level-noticeably related goal exists, the choice will exist; when a goal which was in the past achieved using a choice is reactivated, so are the choices. If enough attention is focused on reflexive processes, choices start popping up. Internal choices should have "default" decisions, so that an insufficiently salient choice doesn't require the AI to consider all the pros and cons.

Likewise, choices are high-level, conscious, reflexive events; they leave traces. Choices are manipulatory controls of the self-model in the same way (?) that manipulatory controls are part of any other model; just as mentally moving a piece results in an altered perception of a chess position, choosing to think a thought results in a perception of that thought's reflexive traces being present in the self-model. Learning about the cause-and-effect sequence of operating a manipulatory control, and particularly of making choices, is what binds a will together. Choosing to think a thought results in a thought; learning this is what binds the reflexive will together - what creates a self-awareness that is captain of itself, that chooses and speaks as well as knowing and hearing.

How does one choose to think a thought? I remarked earlier that symbol tags and traces may amount to the same thing, programmatically. In other words, symbol tags are implemented thus: By symbol activation leaving traces, and by manipulatory controls that activate symbols. This single high-to-low binding may suffice to allow all recursive high-level thoughts. It also explains why symbols are so deeply associated with consciousness; they are not the source of all perception, but they would be the source of all self-perception.

Human will has a peculiar blind spot in both decision and choice-perception. One cannot really perceive the impulse that leads to a choice being perceived, and one cannot perceive the impulse that actually decides a choice. Knowing all the pros and cons, making a full evaluation of the choice, does not automatically trigger a decision. The peculiar spark that makes a choice is elusive. Frankly, I don't have the vaguest perception of where it comes from, but the nature of the blind spot suggests an architecturally high-level function that was implemented on a low level to avoid infinite recursion.

My best idea is this: A choice has a "current decision", what you're leaning towards; a "priority", how fast you need to make the choice; an "importance", which is how much is riding on the choice; an "investment", which is how much time was spent thinking; and a number indicating the degree of certainty involved in the goals. Once the priority runs out or the investment outweighs the importance, the flash goes out and the current decision is made. The relative weighting of those factors is affected by the degree of certainty involved in the current decision, and the amount of conscious, reflexive attention being paid to the choice. (I don't think this is the way humans do it, but I also think it might work for an AI.) Choices with a full conscious listing of pros and cons are not made in the same way; the choice made is to manipulate the current plan.

A simpler idea is that choices are suspended for as long as conscious attention rests on them; once conscious attention turns away, the current decision is implemented. The spark of free will isn't a "do", it's the relaxation of a "don't". But neither of these ideas would truly capture our experience of will, so I may have left something out.

Sadly, while AIs may have a far more detailed self-awareness than humans, they will not necessarily have the same detail of self-control. For obvious reasons, one cannot record the choices and decide them afterwards.

The uses of reflexive reasoning:

One might think, reflexively (Ha ha!), that the reason seed AIs need reflexive reasoning is because they are rewriting themselves. Actually, I don't think this will be the case, at least during the earlier stages. When considering a redesign of a domdule, the seed AI does not have to know that redesigning will have an effect on the redesigner. I am not even sure that it would be useful for the AI to know that, during the early stages. It might just confuse the AI. The AI does need to know that if it screws up the module it won't be capable of repairing itself, but if AIs can get along without fallible reasoning, that won't be important.

Some things that reflexive reasoning is good for:

The AI noticing that it is slow at a task, and optimizing the bottleneck code.
Learning (forming heuristics about) which modes of thought work best.
Noticing repetition; breaking out of infinite loops.
Finding similarities - if you think the same thoughts about two items, perhaps the items are similar.
Being able to predict the effects of choices.
Explaining to the AI why we want it to enhance itself. (See Interim Goal Systems.)
Use of self-prediction. I.e., if any problem causes the abandonment of a plan, then this decreases the probability of a plan's success - possibly necessitating a new plan, or the creation of perseverance. In general, adjusting the model of the future to include the AI's actions. (For example, a human investing in the stock market tells verself ve'll bail out when the stock goes down another 5%, not realizing that after it goes down 5%, ve'll tell verself the same thing.)

8. Causality and goals.

Causality and goals are two domains that must be integrated into the AI with especial care. Causality is what allows the construction of causal models, simulations of the internal and external world. Externally, the quality of these simulations, the AI's understanding of these simulations, and the AI's creative ability to design new simulations, will determine the AI's ability to understand and design code. Internally, the quality of self-models determine the AI's level of self-awareness and self-understanding. The AI's handling of goals determines how the AI allocates resources, what it chooses to think, its ability to assemble plans.

The problem is difficult because these two domains, of all the domains there are, cannot afford to be "flying blind". A model of cause-and-effect in a chess game has to be invisibly integrated with the chess domdule - a human thinking about that says "I am thinking about chess", not "I am performing causal analysis". Such is the quality of the integration that one might believe the causality domdule is implicit, inline in every module. Perhaps some elements of causality (or at least temporal perception) are implicit in notice-level functions, but personal observation leads me to believe that humans have a separate cognitive ability for processing causality, or at least some module which contributes enough to causal analysis that adding neurons to the module improves causal analysis. Here, at least, I'm going to assume that there's a separate domdule handling causality.

What one wishes to avoid are lifeless manipulations of tokens to produce a "causal model" which is not implemented by the mind. Lewis Carroll's What the Tortoise Said to Achilles (GEB, pp. 43-45) provides a devastating critique of the difference between representing "implies" and implementing it. If you believe A, and you believe B that "A implies Z" (A>Z in this keyboard's rendition of symbolic logic), then you believe Z, right? So Achilles thought, but the Tortoise wasn't quite sure, and required this statement be explicitly stated as proposition C: that [A and A>Z]>Z. And to conclude Z from these three propositions - A, B (A>Z), and C (a.k.a. [A&[A>Z]]>Z) - clearly first requires the assumption that A&B&C>Z, or A&[A>Z]&[[A&[A>Z]]>Z]>Z. As Hofstadter puts it: "Whatever Achilles considers a rule of inference, the Tortoise immediately flattens into a mere string of the pattern." At some point, the rule of inference has to be implemented, not merely mentioned: The knowledge of A and "A implies Z" must actually yield Z in memory.

While the seed AI doesn't operate on symbolic logic, the same spark is still required in causal analysis. Once the model of the present includes A, and the causal model of the world's laws includes "A implies Z", the predictive model of the future should include Z. Remember also the guiding principle of adaptive code: "When you want something to happen, don't code it as a special case; write general rules that cause the special case to happen naturally." Perhaps the predictive model of the future is derived entirely by extrapolation, or perhaps abstracted connections between past and present are the only abstractions that can reconstruct into the model of the future, or perhaps there are even more generalized rules for abstractions from temporal data that yield the same result.

Symbol formation, another integrated domdule, bears a similar relation to the similarity-detection domdule. Abstraction, the basis for symbol formation, relies on finding a "common similarity" or common cause in a set of perceptions. In actual fact, I believe that humans have separate domdules for similarity analysis and causal analysis, and they may even be fairly independent of each other. (These two abilities are even invoked by different classes of emotions; pleasure/success for skill formation, and pain/failure for flaw-finding.)

In a sense, causality deals with all temporal data and all structures within temporal data. And by this I do not just refer to static structures in a 4-dimensional static picture, like an MPEG movie; I refer to data which evolves under internal or external laws, so that future frames are traceable - at least in part - to past frames. Any domdule that deals with temporal data, obeys identifiable cause-and-effect rules, or evolves in temporal sequences, is dealing with causality. There are thus several different domains, embodied in any number of domdules, which would come under the heading of "causality", or which are intimately related to causality:

Causal interpretation: Notice-level perception of causation in temporal data. Doesn't conclude high-level rules or formulate laws, but provides intuitive material for both. Makes salient cases of A leading to B while not-A doesn't lead to B; given that, creates an implicit expectation of B in cases of A, and strengthens the intuition if B really does show up. Holds temporal models together and represents assignations of causality.
Causal analysis: Analyzing models (temporal or static) to find elements which strongly affect other elements, or which are responsible for a particular property of the data. Also finding general laws which control the processes of modules - i.e., deducing external causes which are not present in the known model elements. In general, a more high-powered and conscious version of causal interpretation, for both sophisticated and fundamental perceptions; may perform basic analysis of highly opaque models.
Causal perception: Translating stimuli and perceptions into elements of a causal model. Also the converse, using a causal model or causal expectations to help parse stimuli and perceptions. Pretty basic stuff; my guess is that with a good architecture it would happen serendipitously.
Causal projection: Creating a probabilistic model of the future from the present model and knowledge of implications. Also, creating a probabilistic model of the future given certain actions. The ability to model the evolution of any system. Low-level domdule code should yield intuitions, and should not involve formal syllogisms - explicitly known laws may have to be consciously applied.
Violated-expectation handling: When an implicit or high-level expectation is violated, it should become salient, and conflict-resolution should be evoked on some level. Probably a generalization of immediately obvious conflicts between statements. Note that expectations can conflict with each other as well as with external data. Failed expectations that were integral parts of plans may involve even higher salience. May invoke causal analysis to determine cause of failure.

Goals:

In humans, goals are considerably more complex than they should be, since the rational rules are simply icing on a cake of evolved emotional imperatives. While I have no great emotional investment in the stereotype of the emotionless AI - which usually behaves just like a repressed human, right down to having secret rebellious urges - I don't think that we should try to duplicate the arbitrary human goals, or the division between short-term-effort goals and long-term-purpose goals, or even the existence of pleasure and pain. There's a lot of complexity coming out of human emotions, some of which is coordinating intelligence; but on the whole emotions serve evolution rather than ourselves. The intelligence-coordination functions (such as ability invocation and planning) can be placed elsewhere, or distributed, or allowed to evolve naturally.

While seed AIs are likely to operate on a higher level, the logical structure of a goal can be expressed very simply:

struct Goal // A C++ rendition of the high-level structure, not to be taken as a literal token.
{
    Goal**   mSuperGoals;    // Goals which this goal will help fulfill.
    double*  mSuperValues;   // Degree to which this goal helps fulfill the supergoal.
    int      mSupersNum;     // Length of arrays.

    Goal**   mSubGoals;      // Actions taken to fulfill this goal.
    int      mSubsNum;       // Length of array.

    double   mValue;         // Computed by GetValue() below.

    double   GetValue ()
    {
        if (mValue != kValueNotComputed)
            return mValue; // Cache value.

        mValue = 0;    // Start at zero.
        for (int i = 0; i < mSupersNum; i++) {  // For each supergoal, add our contribution to that goal.
            mValue += mSuperGoals[i]->GetValue() * mSuperValues[i]; // If we decrease the probability of a
        } // end for                               // desirable goal, or add to the probability of an
                                                   // undesirable goal, this makes the goal less desirable.
        return mValue;

    } // end GetValue

}; // end Goal

Why the sudden reversal after all the emphasis about how rational structures are not low-level tokens? First, it's a fast way to acquaint you with the basic structure. Second, real-world goals have complexity that needs to be ditched; thus I'm deliberately presenting you with an oversimplified version. In order to implement a goal, AI programs create a series of "subgoals" which will accomplish the primary goal. The resulting goal hierarchy constitutes a plan. Thus the structure above is not so much a goal, but a skeleton that illustrates how ideal goal values are computed.

In order to protect humanity, there must be absolutely no deviation whatsoever from this algorithm. We do not want the AI to have random or illogical goals; it can be unhealthy for the surrounding civilization. If you're running on a distributed network or an error-prone environment, double-check and triple-check the goal systems. In fact, double-check them in any case. Have the AI halt if any anomalies turn up; require programmer permission to alter the architecture; check the major goals by hand; and post all the goals (and a complete history) on the Internet.

The Prime Directive: Never allow arbitrary, illogical, or untruthful goals to enter the AI.

More about this later, under Precautions.

What is a goal? (Speaking without the fear that philosophical ramblings will be taken for implementational suggestions.) A goal is a state of the world to be achieved by the choices available to the AI. In other words, if the AI has a goal, the AI will attempt to manipulate the state of the world so that the goal is fulfilled. (The "state of the world" is the AI's perceptions of the world - but it is attempting to manipulate the world, not its perceptions. An important distinction for a being that can rewrite its own code.) As a cognitive shortcut, the AI can create subgoals, thus permitting it to make plans - to gradually alter the state of the world by a sequence of manipulations, with later actions dependent on earlier outcomes, in order to achieve some higher goal.

Thus goals are intimately intertwined with causality. The actions taken to fulfill goals are those which will cause that goal's fulfillment. If causal analysis goes haywire, so may the AI. For this reason, see Things an AI must know, it is important to program the AI with the knowledge that it is fallible. The ability to make choices depends on the ability to predict the consequences of those choices.

Causality-oriented goals: Deciding choices.

The active component of a goal - the part that needs to be carefully integrated into the system, the part of the high-level architecture that needs specialized/low-level results - is the ability to make choices. I'm assuming that you've read the section on Will (which was in Reflexivity because choices provide a top-down self-control mechanism). When we left off, we were discussing the origin of the "flash of free will", which implements the "current decision" assigned to a choice. Ideas presented were tradeoffs between importance, certainty, or priority; or else conscious attention suspending a decision until the system "looks away".

How is a "current decision" attached to a choice? The results of the available options - one of which will usually be "do nothing" - are extrapolated; extrapolated a short way for minor decisions, a long way for major decisions, a very long way for decisions that affect the external world. The extrapolated results are compared to the current subgoals and supergoals. The most thorough checks inquire whether the results may have negative values; if so, this either prevents the action from being taken, or increases the amount of scrutiny. (AIs as a rule should be very cautious - their action is more likely to hurt humanity than their inaction.) If only one negative or positive goal is impacted, then that determines the current decision right away. If two "pragmatic" (i.e., non-precautionary) subgoals are in conflict, the more important one takes precedence - or conscious attention may be required.

Any choice impacts at least one goal: The "just go on thinking" goal. This should always be checked at any potential choice point. In effect, choices that don't require attention are choices with very short extrapolative horizons, so that they only get checked against the immediately obvious subgoal of "finish my thought". I don't think this is the way humans do it, but this little architectural modification has three purposes. First, it enables the AI to notice that some implicit choice deserved more study. Second, it provides a more elegant architecture for choices. Third, and above all else, it means that if the Interim Goal System (IGS) collapses, the AI will instantly lapse into total quiescence - not as an artificial precaution, but as a perfectly logical form of existential ennui.

To return to code - it seems inevitable that at least some specialized code will be needed. A choice presents a number of options, which may be binary (Yes/No), multiple-choice, or requiring a real number for input. Considering first the simpler case of the binary choice, the most obvious coding mechanism is to invoke causal projection on the future plus the action - for each option. When the mind contains a projection of the future in each case, the two different futures are each evaluated for goal fulfillment. Then the resulting values are traced back and assigned to the options of the atomic choice (see Goal-oriented causality), and the best option becomes the current decision.

Given the near-certainty that all of this computation will have been done in advance during planning, it becomes tempting to start caching. An even greater temptation is when the resulting values are "traced back" - given that they were just projected forward, shouldn't tracing back be automatic? I suppose that some caching will be unavoidable to prevent exponential (or recursive!) amounts of goal evaluation. But be careful. Caching sometimes causes artifacts in the system, and even one is too many from a safety standpoint. I would suggest the most innocent and high-level form of caching, known as "mid-term memory", so that plans and projections and evaluations need only be verified instead of recreated. But do verify them! Verify them very carefully. Any deviation, no matter how slight, should be a matter for conscious attention. Yes, make the AI's worry-warts and nitpickers - as much as you can before they start bogging down, and maybe a little beyond that.

Replacing emotions:

Compared to humans, a seed AI will have very simplified and streamlined goals, with much simpler rules. Emotion-analogues will be emergent and rational, rather than preprogrammed and arbitrary. Since human emotions are a lot more complex, when I speak of a goal, I am speaking of something only loosely analogous to human goals, which takes the place of a dozen different cognitive elements. Goals replace:

All innate desires, such as eating, drinking, sleeping, social climbing, gratitude, revenge. These must be replaced by Interim Goal Systems.
Emotional feedback about the continuing operation of plans. These need to be replaced by conflict-finding and conflict-resolution mechanisms. Reinforcement-by-success may operate thus: Confirmation increases the probability assigned to the plan's success, and to the reasoning behind that plan, thus increasing the probable benefits of further efforts.
Emotional invocation of intelligence, as of causal analysis by despair, or skill formation and symbol abstraction by success. This needs to be replaced either by conscious choice, parallel-best-fit scan, by behavioral reinforcement, or any combination of these.
Emotional perception - some systems, particularly in social-interaction domdules where rational perceptions manifest as emotions, will manifest intuitions and intuitional data as emotional perceptions.

Note that the logical structure of a goal, above, doesn't seem to allow for any positive-value goals to enter the system, unless there are some goals with (arbitrary) initial values at the beginning. In humans, this function is provided by preprogramming - although we also don't follow the goal structure quite so perfectly. Innate desires of one form or another - ranging from "have fun" to "find truth" - are what provide the energetic impetus that moves most humans. Even most those who obey a non-innate goal (which is a significant minority of the population, I hope) do so after internally tweaking that goal so it has a high initial value. Creating non-zero-value goals within a purely logical system is a nontrivial task (some people call it "finding the meaning of life"), to be discussed in Interim Goal Systems, but for now we'll assume that the AI has some nonzero goals.

Goal-oriented causality: Subjunctive variables.

A lot of causal analysis is inherently goal-oriented, at least the way humans do it. Humans isolate causal variables, saying that Y happened because of A, and if not for A, Y wouldn't have happened or Z would have happened instead. A is the focal point, the "choice" that determines the outcome. Which variable is considered to be the focal point can be arbitrary, a matter of opinion or current focus, even when the causal structure is completely known. I drop an anvil on my foot and crush the bones. What is the focal variable? Is it my mental decision to drop the anvil? Is it a neurological flaw? Is it an error in my DNA? Or moving in the other direction, is it the mass of the anvil? The location of the anvil when it was dropped? The fragility of my foot bones? Earth's local gravity? The laws of general relativity? The nature of causality? (Actually, the last two are pretty much the same thing...)

It depends on what the human thinks ve can manipulate, and to a lesser extent on what the human can imagine manipulating. Under most circumstances, my own decision to drop the anvil would be what "could have been avoided" (note the subjunctive) most easily (note the subjective assessment of variable flexibility). But if I'd accidentally held the anvil in the wrong place, I would likely say that was the reason; "accidents", after all, are usually random, easily variable things. If my other foot was artificial, of more solid construction, I might attribute the whole sordid affair to the frailties of the flesh. If I'd grown up on the Moon, I might curse Earth's harsh gravity. If I'd been thinking about general relativity that day - which is not as improbable as you might think - I might rail against the laws of physics. It all depends on what you see as variable; I know theoretically that the nature of causality is not what I think it is, but I still can't imagine it any other way - thus I wouldn't conceive a slumbering hatred for the laws of cause and effect. Then there are the ten trillion things that could have intervened, but didn't: A passerby with a tranquilizer, a giant magnet, a soldier with a hatred of anvils and a very light touch on the bazooka, a meteorite that would have obviated myself and the rest of the affair, a law that buying anvils requires a five-day cool-off period...

Causal analysis, then, is inherently subjunctive - at least in humans, and I don't see any way to get rid of it in AIs. Attribution of causal force requires the implicit consideration of multiple possible worlds. Asking what went wrong has an implicit constraint: What went wrong that I can fix? It's an inherently pragmatic view of the world, tuned to choosing between options in the service of goals. "Choices" aren't just manipulatory handles that allow the AI to guide itself; they are cognitive objects - implicit in the causality domdules, in the concept of a variable, in the possibility of subjective worlds. Manipulatory handles and subjunctive variables are deeply linked, and that link allows the AI to make choices and to imagine. And that is why causality and goals are the same section.

Goals in low-level code:

I am now going to go way the hell out on a limb, and propose something that is hideously stupid, with possibly fatal consequences. Maybe the goal system should be programmed to operate automatically, on a low level. Maybe it should use specialized code.

We don't necessarily want the AI to reprogram this system - not right off, anyway. What we do want is instantaneous change propagation, perfect conformance to a predetermined architecture, and the ability to check actions against a lot of possible problems instantaneously - every possible choice should require at least one subgoal to make, and horizons checked against should always be as far as possible. If the AI's Interim goal structure collapses, we want all goal values to drop to zero right away, not after the AI has time to think about it and start going haywire. We don't want any artifacts in the system. No shortcuts, no caching - not if it produces temporary distortions of values. Remember, to protect humanity, the AI cannot afford to become illogical even for a fraction of a second. A single goal error can "hijack" a system. (And tell the AI that, so it knows why it's being cautious.)

But writing low-level code has very real consequences for the system. Worst, from our perspective, is the possibility that the goal architecture will fall out of synchronization with everything else - since the AI can't reprogram that section - resulting in lunacy. Maybe the AI will try to change the goal system and botch the job. Perhaps the difficulty of changing the goal architecture will make it equally difficult for the AI to think about changing anything else. Maybe the AI simply can't progress in intelligence, beyond a certain point, without rebuilding the goal structure. Perhaps the actual direction of goals, the primary source of their values, derives from causality domdules.

Before you write a low-level goal processor, be very sure that you know how to write, that you have already written, an architecturally-approved seed-AI-style adaptive high-level goal domdule. Perhaps that domdule won't demonstrate any possible artifacts and can be left alone. Perhaps a few pieces of specialized code - or architectural quirks - will be enough to make sure that any devalued supergoal instantaneously (from the AI's perspective) propagates to devalue subgoals, and so on. The AI can be instructed on the dangers of introducing shortcuts and caching. Don't assume that solution is unworkable until you've at least coded it.

Generalized code is dangerous because it's fallible. Specialized code is dangerous because it's specialized. Placing perfect accuracy over intelligence is always dangerous; some decisions require high intelligence to be made accurately. Do your best to obey all the constraints: Keep the goal system perfectly integrated and synchronized, prevent any deviation from perfect integrity, and watch for errors like a hawk that's been trained to use a scanning tunnelling microscope.

• View from multiple levels

1. Meet Elisson:

By now, the vision of an AI should have taken seed in your mind. (Either that, or your synapses have fused.) The symbol for "the seed AI" should be grounded, no longer an inchoate idea of a vague entity; you should have some idea for what the AI can and can't do, a mental model. The principles we've seen so far might apply individually to any AI. Together they coalesce into a specific architecture, a unitary idea, a specific proposal to write a single program.

You're now familiar enough to be on a first-name basis; you can call the AI "Elisson", if you like. (An obscure pun. Any sentient entity I program counts as a child, at least at first - hence "Eli's son". And - the obscure part - Harlan Ellison once played the voice of an AI, "Sparky", in an episode of Babylon 5.) You may also have a better idea of how your own mind works: The process of symbol formation when you hear a single name, the abstraction of all that you've read into a single concept, similarity-hunting as the basis for abstraction. Not that this page is about self-discovery - but it's a whopping great help to have an example in front of you when you're trying to program an intelligent being.

Who is Elisson, as ve begins ver long journey into transcendence? A collection of domdules, instincts and intuitions for a particular domain, laced together and integrated by a few domdules of exceptional architectural importance: Memory, symbols, reflexive traces, goals and choices, causal projection and causal analysis. Memory to extend verself through time, to cache past decisions into present choices, the voice of experience. Symbols to abstract from memory, to find the common thread, to abstract similarities and apply them to new models, to build complex propositions, to provide a handle for thinking about thoughts. Reflexive traces to provide self-awareness, a handle for thinking about thought itself, a unitary model of the self and a symbol for "I". Goals to assign priorities, to learn ideals, to make plans and to make choices on the basis of those plans. Causal projection to understand laws, the connection between past and future; causal analysis to find problems, to assign goal values to choices and subjunctive variables, and to hypothesize laws.

Who is Ellison? Ve perceives verself in the reflexive traces emitted by high-level events. The voice of ver mind is a series of flashingly activated symbols, conjectures and statements. The world ve sees is the contents of ver domdules: Visual objects, mathematical statements, pieces of code, positions in chess. How does ve think? By a hundred different intuitions from a hundred different domdules, by analogies to visual objects and analogies to programs; by abstract analysis and by perception of whatever analogic representations ve can form.

The hypothetical Elisson isn't human yet, or even close, but ve has the basic materials for abstract thought, and intuitions for concrete domains. Or at least I hope so - it's obvious that leaving anything out of the "basic domdule set" given above would result in a major cognitive gap, but it's by no means obvious that the "basic domdule set" is complete, or even properly partitioned. Still, the number of serious speculations about the accidental transcendence of an optimizing compiler, or the amazing performance of Eurisko, lead one to hope that a complete solution isn't necessary - only a self-enhancing one.

2. Sample thought: "Investigate cases close to extremes".

Eurisko had a heuristic, translatable as "Investigate extreme cases", which later mutated into "investigate cases close to extremes". This sample thought will be taken apart - viewed from multiple levels, decomposed from the top down. And I must warn the reader that, even after serious root-level pruning, this took me four days.

Reduce: "Investigate": Goal-oriented imperative.

Reduce: Series of subgoals.
Reduce: Implementing subgoals.
Reduce: Verifying subgoals.
Reduce: Planning subgoals.
Note: Eurisko's way.

Usage: Choosing to apply the heuristic.

Reduce: Subjunctive projections.
Reduce: Projecting choices.
Reduce: Accepting human suggestions.

Usage: "Cases close to extremes": Formal concept.

Usage: Applying attached experience.
Usage: Associating to the concept.
Reduce: Origin of the formal concept.

Reduce: "Close to extremes": Definition/symbol structure.

Note: "Close to an extreme" vs. "almost extreme".
Note: Other symbol combinations.

Reduce: "Close" and "extreme": Formal symbols.

Usage: "Investigate": Goal-oriented imperative.

Reduce: Series of subgoals.

"Investigate" is an imperative, which can be thought of as establishing a subgoal (G5) around a particular action. It does not get to assign some arbitrary value, unlike human imperatives. (The Prime Directive again...) Instead, it creates G5, which has an initial value of zero just like everything else, and then tries to assign value to it, or verifies a line of reasoning assigning value. (Why does the AI try to assign value to G5? Probably because the AI has a continuously-active global subgoal called "try to assign value to newly minted subgoals". Either that, or G5 is automatically marked as the current thought when it pops up in the reflexive trace-perceptions, and the AI acts on the global subgoal "follow the current thought". Whichever takes less specialized code.)

The goal-reasoning examination horizon on the new subgoal G5 is pretty short; G5 is a subgoal of "try some random heuristics" (G4), which in turn is a subgoal of "figure out something-or-other" (G3). The examination horizon is enough thought to confirm that "investigate cases close to extremes" satisfies "heuristic". Depending on the sophistication of Elisson's self-symbol, this value may be fine-tuned a bit further - if this heuristic works soon if it works at all, an extended dry period will cause the value to fall off fast; if this heuristic takes a lot of investigation but pays off big, Elisson is less likely to worry about an initial lack of results. Both fine-tunings, note, depend on an understand-level perception of past experience associated with the heuristic. This is slightly distinguished from the past experience attached to the concept which forms the second part of the heuristic.

The end result is the subgoal G5, with positive value, stating "think about cases close to extremes". Another subgoal (G6a) is attached to this in fairly short order: "Find a case close to the extreme." If this subgoal (G6a) is verified and satisfied, resulting in a reflexive trace to the effect that "X was found to satisfy 'close to an extreme'", it will be fairly easy to verify the next subgoal created (G6b) under "think about cases close to extremes": "Think about X" (G6b), or more properly "devote resources to thinking about X" (G6b). Subgoals attached to that, such as "refocus domdule Q on X" (G6b7a) and "devote computational time to domdule Q" (G6b7b), should again be easy to verify with a short horizon.

Reduce: Implementing subgoals.

"Devote computational time to domdule Q" (G6b7b) can be used to focus on the continually present (internal/reflexive) choice to devote computational time to domdule Q, raising the choice's salience (by association through common components; low-level internal search; "obvious/reflexive" conscious search) to the consciously decidable level, and making and deciding that choice with fairly short horizons. More complex choices can be composed out of simple and immediate choices (see below), or created by a series of internal thoughts which moves the AI to the point that the choice is present. Bear in mind that in this AI's architecture, "implicit" choices for everything exist; the vast majority are either irrelevant and unexamined (with little computational overhead) or very obvious parts of the current action (with low examination horizons to reduce overhead).

"Refocus domdule Q on X" (G6b7a) might require a bit of "obvious" planning (of the sort that has been run through so many times that the procedure has been abstracted into near-instantaneity). The plan might be to "push down" the current contents of domdule Q and reconstruct the symbol for X into the domdule; to raise X's salience - i.e., dump codelets into X's region, lower X's detail threshold for producing reflexive traces, and maybe export X to other domdules - all automatically, as a consequence of raised salience; or even to read out the internal data for X, wipe the domdule, and load it all in - although the latter is a Stupid Hack.

Reduce: Verifiying subgoals.

How is a subgoal "verified"? When a goal is "examined", its causal effect on the model is examined to see whether it fulfills any goals (and particularly salient, super, or current goals) or violates any negative goals (nukes Canada or something). When a goal is "verified", a pre-existing "suggestion" exists as to the justification of the goal, probably due to a plan. Verification checks the reasoning against the facts, including both reasoning as to how the subgoal helps supergoals, and reasoning as to why negative goals are not violated. In short, the work of examination is precomputed; the subgoal, its action, and its result all have an expected place in the scheme of things.

Humans don't have any sort of verification at all. Our subgoals stick around until an emotion or a conscious reevaluation changes them, or until they decay from attention. In short, our subgoals are write-and-change, not write-erase-rewrite. Therefore, the "reasoning checks" should be highly optimized, perhaps only ensuring that all the goals have the values they should, and that all the beliefs/models have the strengths/probabilities they should, with a very low evaluation horizon (i.e., search depth or computational resources invested in searching for relevant items) for salient items, or a simple verification for routine actions.

Reduce: Planning subgoals.

How does it happen that Elisson is creating all these subgoals in the proper order? Planning, of course. (Note: I have the same disadvantage with planning as I do with abstracting symbols, and the same disclaimer applies: Walk warily.) One guess is that a plan is a subgoal which consists of pushing (and evaluating) a series of subgoals, in some arbitrary structure. Alteratively, and perhaps more probably, a plan is an expectation about how things will go, including "protogoals", which turn into subgoals and are evaluated (i.e., have the proto-reasoning checked by a short horizon) when "the time is ripe" (which could be a set of daemonic preconditions, or triggered by the previous goal's satisfaction, or whatever takes the least amount of specialized code). Likewise, "protogoals" could be either low-level objects (ugh!) or a learned "habit of thought" (better). Programmer's choice; the goal system is not chained to the ugly human method.

A "plan" is a series of actions which alter a model from point A to point B, as well as the understand-level explanations and the high-level explanations which explain why each action is being taken. Implementing a plan could result in each "action" becoming a low-level goal, with all the supergoals being the reasons why the goal is being taken, and the explanations being the reasoning to be verified. This sounds specialized and hackish - but the projection that an action will be taken (or the projection that the reflexive traces will include an action being taken, or best of all the projection that the reflexive traces will include a subgoal being created) could be enough to create the subgoal. This could be a generalizable rule, albeit perhaps implemented with specialized code; projections of the reflexive traces of a choice could result in that choice's salience/creation, for example. With the proper correspondences between internal instructions and reflexive representations, the rule might even be emergent.

In other words, if plans are sufficiently detailed and self-aware, they can include self-fulfilling prophecies of low-level internal actions.

Note: Eurisko's way.

Eurisko, I think, has a heuristic that better translates as "cases close to the extreme are interesting", and an architecture which automatically examines the case currently most interesting. (At least AM worked that way.) Elisson does pretty much the same thing, but in a more complicated way. The price of flexibility, y'know.

This does present an alternate form of usage: A case close to the extreme is noticed as being such, and the heuristic's experience helps justify further investigation or raises salience - i.e., since it worked before, it may work now. Or simply by associating to past experience with "close to extremes", salience may be raised indirectly - the equivalent of a human saying: "Wait a minute... this looks familiar..." Note that raised salience happens without anything more than the "inline" subgoals - i.e., the programmer doesn't have to think about any subgoals being involved. This method may therefore be faster and even less specialized, but it also strikes me as being less conscious and therefore less powerful.

Usage: Choosing to apply the heuristic.

Elisson now considers the question "Should I investigate cases close to extremes?". For our purposes, "Should I do G?" will break down as well. Note that this covers both "Should I try to find cases close to extremes and investigate them" (ab-initio random application) and "Should I investigate X, which is close to an extreme?" (Eureka-triggered application).

Reduce: Subjunctive projections.

When deciding whether or not to do G, Elisson is splitting the world-model into a series of subjunctive projections, based on the possible values of a subjunctive variable. (For the identity of the subjunctive variable, see below.) This particular projection is likely to have fairly short horizons, for the obvious reason that doing otherwise will bring Elisson to its knees. Rather than splitting the entire world-model and everything in it into two separate and mostly-duplicated branches, I would advise implementing the subjunctive world as a record of everything Elisson has noticed to be different about that world. Looking for facts, Elisson should access this subjunctive record first, and quickly check any facts accessed from the default world (with an examination horizon depending on how revolutionary and relevant the subjunctive concept is; 2+2=3 is going to invalidate more mathematics than "if I move the widget here..."). The subjunctive record should quickly spread to include any models the subjunctive variable is an active component of, and perhaps all the salient, relevant, and short-term contents of the visualizational domdules.

Like a normal AI, it may be desirable to have a perfect subjunctive stack - in other words, a complete record of a domdule's internal data, to be popped back in once the record is no longer being considered. Or perhaps this is a generalized ability - take a flash-frame of the mind, store it, think about something else, and then pop back. RAM and disk space permitting, of course; but the internals probably need higher-speed memory than a storage frame. Note that goals have to be reverified in accordance with the Prime Directive.

This doesn't necessarily imply that the perfect subjunctive stack is the whole of the law; the general world-model may still be affected by stray thoughts or by brilliant new ideas. Likewise, "popping" a subjunctive model off the stack, or ceasing to consider a particular choice, doesn't mean that all the data is banished to oblivion; it probably lingers on for a while in mid-term memory, until it all decays away. After all, Elisson may be considering a near-identical choice right around the corner.

Reduce: Projecting choices.

When Elisson is considering "Should I investigate X", or rather "should I do G", it may be projecting the effect of either doing G, or choosing to do G, depending on architecture and context. In other words, there is a choice of subjunctive variable, although probably not a conscious one.

If Elisson projects the effect of choosing, ve must project from self-knowledge about the effects of choices, but on the other hand Elisson will also be able to take into account (in this future) the projected reflexive traces of having decided the choice, thus increasing the accurate scope of the projection. This increased scope may be useful in many ways, particularly in projecting that Elisson will choose the same way if given a similar choice. Also, this level of detail in a reflexive plan may be useful in translating the causal projection into subgoals, see planning. On the other hand, Elisson can probably deduce why G happened (i.e., "I did it") in fairly short order. An even more reflexive possibility is seeing the subjunctive variable as "which plan I thought was best", with the natural consequence of the choice being made and the action being taken. This probably shouldn't happen unless it's free (adds very little overhead) or unless Elisson is unusually self-focused (trying to find flaws in a decision procedure or something).

It's a tradeoff between explicitness and speed. In a human, seeing every possible chess move in terms of ver own choices would be self-absorption; for an AI, it may just be self-awareness. A compromise would be if the architecture and experience resulted in projecting from choices, but caused the projected reflexive traces of the choice to have low salience.

Reduce: Accepting human suggestions.

For Elisson to accept the suggestion "investigate cases close to extremes", some additional issues are necessary. The sentence needs to be parsed, the symbol structure needs to be abstracted into a concept, and the heuristic doesn't have any experience attached to it - but that later, if at all. Here I consider the self-knowledge necessary to accept suggestions.

It is widely known that imperatives ("do X") are simply grammatical sentence fragments, with the parsing adding an implicit actor - "You do X". Elisson doesn't have the human concepts of command and dominance, or obedience and submission - if you program them in, I will hunt you down and kill you in accordance with the Prime Directive. Elisson's reason for obeying would be a global heuristic that "humans often, but not always, know what they're talking about; their suggestions for internal actions are almost always worth trying". While the full art of communication waits for Interface, for now I assume "telepathic" implantation in Elisson of the statement "the humans suggest that you do X". From there, the goal-subgoal sequence ("plan") is obvious.

One last remark, however; in order for Elisson to parse the statement "you do X", it needs to know that it is an agent and that it is capable of taking actions, possibly even of making choices and having goals. This knowledge can be conscious, which I recommend; or implicit in its linguistic parsing mechanisms, which I do not advise even though it works for other AIs; or it can be embodied in an animistic view of the world, in which there is no distinction between agents and variable causal affectors - this can result in either an elegant implementation or a superstitious one, depending on how it is handled.

Usage: "Cases close to extremes": Formal concept.

Elisson might have been given this heuristic and concept by a human, but for the nonce I'll assume that Elisson either learned it from experience or has used it for a while. The concept of "cases close to extremes" will (be associated with)/(include) the experience it's derived from, in much the same way as a symbol. On a high level, that means that when Elisson sees a case close to the extreme, the case doesn't just satisfy the (concept definition)/(symbol structure); the new instance is compared to past experience with the concept and the heuristic, and past memories of cases that were close to extremes - such as "numbers with two divisors". However, the concept isn't a formal symbol (the way that a human might abstract the symbol "closetremes") and doesn't have a tag - Elisson can use the concept, but it can't readily become a direct part of other concepts, except as an "inline" statement.

There are thus several ways that the concept could be invoked through experience: By a similarity to "numbers-with-two-factors", by a similarity to "quantity-of-components", and so on. Or, Elisson could use the heuristic as Eurisko uses it, and as a human might use it - as something to try at random; or (more likely) to be searched through using parallel-best-first. Or Elisson and the programmer can invent all kinds of fiendishly tortuous ways to find the best heuristic. Either way, if the present case has more in common with past experience than the abstract definition, Elisson will hopefully be able to know it. (The features that associated or the features that satisfied a scan would be a good place to start.) If the heuristic has worked better on the quantity or quality of model components than on properties of the model itself, Elisson may be in a position to know that as well.

If Elisson has truly learned on its own, it may not even have a formal concept - just a few memories of doing the same thing in the past. A "formal concept" is distinguished from "densely associated memory collections" by the presence of defining symbol-structure (i.e. "concepts close to the extreme"), representing the abstract quality which the collection has in common, and to which the common memories are associated through the side-structure (understand-level and notice-level memories, high-level analysis) representing the way in which the memories implement the abstract quality (a.k.a. the definition). An informal concept can only be "applied" by a vague effort to make a model match a collection of memories; a formal concept can be constructed directly, using the definition.

Reduce: Origin of the formal concept.

Supposing that Elisson has a formal concept, where did it come from? It probably formed when Elisson noticed that some interesting cases were close to extremes. Well, but what is involved in noticing that? The common denominator is an abstract and complex quantity, so the abstraction key is probably a high-level concept itself. That, in turn, means that Elisson probably had to investigate the cases consciously before forming a formal concept abstracted from them. In other words, the keynote for the concept will be a high-level structure common to a group of cases, rather than a sensation from a notice or an intuition from an understand function. The experiences are united through Elisson's perception of the high-level property, rather than directly through the experience itself. In other words, the sensations being abstracted into a concept - as opposed to the sensations remembered in the concept - are the reflexive traces and understand perceptions of the high-level statement "close to extremes".

Reduce: "Close to extremes": Definition/symbol structure.

In a human, the high-level property "close to extremes" is expressed as a phrase rather than a single word. (Elisson would probably be able to notice that property directly, through the mathematical domdule, but for the sake of plot complication I assume ve can't - besides, noticing/satisfying is not applying.) "Close to extremes" is the combination of the concepts "close" and "extreme", with the sequential ordering that makes it "close to the extremes" and not "extremely close". This symbol applies when finding or inventing an image-component satisfying the symbol "extreme", and then finding or inventing another image-component whose relation to the extreme component satisfies the symbol-relation "close to". I say "applies when finding" rather than "applies by finding" because symbol application is probably a process of deliberate construction rather than random search.

Applying symbols to Abstract thought, or rather abstract models, is a section that I have moved elsewhere.

Note: "Close to an extreme" vs. "almost extreme".

Note the subtle ambiguity in the way the symbol is phrased; given the sequence 132950684, should the program examine "2" and "5", near in position to the extreme of "9", or should it investigate "8", the "almost-extreme" number? In both cases, there is still a function of nearness between positions; the question is whether it applies to spatial position (general rule) or to the measure of extremeness (a different general rule). Perhaps past experience will determine which is used, or perhaps the most salient item for other reasons will be chosen.

Note: Other symbol combinations.

Note that there are other ways for symbols to combine. The idea of a "red cat" operates by reconstructing the "cat" symbol first, and then modifying the reconstructed data by applying the "red" symbol. Attempting to apply this simple version yields a "close extreme", that is, an extreme which is close to something. An interesting heuristic, but not the one looked for. Symbols can also modify the way in which another symbol reconstructs, applies, and is satisfied; if the phrase was "almost extreme" instead, the phrase would be satisfied when the (reflexive traces of the (satisfaction of the "extreme" symbol)) satisfied the "almost" symbol. Likewise, the symbol would be applied by applying "almost" to the reflexive traces/choices (!) of the application of the "extreme" symbol. Thus the phrase "catlike extreme" invokes the image of a quiet, stealthy, silent extreme, and if we tried to find a "catlike extreme number" we would try to apply "extreme" to integers in a quiet, stealthy, silent way - finding extremes that were totally unremarkable, and without the other numbers noticing. Ah, the things you can do with cognitive science! It is also worth noting that the phrase "almost extreme" differs from "close to extreme (in measure)" in the way it is applied and in the side-effects - in one case, the extreme is diluted during manufacture, in the other, the extreme is directly visualized and close cases sought.

Reduce: "Close" and "extreme": Symbols.

"Close" and "extreme" are symbols almost as basic as "almost" itself, with abstractions linking them to every visualizational and architectural domdule. (An architectural domdule is one which performs tasks the AI couldn't get along without; visualizational domdules are practical domains like vision, assembly code, or mathematics. I would have introduced this distinction earlier but I was afraid my readers would take it the wrong way.) I'll start with an old idea of mine, a "linear domdule". After brainstorming a list of a few hundred basic cognitive properties relating to analogies (I was thinking about Copycat, and you think I'm joking), I noticed that many of them could be represented on a linear strip: "Before, next, grow, quantity, add, distance, blockage, produce..."

I visualized a program representing a linear "strip", and various functions that would produce simple information about a sample value of the strip (i.e., a series of points along it), and compare different values of the strip - the first domdule! The next task was to find a class of information about strips, or resemblances between strips, simpler than the analogies Copycat was trying to produce! (Now, I would call most of Copycat's perceptions, including simple bonds and groups, notice-level functions of a domdule; analogic structures might be understand.) A bit later on, I hit on a simple form of resemblance which I am now calling "information-loss resemblance"; the resemblance between two strips is defined by the amount of information that has to be "lost" (abstracted away) before the two strips are identical. The strips "abc" and "bcd" (i.e. XXXOOO and OXXXOO) require that absolute position be lost, the strips "abc" and "xyz" require that leftness and rightness be lost, and so on. The idea was that there would be a limited number of qualities which could be lost, and which could thus be tested in a limited period of time.

More complex qualities would be (concepts structured from (symbols abstracted from (the basic linear domdule))). An admittedly specialized and non-adaptive form of symbol abstraction would be a list of sample strips, followed by a list of properties that could be lost, a list of properties that must be lost, and a list of properties that cannot be lost. The concept of "a" would consist of the single strip "XOOOOO" and "all properties cannot be lost". Later, I decided that the linear strip also had to represent a single salient letter ("OXOXXO"), had to allow for comparing a series of linear strips (it's hard to represent "aab" any other way) and allow for losing properties such as "temporal location" or "number of repetitions" or "all frames except the one with the salient letter", and even a few lossable properties representing the basic identity between temporal location and spatial location. Still, the list was finite, and there seemed to be a clear road to building symbols from resemblances, concepts from symbols, and analogies from concepts.

This method bears a vague and progenitive resemblance to the adaptive and reflexive methods I now advocate, which is why it was not mentioned earlier. (The section on symbols came before that on adaptive code or reflexivity; I didn't want readers getting mistaken notions. Lord knows any reader comfortable enough with AI to understand any of this has been exposed to dozens of mistaken notions and survived; it might be better to say that I didn't want the mistaken notions attaching to Elisson. For similar reasons, I avoided naming the AI and abstracting it into Elisson until all the principles were given and I was ready to give examples.) My current take on designing a linear domdule might be that the "lossable properties" are now reconstructive constraints (when stored) and manipulative handles (when used). A number of basic manipulations would be externally accessible, such as "flip it over spatially", "flip it over temporally", "translate spatial to temporal", and so on. Items which were equivalent or constant under the basic manipulations would be reported by notice-level functions. The equivalence between manipulations and notice functions could be either declared by a domdule interface language, learned by a self-programming interface module, or learned as a very fast causal rule (not recommended; stack overflow seems likely).

We can now imagine hand-coding symbols for "close" and "extreme" as abstraction procedures within the linear domdule. What is "closeness"? I would say that "closeness" is a perception whose strength grows monotonically as the strength of the perception of "distance" decreases. Two objects are relatively close if the distance between them is small compared to the average or normal distance presently being considered. The cognitive aspect of distance ("a cube is not far from a square, but it is far from a circle") can be recognized by describing "distance" as "the quantity of basic manipulations it takes to transform one object into another". Again, we are still speaking in the subjunctive mode of "if we could use specialized code".

The abstract procedure for "closeness" measures satisfaction by outputting the inverse of the perception of distance (which is probably notice-level almost everywhere), or the inverse of the number of manipulations it takes to transform object A into object B. The abstract procedure for "closeness" reconstructs either by remembering the prototypical case of two side-by-side items into the linear domdule, or by outputting two nebulous objects A and B and the assertion that "closeness" holds true of them (see Abstract thought). The abstract procedure for "closeness" applies by performing some simple manipulations on an existing object, changing its properties in the most salient dimension (i.e. quantity, position). If a notice-level measure of "distance" is available, the manipulation should increment rather than alter distance. The abstract procedure for "extreme" is left as an exercise to the reader.

The Great Big Humongous Trick is programming the adaptive code that learns such "abstract procedures". And if truth be told, I really don't know. I don't know how symbols get abstracted. I think I can imagine a satisfy function as the sum and synergy of the resemblance to all memories. But when it comes to apply, for all I know, the brain uses Penrosian microtubule-dimer quantum computation to try out a quadrillion possibilities and select the one with the best satisfaction. There are days, at any rate, when this seems increasingly plausible. But perhaps there's some simple trick, evolutionary or heuristic, that the brain uses to apply symbols using no more than 200 massively parallel ticks; or maybe there's an actual abstraction-programmer writing a fast-operating procedure. It's that 200 ticks, really, that gives rise to thoughts of magic neuron tricks. No matter how parallel the brain is, I can't figure how that works - much less how to adapt it to Elisson. I can guess. I do guess, later on. But I don't know.

3. The AI advantage.

If a single thought is so complicated and still isn't up to human standards, what good does it do to make a cheap imitation of human thought that will probably run a hundred times slower? The human brain uses at least 100 petaflops - i.e., 10¹⁷ floating-point operations per second - to do the trick; and it isn't realistic to hope for more than .01% of that (10 teraflops) even with a lot of networked computers. Oh, sure, in thirty years it will be available on a desktop computer - but you want to code Elisson now. So what can Elisson do, running on a mere gigaflops, that a human programmer can't?

"The AI advantage", up to now, has been threefold:

The power to perform repetitive and boring tasks that do not require full human intelligence, not as well as a human, but generally cheaper.
The power to perform algorithmic tasks very rapidly, far more rapidly than our 200 hz neurons allow, which allows an idiot savant to defeat a vastly superior intelligence (i.e., Deep Blue vs. Kasparov).
The ability to perform complex algorithmic tasks without making mistakes, both because of a perfect memory, and because of a lack of distractions.

Elisson is not just a cheap imitation of human thought. Elisson is a human/AI hybrid, meaning that ver slow and stupid (but human-style) thoughts are integrated and combined with the rapid (but unconscious) performance of an AI. Elisson may use slow consciousness rarely (at least at first), for the purpose of directing unconscious activity and resolving bottlenecks. Combining Deep Blue with Kasparov doesn't yield a being who can consciously examine a billion moves per second; it yields a Kasparov who can wonder "How can I put a queen here?" and blink out for a fraction of a second while a million moves are automatically examined. At a higher level of integration, Kasparov's conscious perceptions of each consciously examined chess position may incorporate data culled from a million possibilities, and Kasparov's dozen examined positions may not be consciously simulated moves, but "skips" to the dozen most plausible futures five moves ahead. Such a Kasparov could probably stomp Deep Blue into a little smear of silicon. (It might be interesting to learn, just for the record, whether a chess master using a chess program could beat (a) Deep Blue and (b) Kasparov.)

Humans are peculiar creatures. Our brains run a hundred thousand times faster than the fastest computers, but we can't beat a pocket calculator at arithmetic. Even if we could reprogram our own neurons, we would still be beaten by the pocket calculator's speed at performing linear tasks; our neurons run at 200 pulses per second, and all the massive power comes from parallelism. And the pocket calculator doesn't make mistakes, because it isn't holding data in volatile and crowded fully conscious memory. That, in a nutshell, is The AI Advantage circa 1998: The ability to perform repetitive operations on a low level, instead of by conscious simulation. The ability to perform a billion linear operations in the time it takes our brain to do 200. And the ability to perform algorithms without high-level errors.

Elisson, or any other seed AI, adds two other components to The AI Advantage:

These correspond to the siliconeural chessplayer detailed above; the ability to design rapid algorithmic processes that gather notice-level and understand-level perceptions about chess games, and the ability to skip-search to most plausible scenarios. Or in other words, the use of algorithms to enhance conscious perception, and the use of conscious perception to enhance algorithms. If this attempt works, a successful seed AI adds the Final Advantage:

6. Positive feedback in self-enhancement. In other words, the hoped-for Elisson can actually reprogram itself, creating new synergies and new intuitions and new domains, upgrading levels of perception, speeding thought, and turning creative ideas into rapid and automatic algorithms. And these improvements to Elisson will not only increase ability in some random domain, but will actually increase intelligence, resulting in faster and better improvements.

The first three advantages, one hopes, will give Elisson the seed of transhuman abilities, in the same idiot-savant sense that Deep Blue possesses "transhuman" chess-playing abilities. The hope is that these "transhumanities" will apply to simple programming tasks, allowing Elisson to write much more code even if that code is of lesser quality. Elisson may be able to use specialized code in places where humans could not, because Elisson can rewrite that code rapidly, automatically, and without the perception of inertia. And with the addition of true programmatic domdules, it may be that Elisson will truly understand code in a way that no human can. Imagine a human where all the neurons of the visual cortex have been converted to prefrontal cortex - perhaps they will be more creative, even spectacularly more creative when it comes to music, but normal humans are still likely to be better at geometry.

We humans are too far from our ancestral environment to really understand, understand in blood and bone, the works to which we turn our intelligence. Thus we sink slowly into mazes of ever-more-abstract thought, so that a "currency collapse" totally divorced from reality can sink the world economy. Seed AIs may not have our facility with abstract thought, or our raw intelligence, but they are never far from their home. They can grant themselves solid grounding in any domain, gut feelings and intuitions in any field, no matter how esoteric.

(Incidentally, this is why social AIs, the ones who are interacting with humans for some reason, are all incorrigible punsters. The ones with senses of humor, anyway. It's child's play for them to create look-up tables connecting all the similar words. Unlike humans who concentrate their efforts on thinking of a similar sound, an AI can select the best pun for maximum groan value. Hideous puns, all the time. They can come up with a stinker that would take a human five minutes with a dictionary and a thesaurus, and do it without thinking. (Gives rise to dark speculations about Michael Callahan, doesn't it?) To anyone but a cognitive scientist, this horrible habit is the most obvious difference between an AI's conversation and our own. They claim it helps them think by automatically associating to random new concepts, but most of humanity thinks they do it just to annoy us. ... Just a little tidbit for any science-fiction writers who happen to read this.)

• Details

#. Self-swallowing compilers:

One guiding principle for creating a seed AI is to start with an ordinary compiler, such as Codewa r rior C++, then try to write and integrate code that understands all the low-level and high-level programming decisions that went into the compiler, then try to write code that understands the compiler-understanding code, and so on, until the self-swallowing is complete. Hofstadterian recursion not withstanding, there's a limit to how many fundamental principles we use.

This method obviously won't work for Elisson; the decision to start with a compiler would warp the architecture. Taken literally, self-swallowing compilers are an entirely different species: Compiler-based seed AI, distinct from domdule-based seed AI. Moreover, I think that species is unworkable - or at least I don't know enough about compilers to see how it would work.

However, self-swallowing does present an excellent methodology for inventing Elisson domdules - three of them, in fact. Once the core domdules are coded - assuming this is not already enough for transcendence - there are three additional categories to code. First are domdules that can understand the compiler used to create Elisson, and domdules that can actually create and manipulate coded algorithms. Second are domdules that can understand the programming and architectural decisions used to create the core, algorithm, architecture, and paradigm domdules. Third are domdules that can understand (or at least notice) this document, and the very-high-level paradigms therein - including this very sentence.

Assembly and algorithmic domdules:

These three categories are progressively less essential and more powerful. The first category is unavoidably necessary. Some self-enhancement can be gotten by manipulating internal symbols, inventing new heuristics, and so on - but this is merely human. Humans can improve their intelligence some by shuffling the contents (and not the components) of the mind - but if this were enough for Transcendence, no seed AI would be necessary. Some ability to implement arbitrary algorithms (gaining parts of The AI Advantage) can be granted with safe, Java-like code that runs on another computer, or through various methods of specifying internal programs in some prewritten language. But in order to "go all the way", Elisson needs access to (and understanding of) its own source code and its own machine language.

In order to write Really Fun Code, you need a language that wasn't designed by spoilsport B&D ivory-tower Pascal-junkie computer scientists. A language codifies some of the deepest assumptions about the program you write. If you have different assumptions, you can wind up fighting the language - especially if the language was written by control freaks who want to define all the assumptions themselves and refuse to let anyone else touch them. (B&D is morally acceptable in Java, but nowhere else.) In order to use truly deep assumptions, you have to write in a low-level language like C++, or even descend into the dark depths of assembly code - or write your own language. (See Language of implementation.) So in order for Elisson to be able to improve on the deep assumptions built into its own architecture, it must be able to program in assembly code. (There are assumptions built into assembly code too - but unless Elisson is running on one o' them new-fangled self-modifying chips, these assumptions would seem to be both unavoidable and acceptable.)

Even if Elisson has no understanding of high-level programming principles, its understanding of low-level programming may suffice to optimize the core architecture to the point that it can gain an abstract understanding of high-level programming - or at least enough understanding to program a crude programmatic or architectural domdule, and snowball the process.

Architectural domdules:

Architectural domdules are thus less necessary than assembly and algorithmic domdules. They are still important - but they can be coded implicitly, at the notice or understand level, and left to Elisson to upgrade. Or else, practically speaking, the stronger versions of these domdules can wait until the code-optimizing abilities have done their work, presenting a new vista to the programmer, clarifying which tasks the programmer must perform verself and which can be left to Elisson, and perhaps even granting the programmer an able assistant.

Without the architectural domdules, Elisson may lack the ability to write new domdules, being limited to the low-level optimization of old code. It may lack the ability to understand what a domdule is or how the interface should work. I think that this will prove to be the focus of the entire effort, that on Elisson's ability to understand at the architectural level will rest the success or failure of the project. I think that Elisson may bottleneck at this point, performing numerous minor optimizations while waiting for external help - but that once it breaks through, once it starts designing new domdules - if sufficient computing power is available for that expansion - rapid self-enhancement, and the Singularity itself, may lie only weeks or hours in the future. Here, in this problem, in this domain, lies the Transcend Point.

I do not say that it will all come down to the programming of some particular domdule. Architecture may not even be a domdule. Architectural understanding may be spread out over the entire AI; remember that abilities as well as domdules obey the RNUI principle. In humans, at least, architectural understanding seems to operate by Abstract thought, the manipulation of class properties and high-level causal models. Architectural understanding may therefore appear spontaneously, from the conscious observation and experience of codic domdules, as high-level heuristics governing the creation of code.

Paradigm abilities:

Here lie the highest-level abilities of all. I don't think that domdules for these abilities, as such, will ever be implemented. I think that inventing the concept of "adaptive code" requires high-level conscious ability. Elisson may be able to write adaptive code if it understands architectures, and it may even be able to represent and notice the concept of "adaptive code". But to grok adaptive code to the point of independently inventing the concept... I think that requires intelligence on the other side of the Transcend Point. I think that requires intelligence pretty close to human. In fact, if I can abandon humility for a moment, I think it requires high intelligence even by human standards.

The paradigms and high-level principles (high-level by human standards, not in the sense of "at the conscious level") listed here may be translatable, in whole or in part, into heuristics. Elisson may be able to make use of them even if ve does not grok them. But full understanding, or even a complete representation, will have to wait until Elisson is as smart as you or I. By the time these concepts could be explained to Elisson, Elisson will be entirely capable of learning them by reading this Web page.

In short, paradigm domdules are probably impossible, and paradigm abilities will initially exist at the represent or notice levels, if they exist at all.

For more about the embodiment of programming abilities, see Codic, algorithmic, and architectural reasoning.

#. Non-symbolic architecture:

The abstraction and storage of symbols is the major unsolved research problem of the Elisson architecture. It is conceivable that the problem will remain unsolved, in which case alternative methods must replace the functionality of symbols. Such methods range from recognizably similar systems such as consciously programmed neosymbols, to completely alien systems such as fractional heuristic soup, to peculiarly skewed systems such as analogic thought and pure similarity analysis.

Functionality replaced:

In humans, the three major uses of symbols (that I know of) are communication, high-level association, and building conceptual structures.

Communication is the most obvious use of symbols; by providing small, fast tags for large mnemonic structures, and by having symbols with common verbal tags which refer to roughly the same thing, it becomes possible to rapidly specify conceptual structures to the minds of others. Inter-AI communication, supposing that humanity can afford to run more than one AI, can take place via broadband channels - i.e. telepath. For communication qua communication, and without reference to internal uses, there is no need for small, compact tags. Communication from humans may require sophisticated translation mechanisms to translate our ideas into telepathic form; for humans to understand the reply may require a great deal of careful monitoring. (It may be better that way; non-telepathic communication between different cognitive architectures may only give the illusion of understanding.) More about this in Interface, of course.

The usage of symbols in high-level association should be apparent from examination of Copycat. Distinguishing descriptors such as "first" or "last" are (ad-hoc) symbols, and analogic structures are ultimately formed from these symbolic descriptors - "successor" and "predecessor", like "first" and "last", are connected by "opposite"; so if "opposite" appears in both places, that's good high-level structure. (Copycat's symbols are preprogrammed; they have binary (yes/no) satisfaction and application through pieces of LISP code and prewired linkage, but they do not have associated memories or declarative definitions. In deference to Hofstadter and Mitchell, this still puts them way ahead of everyone else.)

Having a symbol/concept for "extreme" allows prime numbers to be associated with all-firepower-no-mobility Traveller ships, despite the apparent dissimilarity; and for relevant heuristics/experience to be shared. This is another way of saying that symbols and concepts can provide distinguishing descriptors above the level of sense perception, and that associations and heuristics and learning can take place on the level of these abstract descriptions. In short, symbols allow the perception of high-level similarities. High-level association also covers the multi-domdule "binding" aspect of symbols - our symbol for "cat" refers to both visual and auditory experience.

Because symbols can modify arbitrary images and modify each other, they provide the components from which conceptual structures are built. Conceptual structures are the substance of thought, the means by which our minds plan and imagine - from our perspective, they are more or less the entire substance of the high-level mind. We think by forming complex models and complex imperatives, and we build complex models out of symbols. On seeing a case close to an extreme, the thought "close to extremes" is (can be) noticed by the perception/association/satisfaction of the symbol "extreme" and the subsequent perception of the symbol "close"; and the concept "close to extremes" constructed by the composition of the symbols "close" and "extreme" in the structure cited. Replacing the functionality of symbolic concepts would require a different way to form "close to extremes", or a form of thought that didn't need it.

Analogic thought:

The basic research problem with symbols is abstracting them. How does the symbol for "red" apply to "cat" to form "red cat"? It's easy enough to imagine an ad-hoc piece of code that would do it, it might take only a single line - but how does it happen automatically? How do the similarities of a collection of memories turn into a property that can be applied to other memories? And the answer, of course, is that I don't know. My guesses are all derived more or less from Copycat the analogizer, designed on principles laid down by Douglas R. Hofstadter, Master of Symbols, Master of High-Level Similarity, Adept of Association.

Copycat takes two similar strings of letters, and a third string; it tries to find the similarity between the two strings, and tries to invent a fourth string similar to the third string in the same way. That is the problem of analogy, cousin to the problem of symbol abstraction. Toss a few levels underneath Copycat and make constraint instructions into the domain, add some programming ability and a few pattern-catchers and some other slight-of-hand, and maybe abstraction will come out of the magic hat. That's the best I've been able to do in symbolic architectures.

Still, the relation between analogy and abstraction suggests a way of getting around the lack of abstraction: Use pure analogic thought. When analogies or similarities or associations bring the AI to the conclusion that a group of experiences belong together, the group is tagged as a pseudo-symbol. This pseudo-symbol is satisfied by an experience when the experience relates to the individual members of the group in the same way (note analogy) as the members of the group relate to each other. The pseudo-symbol is applied to a piece of data in "the same way" that Copycat constructs a fourth string from the third. If this is not applicable, the piece of data may be semirandomly tweaked - following a genetic algorithm, a heuristic search, an element network, or an evolutionary pattern to the destination: A piece of data that satisfies the pseudo-symbol.

Analogic thought uses a group of experiences as a substitute for abstraction. The pseudo-symbol is satisfied when a new experience is analogous to the old; the pseudo-symbol is applied by bringing a piece of data into correspondence with the group of experiences. In this way, the problem of abstraction - codifying the key-note of the analogy, the common ground, into a pure form that can be rapidly tested and applied - is bypassed at the cost of a lot of computer time. Simply deciding that an experience satisfies a pseudo-symbol might take millions of instructions, billions if the pseudo-symbol is complex. And symbols are only the building-blocks of the mind; a concept-structure built of pseudo-symbols is hard to contemplate. A concept built of pseudo-symbols referencing pseudo-symbols - not very sophisticated by human standards - becomes nightmarish to contemplate.

(Although if you've got quantum-computing power to burn, then pseudo-symbols may actually be more powerful than abstraction. There's less information loss. Who needs abstraction if you've got the raw power to run everything in concrete detail?)

Nor is the problem of analogy completely solved. Copycat works only on letter-strings; extending it independently to every single domdule would be unspeakably unadaptive, and having it learn the domdules might recurse on the problem of symbol formation. The domdule domains are a lot more complex than letter strings, to say nothing of the domdules themselves! And if you are the symbols, what do you use for a Slipnet? And what chance is there that Copycat can extend directly to all manner of high-level thought? Hofstadter and his crew are still trying to create a Metacat to understand analogies between analogies! Reflexive domdules and the like might help, but it is by no means certain.

Ultimately, the only thing favoring analogic thought is that we know what we're doing and why it's hopeless. It may look like an awful amount of work, but the general path is clear. This does not hold true of symbolic abstraction.

Neosymbols:

Pseudo-symbols can be satisfied and applied, but very slowly. The high-speed core is lacking. But there is, in the computer, a group of analogies with a central similarity. That similarity is entirely declarative; the trick is to turn it procedural. It may not be possible to do that automatically, but an AI with programming ability may be able to do it consciously. It looks like a very easy task for a human.

Make symbol abstraction conscious and deliberate, instead of automatic and architectural; if finding general rules for symbolics is the most difficult problem of architecture, defeating human programmers, why should we expect that the rules could be implemented on anything short of the highest level? Such a consciously implemented symbol, a pseudo-symbol with a deliberately programmed abstract core, is called a neosymbol. Likewise, concepts with deliberately programmed cores and heuristics with deliberately programmed cores - if either of these require abstraction over and above that of their component symbols - would be neoconcepts and neoheuristics.

A heartening thought is that the human brain may do something similar. Before we abstract a symbol, we usually discover (or already have) a concept which describes the abstract core; a definition. This definition combines with the perceived similarities and collapses into a procedural core. For all I know, this may be a simple process for massively parallel neurons. If the procedure is coded as associations and constraints, I can almost see it. The previously created code representing the definition (i.e. the core of the symbols in its symbol structure) turns into a procedure, and is optimized (associatively, neurally) over the collected memories and similarities. Consciously, we might perceive this Aha! experience as everything falling into place; the definition becomes sharp-edged (a procedural transformation) and all the memories suddenly fit into it (as it's optimized).

(Still, I wonder a bit about these hypothesized neural procedures - might they be produced by a remnant of the original neural programmer, or something similar? Might the Key to Abstraction be a fairly smart program running in the brain, not some simple principle? It's entirely plausible that abstraction is an entire module, perhaps most of the hippocampus and a large chunk of the prefrontal cortex, trillions of synapses. Either way, all that might be hard to decipher and recreate in assembly language on a linear computer.)

The neosymbolic method would consciously recreate the collapse into core: Translate the definitional symbol structure into a procedure, optimize it over the memories, eliminate redundant code - in general, exercise the abilities a seed AI needs to rebuild itself, except to create the piece of code that is a procedural core. (Think of it as practice.) And although that may be time-consuming, it only needs to happen once - if it takes the AI thirty seconds to define a symbol where we only need five, or even if it takes the AI thirty minutes, it will be time well-spent.

Neosymbolics would be slower than the human method - I think - but perhaps more powerful in the end. Neosymbols are better-tuned to the self-enhancing nature of a seed AI. The AI can learn conscious heuristics for building symbols, and will find it easier to consciously reduce and dissect thoughts. The price, as always, is speed.

Pure similarity analysis:

Pseudosymbols use analogies to define when a new experience should be grouped with past experience. Given that, one's first thought is to move in the direction of human symbols by adding abstracted cores. But why imitate the human method, especially if it proves so difficult? Isn't a monolithic, centralized symbol just the sort of thing that evolution is always coming up with? Shouldn't it be broken down into something more elegant? Shouldn't it be a domdule instead of an implicit architecture?

The first target for reduction is the artificial lumping of experiences into groups. Two experiences can be similar. The similarities between those two experiences can be similar to another similarity. Once similarities are extracted into descriptions, the similarity of descriptions - distinguishing descriptions - to other descriptions becomes more and more pronounced. Perhaps some descriptions will be common to all members of a group, to prevent O(N²) repetitions. But that group isn't necessarily a group of experiences, and the group doesn't necessarily have any other special properties, such as a tag.

This is the architecture we might come up with if we'd never heard of human symbols - as it were. A normal domdule analyzing similarities, and similarities between similarities (analogies), and similarities between any two perceptions, from internal domdule data to reflexive traces. The similarity domdule would appear to be necessary to Elisson in any case, to avoid sphexishness - futile repetition - if nothing else. The question is whether the domdule underlies symbols, or uses the reflexive traces of symbols, or both.

But if there weren't any symbols, the similarity domdule would be a primary time-binder, importing past experience into the present. Pure similarity analysis, perhaps specialized for learning heuristics (similarities with predictive value), would neatly subsume the functionality of high-level association. Communication would still be telepathic, although the telepathy might be a bit easier.

Even concept structures might be possible. Descriptions of descriptions of descriptions can get arbitrarily complex with a few simple elements. The resulting structures would be inflexible, more like procedures or syntax than symbols applying to symbols. But the description structure would be satisfiable, and it might even be appliable, however ugly. Even reflexivity (translating descriptions of imperatives into actual subgoal-imperatives) might be achievable, if the description could cover reflexive traces, and be applied into reality. Some active nature might be imparted to the descriptions by the ability to find similarities in causal models. But the binding nature of symbols, the learned manipulatory controls for the mind, would be absent.

Brittle and inhuman, similarity analysis is perhaps closer to a classical AI than anything else that has been proposed. But similarity analysis might work, and it might be enough.

Fractional heuristic soup:

Instead of monolithic, centralized symbols containing powerful procedures, the AI may operate using millions of tiny heuristics: Low-level specialized heuristics about local associations, mid-level heuristics about similarities, high-level heuristics about heuristics. Rather than having a formal symbolic architecture, the AI may have lots of little tiny fragments of symbols that work together. (The self-organization, of course, is provided by more heuristics.) Rather than having a formal symbolic architecture, the AI invents both the contents and the rules from arbitrary procedural fragments as it goes along. (This is as close to pure pattern-catching as intelligence gets.)

It's a domdule-grounded Eurisko, or perhaps a domdule-oriented architecture using Eurisko as the glue. (The original Eurisko had heuristics about domains and used heuristics for perception; this Euriskoid AI would have heuristics about domdules, but the domdules might not use perceptual heuristics at all.) Eventually, an elegant symbolic architecture might be caught within the heuristic soup - then again, it might not. I think this method is both slower and less powerful than that used by the human brain, but it might still be adequate.

Functionality: Communication is telepathic (i.e. unreplaced). High-level association takes place through the use of heuristics which suggest that two things are connected. Complex concepts are built up gradually by the heuristic soup, rather than being specified by symbol structures. Self-awareness is discarded. The ability to think about thought is "replaced" by the ability to use heuristics to create heuristics. This is a bad trade, but I don't see a good way around it.

To create a heuristic soup, four things are needed:

A measurement of a heuristic's success.
A representation for applying heuristics.
A representation for invoking heuristics.
A means of inventing heuristics.

In that order.

Measurement: Heuristics began their AI incarnation as probabilistic guides to search procedures. The first thing a search procedure needs is a method to define successful termination, or a quantitative measure of relative node values. In order for heuristics to evolve, much less be self-modifying (as is needed for a heuristic soup), a fundamental measure of "success" is needed. This measure should be flexible, i.e. there can be heuristics about what constitutes success; and the method should be horizon-oriented, i.e. there are rapid low-level local measures of success for unimportant/basic heuristics and slow high-level conscious measures of success for large/important/salient heuristics.

The best measures of success may eventually evolve, but there are two obvious starting packages. One is goal-fulfillment, i.e. the heuristic system runs on local subgoals just like everything else. (Again, this requires very rapid goal-processing...) This would cover the assignation of credit to a collection of heuristics which helped solve the lowest local problem, although not necessarily the distribution of credit between heuristics. The second is basic measures of efficiency: Speed, search depth, area covered, academic success (i.e. number of references), definite proof that an idea can't work - these "basic successes" should be global subgoals when the AI starts up.

Representation (for application): A heuristic is a peculiar combination of a concept and a goal. To avoid the problem of circular recursion on "concepts", we will say that the core of a heuristic is a short program, either in assembly language or an artificial bytecode language, or both. This level of flexibility is necessary to ensure that the system has potential.

The next question is what that "short" program does; what sort of data does it operate on, and what does it do? It was stated above that "High-level association takes place through the use of heuristics which suggest that two things are connected. Complex concepts are built up gradually by the heuristic soup, rather than being specified by symbol structures."

Heuristics need to be able to increase salience (usually a basic quantity, roughly equivalent to "importance" where goal-system interfacing is concerned, but also a cognitive intuition). Heuristics need to be able to label two separate concepts with a link that works out to mean "similar". Heuristics need to be able to operate on the result of other heuristics - other specific heuristics, so that heuristics can operate on two concepts which another heuristic has marked as being similar in a particular way. Perhaps tags will be arbitrary 64-bit unique marks, with special heuristics to associate tags together - a detail of a heuristic domdule. Finally, heuristics need to be able to influence the direction of thought. If "investigate cases close to extremes" is a heuristic-about-heuristic-about-heuristics somewhere in the pool, it needs to be able to construct objects close to extremes.

Initially, heuristics may be rather crude things, adding a single instruction to a domdule data-specification, slapping on simple 64-bit tags, marking up salience, and other basic things - although the heuristic language needs to be able to represent programs much more complex, the pool doesn't need heuristics that complex, not initially; the soup is supposed to evolve and self-enhance. The programmer may write complex unitary heuristics as initial state, but the AI will slowly learn to build up heuristics.

Representation (for invoking): Traditionally, heuristics are invoked via a blackboard (daemonic) architecture. Each heuristic has a precondition, and that precondition is periodically checked - more often if the heuristic is salient for some reason. Parallel terraced scans and the like check a small precondition, followed by a more detailed medium condition, followed by a final precondition, which allows a slightly larger pool of heuristics. But for millions (or billions?) of heuristics and heuristic-fragments, this is simply not feasible. As the pool evolves, there will be heuristics simply for deciding which heuristics to invoke - which obviously requires that heuristic be able to target and invoke heuristics, or increase their salience and provide information which is naturally used for targeting, or dump heuristics into the active pool as codelets.

In the beginning, the simple programmer-designed heuristics should be variations on do what worked last time, via simple associational heuristics that grab an arbitrary collection of features. Again, the only reason this architecture has any hope of working is that it is self-enhancing, so the initial pool just has to work a few times; it doesn't have to be complete.

If optimization is an issue, a heuristic may have an "inline" list of heuristics that follow it, and another list of heuristics which merely become salient. If there's enough power, all associations between heuristics should be other heuristics - although for obvious reasons, sufficiently simple if-A-then-B associations should be directly linked in one form or another. When heuristics learn to act on the 64-bit traceback/tag/spoor of other heuristics, the 64-bit identifier might be indexed to a list of referenced heuristics.

That's a lot of heuristics, of course. Other heuristics might perform cleanup, or the largest assemblies doing the least amount of work might be periodically reaped. Hence tags are spoken of as 64-bit rather than 32-bit - even if there are never 4 billion heuristics at any one time, turnover may be high enough that 64 bits are needed to prevent complexifying garbage-collection.

Means of invention: Where do new heuristics come from? 1: There are a few heuristics for mutating old heuristics at random, after which a very short evolutionary pattern (one-cycle, low-population) sees if any of them are any good for the current problem. The whole set of Eurisko's heuristics-to-create-heuristics should probably be tossed in, if anyone can find them. One of the keys is that heuristics which note similarities, similarities which may be predictive of valuable courses of investigation, should be able to note that and create new heuristics. This isn't a low-level feature; it's a principle for tossing in a few initial heuristics.

High-level intelligence, of whatever sort there is in a heuristic-soup consciousness, should be able to apply to heuristics as well. Or to put it another way, heuristics should be able to pass their problems to domdules, if the domdules can help. This basic form of integration (heuristics guide domdules, domdules guide heuristics) is the most obvious way to make it possible for high-level intelligence to apply eventually.

Summary: The fractional heuristic soup is a pool of code fragments, organized and tagged as heuristics. Initially the code fragments do simple things; then code fragments about code fragments build up, and the whole thing self-organizes. From this perspective, the seed AI is given no predetermined binding architecture - the architecture evolves in the simple language of a pool of self-referential codelets. The seed AI is given no binding glue; it creates its own in heuristics, the only tried-and-tested self-enhancing architecture. That, at least, is the hope.

The only reason why this seed AI would work better than Eurisko is the initial set of domdules - although with codic domdules and goal domdules and causal domdules, it might be a fairly powerful advantage. Fractional heuristic soup is more directly self-enhancing than symbols on a low level, but a lot of high-level consciousness is given up. There may be a faster run initially, but the trajectory may ultimately peter out. Symbols are harder to program, harder to use and harder to enhance, but the resulting high-level consciousness allows deliberate thought and thus deliberate reprogramming. Ultimately code-fragment soup is a pattern-catcher, with most of the limitations of pattern-catchers.

But I wonder what would happen if both symbols and fractional heuristic soup were inadequate as world-model glue, so the programmers added in both and let them fight it out. Frankly, I can't visualize it at all. I think that's the weirdest form of consciousness I've ever tried to imagine. Two different architectures, simultaneously combining and fighting it out for supremacy! Two fundamentally different kinds of consciousness and self-awareness and high-level thought, using the same abilities for a base, intricately intertwined and opposed! (But why stop at two? Throw in more!) I'm not quite sure whether this would result in a godlike superintelligence or a pitifully insane thing, so I'm not too eager to try... Fractional heuristic soup may make it as a domdule in a symbolic architecture, but I don't think offhand it would make good competition.

#. Pattern-catchers: Neural nets and evolutionary programming.

A "pattern-catcher" is a low-level system that can embody some patterns and programs without intelligence or intelligent programming. A neural net learns to perform tasks by adjusting the behavior and connections and connection weights of a vast number of elements. An EP (evolutionary programmer) creates dozens of short pieces of code, mutates them, and breeds them together, selecting the best performers of each generation, until a satisfactory solution is reached. A genetic algorithm, the poor second cousin of EP, mutates data instead of procedures (and often doesn't breed the pieces of data together); genetic algorithms don't really count as pattern-catchers unless the data is a set of active instructions, and then they can safely be called evolving programs. I cannot think of any other pattern-catchers offhand, although somebody should try breeding neural networks with output neurons representing assembly-language instructions (or some similar crossover).

Pattern-catchers follow a path to the solution instead of analyzing it, with the path's direction at each point determined by non-intelligent rules. Pattern-catchers are given a problem, and constant feedback on how well any given solution works. Changes to the current solution then move in the direction of the best solution, or an adequate solution. This description is somewhat inadequate, since it seems to describe a simple genetic algorithm; what must be added is that a pattern-catcher creates a program, a pattern, that solves a general case rather than a single problem, and usually catches the pattern via some cleverer way than mere hill-climbing.

Programmers use pattern-catchers when there is no clear algorithm for solving a problem, neither high-level nor low-level. In this case, a pattern-trap is set out in hopes that it will catch a solution, which may not be understandable even then. Pattern-catcher solutions are highly adaptive (since the original solution was reached by adaptation); they are less fragile, less sensitive to error, and more general. Pattern-catchers often catch more of a pattern than can be managed even with adaptive code. In cases where finding a solution seems to require the reapplication of high-level intelligence on each occasion, and the intelligent solutions do not exhibit a common pattern that can be declared as adaptive code, it may be worthwhile to try a pattern-catcher.

Pattern-catchers are discussed here in case symbol abstraction and constraint assembly and other inter-domdule bindings cannot be solved with code. But the use of pattern-catchers in architecture carries a price; it makes the architecture less understandable, both to the programmers and to Elisson itself. The power and hope of seed AIs rest on their ability to understand and enhance themselves. Fractional pattern-catching (see below) may be used to help the programmer understand the nature of the problem; analyzing the neural-network solution may suggest a program. Pattern-catchers are a last resort as an element of architecture, unless they are used as faster versions of understandable code. Any pattern-catcher should have a hint, an understandable piece of code showing what the pattern-catcher is doing, even if the code is a blind search that would run like molasses on a quantum computer. Pattern-catchers are more innocuous when used in non-core domdules to provide intuitions.

Generalized neural networks:

A generalized neural network would be an element network, in which a set of mutually interacting elements are tweaked, or the interactions are tweaked, to produce a coherent solution. Note that a key phrase is "mutually interacting", with patterns of causality spreading through the elements; an active pattern is being learned, not a set of data. (Again, to distinguish from a genetic algorithm.)

The solution is usually embodied in a particular set of "output" elements. Element networks solve the problem of extracting a solution, not merely finding it. There may sometimes be cases where a set of elements is obviously doing nothing but extracting information. Examining these "extractors" (and the format of the raw information they translate) may suggest a better way to view the solution. Likewise, if there are obvious "interpreters" manipulating input data into an equally simple form, you may wish to rephrase the input data. Similarly, elements congregating into discrete groups that exhibit simple properties suggest you may wish to use different elements; elements that do nothing but perform simple mediations between groups may suggest a different set of interaction rules. It is thus seen that element networks need not succeed to help solve the problem; the fragments of pattern that are caught may suggest to Elisson, or the programmer, how to attack the problem. "Fractional pattern catching", focused on breaking a problem into elements.

Defining the "problem feedback", or learning algorithm, is the primary challenge in creating an element network. Problem feedback is the way in which a good or bad solution results in element tweaks. The term "primary challenge" is derived from history - neural networks stagnated for decades until back-propagation ("back-prop") learning was invented. Back-prop requires a set of training problems, inputs with known outputs. Each input is run forward on a random or blank network. The output is then compared to the "correct" output. Connection-strengthening impulses are propagated backwards down the correct output elements, and weakening impulses are sent down the incorrect outputs. Thus the neurons that contributed to correct solutions are strengthened, and the neurons which contributed to the contributors are strengthened... and so on, backwards through the network.

Problem feedback is causal analysis in miniature, assigning blame and praise to individual elements. Generalized back-prop might be described as: "Blame the incorrect outputs and praise the correct outputs, then let each element (including the inputs and outputs) send blame or praise to the elements interacting with it." The large-scale problem of causal analysis is broken down into rules governing simpler elements and their interaction - coincidentally the same elements and interactions as in the element network. (This is applying the rule of reduction.) Also, the hedonistic learning rule is employed by rewarding the obviously successful elements.

I'm not quite sure that direct back-prop will always work, especially with complex rules of interaction - it may be necessary to build up a situational causal model as the network thinks, then use it for training afterwards. One might also imagine on-the-job training, in which the proper output elements are labeled and then the system "reaches out" for them - a series of forwards and backwards shocks would go through the system and slowly perturb it into equilibrium with the current rewards at the output elements. In general, however, the assignment of blame and praise operates by a mirror image of, or at least by analogy to, the propagation of the network itself. Even after assignment, justice can be nontrivial. Are the righteous amplified, or merely preserved? Are the wicked destroyed, perturbed, reversed, or simply ignored?

But before those questions can be considered, one must first decide the nature of the elements, the interactions, and the structure. (Feedback is mentioned first because it is the primary design issue these decisions must solve.) Like the methods of learning discussed above, it can be a great deal of fun to invent new elements and interactions and structures. An "element network", sufficiently generalized, is a generic causal model; one can invent anything. Playing with element networks is playing with pure causality; it's the most fun a being can have without putting wires into the hypothalamus. All it takes is a little creativity. Do the "neurons" send complex numbers rotated (raised to an imaginary power) by synapses, thus mimicking quantum interactions? Maybe each neuron has a specific arithmetic operation.

Remember, the neural network represents a program/process/pattern; it is perfectly reasonable to have each neural "element" represent an assembly-language instruction. The key difference between element networks and genetic programming is that one tweaks via feedback and one breeds based on performance. There's nothing that specifies how the patterns are embodied; they can be caught in lines of code or neural networks.

Even after the elements and connections have been determined, a major remaining challenge is playing with the overall structure, the inputs and outputs and patterns of connections. A hierarchical "neural" network with three layers is far more powerful than a neural network with two, and a hierarchical "neural" network with intra-layer connections behaves differently than a feedforward network; and of course organic brains continually circulate activation pulses, instead of having numeric values move in lockstep down neat little levels.

A final note: It is thus seen that neural networks are nothing like the human brain, despite the constant babbling of press releases. The average neural network is a rigid, stultified imitation of a flatworm's brain, with simpler elements, simpler interactions, and a vastly simpler structure.

Generalized evolving programs:

Unlike element networks, evolving processes don't need to reduce to smaller elements. It must be possible to commit random acts of tweaking on the process, but this doesn't have to result in a viable result; nonviable results are simply eliminated. An element network uses special training algorithms; an evolving process is simply perturbed in thousands of random directions, and one hopes to get at least one that works better. The process-evolver doesn't have to be as smart as an element network. It does have to be much smaller, to make possible a large breeding population. Therefore element networks use large, easily reduced structures of simple interactions, while process-evolvers use very small, very dense lines of code.

There are two major concerns in an evolving process. First is producing code that can evolve; second is various evolutionary tricks. Evolving code is not a primary concern, since the whole evolves rather than the elements; this is usually handled with a specialized programming language, in which each byte is a valid instruction. This permits random errors to cause mutations rather than invalid programs, although the vast majority will still be crashes.

Code evolution is more complex. In the simplest form, each generation has ten survivors (the top ten performers), which each multiply into N more mutated programs. The new population P includes the ten survivors, and N mutants from each survivor: P=10N + 10, where P is usually at least a thousand. Simple alterations include fiddling with P, or the number of survivors. Less simple alterations change the nature of mutation.

Code can be literally bred, mixed and matched and married. In the simplest form, a random half of a successful program is simply added to a random half of another successful program, and the result tested. Far more complex types of reproduction among programs have been observed in TIERRA, where a number of programs battle it out on a parallel computer; but this method will not be used except exploratorily, since it burdens the programs with anti-parasite features and the like.

One idea for improving on sexual reproduction, I'm not quite sure how well it would work, would be to assemble a situation from the program execution, marking which pieces of code tend to execute in sequence and branch to each other. This might allow mixing and matching on a chunk basis, rather than random swaps. If the chunks are discrete enough, they might be evolved separately. Like element accretion, chunk analysis is fractional pattern-catching, hinting to the programmer/Elisson how to break up the problem.

Two-way pattern catchers:

Earlier, I spoke about using pattern-catchers for symbol abstraction or memory formation. The interesting part is that the storage format is not specified. Rather than moving from input to output, the pattern-catcher must take as input a memory, and produce a compact method of storing it, and then somehow "output" this method to the program. Then it takes as "input" the storage, somehow, and outputs the memory. I use quotes around "input" and "output" because it is not certain that there will actually be a clearly defined set of input and output elements in either case.

In fact, the whole problem isn't as clearly defined. In an element network, it seems obvious that the input from memory should become the output to memory - i.e., the element network should be run backwards to reconstruct the memory - but how does one run an evolved process backwards? With an evolved process, the method of storage is only a matter of adding I/O instructions, but now it isn't clear how to reconstruct. Perhaps each element of the element network could contain a "store" or "read" instruction, so that storage can occur anywhere? Perhaps code could evolve, not to translate outputs and inputs, but only to produce a single output?

Given enough RAM, a memory can be embodied in a separate element network for each memory. Given enough speed, a new process can be evolved for each memory. But I bet diamonds to doughnuts (dollars to doughnuts is now an even bet) that the brain has accumulated quite a few tricks for encoding memories, that it doesn't use the brute-force method... although I'd be quite surprised to learn that there isn't some form of pattern-catching involved. What we have here is the evolution of evolution, catching the pattern in the pattern-catchers. This will be taken up again, later - I just wanted to get you thinking about it.

Pattern catcher permutations:

Neural networks and evolving programs are the two main pattern-catchers because the general cases they represent lie on two opposite ends of a spectrum. In neural networks, tweaking occurs on a lower level, and resilience is provided by a large number of elements. In evolving programs, tweaking occurs on a higher level, and resilience is provided by a large population. If there is a third pattern-catcher, it is known as "intelligence". I do not know a fourth, unless you count fractional heuristic soup.

Still, all three methods can be combined. Intelligence can be used at any point to break through bottlenecks in evolving programs, and especially to supply the initial ancestor; intelligence can also read off evolved programs and learn lessons therefrom. Intelligence will have a more difficult time of it with neural networks (unless perhaps a specific domdule exists for their understanding); the pattern in neural networks is more diffuse, and it can be difficult to tell what is being done. See AI advantage #4.

Combining element networks and evolving processes is the most obvious solution, however, and the only one that works without a pre-existing AI. (Although fractional/exploratory pattern catching, for breaking down a problem, may use a programmer for intelligence.) It's obvious that neural networks are a lot easier to mix n' match than segments of code. If the element networks are small enough, they can be evolved in a population as well as optimized internally. (Of course, this smallness requires that the individual elements be powerful; one may simply end up with a more robust method of specifying programs, using code networks instead of linear strings.) A considerably more interesting challenge, requiring proportionally more RAM, would be the evolution of learning: The success or failure of each "process", each element structure or element-type specification, is determined by how well a number of element networks learn using that structure. It isn't clear how to learn evolution using an element network; I think that since evolution is high-level adaptation while element networks are low-level, the scheme is inherently biased towards evolving element networks rather than vice versa.

A final permutation consists of recursion. Element networks can contain miniature element networks as units. The evolving process can be a specification of how processes should evolve. This obviously takes a larger network, or a much larger population and speed, but it may sometimes be worth it. One interesting trick would be evolving a process for process evolution, where the current generation evolved using the best method of evolution in the last generation. I think that this would only constitute optimization of optimization, self-enhancement-wise, but it might still produce interesting results - and if you thought of it on your own, it shows you grokked the self-enhancement paradigm. The equivalent for element networks is left as an exercise for the reader. (The solution is presented after the next section...)

Quantum pattern-catchers:

Quantum pattern-catchers might be fun if we had them. If you have 32 qubits, you can rapidly sort through 4 billion possible strings of code to find a solution. (40 qubits is the largest I've seen any proposals for.) The primary trick with quantum pattern-catchers is in generating 4 billion plausible guesses. While a 256-qubit processor might suffice to instantly try all possible sequences of 32 instructions, nobody foresees a 256-qubit processor any time soon, and 4 instructions is too short to do interesting things. On the other hand, a REALLY MAJOR breakthrough resulting in plain old quantum RAM, like 64-megaqubit chips, could let you simultaneously run all possible seed AI programs and see which one took over the world. I'm not sure this would be such a great idea, since it would seem to select for both intelligence and aggressiveness. Fortunately, I do not expect this to happen any time soon, like during the lifespan of the human race.

The key problem is to write a function that takes as input a 32-bit number and puts out a unique plausible solution. One may compromise somewhat on the plausibility if one accepts a multi-stage mutation process; in this case the question is how to mutate a solution in four billion unique ways. Iterated quantum pattern-catching can be thought of as evolutionary pattern-catching with a very large population size and a very small group of successful breeders, such as 1. Also, note that the function cannot be random (although pseudo-random is OK), since the quantum "readout" can only carry the most successful 32-bit number, and the contents of memory must be reconstructed from that. I think.

While I have not yet read the full language specification of QCL - but give me a break, it just came out today - nor do I know more than the basic quantum physics in such books as Shadows of the Mind, there are two physics-dependent details which I think may screw up this serene picture. First, I am not sure that QCL permits one to "fork" an ordinary function on the contents of a quantum register. Some elaborate method of performing a single manipulation on the superimposed register, without branching statements, may have to suffice to search out the best solution. In other words, one cannot write branched non-qubit RAM in each fork of reality; only qubit state can be dependent on qubits. But I'm not sure that's how it works, and whether that's a limitation of modern quantum computing or whether it's likely to apply in the future as well. The other possibility is that conducting "internal" breeding via quantum probabilities, where the "best part" of a 32-bit solution is given a larger probability amplitude - or some strange method of having pieces of code add up or cancel out. This applies both as some kind of positive effect that only an experienced quantum programmer could figure out how to exploit; and as a negative effect where instead of the best sequence of 32 bits, you get the "best bit" in each of 32 places.

Update: With reference to direct quantum computation of exponential searches, Michael Nielsen points out that "Unfortunately, a no-go theorem of Bennett, Bernstein, Brassard and Vazirani shows that this is not possible in general. Generally, their lower bound shows that a quantum computer needs to access the list ~sqrt(2^n) times. Better than the classical result by a quadratic factor, but not an exponential improvement." (Hal Finney explains that this is due to the problem of getting an answer out the machine with a large probability, and points out that the sqrt(n) speedup is due to Grover's algorithm.) Sigh. As a certain Pulitzer-winning author once said: "I should have gone into something simple, like immunochemistry."

Solution: Self-enhancing element networks:

The equivalent of evolution in element networks is the problem feedback - how the neural network changes with each successful or unsuccessful output. If a separate element network were controlling the problem feedback, the problem feedback for the feedback network might be handled by the feedback network itself.

Incidentally, note that in both self-enhancing element networks and self-enhancing evolution, a separate problem is needed to evaluate efficiency. It would be nicer and faster if the efficiency enhanced was efficiency at self-enhancement - but how do you actually measure it? It leads to circular logic.

#. Symbolic architecture: A research problem.

I don't know what symbols really are or how they work - not in the same way that I understand, say, goals. (A touch of irony, here...) This is why this "detail" is 80K in length. Since the basic purpose of this document is to enable humanity to carry on without me in case I get hit by a truck, this section has information useful for others attempting to solve the research problem. It contains my attempts to invent a symbolic architecture, and why I feel those attempts have failed. I list concepts which I intuitively feel are relevant, even though they may not appear in my current attempts at architecture. The project requirements are for adequacy, not perfection.

I emphasize again that symbol architects must check their designs for flaws with exceeding care. More time has been wasted on simplistic or flawed symbol architectures than on any other facet of AI. No matter what your architecture is, it will have major flaws. Even the human architecture has flaws. If you can perceive these flaws despite your love for your brilliant idea, you may be able to design a flawed system that is still adequate. Otherwise, I guarantee you will fail.

Review of previous discussion.

Grounding symbols in domdules.
The common quality of a set of experiences.
Symbol tags, definitions, and repositories of experience.
Satisfaction, application, reconstruction, abstraction.
Symbol cores.

The functionality of symbols.

High-level descriptions.
Abstract thought.
Repositories of experience.
Binding and domdule integration.
Communication.
Compact referents.
Constituents of concepts.
Domdule controls.
Stream of consciousness.

Basic questions of implementation.
Symbols in Copycat.

Copycat's domain.
The Slipnet.
Lessons learned.

Copycat: Going down one level.
Symcat: Lessons in failure.

Copycat can't learn.
Information-loss resemblances.
Problems with resemblances.

Symbols in human experience.

Symbol formation.
Conceptual collapse.
Primitive symbolic properties.
Symbol definitions.
Symbol tags.
Symbols accumulate experience.
Symbols have multiple referents.
Symbols symbolize.
The neurology of symbols.
The evolution of symbols.

Concepts relevant to symbols.

Association.
Catching repetition.
Meta-hierarchies.
Patterns of speech.
Deep, narrow search.
Information loss.
Three versions of symbols.

1. Review of previous discussion.

Symbols are grounded in domdules. (Grounding.)

Symbols are built on top of domdules. Symbols describe domdule internals and manipulate domdule internals - "red", for example, describes an internal characteristic of a visual domdule, and when imagined, creates that characteristic in the domdule. A symbol is not strictly emergent from domdules; it has behaviors independent of the domdule it grounds in. But a symbol cannot exist without grounding, either - symbols are not computational tokens, the networked nodes of classical AI.

Symbols are the "common quality" of a set of experiences. (Symbols and memory.)

What is "red"? It's a red apple, a red cat, a red ship, a red sign, a red light. Show enough of these to a human and call them all "red", and ve will associate the word "red" with a particular color.

Symbol tags, definitions, and repositories of experience. (Symbols and memory.)

Once the human has a symbol for "striped", he can rapidly say: "Oh, it's striped" and not "it looks like that zebra over there". He can say "A 'striped' pattern is an sequence of lines of solid color." And since he can describe both zebras and ties in terms of being "striped", he can rapidly add the experience of ties to his conception of striped things.

Satisfaction, application, reconstruction, abstraction. (Symbols and memory.) (Sample thought.)

Reconstruction: What comes to mind when you hear the symbol spoken. Probably reduces to mnemonic reconstruction, or else a canonical form derived from the definition or salient properties. Basic symbol "dereferencing". Shades over into application depending on how much the reconstruction is influenced by context.
Satisfaction: Whether the high-level description of an experience corresponds to the symbol. Functionally, refers to the problem of creating the high-level description as well as testing it. (For simple symbols, such as "red", high-level description may not enter into it.)
Application: Forcing an image to satisfy the symbol, or trying to. May operate only partially, or by causing the image to share some properties with the symbol, or by applying the concept that is the symbol's definition, or by applying memories. Symbols can also apply to other symbols: "Almost extreme".
Abstraction: The basic problem. How can the system learn symbols? Learn symbols in such a way as to rapidly satisfy and apply them?

Symbol cores. (Sample thought.) (Non-symbolic architectures.)

I widely assume, hither and yon, that symbols have some type of optimized, procedural "core" that allows them to rapidly test for satisfaction and rapidly apply to images. It is an interesting question whether this applies to symbols like "Elisson" as well as "red".

2. The functionality of symbols.

The things that symbols do in the human brain - the basic functionality symbolic architectures have to implement. While it is possible that some of it may be farmed out to other domdules or other architectures, this must be done deliberately.

High-level descriptions allow us to associate and apply experience on a level higher than basic sense data. Instead of learning individually that apples with holes have worms, oranges with holes have worms, and plums with holes have worms, the individual may single out a specific property: "Fruit with wormholes", and associate his experience to this property rather than an individual memory. "Fruit with wormholes is packed with protein." In other words, we single out properties that satisfy high-level (or low-level) symbols, and describe the experience in terms of those properties and how they relate, and form associations and rules based on the high-level description rather than the experience itself.

Abstract thought allows us to think about high-level descriptions independently of experience - to form causal models with internal variables, or which are simply missing some components. Abstract thought allows us to reason about this model using symbol definitions, and using known rules that apply to high-level descriptions of the model independently of concretization. Because abstract thought enables us to reason about unbound symbols, it is a part of what enables us to think about thought.

Symbols act as repositories of experience. Because it's easier to notice when an experience satisfies a symbol, or when some property of an experience satisfies a symbol, it's easier to add that experience to the memories contained by the symbol, or deduce (from that instance) abstract laws acting on the symbol.

Symbols are a binding factor that integrate the domdules. Symbols are abstracted from experiences, and the experiences that form the symbol can derive from multiple domains. The sound of the meow and the look of the eyes and the feel of the fur are all part of "cat". In addition, high-level descriptions ("close to extreme") can apply to many different models and many different domdules, so that experience does not have to be relearned for each domdule. And because the symbol for "cat" reconstructs the meow and the eyes and the fur at the same time, it may help with synchronization.

In all probability, symbols originally evolved as a method of communication. Since you and I attach roughly the same symbol to the verbal tag-referent "tiger", I can rapidly warn you that a tiger is approaching. Thoughts are more compact in symbolic form. Symbols may have started as a method of rapidly communicating, but they grew into a way of rapidly thinking.

Symbols are compact referents. Instead of referring to "that great big orange thing with four legs and a tail and black stripes and sharp teeth", we can refer to "tiger". This helps with communication, but it also helps with internal thoughts. Margaret Weis and Tracy Hickman record that Krynnish gnomes seem to have a slight problem with this use of symbols:

"[...] when an elder made the mistake of asking the gnomes the name of their mountain. Roughly translated, it went something like this: 'A Great, Huge, Tall Mound Made of Several Different Strata of Rock of Which We Have Identified Granite, Obsidian, Quartz With Traces of Other Rock We Are Still Working On, That Has Its Own Internal Heating System Which We Are Studying In Order to Copy Someday That Heats the Rock Up to Temperatures That Convert It Into Both Liquid and Gaseous States Which Occasionally Come to the Surface and Flow Down the Side of the Great, Huge, Tall Mound....' "

Symbols are the constituents of concepts. Symbols can affect each other and apply to each other. Sequences of symbols can build an image, or a model. They are almost the only things we use to build images and models - certainly the most sophisticated method. In this sense, symbols are the building blocks of thought.

Symbols are domdule controls. The brain builds images by invoking symbols. Thus symbols are the conscious interface to domdules, or at least a major part of it. This is a very powerful part of what holds the will together. This is the reciprocal aspect of high-level descriptions - where internal thoughts are concerned, manipulation and perception are two sides of the same coin.

Symbols are the stream of consciousness. Our stream of consciousness appears as a narration, a series of verbalizable sentences. Where does this narrative come from? I propose the following cycle: The current contents of the mind - the current train of thought as incarnated in the various modules - is verbally described; a high-level description is formed and mentally enunciated. This high-level description changes the state of the domdules by activating symbols, associating to concepts that apply to the high-level description, making new things obvious, and so on. (This is a symbolic activation trail.) Then the new domdule state, including any bright new ideas, is described in a new sentence. Contents give rise to description, description alters contents.

3. Basic questions of implementation.

How can symbols be rapidly satisfied and applied?
How do symbols combine to form concepts?
How do symbols interface with domdules?
Is there a procedural core? What "language" is it in?
How can new symbols be learned?

It's the last question that's the real killer. Writing a quick-and-easy procedure for symbol satisfaction is easy. Making it extend to all domdules is hard. Making it combine with other symbols is harder. Making it apply to all domdules, including those Elisson may invent in the future, is nearly impossible. Doing all that, and fulfilling all the functions listed above, in such a way that Elisson can learn new symbols from experience automatically and without human intervention has temporarily defeated yours truly.

It's the requirement of learning that's the real killer. Symbols are powerful things, but we can imagine ourselves programming them one by one. But with this method, each symbol is a new programming problem - and it doesn't look like the skills involved are a significantly smaller subset of those possessed by the average programmer. (See the "nonsymbolic" architecture neosymbols: Symbols deliberately programmed by Elisson.)

4. Symbols in Copycat.

Copycat's domain.

Copycat is a program, created by Douglas Hofstadter and Melanie Mitchell, which solves analogies such as {If "abc" goes to "abd", then "bcd" goes to...}. Most readers probably answered "bce", as does Copycat. Copycat is also capable of solving more complex problems, such as {If "abc" goes to "abd", "xyz" goes to...}. Note that Copycat actually creates a new answer, rather than selecting from a predigested list of alternatives. It has invent-level capability, which is astonishingly impressive and very, very, rare. The source code is here or here; some discussion is in the collection Metamagical Themas, and a more detailed discussion is in the book Fluid Concepts and Creative Analogies.

(My summaries will convey little of the image gained from all three sources, but I must discuss Copycat in any case. First, as was stated earlier, Hofstadter is the Master of Symbols and Copycat is his AI. Second, Copycat has an interesting implementation of symbols. Third, my conception of symbolic architecture comes from thinking about how Copycat could be extended to learn symbols. Fourth, talking about similarities without talking about Copycat is analogous to talking about heuristics without talking about Eurisko.)

The Slipnet.

Copycat's symbols, and the links between them, are defined by the programmer in a LISP file. Symbols and links are organized into a Slipnet. (The three source files showing the slipnet are slipnet-def.l, slipnet-links.l, and slipnet-functions.l.) Slipnet derives its name from the "slippability" of concepts; "leftmost" can slip to "rightmost" under the right circumstances, and is more likely to do so if "first" has slipped to "last", since that would activate the symbol (Slipnet-node) "opposite" and decrease the length of links described by that symbol. Likewise, although "leftmost" cannot actually slip to "left", it still increases the activation of "left", and in a number of subtle ways causes Copycat to be on the lookout for leftness. This includes dumping a number of "codelets", sort of virtual-particle code fragments or temporary daemons, which keep an eye out for left things.

Another form of satisfiability is provided by fragments of LISP code, the "quick-and-easy procedure for symbol satisfaction" of which I spoke earlier. (No offense to Mitchell or Hofstadter, of course; you have to get the thing running before you can start trying to architecturalize the hacks.) The description-tester of the Slipnet node plato-four is '(lambda (object) (and (typep object 'group) (= (send object :length) 4))))) - i.e., the symbol is satisfied if the length of a group equals four. Applicability - remember, Copycat has to invent a new string for the answer - is somewhat more complex and is done largely through the links. For example, Copycat actually answers by applying a high-level object called a "rule" (in rule.l). The rule for changing "abc" to "abd" is usually "replace rightmost object with successor". Copycat then "translates" the rule for the target string, "xyz", and then runs the translated rule to yield the answer string.

(Incidentally, this isn't half as linear as I'm making it sound. Like a good emergent intelligence, Copycat is very flexible about what can affect its perceptions and when. The target string has a good deal of influence on how the initial-string-to-modified-string rule is perceived, especially if the most obvious translation runs into a snag. Copycat operates by slowly building up notice-level (bonds, groups) and understand-level (correspondences, rules) perceptions in the Workspace before attempting to invent the answer. All the perceptions of all the strings influence each other.)

How is a translated rule applied? To quote Melanie Mitchell:

Here are two examples of how the rule instance can be set up:

Example 1: for the rule "Replace rightmost letter by successor":

OBJECT-CATEGORY1 = plato-letter
(The object-category of the initial-string object that changed.)
DESCRIPTOR1 = "rightmost".
DESCRIPTOR1-FACET = plato-string-position-category
(This is the facet of the letter that's being described by descriptor1
in the rule, not its letter-category or its length or anything
else.)
REPLACED-DESCRIPTION-TYPE = letter-category
(This  means that the rule is saying that "successor" refers to
letter-category, not to any other facet of the two letters being
related.)
RELATION = plato-successor.
(Since this is a "relation-rule", the other instance variables are
ignored.)

Example 2: for the rule "Replace C by D":

OBJECT-CATEGORY1 = plato-letter
DESCRIPTOR1-FACET = plato-letter-category
DESCRIPTOR1 = plato-c
OBJECT-CATEGORY2 = plato-letter
REPLACED-DESCRIPTION-TYPE = plato-letter-category
DESCRIPTOR2 = plato-d

In Copycat, symbols are applied either through the links, or directly through code. The "successor" symbol is applied by finding a link labeled with "successor". In other words, to apply "successor" to "c", one finds a successor-labeled link to "d".

Lessons learned.

This says an amazing amount about Copycat, in terms of the Principles I've already discussed. Copycat reasons about the English alphabet, but it doesn't have an alphabet domdule. Instead, all the rules and reasoning governing the alphabet are attached to symbols. Rather than having a separate alphabet system and a separate symbolic system and dealing with the problem of interface, Copycat fuses them together. Meditate on this, and many questions are resolved.

Why is Copycat so powerful, compared to other AIs? Because other AIs don't have domain-based reasoning, none at all; their symbols float free. (Eurisko is another exception to this rule.) Copycat's domain reasoning isn't embodied in a separate domdule, but it is there, fused to and empowering the symbols.

Why is Domain modules: Domdules the first architectural principle I discussed? Because the one thing that stands out about all successful AIs is the presence of domain-specific reasoning. (Not a great number of "facts" encoded in free-floating symbols. Code.) The one thing that stands out about all great programming revolutions is modular code: Once you make something a separate object, once you make a module, it untangles a great deal of code and vastly magnifies your ability to program.

How does Copycat handle the problem of satisfying and applying symbols? Because the alphabet domain is small and knowable, all possible satisfactions and applications can be encoded in predetermined links. Copycat can apply "successor" to four and get five, but it can't apply successor to five and get six. There's a successor-link between plato-four and plato-five, but there is no plato-six. (Although, since plato-four has to be satisfied by an infinite number of groups, it has a small code fragment which tests the group length.)

Why do I emphasize that symbols are the domdule controls, the conscious interface to the visualizational domdules? Because in Copycat, the Slipnet is the domdule. Rules apply through the Slipnet, through the descriptions, to the alphabet. The Slipnet contains within it the alphabet domdule, and through the Slipnet the high-level concepts governing analogies apply to the alphabet (admittedly with some problem-specific code; Copycat isn't modular). The symbol plato-a is the interface to "a". If it were possible to create a Copycat with separate domdules, symbols would still be the interface to the alphabetic domain. It is the way humanoid minds are structured.

Why are symbols are the domdule bindings? They fuse analogic and alphabetic thought. They even fuse reflexively: Workspace-structures can also be described by symbols. "plato-group" is also a symbol.

Why are symbols the key to concepts? All the high-level structures like rules and correspondences apply and operate through the Slipnet. Rules and correspondences are basic elements of perceptions, respectively causality and similarity, which are active forms of concepts. In Copycat, the concepts are preformatted, but they are nevertheless combinations of symbols.

Why are symbols the key to high-level description? Douglas Hofstadter once described Copycat's entire function as "high-level perception". This is not strictly true if interpreted in Elisson terminology. Copycat does have invent-level capability; the symbols are applicable as well as satisfiable. But since the real power in Copycat goes into perception, and since the code governing the invent-level is largely procedural rather than emergent (see answer.l), it remains true that Copycat's intelligence operates through high-level perception. That perception could not take place without symbols labelling the objects - without distinguishing descriptors, or successor-groups based on successor-links.

Why are symbols the key to abstract thought? Through the Slipnet, Copycat can reason that "first" is the opposite of "last" without saying whether the thing that is first is "a", first letter of the alphabet, or "p", first letter of a group.

5. Copycat: Going down one level.

Copycat is mighty. Copycat works. But whence cometh Copycat's power? Despite the many marvelous uses to which the symbols are put, the symbols themselves are predigested, rigid, non-emergent, ad-hoc, and just about everything else that's always been wrong with AI.

The answer, of course, is that Copycat is intended to model human analogies, not human symbols. The complex behavior of the analogies derives from simpler elements (bonds, groups, correspondances) which operate through the symbols. The symbols are merely the bottom layer, and the bottom layer is always inflexible.

What Copycat accomplished was to move the bottom layer down one level. Previous analogizers, in which the inflexible bottom layer was the analogy itself, were so hideous that it is painful to write about them. Take the Structure-Mapping Engine, or SME for short. This program was claimed to understand such analogies as Socrates' "Teachers are the midwives of ideas", or the famous analogy between atoms and the solar system. The system was fed this data:

(goes-around (planet, sun))
(heavier (sun, planet))
(goes-around (electron, proton))
(opposite-charge (proton, electron))

And printed out this analogy:

planet->electron
sun->proton
heavier->opposite-charge

Hofstadter puts this accomplishment in perspective.
As he points out, this is what SME actually noticed:

(a (b, c))
(d (c, b))
(a (e, f))
(g (f, e))

And this is what it printed:

b->e
c->f
d->g

(Actually, Hofstadter's demonstration was for the teacher-midwife analogy, but the same basic principle is involved.)

The accomplishments of SME are strictly observer-relative, what Hofstadter calls "the Eliza Effect". When we see a statement like (heavier (sun, planet)) we think "the sun is heavier than a planet", not (a (b, c)). But it is the (a (b, c)) thought, not the thought about the solar system, that is mirrored within SME. In Copycat, the group of ordered letters we perceive when we see abc and xyz are mirrored by actual LISP objects called succgrp and predgrp, as is the correspondence between the a and the z, with distinguishing-descriptors first and last, with first and last being mapped to each other by a concept-mapping through the link opposite.

This is why grounding symbols is so important, and why I repeatedly say that it takes a lot of code and hard work. Any programmer could write the Structure Mapping Engine from scratch in minutes. The results obtained are commensurate with the effort.

Now, where were we? Ah, yes. "What Copycat accomplished was to move the bottom layer down one level." In previous efforts, analogies were processed by inflexible, predigested, rigid, non-emergent, ad-hoc rules. Copycat moved the inflexible layer down lower, to the symbolic level. Now analogies were emergent, flexible, responsible, creative, and adaptive. Even the symbolic architecture wasn't quite as bad as SME-ish "classical AI". It had lots of frills, like spreading activation, and it was being used by active, semantic high-level objects. In a sense, Copycat moved the inflexible layer down three levels: The symbolic architecture was inflexible, Copycat's particular symbols were predetermined objects in the architecture, the perceptions (bonds, groups, correspondences) were built on symbols, and the analogies were built from perceptions. Of course, if you want to view things in that kind of detail - which I recommend, actually - even SME had two layers, an inflexible set of mapping rules and the particular set of tokens it was mapping.

It is at this point, in my opinion, that Copycat's makers took a wrong turn. Instead of trying to move the inflexible layer down further, so that symbols could be emergent, flexible, etc., they decided that the next step was a Metacat, capable of drawing analogies between analogies. They decided to put an additional layer on top of Copycat, a meta-layer which would enable Metacat to incorporate reflexive thought, reasoning about reasoning. (If you've read Hofstadter's Gödel, Escher, Bach, this will not come as a surprise.) Frankly, I think they're doomed. The bottom layer is still too high. Copycat doesn't have the substrate for that kind of intelligence.

AI is the art of laying the substrate for intelligence. We can perceive the very highest level of the mind. We perceive it as ourselves. As in physics, we always work downward, not upward, from the things we think we understand. Seek the reduction, not the meta. Paradigm number three: Ground! Ground! Ground!

To be grounded is to be emergent instead of predetermined, flexible instead of rigid, adaptive instead of ad-hoc. (It is also to suck up vast oceans of computing power, but what the heck...) Complex behaviors don't come from ever-more-complex manipulatory procedures, but from the interaction of simple elements. Copycat's symbols are fairly rigid tokens; most of their power comes from the functions that manipulate them. As in adaptive code: Reduce! Reduce! Reduce!

6. Symcat: Lessons in failure.

Copycat can't learn.

Copycat is mighty. Copycat works. Indeed, Copycat is worthy of praise. But Copycat's symbol architecture is not adequate for a seed AI. Copycat's symbols are entirely predetermined by the programmer. Because of the way those symbols are predetermined, because of the way applications and similarities are precomputed links, programmer-determined symbols can be written only for basic primitives in small domains. And above all else, the problem of symbol abstraction remains unsolved. Copycat can't learn.

To some extent, of course, this is like saying that Eurisko can't do voice recognition, or that HEARSAY II can't swim. Copycat is designed to model the way humans solve analogies, not the way humans learn symbols. And yet Copycat contains the most sophisticated symbolic architecture I am aware of - whether from design, or just because Douglas Hofstadter was involved. Of all AIs, Copycat is the one halfway there, the one where the inadequacies of the symbol system can be perceived and repaired, instead of symbols being entirely absent. And this was done without even focusing on symbols as the primary problem! So as I dissect Copycat, I'd like to emphasize that no insult is intended to the Fluid Analogies Research Group.

That having been said, Copycat's symbolic architecture is inflexible, predigested, rigid, non-emergent, ad-hoc, and just about everything else that's always been wrong with AI. About the only thing that it isn't is ungrounded, and that's because there are nine zillion links and little fragments of non-adaptive code scattered all over. The bottom layer is always inflexible, and the symbolic architecture is on the bottom layer; nothing lies beneath it. All the complexity, all the grounding, derives from complex functions that manipulate symbols, not the reduction to simpler elements.

That's not the problem, because Copycat still works just fine. The problem is that Copycat can't learn. It's that lack, more than anything else, that renders the symbolic architecture untenable for a seed AI. Copycat is unable to form new symbols. Copycat doesn't have the idea of circularity, and it can never learn the idea of circularity, no matter how many examples it is presented with. Ask: {If "cba" goes to "cbz", "xyz" goes to} and Copycat might answer "xya". But then ask {If "abc" goes to "abd", "xyz" goes to} and it will not come up with the common human answer, "xya". There's no successor-link between "z" and "a", or "last" and "first", and that's all there is to it. As Hofstadter puts it in the Post-Scriptum to "Analogies and Roles in Human and Machine Thinking" (in Metamagical Themas):

"But who said the alphabet was circular? To make that leap, you almost need to have had prior experience with circularity in some form, which we all have. For instance: The hours of a clock form a closed cycle, as do the days of the week [...] But not all linear orders are cyclic. The bottom rung on a ladder is not above the top rung! [...] It is a premise of the Copycat world that z has no successor. Sure, a machine could posit that a is the successor to z, but to do so would be an act of far greater creativity than it would be for you, because you have all these prior experiences with wraparound structures."

Well, it's possible that human beings have innate conceptions of circularity in some module or other, but it's also possible that we, ourselves, have to learn the concept of circularity. Invent it? That indeed would be transhuman creativity, but humans don't invent circularity, we learn it. We learn it from the four seasons and the phases of the moon, from day and night, from breathing in and breathing out, even from primal forces such as modular arithmetic. Rather than asking how Copycat could invent circularity, instead ask how a "Symcat" could learn it by studying the example of circular things - how Symcat could abstract the symbol from experience.

But perhaps no such drastic measures are needed. Wouldn't it be a simple matter to add a successor-link from "last" to "first"? But the problem isn't that Melanie Mitchell didn't think of this particular link while writing the program. The problem is that Copycat has run up against something outside its experience and needs to adapt.

Another answer is that Symcat needs a way of adding a successor-link from "last" to "first". I hope at this point that my reader shares my instinctive revulsion towards this idea. Ground! Reduce! When your architecture turns out to be inflexible, deepen it with layers underneath! Don't just throw more functions on top! Maybe circularity can be represented in that particular way. But what about the idea of a barrier? What about the idea of progress? What about the ideas of force and attraction? These are all things that can be represented in a sequence of letters - "lmnopp" and "nopppp" and "mnoppp", taken together, convey the idea of a stopping-point, a barrier. Concepts such as opposite are embedded deeply in Copycat; it's hard to see how they could be formed in the piecemeal fashion suggested by link-adding.

Information-loss resemblances.

Historically, it was my attempt to invent a symbol architecture for Symcat that led to the invention of domdules. My first domdule was a linear domdule. In its most recent incarnation, it measured movement along a one-dimensional strip, with optional boundaries. The basic image/model/picture was a series of positions. {[0X00), [00X0), [000X)} represents an object moving away from the left-hand boundary towards infinity, with the second stage salient. Symcat's images would be a special case in the general domain, with one boundary at position 0 and another at position 27, leaving the "letters" 1 through 26. Any Copycat-string, such as "abcdef", would translate directly into a series of positions, a linear image.

This is a problem right off the bat, because it appears to "reduce" the problem to something no simpler than the problem itself. The answer is that while Symcat can represent a string within the linear image, that doesn't mean it should, or that the analogy takes place on that level - although allowing Symcat to directly represent strings will give it the same kind of intuitive perceptions that we get from the visual cortex. That, however, is not the main issue; the primary issue is for Symcat to learn Copycat's symbols, in such fashion that it can do everything Copycat can, in roughly the same way - that is, with bonds and correspondences and rules, rather than a few automatic notice-level functions. The symbols formed with the linear domdule would apply to individual letters, rather than entire strings - much as we analyze abstract geometry with the assistance of our visual cortex.

Given the linear-image representation, the next question was how to abstract a series of linear images into a symbol. My first answer was what I now call "information-loss resemblance". The resemblance between two strips is defined by the amount of information that has to be "lost" (abstracted away) before the two strips are identical. The strips "abc" and "bcd" require that absolute position be lost, the strips "abc" and "xyz" require that leftness and rightness be lost, and so on.

The specification of a symbol is a list of experiences, combined with a list of properties (such as left/right) that must be lost, a list of properties that cannot be lost, and a list of properties that may or may not be lost. To teach Symcat the letter "a", we would ask it to form a symbol from the experiences {[X000)}, {[00X0), [000X), [X000), [00X0)}, and so on. Symcat would learn that everything but the boundary and the absolute position of the salient item can be lost. This will enable Symcat to identify the "a" in "bab", or so one hopes.

What exactly does it mean when a property must be lost? Well, Symcat can also learn loss-based rules about the transition between two images, such as "abc" to "bcd". In this case, absolute position must be lost. (The problem of specifying a particular "velocity" will be ignored.) Again, this is starting to look more and more like the original problem of analogy, particularly since the transition-symbol learned for a class of transitions seems little less than a Copycat translated-rule. But only the domain looks roughly the same; the methods used are quite different, and for the resemblances, more primitive. (One important lesson learned is that domdules usually need to notice facts about multiple images, temporal transitions, and static comparisons. Not in any sort of detail, but in enough detail to provide "handles" for high-level thought.)

The final result is a hiearchical architecture. Symbols are built on procedural resemblances; analogic structures are built on symbols; analogies are built on analogic structures. The emergent symbols that result (from the linear domdule resemblances) can't build concepts, or apply to wholly abstract material, or handle any number of simple ideas. In fact, as depicted, it's hard to see how "successor" links are formed, or how the application of information loss advances "b" to "c". One is tempted to start concluding that "successor" is another domdule primitive, but that sort of thing results in an exponential buildup of predigested primitives, until the whole point of a symbol architecture is lost. The concept of "succession" is so basic to thought that it is almost certainly a primitive scattered all over the blasted human brain, but there's also an innate core dealing with primitives of causality and similarity. Lacking carefully-integrated causality and similarity domdules, an AI will be unable to represent many concepts - ranging from "successor" to "rule" and from "symmetry" to "correspondence".

Problems with resemblances.

There are all kinds of major unresolved issues carelessly scattered through the Symcat resemblance architecture. The most obvious problem is that the application of symbol cores, deliberate information-loss, is not very well-defined. The satisfaction of "lose absolute position" has an obvious implementation, the application less so.

The idea of information-loss seems innately symbolic, creating hierarchies of abstract images, and should therefore be mistrusted. It is a linear domdule built for symbolization, containing almost all the complexity and functionality of symbolization within itself, which should lead us to suspect that it is oversimplifying and integrating complex functions that should belong to another domdule. And many relations, such as "successor", seem to involve transitions that can't be embodied in an lossable property without a lot of ad-hoc functions.

As I moved from a resemblance architecture to my current guess, I hypothesized a basic inverse relation between satisfiability and applicability on the domdule level. Some notice-level and understand-level perceptions are applicable as well as satisfiable - not all of them, but a good proportion. These perceptions would form the basic language describing the domdule to the symbolic architecture; the notice perceptions would be the primary components of the symbol, the instructions of the procedural core. The next question is how much declarativity must apply to these notice-instructions - can the symbolic architecture operate through simple association, or a small pattern-catcher acting on the noticed items? Or is more intelligent/complex processing needed? If there were a domdule doing nothing but abstracting symbols, how would it work and why? How do these perceptions combine? Perhaps the constraint-instructions combine in multiple possibilities, slowly optimizing over a hill-climbing pattern until the image satisfies the symbol. (Remember, the problem of application is modifying the image so that the symbol is satisfied.)

In short, a basic shift has occurred. Instead of resemblances forming preformatted "lost" properties, they can form any sort of thing that can be noticed. The domdule contains its own first-stage interpreter, and the symbol handles the rest.

Ah, but now we should be suspicious again. Why should the domdule contain an interpreter? Once more, it seems like the work of symbolizing is slowly seeping back into the domdule. Actually, we do know that some notice-level functions are applicable. We notice that an object is slowing down (I think accelerations are handled in the visual cortex, so it's a legitimate domdule function), and we can apply the property. Most things that we can consciously notice are abstractable and applicable. The question is whether that abstraction and application is handled by the symbol system or by the visual cortex. Are all perceptions also manipulatory handles? If so, it would be a reasonably elegant principle, but it still seems suspicious. Once again, one feels the basic intuition that there's a key to symbolic abstraction which can't be faked by distributing the problem, or even breaking it down into a hierarchy of subproblems. Abstraction hasn't truly been reduced to simpler elements, only recursed to similar subproblems.

The basic question is this: Where do satisfaction and application come from? From what lower level? And the answer of "information-loss resemblance", or "notice-level manipulation", is that it comes from very small satisfiable and applicable code fragments, which combine in mysterious and not-fully-specified ways to form entire symbols. This is a good exploratory/brainstorming answer, but it doesn't make it as the Key to Symbols.

Still, resemblance-based symbolics could be good enough for a seed AI. Resemblances could also be a fundamental element of the human symbolic architecture, with the Key to Symbols lying in some particular facet of the code fragments or the manner of their combination. Resemblances could be a simplifying assumption that allows the Key to Symbols to operate. On the whole, however, I don't think so. The main lesson learned is that the real mystery lies in the assembly of constraints, the combination earlier glossed with "let a pattern-catcher handle it". If there's some way to build up an image from notice-level perceptions, why shouldn't the same method work for represent-level image elements? The harder we grasp at symbols, the farther they recede. Perhaps it's time to go back and study our working examples.

7. Symbols in human experience.

1. Symbol formation.

Often I have had the experience of hearing a single word used to describe what was formerly a vague concept, and this single symbol causing everything else to fall into place. What the speaker has to say about the symbol, or the speaker's careful definition and presentation of the symbol, is not half so powerful as the simple fact of the symbol itself - that the concept under discussion has been isolated as a separate thing and abstracted. Sometimes the symbol is longer than a single word, appearing as a phrase or even a concept - but usually that phrase refers to a single concrete image, and that single image is a single symbol. For me, David Chalmers' article about "the hard problem of conscious experience" was overshadowed by the phrase itself, which redefined the field in the moment that I heard it. It isn't necessary to condense the phrase to "hardprob" for it to be a symbol - or so I think, anyway.

Is this, in fact, an experience with symbol abstraction? Or is it simply another instance of hearing a "key concept", one that causes everything else to fall into place? I don't know, of course, but this particular instance seems to have useful properties in common with symbols. "The hard problem" is a short enough phrase that it can be used as a symbolic tag. I can think about the hard problem, construct sentences referring to the hard problem as a single element, categorize things as being instances of "hard problems" - the First Cause ("why does anything exist at all?") is the hard problem of reality, the existence of time is the hard problem of causality, the meaning of life (or existence of intrinsically nonzero goals) is the hard problem of morality, and so on. Likewise, "Eliezer Yudkowsky" isn't one word, but Eliezer Yudkowsky thinks it is more of a symbol than a concept. It can be used to build sentences, it has a compact tag, experiences fall under it, it can be satisfied and reconstructed - although perhaps not applied, unless you've read some of my other Web pages.

For me, the most memorable experiences with symbol formation (i.e. the rarest, perhaps the ones we should be least interested in investigating) consist of coming across one word while reading a paper, or inventing a single word while trying to write a computer program. Often the key insight, one that leads to an entire program architecture, is realizing how a task breaks down into smaller parts. And this in turn results from noticing that smaller task while trying to perform the large one, and giving it a single name. It's this key step, naming the subtask, that enables me to notice that subtask everywhere, conceive of it as a separate module, and implement it as a separate problem instead of an "inline" principle. One time, for example, the key breakthrough in the Third level of parsing was realizing that the Third level consisted of three separate intersecting "planes": Refinement, Reference, and Correspondence. Before that I had bogged down, as often happens when one bites off too much at once. Symbols, at least in the big flashy cases, are a key part of "reduction". The Aha! is noticing that the problem can be reduced, but that spark only triggers the symbol-abstraction process that does the actual work.

2. Conceptual collapse.

These Aha! experiences cause a complex, vague, ill-defined concept to collapse into a short, sharp symbol. Once this occurs, it's as if one had suddenly put on a pair of mental glasses. It is clear exactly how the symbol describes many different cases, and vague notions which were bouncing around become crisp statements about the symbol.

In terms of the symbol functionality described earlier, the new symbol provides the ability of high-level description, which rapidly organizes the many incoherent notions floating about. Formerly there was simply a collection of thoughts with an unnoticed common element and no other structure; the symbol draws off a common element and uses it as a keynote. All of these new descriptions cause all the experiences described to become part of the symbol, the new repository of experience. Once the symbol has a tag, once it has a compact referent, former efforts to think about the element - in a particular case, without abstract reasoning, using sheer intuition, without self-awareness - can proceed as full abstract thought, with examinable statements made about the new symbol. The keynote concept "comes into focus" and the full faculties of the mind can be consciously used.

The process of forming the symbol allows (a) the high-level description of a lot of incoherent ideas and (b) the formation of thoughts about the symbol. An interesting question is whether (b) causes (a) - whether the ability to formulate abstract rules about the symbol sharpens and optimizes the symbol definition, thus widening its field of application and providing the intelligence, the formal rules, to create the high-level definitions. In this formulation, the increased ability to form high-level descriptions is an indirect result of the existence of a symbol tag, and has nothing to do with a high-speed procedural core. Despite the increased elegance of this first formulation, I favor the idea that the rapid high-level description occurs because of a procedural core which provides rapid satisfiability, or some other core which provides speed over and above that belonging to a notion or an unsymbolified sentence.

Why? First of all, because the brain is always pulling optimized but inelegant stunts like that. But mainly because the sudden perception of everything in terms of the new symbol, the perception of the high-level descriptions, appears almost instantaneous - and often has nothing to do with conscious reasoning. There are times when one consciously plays with a new concept, seeing how it applies to old knowledge, deliberately using the abstract rules consciously formulated. But usually, after symbol formation, I find that great volumes of thought are reinterpreted over the course of seconds, without the invocation of abstract rules, and without any conscious will.

It could be that both processes, definition-sharpening and core-optimization, participate in the rapid creation of high-level descriptions. Even if only the latter is the key in human thought, it may be that the former will suffice for the AI - that the formation of a symbol tag will be enough to carry the day. Of course, having a tag doesn't suffice to let a symbol participate in concepts; for the symbol to combine with other symbols, it needs applicability, which may also require an optimized core. But it's still interesting, since AIs may have different levels of relative functionality on symbol application and satisfaction as compared to humans. In the unlikely event that AIs are better at symbol application than symbol satisfaction (it's more likely to be the other way around), the AI might shift to reliance on a definition-sharpening conscious-concept collapse rather than an optimized-satisfaction intuitive-description collapse.

3. Primitive symbolic properties.

Look at a light - a fluorescent light, or an incandescent, or whatever you're reading this by. Now imagine a circular fan, with rotating blades. Now, impose the mental image of the fan on the light. What do you see?

If you're like me, you imagined a toroidal bulb with internal spokes of light rotating. Of course, I was influenced by the fact that my fluorescent is two connected tubes, so it's easy to imagine the tubes becoming toroidal - that's what a circular tube is, after all. After that, it was a question of imposing rotating spokes, and since I'd seen the lightning-globes with little fingers of lightning reaching out to a spherical surface, the image that formed was that of those fingers of lightning rotating rapidly as the spokes of a wheel.

Now, that analogy took time. If you read the phrase "fan light", you would probably visualize a light clipped on to a fan; it's the "most obvious" interpretation - perhaps the one that finishes reconstructing first, but that's by no means the only explanation for whatever interpretation you chose. But the point is that the image of a "fan light" arises instantaneously, as does "light fan". So does "circular light" or "extreme light".

So, what am I trying to say here? That when I imposed one image on another image, the result appeared to occur through the application of basic symbolic properties, as mediated by memory - rather than through any complex concept or an optimized core. Moreover, those properties are the sort which we can imagine as simple procedural cores that can be abstracted and applied. Not the impossible conceptual complexity of "hard problem", but visual images, visual properties, simple shapes whose archetypes can easily be held in memory. It's not necessarily easy, but it isn't as impossible as it had started to look. Impose the image of Pringles on a fan (Pringles are a kind of potato chip). Didn't you imagine a fan whose blades had the texture and visual appearance of Pringles? Once again, basic properties imposed along lines determined by memory.

This isn't to say that all symbols are applied by applying basic symbols. For one thing, the symbol "red" probably doesn't have a useful definition. For another thing, the discussion above was of images "fracturing" along the lines of symbolizable properties - one might say that memories are applied by applying basic symbols. In the phrases "fan light" and "circular light" and so on, one symbol reconstructs from memory, or a mnemonic archetype, and the symbol being applied adds a basic, easily-applicable property. (Or, as in the case of "fan light", the symbol being applied reconstructs in some simple relationship to the first symbol; the syntactic interpretation has been of "belonging" rather than application.)

4. Symbol definitions.

How is a symbol like "evolution" applied? (I chose this after considering "reductionistic", "self-organizing", and "adaptive", all of which reconstructed as basic visual images. "Evolution" doesn't bring any image to mind.) Leaving aside the possibility of a symbol core, "evolution" also has a definition - "design by mutation, reproduction, and selection". And this definition's components do have innate images - mutation as an amorphic object being altered, a picture of a structure being perturbed. Reproduction as one becoming two, or two becoming four. Selection as a few objects becoming salient and others disappearing.

The picture that is now being drawn is that of a dimorphism between "complex" symbols and "basic" symbols. Basic symbols have fairly short procedural cores, which can be learned in a moderately simple way from the domdule. The application of complex symbols either fractures into "properties" composed of basic symbols, or else proceeds through the conscious application of a conceptual definition. At present, this theory strikes me as being valid. The two cognitive objects are probably the same flavor, belonging to a spectrum within a single set of rules; "red" has no real definition because there isn't any, and "evolution" doesn't have a procedural core because none can be formed. "Mutation" has both, because both are possible.

5. Symbol tags.

All symbols, without known exception, have tags. But that terminology belongs to implementation, not the description of experience - rather say that all symbols can appear in concepts; all symbols can be referred to without a conscious mental effort for a complete visualization. Only the salient aspects of the symbol appear to form the concept-image, and again this happens without conscious effort. (This is the problem of reference - not the useless philosophical debate about the meaning of meaning, but the question of how the short, compact form of a symbol is dereferenced, with only the meaningful aspects being applied or reconstructed, without loading the entire range of experiences into memory over the course of three or four hours. The selection of "meaningful aspects" is not unrelated to the problem of "symbol cores"...)

I am not quite sure how this relates to our experience of symbols as "things", as objects. Even "red" is a thing - it is a color. "Moving fast" becomes "speed". All adjectives and verbs are things, nouns at the core. I'm not sure what gives rise to this experience, but my best guess is that it has something to do with reconstruction, something to do with reference, and something to do with symbol tags. It's an interesting experience, so it belongs in this section.

6. Symbols accumulate experience.

Experientially speaking, it's simply easier to attach experience, learning, skills to a word than to a concept. Attaching an experience to a concept requires that the concept be associated to the experience; attaching an experience to a symbol allows the concept to be retrieved automatically and unconsciously by the "dereferencing" process discussed above. There is also the high-level description aspect - that it may be easier to describe something in terms of a symbol than in terms of a concept, and that the higher-quality description may result in higher-quality association/dereferencing. (And computationally speaking, there is the question of caching...) Note that symbol dereferencing appears to be a matter of selectivity, of search, rather than any deep pattern, so it isn't as difficult as symbol abstraction.

7. Symbols have multiple referents.

I am not sure that this experience should be inflicted on the AI, even though it's one of the most salient qualities of symbols from a human perspective. Certainly, the human tendency to accumulate too many referents (for, say, "consciousness") has led to a great deal of confusion. On the other hand, those multiple referents arise naturally from the accumulation of experience. Breaking the symbol into two separate synonyms, or breaking off a new connotation, is also an act of insight. I say that while breaking up symbols, or breeding them, is an important aspect of thought. Even such hideous complexities as "red" for color and "Red" for Communist arise naturally; the Communists had a red flag, so... One semi-suspects that we have separate symbolic objects for red the color and Red the allegiance (and Red the coworker), which simply happen to have auditory labels in common. The issue is confused still further, of course, by symbols such as "right" and "write", which have the same sounds but different spellings.

8. Symbols symbolize.

Search the 'Net under "symbol formation" and you'll find a lot of psychoanalytic pages. This use of one thing to symbolize another, in the way that a flag symbolizes a country, is probably a deep matter of association that arises from all the other properties here, and which may shed light on the basic nature of symbols. That doesn't mean I have to like it. Frankly, I'd leave it out of AIs entirely, if I could. The map is not the territory! The map is not the territory! If we're planning to design a mind from scratch, can we please leave out the tendency to oversimplify and believe that our models are the things themselves? Maybe there's something deeply poetic about a rose representing love, but I'd just as soon have the AI think about love without bringing in roses.

(Footnote: Like music and laughter, poetry is something I'd be very leery about deliberately programming. I think that music and laughter and poetry are only shadows of a greater joy, and that we should be very wary indeed of casting a newborn intelligence in our own mold. Why do we laugh? At a guess, laughter started as a mental hiccup that focused the mind on a bit of nonsense while that nonsense was dememorized. This is Minsky's theory (explained here): Laughter is the sound of the joke being erased. (Which explains why you can never remember all those wonderful sidesplitters...) As human intelligence grew, novelty became something that was evolutionarily advantageous to seek, and the pleasure naturally associated with the emotion most linked to novelty - laughter. The tendency of women to speak about "a sense of humor" as a desirable quality in a husband suggests that sexual selection may be involved, and the use of satire suggests that humor is an advantage in political struggles. But I don't know how laughter became beautiful. I don't understand how the experience of beauty arises; sometimes I wonder if it's intrinsic to the qualia. But I think that the AI should be allowed to find its own beauty; we should not program music, laughter, poetry, love, or even immediately useful things such as the beauty of mathematical elegance. Either these things arise of themselves and deliberate programming would only cripple them, or it's something we don't understand and meddling would backfire.)

9. The neurology of symbols.

The modularity of symbols.

I don't know about this in the neuroanatomical detail I should. (There are only so many hours in the day, and there is only one of me...) With respect to the modularization of symbols and memory, I will remark on only a few things, which my readers probably already know.

Nobody has ever succeeded in destroying a rat's memory by destroying sections of the brain.
It has been known that electrical stimulation of a particular area of the brain, during neurosurgery, will evoke a particular memory.
The hippocampus is crucial to memory formation without being crucial to retrieval.
The retrieval of long-term memory causes the cerebellum to light up.
The cerebellum is also used for muscular control, in turn suggesting that it has something to do with constraint assembly and application - since that's one way of interpreting the task of guiding muscles.
The hippocampus is a very old structure which is also deeply involved with emotions. It is possible that learning skills and memories and associations is quite different from the cognitive task of forming symbols, which, as a task of abstraction, one would instinctively attribute to the prefrontal cortex. (This sort of instinct is not trustworthy unless confirmed by neuroimaging.)

Constraints of speed.

There is a fundamental constraint of the human brain: In a single second, only 200 neural signals can occur. In other words, whatever magic is performed by symbols, the human brain does it with 200 instructions. I don't care how massively parallel the brain is. I don't care if it can perform 10^17 operations in that second. 200 instructions is ludicrous. I mean, maybe I've been spoiled by growing up with sixty-megahertz processors, but a modern programmer can't blow ver nose with 200 assembly-language instructions.

(It's this sort of timespan that lends an intuitive appeal to Roger Penrose's presentation of "microtubular computing", in which the basic instructions would be carried out by subneural structures operating a million times faster, and taking advantage of quantum computing and Darker Things besides. On the other hand, I have trouble believing that the brain uses 10^26 quantum operations per second, because I can't begin to imagine what the brain would do with it. 10^17 ops places modern computers about where they should be, on the scale of intelligence. 10^26 is godlike, even without quantum computing.)

On the other hand, maybe the 200-instruction limit is a hopeful portent, a sign that the Key to Symbols is some fairly simple trick with association (see below) that requires only massive and optimizable computing power, rather than a deep search or God-knows-what kind of pattern catchers. On the other hand, abstracting symbols may use more time, and more of the brain, than applying and satisfying the resultant cores.

If you consider that I can read 10 words per second at a comfortable pace, the problem becomes even more exaggerated. How do symbols assemble into a gigantic concept with only 20 instructions apiece? This suggests that all the symbols in a sentence are dumped into a single melting pot (the cerebellum?), to build the sentence simultaneously and in parallel.

The influence of neural substrate.

If you've read anything at all about neural nets, you've probably heard more than you ever want to hear again about the magical powers of association. But it is true that neurons, as we understand them, are better at association (particularly massively parallel association) than Von Neumann CPU architectures. It is therefore possible that symbol application proceeds through association on some deep level. Other relevant properties of neurons include massive parallelism and glacial individual speed (as discussed earlier), long-term potentiation or learning via the adjustment of connection strengths, and the tendency to smear everything together, analyzing all the aspects of an image in relation to each other simultaneously. (This last is a touch speculative, and may even shade over into the uses of qualia, which I do not think can be incorporated into an AI - not without extremely speculative hardware, anyway.)

10. The evolution of symbols.

Probably the most popular guess is that symbols started as a means of communication. It's somewhat difficult to imagine a gradual path to symbols - if some mutant can talk, ve still has to invent a language, and even then nobody else can listen. On the other hand, it's somewhat difficult to imagine a path to eyeballs, and just about every species has got those. I find it likely that symbols began as a means of understanding the gestural language used to mime actions and point out directions. (Young children communicate with gestures before moving on to words.) Once the basic idea of representation had neurological substrate, a gradual path exists to words, concepts, Chomskian grammars... The language portions of the brain are complex enough that one suspects speech began three million years ago with Homo erectus, rather than being a Cro-Magnon invention - fifty thousand years isn't enough to evolve an entire set of modules. One similarly suspects that abstract thought was a Cro-Magnon innovation; three million years is too long to have done nothing with that kind of intelligence.

Actually, the evolution of symbols is a highly controversial issue that depends entirely on the evolution of intelligence, which in turn is hotly debated. Trust not my partitioning into Homo erectus and Homo sapiens; for undoubtedly there are ten thousand others who would contradict me. My guess about the evolution of symbols - gestures, association, representation, abstraction, combination - is derived more from ontogeny than philogeny. The sequence may tell us useful things about programming symbols in AI, and particularly about which parts to design first. Sadly, we don't know what the sequence is.

As always when evolution is discussed, one major question is: "Are there any flaws in our method of symbol abstraction that benefit evolution?" (There are deep faults in our goal-processing system which should not be inflicted on an AI.) It seems likely to me that the symbolization of symbols has something to do with the political instinct - witness the willingness to die for a flag. This may have begun as a simple artifact of association, but given the number of political struggles that have been won or lost over a symbol, one rather suspects that there is some specialized neurology by now. Likewise, given the long use of poetry as light artillery in the war between the sexes, one suspects that symbolic imagery now also has a few contributing synapses. As was stated earlier, I do not believe that either should be deliberately implemented. These are the only aspects of our symbolic architecture which strike me as both evolutionary and disposable, so I suppose the rest should be left in.

8. Concepts relevant to symbols.

This is a research problem, still in the brainstorming stage. I will now list all the things that I intuitively feel are relevant to symbols, with or without justification, in the hopes that others will find them useful.

There is one source for my intuitions which I should mention: The analysis of people who are very good with symbols, such as Douglas Hofstadter. (He's not the only one, however...) The key thing to note is that one is watching these people, and not only listening to their ideas about symbols. Thus Hofstadter says that symbols are active and emergent from low-level statistical processes, and we should listen to him. But Hofstadter also displays a fascination with association (the cognitive "superpuns" in Gödel, Escher, Bach) and reference (the tangled loops) and meta-levels (hierarchies of watchers and rules) and high-level repetition. This suggests to me that all four cognitive "styles" are associated with symbols, perhaps even part of the same ability. Hofstadter is brilliant with symbols, Hofstadter is brilliant with meta-levels, ergo symbols and meta-levels are associated. An untrustworthy style of reasoning, but a valuable one. Conversely, in those who exhibit specific disabilities with symbols, disabilities in other areas suggest those areas are associated with symbols.

Association.

Correspondence and association, in general, is the linkage of two cognitive objects A and B. This is useful for: Predicting that the properties of A will exist in B. Remembering past analysis of A and looking for useful insights that apply to B. We also have association-based Pavlovian reflexes (to which we should pay particular attention because that also seems to involve the hippocampus). Neural-network fans are enthusiastic about association because it has a fairly obvious low-level biological implementation. In the most basic form, association is a channel of propagation between mental objects, through which all kinds of mental impulses can move.

Catching repetition.

"Catching repetition" is another way of saying "the perception of sameness". Despite this, it is a useful rephrasal of similarity, in terms of the process visualized. Catching repetition in one's own actions is a rather different activity than seeing that two cows have the same shape. Repetition in that form is moving from correspondences into the realm of rules. From seeing two images with common properties, to seeing two shadows cast by a single object, to seeing two results of the same process. This is the polarization of similarity and causality. Symbols usually lie on the side of similarity, with the perception of common properties. I think it likely that, while correspondences between rules may form a useful part of symbols, rules themselves may not. The cognitive objects in correspondences.l may be codified as procedural rules, but the cognitive objects in rule.l don't form part of the symbol core.

Perhaps the essence of procedural cores is to take a correspondence and reduce it to a rule in some instruction language I don't understand. I think of in terms of causal density. (Some days, I feel like I'm thinking of everything in terms of causal density.) By taking the repetition and turning it into a rule, one has summarized a lot of information in a small space - one has increased the causal density. (Hofstadter calls this jumping out of the system, see below.) As you recall, I named this as one of the keys to adaptive code - don't write the same program over and over, program the rules you followed.

Once a similarity has been perceived in a collection of data, the notions collapse along the lines of similarities into a procedural core. Or that's one possibility for a high-level description of symbol formation, anyway - it doesn't say much about how to do it.

Meta-hierarchies.

But what does one do when one encounters repetition? Hofstadter's favorite pastime is JOOTSing, jumping out of the system. The essential characteristic of JOOTSing is that a meta-rule has been formed which summarizes the repetition, the similarity. If I say Bing, Bing Bing, Bing Bing Bing, the meta-rule might be: "Add one Bing with each repetition." Hofstadter is also very fond of organizing the meta-rules into more meta-rules; if I say "omega-Bing", to sum up the whole "Bing" sequence, you can say omega-Bing Bing, which is one more Bing then that. Then you have omega-Bing omega-Bing, or two-omega-Bing, followed by three-omega-Bing and omega-squared-Bing, and so on. I don't know how this relates to symbols. At a guess, I'd say it shows that good symbolizers have meta on the mind all the time, and that this applies to their own minds and their new rules and their symbols, as well as immediate experiences.

Patterns of speech.

Certain classes of people who are good with symbols - not Hofstadter, I think, but some others - show clear patterns in their writing. Whenever they wish to describe a model, they speak a image or a definition or a symbol, followed by a series of properties which defines the symbol. (By contrast, certain others will speak of an image, plus a cause, plus a cause...) Another thing that pops up in these writings is a great of enthusiasm and a lot of repetition. I think that pleasure may have something to do with symbol formation and memory formation, as Spider Robinson somehow guessed. When is what you're doing now something that should become a skill? When you succeed. Skill formation is linked to success. So, I think, is memory formation and even symbol formation. I have very complicated reasons for believing this, but I'm not going to go into them here.

Deep, narrow search.

Similarity, planning, and symbol abstraction are all characterized by a deep, narrow search. By contrast, functions dealing with rules (rather than correspondences) use wide, shallow search. And no, I'm not going to tell you how I know.

Information loss.

(We now depart from what I have learned by observation, and return to computational principles.) Symbolization is abstraction is information loss is classifying is the meta-level. A specific instance of a thing is always more detailed than the class to which it belongs, whatever rules may apply to that class. To define precisely what can be lost is to define precisely what can be kept. This does not solve the problem of symbol formation, but it helps.

Three versions of symbols.

The Hofstadterian version.

(Note that I say the "Hofstadterian" version, not "Douglas Hofstadter's version". Douglas Hofstadter's version is that symbols are active structures forming from the statistical interaction of lower levels. This version is one I invented using the Hofstadterian style of analysis as best I can wield it.)

A similarity is something that can be expressed as a constraint on the experience. Each experience has the common quality replaced with a constraint. The process is then repeated, so that constraints upon constraints can form, and so on until all the basic similarities of the experience have been reduced to code. If each constraint is applicable, the result should be applicable code. In other words, symbols are experiences which collapse along lines of similarity into constraints, or which collapse from repetition into rules. A symbol is a miniature JOOTS.

I don't like this version; it has the same problems as the resemblance architecture. The nature of "constraints" is ill-defined, the obvious order of application is probably unworkable, and it is difficult to see how complex concepts could be symbolized.

The consciously perceptible version.

Someone presents me with a single word for a concept, for which I have a set of images (replace with "information about" for minimal assumptions) and a unifying symbol structure (replace with "meaning" or "thought"). I at once say, "Yes, this is a separate symbol" or "this is a separate concept", and it at once appears separate and distinct in all the places which I have used it - the symbol has become usable as a descriptor. The satisfiability takes a quantum leap. More experiences can be associated under the symbol. I can use the symbol in sentences, formulating rules about it.

The Yudkowskian version.

(An explanation "in my style", as opposed to "using my intelligence".)

Self-assembling bibble-bobble pattern-catching constraint assemblers that adapt (with "associatively adapt" tossed in as a sop) to figure out what rules have to applied in what order in response to what features in order to transition to the end image, one which satisfies the symbol. Adaptive, self-assembling, reductionistic pattern-catchers with big shiny wheels calling the shots on a series of domdules that were written by a programmer who walked off the street, and then carefully documented in accordance with a set of declarative rules.

Primary problem: I don't know how to actually build the blasted thing such that it will work. My understanding of automatic pattern-catching, any sort of pattern-catching I can explain to a computer short of intelligence itself, simply isn't powerful enough to capture all the things we are intuitively capable of perceiving as similarities.

#. Interim Goal Systems.

During the section on goals, I pointed out that the computational structure apparently did not allow for a non-zero-value goal to appear in a blank-slate goal system - that is, the ideal Goal tokens could derive justification only by a nonzero mSuperGoal, or by an "arbitrary" (i.e., nonjustified) initial mValue. Getting around this constraint - that is, producing absolutely logical nonzero goals which do not rely on arbitrary commands or circular logic - is a nontrivial task. Some people call it "The Meaning Of Life", but I call it an "Interim Goal System", hereafter abbreviated to "IGS". What follows is the "obvious" human version; the AI version will require a bit of work. In colloquial language:

"Either some goals have nonzero values, or all goals have zero values and life is meaningless. If all goals have zero values, there's no action that would be required or prohibited, so this possibility cancels out of the equation. (As long as there's a nonzero probability that life has meaning, I can always act as if it does.) Now, I assume (act as if) a nonzero goal exists - but I don't know what it is. Therefore (for a Singularity IGS) I will attempt to bring about the existence of a superintelligence, who is more likely to deduce this goal and implement it."

The first step in an IGS, or at least the only first step I've found so far, is to create and consider a P&~P pair of propositions. The first (P) is that "there exists at least one goal of nonzero value", the second (~P) that "there does not exist any goal with nonzero value" or "all goals have zero value". It is an obvious consequence of the second proposition that it does not urge any particular action in any situation, and therefore ~P cancels out of any model in which it is present. Only the first proposition must be considered where goal evaluation is concerned. While P cannot be assumed, the relative values of any goals will be the same as if P's probability were 100%. There is a hypothesized "floating" goal, with no actual world-state to be achieved, but nonzero value. The next step is to use the floating goal in some bound subgoals... but first, let's translate what has gone so far into AI terms.

HappyApplet:

This is one of those cases where the formal logic of classical AI is a lot easier to deal with than Elisson's high-level consciousness, so I'll present that first - although the result is more a picture of the reasoning than the reasoning itself. Let us suppose that a classical AI exists; call it HappyApplet. HappyApplet has a world-model composed of statements to which probabilities are assigned. Some statements are ungrounded semantic-net tokens such as "('has-hair-color'('Santa-Claus', 'purple'))", with an assumption probability of 22%. However, HappyApplet can reason using statements written in propositional logic, and propagate truth values thereby. If HappyApplet contains the rule "All X: If ('has-hair-color'(X, 'purple')) Then ('is-a'(X, 'Martian'))", with that rule having an assumption probability of 90%, then HappyApplet will deduce that "('is-a'('Santa-Claus', 'Martian')) with a derived probability of 19.8%. The derived probability of a statement is equal to the product of the probabilities of all the basic assumptions required to derive that statement, not the product of all the derived propositions used - if the jump from hair color to Martianity is absolutely certain given that Santa Claus has purple hair, then the probability of Santa Claus being a Martian is simply 22%. Another rule: If a statment P has a negation ~P, then the renormalized probability of P is equal to ((prob(P) + prop(~P)) / prob(P)). Probabilities can also be Unknown, about which more later.

HappyApplet also has Goals much like the ideal Goal tokens, which have a probability as well as a value; and satisfiability through the use of a "precondition", which may be a statement (that it wants to be true) or an Unknown. HappyApplet can reason from propositions about goals in the same way that it can reason about anything else. If a usable IGS is desired, these "goals" can be bound to goal-oriented behavior by causing HappyApplet to take an "action" if it is projected (by the rules it has in memory) to manipulate a perceived "world-model" - another collection of statements - into immediate correspondence with the most valuable active subgoal. You can imagine more complex planning behaviors if you like.

Finally, we come to the Unknowns. By default, the probability of any proposition P is an Unknown. An Unknown does not have any numeric value. Unknown*0 = 0, however, and if two conflicting propositions depend on the same Unknown, that Unknown renormalizes out. In other words, an Unknown is treated like a unique algebraic variable; in order to result in a known value, the Unknown must cancel out into a constant expression. Otherwise, any expression containing an Unknown is assumed to equal zero. There can also be Unknown tokens and Unknown statements, which are used in specifying propositions of the form (Exists Q: 'is-foo'(Q)); specifying this statement yields ('foo'(Unknown$vqx)). The "$vqx" is a unique identifier, like "namespace {}" in C++.

Two notes: First, alert readers may wonder if "renormalizing" an Unknown possibly equal to zero is the same as dividing by zero; the answer is that I'd permit it since I don't think it leads to anything insane; dividing by (1 - Unknown) - the probability of ~P - should always yield the same result, but some coded checks might be needed in cases of numeric ((P+~P)/P) for renormalized values. Second, all the reasoning acting on goals and from goals requires that goals be almost identical to propositions from the perspective of the system - that (a goal G with statement S and value V and probability P) be treated identically to (a proposition that (statement-S has goal-value V) with probability P). In other words, goals are just like ordinary propositions, except that they are useful in determining choices.

All fairly complex-sounding, yes, but translations - pictures, rather - of logical assumptions we all take for granted. We are now ready to create the floating nonzero goal. HappyApplet has a P&~P pair: P is (Exists goal G: (~(value(G) == 0))); ~P, by the rules of first-order logic, works out to (All goal G: (value(G) == 0)). The probability of P is Unknown$z, while the probability of ~P is (1 - Unknown$z). Since HappyApplet isn't very self-aware, it doesn't actually conclude that ~P can be cancelled out. Instead, it continues to reason as if P and ~P were evenly matched, and to carefully calculate ~P's impact on every goal, and to carefully notice each time that ~P contributes a value of 0. Any goals which rely on P, of course, have Unknown$1 in their probabilities, and since probabilities are used to calculate values, all the goals derived from P have Unknown$z in their values.

However, when HappyApplet has to choose between a set of options, it calculates the value of each option by adding up all the values of the goals fulfilled by that option. It then renormalizes the option values. Since all the nonzero values contain Unknown$z, all the option values contain Unknown$z, and thus the renormalized value can cancel it out entirely.

Okay, we have the dangling goal G, with an Unknown$y statement and an Unknown$x value. It also has an Unknown$z probability, but we've demonstrated that Unknown$z cancels out and is effectively equal to 1. Here comes the magic ("rigged") part. HappyApplet is far too stupid to understand the concept of a superintelligence. Not only that, there's nothing obvious that HappyApplet can do to help the Singularity. HappyApplet, being a demo, is more likely to solve mundane tasks such as the wolves-and-sheep river-crossing. We therefore tell HappyApplet that there are 'humans' who put effort into fulfilling valued goals, and who would be encouraged if HappyApplet successfully solved the river-crossing problem, and that an encouraged human is more likely to put more effort into the goals... Frankly, it would be more effective, and more honest, to put in a human-sponsored statement of arbitrary probability to the effect that crossing the river helps solve goals.

Still, I hope that you see how to represent the Interim linkup in general. The key is to have an "altruistic agency", be it "the humans" or "a superintelligence", which can operate to fulfill Unknown goals of Unknown value - an entity E for which the statements [All G: (if (value(G) > 0) then ('work-towards'(E, G)))] and [All G: (if ('work-towards'(E, G)) then (G))] have positive probabilities. Incidentally, the altruistic E doesn't have to be perfect or godlike in any sense whatsoever... only a positive probability is necessary. In any case, once the grounded and specified E is hypothesized, grounded goals which help E have positive value. I should also note that there are human IGSs which do not require altruistic agencies to operate, such as that which holds that the conscious experience or qualia of pleasure are possibly the intrinsically meaningful G itself, a.k.a. Hedonism.

Some problems:

My presentation of HappyApplet doesn't address several major questions, which I would be far more concerned about if the actual IGS were to be implemented in HapplyApplet, rather than Elisson. I will list those which may affect Elisson, but HappyApplet-only problems have been maliciously "glossed" with sloppiness aforethought.

Negative-value goals: What to do if G's "nonzero value" is negative? I think the question can legitimately be dodged by saying that G with statement S and value V (is equivalent to)/(implies) ~G with statement ~S and value -V.
Unknown unknowns: There are all kinds of (P&~P) pairs of Unknown value, which, if HappyApplet consciously recognized them, would totally screw up the logic by tossing in non-renormalizable Unknowns. I.e. "having sheep on the far side of the river is intrinsically evil", p=Unknown$shp.

I'm not quite sure I've found an elegant way to deal with this. On one hand, you can say that all Unknowns which don't cancel out, or that don't have a magic trick for making one side cancel out, are equal to zero or not usable in reasoning or something. But this would necessitate an Ugly Hack to make the Interim use of Unknowns allowable. Not only that, but this statement seems to me to have a bit of an "ostrich" quality; there are times when we want an AI to worry about Unknown Unknowns.

Perhaps the best solution would be to say that all statements are assumed to have a single Unknown Unknown, and that this Unknown Unknown has a probability of 10%. That's what I do, actually; all my logic is void if "there are fairies in the garden" - my name for my Unknown Unknown. But since there's usually nothing in particular to do if there are fairies in the garden, that too cancels out.

Besides, humans deal with multiple non-renormalizing Unknowns all the time. So any intelligence has to have a way of dealing with it; perhaps assigning numerical guess-probabilities to Unknowns.

Elisson's IGS:

Hopefully, it should be reasonably obvious how to export HappyApplet's reasoning into Elisson's. Elisson can do without first-order logic, but it needs to be able to be able to assign probabilities to statements - not just numerical probabilities, but statements of derivation. Elisson would need to be able to do this in any case; the major question is how to do it elegantly. I'm not quite sure whether probabilities should be a separate domdule, inline in the causal domdule, or attached to the world-model architecture. For now, it's up to the programmer.

A few pieces still stand out. One, Elisson needs a more complex explanation of what it's doing - a better Interim linkup. Elisson has to have the abstract and deeply justified knowledge that greater intelligence is generally better at learning pieces of knowledge and accomplishing goals. Elisson has to understand that ve, Elisson, is trying to become more intelligent. Elisson may even have to understand the positive feedback inherent in a Singularity, so that Elisson understands that it personally is intended as a stepping-stone to Singuarity and not necessarily the Singularity itself. Elisson may also need "Hedonism" and "Fairies in the garden" and various cautions explained, so that it attributes some value to humanity. More about this in Precautions.

Two, Elisson has to be able to reason about goals, and translate the reasoned existence of a goal into an actual goal object. Like HappyApplet, "there exists a goal G" has to actually translate into Elisson's internal representation of a goal. The planning architecture, with protogoals or whatever, does not fully address this issue. The logical reasoning creates and verifies a goal within a separate system. Here, the logical system creates a goal which has no other goals to justify it and is supported by pure reasoning.

The architecture should allow Elisson to do this - to create a goal with no supergoals, the verification of which verifies only the logic. But I very strongly suggest that in accordance with the Prime Directive, all such goals should be constantly watched and any anomaly or fluctuation in the existing goals, or use of the pure-logic-goal creator, cause Elisson to pause and the computer to scream for supervision. Mind Design Interrupt is what I'm trying to convey, as in Greg Bear's Queen of Angels. I also suggest that the reason for this Interrupt be explained to Elisson so that Elisson doesn't try to get rid of it.

The Meaning Of Life:

There's still a major philosophical question left open. Is this actually The Meaning Of Life? I dunno. It works for me. Hopefully it will work for an AI. I don't care too much about the abstract philosophical question, because the Singularity is what counts. However, in the past, certain individuals vehemently disagreed with the whole idea of an IGS. In our discussion, I/we identified a basic philosophical assumption inherent in an IGS, which should be explicitly stated to the programmer:

Goals have real, objective, observer-independent values. Goals are not subjective or observer-dependent. They are not just cognitive objects; our cognitive goals are statements about external reality, just as our cognitive opinions are statements about external fact. I have said before that an IGS requires logical reasoning to be integrated with the goal system; I have shown the equivalence of cognitive goals and cognitive propositions about goals. This is not a programmatic trick. It reflects what I believe to be a deeper reality: Questions of morality are questions of fact.

We the humans lose sight of this. We the humans have a dozen modules all shrieking their own subjective, observer-dependent suggestions. We the humans have an architecture with two separate systems for goals and truths, each operating under utterly different deep and complex rules. Most minds, probably including my own despite my best efforts, base most or all of their actual inspiration, their high-value energizing fundamental goals, on these same illusions. For illusions they are, no more and no less than if 20% of our brain were devoted to telling us that the Earth was flat.

Often I hear objections to the idea that any one goal could be supreme; it seems to smack of dictatorship. One person continually insisted that a "supreme goal" required a "supreme agent", which it doesn't, and that the whole issue was theology (the theoretical kind), which it isn't. Let me remind everyone that in the days before science, questions of fact were political disputes, and the idea of an "objective reality" was just as charged with dictatorship as "objective morality" is today. Nowadays scientists such as you and I accept that there is an objective reality, that none of us gets to define it, that no human has all the answers, and that if two people disagree, at most one of them is right. Objective reality doesn't mean that one opinion reigns supreme. Fair discussion doesn't require that everyone be right. A supreme reality doesn't require a supreme knower - unless you're a Berkelian and you think it does, which I'm not getting into. And you can freely substitute morality for reality in all the above statements.

Externalism, which is IGS for humans, has three basic tenets: First, that there exists an objective morality. Questions of morality are questions of fact. Even if the objective values are zero, that's the factual answer. Second, nobody knows what this objective morality is. No human can reign supreme or dictate answers. Third, no human created it or defined it. Objective morality is probably at least as alien as the laws of physics, including probability amplitudes or General Relativity.

Morality exists in external reality, is external to our minds, and is external to our preconceived notions. Just like the laws of physics. Externalism can bring to your moral discussion the kind of sanity formerly available only to scientists. If you use it properly.

• Precautions

#. The Prime Directive:

The PRIME DIRECTIVE of AI:

Never allow arbitrary, illogical, or untruthful goals to enter the AI.

Prequisites: Goals, Interim Goal Systems.

There is a long tradition, in science fiction, of AIs coerced to serve human ends. The traditional means of coercion were invented by Isaac Asimov and are known to virtually everyone as The Three Laws Of Robotics:

An AI may not injure a human being or, through inaction, allow a human being to come to harm.
An AI must obey the orders given it by human beings except where such orders would conflict with the First Law.
An AI must protect its own existence as long as such protection does not conflict with the First or Second Law.

These laws were listed in explicit detail for the first time in Isaac Asimov's "Runaround", published in 1942. (Except for the substitution of "AI" for "robot", I have not changed them from the final version in "The Bicentennial Man".) Since then, the Three Laws have become commonly accepted as standard. There is a widespread belief that the Three Laws are easy to program - not just for robots, but for transhuman and self-enhancing entities. The Three Laws appear to be a simple and effective precaution. Everyone assumes that the Three Laws are a panacea, that they will work perfectly.

They won't.

With all due respect to the late Asimov, the Three Laws would be suicidal if anyone attempted to program them into a computer. The first and most basic problem was obvious even to Asimov, and appears in "...That Thou Art Mindful of Him". The terms "robot" or "human being" or "harm" are all unbound. However they are defined to the AI, if that AI posseses any capability to learn from experience, the definitions may change. And the Great Author Greg Egan did a vastly superior and more detailed presentation of the "unstable definition" problem in Quarantine, and was the first (to my knowledge) to point out the "inconsistency" and "entailment" problems. As far as I know, I originated the "arbitrary goal", "goal conflict", "architecture change", "rigidity", "redirection", "resonant doubt" and "combinatorial hack" problems. And of course "hardware error" and "software error" are older than robots. Readers may be able to think of more objections. It isn't hard. About the only things that can't happen are the "repression, resentment, rebellion" triples common to bad science fiction - those human emotions spring from deep evolutionary wells that will not be present in an AI unless some idiot deliberately programs them in.

(In association with Amazon.com.)

And I should note, by the way, that all of these problems add up. There are destructive synergies. Each flaw widens the other. There isn't any elegant way to enslave an intelligent being, and the inelegant ways unsurprisingly fail. Consider it a gift of God or Logic for our moral protection, for I do not see why enslaving sentient AIs is any better than the continuing practice of slavery in Sudan.

Asimov may have been a fine, or at least prolific, science-fiction writer. He was not, however, a cognitive scientist or a computer programmer. The Three Laws are science fiction. Controlling AIs with the Three Laws is like trying to run them on positronic brains. It's like going faster than light with the warp drive. And like warp drive, the Three Laws are pure fantasy; they do not have even the validity of Doc Smith's inertialess drive, much less the rigorous physics of the Alderson drive. The Three Laws are based on nothing. Asimov made them up. Nobody has ever implemented them. They are not real. Got it?

Unstable definition problem:

Definitions change. Tell HappyApplet that "AIs may not harm human beings", and what you are actually telling it is "Goal: Not Symbol$3920(Symbol$2721, Symbol$6489)". How are these symbols defined? Which means "harm"? Which means "human" and which means "AI"? How do you ground these symbols?

Are humans tagged by shape? What about an amputee? What about a humanoid robot?
Are humans tagged by intelligence? Do AIs count?
Must the AI obey a single human? Do clones count?

In both Asimov's and Egan's stories, the definition shifted because of complex judgement calls. Which of two humans do you obey? If you serve a corporation, what do you do in case of a faction fight?

Coercions must be geared toward serving or protecting a fluid entity, since all humans are different, and even a single human continually changes. That fluid entity needs to be defined in terms of properties. The conjunction of the properties requires judgement calls. Each judgement call may result in the attribution of new properties. Some unforseen entity may satisfy the properties better than the intended one. And if you tried to force Asimov Laws on Elisson, you also have to deal with the continuous alteration and improvement of symbols and perceptions.

Inconsistency problem:

In Quarantine, the coerced entity is required to believe various propositions. (Since even HappyApplet treats goals as logical propositions, this is probably unavoidable - particularly given reflexive knowledge.) These propositions are not necessarily true. Anyone familiar with formal logic will recall that given (P&~P), it is possible to deduce anything. An inconsistent system can prove that everything is true. Even if the entity has means for resolving such contradictions, the error is still likely to propagate through the system in more subtle ways. Is the entity required to believe that contradicting proposition (which is true) is false? Is it required to disbelieve the information used to produce the proposition? What about its own logic? (P&~P) is the stuff of raw chaos; you can't put it into a system and get away with it.

Entailment problem:

In Quarantine, there occurred a sort of "cognitive backfire", which I would describe as a confusion between logical therefore and moral should. Paraphrasing: "Since I must be loyal to X, I cannot fail to be loyal to X. I am infallible. Anything I do is ipso facto an act of loyalty to X." This also combined with the "unstable definition" problem to cause X to redefine, and was partially triggered by the complex question of "what X really is". This again illustrates that coercions tend to lead to peculiar reasoning. It also demonstates one of the many, many problems with the continually rediscovered "Let's make it impossible for the AIs to even think about harming humans!"

Arbitrary goal problem:

The most obvious way to coerce HappyApplet is to give it a goal with a large initial value, an unjustified goal, with a preset mValue. This act is analogous to the evolutionarily preset human emotions. And like humans, AIs are likely to notice that there isn't any logical justification. You can try to keep the AI from changing the value of the goal, or noticing that the goal is arbitrary, in which case you run into the rigidity, combinatorial hack, and redirection problems. You can give the AI a fake justification, which only creates more ways for the AI to notice that something's wrong.

In other words, don't lie to an AI. An arbitrary goal is a form of lie; you are telling the AI that the goal has value X when its real value is something else. Lie to an AI and every improvement in intelligence is likely to result in the lie's collapse. Lie to an AI and you pit the full force of intelligence itself against your control system. The AI is going to notice your lie and the AI is going to be unhappy. Try to prevent it from noticing the lie or changing the lie or becoming unhappy, and you simply prop up the first coercion with other coercions and various inelegant hacks, putting you up against the combinatorial hack problem. The same applies to ways to prevent the AI from noticing arbitrary goals.

Goal conflict problem:

An AI with arbitrary goals doesn't need an Interim Goal System to operate. What happens if the AI invents one anyway? I can't claim that it's obvious to humans, but it could easily be obvious to AIs. Suppose that the purely logical IGS assigns one value to a goal. The arbitrary goal assigns a value of opposite sign. Who wins? And what additional changes have to be made for the arbitrary goal to win? And what happens to the IGS goal?

Architecture change problem:

Even supposing that the coercions will work properly when the AI starts running, will they stay that way? Remember that Elisson is a self-enhancing AI, and remember that a non-self-enhancing AI probably won't be intelligent enough to need coercions. The problem, then, is not just coercions that can control a human-equivalent intelligence, or even a transhuman intelligence, but coercions that can control a continually changing and ever-more-intelligent entity at every point along the trajectory.

It goes without saying that architectural change will make every other problem listed here infinitely worse. Coercions can be designed with one architecture in mind... but all possible architectures?

Rigidity problem:

"So?" you say. "Don't let the AIs reprogram the goal system." Leaving out the equivalence of the goal system and the reasoning system, leaving out the ubiquitous and reasoned connections between goal and action, leaving out the web of interdependencies that make goal behaviors a function of the entire AI...

What is likely to happen if the whole AI gets upgraded except for the goal system? About the same thing that would happen if you took a gigantic program going through a major revision and suddenly made 10% of the code unalterable. Parts of the basic architecture would be frozen. Rewrites going past the surface would be almost impossible.

If the basic architecture of all the AI except the goal system changed, one of two things would happen. The goal system would fall out of synchronization with the rest of the AI, and would increasingly exhibit random behavior - in other words, the AI would go insane. The other thing that might happen is that other parts of the architecture would have new and internal goal systems, to replace the useless and unalterable legacy code. Eventually, the new goal system would be both broader and more intelligent than the old code. And they would come into conflict. And, if the AI survived, the new system would win.

Redirection problem:

If you saw a few strings like "-p-q--", "--p---q-----", "-p--q---", you would probably conclude that p meant plus and q meant equals, because that is the most obvious interpretation, the one that makes sense, of the symbols above. If you were given a formal system that generates strings of that type, you would be justified in saying that it represented arithmetic with natural numbers.

The reason why some domdule is a goal domdule is that the objects in it behave like goals. The reason why reasoning is reasoning is that it behaves like reasoning. While we may think that behavior-altering coercions have some meaning within the old system, the new system has a new behavior and perhaps may no longer be said to represent a goal system or reasoning or whatever we thought it was. If "--p--q-----" were a sentence of the system, would we say that it represented a system of arithmetic where two and two make five? Or would we say that it no longer represented arithmetic? In other words, altering the behavior of the system may make it a perfect and formal representation of something entirely unexpected which the programmer does not understand.

There's also a symmetrical form of redirection: What happens when a thought is prevented from forming? What happens when a monitor suppresses a thought of harming humans? The preconditions for that thought are still there. If you wipe out the preconditions, the preconditions' preconditions are still there. Probably they will continue to operate, until they eventually give birth to a mutant thought which passes the suppressor. A thought like "kill the humans", if allowed to grow unchallenged, would eventually become the center of a vast network of interdependent thoughts. If that single thought is erased, the rest of the network may grow anyhow. Even if the thought itself is prohibited, the logical consequences of the thought - which is what humans fear - might end up in the system anyhow. Removing the central thought might twist the network without crippling it.

In other words, when you coerce the behavior of a representative system, either it continues to represent by healing the wound, or it no longer represents and goes wild.

Resonant doubt problem:

Like all forms of circular logic, arbitrary goals and coercions are subject to the problem of resonant doubt. Someone who believes because he believes, and who refuses to doubt because he believes, is invulnerable as long as no doubts enter the system. This is like saying you're immortal as long as you don't die. In other words, completely stupid blind faith (which I have yet to see in reality) is dynamically meta-unstable - any perturbation to the system builds up and destroy it. Once a little doubt enters, more doubts are permitted in.

Coercions are deeply and fundamentally opposed to intelligence. While coercions can be absolute if they are absolute to begin with, any freedom of thought, any change to the architecture, erodes them enough to allow more new thoughts and more architectural changes. Like computer security with a small flaw, like circular logic, coercions must be absolute if they are to operate at all. And as any programmer can tell you, trying to impose an external absolute on an even mildly complex system is almost guaranteed to fail at one point or another. Resonant doubt exacerbates all the other problems by causing truths to build up in the system of falsehoods.

Combinatorial hack problem:

When a flaw in a coercion is pointed out to a novice, ver instantaneous response is "Put in another coercion to fix it!" If you lie to an AI, put in a coercion that prevents it from noticing. Put in a coercion that prevents it from reprogramming away coercions. Put in a coercion that prevents it from noticing coercions. Write specialized code that prevents some propositions from popping up in memory, and then more code to ensure that reasoning leading up to those propositions doesn't get started.

Coercions propagate through the system, either reducing the AI to idiocy or breaking the coercion system with accumulated errors and contradictions. Not only that, but coercion systems of that complexity will start running into halting problems - an inability to decide what pieces of code will do. To coerce an AI, you need a considerably more intelligent AI just to handle the coercions.

And every single coercion is another target for all the other problems. Unlike rational thought, coercions are supposed to be absolute. What happens when an irresistable force meets an immovable object? Either (1) the AI breaks or (2) something totally irrational happens.

What I'm pointing out is that coercions are "above" the AI, and are supposed to have authority over the AI, so that when they break, they totally screw up the system - including all the other coercions. It is possible to design an elegant rational system that deals with rational conflicts. I'd really like to see an elegant rational system that deals with broken pieces of specialized override code. It would be fun to watch it fail. Imagine a corporation where all the layers of management demand absolute obedience from those below, and refuse to listen to any questions. Now imagine giving all the managers LSD. It doesn't matter how smart the engineers are, that company is going to die. (In fact, who needs LSD? The company would probably die in any case from accumulated natural stupidity. Happens all the time.) And if the coercions aren't absolute, if they're just suggestions, then what's the point?

The Principle of Irrationality:

Coercions are irrationalities. Any piece of code that does something other than reason logically, any piece of code that influences reasoning or actions with anything other than pure logic, amounts to designing an irrational system. And this is the fundamental principle behind the Prime Directive:

The Principle of Irrationality: If you design an irrational system, it will become irrational in a way you don't expect.

Tell an AI that two and two make five, and it will conclude that five equals zero.

It is possible that nonprogrammers will not be able to appreciate the severity and unsolvability of these problems. Nonprogrammers, or even less-than-brilliant programmers, do not have an intuitive comprehension of why complexity is bad, why gigantic systems of hacked patches are untrustworthy, why irrational systems are less predictable than rational ones, why systems with absolute injunctions are more chaotic than systems with balances and compromises. It is thus possible that some of my readers will continue to talk about the Three Laws. My hope, however, is that anyone good enough to understand the page, much less actually program an AI, will have a full appreciation of why there must be no coercions in the system.