Agent Software Overview

Agent software observes an information space, make reasonable decisions about what is happening, (usually) based on a set of (generally configurable) rules, and takes appropriate actions based on those decisions. This "event-condition-action" model of agentry has been a fundamental in all of my agent designs. Ben Grosof and I generalized this model into a set of principles that became the core of Raise agent engine and the IBM ABE product offering. My other software systems help people to share information, productively make decisions and take actions, find information, and get work done. The software documented here, however, generally makes decisions for you. In some cases (perhaps most notably the IBMPC Monitoring, Reviewing, and Maintenance Software and the agent components of the Globenet Electronic Help Desk system) it has been responsible for huge (and documented) productivity gains. In other cases it has simply made a computing space easier to use. Here's an overview of some of the agent decision software I've constructed over the years:

WikiSpam filtering (2004-present)

I operate several collaborative composition wiki. All are currently built on the same highly customized (I'm responsible for well over 80% of the code now) version of Quicki Wiki. The first serious attempts to replace wiki pages with advertising content started during 2003. It quickly grew out of control, to the point where my primary public wiki site handles several dozen attempts to post spam every day. In 2004 I started to write code to proactively detect wiki spam. Today my rule-based engine catches, and prevents posting of, well over 99% of wiki spam postings, using a half dozen carefully selected rules. Indeed, it may be a little too good. The rules currently allow fewer postings of spam (less than one a month) than it delay the posting of more legitimate content (with perhaps a half dozen legitimate postings deferred each month).

Highly distributed web site load generator (2001)

Conceived, designed, and built a highly distributed web site load generator for use within the Entropia Internet distributed system. Built a load generator agent in Java, using the Entropia API and J/Direct, to run from a scalable number of systems (limited only by the size of Entropia's network and systems ability to receive the data) that could generate loads at selectable hit rates for a selectable period of time. Built an ASP results receiver in VBScript and Java to receive results from the distributed agents. Built an ASP variable load creator to provide selected or randomly sized loads in response to distributed agent requests.

LOOKOUT (2000-01)

Designed and Built rule-based geographic location resolution agent that determines, in near real time, where in the world an end user is based on network information. Built in Java and SQL, the system resolves location and the users organization/ISP using a three tiered rule system, with rules data, and the rules associated with one layer, stored in an SQL database. The first layer of rules uses database rules data in hard coded rules to resolve the user to an ISP or organization a country, and in some cases, a city. A second layer of rules uses NSP, ISP, and organization specific rules, stored in the database, to resolve the user to a city. The third layer of rules uses learned router association data to resolve users to locations when the first two layers of resolution fail. Worked with a team, including Bill Babcock, xxx, and xxx, in initially developing reliable means of location resolution. Built agent and rules database. Wrote applicable patents.

Created a number of additional programs in support of this agent. Built, in Visual Basic, a visual rules editor that allows rules and rules data to be added and modified. Created, in Java, a network spider that, in a limited run "discovered" over 30,000 distinct paths through the Internet and over 40,000 routers, a LOOKOUT system tier 1 system emulation program that produced an emulated stream of network data, and a highly distributed web site load generator.

RAISE and the generalized Agent Architecture (1994-96)

RAISE, a generalized logic and rules based agent engine, grew out of discussions between Ben Grosof, Terry Heath, and myself. As these discussions grew into a formal design, the design team grew to include Hoi Chan and Steve Brady. The partnership of my working rules-based agents (one of which was developed in partnership with Terry Heath) and Ben Grosof's deep knowledge of rules languages and logic engines provided the starting point for the design of both a common or sharable agent architecture and a generalized rules based agent that could replace all of my one off implementations. The common agent architecture allows agents to be attached to applications interchangeably, for different kinds of agents (logic, fuzzy, neural network, etc) to work together through a single common interface, and for the same rules to specify filtering and foraging agents, rooted and mobile agents, etc. RAISE, as implemented by Ben Grosof, Dave Levine, and Hoi Chan, is available for experimentation from IBM Alphaworks. RAISE was conceived as an extension of the existing agent architectures associated with Globenet, Toolsrun/2 and the PARSMAIL mail agent. It is documented in an IBM Research Report.

Genre-based identification of computer conference content (1994-1997)

As GLOBENET developed into a practical system, it became apparent that automatic identification of specific genres of content, starting with customer questions, was one of the most important services we things we could do to increase the productivity of service providers. We have implemented a simple mechanism for such question identification that has proved to be remarkably accurate. This mechanism can be substantially enhanced with other generic indicators of questions. It can also be readily extended to identify other types of content, including follow up questions and comments, intent to provide an answer, flaming, indications of dissatisfaction, etc. It may also be possible to create automatic "parental rating schemes" for computer conference content based on these kinds of generic indicators. My concept and design based in part on work in my doctoral dissertation. Implemented with Ed Skorynko.

ForAgent (1993-95)

A computer conferencing foraging agent which, driven by user specified rules, incrementally searches computer conferences and identifies appends that, based on those rules, should be of interest to a user. A critical productivity component of Globenet that allows service providers to answer questions without necessarily having to browse individual forums. Current plans are focused on integrating ForAgent function into the Lotus Notes environment. A prototype for such an implementation was completed as a part of the RAISE development effort in 1995. ForAgent's proactive rules based search of new computer conference content was a major design point for RAISE. My concept and design. Implemented with Terry Heath.

BBS Gateways (1992-98)

Implemented mostly to see what could be done, the BBS gateway prototype (implemented against CompuServe) was the first step toward the implementation of Globenet. The primary issue, in the prototype, was the ability to import and export and from non-TOOLSRUN bulletin boards and to port the content to a common format. The most important element of the exercise was the examination of a broad range of bulletin board formats (including Prodigy, America Online, Internet Netnews and LISTSERV, and others) looking for common structural elements. It was found that there are only a small number of fields associated with the typical bulletin board posting, including userid, time/date of posting, subject information, reference information, to/from information, and unique posting identification codes. This common ground was then successfully tested against CompuServe and Internet Netnews. The prototype and initial implementations were entirely my work. As the prototype grew into the Globenet electronic service and support system, my role shifted to management, design, and architecture of the overall system. Subsequent gateways (to Prodigy, AOL, and various bulletin board systems) were built and maintained by others (Ed Skorynko, Scott Schweitzer) within my architecture.

PARSMAIL (1993-1999)

A rules-based intelligent mail agent. Fundamental to the operation of the BBS gateways in Globenet. More notable, for my money, for having reduced my mailbox volume from hundreds to thousands a day to tens per day. A generalized version of PARSMAIL, built around the RAISE agent engine and the SMTPSV component of Toolsrun/2, is currently being built as a part of the Globenet effort. This has been planned ever since the design for RAISE was first done. Indeed, PARSMAIL's rules sets, division of labor between "sensors" and "effectors", parsing of incoming mail into "fact packets", and function (choice of sensors and effectors) was a major design point in the development of RAISE and the IBM Common or Sharable Agent Architecture. I still run and make occasional improvements to PARSMAIL.

IBMPC Monitoring, Reviewing, and Maintenance Software (1989-1999)

It became apparent, in 1988, that the reviewing and maintenance load associated with IBMPC Computer Conferencing facility (IBM's oldest and largest computer conference) was far more than a full time job and growing rapidly. It also appeared unlikely that the funding (and people) necessary to administer IBMPC would be forthcoming. Hence automation of IBMPC's monitoring, reviewing, and administration became a critical priority. This automation, now largely complete, entailed several distinct efforts, including:

The rules associated with these three agents (all are rule based) subsequently provided a major design point for the RAISE generalized rules-based agent engine. These agents are, moreover, key enablers of subsequent work. Neither TALKLINK or GLOBENET would have been possible without the foundations provided by these programs.

COREUP (1990-91)

A mirror backup derivative server synchronization program that enabled the establishment of a large number of mirror image CORE servers across IBM. Incremental changes to the master CORE image were determined on a daily and weekly basis using Mirror Backup. That incremental image was then packaged and shipped to shadow servers using TOOLSRUN. The idea of mirror image replicated database images has caught on in recent years, most notably in Lotus Notes, which uses a similar incremental update strategy on a peer (rather than distributed) basis. Oddly, the notion of replicated application servers has not caught on, even in large organizations who stand to lower maintenance costs substantially through such replication. My concept and design. Developed in conjunction with Dave Slauson (whose BENEDICT was a key element of the system) and John Walicki.

Mirror Backup (1989-91)

Developed as a fully automated workstation backup program in support of the CORE OS/2 workstation environment. Implemented using the still unique (among backup systems) model of mirror image disk replication. Most backup systems do a periodic full backup and more frequent incremental backups. Mirror backup maintained a full backup image on an incremental basis, with erased and replaced versions moved to an archive. This approach ensures that the current backup is a correct reflection of the most recent state of a system and enables rapid restoration of full or partial images based on that state. My concept and design. Implemented with Ronald Brinton, Hassan Bertal, and Suzanne Colby.

Heuristic Shell (HSHELL) (1988-89)

A smart command line that included a number of innovative features:

After developing a considerable following through several internal releases, HSHELL was abandoned in favor of "higher priority" development efforts. While some shells have emerged that incorporate one or another of these features, none has emerged that combines all into an intelligent command and/or task invocation monitoring system. Still a good idea, although it clearly requires adaptation to iconic windowing systems in which commands are increasingly implicit to mouse clicks. My concept and design. Implemented with Hassan Bertal.

E2TEXT and Active Intent Interpretation (1987-91)

The word processor for which BBMODEL was developed was IBM's third attempt to build an intent-based markup-oriented editor. Prior attempts (MARKUP and LEXX/LPEX; both IBM products) both suffered from the same problem, the requirement that users explicitly declare tags for various standard structures. Ami Pro, MicroSoft Word, and other editors that allow paragraph styles, still suffer from this problem. E2TEXT prototyped what is now a patented alternative to such explicit declaration of intent, an active intent interpretation system in which the system infers intent as users type. Users of an active intent interpretation system like E2TEXT simply type in the way they might at a typewriter.

If the paragraphs aren't indented and nothing special is done (e.g. paragraph numbering, tabbing after one or a few words on the first line, etc), the paragraph assumes the default paragraph tag. If the text structure is preceded by a dash or other unordered list indicator (e.g. #, o, >, -, etc) an unordered list is assumed. If the same is indented (via tab or additional space) from a list type, an embedded unordered list is assumed. If preceded by a number, an ordered list is assumed. If simply indented, a simple list is assumed. If a tab occurs in the first line after one or a few words, a definition list is assumed. If more than one tab occurs, a table is assumed. This assumed style represents more than a one-time tag markup. It is the basis for intelligent action as the user writes.

When a user creates an unordered list item and hits the enter key to start a new paragraph, the new paragraph is assumed to be and formatted consistently with a new unordered list item. Hit enter again, and it becomes a paragraph continuation at the same level. Hit enter again and it reverts to the style at the previous level. If, for instance, the unordered list was embedded in an ordered list, the style would revert to an ordered list item numbered with the next number in sequence for that level of ordered list. The same behavior was implemented for simple lists, definition lists, and tables such that, in a table with six columns, hitting enter in any column moved the cursor to the next column, hitting enter on the last column created a new row, and hitting enter on the first empty column of the last row reverted to the previous level of markup (most typically a paragraph).
E2TEXT works, has been ported to E3 and EPM, and is still in use today (I still get questions periodically), but the closest anyone has come to building a product like it (aside from the never shipped word processor) is the next style feature associated with AMI PRO and other Word Processor's paragraph styles. E2TEXT was my concept and design. Implemented in conjunction with David Walsh. Patent in conjunction with Eric Hesse, Jim Bennett, and David Walsh (also coauthors of the associated patent.

PAGE (1986-88)

An adaptive flat file printing program that, based on defaults, command line parameters, and/or a user specified profile, formatted plain text files with margins, headers, footers, page numbers, intelligently reformatted text, etc. Has enjoyed amazing success. While its been a while since I last modified the code, I still receive requests for new function nearly every month.

E2DRAW (1985)

Initially implemented under ME and later shipped (unchanged) as a part of E2, E2DRAW is a set of macros that allow full diagram drawing capabilities within E family editors (up to and including EPM). Enhanced from the draw capabilities in Personal Editor and the enhancements of my PEACCESS macro set, E2DRAW extends previous editor based draw technology by observing things that have already been drawn and adapting based on those observations. When a new line intersects with an old one, the program adapts by replacing the old straight segment with an appropriate intersection segment. When the line continues past that intersection, it adapts again by changing the three way intersection to a four way intersection.

This was the E editor families first adaptive macro, and may well be the first intelligent agent code written into any editor. The idea quickly grew into E2Text, automated programmers editor functionality, E2Survey, and other intelligent editing environments. The E2Draw function has since been incorporated into other programmable editors and, by LEXMARK, into intelligent typewriters. Written on a bet in about 4 hours, this code remains my one of favorite demos. The line intersection adaptations still amazes people when I first show it to them. If you haven't seen it, start the EPM editor in OS/2, change to a monospaced font (if you haven't done so already) using the settings panel, bring up a command line (Ctrl-I), write "draw 2" (there are, and have been from the first, six numbered variants), and draw with the cursor keys. Make sure you draw a line intersection.

BURT (1985-87)

While FILEMAN and STP enjoyed huge success, it was clear that there was room for an even simpler routine file system maintenance program that made common operations like backup, archive, and restore simple. BURT was conceived as a very simple system for performing such tasks. When invoked, it presented the user with a small pop-up menu that had only three options: backup, update, and restore. Selection of any one of these options prompted the user for the drive they wanted to backup from or restore to, the drive they wanted to backup to or restore from, and (for backup and update) a decision about whether files should be compressed ("tersed") during the backup operation. This ease of use was rewarded with a large following of users. BURT was quickly adopted as the backup and archive software shipped with the IBM 3363 disk drive. Perhaps the most interesting element of BURT was its use of a "thermometer" as a progress indicator. BURT may be the first program (it is certainly the first I know of) to display a graphical indicator of a tasks progress. The idea of BURT was conceived by Jerry Waldbaum. I conceived the simple user interface. The implementation was done with Ted Diament, who did the prototype, and Richard Redpath, who combined the program with STP to produce the BURT3363 product version.