Targeting Techniques

(by Gary Dubuque)

The Target Edit dialog is one way to create new AIML categories. AIMLpad offers several others.

Dr. Wallace describes "targeting" as:

"A style of AIML content creation is based on a backward-looking log file analysis. … More generally, every input that matches a pattern with a wildcard is an opportunity to create a new, more specific pattern and its associated template. … Targeting is a special case of the backward-looking strategy. … we rely on heuristics to select targets from the activated categories."

I propose that targets can come from more than the activated categories. If we save the log of chat, we should be able to generate simple categories from that transcript. Perhaps this is a variation on anticipating what a client might ask and as such isn't strictly the same sense as refining wildcards. It seems more and more AIML creation is adopting this style as a form of targeting.

For the record, log transcripts in the AIMLpad application are simply a single line for the input asked followed by one or more not-blank lines of response. The input/responses are separated by a blank line. They do not contain prompts telling which participant created each exchange.

AIMLpad has three options for pulling directly from the transcript.

Quick Build Categories... isn't really targeting. It reads a text file and composes categories into the editor's current document being edited. The input text file is in the same format as what the normal discourse chatting with the interpreter logs to the editor. So if you were to load a set of AIML for Alice standard, then chatted with it, you could use that transcript to start your own bot. Perhaps you'd want to edit some of the responses to emphasize the character you are creating. The advantage you have with doing it this way is the speed with which you can generate your base set of AIML categories. All the other techniques require review, one by one, of the categories to keep.

Recently the "ultimate" ALICE, Program-Z, added a feature to do something similar to the quick build. You can review the log of conversations on-line and pick a particular input/response entry to convert into a category. The next option in AIMLpad is also close to this technique.

Load Dialog Text into Targets... is very similar to the quick build. It takes that same transcript and creates target entries instead of composing categories. Then you can use the Target Edit dialog to pick and choose the ones you like. I'll explain more about using the Target Edit stuff later.

Classify Dialog Text... is an attempt to directly implement Dr. Wallace's heuristics for targeting (it is incomplete and doesn't work yet). As he further states for the process:

"If the matched pattern ends with a wild card, the suggested new pattern is generated as follows. Suppose the pattern consists of [w_1,w_2,..w_h,*], a sequence of h words followed by a wildcard. Let the input be [w_1, w_2,...,w_k] where k > h. The new pattern [w_1,...,w_h,w_h+1,*] is formed by extending the original pattern by one word from the input. If the input is the same length as the original pattern, i.e. k+1=h, then the synthesized pattern [w1,...,wk] contains no wildcard."

All the techniques outlines so far require the transcript to process. Targeting also can be directly from target logs captured during the interpreter's operation. These target entries are detailed, capturing all the intermediate steps through srai's elements. Target logging can be turned on and off using Ctrl-T or the menu option Tools -> Context -> Targeting.

The format of the targets log file is very similar to that used by interpreter Program-D. You need only add <targets> to the beginning of the file and </targets> to the end of the file for use in Program-D's tools. I (with some bias) suggest the editing tools in AIMLpad are better.

When information is captured in the targets log file, it can be listed by the activated patterns using the Tools -> AIML -> List Targets...menu option. The list is in alphabetical order of the patterns. The format of the list is an outline. You can open up the pattern's entry to see all the inputs that activated the pattern. Or you can right click on the pattern's entry to see what percentage the pattern captured of all the targets logged. Double clicking on an input's entry will go to the Target Edit dialog.

So when it comes down to editing a single category, AIMLpad almost always ends up at the Target Edit dialog. It would be good to understand this workhorse of an interface. In fact, it is such a special dialog that it has its own shortcut, that is, Ctrl-W. Actually it is not very pretty, I've seen better. It is designed after the one found in Program-D. It has one quirk to watch out for too.

The top part of Target Edit dialog has read-only items indicating the input and what matched it. If you are editing a category found through a search, these items will be empty. A real target will have them filled in. Even when you loaded the dialog text into targets, AIMLpad found this information for you. If the match pattern does not have a wildcard, you probably don't have a good candidate for the target philosophy.

But who cares? This is the dialog where all good things take place. An author might just as well start AIMLpad.exe and press Ctrl-W to get started. If a target is in the way, press the Clear button! Let's get started writing AIML categories...

Your input is matched against the pattern. For your convenience, the pattern is shifted to uppercase automatically. Well so are the <that> and <topic> filters too. But you will have to be sure you don't use punctuation in these three data items. You should leave all that formatting for the template area which is the big input box in the middle of the Target Edit dialog.

Most of the work is building the template or reply. On the left hand of the dialog there is a multi-choice list indicating Reply or Template. This is the tricky part which you have to watch out for! Often you won't use the complex AIML elements for formatting the response, so leaving the choice on Reply is ok. Often you won't see any difference in selecting either Template or Reply. The safe selection is Template! If you leave it on Template you won't go wrong.

So what is Reply for anyway (especially when it appears to be the default selection)? If a target has a template with a <random> AIML template element, one of several responses will be outputted. The Template view shows the list of responses as coded in the <random> tag. The Reply view shows which one was presented to the client by the activation of the category.

If you start editing the reply and then save the changes, the updated category will replace the template with just a single reply. Therefore, always edit with the Template selected to be safe. Then you will not lose any special formatting. When making new categories, this is not an issue.

There are many examples of writing AIML. How to make a <random> element or how to use the <think> and <set> elements to save the context of the conversation. I'll leave this kind of tutorial for other sources.

The Target Edit dialog has some helpful buttons for composing templates. Actually they are not that helpful. Be sure to press them before entering some response since they will sometimes erase what was previously there. The <think> button adds the <think> tags around anything it finds in the template. The <random> button is similar except it also includes a <li> set for the first choice of the random options. The <sr/> button appends an <sr/> tag to the end of the current template. The <srai> button surrounds the current template with the <srai> tags. The Clear button does just that, it clears all the data inputted so far to start a new category.

That leaves the Reduce button for the last in the upper group of buttons. If you remember, the Classify Dialog Text... referred to a algorithm Dr. Wallace has for refining inputs that matched wildcards in patterns. The Reduce button does the steps outlined. Therefore you need to have a target with an input and a pattern with a wildcard before the reduce will work.

I recommend you experiment with reducing to understand how it really works.

So a point can be made here. You can change the template all you want and nothing really happens. The lower group of buttons on the left make the edits actually update (or go away.) This allows you all the opportunity in the world to try out any of the buttons in the upper group. So play until you are comfortable that they can do anything for editing AIML. I expect, I'll be improving this part of the tool eventually.

If your are done with the targets, press the Drop All button and they all go away. Perhaps this is a good time to point out that the targets log file can be located or relocated through the Tools -> Context -> Set Target File As... menu option So you don't have to destroy a targets log file if you want to do other editing like the Load Dialog Text into Targets... options explained above. There is no backup, so a Drop All action will get rid of the targets permanently. Or you can work the targets one at a time and then Discard each as you go along. The Next button is good for that technique. By the way, you can't Drop All nor Discard nor use the Next button when you are editing a category you found in a search of the interpreter's database.

The targets themselves never actually change. If you move to another target using the slider bar or the Next button, it will be displayed as it was logged, not with any updates you may have made earlier. So it is wise to press the Discard button after you are done with a target unless you make a mistake of using it twice to update your AIML set. This would avoid unwanted duplicate categories.

Before you are done, you may want to save any changes. You have options here. Your changes could go back into the interpreters current database as a new category. Or your changes could be added to the end of an AIML file. Or you could replace a category in an AIML file. Or you could compose the category directly in the text editor.

A typical scenario would be to start a new document in the editor. Then Load Dialog Text into Targets... from some on-line chat bot. That is, of course, after you had loaded the corresponding AIML set and sorted it in the interpreter. Now you are ready for targeting. One by one, you would examine the target log making changes where the bot's personality needs training. At each improvement, you would press the Save button to capture the new category in the AIMLpad editor. You could Discard each target as you go, but absolutely you would use the Next button to step through the target log. When you have reached the end of the targets, the editor will be full of new categories to save as a new AIML file for your bot.

Instead of putting the new categories into the text editor, you could put them directly into the interpreter's database. Do this with the Add to DB button. When done, you could dump the new categories back out to a file. This would suggest a reload of the interpreter's database so the new categories could be identified as being in the new file that just got dumped.

Instead of putting the new categories into the text editor, you could put them into a AIML file. Do this with the Save In... button. You could keep adding them to a single file or you could add them to various files depending upon your organizational scheme for AIML categories.

Another scenario could be to load the AIML into the interpreter and load the targets as in the previous setup. With each target that you make changes to, you would press the Replace In... button. The open file dialog will ask for the name of AIML where the old pattern can be found. You pick the right file and the category gets updated. (Note: I have an enhancement on my to-do list to default to the file where the old pattern was loaded from when it was put into the interpreter. This would make improvements to AIML sets not so painful.)

Targeting is a tedious operation that could be better done through automation. Perhaps exploring some of the script language features, we can find a simpler way than reviewing and editing to train our bots. I hope, when I get the Classify Dialog Text... option finished, to have reduced some of this to the most likely candidates for good targeting (Dr Wallace sorted his results by the percentage of inputs matching a pattern with wildcards to get the biggest bang for the buck.)

(Edited for the SourceForge site by Stefan Zakarias)