Application of Expert Agents/Assistants in Library and Information Systems
by
James G. Williams and Ken Sochats
Department of Information Science and Telecommunications
University of Pittsburgh
Abstract
This article will present a brief overview of expert agents and assistants and then discuss activities in several areas of library and information centers where expert systems have been developed and where intelligent agents/assistants could be of practical use in such activities. A discussion of what an intelligent agent or assistant might perform in these areas is outlined.
1. INTRODUCTION
Although expert systems can and have been applied to many areas within Library and Information Science, the most popular areas have been on-line retrieval, reference and referral services. Other areas which have received a relatively significant amount of attention are cataloging and classification, indexing and abstracting, collection development, acquisitions and management information.
MCDONALD & WECKERT provide a set of articles from a conference and workshop held at Charles Sturt University in Riverina, Australia which discuss expert systems in many of these areas. CAVANAGH also provides an excellent overview of the application of expert systems in library and information centers. His article covers selection/acquisitions, cataloging, classification, indexing, reference, user models and interfaces for searching, post-search processing and intelligent tutoring. He provides a brief description of prototypical systems that exist for these areas.
Over the last several years, there has been considerable effort in developing intelligent agents and assistants which are experts in a particular domain such as office automation. These agents and assistants have been developed with varying levels of intelligence using expert system principles and techniques although they are not necessarily expert systems in the classical sense.
2. EXPERT SYSTEMS
Generally, expert Systems are classified as a subset of the field of Artificial Inteliigence (Al). Expert systems are systems that employ a codification of human expertise to solve problems. The goal of the expert system is to emulate the problem solving process of the expert(s) whose knowledge was used in the development of the system. Thus, the expert system's performance is highly dependent on the quality of the expert. Typically, expert systems are narrowly limited in their domain of operation.
The structure of an expert system is shown in Fig 1.
Figure 1. Expert System Structure
An expert system is comprised of three component areas. First is the knowledge base. The knowledge base contains the representation of the expert's knowledge in the problem domain. Next is the database that contains specific facts about the problem situation under study. Finally, the inference engine is a programme that applies the expertise of the knowledge base to the facts in the database to solve the problem at hand.
Knowledge base is the 'nerve centre' of the expert system. The knowledge base represents both the expertise and the reasoning strategies of the expert in the problem domain. Many different representational schemes have been used by expert systems researchers to structure the knowledge base. These range from network structures that interrelate the expert's domain Knowledge to frame representations that represent knowledge as schematic data structures to rule based systems that codify the expert's knowledge as a set of rules or heuristics.
Two major issues arise during the development of the knowledge base. The first is the source of the knowledge base content. Knowledge bases can be derived from the expertise of single experts in single domains as in the Caduceus system for medical diagnosis (Pople, Myers, and Miller ). Many common systems rely on the use of knowledge from several experts in the same domain that are merged to form a single knowledge base.
The Prospector system for mineral exploration (Duda, Gashnig, and Hart5) has several sub knowledge bases derived from experts with differing areas of expertise. The Prospector system applies each of these knowledge bases to a problem and evaluates the outcomes of these to arrive at a single final recommendation.
The second issue in knowledge base development is the extraction of the expert's knowledge. Many expert system developers rely on interview techniques to get their expert(s) to explain the techniques and information that they bring to bear in solving the domain problems. Ericsson and Simon have introduced the formal methodology of protocol analysis to capture an expert's reasoning. In protocol analysis, the actions of an expert are completely recorded as the expert solves a series of problems. The expert is also encouraged to 'think out loud' or to talk through the steps he or she is taking during the solution of the problem. These recordings become a protocol which, when analysed, provide insight into the expert's problem solving skills.
Rulecons is an expert system to aid in the formulation of knowledge bases. Rulecons assists the expert system developer or knowledge engineer during interviews of the expert(s). It is a system for creating domain independent, rule based knowledge bases. Rulecons helps the designer to maintain consistency and achieve completeness in the knowledge base.
3. INTELLIGENT AGENTS
An agent is a software module that monitors a specified set of events, messages and information flow for a user or group of users and performs useful functions that assist the user(s) in reducing his/her need to be notified, involved, interrupted or otherwise bothered to deal with all aspects of the activity, task or related sub-tasks. Agents that have been developed in various application areas range from simple extensions to existing software such as a word processor or spreadsheet to complex workflow procedures. Such agents assist in tasks such as scheduling meetings, filtering e-mail messages, searching for specific types of task related information or making decisions about routing forms for authorization or performing anticipated additional work functions.
The two main features of intelligent agent software for assistance is 1.) it must be able to learn and 2.) It must be able to negotiate. Negotiation is required because it must be able to interact with other agents (which may be humans) in order to both learn and complete its tasks. It must be able to learn since it must adapt to the user for whom it is an assistant and the changing environment in which it exists. Such agents have been developed for calendar management, optimized use of the UNIX operating system and "surfing" the Internet. The most popular application of intelligent agents has been in the office domain for handing information flow such as mail filters, news filters, and library browsers.
Intelligent agents must work autonomously in the background so that they can perform their work with as little user intervention as possible. This requires the agents to negotiate with other agents and reason about available resources and alternative solutions to task related problems in order to minimize user interaction.
Intelligent agents must be able to learn a user's (expert's) approaches to performing tasks. A number of approaches have been taken to provide learning mechanisms such as "learning-by-being-told" or "dialog learning" and "programming-by-demonstration." MAES discusses four methods for acquiring data for learning that are incorporated in intelligent agents:
In all cases, expert agents need some knowledge acquisition mechanism which may include the selection of various machine learning algorithms. Machine learning consists of tools and algorithms for learning classifiers, decision trees, rules for control knowledge, learning protocols, finite state machines and other operational structures for learning group support structures. Learning protocols permits an agent to acquire or improve strategies for negotiation with other agents and humans. It is possible to distribute the learning task to many agents thereby enabling reasoning types of systems. The most recent agent/assistance software is constructed so that they improve over time by learning dynamically during actual operations of their designated tasks. This type of software is called a learning apprentice or learning assistant.
Toolkits for intelligent agent development are beginning to appear. Lotus Notes has under development customizable intelligent agents for filter and search assistance for various databases. The SHADE system implemented using the Knowledge Query and Manipulation Language (KQML) supports concurrent engineering which acts as a mediator for information consumers and providers. The Knowledgeable Agent-oriented System (KAoS) is a distributed artificial intelligence environment based upon the Common Object Request Broker (CORBA) standard that enables expert agent applications to be developed in distributed Smalltalk.
Agents which have been developed and are currently quite popular fall into two major classes, namely, filtering agents an search agents. The development of agent systems is being actively pursued. These are briefly discussed below.
3.1 Filtering Agents
Information filtering agents have been the focus of attention in the e-mail domain by pre-sorting or pre-selecting e-mail messages for certain types of actions such as forwarding to colleagues. Lotus Notes, Inbox Assistant, Mailbox assistant and Microsoft Exchange Server have such filters. Mailbox assistant augments its user supplied rules with learned rules based on e-mail transmissions. Similar types of agents have been developed for filtering Internet News, WWW pages, library documents, video database browsing and other sources of information.
3.2 Search Agent s
Filtering agents assist in interactive browsing while search agents automatically search for information. This can be of enormous help with millions of WWW pages and large library collections as well as corporate databases. The computer programs referred to as Web Crawlers or Spiders index WWW pages of interest for their user(s). WebWatcher is an example of a learning apprentice which acquires knowledge about useful HTML links that correspond to a user's interest. The learning mechanism is relevance feedback from the user. Frequently Asked Questions (FAQ) files at the various Internet sites is the target of Hammond's FAQ-Finder which applies parsing and classification techniques to find relevant keywords and the type of question in a request (How to, verify, when, etc.). The Commerce Business Agent reasons and plans for requests that may be asked of the Commerce Business Daily's on-line database. Many other such agents have been developed for use with business information as well as technical reports on the Internet.
3.3 Systems of Agents
While the agents discussed above have been developed for a specific type of task, agent systems will be developed for multiple tasks that are interrelated or at least performed by the user on a frequent basis. A system of agents which can share knowledge and assist users for complex information processes is the desired goal of expert agents. At the current time there are no fully operational, generally applicable agent systems but rather various categories of specialist agents each designed to perform a specific task. But experimentation with such systems are underway in several academic and commercial environments. Several critical issues are being investigated related to intelligent agents. These are:
These and other issues are a fertile ground for research and development over the next several years.
4. Information Retrieval and Catalog Searching
Studies of OPAC use have revealed that effective use of on-line retrieval is difficult to achieve for novices as well as experts. A problem major area is the match between a user's search terms and correct subject terms. The correspondence between the terms a users supplies and the terms actually used for indexing items is typically less than 50%. Additional studies have shown that understanding how to use an on-line database or OPAC accounts for its ineffective and inefficient use.
BORGMAN found that simple typing errors and logical errors accounted for a significant amount of user dissatisfaction with on-line searching. An expert agent or assistant would need to understand the users needs, intentions and goals and develop a plan to meet the goals. An expert agent to aid in searching would also need to deal with the problem of term selection. This might be done in a variety of ways such as spell checking, thesaurus lookup of antonyms and synonyms, rearranging words in a multiword search term, dealing with prefixes, suffixes as well as infixes, account for varying spaces and utilize the results of previous searches to find additional terms or eliminate terms. An intelligent agent would perform the following tasks:
5. Reference Services
Reference questions are answered using a range of different types of sources such as atlases, dictionaries and encyclopedias. When to use a particular type of source and the specific tool to utilize within the source class is the domain of the expert reference librarian.
This requires a knowledge of what reference tools are available, general strategies of when to use which tools for a specific type of question and finally, a deeper knowledge of interviewing users and the structure of knowledge within a particular domain. This can be viewed as a decision-making process. An essential step in answering a reference question of the "ready reference" type is to map the question to a category of tools. For example, IF the question requires information about basic concepts THEN an encyclopedia is a good source. IF the question requires information about a word, THEN a dictionary is a reasonable choice.
Thus, the question must first be categorized into a type of reference tool category before a specific tool in the category can be selected. This can be thought of as a simple forward chaining technique which, after analysis of the question which was asked, is the initial expert system task to be performed. The next step or task can be considered a backward chaining task that attempts to map the elements of the question to the most appropriate reference source within the category determined in the first step. This is done by matching characteristics of the question to characteristics of the source. IF the question is initially mapped to the category of dictionaries because it dealt with the meaning of a word or phrase and it was a medical term for which the user wanted an illustration, THEN a specific illustrated medical dictionary such as Dorland's Illustrated Medical Dictionary would be an appropriate selection.
The fact that it is possible to codify the rules for answering "Ready Reference" questions based on a relatively simple model does not mean that it can replace a reference librarian for all ready reference questions but it does mean that there is a level at which an expert agent can support answering ready reference types of questions. One difficulty is getting questions into a form that can be analyzed by an automated system so as to first categorize the question by class of tool appropriate and secondly by extracting features from the question so as to be able to choose a specific tool within a class.
As Forrester and Garner (FORRESTER & GARNER) point out, the world of answering reference questions is not as simple as that portrayed above. An expert system for reference must perform the role of a human intermediary which they state comprises the following stages:
The generation of a reference map based upon declarative semantics of a users question in a prepositional form are categorized by FORRESTER & GARNER into seven proposition types:
7. How -to (activity-select)
This scheme requires an intelligent agent to have both lexical and semantic information of all referenced knowledge sources.
The capability to respond to a user's natural language request is a goal of some expert systems in the reference arena. The ability to respond to a request such as " I need a bibliography on distributed databases" or "I need the name of a distributed database system" represent two quite different types of questions. The first can be answered by bibliographic data but the second requires the content of a source such as a textbook or a reference source such as Datapro. What they both have in common is that to understand what the question is about, some level of natural language processing must be performed and mapped to a question database which contains the answer to the question.
Several relatively successful attempts have been made to answer the first type of question (bibliography) but successful attempts at answering the second question at a general level have remain elusive. The National Agricultural Library of the United States Department of Agriculture (USDA) developed a system called ANSWERMAN in 1986 for assisting in answering questions related to a wide variety of reference books and other reference materials in the area of agriculture. This was referred to as an "Expert Advisory System."
This was followed by a system called "AquaRef" which acts as an expert advisory system in the narrow domain of aquaculture. A major concept in this system was to permit chaining by users to build "chains" of knowledge bases representing their own collections and expertise which could be chained to others knowledge bases thereby extending the overall knowledge base using a form of distributed expertise.
Intelligent agents developed to be experts in specific domains for answering reference questions would need a database of domain sources, a database of frequently asked questions (FAQ) with answers, a thesaurus for vocabulary control, a concept or topical hierarchy or network of domain knowledge, a parser plus lexical and semantic analyzers for analyzing questions being asked, a powerful pattern matcher, a classifier for determining question and answer categories, a search engine for retrieving relevant answers, and a relevance feedback mechanism to learn how to better perform the reference task. The intelligent agent would continually update its databases, its concept hierarchy and its classification scheme based on relevance feedback.
If the intelligent agent maintained a profile of users who asked questions and relevance feedback information related to previous questions asked, it could perform its task in a very personalized manner.
6. REFERRAL SERVICE
McDonald (MCDONALD) describes a referral/reference prototype called DISTREF used by students involved in distance education at Charles Sturt University in Australia. This prototype system captures knowledge of academic disciplines as well as knowledge provided by lecturers and librarians who deal with inquiries from students. The system is composed of the following major components:
The system has a knowledge editor module for creating and maintaining concept hierarchies and a knowledge integrator module for building structural relationships among concepts.
DISREF attempts to stimulate concept formation and exploration by users who do not have a reference librarian or instructor readily available for asking questions and receiving answers.
An intelligent agent for the student in a distance education program might observe the information (in the form of a course syllabus, lecture notes, exercises, problem sets, etc.) related to the current task (specific course related activities) a student is working on and could launch a search for potential referral items, experts, locations, organizations and other artifacts which are relevant to the current context in which the student is working. The agent might have learned a "pattern of interaction" capable of detecting when the student needs to be referred to sources of help and offer suggestions without being explicitly asked to provide such help. This is analogous to a teacher observing students in a class during a lecture or question-answer session and determining that they need help in understanding the concepts being presented. It is also analogous to a teacher observing a student attempting to solve a problem or perform an experiment and detecting that they need assistance. In some cases, the amount of time spent on a particular activity indicates a need for assistance whereas in other cases there is a pattern of interaction with the material, problem or experiment that indicates a need for assistance.
The difficult task for such an intelligent agent is not monitoring what the student is doing at the workstation since it is relatively simple to capture all keyboard, mouse, window and data related activities. The difficult task is determining how to classify and interpret the monitor data such that interaction patterns can be detected and utilized to provide assistance. Such an intelligent agent must be able to learn from past experience with the user or by having an expert such as a teacher enter patterns known to indicate when there is a need for assistance.
7. INDEXING AND ABSTRACTING
Traditional information storage and retrieval systems (ISR) use an inverted file approach to indexing. This causes problems with the post-coordination of uncontrolled vocabulary resulting in poor recall/precision and the storage of unused terms. The use of positional, truncation and substitution operators alleviates some of this problem for searching but syntactic and contextual information is lost in an inverted index. The Structured Information Management: Processing and Retrieval (SIMPR) project funded by ESPRIT attempts to process source documents and make them retrievable using expert system techniques rather than full inversion of the documents. SIMPR maintains a model of the information base into which documents are entered and from which retrieval is performed. The system uses lexical and syntactic processing modules for storage and search rather than simply character matching. The text processing system validates text words, performs a structural mark up of a document, performs a morphological analysis of each word via a lexicon lookup using 400 disambiguation rules and then performs a syntactic analysis of words relative to the role they play within a sentence and appends an appropriate role tag.
The task of automatic indexing or subject cataloging of documents has been approached from many angles but all approaches require the selection of terms from the text of the document or document surrogate, selecting those terms most indicative of the content or "aboutness" of the document and mapping it to existing terms via a thesaurus or creating a new category of terms when no thesaurus match is found. Expert systems attempt to select anchor terms or phrases using a set of rules or heuristics such as:
The SIMPR indexing software extracts analytics (index terms) that represents the information content of the text and further classifies the index terms for subject searching. Each sentence in the text is passed through a series of Augmented Transition Networks (ATN) for extraction of candidate analytics using a set of processing rules. The candidate analytics are then passed through a series of normalization and transformational procedures to ensure standard structure and word forms. The words are also compared to a set of stop word lists for possible rejection as an index term. The results are presented to a human for rejection, modification or acceptance.
The SIMPR classification system uses an expert system to learn how a user classifies documents and suggests likely classification categories. This learning approach is seen as the most practical method for determining classification attributes. This function falls into the category of an intelligent agent since it must observe an expert user and learn how the user performs a task, e.g. classification. The suggestion of classification categories is an assistant function. Therefore, the SIMPR classification expert system falls into the fuzzy area between a classical expert system and an intelligent agent/assistant.
The Artificially Intelligent Document Analyzer (AIDA) project developed a range of document analysis techniques to indicate the content of a document. AIDA divides document into three stages:
The document structure analysis breaks the document into words, sentences, paragraphs, sections, etc. and identifies structural entities such as headings, numbering conventions, and other document structural and countable facts. Content analysis is performed using sentence position, cue identification and section headings to identify key sentences which are then selected as representing the content of the document in much the same manner as an expert abstracter. The knowledge acquisition portion of AIDA used a professional librarian in the Australian Parliamentary Library in order to develop the knowledge base. The output is an abstract indicating the content or "aboutness" of the document using the "key" sentences found during the content analysis processing phase.
The AIDA system can be thought of as having many characteristics needed for an intelligent agent or assistant. If an abstracter, indexer or subject cataloger had such an agent available, it would present its suggestion for the content of the document which could then be utilized as a draft abstract, a surrogate for indexing purposes, or a statement useful for selecting subject terms and classification numbers. This agent might interact with an indexing agent, a classification agent and or a subject cataloging agent to provide a basis for their specific tasks.
An expert agent might also observe professional abstracters and indexers as they perform their work tasks and learn the patterns used to create abstracts and select or generate index terms. The agent could then determine the sources within documents, and related authorities such as a thesaurus or subject classification schedules of "key" sentences for abstracts and the index terms selected for documents. The agent could then determine what patterns exist in these selections for different document types as well as similarity measures among documents so as to use this knowledge to make suggestions to an abstracter or indexer.
A simple similarity measure between a new document to be abstracted or indexed and previously abstracted and indexed documents could provide prototypical abstracts and index terms for a new document. This will require maintaining information about the source of key sentences and index terms for documents already abstracted and indexed such that the intelligent agent could utilize this information in making suggestions to an indexer or abstracter about new documents to be processed. Structural information about documents will be necessary to perform this task. This would suggest that the structural markup of documents such as that provided by the Standard Generalized Markup Language, SGML, will be required for such an intelligent task. Thus, it will be possible to easily and quickly identify structural components of documents such as title, author, abstract, summary, conclusion, headings, tables, lists, formulas, diagrams, etc.
It is hypothesized that expert abstracters and indexers utilize the structural components as cues for selecting key sentences, index terms, classification numbers, subject headings, etc. The use of a thesaurus, subject heading list or classification schedule as a verification mechanism for selecting terms, phrases and key sentences might be more useful if the entries in these devices had some indication or explanation of how they were derived, what the sources of derivation were and under what circumstances they were most useful. This is analogous to the roles and links concept associated with index terms and subject headings.
8. COLLECTION DEVELOPMENT AND ACQUISITION
The economic crisis related to the acquisition of library materials has increased the importance of selecting and acquiring the best and most relevant materials within budgetary constraints. Selection of materials has always been the domain of experts in collection development. Expert systems have been developed to select materials in very specific domains such as the Monograph Selection Advisor which supports the selection of materials in classical Latin literature and others such as that described by (RADA) which makes decisions on journal selection. An expert in any area including collection development must have three types of knowledge, namely:
where domain theory + problem solving = heuristic rules
In collection development, the kinds of knowledge required are:
The critical element is the selection criteria used in the expert system. These can be acquired via user defined criteria as an enumerated list with sub issues, range of values for certainty and how different criteria can be combined for weightings or establishing priorities. Another approach for establishing selection criteria is to use machine learning based on a decision theoretic model by observing how an expert collection developer classifies potential selections in terms of acceptance and rejection of specific items and the explanation for such decisions. The statistical model resulting from such observations and the features of the items selected and rejected become the criteria for future recommendations for acceptance or rejection.
It may also be possible to train a neural network to build patterns for determining the features or criteria for acceptance or rejection of specific items. Of course, selection criteria will change over time and therefore it becomes necessary to constantly monitor the selection patterns and determine when such changes are occurring. These changes must then be used to update the knowledge base by adding new rules, deleting old rules or modifying old rules.
The Librarian Assistant expert system has been built using the expertise of a professional technical librarian Debrower and Jones. The domain of this system is the collection of the Applied Physics Laboratory at Johns Hopkins University. The subject areas covered include:
engineering, computer science, physics, applied mathematics, telecommunications and other technical areas relating to the Laboratory.
Johnston and Weckert describe an expert system called Selection Advisor. This system assists in collection development at academic libraries. The system has rules for the application of the following six selection criteria:
* Subject.
* Intellectual content.
* Potential use.
* Relation to existing collection.
* Bibliographic considerations.
* Language
If we had an intelligent agent that could observe sources of requests for materials (such as e-mail requests), changes in the environment served by the collection (such as new courses, research projects, services or products), the content of the existing collection (such as an OPAC or document database), and sources of new materials (such as publisher announcements and lists), then it would be possible for the agent to be given a set of initial rules matching the collection development policy of the library or information center and let it observe how an expert bibliographer makes decisions concerning requests and newly available materials. The intelligent agent would build a flexible decision model and a set of heuristics for assisting the collection development personnel. For example, when an e-mail request for materials arrives, the agent could parse the request, determine whether it is already in the collection, determine whether it is an other collection from which it could be borrowed, verify the bibliographic entry, process it against the collection development and heuristics provided or learned and prepare a recommendation with explanation as to whether the item should be acquired and from whom. Budgetary constraints could also be a part of the rule base and budgetary data could be used as part of the decision making process.
The collection development personnel's task would change from largely data collection and analysis that currently characterized the job to evaluating the explanation and data used by the intelligent agent for its recommendations. An agent that observed multiple collection development personnel might impose a level of consistency that is difficult to achieve using manual systems.
9. CLASSIFICATION OF LIBRARY MATERIALS
The classification of library materials using the Universal Decimal Classification (UDC) system, Library of Congress Classification (LC) system, Dewey Decimal Classification (DDC) system or others is a fundamental function for libraries which has traditionally been performed by experts in classification. The N-Cube expert system reported by COSGROVE is an example of using expert system technology to assist in the classification of library materials using the UDC. This system combines shallow knowledge usually represented as production rules and deep knowledge usually represented in a frame structure by representing classification knowledge as an integration of object oriented structures with associated rules and hypotheses. The N-Cube architecture consists of a parent class, sub-classes, a rule set, class inheritance and a set of hypotheses. This is represented as a tree structure which is searched based upon information in the bibliographic elements of a title to be classified. When a rule (one or more facts) matches a node in the tree, it is verified via a user or a knowledge base of facts and its classification number is extracted. The node's sub-tree can then be searched to determine more specific extensions to the base number as desired.
Classification schemes may be knowledge based such as UDC and DDC or they may be materials based such as LC. Knowledge based schemes would appear to be easier to apply expert system technology to than a materials based scheme. The basic concept of classification is to keep material about the same subset of knowledge together. Therefore, a set of rules could be formulated that depict the logic used by a classifier to select a classification category and then build the appropriate extensions to the number. Extensions to a classification number are formulated by rules that dictate how to generate those digits beyond the general category number to provide specificity for the location of the material. Rules for handling form of material (dictionaries, periodicals, associations, historical, collections, bibliographies, etc.). These rules are provided in the schedules of the classification schemes.
A particularly sticky problem is the classification of non-traditional materials. Schultz describes the designof an expert system for classifying music scores using the Dewey Decimal System. The MUSic Cataloguing (MUSCAT) system is also designed to assist in cataloguing musical pieces (Fulton).
An expert agent could assist a classifier by utilizing bibliographic elements in the cataloged material to suggest one or more general categories. For example, the author entry may be used to lookup other materials written by the same author and use its classification number as a potential starting point. This is based on the assumption that authors tend to write on the same general knowledge area for extended periods of time. Therefore, the most recent materials would be the most useful.
A match between a previously written material's subject headings or index terms might be used to verify that the current material is in the same knowledge area. The intelligent agent might then examine other bibliographic elements such as subject headings, title words and phrases, etc. to generate a more specific classification number within the general category and to determine whether the material requires form sub-divisions. If the material requires form sub-divisions, the classification schedule form sub-division rules would be used to generate further extensions to the classification number. This would require that the form-subdivision rules be expressed as a set of IF - THEN - ELSE production rules. The expert agent could then collect data on the classifier's acceptance, rejection and modifications to its recommended classification number to refine its rule base for future processing of bibliographic entries to determine recommended classification numbers. Obviously, such an agent could have a highly interactive component to assist a classifier in navigating a classification schedule and formulating form sub-divisions of classification numbers, e.g. country classes for national bibliographies.
10. ONLINE DATABASE SELECTION
The number of on-line databases has grown from several hundred to several thousand over the last ten years which poses a problem to information specialists or end-user who must select a database for answering a specific query. Even though database selection aids exist in printed and electronic form they are not efficient or effective to use because of their volume. An expert system developed by TRAUTMAN and FLITTER attempts to provide database selection assistance. The system consists of database attributes with rankings, a user modeler, a question clarifier, a search module, an evaluator module and a ranker module to produce a recommended set of database files to use for a query.
A browser module also permits a free text search of text files and indexes of the database entries. Databases are classified using nine attributes such as kind, period of coverage, language, geography, vocabulary, viewpoint, sources, audience, and updating. Each database attribute has category descriptions and ranks used by the expert system to determine which databases are appropriate for a query. The user modeler is used to determine where the end-user fits on a novice-expert scale as well as the type of results required for its ultimate utilization. Ambiguities and uncertainties in the query are resolved by the question clarifier module using expert knowledge analogous to a professional intermediary. This expert system is an aid to selecting the correct database from the thousands that exist.
An intelligent agent might be built that utilized the same data and module structure as Trautman and Flitter but with the added capability to store queries, user modeling information and feedback on databases that that successfully responded to the query. When a new query is entered, the expert agent could match it with similar queries and user modeling data to determine whether it already had a database selection list from a previous query. The agent might still do a search and match the results with the stored selection list to determine whether its previous results were still valid. This is similar to the human expert who remembers previous searches that are similar to a current query and utilizes the same databases that provided success. An extension to this process might be to have a small but representative sub-set of each database's content such that a local search could be performed to show the user what the results are likely to be and get user feedback from this subset search.
11. LIBRARY EDUCATION
Another area of intelligent system application is in the training of professional librarians and in topics related to library science. Expert systems can provide a specialised, concentrated, and low cost method for obtaining practical library-related experience. Obensen et al describe the Yes Sir expert system which is designed to assist in retrieval tasks. Associated with Yes Sir is an intelligent tutoring system that teaches the user improved information retrieval skills. As expert and intelligent agent systems become more abundant in library science, modules for enhancing the knowledge and skills of the users will be added.
12. SUMMARY
In this article we have tried to provide an introduction to expert system technology and intelligent agents/assistants as well as how this technology has potential uses in work activities performed in libraries and information centers. The problem area we concentrated on was the use of such agents to reduce information overload and reduction of clerical and repetitive tasks associated with many activities. In addition, we have tried to provide some indication of expert systems developed in these areas of activities which have proved useful and whose techniques could be incorporated into intelligent agents/assistants.
It should be clear that there is no generally applicable agent or agent system that combines all the ideas and results of the many different developments that have been explored. But there are various types of specialist expert systems and agents that have been used to explore specific problem domains with library and information centers. There are very few intelligent agents commercially available but this should change over the next few years. There does not appear to be any standardized approach to developing intelligent agents.
Intelligent agents can be classified as belonging to categories such as:
Future agents will explore information spaces based on user activities and develop pathways to information sources which they will cache on the assumption that these sources will become valuable in the future. Future agents will be combined into systems of agents which will exchange information, execute tasks for each other and negotiate solutions to information related problems. Of course, the acceptance and usefulness of such agents will depend on user acceptance of a software product acting on his/her behalf and observing and collecting data on what he/she is doing.
Bibliography
Bocionek, S. Software Secretary Kernel: An extendible Architecture for Learning and Negotiating Personal Assistants. AI Communications, 7(3/4):147-160, September/December 1994.
Borgman, Christine. "Psychological Research in Human-Computer Interaction." in Williams, Martha (ed). Annual Review of Information Science and Technology. White Plains, NY. Knowledge Industry Publications, 19:33-64, 1984.
Boy, G.A. Intelligent Assistant Systems. Academic Press, London, 1991.
Cavanagh, Joseph. Library Applications of Knowledge-Based Systems in Roysdon, Christine and White, Howard (eds). Expert Systems in Reference Services. Haworth Press, New York, 1989, pp. 1-19.
Cosgrove, S.J. Item Classification Using the N-Cube's Hierarchical Knowledge Representation Schema in McDonald, Craig and Weckert, John (eds). Libraries and Expert Systems. Taylor Graham, London, 1991, pp 88-98.
Forrester, C. L. and Garner, B. J. Knowledge-Based Support for Library Users in McDonald, Craig and Weckert, John (eds). Libraries and Expert Systems. Taylor Graham, London, 1991, pp 17-27.
Geneserth, M. R. and Ketchpel, S.P. Software Agents. Communications of the ACM, 37(7):48-53, July 1994.
McGuire, J., et al. SHADE: Technology for Knowledge-based Collaborative Engineering. Concurrent Engineering: Research and Applications, 1(3), 1993.
Maes, P. Agents that Reduce Work and Information Overload. Communications of the ACM, 37(7):30-40, July 1994.
Maes, P. (ed.) Designing Autonomous Agents. MIT/Elsevier Press, London, 1994.
McDonald, Craig and Weckert, John. Libraries and Expert Systems. Taylor Graham, London, 1991.
Norman, D. A. How Might People Interact with Agents. Communications of the ACM, 37(7):68-71, July 1994.
Rada, R; Bacus, J.; Giampa, T.; Gidds, C.; and Goel, S. Computerized Guides to Journal Selection. Information Technology and Libraries. No. 6, 1987, pp 173-184.
Riecken, D. Special Issue on Intelligent Agents. Communications of the ACM, July, 1994.
Trautman, R and Flitter, S. An Expert System for Microcomputers to aid Selection of Online Databases in Expert Systems in Reference Service edited by Christine Roysdon and Howard White, Haworth Press, New York, NY. 1989.
Wayner, P. and Joch, A. State of the Art - Agents of Change. Byte Magazine, March 1995.