This is a plain-text version of a dissertation. It should not be distributed or otherwise used without permission of the author. The author's current contact information is: // Gregory B. Newby, Assistant Professor in the School of Information // and Library Science, University of North Carolina at Chapel Hill // CB# 3360 Manning Hall, Chapel Hill, NC, 27599-3360 E: gbnewby@ils.unc.edu // V: 919-962-8064 F: 919-962-8071 W: http://www.ils.unc.edu/~gbnewby/ Towards Navigation for Information Retrieval by Gregory B. Newby B.A. State University of New York at Albany, 1987 M.A. State University of New York at Albany, 1988 Abstract of Dissertation Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Information Transfer in the Graduate School of Syracuse University May, 1993 Approved ____________________________ Professor Michael S. Nilan Date ________________________________ This work proposes navigation as a fundamental concept for information retrieval. A conceptual framework for navigation is developed, after Mead's (1960) notions of the importance of modelling the other for effective communi- cation. Navigation is defined as that behavior in which humans engage to make sense of an information space. Information space is defined as the set of concepts and relations among them stored by an information system. Unlike cognitive spaces, which humans possess, currently available information spaces are not generally subject to change in response to ongoing communica- tion. During human communication, people send and receive messages, which have the effect of changing their cognitive spaces. For effective communica- tion, each participant uses a model of the other to properly gauge the effects of her or his messages. For interaction with information systems, the necessity for a model is the same, but it is typically incumbent on the human user to conform to the generalized model which the system has of its users. Navigation is not a metaphor, it is human behavior which occurs when humans interact with information space. This work treats navigation through physical domains in the same way as navigation through information space, in that each requires the creation and maintenance of a cognitive model of what is navigated. Brookes' (1975) "exosomatic memory" is presented as a long-term goal of information systems. Providing more navigable systems is one step towards that goal, by facilitating human model building and moving towards human- computer interaction which is more similar to human-human interaction. An empirical investigation of navigation for information retrieval is completed. An information space is created which incorporates concept relations, intended as a step towards information spaces which match the cognitive spaces of users. A prototype information retrieval system is designed to navigate the information space, employing a visual interface and optional gesture-oriented input device. A user-based evaluation of the prototype environment is made. The outcome indicates that navigation, as conceptualized for this work, is a useful and fruitful outlook on information seeking behavior involving human-computer interaction. Towards Navigation for Information Retrieval by Gregory B. Newby B.A. State University of New York at Albany, 1987 M.A. State University of New York at Albany, 1988 Dissertation Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Information Transfer in the Graduate School of Syracuse University May, 1993 Approved ____________________________ Professor Michael S. Nilan Date ________________________________ Coke is a trademark of the Coca-Cola Company. Dialog is a trademark of Dialog Systems, Inc. HyperCard is a trademark of Apple Computers. Iris, The Graphics Library (GL), The Geometry Engine, and Irix are copyrights of Silicon Graphics, Inc. PowerGlove is a trademark of Mattel, Inc. Unix is a trademark of AT&T. VM/XA is a copyright of IBM. Copyright 1993 Gregory B. Newby *** Insert Committee Approval Page Here *** T A B L E O F C O N T E N T S LIST OF TABLES AND FIGURES. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii ACKNOWLEDGEMENTS. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii CHAPTER 1: INTRODUCTION AND CONCEPTUAL FRAMEWORK . . . . . . . . . . . . . . . . . . . . 1 1.0 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 A Conceptual Framework for Understanding Navigation. . . . . . . . . . . . . 6 1.1.1 Human Communication, Cognitive Movement, and Cognitive Space. . . . . . . . . . . . . . . . . . . . . . . . . . 7 1.1.2 Information Space and Navigation. . . . . . . . . . . . . . . . . . 17 1.1.3 Models for Human Communication and Navigation. . . . . . . . . . . 25 1.1.4 Representation for Information Space. . . . . . . . . . . . . . . . 28 1.1.5 Facilitating Navigation . . . . . . . . . . . . . . . . . . . . . . 32 1.1.6 Communication and Navigation. . . . . . . . . . . . . . . . . . . . 33 1.2 Goals for Information Retrieval Systems. . . . . . . . . . . . . . . . . . . 37 1.2.1 Relevance-Based Information Retrieval . . . . . . . . . . . . . . . 38 1.2.2 Navigation-Based Information Retrieval. . . . . . . . . . . . . . . 41 1.2.3 Criteria for Evaluation of Information Retrieval Systems. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 1.3 Goals, Questions, and Definitions. . . . . . . . . . . . . . . . . . . . . . 50 1.4 Criteria for Evaluating this Work . . . . . . . . . . . . . . . . . . . . . 55 1.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 CHAPTER 2: LITERATURE REVIEW . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 2.0 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 2.1 Traditions in Information Retrieval. . . . . . . . . . . . . . . . . . . . . 60 2.1.1 Foundations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 2.1.2 Representation for IR . . . . . . . . . . . . . . . . . . . . . . . 62 2.1.3 Assigning Keyterms. . . . . . . . . . . . . . . . . . . . . . . . . 63 2.2 Non-Relevance-Based Approaches and the Retrieval Pro- cess . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 2.2.1 Browsing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 2.2.2 Cognitive Approaches. . . . . . . . . . . . . . . . . . . . . . . . 68 2.2.2.1 Dialog-Oriented Systems . . . . . . . . . . . . . . . . . 69 2.2.2.2 Anomalous States of Knowledge . . . . . . . . . . . . . . 70 2.2.2.3 Cognitive User Models . . . . . . . . . . . . . . . . . . 71 2.2.2.4 Sense Making. . . . . . . . . . . . . . . . . . . . . . . 72 2.3 Information Space. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 2.3.1 Spatial Representation for IR . . . . . . . . . . . . . . . . . . . 74 2.3.1.1 Spatial Representation with Orthogonal Vectors. . . . . . . . . . . . . . . . . . . . . . . . . . 75 2.3.1.2 Spatial Representation with Term Relations. . . . . . . . 76 2.3.2 Information Space in the IR Literature. . . . . . . . . . . . . . . 78 2.3.3 Information Space in the Psychological Literature . . . . . . . . . 80 2.4 Wayfinding and Visualization . . . . . . . . . . . . . . . . . . . . . . . . 82 2.5 Literature Central to the Current Work . . . . . . . . . . . . . . . . . . . 84 2.6 Outcome of the Literature Review . . . . . . . . . . . . . . . . . . . . . . 88 2.7 Information Space Revisited. . . . . . . . . . . . . . . . . . . . . . . . . 89 2.8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 CHAPTER 3: METHODS OF INVESTIGATION. . . . . . . . . . . . . . . . . . . . . . . . . . . 91 3.0 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 3.1 Building an Information Space. . . . . . . . . . . . . . . . . . . . . . . . 93 3.1.1 Information Space via Multidimensional Scaling. . . . . . . . . . . 94 3.1.2 Building the Space. . . . . . . . . . . . . . . . . . . . . . . . . 98 3.2 Building the Interface . . . . . . . . . . . . . . . . . . . . . . . . . . 107 3.3 User-Based Evaluation. . . . . . . . . . . . . . . . . . . . . . . . . . . 112 3.3.1 Overview of Evaluation. . . . . . . . . . . . . . . . . . . . . . 113 3.3.2 Operationalization. . . . . . . . . . . . . . . . . . . . . . . . 114 3.4 Specifications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 3.4.1 Respondents . . . . . . . . . . . . . . . . . . . . . . . . . . . 122 3.4.2 Setting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 3.5 Error. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124 3.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 CHAPTER 4: RESULTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 4.0 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 4.1 Analytic Techniques. . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 4.2 Task-related criteria. . . . . . . . . . . . . . . . . . . . . . . . . . . 132 4.2.1 Well-Defined Information Needs. . . . . . . . . . . . . . . . . . 134 4.2.2 Less Well-Defined Information Needs . . . . . . . . . . . . . . . 136 4.2.3 Learning and System Cues. . . . . . . . . . . . . . . . . . . . . 138 4.2.4 Training Time . . . . . . . . . . . . . . . . . . . . . . . . . . 141 4.2.6 Summary of Task-Related Criteria. . . . . . . . . . . . . . . . . 142 4.3 Model-related criteria . . . . . . . . . . . . . . . . . . . . . . . . . . 143 4.3.1 The Other . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144 4.3.2 The Relationship to the Other . . . . . . . . . . . . . . . . . . 147 4.3.3 How to Change the Relationship to the Other . . . . . . . . . . . 151 4.3.4 Model of the Self . . . . . . . . . . . . . . . . . . . . . . . . 153 4.4 Analysis of Closed-Ended Data. . . . . . . . . . . . . . . . . . . . . . . 156 4.4.1 Searches which did not Result in Selection of a Document Surrogate . . . . . . . . . . . . . . . . . . . . . . . 160 4.4.2 Successful Searches: Prism . . . . . . . . . . . . . . . . . . . 162 4.4.4 Successful Searches: Space . . . . . . . . . . . . . . . . . . . 166 4.4.5 Respondent Overall System Evaluation. . . . . . . . . . . . . . . 171 4.4.6 Demographics. . . . . . . . . . . . . . . . . . . . . . . . . . . 176 4.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178 CHAPTER 5: DISCUSSION AND CONCLUSION . . . . . . . . . . . . . . . . . . . . . . . . . 180 5.0 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180 5.1 Navigation: Derived Understanding . . . . . . . . . . . . . . . . . . . . 182 5.2 Implications for Information Retrieval . . . . . . . . . . . . . . . . . . 184 5.3 Limitations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189 5.3.1 The Conceptual Framework. . . . . . . . . . . . . . . . . . . . . 190 5.3.2 The Space System. . . . . . . . . . . . . . . . . . . . . . . . . 191 5.3.3 The Empirical Study . . . . . . . . . . . . . . . . . . . . . . . 195 5.4 Future Work on Navigation. . . . . . . . . . . . . . . . . . . . . . . . . 197 5.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201 APPENDIX A: TRAINING PACKET. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202 APPENDIX B: QUESTIONNAIRE. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221 APPENDIX C: INFORMATION NEED STATEMENTS. . . . . . . . . . . . . . . . . . . . . . . . 233 APPENDIX D: INSTRUCTIONS TO THE RESEARCH ASSISTANT . . . . . . . . . . . . . . . . . . 235 APPENDIX E: INFORMATION SPACE COORDINATES. . . . . . . . . . . . . . . . . . . . . . . 241 Section 1: Coordinates of 264 Eric "descriptors" and "major descriptors.". . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241 Section 2: The coordinates of 272 documents. . . . . . . . . . . . . . . . . . 246 APPENDIX F: SAMPLE ERIC DOCUMENTS. . . . . . . . . . . . . . . . . . . . . . . . . . . 252 APPENDIX G: CODEBOOK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255 APPENDIX H: OPEN-ENDED DATA BY ITEM. . . . . . . . . . . . . . . . . . . . . . . . . . 257 REFERENCES. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285 LIST OF TABLES AND FIGURES Figure 1.1: What do information retrieval systems match? . . . . . . . . . . . . . . . . 40 Table 1.1: Summary of Key Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 Table 3.1: Overview of Procedures Taken to Generate a Spatial Repre- sentation of a Bibliographic Database. . . . . . . . . . . . . . . . . . . . . . 98 Table 3.2: The ERIC Database Subset. . . . . . . . . . . . . . . . . . . . . . . . . . . 99 Table 3.3: Statistics for Keyterms . . . . . . . . . . . . . . . . . . . . . . . . . . 102 Table 4.1: Success for Each System Across Tasks. . . . . . . . . . . . . . . . . . . . 135 Table 4.2: Time on Task. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156 Table 4.3: Closed-ended scores for "satisfaction" item . . . . . . . . . . . . . . . . 157 Table 4.4: Relationship of Navigation and Satisfaction Scores. . . . . . . . . . . . . 158 Table 4.5: Demographic Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177 ACKNOWLEDGEMENTS Many people and organizations helped me during the completion of this work. My thanks goes (in alphabetical order) to: Abrams/Gentile Entertainment, Inc. and Chris Gentile for the PowerGlove converter box; The Advanced Graphics Research Laboratory at Syracuse University, with Dave Richers and Arnold Paul who helped set up early tests of my IR system; Roger Chen for chairing my defense; the database management group at Syracuse University for allowing me to develop a database subset of the ERIC database; Bruce Derr, Ronald Kalinowski, and the rest of the computer systems staff at Syracuse University for favors, advice, and indulgence; Judith Diamond for program- ming help; my fellow doctoral students at Syracuse University, who offered moral support and helpful advice; Wayne Fordyce, who gave me a startup account and advice for the Cornell supercomputer; the Graduate School of Library and Information Science at the University of Illinois at Urbana- Champaign, for their indulgence and faith in my work; David Micko for recruiting respondents and administering the user-based study; the National Center for Supercomputing Applications, which supported production of a videotape of my system; Joseph Woelfel for planting a seed that started all this, and serving as outside reader; Kent Yates for help with Iris system adminis- tration at the University of Illinois; Robert Zeh for programming assistance. Separate thanks go to my advisor and committee members: Michael S. Nilan (advisor), who offered constructive advice on how to present my work and made my career at Syracuse much easier; Jeffrey Katzer, who has the sharpest methodological mind on the block and offered keen criticism of my work; Sung Myaeng, who expressed strong interest in system design issues; Robert N. Oddy, who has knowledge of information retrieval that is astounding, and offered much practical advice on my space building and system design; and Michael B. Eisenberg, who brings exceptional writing talent and rigor to everything he does. Extra special thanks goes to my wonderful wife, Ilana, who gave me her support, encouragement, and inspiration. CHAPTER 1: INTRODUCTION AND CONCEPTUAL FRAMEWORK 1.0 Introduction The language of information retrieval is the language of navigation. Users look for information, they browse, they choose a new direction for a search, they find something that is close to what they are looking for. Navigation describes what information seekers do. This work introduces a conceptual framework with navigation as a fundamental concept about which information retrieval systems and other types of information systems might be built. Information retrieval systems are special types of information systems for two reasons. First, the size of the database is typically very large. Databases with hundreds of thousands of items are common, and may include significant amounts of unstructured natural-language text. Second, the users and their purposes are heterogeneous. Designers of information retrieval (IR) systems have little power to specify the situations in which users will approach a system, or to predict the values, goals, experiences, or language they will bring with them (Nilan and Rosenbaum, 1991). These reasons make for large information spaces for IR (as defined and discussed in this chapter) which are difficult to structure for particular information use situations. Other types of information systems might also benefit from a focus on navigation, as defined and discussed here. Train schedules, book indexes, thesauri, training manuals, expert systems, and computer operating systems are other types of systems for which navigation as a fundamental concept might be applied but they typically focus on a more homogeneous range of users and uses than the bibliographic systems found in IR research. Defined generally as any mechanism for storing and retrieving information, information systems might include anything from office filing systems to artificial intelligences. The current work is concerned exclusively with computerized information retrieval systems, but perhaps has implications for non-computer- ized systems. A major redirection in the design of some computerized versions of the information systems listed above has been taking place in recent years, as graphical interfaces and window systems have become commonplace. These newer systems do focus on navigation, often explicitly, although without any sort of consistent conceptual framework such as that developed in this chapter. The X-Windows system, hypertext, computer browsing systems, and various other visual interfaces are all examples of information systems based on some aspects of navigation as discussed here. Historically, Bush's (1945) "As we may think" laid a goal which was easily within the reach of imagination, but has not yet been realized. I wonder if he could comprehend the vision of Gibson's (1984) NEUROMANCER, in which similarly futuristic incorporation of humans, machines, and information was predicted? Bush's "memex" was essentially an add-on to human memory -- it was able to identify information needed in response to ambiguous or incomplete queries, presumably based on what it knew about various users and processes. Input and output were via text (or perhaps voice). In Gibson's world, machines such as the "memex" have still not been realized. Instead, humans navigate through a worldwide multidimensional information space, called "Cyberspace." Input and output is accomplished via direct neural stimulation, and the entire scenario is augmented by computers. This work may take small steps towards making the visions of Bush, Gibson, and others a reality, through potential enhancements to IR systems and by expanding the notions of representation and cognitive matching within such systems. Several difficulties lie on the way to the achievement of such visions, however. One important one has to do with the nature of the interaction between people and computers. When a literature on information retrieval began to come into its own in the mid-1960's, computers did not provide highly interactive environments: they operated mostly in batch mode. Batch processing is when a program is introduced to the computer and run at a later time. Results are then returned to the programmer. The programmer then has to determine the quality of the output, make any changes indicated, and resubmit the job as needed. (The computer, meanwhile, has no memory of the previous time the job was submitted.) The "memex" would not exist in such an environment. Bush's vision pointed to ongoing interaction with the "memex," more akin to human communication than batch-mode retrieval. Unfortunately, the batch mode of processing is still the dominant model for computer interfaces today. Whether the user is creating a program, using an operating system, sending electronic mail, or using an IR system, the general process is one where the user submits and re-submits commands until he or she is satisfied with the result. Interaction in these cases is in real- time, with immediate system feedback -- however, the general process is one where the user submits and re-submits commands, and is solely responsible for quality control. The addition of a visual interface based on a desktop metaphor, and pointing device such as a mouse, might remove some of the onus from the user to remember specific command sequences, but the "hit or miss" nature of the interface is not removed by the simple addition of a different sort of target (a menu) and new device for interaction (a mouse, instead of a keyboard). These systems do not have models of either users or their tasks. This work postulates that users must navigate such systems, as navigation will be formally defined in this chapter, but the systems were not designed to facilitate navigation. This work presents navigation as a fundamental concept for information retrieval. This is contrasted with relevance, which is the concept at the heart of most of today's IR systems. Navigation is not proposed as an end goal for information retrieval, but rather as a redirection for research and design which will move away from the assumptions and practice of batch-mode IR and towards systems which are more like human communication systems. Eventual- ly, it is hoped, this path will lead towards realization of IR scenarios such as Bush dreamed of, or as proposed by Brookes (1975), where information systems operate as "exosomatic memory:" an external database which is accessed by humans as an extension of their individual memories. Users may be more empowered by navigable systems to select the information they want than by systems that do not focus on navigation. This is because a navigable system (as envisioned for the current work) would have two important qualities. First is the organization of a database so that it closely matches user perceptions of the database contents, in that the concepts, and relations among them, which the database has are similar to the concepts and relations as perceived by the system's users. Second, navigable systems provide cues which help the user to understand his or her status relative to the system -- the relationship between user and system -- and how to change that status. This chapter lays the framework for the rest of the work. The main component of the chapter consists of the development of a conceptual framework with which to consider navigation. Human information seeking behaviors are considered in the context of information seeking using computerized information retrieval systems. At the end of the chapter, specific criteria about which to build navigation-based IR systems are generated, and research questions are specified, both of which will drive the rest of the work. Chapter 2 reviews literatures related to this work which were not introduced as part of the conceptual framework. In Chapter 3, the three-part investigation of navigation is described. The first part was the construction of an information space which was intended to have qualities in common with human cognitive space, for the purpose of facilitating navigation. Then, a prototype IR environment was created using a visual interface to the information space. Finally, user-based methodologies were employed to investigate the applicability of the information space and prototype system for information retrieval. The results are analyzed in Chapter 4, and Chapter 5 summarizes the findings of this work. As will be discussed, the conceptual framework generated in this chapter was given some empirical support. Specifications for future system design are drawn in Chapter 5, and directions for future research. 1.1 A Conceptual Framework for Understanding Navigation This section introduces a conceptual framework from which to under- stand navigation for information systems. The section starts with some considerations for theory building, and then the framework building starts with a consideration of human communication. This work attempts to fit in several different areas of social science at once and play several of the traditional roles of information retrieval research found in the literature. This chapter is primarily concerned with theory building, in which concepts are defined and discussed and linked together, and laying the foundation for the rest of the work. Chapter 2 will give background, substantiation, and further direction for the conceptual, methodological, and system-building concerns of this work. Chapter 3 attempts to combine the conceptual framework from this chapter with practical methods for system design and evaluation in the literatures, so that a social scientific study of the framework can be carried out. This subsection addresses the current work in its role of laying the foundation for eventual theory. 1.1.1 Human Communication, Cognitive Movement, and Cognitive Space Human communication may be described as ongoing, dynamic interaction, in which two or more parties exchange messages and meanings (e.g., Cushman and Cahn, 1985). Human communication might be dyadic or involve more actors, it can take place synchronously or asynchronously, feedback may be immediate or only occasional. Scholars of human communication have divided the field into various domains: interpersonal communication, mass media, political communication, intrapersonal communication, computer-mediated communica- tion, etc. (Littlejohn, 1983). In all of the various types of human communica- tion, intentional agents use a variety of media to share meaning. The work of George Herbert Mead lies at the deep foundation of the current work (and is also central to several bodies of theory for human communication. See Littlejohn, 1983). Mead's work ranged across the entire realm of social existence, including functionalism and social behavior, language, ethics, time and space, and interpersonal and intrapersonal communication, from both a theoretical and philosophical standpoint (Aboulafia, 1991; Joas, 1980). He was more than a father figure to sociology and the symbolic interactionist perspective on human communication -- he was a seminal figure in 20th century Western thought (see, for example, Cronk, 1987). Mead's work did not receive all the attention it deserved from US scholars, but has seen the focused attention of such central figures in continental philosophy as Habermas, Tugendhat, and Joas, according to Aboulafia (1991). Aboulafia posits that Mead's work is at the core of much of 20th century social science, although he is often not credited. For the current work, Mead's notions of how humans communicate and the models that people have of each other and themselves are at the focus. Mead (1960) stressed the importance of models of the other for communication: in order for messages to be effectively sent and understood, each communicator must have a model of the other. The model might be at a basic stereotypical level, or it may be a result of intimate knowledge. In the case of an intimate model, messages can be transmitted more easily (Newby, 1988) -- take the example of long-term marriage partners who are able to send many messages effectively without many words. If the model is not accurate, "breakdown" may occur (in the sense of Maturana and Varela, 1987). For instance, a diner might order a ham-and-cheese sandwich in a vegetarian restaurant, or a librarian might deliver a book about friendly household pets to someone interested in taxidermy. The models of the other which humans possess make them capable of tailoring a message so that it might have the desired affect on another (although such processes are largely unconscious, and do not imply any necessary attempt at coercion by communicators). Mead did not go into detail about what the models might consist of, how they might be stored, or what processes integrate a given model with a desired meaning to form a particular message. He did propose that such models would include past experience with particular people and situations, and would involve integration of both general and specific knowledge (Mead, 1960). A model of the "generalized other," according to Mead, makes it possible to negotiate novel situations with reasonable success. The ability to model the self allows humans (and other beings) to relate a particular communication episode to their own existence. A uniquely human ability, according to Mead, is to model how another perceives the self. This enables people to place themselves "in the other person's shoes," and perceive a particular interaction or experience as another might. Taking someone else's point of view is known in common experience to help smooth interaction, and is at the basis of many types of communication instances (from, say, psycho- logical therapy to political solicitation). The model of how another perceives the self might not be accurate, of course, and is probably not consciously constructed. It is the model of the other, the model of the self (or self- concept), and the model of how the (possibly generalized) other perceives the self which enables people to send and receive the communication messages to which they are exposed. Communication scholars, psychologists, market researchers and others are interested in the effects of messages. In a simple model, the utterances and gestures made by someone in, say, a dyadic encounter, might be considered as messages sent to the other. In real-world situations, of course, it is difficult to identify unambiguously the nature of the messages being sent. The goal of mass media messages, such as advertisements, is cognitive movement. All communication results in some cognitive movement. The resultant cognitive movement may have been intended or not by the sender of the messages, just as the message may or may not have been sent intentionally. Cognitive movement is a change in meaning: when understandings of concepts, or the concepts themselves, are subjected to change. Cognitive movement might alternatively be called "learning" (but without implying active information seeking by the learner), "attitude change" (but referring to all that is known, not just attitudes -- whatever they are), or simply "change in knowledge" (but without implying that the change is great or significant, or even noticed). In the simple case, the effects of a single exposure to an advertisement may be gauged relative to pre-measured meanings (Woelfel, Holmes, Cody and Fink, 1988). In multi-party real-world interaction on an ongoing basis, messages are exchanged among actors so that meanings in the end are a combination of all messages sent with the pre-existing meanings (Kincaid, 1988. Note that while some of the mathematics in Kincaid's Conver- gence Theory have been found faulty, the basic premise has empirical support). A considerable body of empirical and theoretical work concerning cognitive movement has been generated by Dervin and Nilan (see Dervin, 1983; Nilan and Rosenbaum, 1991). Their "sense making" paradigm presents cognitive movement as a fundamental human condition, in which humans actively seek to make sense of their surroundings. Empirically, cognitive movement is measured by considering gaps in the knowledge of actors. Cognitive movement is accomplished when a gap is bridged. Methodologically, gaps are defined as questions or uncertainties actors experience. Cognitive movement through gap-bridging might result in increased understanding or new knowledge, but can also result in decreased understanding or identifica- tion of new gaps. Even if gaps are not successfully "bridged," cognitive movement might take place. -------------------------------------------------------------------------- Definition 1.1 Cognitive Movement: A fundamental human condition in which new meanings are sought or obtained. A change in what a human agent knows. -------------------------------------------------------------------------- Cognitive movement refers to changes in what is known and is limited in that it does not deal in how things come to be known, the mechanics of knowing, or thought processes (cognition). Human beings are considered to be active agents in the collection of information and maintenance of what they know. It is based on the premise that any information received (or created) by an individual will by definition result in some cognitive movement. This is not so much in the tradition of Shannon and Weaver's (1949) information processing theory in which information is that which reduces uncertainty and can be measured in "bits." It is more akin to Woelfel and Fink's (1980) description of how what is known changes as information is received, but things which are already well-known are less subject to change. Cognitive movement is not a metaphor, it is a way of describing human behavior. (Although there is nothing which physically moves, there is a change in what is known. By analogy, "counting" can be thought of as "moving along a number line." The generation of a spatio-temporal continuum for the discussion of counting is not inconsistent with the behavior under study, and does not imply an overlap with any physical domain. For the current purpose, "movement" refers to change, but does not imply that some object needs to move through space in order for "cognitive movement" to take place. The term "cognitive movement" was chosen because it may have less a priori meaning than terms such as "learning," because it does not imply a massive or significant change in what is known as might a term such as "cognitive change," and because the term "cognitive movement" has the benefit, as will be seen later, of fitting well with previous theory and research related to this work, such as Salton, in Salton and McGill, 1983, Woelfel, in Woelfel and Fink, 1980, and Koll, 1979). The fit between cognitive movement as defined above with Dervin's and Nilan's sense making approach to cognitive movement is fair, but not perfect. They study the information sought and received, the situation surrounding the information need situation, and the related affect. The current work is less concerned with the perceived changes in what is known, which are accessed by interviews with respondents, than with what is known. (This certainly does not imply that what is known by an individual can be readily ascertained, if at all. Indeed, both "cognitive movement" and "cognitive space," as introduced later, will be only partially explicated for the current purpose, and partially left in a black box). The activities concerning the active search for informa- tion are external to the conceptual framework presented here, as are, to a large extent, human perceptions of changes in knowledge. Thus, the methodological concern of Dervin and Nilan with gathering data on perceptions of information need situations is not a component to the framework being built here, but there is definite overlap of interest in cognitive movement, as it results from exposure to information. Notice that the "gaps" discussed by Dervin and Nilan are not meant to signify physical phenomena. They are metaphors for mental phenomena, the mechanics of which are outside of the sense making paradigm. This use of physical terminology to refer to cognitive phenomena will be employed later in a discussion of navigation and information space, in which the terms used are often applied to physical domains, but are appropriate at a higher level of conceptualization, where the higher level of conceptualization remains powerful, perhaps more so, for the physical domain. It might be inferred from Dervin (1983) and other works in the paradigm that the distinction between a physical gap or barrier and a mental one is practically non-existent, from the standpoint of theoretical consideration of the cognitive processes involved in bridging that gap. This is not to say that mental gaps are like physical phenomena. To the contrary: it is to say that a physical gap, and the act and necessity of crossing it, are cognitive phenomena. Therefore, physical phenomena may be considered a subset of cognitive phenomena when considering the cognitive processes involved in interacting with them. It is suggested here that cognitive space is the necessary medium in which cognitive movement takes place. In the current context, cognitive space is presented as a stepping stone to concepts more appropriate for consider- ation of information systems. For this purpose, cognitive space is made up of two things: concepts and relations among concepts. Such a cognitive space is sufficient, perhaps, to describe the substance of human communication and sense making behavior. Concepts might be simple referents to physical entities (boars, hogs, swine), more complicated entities (the constituency, females, the ERIC database), or vague ideas (democracy, freedom, philosophy). Relations might be temporal (as in the case of most of Dervin's and Nilan's works), based on perceived similarity (as in multidimensional scaling, a psychometric method adapted by Woelfel, described in Woelfel and Fink, 1980), typeless or simply directional (for most hypertext applications), or a variety of other types (hierarchical, causal, relational...). Changes to concepts or relations among concepts are the outcomes of cognitive movement. -------------------------------------------------------------------------- Definition 1.2 Cognitive space: The necessary medium in which cognitive movement takes place. Consists of concepts and relations among concepts. -------------------------------------------------------------------------- Cognitive space is sufficient, as explicated here, for this discussion of cognitive movement for human communication and sense making behavior, but does not provide understanding of human thought processes any more than the scholars cited here have, although it does provide a consistent base for consideration of various types of cognitive movement. Cognitive space is a construct employed to aid in consistency and understanding for cognitive movement. Cognitive space is, at best, a pale shadow or metaphor for the substrate of human thought. This means that while we known that "cognitive movement" takes place and can to some extent measure the causes and outcomes of the movement, we are left with only a high-level understanding of what has changed. This is the same situation as educators might face when discussing "learning:" they can measure outcomes of new knowledge, and postulate that some change in what is known has occurred (and would presumably agree with the statement that what is known consists of concepts and relations among them), but have little concrete understanding of how the concepts are "stored" in human memory, what the relations are, or how these things have changed. An additional component of cognitive space can be identified, but it is not a necessary part of the definition. "Dimensions," or "types" of relation- ships are a part of a cognitive space. For the current work, only a very low- level type of relationship, similarity, will be employed. Many other types of relationship can be identified. Hierarchical relationships, causal relationships, similarity on a particular quality (height, color, ease of use, etc.), and nominal relationships might exist, depending on the nature of the concepts. Identifica- tion of the different types of relationships, their associated situations, and their relative importance in both cognitive and information space are left for a future study. "Similarity," as used in this work, is proposed by Woelfel and Fink (1980) as a fundamental measure for cognitive research, in that all other measures, even nominal ones, incorporate notions of similarity. Their empirical measure for similarity, however, is "dissimilarity." This is because a ratio- level score of zero on a dissimilarity scale corresponds to identity (that is, a score of zero on a dissimilarity scale means the items measured are identical). For a corresponding scale measuring similarity, even an infinitely high number would not indicate identity. The term "similarity" will be used in this discussion in favor of "dissimilarity," because it is more familiar and because the distinction is more useful at a methodological level than a conceptual one. Methodologically (in Chapter 3), this work will make use of a scale which provides for an absolute (bounded) score for both similarity and dissimilarity. Similarity is a low-level type of relation among concepts in that it does not specify the different relationships which might be perceived among concepts by humans. However, similarity has been shown empirically to have the benefit of allowing for more specific types of relationships to emerge from similarity data. Multidimensional scaling (MDS), a psychometric measure of similarity (or dissimilarity), has been used for such purposes. For example, Woelfel and Fink investigated whether Osgood's Semantic Differential scale, which postulated an orthogonal relationship between the dimensions of good- bad, strong-weak, and active-passive, could be validated using MDS (Woelfel and Fink, 1980). They found that "goodness," "strength," and "activeness" did emerge from dissimilarity data as bidirectional components, but the relations were neither orthogonal nor on a unit circle, as predicted by Osgood. This and other studies indicate that similarity-based measures were shown to have the utility of semantic differential or Likert-type measures (see Babbie, 1990) for identifying types of relationships. This discussion of human communication, cognitive movement, and cognitive space is not meant as an answer to the questions that communication scholars and others have asked over the ages. It is meant as a conceptual framework for understanding human communication and sense making behavior, derived from the theory and empirical evidence of several para- digms. More importantly, it is the basis for considering navigation as a fundamental concept for information systems. 1.1.2 Information Space and Navigation In this subsection, the conceptual framework proposed above for human communication systems will be applied in consideration of information systems. In human communication, active agents exchange messages and meanings. When human agents interact with information systems, however, the exchange of messages may be one-sided (for instance, when someone reads a train schedule). In the case of two-way exchange of messages, as is more common when computers are involved, there is still little potential in existing systems for interaction between cognitive spaces as described for human communica- tion systems above. For instance, a user querying a database will get a response from the system (two-way communication), but the contents of the database will not be changed in response to ongoing user input. In a cognitive space, cognitive movement may take place. In almost all information systems, computerized or not, the concepts and relations among them are not subject to change. (That is, the concepts and relations among them are not subject to change in response to communication with users. The system designer or database manager may, of course, make changes.) In this case, no analog to cognitive movement of the system database may take place, and the space of the system is (by definition) not subject to change. The term introduced to refer to such a space is "information space." -------------------------------------------------------------------------- Definition 1.3 Information Space: The concepts and relations among them stored by an information system; typically not subject to change through interaction with the system's users. -------------------------------------------------------------------------- Information spaces may be book indices, databases, bibliographic collections, computer interfaces, and so forth. Information space, unlike cognitive space as described in the previous section, is more than a metaphor. For cognitive space, there is no currently accepted method for knowing all of the different concepts, relations among concepts, and whatever else might make up human knowledge. However, we do have full access to the contents of an information space, inasmuch as those contents were explicitly made a part of the space. The words used as labels for concepts are what are known explicitly, not the concepts themselves -- there may still be ambiguity in the meaning of the concepts. This is not to say that the space is somehow "objective" and entirely knowable: indeed, it is a premise of this entire work that all information spaces are subject to interpretation and individual perspective. For an information space, we are able to know both the concepts which are part of the space (or at least the words which serve as labels for the concepts) and any relationships they have to each other, such as relational or hierarchical relationships. Consider a relational database: each field is known, as are the contents of each field. Further, any relations among fields are also known, as they were formally and explicitly specified. As defined here, "information space" is not a metaphor, but it is not quite so "real" as cognitive movement. It is a concept which exists in various literatures (which will be reviewed in Chapter 2) without formal definition. Information space cannot be a metaphor, since there is nothing we can point to as its referent. It is a way of talking about the contents of an information system. (The argument for information space as non-metaphorical follows the same lines as for a number line as non-metaphorical: numbers do not exist on a line, nor do numbers have anything you can point to as directly indicating their existence at all. Likewise, concepts and relations among them are not found in an information space and are similarly non-corporal. Both number lines and information spaces can be used to organize and talk about their respective attributes, but they are not similes, analogies, or metaphors, they are simply helpful ways of talking about and delimiting things). There must be a resolution to the dual quality of information space as something which is known explicitly and at the same time only knowable through interpretation. The resolution is based on ontology: without an observer, the information space cannot be said to have any meaning at all. However, since there is an artifact which was generated by a human, we can use the formal specifications which generated the artifact as a basis for discussing the information space. This means that the information space may be referred to as "objective," but there is no way to interpret the contents of the space without an observer, making any perception or description of the space necessarily "subjective" (an epistemological assumption which is revisited in the discussion of relevance in a following section). The key to this work lies in considering the outcome when a human agent's cognitive space interacts with the information space of an information system. If the information space is not subject to change, but the cognitive space (necessarily) is, then any convergence of meanings must be one way: the user must adjust to the meanings found in the information system. This is because the system, unlike a human communication partner, will not and cannot adjust for its user. Part of the long term goal of this work is to point out the necessity for making system information spaces more changeable, and more similar to human cognitive spaces. However, such information spaces are not currently the norm. This work will make use of non-dynamic information spaces as they currently exist, but with some enhancements which will be discussed later. The active formation of models so that communication may take place between humans could be described as negotiation: agents make changes in their models of the other and to knowledge about their relationship to the other during the communication process. Negotiation, as cognitive movement behavior, involves both knowing the other and making the self known to the other. While overt personal management skills sometimes come into play, negotiation is for the most part ongoing and without much conscious supervision. For any given communication instance or relationship, there are only some things which must be known for effective communication to take place. For instance, you might not need to know someone's religious affiliation to sell them a car, but such knowledge could be useful to sell a cemetery plot. -------------------------------------------------------------------------- Definition 1.4 Negotiation: The process by which human agents form and revise models of each other during communication. -------------------------------------------------------------------------- Negotiation refers to the often unconscious give-and-take while people get to know each other in some context -- it is not proposed as a bargaining process, but more of a "definition of the situation" (in the sense of Goffman, 1974, in Littlejohn, 1983). A sense of the ongoing, interactive nature of negotiation, as the term is used here, can be had from a reading of Cushman (1977), who talks of how the "rules" of a communication interaction are not obtainable explicitly, even from the participants, and yet they both govern and limit the interaction, and are in a constant state of flux. Negotiation is a necessary component for effective communication to take place: it is the part of a communication interaction which lays down the groundwork (alternatively stated, it defines the situation or produces the rules). Rules and definition of the situation, at least syntactically, imply that their associated activities happen up front -- at the very start of the interaction. The term "negotiation" allows for such up-front activities, but also implies that changes in the rules of an interaction or redefinitions of the situation are constant. There is no negotiation when the information space is not subject to an equivalent of the cognitive movement behavior of humans -- when the "other" can neither form nor revise a model. Instead, the user must navigate through the information space. Navigation means that the user must come to under- stand and act within the model of him or herself that the system has formed of him or her a priori. Navigation behavior occurs in humans when they must make sense of an information space, especially when they are just coming to know the space or a portion of it. -------------------------------------------------------------------------- Definition 1.5 Navigation: Human behavior to form and revise a model of an information space. Involves coming to understand both the information space and the model that an information system has of its (generalized) users. -------------------------------------------------------------------------- The difference between negotiation and navigation is that in negotiation all communicators actively form and reform models of each other. For navigation, model formation is one-sided: only the human component actively forms a model of the information space. The information space does not actively change in response to ongoing interaction with its human communica- tion partners (see Table 1.1). Table 1.1: Summary of Key Concepts -------------------------------------------------------------------------- Nature of Concept Concept Description negotiation human behavior occurs between human communicators navigation human behavior between human(s) [which possess cognitive space(s)] and information systems [which possess information space(s)] cognitive possessed by the dynamic knowledge possessed space humans by an individual information stored by non-dynamic knowledge space information systems -------------------------------------------------------------------------- * "knowledge" refers to known concepts and relations among them. The term "navigation" allows a distinction between the interactive modes of communication for human agents and the one-sided communication typical of current information systems. It also fits well with the concept of "informa- tion space" as discussed above. For instance, navigation behavior might lead to only a partial understanding of an information space -- further navigation would be needed to understand the rest of the space. This is similar to when people know each other and can communicate well in a business setting, but find gaps in their models of the other in social settings. An analogy for an information system might be someone who can use electronic mail within a computer operating system (a type of information system), but cannot perform other functions. Navigation is not strictly limited to the early stages of a relationship with an information system, however. The model one has of a system, and the understanding of its contents, might be less subject to change as time goes on. This is the same thing that happens when people form models of each other: at the beginning of a relationship, what is known about the other changes rapidly. After some time though, there is less need for change in the model. For understanding the information space, consider a word processor: a user could be adept at using a particular program, but then have problems adapting to a new version of the program if some menus or options were changed. Navigation behavior does not imply a physical domain in which to navigate, although some navigation behavior involves physical domains. The important part of navigation is not movement through a physical space, but the cognitive movement and model building on the part of the human navigator. Although the literatures on wayfinding and geographic systems (as will be discussed in Chapter 2) are mostly devoted to physical spaces such as cities or buildings, navigation behavior as described here and in those literatures are remarkably similar. People form a model of an information system, of another human agent or agents, or of a city or building in essentially the same way and for the same purpose. The purpose is to be able to understand one's surroundings and to be able to find what you want to find. The model is formed through exploration, application of models from similar situations and testing and refinement. "Wayfinding," as a field of study, seems more concerned with human cognitive behavior than locomotion in a physical environment. Navigation, as defined here, is not a metaphor, but is human behavior which occurs when people interact with information spaces. A philosophical assumption of this work is that a physical domain -- say, a building or city landscape -- cannot exist without a human to possess a model of the domain. Thus, a physical domain, from the point of view of a human agent, is simply another space in which to navigate. This assumption does not deny the existence of an arrangement of bricks or other building materials, but posits that everything that makes a building a building, rather than, say, a work of art, a tree, or a spaceship, exists only in the mind of a human agent. Navigation, as the term is defined here, is raised to a level which goes beyond physical relocation in a physical domain. Navigation accounts for the cognitive behavior and model building which necessarily precedes physical movement. Physical movement, seen in the light of the definition for navigation offered here, is secondary. Current efforts in the field of virtual reality (e.g. Rhein- gold, 1991), point out the need for de-emphasis on physical movement for navigation behavior. 1.1.3 Models for Human Communication and Navigation Four types of knowledge, or four components of the models necessary for human communication, can be identified. Three of these are especially important for a navigation-based approach to information retrieval. The models are: a model of the other, a model of the relationship to the other, a model of how to change that relationship, and a model of the self. Each will be described in additional detail in the following paragraphs. These models apply for any type of communication situation, including the situation under study here where one communicator, the information retrieval system, is not a human and has, at best, limited modelling capabilities. The model of the other tells us something about how the messages we send will be received. Any intentional communication has some goal, even if the goal is not always explicit. In order to be able to achieve some goal through communicating, we must know something about the other. According to Mead, we might have specific knowledge about a particular other, and specific shared experiences with it. In many cases, however, we do not. Therefore, we need to apply a model of the "generalized other." Mead stated that we do not generate our messages for what the other "is" (even if such a existential statement about the other were reasonable in Mead's theory), but at what we perceive the other to be. Mead believed the model of the other includes something unique to humans: a model of how the other perceives the self. This component of the model of the other can be used to fine-tune the interaction -- to not only attempt to predict the reaction of the other to our messages, but to guess the likely response that the other might make (based on their model of us). The model of the other would include an idea about what the other knows and who they are (i.e., their values, goals, and experiences). The model of the relationship to the other really contains two compo- nents, but they are hopelessly entangled. First is how you perceive the relationship. Second is how the other perceives the relationship. This model would include identification of common goals, any purpose to the relationship (or to a particular interaction), and history of the relationship. The model of how to change the relationship goes furthest beyond a particular relationship. That is, all sorts of general background knowledge from other interactions might be applied to make changes to a relationship. In some cases, changes to the relationship overall might be desired, for example when a co-worker is promoted and wishes to be treated as a superior. Much more frequently, the change to the relationship is momentary -- a change in the topic of discussion, or the introduction of some new component to the relationship. A model of the self is a central factor for intelligent behavior and a fundamental characteristic of human existence (Newby, 1988). This model perhaps plays a back seat in human communication in that it is deeply ingrained in the background of all our thoughts. It is also less likely to change during a particular interaction than the other models (the self-concept, once it emerges during childhood, is, according to Mead, quite stable and usually subject only to incremental change over time). This subsection has briefly introduced the four types of models which come into play during human communication. All four are important for interaction with information retrieval systems. Although the systems (as they currently exist) are not able to perform all of the modelling functions which humans do (or at least not nearly so dynamically), there is nonetheless an implicit model of each type possessed by the systems as implemented by their designers. A later subsection will generate model-based criteria which be applied to the study of IR systems which is a component of this work. 1.1.4 Representation for Information Space Information space, like cognitive space, consists of concepts and relations among them. The relations may be of various types. A representation of an information space consists simply of specifications of the types of relations present in that space and identification of the relations among some or all of the concepts in the space. Representation is of central concern for information retrieval systems. Whatever retrieval mechanisms are in place (say, Boolean set operators), they must perform their functions on a representation of the database. The internal machine representation of data is not of concern here (for instance, whether a particular machine treats a string of bits '01000010' as the character 'A' or not). Instead, the level and number of relations among items in the information space is of interest for information systems, for it is these relations which limit the type of searching mechanisms which may be employed. What follows is my brief analysis and comparison of some common representation schemes. Representation by keyword is the method used by most traditional information retrieval systems. In such a scheme, an index is created by which all document representations which contain a particular keyword may be quickly accessed. The type of relationship in such a scheme is purely logical: either a document representation "is" or "is not" associated with a particular keyword. Similarly, the coincidence of keywords in documents provides for only an existential relationship among documents: either they are related (that is, they have keywords in common) or the relationship is unknown (they do not have coincident keywords). Search mechanisms for such representation schemes are limited to combinations of sets of document representations which do or do not contain a particular keyword. Enhancements to keyword representation schemes include the keyword in context. This provides for relations among terms which occur in the same context in a document. In this scheme, there are some relations among keywords not found in the simpler method described in the previous paragraph. As for the basic keyword representation, though, document or keyword relations are limited to coincidence in a particular index entry, which indicates a relationship exists. Otherwise, the relationship is unknown (or may less precisely be said not to exist). More advanced mechanisms using keyword representation schemes allow for specification of adjacency or proximity. The relational database is another way of representing information. In such a scheme, categories of data are specified. Searching mechanisms can combine different data categories in one search -- for instance, an author/title search. Within each category, however, searching mechanisms are usually limited to those of a traditional keyword index. Note that the 'relations' among categories are mostly up to the user to judge (there is some matching of the information space to his or her cognitive space, created through the specifications for the information space), such as knowing that the name, address, and city fields are grouped together, and occupation, education, and professional affiliations are in a different group. Relationships within a category (say, the 'address' or 'name' field in a mailing list database) are the same as for keyword representation schemes. Relationships across categories (that 'name,' 'address,' 'city,' and 'state' are used for an envelope address, while 'occupation' is not) are left to an application program or the human user. In terms of retrieving desired information, relational databases provide an additional set of sets for Boolean queries, and perhaps some keyword in context operations, but few additional features. Similarity-based representation schemes have been employed to provide for searching mechanisms which allow continuous navigation of an information space instead of the discontinuous navigation found in previously mentioned schemes. Salton and his colleagues (e.g., Salton and McGill, 1983) have created vector spaces in which the keyterms are mutually unrelated, exactly as for keyword representation schemes. However, documents are spatially located at the centroid of their associated keywords in a multidimensional vector space. This is an important advance over other schemes in that it permits searching mechanisms which allow incremental changes to the information need -- changes of degree, not of an entire category. In such a scheme, there are no categories or types of relationships, but there are explicit and known relationships between every document in the database. A note on terminology: the word "keyword" does not appear in my dictionary. Neither does the word "keyterm." Since "keyword" would seem to refer to a single word, and "keyterm" would indicate either a single or multiple-word term or phrase, I have chosen to use the latter word throughout this work. I will use "keyword" when referring to methods or workers which have defined themselves to be working with "key words," as opposed to "key terms." For my purposes, keyterms are more open to the notion that a term, as used, might be ambiguous or subject to personal interpretation, whereas keywords are simply referents to objects which might be looked up in dictionaries or listings of subject headings. All the representation schemes considered in this subsection, and the others not mentioned here, result in information spaces. But the nature of the spaces, and the purposes to which they might be put, and the degree to which they facilitate navigation, are different. The discontinuous spaces, in which all relationships involve sets, are not highly navigable because so few relations exist. In a medium or large database, the majority of keywords and documents will have an unknown relationship. The trouble with using such a representa- tion scheme for searching for information might be seen by considering a different domain: shooting a gun. If a blindfolded person were trying to hit a target, and another person providing feedback, a keyword-only representa- tion scheme would be limited to feedback of the form: "you missed," and "you hit the target." Salton's vector representation provides an important advance over discontinuous schemes, in that (to continue our analogy), instructions to the would-be marksman may take the form, "you hit to the left of the target," and, "you hit to the right." Even a metric is available: "try aiming ten units to the right." Salton's information space is more navigable than a traditional keyword space because it incorporates relations among all documents in the space. This provides for an information space which more closely matches, at least in the qualities it possesses, the cognitive spaces of users, provided the assumption is accepted (after Woelfel and Fink, 1980, among others) that any concepts known to a person have some relation to each other. Salton's scheme is still deficient in that keywords are mutually unrelated, which is inconsistent with both theory (e.g., Woelfel and Fink, 1980), and with empirical evidence (e.g., Harper and van Rijsbergen, 1978). 1.1.5 Facilitating Navigation As Norman (1988) has pointed out, people do not need a user manual for a cup. How might information systems be built which do not require extensive learning time, or waste users' time by producing wrong, misleading, or incomplete results? There are two general approaches which can be taken to information system design: system-based or user-based. The system-based approach is typical of most systems of all sorts. System-based design means that the designer generates specifications for the system's functionality and means for accessing the system functions. In contrast, user-based design starts with the processes the user goes through and language he or she employs to accomplish the system task, and builds the system specifications around them. Combinations of system-based and user-based design are possible, such as employing system-based design for the functional descriptions but user-based design for the interface (Nilan et al., 1989). The advantage of user-based design is that a properly designed system will have minimal startup time: the functions and access methods will be organized as the user expects them to be. With system-based design, the user must first learn the secrets of the system and then translate his or her intentions into the language of the system. The philosophy of user-based system design can be restated using the terminology introduced in the current work. The goal is to minimize startup time and empower users to accomplish desired tasks without translating their needs into language suitable for a particular system. To meet this goal, the information space should match the cognitive space of the user, in the sense that the concepts and relations among them and the language used to reflect these are similar. I say again: The information space should match the cognitive space of the user. This means the system database would be as the user perceives the concepts and relations among them, not as the system designer perceives them. An information system designed in this way would be as a long-known friend, although a changeless and unresponsive friend: the user would be able to predict system responses in various situations. Such a system would be as a well-thumbed book, where desired passages could be found easily (even if they had never been seen before). The next step, to make information retrieval more like human communication, will necessitate the provision of a cognitive space for IR systems (instead of an information space), so that it may be subject to negotiation through ongoing contact with its environment. The final step, as currently envisioned, will be to move from human communication-like systems to exosomatic memory systems. 1.1.6 Communication and Navigation A long-term agenda for the type of information system envisioned in this chapter is to make the science fiction of today a reality -- to have systems which are able to respond to humans as other humans do. In short, to bring the (barely) navigable systems in use today up to the level of human communication systems. A purpose of this work is to consider similarities between human communication systems and information systems and to identify possible fruitful directions for information system design, which might lead to the ultimate goal of information systems as exosomatic memory. The discussion of models earlier in this section is equally appropriate for human communication and information systems. If people did not have a model of, say, a telephone directory (what it is, what it contains, how to use it, etc.), they would be hard-pressed to use it. Although the information space of an information system and the model it has of its users is not subject to change, there still is a definite model of the user which was instituted by the system designer. User-based design involves building a user model consistent with the model which the user expects the system to have of her- or himself. There is no one definition of "success" for information systems and success is not a criterion which can be applied to many human communication encounters. We can, however, think about what makes interaction with either humans or information systems easier or more efficient (in terms of the intended understanding of messages sent or received). Given the existence of an acceptable model of the other, only two components of a communication system can be possessed by actors to facilitate the process of either communication or navigation: 1. A model of one's relationship to the other (in the sense of Mead, 1960) 2. A model of how to change that relationship These components might be restated in Goffman's (1974, in Littlejohn, 1983) terms: people need a definition of the situation and a way of changing the situation. Stated again in Cushman's (1977) terms: people need to know the rules for an interaction and need to be empowered to change the rules. The specific types of knowledge which make up each model depend on the type of relationship. A model of a relationship might include history, similarities to other relationships, communication norms, and so forth. A model of how to change a relationship might include a variety of perceived cause- and-effect occurrences (such as a set of known commands), the predictability of the other, and knowledge of available communication channels. One example from human communication is when someone regularly purchases pizza for lunch at the same spot -- his or her model of the relationship is based on the roles of customer-salesperson, and includes a "script" (again borrowing Goffman's terminology) which the participants follow regularly. His or her model of how to change that relationship involves knowledge about other options for lunch (that is, non-pizza), and how to achieve these options. In order to break out of the "pizza script," the customer needs to change the model of the relationship which both parties have. In order to have a successful non-pizza interaction, in this trivial example, both participants need to adjust their understanding of what's going on by negotiating a new model. The extent to which this might be traumatic or difficult depends on how deeply ingrained the model is. Another example might be based on human-computer interaction. A regular user of word-processing software might receive a new version of the software. If for example, the old version ran under DOS and the new version runs under a windowing system, it might be that some of the command keys are different. In this case, the user, who was proficient in the old version, needs to learn to use the new version. This process, of changing both the (low-level) manner in which the user interacts with the word processor and of rendering the expert user to a less proficient status, is one of changing the relationship between the user and the system. However, the system's view of the system and the model it has of its generalized users (as created by the system designers) does not change in response to the user's changed model: In order to change the relationship, the user has to change, because the system will not. As Mead (1960) pointed out, a uniquely human ability is to model how another models the self -- to see oneself through another's eyes. As the second example demonstrates, the importance of such a model for information system use is paramount: it is incumbent on system users to know how the system expects the user to act. Few current information systems participate at all in helping the user to form a model of the system (at least through use - - training manuals, user guides, and help screens are all external to the system functionality. Such model-building activities might be analogous to the pizza-seller saying, "I am a pizza vendor. You may give me money, and I will sell you a pizza. In the event that you want a different sort of pizza than I have available, I will cook one for you"). The model of how to change the relationship has to do with predicting the impact of messages on the other. Again, this is a general but utterly central component of both human communication and information systems. Without a clear model of the other, you just muck about trying to form a message which generates the desired result -- like pushing buttons and turning knobs randomly in a control room, hoping to turn off a nuclear reactor. Communication and navigation involve the same sorts of processes on behalf of the actor -- both necessitate a model of the other and involve the exchange of messages. The main difference is the extent to which an information system is able to give the type of feedback typical of human communication systems: to dynamically change the model of the user, to understand the cognitive space of the user, and to incorporate the user's model of the system (and the data it contains) when responding to user input. 1.2 Goals for Information Retrieval Systems Notions of navigation for IR systems are informed by consideration of what ideal IR systems might be. An historical image of ideal IR was introduced by Bush in 1945. His "memex" was a machine that provided personal informa- tion on demand. With the advent of interactive computerized information systems in the 1960's, Bush's vision was pursued from two related angles. The first angle is what is here referred to as the "relevance-based" approach. The second has to do with "exosomatic memory." The following two subsections examine each of these angles in turn. A third subsection combines and extrapolates from the discussion thus far some specific criteria to apply to an evaluation of IR systems created in the relevance-based tradition, or in navigation, human communication, or exosomatic memory traditions. 1.2.1 Relevance-Based Information Retrieval Relevance is not a simple concept and it is not at the heart of the current work. It is a topic of considerable history in the literature, and will not be fully treated here. Of current importance is the role of relevance in the design and evaluation of typical functional or experimental information retrieval systems. Despite personal views or philosophies of their designers, most IR systems are designed in the "rationalist" tradition. "At its simplest," according to Winograd and Flores (1988, cited in Schamber, Eisenberg, and Nilan, 1990), "the rationalistic view accepts the existence of an objective reality, made up of things bearing properties and entering into relations." If this "simple" view of the rationalist tradition (as discussed further by Schamber, Eisenberg, and Nilan) is used to understand the design and evaluation of most IR systems in the literature, it is clear that these systems are well placed within this tradition. These qualities of IR systems within the rational tradition are demonstrative of the assumptions which I believe underlie them: - Systems are designed which assume the language a user employs to represent the user's information need (that is, queries are equated to information needs). - Systems are typically evaluated with sets of pre-con- structed queries and relevance judgements which are independent of each other and divorced from a user context. - Relevance judgements are typically binary (yes/no). - Documents are represented in only one way (that is, the keyterms chosen to represent a document are intended to be context free and static). - Queries are independent of each other -- there is no allowance for ongoing or dynamic interaction be- tween system and user. The most common vision of a perfect IR system is one which provides all documents which are relevant to a particular information need and none of the non-relevant documents. As Figure 1.1 illustrates, even if the assumptions of relevance-based IR were accepted, there is a substantial problem having to do with the fact that these systems are not, in practice, matching information needs to information. Instead, they match document surrogates with queries. For the current work, general notions of relevance for IR are not rejected. It is the rationalist tradition within which most relevance-based systems are designed and evaluated which is called into question. Relevance seems a fair evaluation measure for retrieval situations in which users have a very firm notion of what they seek and a sound basis for making judgements about the appropriateness of documents (or surrogates). Perhaps, as researchers move towards the "dynamic, situational" approach to relevance advocated in Schamber, Eisenberg, and Nilan (1990), the role of the rationalist tradition in the design and evaluation of IR systems will decrease. Figure 1.1: What do information retrieval systems match? 1.2.2 Navigation-Based Information Retrieval The more common vision of ideal IR, as mentioned above, has to do with retrieving all relevant documents and no non-relevant documents. A less common vision of ideal IR, but one which is perhaps more closely associated with Bush's "memex," has to do with what Brookes (1975) called "exosomatic memory." Exosomatic memory, according to Brookes, is an IR system operating as an extension of human memory. Such a system might possess intimate knowledge of particular users or user groups and models of the types of processes the users would go through. Multiple placement of items in the database, multiple access methods, and serendipitous retrieval of items are assumed as parts of such a system because introspection reveals that they are in evidence in human memory processes. Exosomatic memory is an ostensible long term goal of relevance-based approaches to IR (discussed further in Chapter 2), but the qualities of exosomatic memory systems seem inconsistent with the assumptions usually inherent in relevance-based IR systems. An intermediate goal I see as being on the way to exosomatic memory is to develop information systems which are closer to human communication systems. A criterion about which to design IR systems to eventually meet that goal is that they facilitate navigation. Navigation is proposed as the most suitable alternative to relevance, and the only alternative I can think of, until there is a way to have information systems which are able to dynamically refine models of their users, systems which communicate as humans do. This work is not intended as a large step towards ideal notions of "exosomatic memory," but it is introduced as a change in direction for IR which facilitates the type of cognitive movement for model formation and change described in the preceding section, more so than systems which do not focus on navigation. Making IR systems more navigable is proposed as a step towards human communication systems for information retrieval, followed eventually by exosomatic memory. Navigation is not the only path which might be taken towards exosomatic memory. Researchers and theoreticians operating in the rationalistic model, in the pursuit of relevance, share the goal of exosomatic memory (at least as demonstrated by the ongoing citation of Bush's 1945 article). As stated earlier, I do not see a path from relevance-based IR to exosomatic memory -- the reliance on matching for relevance-based IR (see Figure 1) is inconsistent with the qualities described in the above paragraph. The practice of evaluating systems in terms of precision and recall, without consideration of the ability to, for example, serendipitously discover new links among items, does not indicate steps being taken towards exosomatic memory. Other paths towards exosomatic memory might include a focus on the neurobiological aspects of information -- to study how human memory works, and develop ways of encoding thought patterns using external devices. Or, study of how people, such as reference librarians, can retrieve information on demand for others -- trying to systemize the processes which people go through, possibly for incorporation in an expert system. These and other approaches have been bypassed for the current work in favor of navigation, because navigation has the dual benefits of being based (as described in this chapter) in a large body of theory of human communication and of being comparable (as described in Chapter 3) to relevance-based approaches to IR. Four phases might be identified along the path towards the ultimate goal of exosomatic memory for IR. First is where we are now: systems which do not explicitly facilitate navigation, and do not attempt to match their information spaces to user cognitive spaces. These systems do possess information spaces, but the spaces are created largely without theoretically coherent modelling of the user. Second is the phase which this work strives towards: making systems more navigable, by providing information spaces which better match users' cognitive spaces and through various other techniques which will be discussed in this chapter and Chapter 3. Navigation is provided in place of negotiation as found in human communication, and navigable information spaces are provided as alternatives to true cognitive spaces. As will be discussed in Chapters 2 and 3, navigation may be enhanced both by providing a more navigable information space (that is, one in which the information space is well matched to a user or user group's cognitive space, in that appropriate concepts have appropriate relations to one other), and by providing navigation cues. The third proposed phase towards exosomatic memory is to make IR systems which perform as human communication systems. Such systems do not imply any sort of human-like intelligence on the part of IR systems, but do necessitate the ability of the systems to negotiate models of the other, as humans do. During ongoing communication, the systems would dynamically modify their database contents, organization, and presentation according to the situation at hand, making their databases, by my definition, cognitive spaces and no longer mere information spaces. The final phase in this research program is to provide for exosomatic memory or a memex-like concept which might be more akin to intrapersonal communication than interpersonal communication. The path to the final phase is not clear at this time. For the present effort, we return to consideration of what might constitute reasonable steps towards the goal of making navigable information systems, which will clear the way for providing systems which are more like human communication systems. Navigation, as described here, can include a full range of information seeking activities, from goal-directed cognitive movement with a specific end-point, or moving towards a vaguely defined goal, or browsing. (These are three types of cognitive movement typically found in the IR literature. Dervin, 1983, lists other types of cognitive movement which are equally suitable, such as resolving uncertainty, getting clarification, solving a problem, overcoming an obstacle, etc.) Navigable information systems are to be built by creating environments where a model of one's relationship to a system, and a model of how to change that relationship, is readily obtainable. Some possible characteristics of IR systems which are built with navigation as a fundamental concept might include: explicit cues as to one's status relative to the system and how to change that status, the use of document representation schemes which closely match how users perceive the documents, keyterms, and relations among them (that is, a correspondence between the user's cognitive space and the system's information space); an explicit model of the information space, in order for users to understand the system information space as it relates to their cognitive space; cues indicating where various items or concepts of importance to the user are located in the information space; the ability to take different views, build maps, retrace steps, and so forth -- to generate the type of interactive and dynamic environment for the exchange of messages typical of human communication. Empowerment of users through navigable systems could result from any of the specific qualities in this paragraph, which fall into one of two general categories. First is the provision of a system information space which more closely matches users' cognitive spaces. Second, navigable systems provide cues which help the user to understand his or her status relative to the system -- the relationship between user and system -- and how to change that status. The assumptions about information retrieval and navigation which underlie the current work are: - We do not yet know enough about user needs, user situa- tions, user goals, etc. to achieve the goal of presenting only documents that would be judged "relevant." Even in a human communication situation, the actors cannot predict fully the results of their messages. - Information retrieval is an information seeking activity, the outcome of which is a user's cognitive movement. - User cognitive space is not static, but is in a constant state of change. Therefore, information retrieval is a dynamic process in which a user's cognitive space inter- acts with a system's information space. - Documents (and the information they contain) are not static: Their meaning, content, and importance exist only according to what they are being used for and who is using them. The notion of a document having some rele- vance value independent of a user is not accepted. - Information spaces might have different meanings for different users, and can have no independent or "objec- tive" meaning. - Users may have very specific goals when approaching an IR system, or very undefined goals. Much use of IR systems is exploratory in nature. The concept of cognitive movement applies to all types of information seeking behavior. - The facilitation of navigation is one step towards making IR systems more like human communication systems and eventually achieving the ultimate goal of IR systems as exosomatic memory. Based on these assumptions, an immediate goal for IR systems is to capitalize on the dynamic, changing features of users and their information needs, and provide systems which facilitate navigation. The intermediate goal is to provide IR systems which approximate human communication systems. The long-term goal is exosomatic memory. This work starts towards providing more navigable systems by first building on assumptions which are consistent with navigation as described here (and therefore not part of traditional design approaches or the rationalist tradition). An effort is made to provide ways of making information spaces which are more consistent with system user's cognitive spaces. Methods of providing knowledge about one's status relative to the system and of how to change that status are developed. 1.2.3 Criteria for Evaluation of Information Retrieval Systems At this point, most of the conceptual framework for this consideration of navigation has been created. An important task remaining is to consider how the definitions from the discussion thus far lead to criteria by which to evaluate information retrieval systems, including those which attempt to take steps towards eventual goals of human communication systems and exosomatic memory. This subsection presents some criteria for consideration in evaluating IR systems. The specific criteria which follow can be divided according to the phase in IR system development at which they apply. For each phase, all the criteria from the previous phases would still be appropriate. 1. Relevance-based retrieval Does the system allow for identification of a particu- lar document from a large collection? Can specific information needs be met ("specific" signifying that a relevance judgement can be made by a user with reasonable knowledge of the subject domain)? Is the nature of the system and its limitations clear? Are cues present to help the user to know how to proceed during a search? 2. Navigable retrieval systems Does the information space share characteristics of the cognitive space (of a user or user group)? Can a fruitful subset of the information space be identified? (where fruitful means that the contents or organization can be used for a variety of purpos- es, not just the retrieval of particular documents to meet specific information needs)? Is the language of the system flexible (e.g., can you use different terms or follow different procedures and end up with a similar set of retrieved docu- ments, or in a similar location in the information space)? Can different perspectives or points of view be applied to the information space? Can incremental changes be made in the system response set? 3. Human communication systems Can the system remember past interactions? Can "world knowledge" be employed to help meet information needs, such as by drawing on experi- ence with other users or purposes, or applying knowledge from the information space? Is there an ability to spot differences in seemingly similar situations, or spot similarities in seemingly different situations? Does the system change somehow in response to messages from its users, on a continuing basis? 4. Exosomatic memory systems Can the system anticipate information needs? Can it search for information independently? Can the system learn from past experience? Implied in the criteria for relevance-based systems are some of the components of Mead's models. A model of the other, a model of the relationship to the other, and a model of how to change that relationship, are all implied. In navigation-based systems, the nature of the desired model of the other changes, so that not just a "learnable" system is presented, but one which somehow corresponds to the user's expectations. That is, the user should be able to rapidly form a model of the other, the relationship to the other, and how to change that relationship, which is largely correct (if not fully detailed), but not explicitly communicated by the system. In other words, the system functionality and organization, not just the information space, should match the expectations of its users. Models implied for human-communication systems include some basic model of the self possessed by the system, and an ability for the system to reconfigure it's functionality and presentation to better match the perceptions of the users. With exosomatic memory systems, the model would necessarily incorporate a large portion of the user's views on the world -- something beyond our current understanding of the sorts of knowledge people use to get through the day, but not unimaginable. Note that some of the criteria above, especially those dealing with model issues, are more appropriately considered from either the theoretician's or system designer's point of view. Others are appropriate for an empirical investigation. The set of criteria in this subsection will be revisited and refined in Chapters 2 and 3, and will form the basis of the analysis and comparison to be described later in this work. 1.3 Goals, Questions, and Definitions The research goal of the current work is to explicate and investigate navigation as a fundamental concept for information retrieval and for information systems in general. "Fundamental concept" means that navigation should be at the heart of current IR systems and at the heart of information seeking behavior employing such systems. As a fundamental concept, navigation (as laid out in this chapter) describes what users of IR systems and other information systems do, even if navigation was not an explicit design specification for the systems. Navigation and its role in the literatures relating to IR will be explored further in Chapter 2. As the literature review will show, navigation is consistent with the goals of various information systems, including existing traditional approaches. The conceptual framework provided in this chapter indicates that navigation is not a true end goal for information systems, but is a reasonable step on the way to systems which are akin to human communication and eventually, human memory. The approach this work takes is to first create a conceptual framework from which to understand navigation and related concepts, and then to create an environment in which navigation can be examined and evaluate the environ- ment for information retrieval. This chapter has started the necessary framework. Chapter 2 examines a wide range of literatures with implications for navigation in information retrieval and ends with suggestions for information system design for navigable systems. The rest of the work describes the creation and evaluation of the IR environment. As navigation has not been formally addressed in the literature of information science and an IR system that focuses on navigation as a fundamental concept has not previously been built, this study is exploratory in nature. The research questions addressed by the current study are: 1. Is navigation a useful approach for operationalizing information retrieval? 2. What perceptions by users engaged in information seeking help to understand the use of navigation for conceptual- ization of information retrieval and information system design? Let us examine the key concepts from the research questions. "Navigation" is as defined earlier in this chapter: human behavior to form and revise a model of an information space. Navigation as an approach for IR involves creating an information space and an information retrieval environ- ment such that the two types of model building are facilitated: the ability of a human user to form and revise a model of his or her relationship with an IR system, and the ability of the user to form and revise a model of how to change that relationship. "Useful" refers to the capability of a navigation-based IR system for accomplishing information seeking tasks as perceived by users of the system. It is a user-based criterion which will be evaluated in the context of users attempting to accomplish basic information seeking tasks with a navigation- based system. Minimally, if a navigation-based system can be employed to meet information needs, it can be said to be useful. However, usefulness might be better considered relative to some other criterion. As Chapter 3 will describe, the usefulness of a navigation-based system will be evaluated relative to a system which was not based on navigation. Criteria will be employed to assess the relative usefulness of the naviga- tion-based system. One set of criteria will include the time taken to accomplish similar tasks and the self-report of satisfaction on a Likert scale. A more important set of criteria are the written comments of users engaged in search tasks (again, Chapter 3 will provide details on the specific comments which will be solicited). Usefulness is not an absolute criterion -- the answer to the first research question cannot be a "yes" or "no." User comments, basic utility, and traditional time-on-task measures will all indicate the ways in which navigation is useful for operationalizing information retrieval. Chapter 2 will introduce some particular desirable qualities for navigable systems which are operation- alized in Chapter 3. Thus, navigation will be evaluated for usefulness as a concept about which to build IR systems and the usefulness of particular aspects of a navigable system will assessed separately. "Operationalizing information retrieval," in the first research question, is a way of saying that navigation will not be addressed from the armchair. An operational IR system, based on the philosophical and theoretical concepts from this chapter and including desired system qualities as will be described in Chapter 2, will be constructed and evaluated for its usefulness. Despite the arguable value of this empirical approach to validation of the claims for navigation introduced in this chapter, problems are introduced in the pursuit of knowledge through the creation and evaluation of a single system. Foremost is the difficulty of having a single system to evaluate: if negative or uncertain answers emerge to the research questions, then it will be uncertain whether the results are due to the inappropriateness of navigation or simply to the particular implementation of navigation. Another problem is that the creation of an evaluation system takes away from possible resources for a more detailed empirical study of navigation. Both of these problems must be endured due to the lack of a suitable navigation-based environment for the comparison. Let us turn to the second research question. "Perceptions by users" indicates that this is a user-based study, employing user-based methods. As such, I am interested in the perceptions of users engaged in information seeking activities with a navigation-based system. This is not simply a system-design experiment: as an effort towards making IR systems which are similar to human communication systems, such an outcome would produce a philosophical conflict. The perceptions to be elicited will be described fully in Chapter 3 and will include unstructured commentary on navigation concepts as found in the evaluation system, open-ended responses to items concerned with particular characteristics of IR systems and navigation environments, and responses to closed-ended items concerning satisfaction and the navigability of the information space. Both the evaluation system and its associated information space will be the subject of user commentary. The term "information seeking" is used less broadly here than in the works of Dervin and Nilan. It is here limited to the types of behaviors typically studied in information science research. Conceptually, information seeking might be limited in the current context to the use of IR systems to retrieve document surrogates in response to an information need. The information need might be for a known item, for information suitable for a particular query, or for general access (browsing). As Chapter 3 will show, information seeking will be operationalized in the narrow context of looking for document surrogates which meet information needs stated in sets of keyterms or sentences. This narrow application of "information seeking" may serve to limit the interpretation of the "usefulness" of navigation. In spite of this limitation, this application of information seeking is very similar to the types of tasks for which IR systems are evaluated in the literatures on information science, but with the benefit of retaining philosophical consistency with the assumptions laid out in this chapter. The understanding of the "use of navigation for conceptualization of information retrieval" refers to the insight gained throughout this work for the conceptual framework for navigation described in this chapter. Naviga- tion, and the long-term research program in which it is a step, is proposed as a fundamental concept for information retrieval. Do Chapters 3, 4, and 5 bear this proposal out? What insight into navigation and navigation for information retrieval will be revealed during the process? The understanding of the use of navigation for "information system design" points not to the theory of navigation, but to the practice. Do the steps derived for the construction of a navigable IR environment work? Which steps seem to hold the most promise, and what perceptions by users help to indicate directions for future research? 1.4 Criteria for Evaluating this Work As an exploratory user-based study, it was difficult to anticipate whither the work would lead. Research questions were a starting point, but neither the evaluation environment nor the user perceptions could be predicted at the outset. This work on navigation was intended as a first step on a path towards exosomatic memory for information retrieval with a proposed route via human communication systems. Two uncertainties about these steps needed resolution. First was the path itself: the conceptual framework for navigation was laid here, but only part of the human communication phase and almost none of the exosomatic memory phase was given as much careful treatment. One criterion for this work was that it yielded insight into whether there is a path beyond navigation and whether navigation, as conceived here, was really its own path or simply a diversion from existing approaches to information retrieval. The second uncertainty was about whether the steps that are taken here were well-taken: were the methodological considerations derived from the conceptual framework described here well-suited for movement along the path to the ultimate goals of IR? This work did not end with the creation of a perfect navigation-based IR system and it did not wait for user perceptions to dictate all of the next steps for reconsideration of navigation for conceptual and methodological purposes. Instead, the work closed when it was clear that there were no further steps to be taken with the few stepping stones laid here. When there were sufficient data to indicate that, first, navigation as a fundamental concept for information retrieval could be looked at in a new light, and second, that alternatives for the design of functional IR systems could be considered with improved knowledge of the role of navigation, this work was drawn to an end. The internal criteria about which this work was organized seem fairly clear. From an outside point of view, the role of this work in the literature on IR and criteria which other scholars might apply to consider the relative merits of the work -- not only as a part of the literature, but as a document meant to serve the purpose of a demonstration of competence for a doctoral degree -- might not be clear. The lack of clarity is because this work does not fit the traditional mold and strives to do several things at one time. Not entirely conceptual, I attempted to lay a framework with immediate uses for this study and possible long-term applicability to examining the domain of IR. Not an empirical study, an empirical study was carried out in the context of evaluating a system designed for navigable IR. And not a system design exercise, this work did involve considerable effort in the area of system design. For an outside reader, the only criterion which made sense to apply to this work was that of contribution: how might this work contribute to the literature on IR and related literatures? Again, the contributions could come from several sources, including the conceptual framework, the empirical design and analysis, and the system-building. Inasmuch as the contribution of this work might not have been evident at the outset, and could be difficult to predict, the other criteria to be applied had to do with the thoughtfulness of the work, the skill with which the tools of system design and analysis were used, and the understanding expressed here of the strengths and weaknesses of the work. 1.5 Conclusion Navigation was a key skill for information seekers before the advent of computer technology (as described in Chapter 2). It was necessary for information seeking with existing information systems. The assumptions of relevance-based IR have resulted in systems which do not provide knowledge about the user-system relationship nor sufficient cues as to how to change that status. More importantly, there has not been a focus on creating information systems which are well-matched to users' cognitive spaces. This work investigated the usefulness of navigation for operationalizing information retrieval. It assessed both the conceptual framework as described in this chapter and the practical criteria derived from it as applied in an empirical information seeking setting, both within the context of a system designed for navigability. As an exploratory study, there was a limited extent to which firm answers concerning the applicability of navigation to information retrieval would be found. Instead, insight into the research questions, and development of grounds for future research were be sought. The work did not represent a deviation from the essence of information seeking behavior as was found in the literatures on human behavior and information retrieval, but was simply a redirection for research and develop- ment based on a the start of a conceptual framework presented here. More navigable systems were proposed as a step towards IR which is more like human communication, towards the eventual goal of IR as exosomatic memory. This chapter has laid the groundwork for the rest of the study. The next chapter examines the role of navigation in various literatures related to IR, and lead towards refinement of the criteria and system-building steps which were only partially developed in this chapter. Chapter 3 introduces the system which was built, how the information space was constructed, and the methods employed to evaluate it. Chapter 4 presents the results of the system design and evaluation, and Chapter 5 draws the work to a close with a revisitation of all the qualities of this work, including the strengths, weaknesses, and implications for the future. CHAPTER 2: LITERATURE REVIEW 2.0 Introduction This chapter discusses several bodies of literature that inform the role of navigation for information systems. While most are not central to the conceptual framework from Chapter 1, they are important either for the framework's support, or because they are pertinent for methodology or design for the theory and practice of IR. The chapter starts with traditional issues for IR system design, including the general batch-oriented outlook of the relevance-based approach and the representation and indexing schemes associated with that approach. The following section considers some non- relevance-based approaches to IR, including a focus on browsing and several viewpoints on the cognitive aspects of users as they may help to understand the information seeking and use process. The next section deals with information space and introduces some of the groundwork which helped define the path for the current work (as discussed further in Chapter 3). The creation of information spaces which are more navigable than those produced by relevance-based approaches to IR has a long tradition, but such spaces have not been at the heart of common systems to date. The following section introduces wayfinding and visualization, which were drawn upon for the system design process described in Chapter 3. Finally, three sections summarize the literature, gather implications for information space and navigation, and give special treatment to researchers whose ideas were particularly influential on this work. 2.1 Traditions in Information Retrieval This section examines some of the more traditional approaches taken to information retrieval. Most of the qualities of current IR systems (including the system used for comparison in Chapter 3) may be found in the literature described here. The section starts with a discussion of some of the founda- tions of IR, including a brief (and not necessarily unbiased) history of the field. Two following sections introduce traditions in document representation and the assignment of keyterms for documents. 2.1.1 Foundations As do many fields, "Library and Information Science" claims its roots in antiquity. The transfer and storage of information have been critical components of survival throughout history. The history of the library shows that great stores of information sources were coveted by powerful rulers throughout the ages (Boorstin, 1983). Modern library and information science is not directed at withholding information resources from all but the very powerful. To the contrary, it is the charge of the modern information scientist to provide access to information resources to all people. In his Libraries of the Future (1965), J.C.R. Licklider envisioned that computers would be used to enable all people to access the information they need from the huge corpus which modern society has created. Information retrieval came into being as an area of study about the time the future role of computers in society was being identified. At the heart of the early literature on information retrieval are rationalist notions of relevance (as presented in Chapter 1). There have been many calls in the literature for different approaches to IR, some of which are included in this chapter. IR textbooks such as Pao's Concepts of Information Retrieval (1989) and Gerrie's Online Information Systems (1983) still offer few views of the field that are not relevance-based. The topics discussed in this chapter include the topics found in texts on information retrieval, such as the evaluation of IR systems, the interaction between system and user, the nature of information needs, and the representation of information. Unlike the (seemingly) related field of artificial intelligence, there is relatively little philosophy associated with the field of information retrieval. IR is a practice-oriented field in which the large number of installed computerized card catalogs, CD-ROM systems, and online systems tend to drive notions of what IR should look like. Workers and theoreticians in artificial intelligence realized long ago that their assumptions about what constituted intelligent behavior needed to be questioned and investigated (e.g., Hof- stadter, 1979. See 'comp.ai,' a USENET newsgroup, for ongoing and current examples of such discussion). Questions about the "intelligent" role of the computer in IR and the necessary "understanding" that the computer must have of its users' needs have seldom been asked in the IR literature. While one might argue that it is the human intelligence that drives effective IR, not a computer intelligence, this potential man-machine symbiosis has seldom been the focus of IR work. 2.1.2 Representation for IR Document representation has a history and scope beyond the field of information retrieval. Representation has to do with the specification and maintenance of relations among some set of items. For our purposes, the relations are known by the formal specifications used to generate them, such as cataloging rules or database format statements. The items, for IR, are usually surrogates or labels for concepts. An important part of the history of document representation has to do with the practice of indexers and catalogers, who have the job of insuring that once a text is put somewhere it can be found again later (see Cleveland and Cleveland, 1983). Classification systems, such as the Dewey Decimal System and Library of Congress Subject Headings, were invented to help classify documents according to their subject area. This section considers the assignment of keyterms, or indexing, a foundation of representation for IR. A later section discusses variations on traditional indexing, specifically having to do with spatial representation. 2.1.3 Assigning Keyterms This subsection is concerned with the activities of assigning keyterms to documents for information retrieval. The goal of indexers and catalogers is to adequately describe the subject area of a given text (book, journal article, photograph, etc.) so that it may be retrieved later. (Catalogers also identify the publisher, author, title, and other information which will not be considered in the current work.) In practice, this involves assigning a set of keyterms to the text, usually chosen from a controlled vocabulary. In an ideal world, the vocabulary and criteria for selecting keyterms would be so well-defined that any trained indexer would choose the same terms for a given text. This ideal vision is not achieved in the real world (e.g., Sievert and Andrews, 1991). The whole practice of assigning keyterms to texts is firmly a part of the rationalist tradition. The job of indexers and catalogers is to assign keyterms which are context free -- independent of any particular user or use. The controlled vocabularies used for such purposes are carefully constructed so that there is minimal ambiguity in the meaning of any assigned term and a minimum number of terms may be assigned (Cleveland and Cleveland, 1983). (In human language, of course, there is considerable ambiguity in the meaning of terms. c.f. Roget, 1977.) For practical reasons, keyterms are assumed for the most part to be independent of one another. A thesaurus may be used to look up related terms, but in all cases the number of terms and the number of relations among them are minimized for indexing and cataloging purposes. The historical practical reason for assigning a small number of keyterms is that only so many terms could fit on a catalog card, or in a computer record, before their numbers made storage and retrieval difficult. The reasons for minimizing the relations among the terms have to do first with reliance on trained indexers and catalogers to assign terms unambiguously, second with the difficulties in assigning term relations which are free of context, and finally with the representation schemes which are incapable of handling large-scale term relationships. That third reason for independence of keyterms for indexing is of particular importance for this work. Document representation for com- puterized information retrieval started and has largely remained at the independent keyterm stage. A small set of keyterms, perhaps six to twelve, are assigned to a document. This set and its associated bibliographic information is the "document surrogate" which is retrieved. The user provides a set of keyterms which represent his or her information need and the IR system matches the user's keyterms to document surrogates. (See almost any IR system, such as those offered by InfoTrac and Dialog, for an example of this type of retrieval.) Note that even in systems which provide access to the full text of an abstract, the same keyterm-based access applies. The main difference between searching a set of keyterms and an abstract is the number of keyterms: one hopes that some of the potential ambiguity in the words chosen to represent a document will be avoided by increasing the odds for overlap between system and user language. Thus, traditional indexing and representation for IR lessen the chance of an isomorphism between user cognitive space and system information space in that the concepts and especially the relations among them are almost definitely not as perceived by any particular user with a particular information need. A later section deals with spatial representation, which is of primary importance for this work. Other forms of representation exist in the literature, but are not considered by the author to have the capability of providing a system information space which matches an individual user's or user groups' cognitive spaces. The argument for this statement is a philosophical one: spatial representation allows for all concepts in an information space to be related to each other. The other representation schemes in the literature do not allow for fully relational representation. The assumption is made that all concepts are related to one another (while admitting that not all relations will be "rich" or particularly meaningful. This argument is given empirical support in the work of Woelfel and his colleagues, e.g., Woelfel and Fink, 1980). For a review of document representation and retrieval methods not included here, such as probabilistic retrieval and various weighting schemes, see Vickery (1971), Richmond (1972), and Batten (1973). Perhaps it is indicative of the continued use of the same representation schemes that there has been no ARIST chapter on "document description and representation" since Batten's chapter. Nor has there been any real change in how things are represented, with the possible exception of Oddy's work (Oddy, 1977; Oddy and Balakrishnan, 1991). Note that Farradane, among others, has advocated what he calls a relational approach to document representation for traditional indexing, but his techniques have not been applied on a large scale (Farra- dane, 1952; Farradane and Thompson, 1980). Nor do Farradane's methods fit well within the framework for information space and cognitive movement developed for the current work. 2.2 Non-Relevance-Based Approaches and the Retrieval Process As mentioned in Chapter 1, the assumptions associated with relevance- based IR may be somewhat appropriate for informed users who have well- defined information needs. These users are familiar with their subject matter, know the terminology of the system, and may have information needs such that binary relevance judgements are reasonable and reasonably independent (for instance, they might be looking for articles suitable for a literature review). There is a whole spectrum of information users and information uses, however, that are not facilitated by the assumptions of the relevance-based approach. This section examines some of the work done on those types of users and uses. 2.2.1 Browsing Browsing has a long history in non-computerized approaches to information use (see Kent and Lancour, 1970). There have been a number of studies to identify the role of browsing and the range of browsing behavior, in information seeking and use (e.g., Morse, 1973; Apted, 1971). Researchers such as Hildreth (1982) and Palay and Fox (1981) realized that users were browsing with IR systems and started to identify the types of behavior that indicated a less directed search than the designers of relevance-based IR systems had intended, and to think about how to design systems which enable those types of behaviors. There has been a recent emergence of computerized browsing systems for IR, usually with the specific audience of users who are not in a strongly goal-directed search mode (Burgess and Swigger, 1986; Larson, 1986; Shneiderman, Shafer, Simon, and Weldon, 1986; Marchionini and Shneiderman, 1988; Pejtersen, 1989). Some researchers who produce browsing systems, such as Canter, Rivers, and Storrs (1985), specifically mention "navigation" as an activity that users perform. Whether they mention navigation or not, most of the browsing systems in the literature provide a visual interface such as HyperCard. Only a few researchers, such as Noerr and Noerr (1985) and Cove and Walsh (1988) consider various representations for data and other more theoretical topics for browsing. Hypertext has the potential to allow user-based links between items in the database, but the literature does not show any work on matching such information spaces to particular cognitive spaces. The literature on browsing and the implementation of IR systems for browsing provides a firm theoretical and practical basis for the implementation of navigation systems for IR and does not fit with the relevance-based assumptions introduced in Chapter 1. Navigation is able to encompass a range of information seeking behavior that goes from that implicitly assumed by relevance-based systems to the type of non-directed and informal behavior which browsing systems attempt to provide for. 2.2.2 Cognitive Approaches Human communication is an interactive, ongoing process. In librarian- ship, recognition of the nature of human communication for the expression and understanding of information needs has manifested in the literature on the reference librarian (Dervin and Dewdney, 1986; Taylor, 1968). For other areas of study, such as expert systems, the eventual resolution of an information problem comes about from an ongoing interaction between system and user (Parsaye and Chignell, 1988). For IR, there is a body of literature directed at interactive systems. Much of this literature points towards the eventual goal outlined in Chapter 1 of having information systems which are more like human communication systems. 2.2.2.1 Dialog-Oriented Systems A small number of IR systems adapt to ongoing user input, attempting to identify a suitable set of documents to retrieve. This is an adaptation of "relevance feedback," and such systems might be called relevance feedback systems. Most of the literature on relevance feedback systems fits within the rationalist tradition (e.g., Salton and McGill, 1983). In contrast, dialog-oriented systems, as Oddy's THOMAS (1977) and later PTHOMAS (Oddy and Balakrishnan, 1991) involve an ongoing communication process, in which the user helps direct the system towards the most useful set of documents in the database. In this process, the system eventually "learns" the user's information need (although systems found in the literature do not remember the interaction for future searches or searchers). The documents are represented in a richly connected network with keyterms. The dialog style of "querying" the system is closer to a human communication approach to IR than to the relevance-based approach. Except for PTHOMAS, there are few explicit links in the literature between relevance feedback of any type and connectionist models or neural networks (Rumelhart and McClelland, 1986). However, the document represen- tations and retrieval processes for relevance feedback systems are closely related to connectionist models which, after all, produce information spaces. In neural networks, input layers (which receive data from the external environment) and output layers (which provide data to the environment) are connected (usually by a middle layer). Such networks may use either local representation in which one concept is associated with one input or output "neuron," or distributed representation in which some pattern on the layer is associated with a concept. Here, concepts might be as simple as the ASCII code for the letter 'A,' or at least as complex as a pattern which is associated with, say, 'maleness' in a photograph. The way a neural network is 'trained' to associate certain patterns in the activations for the input and middle layers with particular output sequences is through feedback -- the network is told the extent to which the output it generates is correct. Note that dialog-oriented IR is applicable to the full range of information seeking behavior, from that assumed by relevance-based approaches through browsing. Also note that real-world characteristics of information seeking, such as the order effects of a search's outcome (Eisenberg and Barry, 1988) and changing information needs during the search are allowed for. 2.2.2.2 Anomalous States of Knowledge One of the assumptions of the rationalist tradition for relevance-based IR is that users are able to express their information needs. The expression of information need is the basis for the query, which is what is matched to document surrogates by information retrieval systems (refer again to Figure 1.1). Oddy, Belkin, and Brooks have built the Anomalous States of Knowledge (ASK) hypothesis around the notion that users are often unable to express their information needs (Belkin, Oddy, and Brookes, 1982). In subsequent studies, such as Belkin and Kwasnik (1986), research based on the ASK hypothesis generated information need structures derived from natural-language expressions of information need. These structures were to be used to choose retrieval strategies. For the current work, the main significance of the ASK hypothesis is its portrayal of the information seeker as engaged in a dynamic and uncertain process, making incremental changes and reassessments of information needs. 2.2.2.3 Cognitive User Models In recognition of the importance of understanding users' information needs from their point of view, and as an attempt to better match an expression of information need to document representations, there has been some work in IR and related literatures on cognitive modeling of users. The general thrust of this work is to raise the level of matching for IR: instead of matching document surrogates to queries, information structures are to be matched to cognitive states (Belkin, 1975). One of the problems that becomes apparent through a reading of this fairly small literature is the lack of understanding of how to implement user models for IR. Just as various document representations might be employed for different situations, so might different types of cognitive models be employed. Characteristics of information seekers' cognitive states can be identified -- prior knowledge, importance of the search, intellectual arousal, etc. -- but how to make use of this information for IR is unclear. Daniels (1986) provides a detailed review of research on user models for information retrieval. Such models have focused on the users' questions (Saracevic and Kantor, 1988), on the relation between "external" information and the internal state of the user (Yovits, de Korvin, Kleyle, and Mascarenhas, 1987), on the role of the user's cognitive state in incorporating information (Belkin, 1984), and other areas. The work on cognitive models for IR indicates the importance of incorporating the user's view of information. Document relevance judgements cannot be made independently of a user and his or her information need situation (Schamber, Eisenberg, and Nilan, 1990). Understanding of the user's information need cannot be accomplished through the construction of a boolean query -- it requires something of the "model of the other," crucial to human communication. 2.2.2.4 Sense Making Information retrieval systems should facilitate information seeking. Information seeking is a form of human behavior in which people gather information in order to "make sense" of their world (Dervin and Nilan, 1986). The Sense Making paradigm describes "information seeking and use" behavior as that in which people actively attempt to bridge (cognitive) gaps in their knowledge. The gaps may be of many different types (Dervin, 1983), and may be well- or ill-defined. The Sense Making approach to understanding human behavior is well suited to IR in that "sense making" describes what people attempt to do with IR systems. "Sense making" also fits the wide range of perspectives on information system use, of different systems, and of different types of users. An important concept at the heart of the Sense Making paradigm is "cognitive movement" (Nilan and Rosenbaum, 1991). Cognitive movement is what happens during sense making: people "move" from one understanding to another. As information is received or created and gaps are bridged, what an individual knows will change. Whether cognitive movement is in the form of a new context, additional understanding, uncertainty, identification of a new gap, or whatever, some sort of change has necessarily taken place. Dervin and Nilan's work (c.f. Dervin, 1983; Newby, Nilan, and Duvall, 1991) has empirically demonstrated the utility of a mapping of the cognitive "space" of information seekers on the domain "space" for a particular task. The works of Dervin and Nilan (see especially Nilan and Rosenbaum, 1991) lead to recognition of cognitive movement as a fundamental human condition. Dervin's and Nilan's work has shown the desirability of a correspon- dence between the user's cognitive space (often organized in terms of situations, needs, and uses) to the system or task space. This mapping or correspondence between information spaces facilitates cognitive movement by placing the user in a familiar cognitive environment -- one in which every- thing is organized as expected, and very little learning must take place to use the system (c.f. Nilan et al., 1989). 2.3 Information Space Previous sections have indicated the role of spatial representation for IR. This section addresses perspectives on information space which go beyond document representation and also considers some of the psychological literature on the role of space in human cognitive behavior. 2.3.1 Spatial Representation for IR Spatial representation schemes have a long history in the literature of information retrieval. The term, "spatial representation," is used as an alternative to "vector representation" or "relational representation," although in practice, the terms are largely interchangeable. "Spatial representation" might be used to refer generally to any representation scheme which incorporates relations among items in the database, but this would not be useful since all databases incorporate some relations. Instead, the term "spatial representation" will be applied to those information spaces in which a large number of items in the database have specified relationships to each other, where the types of relationships are some sort of similarity measure (as discussed in Chapter 1). Salton's vector space is an example of such a representation scheme. This section is divided for separate consideration of orthogonal and non-orthogonal approaches. 2.3.1.1 Spatial Representation with Orthogonal Vectors In the mid-1960's, Salton introduced a representation scheme which has formed the basis for more than 25 years of work on IR. In his scheme (c.f. Salton and McGill, 1983), keyterms are still independent. Terms are represent- ed as orthogonal vectors in a space. Documents are a linear function of their associated term vectors. The "meaning" of the documents (in a limited sense) is taken from the keyterms associated with the documents. Although Salton's work falls very firmly within the rationalist tradition and relies on the same methods for assigning keyterms which are used for other representation schemes, his scheme does provide for relations among documents, which is an important improvement over typical (non-spacial) independent keyword schemes (discussed in Chapter 1). Other work in IR has advocated the implementation of keyterm relations for representation. Hierarchical structures (Schauble, 1989; Thompson, 1971) and clustering (Griffiths, Luckhurst, and Willett, 1986) are possible ways of representing relations among terms. Although this work is concerned mainly with spatial representation schemes, I agree with theoreticians such as Bobrow (1975) who advocate multiple representation schemes within a single system. Later incarnations of the current work might include a diverse selection of representation schemes such that the match between the system and user's spaces might be optimized. The advantage of spatial representation schemes for IR is that they allow for all terms and documents to be related to each other in some way. This is difficult or impossible without a spatial representation. The provision of sets of relationships among items in a database has theoretical palatability from the point of view of General Systems Theory (von Bertalanffy, 1961). In Salton's scheme, however, relations among documents which might be perceived by people would not exist for the system. For example, take three document surrogates, A, B, and C: A = (TRAINS, COAL, TRANSPORTATION) B = (ENGI- NEERS, RAILROADS, POLLUTION) and C = (DOGS, CATS, MICE). In Salton's scheme, each of the keyterms are unrelated, and so each document would be equally unrelated to the other. However, a human judge would probably decide that A and B are more closely related to each other than either is to C. 2.3.1.2 Spatial Representation with Term Relations Salton's vector space method, while pervasive in the literature, has not gone without criticism. There is a definite theme in the IR literature of attempting to incorporate term relations in vector schemes. This is opposed to the method of assuming term independence (orthogonality). Raghavan and his colleagues have been strong advocates of the notion that spatial represen- tation schemes should incorporate relations among keyterms (Raghavan and Yu, 1979; Raghavan and Wong, 1986; Wong, Ziarko, Raghavan, and Wong, 1987). They proposed various methods, all based on statistical relations (such as the cosine measure), for evaluating the relations among terms. In all cases, they were able to increase performance as measured on the traditional relevance- based scale of recall versus precision. Salton made a foray into the area of term dependencies for IR (e.g., Yu, Buckley, Lam, and Salton, 1983) and rejected his findings as largely inconse- quential (his further work in the area did not result in changes to his overall methods for representation and never left the rationalist tradition). Van Rijsbergen (1977) pointed out the theoretical advantages of term dependencies for IR and advocated the term co-occurrence method similar to that introduced later in this study (but he also pointed out that there might not always be enough data for co-occurrence to be used effectively). Doyle (1962) was an early advocate of such a method. Consistent with the relevance-based approach introduced in Chapter 1, the work on term-dependent spatial representation schemes have largely used canned queries and binary relevance judgements to assess their effectiveness. The results of these workers' efforts has been a slight increase in the recall/precision curve which is found in most IR evaluation studies (for a review, see Belkin and Croft, 1987). This work draws heavily on prior research on spatial representation. The scheme to be introduced in Chapter 3 is not substantially different from schemes used for other work -- indeed, it does not use some of the findings that have been shown to increase performance in other systems (as measured using traditional relevance-based methods). Prior work has discussed both the conceptual and empirical justification for spatial representation. This work will take relational representation and go one step further, towards navigable information space with the intent of eventually better matching information spaces with cognitive spaces to make the information spaces more navigable, as discussed in Chapter 1. 2.3.2 Information Space in the IR Literature McGill (1975, 1976) proposed a multidimensional space in which the user's knowledge could be represented as a vector, which would then be compared to document vectors in the same space. A given volume of the knowledge space would represent a given area of study. McGill's later work was clearly in accord with the assumptions typical of Salton's work. However, a concept of great importance for the current work was introduced in this early work: the spatial representation of documents, keyterms, areas of study, and users' knowledge TOGETHER is an important change from the representation schemes and retrieval processes of the day. Meincke and Atherton (1976) wrote of "knowledge space," in which fundamental concepts of a literature may be represented as orthogonal vectors. As part of the same intellectual tradition of Salton and McGill, Meicke and Atherton also propose documents be located at the center of their associated keyterms. An important suggestion is made about the role of representation in the tracking of intellectual growth: both the changes in a literature and changes in the knowledge of a user (or user population) may be tracked over time as the knowledge space changes. Meincke and Atherton's proposal did not result in a functional system, but it did serve to introduce the idea that document representation schemes might provide more than a set of retrieved documents. A spatial representation scheme might give insight into the overall "flavor" of a literature and high-level relationships among documents and/or different areas of study (see also Anderson and Shifrin, 1980; Bawden, 1986; Davies, 1989; and de Bono, 1978). Brookes (1980) provides the most theoretical treatment of the concept of "information space" found in the IR literature. He compares physical space and mental space, and suggests that the qualities of physical space should also apply to mental space. His use of the term "mental space" is consistent with the use of "information space" and "knowledge space" found elsewhere in the literature. Each person's individual mental space includes a location of self, from which perspective originates. The items in the mental space follow the rules of perspective of a physical space, in that items which are closer seem larger (and more important) and items in the distance appear to merge together or disappear over the horizon. The task before information scientists, according to Brookes, is to translate the "objective knowledge" held in computerized databases into subjective landscapes which correspond to users' own understandings of the database contents. The term "information space" is found frequently in the IR literature, but there is no coherent notion of what the term means nor of its role in IR systems. 2.3.3 Information Space in the Psychological Literature Spatial perception and spatial representation are associated with large bodies of literature in psychology and related fields. This section only touches on a few works with the purpose of indicating that spatial perception and spatial representation are a firm and definite part of human information processing and behavior from a different point of view than taken by information theorists such as Nilan and Dervin (Chapter 1). James J. Gibson (1979) is a controversial figure in the psychological literature. Without needing to accept his theoretical orientation, the corpus of theory and research that he has produced indicate the important role of spatial perception and orientation in human behavior. As intentional agents in their perceptions of the world, says Gibson, people use the physical world as a basis for a shared reality. They then create their own psychological universe based on the physical world they share with other people, but it is otherwise wholly their own. Philip Johnson-Laird (1988b) has advocated the importance of spatial representation for human memory which he would like to apply to artificial intelligence (1988a). His experiments have empirically shown the importance of spatial encoding of knowledge and spatial cues for human understanding. While his case may seem overstated (he seldom seems to admit that there are items to be encoded in human memory which do not have analogues in physical space), he has generated evidence for spatial encoding of experience. People do tend to take verbal expressions and form spatial representations, he argues, and these spatial representations are both more informative and more efficient than verbal representations. Johnson-Laird's experiments support a cognitive space in which all concepts are related to all other concepts, in which people construct arbitrary relations among items when none are available. Writers such as von Eckartsberg (1981) and Eliot (1987) have described the historical and cross-cultural aspects of psychological space. (Deregowski, 1989, is among the many who have performed empirical research on the cultural roles of spatial perception.) This body of work provides substantiation for the assumption that the concept of cognitive space is ahistorical and acultural, although not without variation. For instance, different cultures (and different groups or subcultures within a culture) might have different concepts and relations among the concepts as the basis for their information spaces. Both von Eckartsberg and Eliot give treatment to the philosophical aspects of cognitive space (which they refer to as "mental space") with von Eckartsberg favoring the Eastern perspectives and Eliot favoring the Western. In both cases, the role of physical and mental space, spatial perception, spatial organization, spatial memory, etc. are found to be central to human cognitive and social behavior. In addition, Eliot discusses the role of "psychological" space in measurement (psychometrics) and human development. With the possible exception of Johnson-Laird, there is little explicit guidance from the psychological literature on how to apply theory and research on spatial perception and representation to information retrieval. However, there is substantial justification for the notion that an effective representation scheme for IR would involve spatial cues and would provide an environment which allows human spatial perception and organization skills to be used. 2.4 Wayfinding and Visualization This section examines literature with implications for both represen- tation of data for IR and interface characteristics. If one is to accept the proposition from Chapter 1 that navigation through information space is an activity associated with information seeking and use which leads to the fundamental human experience of cognitive movement, then consideration of the literature on navigation through physical space is called for. The literature on wayfinding, maps, and so forth will not be given lengthy treatment in this work. Gluck (1991) may be consulted for a review of the literature on wayfinding and geographic information systems. This literature is pertinent for the current work, but also speaks largely for itself: clearly, navigation through information space can be informed by the literature on navigation through physical space. Just as clearly, though, there are some differences between the objects of study by information scientists, on the one hand, and geographers, urban planners, and so forth on the other hand. An important overlap for all these areas of study is the choices made by system designers about how to best represent the database (or city, or landscape...) for a particular use. Lynch's (1960) classic The Image of the City is a founding work in the modern study of human views of cities. Much later work on wayfinding and urban planning followed from this work. Lynch proposed a five-class taxonomy of a city's geographic features: landmarks, paths, regions, nodes (intersection points) and edges (terminators or boundaries) (from Gluck, 1991). These features are used by people to find their way around, and recognize where they are. Another model for the wayfinding activities of people points to a three- part cognitive map: places, spatial relations between the places, and travel plans (Garling, Book, and Lindberg 1984, in Gluck, 1991). Literature on information retrieval has begun to use some of the language and literature from the geographic literature. Kerr (1990), for example, points out the importance of navigational cues for wayfinding in an electronic database. Lynch and others point to the development of an accurate model of a city or landscape as a necessary prerequisite for effective navigation. The rest of this section deals with visualization for IR. If navigation is to be implemented for IR, consideration must be given to aspects of the interface. In batch-mode, all interaction is typically via text: the user types a query and receives some text as output. For wayfinding activities to be effective, visual navigation cues and possibly a visual interface are indicated. J.C.R. Licklider (1969) was again proven a visionary in his prediction of the role of computer graphics for information systems. He foresaw times such as these when real-time high quality interactive graphics are an important part of the human-computer interface. The Macintosh desktop, the X window standard and virtual reality (BYTE, 1990) are all areas in which the graphical presentation and manipulation of data are becoming commonplace. A previous section mentioned the hypertext-based browsing environ- ments which are emerging in the literature. In addition, work is being done on visualization for various representation schemes (Crouch, 1986), visualiza- tion for database management systems (Afsarmanesh and McLeod, 1989), and visual query languages (Chang, 1990). Scientific data visualization is another important topic which is coming into its own (Banchoff, 1990). For information retrieval, visual interfaces and the visualization of databases are part of an emerging trend which should lead towards more navigable systems, although the thrust of work on data visualization is not usually defined as related to information retrieval. 2.5 Literature Central to the Current Work This section is of particular importance for the current work, in that the literatures it mentions offer the most direct support for the conceptual framework and methodological approach to investigating the research questions from Chapter 1 as opposed to offering general support, clarification, or substantiation, as have most other literatures reviewed in this chapter. Chapter 1 mentioned "exosomatic memory" as proposed by Brookes (1975). Miller (1968) proposed a similar concept: Both Miller and Brookes envision the ultimate goal of information retrieval as a "group mind" in which the knowledge of a society is stored and made accessible to all. Miller strongly advocates spatial representation of data as he believes the ability to organize information spatially is important for human information processing. Brittain (1979) elaborated on the idea of information systems leading to extensions of human memory. Robertson (1979) is a strong advocate of spatial representation for information retrieval. He reviews several possible spatial representation schemes, including those of Salton (SMART), Oddy (THOMAS), and Meicke and Atherton. He states that spatial representation shows great promise, but there is not yet a consistent spatial model for IR. Koll (1978) performed a set of experiments to determine whether a spatial representation which conformed to the result of psychometric measurements of user's cognitive relations would improve retrieval. This work is the only test of the effectiveness of creating an information space based on a user group's cognitive space found in the literature (although Koll did not use the concepts introduced in Chapter 1). He applied multidimensional scaling (MDS) techniques to document and keyterm relations as perceived by users (the philosophical, theoretical, and methodological background for Koll's work may be found in Woelfel and Fink, 1980). His study showed that a spatial represen- tation which matches the cognitive state of a user is more effective for retrieval than an orthogonal vector space (with effectiveness measured in the relevance-based tradition). One of the problems of Koll's representation scheme is the expense of data collection to form a database -- the number of paired-comparisons judgements by humans that must be made for even a modest database based on MDS (say, 100 documents with a total of 400 unique keyterms) is prohibitively high. The necessary sample size for each homogeneous user group makes the building of such user-based information spaces even more expensive. As a remedy to the creation of user-derived term-dependent information spaces, but at the expense of a more precise match to the cognitive structures of users, automatic methods for the creation of information spaces have been proposed. Noreault, McGill, and Koll (1979) performed a comparative analysis of 67 statistical similarity and dissimilarity measures, many of which could be represented spatially. They found differing abilities of the measures to distinguish among sets of relevant and non-relevant documents (as measured in the relevance-based tradition). Jones and Furnas (1987) performed a similar set of experiments with comparative analyses of six common similarity measures for IR. One of the methods analyzed by both Noreault, McGill and Koll (1979) and Jones and Furnas (1987) is the Pearson correlation. As Rodgers and Nicewan- der (1988) point out, there is a direct spatial representation corresponding to a group of Pearson correlation scores. The Pearson correlation (or "correla- tion coefficient") has been used by several researchers to form information spaces for retrieval, such as Trivison (1987) and Harper and van Rijsbergen (1978). The Pearson correlation is also closely related to the cosine measure which is very frequently used for representation in IR (refer again to Noreault, McGill and Koll, 1979). More complicated similarity measures which result in spatial repre- sentation are found in the work of Deerwester et al. (1988, 1990, also Furnas et al. 1988). They used a technique called singular value decomposition to extract a reasonably small (50-100) collection of orthogonal vectors from a large term-by-document matrix. The vector space represents the higher level concepts of a given literature. Unfortunately, these researchers evaluated their system in the relevance-based tradition, used canned queries and relevance judgements, and did not visualize the space. Even so, the technique showed promise for retrieval effectiveness. Natural language text has been used to create a multidimensional information space based on a term by term correlation matrix. Danowski (1988; personal communication 1990-91) performed analysis on electronic bulletin board systems, finding that high level relations among terms and concepts which were discussed emerged when the space was visualized. In this case, visualization of the space was accomplished by viewing the first three dimensions of the multidimensional analysis. Because relationships among all terms were computed in the space, and the first three dimensions account for the most variance in the space, a fair visual approximation of the multidimen- sional space was possible with only the first three dimensions extracted from a statistical measure of lexical distance. Freeman and Barnett (1991) performed a similar analysis on a corporate electronic mail system, finding that the concepts which emerged from the word co-occurrence analysis helped to understand a corporate culture. 2.6 Outcome of the Literature Review There are several general statements that may be made based on the literature covered in this chapter. These findings will be applied in Chapters 3, 4, and 5 for formation of the methods for investigation, creation of the system, and analysis. They supplement the criteria for navigable systems generated in Chapter 1. - Older, foundation writings in the field of information science (e.g., Licklider, 1965, Robertson, 1979, and Miller, 1968) point more towards the assumptions of the current work (outlined in Chapter 1) than to the assumptions of relevance-based IR. - Spatial perception, spatial movement, and spatial organiza- tion are central to human behavior and constitute one way of operationalizing information retrieval. - Spatial representation for information retrieval has a long history, but most spatial representation schemes for ex- perimental systems assume independence of terms in the space. - Assumptions about the independence of terms and the independence of relevance judgements for most relevance- based IR work have not gained empirical support. - The literature on browsing supports the notion that information seeking behavior is often undirected and the ability to visualize and explore a database is important. - Psychometric approaches (such as multidimensional scaling) and statistical approaches (such as word co-oc- currence) provide a basis for discerning relationships among terms and documents. - Visualization of data and visual interfaces has shown great potential for increasing access to, and understanding of, a wide variety of application areas and fits in well with the evidence supporting spatial representation. 2.7 Information Space Revisited Terms such as "information space," "mental maps," "spatial orientation," and so forth are found throughout the literature of library and information science (cited throughout this chapter). There has been no unified treatment of information space for information science. Perhaps the reason why "information space" has not received the attention given to terms such as "relevance" has to do with its pervasiveness, inside and outside of the literature on IR. The various literatures considered in preceding sections confirm that "space," with all its meanings and connotations, is an important part of human behavior. As such, "information space," as "relevance," is a term for which a variety of definitions might be applied -- but for which the common sense, ill-defined understanding of the term will continue to serve most purposes. This work required a more explicit definition, though, at the possible expense of limiting the scope or creating artificial boundaries. This review of the literature, while not pointing towards it exclusively, provided justification for the conceptualizations of cognitive space, information space, and navigation made in the current work. 2.8 Conclusion This chapter has touched on many literatures which offer general support for the conceptual framework and methodological approaches to the study of information retrieval, information space, and navigation. For IR, the literatures indicate a variety of approaches which do not fit well with the relevance-based literatures -- browsing, the incorporation of term relations, and relevance feedback are all left out in the cold by existing commercial IR systems. The foundations of IR and these literatures suggest strongly that relevance-based systems, while appropriate for some information needs, are not a panacea for all users. There is a long history of spatial approaches to cognition and informa- tion retrieval which have seldom been raised to levels which consider the psychological aspects of navigating through a non-physical space. The conceptual framework from Chapter 1 drew on prior work by Mead, Dervin and Nilan, Woelfel, Cushman, Goffman, and others (and surely took liberties which would churn the stomach of each). In this chapter, the works of theoreticians and practitioners whose works contributed to an understanding of navigation for information retrieval or had implications for methods for translating notions about navigation into functional IR systems have been discussed. CHAPTER 3: METHODS OF INVESTIGATION 3.0 Introduction The desired outcome of the system design and empirical component of this work was the creation of an environment in which navigation could be examined and the evaluation of that environment for information retrieval. This chapter describes the creation of the environment and the steps taken to evaluate it. The environment was created within the conceptual framework for navigation and related concepts as laid out in Chapter 1. The research questions which the system design and evaluation were intended to inform are: 1. Is navigation a useful approach for operationalizing information retrieval? 2. What perceptions by users engaged in information seeking help to understand the use of navigation for conceptual- ization of information retrieval and information system design? The study was exploratory and thus intended to generate data on the role of navigation for optimizing cognitive movement during information seeking behavior as well as insight for development of future IR systems. As such, more than simple "yes" or "no" answers were sought: the research questions were designed to allow for the generation of a body of data with wide applicability to the general problem of information retrieval and navigation. There were three main components to this investigation of navigation. The first two components had to do with providing the necessary environment for answering the research questions and in meeting some of the criteria for system design generated in Chapter 1. First is the creation of an information space suitable for navigation. The information space was intended as a step towards matching user cognitive spaces and was also based on methods supported by the literature reviewed in Chapter 2. The information space did not reflect the cognitive spaces of particular users or user groups, as would be an ultimate goal. The step towards matching information and cognitive spaces was taken at a much rougher level, by the simple incorporation of concept relations. As will be described, the relations were simple statistical association measures and the concepts were keyterms taken from a biblio- graphic database. The second component of this investigation involved the actual creation of an IR system meant to facilitate navigation. The system included several desirable qualities for navigable systems extracted from the literature and conceptual framework and was built around the information space from the first component. The final component was a small-scale evaluation study of the system in actual use by an group of respondents. The literature did not reveal any information system suitable for an investigation of navigation (e.g., one with full concept relations which allows for iterative searching and a wide range of information seeking behaviors). This fact, combined with the strong mandate for the creation and evaluation of working systems for publication in the IR literature, necessitated the creation of an IR system specifically designed to facilitate navigation. Although the system designed for this study was not the only one that could have been chosen, the outcome was as desired: the creation of an environment in which navigation could be evaluated. 3.1 Building an Information Space Two desirable qualities of the information space built for this investiga- tion emerged from Chapters 1 and 2. The first quality was for the space be more in accord with the user group's cognitive space than that typical of other information systems -- specifically to incorporate term relations and to relate both concepts and documents with each other. This quality is at the crux of this entire work. The second quality was less important conceptually but is important methodologically relative to the qualities of today's computer-based information systems (as described in Chapter 2): that the space was visualizable or well suited to a visual interface. Visualizability was not proposed as a prerequisite to navigability but it offered a firm break with the batch-oriented interfaces of the past and present, and a move towards the types of visual interfaces which seem to be the wave of the future (and of the present, to some extent, although not for IR systems except as mentioned in Chapter 2). This section first introduces some concepts for the building of multidimensional information spaces, and then describes the specific steps taken to construct the space used in this investigation. 3.1.1 Information Space via Multidimensional Scaling A multidimensional information space was built in which important terms from the documents and the documents themselves were located. The relative locations of objects in the space were taken from term co-occurrence measures within document collections. Terms which were more similar (alternatively stated, "terms which were lexically close to one another") were located closer to each other in the space. In this way, an approximate "meaning" for each object in the database was obtained based on its relation to the other objects in the database. The space built was very similar to those which are the outcome of the psychometric methods for multidimensional scaling (MDS). The literature review indicated value for visualization, so, since we see in only three dimensions, three dimensions were extracted from the multidimensional space (as described in more detail later in this section). In this information space, each axis (or "dimension") has no particular meaning. This is quite contrary to the type of space generated by Salton and his colleagues (see Salton and McGill, 1983), in which each axis or dimension is associated one-to-one with a particular index term. This is also separate from "types of relationships," which were also labelled "dimensions" in Chapter 1. Dimensions, in the realm of MDS, refer simply to the Cartesian coordinate system in which objects may be placed relative to each other. The axes of the proposed space are akin to North and South on a map -- they help to align objects, but do not determine what a particular country would look like. Choosing any other coordinate system for a map would not change the relative locations of, say, New York, Miami, and Chicago. Just as for MDS, far more than three dimensions could be extracted for the information space. To account for all the variance in the data by using principle components derived from co-occurrence data (as described later), it is typically necessary to have as many dimensions as there are terms used to build the space. Luckily, the computational methods employed insure that the dimensions which are extracted first account for the most variance. This means that the overall trends and general locations of the objects in the space can be represented quite faithfully in three dimensions. (Note that this is the same procedure by which a three-dimensional object, such as North America, can be reasonably represented with a two-dimensional map.) Based on previous use for IR, word co-occurrence was the basis for the information space (Trivison, 1987; Danowski, 1987; Deerwester et al., 1988; Deerwester et al., 1990). Thus, the space was based on the premise that words which tend to co-occur frequently in text have something to do with each other. Previous work on word co-occurrence helped to identify potential fruitful methods for the current study. With a large sample (that is, when a large quantity of data are used to build the space), differences in language use by authors or indexers can be largely overcome. This means, for example, that words which are synonyms will have similar co-occurrence scores with other words, even if the frequency of occurrence for each word is not the same. So, the four words PIGS, SWINE, HOGS, and BOAR will be located in similar locations in the space, because they have similar co-occurrence scores with other words, even if the absolute number of occurrences for each word is different. The extent to which these words are located at somewhat different locations in the generated space is the extent to which the words are not really synonyms. That is, having the same referent does not mean that a pair of words has the same meaning for those who use the words (Barnett, 1988, includes an empirical study of the porcine words mentioned in this paragraph). Generally, a large sample of documents also helps to protect against ambiguity. One word may be close to two groups of words, but the groups of words are not close to each other (this may be difficult to visualize faithfully in three or fewer dimensions). For example, words such as THE, AND, OR, IF, BUT, WHEN, etc. might have high co-occurrence scores with many other words. However, this does not prevent other words from having "correct" relation- ships with each other. (In practice, of course, these common words would be eliminated through the use of a stop list.) This can enable words such as DOG to be closely related to such terms as HOUSEHOLD PETS and TAXIDERMY, but does not require that HOUSEHOLD PETS and TAXIDERMY be closely related to each other. The goal of the representation scheme used for this study was to build an information space where terms are located closer or further away from each other based on their similarity, as measured by co-occurrence across a large sample of natural-language or controlled vocabulary text. This was proposed as a closer match to the qualities of users cognitive spaces than existing representation schemes. A secondary goal, which helped to determine the specific methods taken to build the information space, was that the space be visualizable in 3D. Documents were represented in the space at the center of the terms found in the documents. The decision to place documents at the center of their associated terms, instead of choosing some other method for placement (e.g., to place them dynamically based on what terms were in the vicinity of the user viewpoint, or to place them in multiple locations close to each associated term) was based on psychometric research, especially that of Woelfel and Fink (1980), in which the "meaning" of a concept has been shown empirically to be strongly associated, and at least partially derived, from its cognitive relationship to other concepts. Inasmuch as the manifest "meaning" of a document surrogate can be thought to be indicated by the terms in the surrogate, the document location was placed at the location which minimized the total "distance" (in the Euclidean sense) from each associated term in the space. The idea was to approximate a simulation of the type of results which might have been obtained using multidimensional scaling methods to obtain psychometric measurements of a particular user population by using the statistical methods described here. This solution was perhaps the simplest which could have been employed, and seemed a good starting point. However, Chapter 4 will reveal deficiencies in this representation scheme which were not evident at the outset. 3.1.2 Building the Space This section describes the steps taken to build the information space used for this investigation (see Table 3.1 for a summary). A fair number of alternatives were tried before these particular steps were arrived at. Many of the alternatives are mentioned here. It is important to note that the apparent differences between the spaces -- their general look and feel, and the clusters among related keyterms -- were barely noticeable to the eye, even when radically different approaches were taken to building them. The specific locations and variance accounted for in each space were different, but the relations among particular items and the general clusters which emerged were quite similar. Table 3.1: Overview of Procedures Taken to Generate a Spatial Representation of a Bibliographic Database -------------------------------------------------------------------------- 1. Select the database 2. Pre-process the data (extract keyterms from each record) 3. Create a co-occurrence matrix for keyterms between upper and lower cutoff values 4. Extract principal components from the co-occur- rence matrix (which generates a spatial representa- tion of the keyterm vocabulary) 5. Compute the location of each document in the space, at the center of all the keyterms for each document -------------------------------------------------------------------------- Choosing the Database. The first step was to select an appropriate database. The database chosen for the user-based evaluation was taken from a subset of the ERIC bibliographic database. The database was available on a mainframe at Syracuse University and included full bibliographic citations and abstracts written by professional abstracters. Index terms in the ERIC database are selected from the carefully controlled ERIC thesaurus. Approximately two thousand ERIC documents in the Syracuse database contained the keyterm MANAGEMENT in the "descriptor" or "major descriptor" fields. This subset was used as a basis for the information space. These ERIC documents contained between six and thirty keyterms with abstracts of about 100 words. The database used covered a time period from 1984 to 1991. See Appendix F for sample ERIC database records ("documents") and Table 3.2 for a summary of the ERIC database subset used. Table 3.2: The ERIC Database Subset -------------------------------------------------------------------------- * The ERIC database contains hundreds of thousands of biblio- graphic records and abstracts generally associated with the field of education. * 2064 records with the keyterm "MANAGEMENT" were selected. These are all the records with the keyterm "MANAGEMENT" in the "Descriptor" or "Major Descriptor" field. * The 2064 records contained between about 6 and 30 keyterms each. * Abstracts of about 100 words were included with every record. -------------------------------------------------------------------------- Three other types of data were tried each yielding suitable results but not quite so good as for the ERIC database. Email from electronic mailing lists, the contents of various USENET newsgroups (an electronic bulletin board system), and several works of fiction available in electronic form (Down the Rabbit Hole by Lewis Carol, chapters from the King James version of the Bible) were all used to create information spaces. With these data types, the variance accounted for was somewhat less (from about 10% to 14% in the first three dimensions), and the visual clustering of terms was not as appealing (as judged by the researcher and his colleagues). Aside from the outcomes listed in the previous paragraph that led to selection of the ERIC database, the ERIC database was considered especially appropriate in that it clearly fits well within traditional IR evaluation studies. Pre-Processing the Data. Relatively little pre-processing of the ERIC data was required. The natural-language data types listed above required a stoplist and spelling checking, but since only the assigned keyterms in the "descriptor" and "major descriptor" fields of the ERIC collection were used for creation of the space, these steps were not necessary (because the terms were all spelled correctly and all terms were kept). Frequency distributions of the chosen keyterms were generated to choose a minimum and maximum occurrence for inclusion in the information space (trimming at the high end of the frequency distribution was necessary because otherwise terms which occur very frequently would account for a large amount of variance in the space and not leave enough variance for other terms to be accurately separated in the first three dimensions of the space). Table 3.3 presents summary statistics for the collection of keyterms used for the information space construction. The minimum and maximum number of occurrences per keyterm was constrained in order to (1) produce an approximately normal distribution, and (2) leave a number of keyterms in the several hundreds, which is manageable by the Principal Components analysis procedure. The minimum and maximum occurrence scores were achieved by trial and error, by moving the top and bottom trimming value and regarding the resulting frequency distribution and summary statistics. This is in accord with other similar research, such as that of Danowski (1988) and Law and Whittaker (1992). The end product shown here has the benefit (from a computational and system design point of view) of a reasonable number of unique keyterms and a similar number of documents which contain the keyterms. There were no perceivable differences between ERIC documents with greater or fewer numbers of associated keyterms (additionally, the database uses another type of keyterm to index documents, called an "identifier," which have similar variety in their numbers assigned to ERIC documents). Personal interviews with ERIC indexers and abstracters did not reveal any bias in the number of descriptors or major descriptors which were assigned documents. Thus, the trimming of documents with large or small numbers of keyterms should not have produced bias in the sample of documents used. Table 3.3: Statistics for Keyterms -------------------------------------------------------------------------- This table offers summary statistics for the number of keyterms used for the construction of an information space. The keyterms came from an exhaustive list of the contents of the "Descriptor" and "Major Descriptor" fields of the ERIC database subset (all records which contain the term MANAGEMENT in one of those fields). A. Data before trimming General: Total number of records: 2064 Total number of unique keyterms: 2856 Central tendency: Mean number of terms per record: 12.16 Median: 3. Mode: 1 Distribution: Standard deviation: 37.16 Minimum: 1 Maximum: 846 Q1: 1 Q3: 10 B. Data after trimming. Criteria: Minimum occurrence for inclusion: 20 Maximum occurrence for inclusion: 55 General: Total number of records: 271 Total number of keyterms: 264 Central tendency: Mean number of terms per record: 31 Mode: 20. Median: 28 Distribution: Standard deviation: 9.46 Q1: 23 Q3: 37 -------------------------------------------------------------------------- The data used to build the space were extracted from the ERIC database subset. All that was used were the terms from the "descriptor" and "major descriptor" fields of the database subset. These fields were chosen because they were taken from a strictly controlled thesaurus -- in other words, these terms have less ambiguity. The relative lack of ambiguity in the keyterms selected, as compared to natural-language texts, may account for the lion's share of the increased variance accounted for over natural-language databases (about fifteen percent with ERIC data, but only about ten percent with natural-language data). This was difficult to test empirically, though, because of the number of other differences between the ERIC database and, say, USENET data -- the size of the "document," number of authors, connec- tions between documents, etc. The "documents" used to build the space consisted simply of the set of keyterms from each bibliographic record in the ERIC database subset -- usually between ten and twenty terms from the "descriptor" and "major descriptor" fields. A file consisting of all sets of keyterms for each record separated by a delimiter was the input to the next step. Building a Co-occurrence Matrix. The Pearson product moment correla- tion (or simply Pearson correlation) was chosen as the measure of association to build the space from the matrix of co-occurrence scores. Other statistics were available (see Noreault, McGill, and Koll, 1979), but Pearson has the benefit of being both well known and directly visualizable. Furthermore, there is considerable empirical evidence that the choice of a measure of association will affect the specific outcome but the overall qualities of the outcomes from other measures are comparable (for comparisons of different statistical methods of association see Noreault, McGill and Koll, 1979). A FORTRAN program was written to read the file created by the previous step and write a co-occurrence matrix. The matrix included all terms with total occurrences between specified upper and lower cutoff values. The information space for the user-based evaluation contained 264 unique terms occurring between 20 and 55 times across all records. Cutoff values were chosen so that the distribution of frequencies was approximately normal. Without trimming scores at 20 and 55, the very long-tailed frequency distribution ranged from 1 to about 800. This was necessary because both the Pearson correlation and principal components procedures (both derivatives of the general linear model), as most statistical measures of association, as sensitive to the number of occurrences for each class. This means that if a long-tailed distributed sample of keyterm frequencies were used, the terms which occurred most frequently would be those most separated from each other in the space. In addition, the variance accounted for by the first three dimensions of the space would be lessened, and the visual display would show high-frequency keyterms to be individual "islands," with almost all other terms clustered at the center of the space. The program could also consider context at levels other than the single document. A sliding "window" would consider the co-occurrence of a word with its closest neighbors -- say, within five adjacent words. This method was used for Down the Rabbit Hole, The Bible, and other natural-language texts. The program could use a stoplist, if desired. The output matrix (in a plain text file) contained all keyterms retained and a co-occurrence score for each other keyterm in the file. These added functions were not needed for the ERIC database. Principal Components Analysis. Principal components analysis (SAS Institute, 1985) is a technique for multivariate data analysis which results in the spatial location of units of analysis. This is the same sort of analysis used to locate concepts spatially from the outcome of human subject measurements for MDS. (There are many different approaches to MDS, all of which yield a space in which concepts are located.) The SAS package took the matrix from the previous step and located all concepts spatially based on their Pearson correlation score. This step is the computational bottleneck in the information space building process, as principal components analysis is a computationally intense procedure. The supercomputing facility at Cornell University was employed for this stage of the analysis. For a moderate size information space (in the several hundreds of items, as described above), over 750 megabytes of main storage were required on the IBM 3090 cluster at Cornell (running the VM/XA operating system). Spaces with more than 500 unique terms were not constructed, as computing time at Cornell was limited. More terms would be desired for use in a "real-world" setting. The ERIC Thesaurus, for example, contains thousands of terms. To process more terms, a custom developed program for principal components analysis would be necessary and would demonstrate a many-fold increase in efficiency. The output from SAS was a list of coordinates in three dimensions for each of the keyterms. The variance accounted for by each of the dimensions was also listed. Due to the high overhead of computing additional dimensions, only three were computed (the full 264-dimensional space would account for 100% of the variance in the input co-occurrence matrix -- the two would be isomorphic). For the information space created for this evaluation, the first three dimensions accounted for approximately 15% of the variance in the co- occurrence matrix. Dimension 1 accounted for about 7% and dimensions 2 and 3 were about 4% of the variance each. A simple FORTRAN program read in the coordinates of the terms which SAS generated and combined it with the original input file of keyterms used to construct the co-occurrence matrix. The location of each document in the space was computed to be at the center of its set of terms. In some cases, the associated keyterms were quite close to the document. In more cases, the document was not in the direct neighborhood of any of its keyterms. This was not immediately evident during pre-testing and instrument construction and unfortunately created confusion which evidently made the whole space less navigable, as discussed in Chapters 4 and 5. In fact, hindsight shows that these procedures really created 2 spaces: the document space and the keyterm space. This may be an artifact of the selection of documents that had a minimum of 20 keyterms each. Appendix E contains the coordinates of all keyterms and documents in the space. This section has provided details on the construction of the specific information space employed for the user-based evaluation phase of this investigation. Options for natural-language text, which were not used for this study were more complicated as there were more options for how to process the text prior to constructing a co-occurrence matrix. Such investigations involved the creation of a variety of custom programs for stemming the text, creating sub-documents from larger texts, removing extraneous data (such as headers from USENET articles or email), and so forth which are not discussed further here. 3.2 Building the Interface This section describes the system built for the investigation. Further description may be found in Appendix A, the "Information System Training Packet," which serves as the system user guide. The system was run on a Silicon Graphics Personal Iris workstation, a special purpose graphics computer. The model used for the evaluation had a 20 megahertz CPU, a nineteen-inch color monitor, keyboard, and three-button optical mouse. As for other Iris', the model contained a "geometry engine," which sped processing for interactive graphics. The workstation had windowing capability which allowed multiple windows on the screen to each run independent processes (or one program to open several windows). The workstation ran Irix version 3.2.2, a flavor of the Unix operating system. All programming on the Iris was done in C (except for a few short programs written in the Bourne shell script language, part of Unix). The geometry engine and associated graphics library allowed for relative ease in programming for complicated geometric transformations. For example, a function to rotate a graphics image would normally require a matrix transfor- mation of all vertices in the image. The graphics library (GL) reduces the programming overhead to a call to the "rotate()" function. Various components for the interface were tried over a period of about six months with considerable formal and informal pretesting. This section describes only the final version of the system. Appendix A, the Training Manual, contains some graphic images of the system. The main component of the system was the window that displayed the information space. Terms "floated" in the space with documents at the center of their associated terms. The mouse or optional gesture oriented input device (the PowerGlove, described below) was used to move through the space. Three modes were available. In stationary mode, mouse or glove movements did not result in movement through the space. In translation mode, the point of view moved through the space along the X, Y, and Z axes (the middle and right mouse buttons were used to move along the Z axis). In rotation mode, the point of view stayed in the same location but the focus of attention moved through the space -- just as one looks around oneself by turning the head. Thus, one could move through the space with five degrees of freedom: for- ward/backwards, left/right, up/down, pitch, and yaw. Roll (or twist about the Z axis) was not implemented, as the words in the space were always displayed right-side-up. Using translation and rotation modes, the user could navigate anywhere in the information space. The PowerGlove is a three-dimensional gesture oriented input device. Intended originally as a game accessory, the addition of a converter box (supplied by AGE, Inc.) enabled communication with the glove via the Iris serial port. The glove uses ultrasonic transmitters and an L-brace receiver (mounted on the monitor or next to the monitor) to give position data and piezo elements in the fingers give finger bend data with two bits precision. The glove was considered an appropriate input device as it would enable the user to make a physical grabbing gesture to retrieve documents in the space and move through the space using finger gestures. Although the glove worked fairly well, the ultrasonic position sensor data were not as steady as those from the mouse. Whether using the mouse or the glove, the appearance of the windows and various modes for interaction were the same. The information space window, as mentioned previously, displayed the keyterms and the documents together in a three-dimensional space. Depthcueing provided cues for the third dimension (this makes objects in the foreground appear brighter than objects in the background). A perspective view was chosen so that closer objects appeared to move faster than objects further away when moving through the space. Three other windows in the interface remained visible at all times. The information space window occupied the leftmost 80% of the screen. On the upper right was the "Map Window" which displayed the user's position in the information space from a distance. In this way, the user could see where he or she was and move to potentially fruitful locations in the space. The map window did not display the actual terms but instead put a colored dot where every keyterm and document was located in the space relative to the user. The "Mode Window" indicated the mode. Stationary mode displayed a sitting stick figure on a red background, translation mode displayed perpendicular bi-directional arrows on a green background, and rotation mode displayed a circle with an arrow superimposed on a yellow background. The "Vocabulary Window" contained all of the keyterms in the space in alphabetical order. The user could use the mouse cursor to scroll though the window looking for terms of interest. Other windows popped up after key commands were pressed. Instruc- tions were given (and any input supplied) in a small text window that appeared at the bottom of the information space window. The window had different colors for different functions. To search the vocabulary window, the user pressed the "V" key. The "F" key produced a blue window instructing the user to type a keyterm from the space (an adept user could use the mouse to cut-and-paste the word from the vocabulary window). After typing the word, the user's point of view "jumped" instantly to the location of that word in the space. The "S" key was for searching for particular documents. Since only the document number was displayed in the space, users needed to retrieve documents to see whether they met their information needs. Users were prompted for the document number (in a white window, this time) and were entertained by a flashing blue-and-white window while the document was retrieved (usually only one or two seconds). The document was displayed in another pop up window which covered most of the information space window. The user was instructed to press a mouse button when he or she wanted to move on. Other functions included rudimentary help (via the "H" key) and the ability to reset the point of view back to the starting position at the center of the space (via the "R" key). The ability to look for combinations of keyterms in the space was not provided (only single-term searches were available), nor was the ability to search for documents using anything but the document number. A description of a typical session follows: 1. Starting the program, providing the name of the coordi- nates file and the database file as arguments. 2. Specification of either the mouse or PowerGlove as an input device by the user 3. Initial exploration of the information space. Moving through the space and keeping track of the position in the map window. 4. Scrolling through the vocabulary window to identify possible useful keyterms in the space. Jumping to the locations of those keyterms. 5. Retrieving potentially useful documents with their docu- ment number. 6. Exploration of the space until a useful location in the space is found. Then retrieving documents from that area until the information need is satisfied (or until it's time to move on to another location in the space). 3.3 User-Based Evaluation This section describes the methods for the user-based evaluation of the navigation IR environment described in the previous sections. "User-based," as introduced in Chapter 1, refers to empirical research methods which attempt to solicit user perceptions of real-world situations. As such, user-based methods yield results which help to understand the situation as the user sees it, without necessarily imposing the world view of the researcher on the user. Twenty people were asked to learn about the system and perform some basic retrieval tasks. They also learned to use a more traditional IR system and compared the two. The variables studied and method in detail follow. The purpose of the user-based evaluation phase of this work was to investigate the applicability of navigation as a fundamental concept for information systems. The investigation was carried out in the context of the navigation-based IR system environment developed specifically for this purpose. The usefulness of navigation and user perceptions which might help to better understand navigation for operationalization of information system design (as described in Chapter 1) will be assessed in Chapter 4. 3.3.1 Overview of Evaluation Respondents were first briefed on the nature of the evaluation and asked to read a consent form. They were told to expect to spend about one hour total, for both learning to use the systems and accomplishing some basic tasks. To learn, they were given a "Training Packet" (Appendix A) which provided step-by-step instructions for all functions of the evaluation system and an overview of the comparison system. Respondents were given a choice of whether to use the PowerGlove or not. Training included familiarity with the Iris workstation and its components. The Training Packet included a brief section on the use of Prism, a typical IR system available via Internet at Syracuse University. Prism, based on Spires Consortium software, was chosen because a similar subset of the ERIC database to the one used by the evaluation system could be accessed. After respondents decided they were comfortable with their use of both the Prism and evaluation system, they opened the self-administered data collection packet (note that all parts of the user-based evaluation were self administered, although a research assistant was nearby to answer any questions). The packet (in Appendix B) took respondents through two sets of two tasks each. Each set was for either the Prism or evaluation system. The two tasks, which were repeated for each system, were to resolve the informa- tion needs associated with each. The tasks centered on either a set of keyterms or a fairly specific statement of information need. The specific information needs were different for each of the four tasks. After each task, a series of open- and closed-ended items were answered about the task. After completing all four tasks, respondents evaluated both systems, completed a demographic questionnaire, and were debriefed as needed. 3.3.2 Operationalization First let us consider the terms in the research questions. The first research question was: 1. Is navigation a useful approach for operationalizing information retrieval? "Useful" was operationalized as user perceptions that they had ability to accomplish basic tasks with the system. Relative usefulness could be gauged by comparing respondents' subjective evaluations of their satisfaction with task outcomes between different tasks or different systems and the amount of time the tasks took to perform. Demographic factors affecting usefulness were investigated because of the ease of data collection for these items (and their expected presence in social science research), but the proposed navigation-enhancing qualities of the system were the focus. "Navigation" is part of the conceptual framework developed in Chapter 1. It was there defined as "Human behavior to make sense of an information space. [It] involves coming to understand both the information space and the model that an information system has of its (generalized) users." The literature review in Chapter 2 touched on various other notions relating to navigation, most of which operated in a physical domain rather than the cognitive domain of Chapter 1. For the development of the evaluation system described in previous sections, "navigation" had to be operationalized using a set of fairly straightforward qualities. Recall the two general types of knowledge to facilitate navigation (from Chapter 1): a model of one's status relative to the other and a model of how to change that status. The first type of model was provided by a visual "window" looking out from one's position in the information space. The system's information space and map windows both provided direct feedback as to location in the space. The second type of model was given first by the ability to directly manipulate the location in the information space, using the mouse or PowerGlove, and second, the respondent could either browse the system vocabulary or jump instantly to the location of any vocabulary word in the space. None of these qualities which are proposed to facilitate navigation and provide a direct picture of an information space fit with traditional batch-oriented retrieval systems. "The other," as perceived by respondents, probably was made up of more than the system (consisting of database, interface, and assumptions of the designer). It may also have been the whole investigation scenario, questionnaire, and so forth. The nature of the perceived "other" was not specifically addressed. "Information retrieval," in both the first and second research questions, referred both to the case of the retrieval of bibliographic citations in response to a fairly specific information need (as for the tasks performed for the evaluation), and to larger notions having to do with "information systems" as any set of steps for storing or accessing information. Let us now turn to the second research question: 2. What perceptions by users engaged in information seeking help to understand the use of navigation for conceptual- ization of information retrieval and information system design? This second question had to do with soliciting data from respondents who actually used the system to accomplish tasks. "Information seeking," in this context, did not refer to the full range of activities which would be desired for a multi-study approach to an investigation of navigation. Instead, information seeking was limited to the completion of several small tasks judged by the researcher to be typical of information seeking tasks in the literature of information science. The "perceptions by users" collected were constrained to pen-and- paper measures (since the entire data collection process was self-administered, see Appendix B). Likert scales were used to collect data on satisfaction with task outcomes, satisfaction with the Prism and evaluation systems in general, model knowledge, and some demographic items (described below). Respondents were asked open-ended questions for additional details on their subjective evaluations and were asked to provide any comments they might have in writing. Data collected to "help to understand the use of navigation for conceptualization of information retrieval and information system design" took two main forms. Understanding could come about by consideration of the open-ended data provided by respondents (that is, data which were not constrained to a numerical scale) or through statistical analysis of the closed- ended data collected. Of the two, the open-ended data had the most potential to provide insight into the various aspects of the conceptual framework developed in Chapter 1. Chapters 4 and 5 describe how understanding of navigation was helped by the data collected. Operationalization of the status relative to the system, and of how to change that status, was through assessment of "where you were during the search" and knowledge of "where you wanted to go." Pretesting indicated that subjective evaluations of these types of knowledge were readily obtainable from respondents using the questionnaire items described in next subsections. The remainder of this section describes the questionnaire. A copy of the questionnaire may be found in Appendix B. The questionnaire consisted of four task sheets, then a sheet for "impressions" of both systems, followed by a brief demographic sheet. Task Sheets. Respondents each completed four task sheets. Two types of tasks were completed for both the Prism and evaluation systems. The first task was to find at least one document citation which meets an "information need." The need took the form of a sentence including two general topics. The second task was to find at least one document citation which had to do with a list of three keyterms. The keyterms were all closely related to the controlled vocabulary terms used by the IR systems (for instance, "managing" would be given instead of "management," an ERIC descriptor). Respondents were asked to write the citation (for Prism) or document number (for the evaluation system) of whatever documents they found which they thought met the assigned task criteria. One or more documents were written for each information need (that is, all of the assigned information needs were "solved" by at least some documents). Appendix C contains all keyterms and statements of information need used for the investigation. By writing the time before and after searching for documents, elapsed time was self-reported in minutes. Respondents were usually stopped after about six minutes because pre-tests indicated that searches which were not "successful" after the first few minutes would not ever be successful, even after 20 minutes. The research assistant was instructed to not "look over the shoulder" of respondents, however, so some searches took longer. Also, some respondents requested additional time to search for documents. Respondents were not told of any time constraint before starting to search. After either finding at least one document citation or being told to stop searching by the research assistant, respondents were asked to evaluate their satisfaction with the outcome of their search on an 11-point Likert scale. 11- point scales were used throughout the questionnaire for consistency whenever scales were used. The scale of 0-10 was chosen as an appropriate number of points based on four reasons. First is the existence of a natural zero. Second is the existence of a true midpoint (5). Third is the familiar decimal scale. Finally, a pretest comparing three different types of scales (5-point, 11-point, and 101-point using percentages, n=28) showed 5 points did not allow the potential variability in the evaluation to show through, and almost no values which were not divisible by 10 were selected on the 101-point scale. After providing a numeric evaluation of satisfaction, respondents were asked to write, "What was it that gave you the degree of satisfaction that you had?" The satisfaction item was intended as a user-based performance measure. An additional performance measure, "confidence," was found to yield the same scores as "satisfaction" during pretesting and was dropped. "Satisfaction" was followed by two Likert scale items. The first asked, "On the scale below, indicate the extent to which you knew where you were during your search (as opposed to being lost)." The second read, "On the scale below, indicate the extent to which you knew where you wanted to go during your search (as opposed to not knowing where to go)." These items were followed by an open-ended question: "What was it that helped you to know where you wanted to go during your search (as opposed to not knowing where to go)?" An open-ended item asking about what helped to know where you were was dropped, as answers were found to be the same as for this latter item (or missing altogether). The items concerning knowledge of status relative to the system and knowledge of how to change that status were intended to address the navigability of the systems. Possible bias introduced by the wording chosen is addressed in the section on bias which follows. A final open-ended item on the task sheet asked for any "comments, feelings, or reflections you have about this task." Respondents were aware that a new type of IR system was being investigated and were given this opportunity to provide feedback on aspects of the task or systems which did not fit in other questionnaire items. Impressions Sheet. After all four tasks were completed, respondents answered items concerning general evaluation of the systems used. Four items were asked of each system. First, respondents described the systems in their own words. Then, two Likert scales were used to evaluate the overall degree of satisfaction and knowledge of "where you were and where you wanted to go." A final item asked respondents to "briefly describe the types of situations in which you might prefer to use XXX" (where XXX was either "Prism" or "Space"). Five additional open-ended items were asked concerning the evaluation system (which was referred to as "Space" on the questionnaire). Respondents were asked to separately assess the "usefulness of some of the Space system components:" the vocabulary window, finding keyterms with the F key, the map window, the spatial location of keyterms, and the spatial location of documents. A fifth item asked respondents to "write any suggestions or other comments you have for the Space system." Demographic Items. The last sheet in the questionnaire packet consisted of eight demographic items. Respondents were asked to circle their gender and write their age. Two Likert scales asked respondents to rate their "experience with computers, " and, "experience with computerized information systems, such as online card catalogs." Respondents were asked if they were familiar with the ERIC database, either in print or electronic form. They were asked if they had ever used the Prism or Spires database management systems. Lastly, respondents were asked to write their total number of years of education (12 being high school graduate, 16 being college graduate, etc.) and write their major or program of study. 3.4 Specifications Pretesting of the investigation methods for the user-based evaluation took place from July to November, 1991. After a number of informal pretests, consultation with colleagues, etc., six respondents completed an early version of the questionnaire packet. After pretesting, a revised training packet was developed, along with some minor revisions to some questionnaire items. Data were collected in late November and early December, 1991. The rest of this section describes the respondents and setting for the research. 3.4.1 Respondents Twenty adults were recruited from a large mid-western university. Respondents were selected using a non-probabilistic purposive method (Kerlinger, 1986). A relatively homogenous group of respondents were selected based on their ready availability and familiarity with information systems. Pretests indicated that a fairly high degree of sophistication with information systems and computer systems in general was desirable to use the evaluation system. This does not reflect directly on navigation, but on some of the design decisions made in building the system. Respondents were akin to test pilots, who had to endure some awkwardness and perhaps bumpy rides to test "advanced" features. To insure the highest quality of "test pilots," the respondent group consisted mostly of students studying for an advanced degree in library and information science (n=18). Another respondent was studying for an advanced degree in English, and a final respondent was on the English faculty. Twenty respondents, completing a total of forty retrieval tasks, was not anticipated to be a sufficient number for statistical significance for most closed-ended measures. As an exploratory study, no particular population was identified for use of the system: instead, the group of respondents were solicited for their ability to easily complete the assigned tasks and provide feedback (based on pretests). Representativeness of the respondent group for some population was not sought. This investigation did not start with twenty as a fixed number of respondents, however. Recruitment of respondents stopped when coverage and redundancy in the responses was noticed: The comments by respondents in the open-ended questionnaire items had the same general content (c.f. Dervin, 1983). There are two main reasons for choosing to not attempt to obtain a sample size sufficient for statistical significance in most numerical items. First, and most importantly, the exploratory nature of the study made individual assessments and comments more valuable than a statistical summary. The number of questionnaires completed, or a few more, made for a collection of data which could be read and evaluated as separate elements. This technique is familiar to market researchers, who first listen to extended interviews with potential customers and only later collect large-scale closed- ended data for statistical analysis (Woelfel, personal communication, 1991). The second reason for the chosen number of respondents has to do with the expense of collecting data. Almost all respondents took more than one hour to learn the system, complete all tasks, and fill out the questionnaire. Respondents were not remunerated for their help except to be offered a summary of the results should they desire it. 3.4.2 Setting All research took place in a private office containing the Iris work- station. This was necessary as no other workstations or locations were available. The room was quiet. The primary researcher was not present, but a research assistant was either present or nearby. Appendix D contains written instructions to the research assistant. 3.5 Error There are several sources of potential error in the methods for investigation described in this chapter. While error, and especially quantifica- tion of error, is usually considered as part of the rationalist tradition, it is also a useful concept for investigators who reject notions of objective truths -- in any investigation, it is incumbent on the researcher to think about the outcomes of his or her research which might have been different, if different methods of investigation were used. Two clear sources of error were discussed earlier in this chapter -- the methods for first constructing an information space and then building an IR system to navigate through it. There was no necessity in choosing a spatial representation, or a visual interface, although the researcher had a predispo- sition to do so and the literature supported it. Indeed, the creation of an IR system was only one possible method for investigating navigation. An alternative method might have been to measure the cognitive space of human respondents in some setting and then attempt to build information spaces which were isomorphic with the cognitive spaces. This type of approach would not serve the purpose of creating an environment in which navigation could be investigated though. Some alternatives to the ways of building the information space and evaluation system were already mentioned in previous sections. The choices made were based on the researcher's predispositions, the literatures reviewed in Chapter 2, and informal experimentation during the design process. Different choices could have led to a different information space and a different system. A different questionnaire would have led to different answers. Chapter 5 discusses whether a different understanding of the conceptual framework for navigation, and of navigation as a fundamental concept for information systems, might have been obtained from alternative approaches to building the information space and information system. The third and most important component of the investigation into navigation is the user-based study. Chapter 4 presents some findings: means, standard deviations, and levels of significance are all ways of measuring error using statistics. First let us discuss some methodological sources of error, and then we will turn to mention of ways in which error was reduced. The method for recruiting respondents produced bias as a necessary result of selection based on the researcher's judgement and convenience (Babbie, 1989). The types of bias anticipated have to do with the amount of experience with computers and computerized IR systems (estimated to be higher than the general population), educational level (high), and general level of sophistication with information resources such as libraries (also high). The role of bias is much less important for this exploratory study than for future studies in which inferences about navigation will be drawn. Computerized IR systems are not yet commonplace, so a representative population sample would not necessarily be desirable: it is likely that a representative population sample (say, of all US adults) would include respondents for whom data collected concerning navigable systems would be confounded with the lack of general computer and IR experience in those respondents. This is an acceptable compromise for this study, because this study is not about changing the processes which are already used successfully to interact with existing information systems (although such a goal could be envisioned as IR systems become more like human communication systems and, eventually, exosomatic memory). Starting with a respondent group which already possesses a general model of information seeking processes and task experience seems a worthwhile tradeoff resulting in non-generalizability of results, when generalizability is not sought. Whereas bias was introduced by the choice of this group of respondents, noise was reduced. The use of a relatively homogenous group should have resulted in less noisy data as differences associated with educational level and vocation were minimized. This respondent group might be seen as better qualified to assess the relative perceived usefulness of a system which stresses navigation to traditional systems which do not stress navigation (but they are more familiar with). The choice of such a group could be predicted to result in a less favorable evaluation of the navigation environment than would the group without knowledge of traditional systems, because they already interact regularly, with some success, with traditional systems and might find a "new" system difficult or threatening. Some group members, similarly, might provide a more favorable evaluation because they might find the new system intriguing or fun. Pretesting of the scales and open-ended questions used led to confidence in their internal validity. Respondents understood what was being asked and were able to express their experiences in the framework provided by the questionnaire items. The many previous uses of Likert scales and open-ended questions of the type used in, for example, sense making studies, lent support for the external validity of the questionnaire items. The general retrieval environment and the tasks completed are both philosophically consistent with the framework from Chapter 1 and in reasonable accord with the techniques of traditional IR research. 3.6 Conclusion This chapter has described the methods for investigation employed in this work. There were three main components, the first two being related to system design and the third with a focus on user-based evaluation. As Chapters 4 and 5 will show, the investigation was not without difficulty. The user-based evaluation, in particular, was problematic due to the evolving nature of the system and the conceptual framework. The outcome of the evaluation, to some extent, only partially matched the final version of the types of criteria which could be derived from the conceptual framework from Chapter 1. In the context of an investigation which simultaneously attempted to develop a conceptual framework and empirically validate or extend the framework, such shortcomings might have been expected. The methods described in this chapter were successful in generating insight into the research questions, but not as successful as they might have been were the conceptual framework in Chapter 1 fully developed at the outset of the research. However, the conceptual framework was developed partially on the basis of the outcome of the investigation! So, there was no way, within the context of a single study, for both the framework and methodology to have been optimal. CHAPTER 4: RESULTS 4.0 Introduction This chapter presents the results of the research described in Chapter 3. It is organized generally around the criteria for information systems as specified in Chapter 1 and provides answers to the research questions. The analysis starts with task-related criteria which are more in accord with relevance-based approaches to information retrieval. Then, model-related criteria are covered. These have more to do with navigation-based systems and are directed at possible future of IR as human-communication or exosomatic memory systems. The perceptions relating directly to the different model components as introduced in Chapter 1 will be summarized. Finally, demographics and other numerical outcomes are covered. The research questions will be addressed throughout this chapter. For the analysis, data will be taken from the variety of sources available. These sources include open- and closed-ended responses to the various questionnaire items, system design characteristics, and known qualities of the information spaces of the systems under study. Note that Appendix H contains all responses to open-ended items sorted by item. Embed- ded in the open-ended responses are all coded responses to the closed-ended items. The codebook for these items is in Appendix G. A total of 78 search tasks were completed by 20 respondents, 40 using Space and 38 using Prism (one respondent was unable to complete the Prism tasks due to network outage). Appendix B contains the self-administered instrument used to collect data. Respondents wrote their answers by hand. All data were later coded and stored via word processor by the researcher. Throughout this chapter, an attempt will be made to gather answers to the research questions, restated here: 1. Is navigation a useful approach for operationalizing information retrieval? 2. What perceptions by users engaged in information seeking help to understand the use of navigation for conceptual- ization of information retrieval and information system design? As discussed in Chapter 1, the answers to these questions were not intended to be "yes" or "no." The answers will be in the form of collected evidence from a variety of sources and will be combined with insight for future system design. 4.1 Analytic Techniques Analytic techniques described in this chapter are appropriate for the types of data which were collected in that statistics were used for closed- ended data and quotes or summaries of open-ended responses are used to describe the outcomes of the user-based investigation. Inferences about relations among concepts under investigation are drawn from statistical measures of association and from explicit statements in the open-ended items. The combination of open- and closed-ended data, as used in this chapter, will be shown to be powerful for understanding the role of navigation for information systems. Statistical tests appropriate for the various sorts of closed-ended data are included in this chapter, along with a level of significance when appropri- ate. An alpha value of 0.10 was considered suitable for the exploratory nature of the user-based component of this investigation. At this point, a caution on using the Pearson correlation and other inferential statistics for analysis is in order. For data associated with tasks, a reduction in variance due to repeated measures across respondents can be expected. This is because each respondent completed more than one task with each system. Although order effects were not found across respondents (as will be shown in this chapter), it could also be suspected that some within- respondent variance might have to do with task ordering. Pearson correlation (and chi-square) values are reported even though the assumptions these tests make of independence across responses are violated. The statistical effect of the within-respondent dependencies of the data may be expected to lower the variance (resulting in lower standard deviation scores) and increase measures of association (resulting in higher Pearson correlation or chi-square scores which are more likely to be significant). Verbal reports from respondents taken from open-ended questionnaire items will help to interpret the appropriateness of statistical outcomes. The analytic techniques for open-ended data are more akin with those of field study or case study methods (Babbie, 1989) than of large scale interviews or surveys. This is because the contexts of open-ended statements are known to a great extent -- the system used, the task criteria, closed-ended scores for satisfaction and navigation, and demographic items are all associated with each open-ended utterance. Thus, the presentation of open- ended data is in the form of quotes from specific respondents and occasional counts or summaries of types of responses rather than counts from categories designed to get at latent content. Formal content analytic methods (Krippe- ndorff, 1980) were not employed, because the contents and contexts for open- ended data were found to be clearly interpretable without resorting to categorizing the respondents or their data. Furthermore, statistical analysis of content analytic categories would not generally yield statistically significant conclusions given the moderately small sample size for the current work, because statistical tests for categories (e.g., the chi-square test) are less sensitive than those for continuous data (such as the T test or Pearson r). 4.2 Task-related criteria This section examines the outcomes of the research in terms of the task- related criteria generated in Chapter 1. The essential issue to be addressed here is, could the systems under analysis be used to accomplish basic tasks of information retrieval? The resolution of this issue points to the answer for the first research question. Differences in the capabilities of each system, including performance for different tasks, are covered here. As such, task- related criteria do not include simply the ability to meet a clearly-defined information need involving explicit relevance judgements, but also purposes such as browsing or the meeting of vague information needs. Two types of tasks were assigned to each respondent for each system under study, completed in random order. One was to identify document surrogates which meet information needs expressed as a set of three keyterms, such as "managing, technology, [and] economic factors." The other had an information need expressed as a sentence, such as "I am interested in planning financing for my child's college education." Both tasks fall generally into the realm of relevance-based approaches to IR in that respondents were asked to meet a particular information need. The makeup of the respondent group, combined with some qualities of the Space system, yielded many comments about anticipated usefulness for browsing or for more vague information needs which were found useful for this analysis. The analysis in this section focuses on the tasks themselves: whether respondents were able to identify document surrogates which were judged to meet the assigned information need, how long the tasks took, and how long respondents took to learn the basics of each system. The process by which the tasks were completed was also considered: cues present in the system to help to know what to do or the ability to get assistance or form plans based on system feedback. (Other process-related items were considered for model- based criteria analysis which follows this section.) 4.2.1 Well-Defined Information Needs The most straightforward way to examine the relative usefulness of each system for task completion is to look at the frequency with which respondents were able to identify at least one document surrogate which they judged to meet the assigned information need. Table 4.1 shows the frequency with which each system "failed" where failure is defined simply as the inability of the respondent to identify a suitable document surrogate. No attempt was made to assess whether the documents chosen by respondents were "correct" because of the assumptions for this work about the situational nature of relevance (discussed in Chapter 1). Indeed, data on the documents selected were not analyzed at all, except to see if any documents were written down. While it may be argued that some documents would be judged by a set of independent observers to match the information needs better than others, for this study it was left to the respondent only to decide. A reading of the document citations and their associated surrogates revealed that respondents' choices were at least in the right neighborhood with considerable individual variation presumably based on individual interpreta- tions of the information needs. Table 4.1: Success for Each System Across Tasks -------------------------------------------------------------------------- n | System col % | Space Prism --------+-------+-------+ Success yes | 31 | 34 | 65 | 77.5 | 89.5 | +-------|-------+ no | 9 | 4 | 13 | 22.5 | 10.5 | +-------+-------+ 40 38 78 -------------------------------------------------------------------------- Fisher's exact test for 2x2 tables was non-significant and was also non- significant when controlling for which task type (keyword or general information need) was completed. (Note that the criterion for independence of measures was not met for this test). In spite of the lack of statistical significance, it may be seen that the Space system had more than twice as many non-successful searches as Prism, giving support to thoughts that Prism was more "useful" overall for task completion for the types of tasks completed for this study. Within the realm of tasks typical of studies and systems designed in the relevance tradition, Prism is more favorably evaluated. Space, however, is not without utility for such tasks. One fairly simple change to Space for increased utility on searches for well-defined information needs will be to use the same information space but with a different interface (e.g., one which makes an analog of boolean searching or field searching possible). 4.2.2 Less Well-Defined Information Needs As no tasks were completed concerning browsing or vague information needs, we are left to consider the assessments of respondents regarding the usefulness of the systems for such tasks. There were a large number of comments to this effect on both the open-ended item concerning satisfaction and comments about the purposes to which each system would be put (asked during the "Impressions" section of the questionnaire after task completion). The trend throughout those comments is clear: respondents expressed faith in Prism-like systems for information needs which were well-defined, and an impression that the Space system would be most useful for browsing or situations where the information need is not well-defined. The primary source for data on the usefulness of each system for less well defined needs is in the "impressions" section of the questionnaire, where respondents were asked to assess "in what situations would you prefer to use space/prism?" For the Prism system, 10 out of the 18 responses to this item specifically mentioned the usefulness of Prism for when the information need is well defined. Comments such as, "when information need is very clearly defined (103101)," and, "If you know the author or title of a book / person and want to find the exact article (120101)," were typical. Other respondents commented on the usefulness of Prism for "most situations (110402)" or "known item (112001)" searching. Two respondents (103103 and 110401) mentioned the usefulness of Prism for multiple subject access or searching across different elements of a document surrogate (e.g., Title and Author). No respondents mentioned browsing for the Prism system. For Space, many comments were directly opposed to those for Prism in that they focused on the usefulness of Space for browsing or for searches which did not have clear objectives. The comments of respondent 110601, "when one has no idea what he wants but maybe clued in on an area by browsing the space," and 120101, "a general, broad search," were representa- tive. Some respondents suggested that Space was useful for subject searches (as opposed to Title or Author searches, although usefulness for full-text searches in general might also be indicated). For instance, respondent 110101 wrote, "a search where I felt that the main subject of a document would be my access point." Comments about the usefulness of Space for less well-defined searches are tempered by comments of several respondents who stated they would not use the Space system for any purpose. Additionally, it should be remembered that respondents might have been trying to say something nice about Space, indicating its usefulness for some task, albeit not a task they completed for this study. Even With those caveats in mind, the outcome of the comparison of Prism and Space for tasks for which the information need is less well defined leans in favor of Space. Although it will remain for another study to empirically test this with an appropriate research design for such information needs, comments such as those of respondents 103101, "when I'm trying to see relationships between descriptors," and 110402, "if I were completely ignorant about a subject," indicate the possible favorableness of the outcome of such a study towards Space. 4.2.3 Learning and System Cues In order to interact successfully with an information system, according to the argument in Chapter 1, it is necessary to have a model of the "other" -- a model of what the system does and how it works. This study did not directly address the nature of the model of the system possessed by respondents. However, some comments by respondents for open-ended items relate directly to the model which they possessed a priori or which were under formation. The implications for this study are twofold: first, that a well-defined model of the Prism system (or systems similar to it) led to increased feelings of comfort and higher evaluations than for the Space system (for which a less well- defined model was possessed). So, people were able to glean quickly what steps were necessary during their searches and make use of system responses to refine their search. The second implication is that respondents quickly formed a model of the Space system based on the user manual and their interaction with the system. Here, the Space system did not correspond to a priori models of IR systems but was learnable. One goal of navigation-based approaches to information system design, assumed also for human-communication or exosomatic memory systems, is the intuitiveness of the system. Not only the organization of the information space (so that it matches the cognitive space of individual users or user groups) but the methods for interacting with the information space. Some respondents indicated they had formed a more effective model of the Space system between their first and second searches. Respondent 103103 commented for the open-ended item concerned with satisfaction, "by this time I had finally figured out to [sic] really manipulate the space." Respondent 110402 wrote, after her second search task, "system is more familiar now, but I can't find much that is useful." Respondent 110401 commented, "previous search helped to know system better and where to go to find exact term." These comments indicate the increased detail in respondents' models of the Space system and the models' utility for interacting with the system. Other comments related to the match of the information space to user's cognitive space (or at least an ability for users to translate their concepts and relations easily to those found in the information space). Respondent 111901 wrote, "intuitive. A 'neat' 3D concept. Takes a little time to get used to," to describe the Space system. However, some respondents, such as 110702, had the opposite perception: "Not completely intuitive, spatial relationships interesting, but often confusing" (from the "impressions" sheet, where respondents were asked to assess the usefulness of the locations of keyterms and documents in the information space). This second perception, relating to confusion about the basis of the location of keyterms in the information space or difficulty in finding useful documents, was found more often. It is not clear in many comments whether the information space is unclear or whether respondents just had difficulty finding documents close to their associated keyterms. There were no similar comments about Prism: nothing to indicate that people had formed more expertise with the Prism system during the short time of their participation in the study and nothing to indicate familiarity with Prism's information space. However, many comments about Prism indicated the existence of a pre-formed and complete model of systems similar to Prism, e.g., in the "impressions" section of the questionnaire where respondents were asked to describe Prism in their own words. Respondent 103102 had a representative comment: "traditional -- menu driven -- command driven search for information on various fields, similar to my prior experience with electronic searching." There were very few comments on the cues which the systems provided for how to interact with them -- components of the model of the system provided by the system explicitly. Some respondents provided answers to the items about "what was it that helped you to know where you wanted to go during your search" which indicated components of each system which helped with the next step. For Prism, on-screen instructions helped. For Space, the map window and the concept relations helped. No answers to this item, for either system, related to the use of online help or the user manual. No comments specified a loss of knowledge about how to proceed with a search either. The conclusion that may be drawn from this evidence, in light of the specific request for information about what led respondents to know "where to go," is that both systems provided sufficient feedback to the respondent (and/or fit well enough with respondents' prior knowledge about information retrieval systems) so that they could proceed without consciously seeking guidance. This sub-section has shown that Space was generally learn-able, but intuitive for only some respondents. Respondents had a well-formed model of Prism-like systems which did not seem to change or be challenged during the retrieval tasks. Both systems gave sufficient feedback or were sufficiently intuitive in functionality so that respondents were able to interact successful- ly with them (this statement pertains to system functions, not to the informa- tion space organization). 4.2.4 Training Time Training time points to the intuitiveness of the system, the ease of forming or incorporating a model of the system, and the degree to which the system meets expectations. Training time, the number of minutes respondents worked with the system for practice before the tasks, was measured for only 7 respondents. Missing data were a result of an error in training the research assistant and the non-interventionist data collection methods (that is, no one noticed when respondents finished training. Respondents were not asked to time themselves for training, in an effort to allow them to take as much time as they needed to feel comfortable). Pairs of training times were as follows, with time for Space given first: 19,6; 14,3; 10,3; 20,20; 15,15; 20,20; 18,12. The means were 16.6 (SD 3.7) for Space, and 11.3 (SD 7.4) for the Prism system (r was 0.71, p < 0.10). As Appendix A shows, respondents were given more detailed written instruction for the Space system than for Prism. The Prism system includes on-screen instructions which seemed to occupy several respondents, while other respondents proceeded from the brief training manual introduction to the retrieval tasks. Respondents mostly followed the Space system training manual page-by-page, which might account for less variance in training time for that system. No definite conclusions about comparative ease of learning to navigate the systems are apparent from these data. (Sample characteristics, detailed in a following section, lead to an expectation that training times for the Prism system should be lower, as all respondents had some experience with traditional electronic information systems.) 4.2.6 Summary of Task-Related Criteria On task-related criteria, the Prism system was found generally to be more effective for information needs which were well-defined, such as the tasks completed by respondents for this study. Respondents indicated the potential usefulness of Space for poorly-defined information needs, but such tasks were not tested empirically. Prism was not judged to be useful for poorly-defined needs but again this was not tested empirically in this study. The Space and Prism systems were similar in terms of training time and the ability for the user to know how to proceed with a search. Simple enhance- ments to the Space interface (not the information space) might lead to a Space system which is better suited for tasks in the relevance tradition. 4.3 Model-related criteria The formation and maintenance of the various models necessary for effective interaction with an information system (or human being) were introduced in Chapter 1. This section will analyze the extent to which model formation or the applicability of a priori models is facilitated by the Space and Prism systems. Unfortunately, due to the evolving nature of the current work, few items relating directly to model formation and use were asked of respon- dents. Thus, both respondent comments and knowledge about the design of each system will be incorporated for this analysis. The existence of a model of the other, a model of how to change the relationship to the other, a model of how the other sees the self, and a grasp of the situation in which the self and the other are found, is necessary for interaction with any information system or human being(s) (again, refer to Chapter 1 for the full discussion). For human communication systems and exosomatic memory systems, these models are paramount. For relevance-based systems, the models are important but seldom a focus of system design (in that the users form the models, but the systems do nothing to facilitate the model formation process). For navigation-based systems such as is under study here, there is an attempt to provide an information space which fits the user's perceptions or model of the space, by matching the information space to his or her cognitive space. There is also an attempt to facilitate model building of the various components of the interaction by providing an interface to system functions which requires little learning and is well-matched to the tasks to be per- formed. This section will examine the available data in turn for the model of the other, the model of the relationship to the other, the model of how to change that relationship, and the model of the self. 4.3.1 The Other In a nutshell, the model of the Prism system was largely pre-existing from respondents' prior experience with similar systems. The model of the Space system was formed during the research process with the assistance of the user manual/tutorial and experience with the system. No data in the literature were found which spoke directly to the issue of how quickly complete novices are able to learn to use traditional relevance- based IR systems. Further, it is difficult to measure the prior experience of the respondents for this study, or any other study, in units such as "hours," or "number of searches completed." Respondents' self-reported experience with computer systems and information systems on the demographic portion of the response form, but this Likert-scale item does not result in any absolute measure of experience. The significance of all this is that respondents had some amount of experience with systems designed in the same tradition as Prism, but necessarily had none with Space-like systems (since Space is the first system of its kind, from both an information space and visual interface standpoint, to be found in the literature). In spite of these differing levels of experience with the two types of systems, respondents were able to quickly form and use a model of the Space system, as evidenced by their rate of success in finding document surrogates and their comments concerning their knowledge of the system. Empirical measurement of the extent to which the Space system design proved to be actually intuitive, as opposed to non-intuitive but learnable, would be difficult. Nevertheless, a model of Space seemed forthcoming and was particularly visible in the "impressions" section of the questionnaire where respondents provided generally accurate descriptions of Space and evaluations of the various components of the Space system. The model of the other for Prism seems a good example of what Mead (1960) called the "generalized other." This is used when general knowledge about an other, possibly at a stereotype level, is applied when experiences with the specific other had not yet occurred. Breakdown occurred for some respondents when they had difficulty applying search techniques which they thought should work in a particular way, such as boolean operators or truncation. Some respondents reacted to this problem by employing search techniques which fit into an even more generalized model (e.g., single term searches), while others pursued expertise needed to form a more specific model of the Prism system (e.g., by using online help and exploration to discover the proper methods for boolean operators). The model of how the other sees the self, for Prism, falls into the same category as most traditional (batch-oriented) systems as discussed in Chapter 1. Respondents knew where the responsibility lay for the generation of queries and re-working of unsuccessful searches: on themselves. As a highly information-literate group, the respondents also provided comments indicating their expectation of the need to come up with search terms which meet someone else's descriptions for documents, not so much their own. The model of how Space models the self was harder to determine and perhaps subject to additional individual variability. Respondents understood they were to navigate through a visual representation of a database but many did not perceive the basic similarities in content between the Prism and Space databases. They noticed the relational nature of the space, but many had difficulty in employing the relations to their benefit. The similarities in function between Space and relevance-based systems included the ability to keyterm search, the nature of the keyterms (e.g, such things as "teacher effectiveness," a term from the ERIC thesaurus), and the basic goal of identifying useful document surrogates. Respondents seemed to grasp these and then responded to these similarities in their responses to open-ended items about satisfaction (and other open-ended items) requesting additional types of functionality, such as boolean searching or multiple term searching. An implied component of how Space models its users comes from frequent assessments of Space as a browsing system as opposed to one for known items or narrow searches. Not surprisingly, respondents did not comment on the nature of Space as a system which is better matched in function and informa- tion space to the their expected functionality and cognitive spaces. The interactive and ongoing nature of the search was evident, though, through such terms as "wandered around" and "nearby terms" in the open-ended comments. The model of the other and of how the other models the self was, as Mead would have us believe, based first on the "generalized other." This applies for both Prism and Space. For each, many respondents attempted to use functions which they expected based on the generalized model. In the case where these functions did not work as expected or were not available, Prism users either went to a more general model or searched for the functionality which they expected was hidden but present. For Space, respondents tended to make use of what was obviously available. This makes sense because Prism really has many more things it can do than Space and almost the entire functionality (but not the intellectual foundation) of Space is evident in the user manual/tutorial (reproduced as Appendix A). A better model of the Space system was formed during the search process -- across training and the two search tasks. The conclusion may be drawn that further refinement to the model, to bring it in accord with actual information space organization and system functionality, might facilitate future searching. 4.3.2 The Relationship to the Other Both Prism and Space responded to input from users. Prism allowed input only from the keyboard, while Space allowed keyboard and mouse or glove input. Respondents were aware of each system as a fairly typical computer application in that the program would wait patiently for the human component to provide some sort of instruction and then rapidly carry out that instruction (or set of instructions). For Space and Prism, the relationship of the human user to the system ("other") was of a fixed set of responses which could be triggered by user input. We can distinguish between the overall relationship to the other and the relationship as it exists at any one time. At a particular time, the general model will apply and there will be some sort of context or situational model as well. For Prism, situational models are limited to slightly different subsets of commands which are available at different points during a search, e.g., during keyword specification versus during document surrogate selection. For Space, situational models are really "locational" models, in that a situation, for the Space system, means simply a different location in the information space. Both systems are clearly quite limited in the extent to which they can change to meet different needs of their users at different times or for different searches. More detail on this matter is found in the next subsection, leaving the current subsection to continue with general models of the other possessed by both the systems and their users. From the system point of view, we can think about the model of the human user which was created (implicitly) by the system designers. For Prism, the model is of a user who has a well-defined information need and is able to express it in terms appropriate for the vocabulary of the particular database being used (in this case, ERIC). Prism system responses are in the form of output (lists of citations or individual full database entries) or error responses. There is no continuity from one search to the next except so far as searching may take multiple steps (e.g., to first generate a list of citations, and then to retrieve particular document surrogates). For Space, the model which it possesses of the other is of someone who is less familiar with the language used by the database designers. The search process here is one of iteration and exploration in that there is a continuous path between locating useful vocabulary in the information space, understand- ing the layout and contents of the database, and retrieving and evaluating document surrogates. The Prism model of its generalized users, at first look, might seem impoverished and inadequate next to the Space model. Space would seem to incorporate many of the aspects of human information seeking and use behavior, while Prism involves a much smaller subset of such behavior. However, the Space system will tend to involve more work and thought on the part of the user, at least initially (as supported by numeric outcomes discussed in the next section). The problem might be akin to that of a reference librarian, who often must interact with library users who do not want to learn to find an answer for themselves, nor do they want to spend time expressing their needs in terms which can be easily understood (perhaps after some dialog) by the librarian. They want an answer, and quickly! So, the model of information seekers incorporated by Space might be more appropriate to actual information seeking behavior and could eventually be quite useful for information seekers who are interested in spending the time to get the benefit (as discussed further in Chapter 5). For typical information seekers though, such as the general public or undergraduates, both the model which they have of Prism-like systems and the Prism model of its users are more in accord with the desired nature of the outcome (quick and unambiguous), if not the outcome actually desired (a good answer, if not the best possible answer). In the current study, many respondents, as information professionals, were already capable of getting closer to a good answer speed- wise, accuracy-wise, and completeness-wise, from Prism-like systems. Respondents' models of Space, or its descendants, may include knowledge of Space as closer to a human communication system than a batch- oriented system. As such, answers to questions could be through dialog. In this study, the tasks completed by respondents did not lend themselves to attempts at dialog by respondents (the tasks fit well with the relevance-based tradition). And, admittedly, the interface and database organization did not prove to be usable enough that any but the shortest conversations would be desired by respondents in the current incarnation of Space. This subsection has considered the nature of the model of the other from both the system's point of view (as created by the system designers) and from the user's point of view. Although Space has some characteristics which make it more akin to a human communication system or at least a system whereby answers are achieved through dialog, it is likely that Prism is better suited to the type of interaction which many users would prefer: short, even at the expense of accuracy or completeness. It should be pointed out that these issues are primarily directed at the interface, not the information space. The information space organization possessed by Space, with its attempt at increased similarity to cognitive space (as discussed in Chapters 1 and 3), could be just as easily introduced with a Prism-like interface. Measurement of relative effectiveness of such a system must be left for a further study. 4.3.3 How to Change the Relationship to the Other The model of how to change the relationship to the other can be examined from two perspectives. First is the larger: how to modify the nature of the interaction and the general expectations that the participants (system and user) have of each other. Neither Space nor Prism possess much in the way of this ability. However, Space has potential to include both models of the information seeker in the relevance-based tradition as well as a browser if somewhat different interfaces were employed. Prism seems stuck in the relevance-based tradition with a model of its user as knowledgeable and goal- directed. The second perspective on the model of the relationship to the other is related to the ability to redirect the "conversation" during a given interaction. The analysis from this perspective is straightforward: Prism operates in the traditional batch mode of operation in that any redirection must be both user- mandated and necessarily removes any previous components of the "conversa- tion" from Prism's "memory." Space, on the other hand, is very much driven by micro-changes to the relationship. As a system more in the dialog-driven model than batch oriented, changes to the relationship to the other are what interaction with the Space system is all about. Within an interaction, Prism users provide queries which make quantum changes in the status of the relationship to the other. This is accomplished as part of the primary methods for interacting with the system. Similarly, Space users constantly change their relationship to the system through their keyboard input and mouse/glove gestures, except the changes are continuous, rather than discreet. As stated in the subsection above on the model of the other, respondents drew on prior knowledge of similar systems for insight into how to change the relationship to the Prism system within an interaction but relied less on such knowledge and more on readily visible or otherwise explicit (e.g., from the user manual) knowledge for Space. The model of how to change the relationship to the other, for both types of IR systems studied here, is a fundamental part of accomplishing search tasks within the system. As systems move towards human communication and incorporate more knowledge of various information need situations, the capability to change the very nature of the other -- what the system does and how it views respondents and their needs -- will be more important. The Space system takes some steps in that direction with the potential capability to provide different interfaces for the same information space, but this quality was not realized as part of this work. 4.3.4 Model of the Self In the view of such theoreticians as Goffman (1974), the self-concept is central to our being and is largely quite slow to change. However, we are able to present different aspects of ourselves in different situations. This subsection briefly considers the model of the self as manifested through interaction with both the Space and Prism systems. Note that there is no consideration of the model of the self possessed by the systems. According to Mead (1960), the ability to model the self and to perceive the self as others see it, is a uniquely human capability. I have argued elsewhere for the necessity for a self-concept for true artificial intelligence (Newby, 1988). Presumably, effective human communication systems might necessitate such intelligence. Furthermore, exosomatic memory systems might actually involve the informa- tion system possessing a copy or analog of its user's self-concept. Presentation of self for interaction with Prism-like systems is analogous to a bright light (the self) being directed through a small opening (Prism's limited model of the self). Such traditional systems are extremely limited in the types of input that they can take and the scope and level of background information on the information need which they are capable of processing. Prism users must therefore channel their vast background knowledge, existing domain knowledge (or ignorance), goals, values, and experiences into a few simple query terms. For Space, users have similar limitations as for Prism, but with the added portal for their presentation of self through the partial match between their cognitive space and the information space. That is, there is some capability to express their perceptions of relationships among concepts through relocation in the information space. This quality, while important to the intellectual foundations of this work, is only realized at a very basic level: that of the incorporation of continuous relations among concepts. This is further limited by the inability to search for multiple terms in the space. Presumably, that added functionality would enable the study respondents to express their needs using keyterms in the space and end up at the locus of those terms which would be a window into related keyterms and documents. Further development, of course, would also necessitate a better match between cognitive space and information space. The presentation of self and not the "true" or core self concept (to the extent to which one may be said to exist) is what allows us to interact in the daily world. The Space system has potential to allow for a fuller presentation of self which might ultimately include such components as the values, goals, and experiences which inform the information need. Although the Prism system does not allow for a broad presentation of self, other systems in the literature have attempted to incorporate some knowledge of the user. As the arguments in Chapter 1 indicate, it is my belief that the lack of success from those approaches results in the lack of consonance between information space and cognitive space, not from the inaccuracy of their basic ideas. 4.3.5 Summary of model-related criteria If one is to accept the importance of models for human communication and realize the importance of aspects of human communication for effective information retrieval, then the necessity of considering the extent to which information retrieval systems assist in the creation and maintenance of such models will be understood. This section has compared the qualities of Space and Prism and the comments which respondents had about using each system, in light of the different types of models which were proposed in Chapter 1 as having utility for interaction with information retrieval systems. Because it was created with the importance of model building and non- relevance-based approaches to IR in mind, Space had a clear advantage over Prism for model-based criteria. An important advantage which Prism had though, was the pre-existence of a well-defined model possessed by the respondents for this study. From the point of view of system design, basic goals of information retrieval (from Chapter 1) and the comments of respon- dents, the roles of the various types of models posited as having importance for information retrieval have been shown to have more longevity and applicability for long-term goals of IR in Space than in Prism. 4.4 Analysis of Closed-Ended Data The user-based evaluation described in Chapter 3 composed the most traditional portion of this work. However, as the previous section indicates, the open- and closed-ended data collected from respondents compose only a portion of the entire study of navigation for information retrieval carried out here. This section will present summaries and interpretations of the data collected from respondents which have not already been incorporated in previous sections. Included are time on task, relation between satisfaction and other closed-ended measures, summaries of the closed-ended responses, etc. The values for time spent on each task showed a clear trend of less time spent during Prism searches than for Space searches (see Table 4.2). Learning effects for both systems may be gauged by considering the time-on- task for each system for each task. For both systems together, there was no significant difference between the time taken to complete the first task and the time to complete the second task (within-respondents t = -0.52, ns). Compari- sons for each system alone were also non-significant. Table 4.2: Time on Task (in minutes) -------------------------------------------------------------------------- System Mean Min Max SD N -------------------------------------- Space 10.9 2 27 5.6 40 Prism 7.9 1 18 5.6 38 Combined 9.4 1 27 5.2 78 -------------------------------------------------------------------------- Respondents were asked to rate their satisfaction with the task outcome ("On the scale below, indicate how satisfied you are with the outcome of your search."). They were also asked to provide explanation in their own words ("What was it that gave you the degree of satisfaction that you had?"). The Likert item yielded approximately uniform distributions for successful searches across systems with a negative skew (where, as above, "successful searches" were those which resulted in at least one document number or citation being written on the data collection sheet). The mean satisfaction score across all searches was 4.6 (SD 3.3). With non-successful searches eliminated, the mean was 5.5. Of the 13 non-successful searches, all but one search was rated as "0" on the Likert scale. The 13th search was rated as "2." Generally, the Space system yielded lower scores on the Likert scale. Only the Prism system was rated as "10" at all on this scale. Table 4.3 summarizes scores on this item for successful searches. Table 4.3: Closed-ended scores for "satisfaction" item (successful searches only) ------------------------------------------------------------------------- frequency | n ----------- score on "satisfaction" scale system | 0 1 2 3 4 5 6 7 8 9 10 ------------------------------------------------- Space | 2 2 4 7 2 4 1 2 4 3 0 31 ------------------------------------------------- Prism | 0 1 1 4 3 4 6 2 3 4 6 34 ------------------------------------------------- Combined | 2 3 5 11 5 8 7 4 7 7 6 65 ------------------------------------------------------------------------- Likert items for navigation aspects of the systems during task completion indicate a strong association between satisfaction and "knowledge of where you are," and "knowledge of where you want to go." Satisfaction was found to be positively related to both measures of system navigability, as shown in Table 4.4. Table 4.4: Relationship of Navigation and Satisfaction Scores ------------------------------------------------------------------------- Pearson correlation scores on the "where you are" and "where you want to go" closed-ended items with the scores on "satisfaction." r | Where Want p < | you to n | are go --------------------------------- system Space | 0.44 | 0.38 | .005 | .05 | 38 | 38 --------------------------------- Prism | 0.53 | 0.50 | .001 | .050 | 40 | 40 --------------------------------- Overall | 0.53 | 0.47 | .001 | .001 | 78 | 78 --------------------------------- ------------------------------------------------------------------------- Each task analysis included three open-ended items. The first asked respondents to write what gave them the degree of satisfaction they had on the closed-ended scale. The second asked about what gave them the degree of knowledge of where they were and where they wanted to go which they reported in the associated closed-ended item. A final item for each task asked respondents to provide any additional comments they might have (see Appendix B for the exact wording of all questionnaire items). Four general classes of response to the open-ended task items were found. One had to do with qualities of the system. Another class was directed at the task itself. A third had to do with qualities of the documents found during the task. The last general class of response referenced respondents' prior experience, either with similar tasks or similar systems. Open-ended items were highly structured, inasmuch as all open-ended data were responses to explicit questionnaire items. The last questionnaire item for task analysis, however, was potentially problematic in that few respondents indicated whether their comment had to do with satisfaction, navigation cues, or something else. This potential problem was not realized because both closed- and open-ended data indicate a strong relationship between the perceived presence and usefulness of navigation cues and perceived satisfaction. 4.4.1 Searches which did not Result in Selection of a Document Surrogate Of the four non-successful searches using Prism, three responses to the open-ended satisfaction item were given. One respondent simply commented she "did not know the descriptors and could not retrieve any documents under their given titles...(120101)." Another respondent was unhappy about having to search on a sentence describing an information need saying, "very irritating to play guessing games. Give me a title, damn it (112601)." Another respondent, during a keyterm task said, "the lack of an online thesaurus made it impossible to narrow the number of hits that technology brought up, or to find alternate ways to search (120201)" ("technology" was one of the terms in the assigned set of keyterms). The small number of non-successful searches for Prism makes it difficult to assess whether any general deficiencies of the system were identified by respondents and indeed indicates a lack of such deficiencies. Respondents for these non-successful searches and later successful searches requested a particular navigation tool -- an online thesaurus or list of descriptor terms. Somewhat more open-ended data were available for non-successful searches using the Space system with 9 responses for the satisfaction item. A total of 6 respondents had a non-successful search with the Space system, meaning that 3 respondents did not write any document numbers at all for either of their two searches. Three respondents were able to identify appropriate locations in the space but then found no documents which met the task criteria (each with a different information need). These responses included, "found terms without document numbers (103102)," "not finding document numbers near my term (110901)," and "tried to locate myself in the space via keyword and [...] concluded there are none in the space (120201)." Respondent 103102 also stated she "didn't understand the relation between terms and numbers, light and dark." The first part of this statement is a frequent comment for the Space system which was also expressed in the three statements from the last paragraph. The statement indicates confusion about the location of documents in the space (which were represented by a document ID number). Many respondents seemed to appreciate the conceptual locations of keyterms in the space, but were surprised at the topics of documents found near them. This phenomenon is discussed in Chapter 5 in more detail. Surprisingly, no respondents commented on the appropriateness or intuitiveness of the conceptual relations among documents (or lack thereof) at all -- either few respondents made an investigation, few noticed relations among documents, or simply none wrote about them. The second part of this respondent's statement refers to the depthcueing feature of the information space window, which made items in the space which were further away appear fainter (or disappear entirely). No other respondent commented on this aspect of the Space system (it was mentioned in the Training Manual). Two responses for non-successful Space tasks indicated that an inability to meet the perceived task criteria led to the satisfaction scores on the closed-ended item. One respondent said, "couldn't find anything that fit more than one of the criteria (110101)," and another said, "the citations I found would not answer my information need (110601)." Other respondents either did not know what gave them the degree of satisfaction they had or did not respond to this item. Only one respondent with a non-successful search indicated "previous experience (112001)" as a navigation aid -- his prior Space search was successful although it only yielded a satisfaction score of 2. Very little other data from navigation-related items were given for non-successful searches. Three respondents (103102, 110901, and 110601) had difficulty finding appropriate vocabulary terms in the information space. 4.4.2 Successful Searches: Prism For Prism, 34 responses to the open-ended item about satisfaction were given. Closed-ended responses ranged from 1 to 10, with modes at 10 and 6 and a negative skew (mean 6.4, SD 3.2). Regardless of the satisfaction score and whether the comment had a positive or negative tone, most responses had to do with whether the document citation found "matched" the assigned information need. Respondents who were apparently less happy with their search commented, "doesn't meet all the criteria in the information need (103101, satisfaction scale value of 5)" and "couldn't relate subjects, couldn't find any matching more than a few at a time (103103, satisfaction 1)." Respondents who provided higher satisfaction scores "found exact words in the title and abstract information verified it (110901, satisfaction 10)," and said "the document abstracts indicated that it [sic] matched perfectly with the stated information need (120102, satisfaction 10)." As occurred for the non-successful searches, four Prism users included criteria which were not part of the explicit task (external criteria were not commented on explicitly for any successful Space search). Respondents mentioned the currency of the item found (103101, 110601, 112001), the quality of the source (110701), and level of the intended audience for the document retrieved (103101). The satisfaction with one successful Prism search was attributed partially to "the search behind it (110401, satisfaction 9)," indicating some learning had taken place. Another respondent alluded to serendipity saying, "I found something somewhat accidentally (110402, satisfaction 5)." Only one respondent commented directly on aspects of the Prism system itself as leading to satisfaction saying, "I really wanted to combine the descriptors and identifiers in one search but I ended up getting an OK document (111801, satisfaction 4)." The same respondent indicated for the next search that she had learned to use the Prism system better saying she had improved her technique by using a multi-step search. Two closed-ended items having to do with the conceptual framework for navigation (as described in Chapter 1 and operationalized in Chapter 3) were part of the task questionnaire. The first item, "to what extent did you know where you were during your search," resulted in a range of scores on Prism tasks from 0 to 10, but no scores of 1 or 2 (mean 6.7, SD 2.6, statistics include 4 non-successful searches, with scores of 0, 0, 4, and 5). So, the effective range for successful searches was from 4 to 10. The second item, "to what extent did you know where you wanted to go," also had a range for Prism searches of 0 to 10, with all scores represented for successful searches (mean 6.6, SD 2.8, statistics include 4 non-successful searches, with scores of 0, 1, 10, and 5). Navigation aspects of the Prism system were referred to in open-ended items, for the most part, in two ways. The item about what gave the degree of knowledge of "where you were and where you wanted to go" yielded several responses having to do with either understanding of the appropriate actions to take with the system or understanding of the topic area for the assigned search task. Comments of the first type were associated with both high and low scores the closed-ended satisfaction item, but high scores on the navigation items. For example, the comment, "I have searched Eric before using Dialog (and on CD-ROM) so I knew what the fields meant and how likely I was to find a term I wanted there (111801)," was associated with a score of 4 for satisfaction, but scores of 9 on the location item ("where you were") and 8 on the movement ("where you wanted to go") item. Comments about understanding of the topic area for the assigned task search included, "clear search objective (110901, satisfaction 10, location 9, movement 10)," indicating a positive impact and "subject descriptors. But not really as familiar with the topic as I would have liked (103103; satisfaction 5, location 5, movement 7)," indicating a less positive impact. Other aspects having to do with navigation ability include qualities of the Prism system (respondent 103101 wrote, "I could see the list of articles in a vertical order"), serendipity (respondent 110101 wrote, "the phrases in the descriptors of the document I found were not what I had searched for. Thus 'teacher effectiveness' found 'program effectiveness' and 'teacher burnout,' so I ended up where I wanted to go, but by accident"), and comparison to other systems (respondent 110101 also wrote, "this interface is not nearly as effective or helpful as the Dialog style interface that I have worked with before..."). The role of previous experience with the Eric database and traditional IR systems (such as Dialog) was important for the navigability of the Prism system. Respondents were reminded they were using the Eric database when they started the Prism sessions by the menu description. Many respondents seemed to associate the Prism interface with the Eric database, providing comments about Eric. For example, respondent 110701 wrote, "the topic seemed to be appropriate for Eric" in response to the open-ended item about navigation. Satisfaction, location, and movement scores were 9, 8, and 8, respectively. In the next search, which was also successful, the scores were 3, 9, and 9. The navigation comment here read, "the search did not seem ap- propriate for Eric." There was no statistical difference for the means of satisfaction, location, or movement scores on the closed-ended scales when the first and second Prism searches were compared using a t-test (satisfaction t=1.49, not significant; location t=0.77, not significant; movement t=1.19, not significant). 4.4.4 Successful Searches: Space Twenty-eight responses were given to the open-ended item concerning satisfaction with searches successful using the Space system. Somewhat more variety was present in the types of responses than for the Prism system. Fourteen responses only mentioned the quality of the match to the assigned information need as the source of satisfaction with similar types of statements as quoted above for the Prism system. The other comments were directed at the Space system itself. No comments had to do with criteria which were not included in the task specifications. The range of scores on the closed-ended item for satisfaction for Space searches was from 0 to 9, with a positive skew (mean 3.5, SD 3.0, eight scores for non-successful searches were 0's and one was 2. Two scores for successful searches were 0.). Of the comments directed at satisfaction for the Space system in successful searches, one respondent (111501) had some positive things to say about the system characteristics: "visual search reset" (the ability to go "home" to the center of the space by hitting the "R" key) and "view the spatial relations," were her comments (satisfaction was 5). Another respondent seemed to indicate that exploration, and not a well-developed search procedure led to success saying, "playing with the computer (satisfaction was 2)." Re- spondent 110402 did not have luck with her explorations, but seemed to place some of the blame on herself saying, "it seemed a relevant document perhaps existed on the system but I couldn't find it (satisfaction was 3)." The experimental nature of the Space system must have been apparent to respondent 110401, who repeatedly typed the number from the sample screen to retrieve a document. The pop-up window which asks for the respondent to type the desired document number to search for included an example (it said, "Please type in the document number you want to retrieve, then press the key. (example: 202111)"), but the example was not a number linked to a document in the prototype system. The respondent wrote, "system kept telling me to put in document number but then would tell me document not found." For the next Space search, she wrote, "I reset the system at least twice to be sure had exited from previous search but still had same document number that was still not found." The research assistant made a note of this "bug" and a real document number was used in the example for subsequent respondents. Another Space system "feature" which was changed in response to respondent comments was the effect of mouse movement on the user perspec- tive in the space. Several respondents commented that the mouse was too sensitive to movement (103101, 110702). The Space program was adjusted so movements of the mouse resulted in only 1/3 as much movement through the space. Respondents could pick up and move the mouse, if they needed to move further in the space than the confines of the mousepad would allow with one continuous mouse movement. One respondent cited his "inability to keyword search (111101)" as his source of dissatisfaction (score of "1" on the Likert item). This might have been a reference to the inability to perform boolean searches or multiple-word searches with Space as it was possible to jump to the location of any single keyterm in the space. His response to item #11 on the task sheet which asked for comments on the task adds, "one term search doesn't put you in the rest of the space." Respondent 112601 wrote, "I don't like wandering around aimlessly with strings of numbers on the screen -- I want to see words connected to my search." This comment is directed at the visual representation of documents in the space using simply the document identification number. Keyterms in the space were the words themselves. This respondent was extremely critical of the Space system in general and was the only respondent whose general response to the research could be described as hostile. When later asked to "Briefly describe the types of situations in which you might prefer to use Space," she wrote, "at gunpoint." Her comments about the Prism tasks were equally critical. Distributions for closed-ended scores for location ("the extent to which you knew where you were") and movement ("the extent to which you knew where you wanted to go") were moderately flat, showing fairly full ranges of scores. For location, the range was from 1 to 10 with only one score of 10 (mean 5.1, SD 2.8, statistics are for both successful and non-successful searches). Scores for the 9 non-successful searches also had a flat distribu- tion, with a range from 2 to 9. For movement, the range was from 1 to 10 (mean 5.7, SD 2.6, statistics are for both successful and non-successful searches). Again, a flat distribution of movement scores for non-successful searches was found, ranging from 2 to 10. Several comments about the navigability of the system (for item #10 on the questionnaire) or for item #11 were directed at the effectiveness of specific features of the Space system. Respondent 103101 responded to the open-ended item about movement and location with, "the galaxy diagram in the upper right hand corner; also noting the terms that appear around the keyterm that I was looking at (satisfaction 3, movement 4, location 3)." Respondents seemed to understand the role of conceptual relations among keyterms in the space. Respondent 103103 provided closed ended scores of 8, 8, and 7 for satisfaction, location, and movement, giving this response to describe his navigation scores: "search terms nearby in the space -- once I figured out what I was doing I was able to move easier." For respondent 111801, serendipity combined with an understanding of the spatial relations: "knowing that I was near good terms helped me to know where I wanted to be. But I wanted documents (there were none there!) I wandered about a bit and saw a bunch of documents nearby. I tried one. It worked!" Not all comments about the spatial location of terms indicated an understanding of the information space. "I knew the terms I was looking for, but they did not seem to be close together (110101)," was a comment associated with scores of 9, 3, and 5 for satisfaction, location, and movement. Some respondents indicated they would be able to use the Space system more effectively with more practice or training. Respondent 103103 commented for item 11, "by this time I had finally figured out to [sic] really manipulate the space." Respondent 110402 wrote after her second search task, "system is more familiar now, but I can't find much that is useful." Respondent 110401 went from scores of 3, 3, and 3 to scores of 0, 4, and 5 for the satisfaction, location, and movement items, commenting after the second search, "previous search helped to know system better and where to go to find exact term." Thus an increase in the perceived navigability of the system did not correspond with increased satisfaction with the search outcome. As occurred for the Prism tasks, some respondents commented on the search criteria as leading to their navigation scores on the close-ended items. Respondent 11050 for instance wrote, "I knew what the information need was." Respondent 110901 had trouble with the search criteria, writing, "the information needs [sic] wasn't that explicit -- couldn't find vocabulary words." There were no statistical differences for the closed-ended scales dealing with satisfaction between respondents' first and second searches using Space. However, scores on the movement and location scales were significantly higher for the second task than the first task, across respondents. For the location scale ("To what extent did you know where you were?") t was 2.56 (p < 0.05), with an average increase of 1.44 between the first and second task. For the movement scale ("To what extent did you know where you wanted to go?") t was 2.48 (p < 0.05), with an average increase of 1.25 between the first and second task. All statistics reported in this paragraph were for the 11 respondents who were successful for both Space tasks only. 4.4.5 Respondent Overall System Evaluation After all four tasks were completed, respondents were asked to evaluate the Prism and Space systems overall. Open-ended items asked respondents to describe each system in their own words and to briefly describe the types of situations in which they might prefer to use each system. Closed-ended items asked respondents to indicate their overall degree of satisfaction with each system and to indicate the extent to which they knew where they were and where they wanted to go. For the Space system only, respondents were asked to comment on the usefulness of some of the system components in their own words: the vocabulary window, finding keywords with the "F" key, the map window, the spatial location of keywords, and the spatial location of documents. Finally, respondents were asked for any other comments or suggestions for the Space system they might have. The Likert item for satisfaction with the Prism system ranged from 0 to 9, with a strong negative skew (mean 6.2, SD 2.6). The navigation Likert item scores were from 2 to 9, with a more moderate negative skew (mean 6.4, SD 2.1). For Space, satisfaction scores ranged from 1 to 9, with a moderate positive skew (mean 4.7, SD 2.3). The navigation item ranged from 1 to 10, also with a moderate positive skew (mean 5.0, SD 2.8). Pearson correlation scores for the relationship of satisfaction to navigation scores were significant for each system. For Space the correlation was 0.75, p < 0.0001. For Prism, the correlation was 0.59, p < 0.001. The Pearson correlations for scores across systems were not significant (correla- tion for satisfaction between Space and Prism was -0.18, ns; for navigation, r was -009, ns). Further comments about the usefulness of Prism and Space for tasks, and models which respondents had of each system are found in the previous sections on task-related evaluation and model-related evaluation. This subsection continues with the comments which respondents had about specific qualities of the Space system. Five questionnaire items for the general evaluation of Space were directed at specific features of the Space system. Three tools for navigation were assessed: the vocabulary window, finding keywords with the "F" key, and the map window. The information space was assessed with items about the spatial location of keyterms and the spatial location of documents. These items indicate which tools were used and with what effects. The features addressed are indigenous to the system constructed for the investigation of navigation: such features are not part of text-based systems (with the exception of the vocabulary window which is commonly seen as an online thesaurus). These items were added to the questionnaire after some respondents had already participated in the investigation. Thus, only about 14 responses are available for each item. The vocabulary window was commented on first. This was the window in the lower right corner of the screen which contains all the keyterms found in the space. Users could hit the "V" key and then scroll up and down in the window. Comments about this window indicated its usefulness to navigating the space. Seven respondents simply wrote a variation on "useful" or "helpful" (120201, 110701, 110901, 111501, 110402, 111901, and 120102). Two respondents spoke of limitations of this feature, one saying simply, "limited (112601)," and another giving more details with, "too limited; some terms listed were not found (110702)." The second comment is difficult to interpret since all terms in the vocabulary window were assuredly located somewhere in the space. Perhaps this respondent had the same trouble respondent 110501 reported by saying, "useful, but it should be noted that the user must type in the space _ between words exactly as listed." This respondent did not notice the instructions in the screen asking her to type in a keyterm, which used "educational_improvement" as an example. Another respondent pointed out that the vocabulary window would work better if you could "simply click on term rather than go through 'F' (110401)." Generally, the vocabulary window was perceived as useful and frequently used by respondents (as gauged by both written reports and looking at what was going on when respondents asked the research assistant for some help, as almost all respondents did at one time or another during the process). A closely associated navigation function was the ability to "jump" to the location of any keyterm in the space. Users would use the vocabulary window to identify potentially useful words (or synonyms for terms in their assigned search task) then go to that location in the space. Evaluations were similar to that of the vocabulary window with several comments of "useful" and the like. Respondent 111101 said the F key was the "best part for me, most like previous computer usages." Respondent 112601 also found this function familiar, saying "unoriginal." Some respondents pointed out the relationship between the "F" key function and the vocabulary window, such as 111801, who wrote, "I think of the two as the same (as the vocabulary window) I didn't think about the vocabu- lary window until you asked. Easy to use." Respondent 120102 suggested this function "would be better if a 'spatial' AND were added," referencing the lack of capability for both multi-word searches and boolean combinations in the Space system. The map window got very mixed reviews from respondents. This was the upper right hand window which showed the point of view in the space from a distance so that one's location relative to all keyterms and document indicators could be viewed. Respondent 120101 seemed to grasp its significance for navigation fully, commenting it was "used to find my way around the space. Looked for blue dots (document numbers) and yellow (keywords) to make a match. Knew that if I was near a dot but the space was blank, I should zoom in and out to find the document word." Respondent 110601 was also able to utilize this feature writing, "shows where the majority of documents are found." Other comments which indicated an understanding of the function of the map window included "saving grace (112601)" (which was almost her only positive comment during the entire research process!), "very useful (120102)," and "when lost (111901)." Other respondents did not seem to understand the role of the map window at all or perhaps were overwhelmed by its presence. Comments included: "I couldn't really get into it. I had trouble focusing on my place in the map (110501)," "I looked at it quite a bit, but I couldn't understand it all of the time. I really only looked at it when I though I was way out in space (too far out) (111801)," "a beautiful distraction (111101)," "didn't look at it (110402)," and, "moderate (110201)." Two respondents had suggestions for the map window. Respondent 110701 wrote, "very useful but I wish I could tell what's in front of me and what's behind me. Maybe using a different diacritic (ie, * v. D_ for those citations that are in front in comparison to those that are behind)." Respon- dent 110901 indicated more depthcues or scale information was needed writing, "scale is too small so wasn't useful until I saw that all the docs were far away from the crosshairs." Perhaps the most important items for considering the effectiveness of the information space that was built for the evaluation system are the items which asked about the usefulness of the spatial location of keyterms and documents. As for previous items, there were both positive and negative assessments. As Appendix A shows, respondents were not fully briefed on the construction of the information space or the methods for locating documents relative to keyterms. The term-term and term-document relationships were not clear to many respondents. For instance, respondent 110402 wrote, "I found some keywords with nothing near them. Neither other keywords nor docu- ments. How can this be?" Respondent 110702 wrote, "couldn't figure out the basis of locations." One respondent clearly had a misunderstanding of the basis for the information space writing, "keywords bunched together, not enough separa- tion. If the keywords are bunched and documents are bunched how can you tell which document goes with which keyword? (111801, emphasis added)." Even with such an understanding, this respondent was able to use the system successfully for the search tasks! Some respondents had positive things to say about the keyterm and document locations. A statement such as, "they hung together pretty well (120102)," while not a rave, indicates hope for the methods used to build the information space. "Wonderful and creative," was the comment of respondent 111501, who, as other respondents, was not remunerated for his participation in this investigation. 4.4.6 Demographics The data in this subsection are reported for their value in describing the respondent group. As discussed earlier, a tradeoff was made which resulted in respondents with a high level of sophistication with information systems and computer systems, relative to the population at large. Table 4.5: Demographic Summary ------------------------------------------------------------------------- This summarizes the responses to the demographic questions. Appendix B contains the questionnaire. Gender: 7 Male, 13 Female Age: Minimum: 22 Maximum: 44 Mean: 29.5 S.D.: 6.6 Education: Minimum: 16 Maximum: 34 Mean: 17.2 S.D.: 1.6* Familiar with ERIC: 18 yes, 1 no** Computer experience (0 = low, 10 = high): Minimum: 4 Maximum: 10 Mean: 7.16 S.D.: 1.9 Information system experience (0 = low, 10 = high): Minimum: 3 Maximum: 10 Mean: 7.47 S.D.: 2.2 Relationship between computer experience and information system experience: Pearson correlation: 0.35, non-significant Chi-square: 55.05, p < 0.05*** ------------------------------------------------------------------------- * mean and standard distribution were computed without the 34 year response, resulting in an effective range of 16-20 years. ** 1 missing value *** Chi-square does not assume that the data are on ratio scale, as the Pearson correlation statistic does. The scores on the "computer experience" and "information system experience" scales had a positive linear relationship, with only three respondents whose scores for information system and computer system experience differed by more than 2 (the three respondents' information and computer system paired scores were 10,6; 10,7; and 3,10). 4.5 Conclusion The answer obtained to the first research question ("Is navigation a useful approach for operationalizing information retrieval?") indicates possible ineffectiveness of the some components of the particular methods for constructing the system evaluated, but offers support for the conceptual framework for navigation presented in Chapter 1. That is, the description of model-based communication and knowledge of the other necessary for effective navigation through information spaces was given support from the outcomes of the user-based evaluation. Support came from the many examples of respondents citing familiarity with the (Prism) system, familiarity with the ERIC database, intuitiveness of the (visual) presentation of the data, and statements of increased ease of use due to ongoing experience with the (Space) system. The outcomes seem to indicate the utility of the framework for navigation from Chapter 1, a utility which is not necessarily effected by the utility of the particular methods chosen here for operationalizing an informa- tion system based on that framework. For the second research question ("What perceptions by users engaged in information seeking help to understand the use of navigation for conceptu- alization of information retrieval and information system design?"), the most important data gathered from users engaged in using the systems have to do with the importance of a model of the system being used. Knowledge of one's status relative to the system and knowledge of how to change that status were strongly related to perceived satisfaction with both the Space and Prism systems as indicated by both open- and closed-ended data. Specific derived qualities for IR systems are discussed in a previous section. Insight into specific qualities of the Space and Prism systems which were proposed in Chapter 1 (and operationalized in Chapter 3) to facilitate navigation came from the many comments from respondents in the various open-ended questionnaire items. They included knowledge of vocabulary items, the ability to combine vocabulary items for a search, understanding of the structure of the interface, feedback as to what to do next, immediate action from the system when commands are given, and other qualities. For the evaluation system, navigation cues were given explicitly: the information window and map window gave definite feedback as to one's status relative to the system. For Prism, navigation cues were either part of the respondent's model of the system (for instance, several respondents used boolean logic, even though no Prism screen indicated such search techniques were available), or were text-based cues as to one's present status, without cues as to how to incrementally change that status. That is, the Prism system was based on the batch model where queries were given and evaluated and then other queries could be given. This is contrasted with the ongoing dynamic search process of the Space system. CHAPTER 5: DISCUSSION AND CONCLUSION 5.0 Introduction This work has attempted to accomplish several tasks. The first task was the creation of a conceptual framework within which an understanding of navigation as a fundamental concept for information systems could be investigated within the context of information retrieval (IR) systems. The second task was the evaluation of literature related to navigation for informa- tion systems. These first tasks constituted the bulk of Chapters 1 and 2. From the groundwork laid in the first two chapters, the next two tasks were to construct and evaluate an IR system based on the concepts of navigation for IR in an attempt to answer the research questions from Chapter 1. General specifications for navigable systems were extracted from both the conceptual framework from Chapter 1 and the literature described in Chapter 2. A user- based evaluation was carried out. Chapter 3 describes the steps taken for the construction and evaluation of the navigation-based IR system, and Chapter 4 presents the results of the investigation. This chapter brings the results of all four tasks together to summarize the findings and their implications for IR systems and to identify areas for future work. The greatest contribution of the work as a whole fits in two areas. First is the generation of a conceptual framework from which to understand human interaction with IR systems. Navigation, as a fundamental concept for IR, describes the relationship between IR systems and human communication systems, drawing similarities intended to enable IR systems to move towards the capabilities of human communication systems for the purposes of information seeking and use. Navigation was proposed as a step which may be useful in reaching beyond human communication for IR towards the long-term goal of exosomatic memory for IR. The second main area of contribution is the collection of steps for the generation of IR systems. Taken together, the methods described in Chapter 3 for generating an information space and interface for navigating in the space formed an important component of the study. These methods for IR system design, concerned with both representation and interfaces, may now be considered for application in other contexts. This chapter will first summarize the findings of the work and discuss the understanding of navigation which has resulted from the process of this investigation. The chapter will then derive implications for information retrieval systems from the results described in Chapter 4 (considered in the framework from Chapter 1), including some specific design criteria about which IR systems might be built. Then, limitations of the work will be re-stated. The last section looks towards the future of work in navigation for information retrieval, with specific directions for further research. 5.1 Navigation: Derived Understanding The analysis in Chapter 4 resulted in answers to the study research questions, plus additional illumination on the concept of navigation. The first research question had to do with the usefulness of navigation for opera- tionalizing information retrieval. Navigation was found to be a viable alternative to relevance-based design, but the relative merits of systems based on navigation and systems based on relevance were only partially identified. The second research question was to look in more detail at user perceptions as they inform the concept of navigation. Chapter 4 summarized many comments, numeric outcomes, and interpretations of the data. This chapter will turn the specific comments and responses to questionnaire items into reflections on the nature of navigation and design criteria for future IR systems. The three main components of this work worked together synergistical- ly. The result of the synergy between the conceptual framework, system design, and evaluation as it related to navigation was an understanding and evaluation of navigation for information retrieval. Overall, navigation was shown to be a viable approach to information retrieval in that the conceptual framework and the system design based on the framework were found to be functional in a working IR environment (as indicated in the analysis of Chapter 4). The definitions for navigation from Chapter 1 were not contradicted. Empirical support for the definitions and the theoretic framework was gained through the language respondents employed to discuss the Prism and Space system components and their experiences with the systems. The concepts and relations among them as operationalized in Chapter 3 and the language of navigation as employed by the users themselves gave support for the idea of navigation through a conceptual information space as a fundamental process of IR. A bonus from the investigation was a greatly enhanced understanding of the importance of models for navigation: it was discovered that new users would employ a generalized model of the system and that they usually focused on the interface qualities in advance of getting to know the database. This has profound implications for system designers and information service providers in that new users can be expected to fuss with a new interface and never understand much at all about the data which it conceals. The data about the role of the model for successful use of IR systems also indicate an important pitfall for relevance-based approaches to IR, which typically assume a knowledgeable and well-informed user. The visual interface was never proposed here as a necessity for navigable systems, yet there is something which makes window-based interfaces more intuitive to use for many computer users across a wide variety of applications. The interface added to Space's navigability, from a conceptual point of view, by providing information on the data organization. From a practical point of view, respondents were accepting and sometimes enthusiastic about the visual interface and mentioned its intuitiveness. The reliance on the match between information space and cognitive space was only partially supported by this investigation. Some respondents men- tioned that database items were located intuitively in the information space but others disagreed. None mentioned that things were "where they were expected to be." The match between information space and cognitive space, while seeming prerequisite for exosomatic memory systems, has an uncertain role to play in navigable IR systems. This section has described some of the specific outcomes of this work which have produced further understanding of navigation for information retrieval. The conceptual framework in Chapter 1 reflects a much larger understanding of navigation than was possessed by the author at the outset of this work. The work completed for this investigation is a good start towards an understanding of navigation for IR. 5.2 Implications for Information Retrieval This section discusses various implications for information retrieval drawn from the conceptual framework and supported by the system design and evaluation process described in previous chapters. Although this work alone can not dictate the direction of information retrieval research and practice, direct consequences of this work may be identified and can play an immediate role for IR system design. Models for information system use were given empirical and conceptual support. The provision of a useful system model was important for system use. Sections in Chapter 1 (especially 1.1.3) drew upon well-established theoretical views on the importance of models for human interaction in ways which could be applied to the domain of information retrieval. During interaction with both Space and Prism, respondents would apply whichever generalized model they possessed to the situation at hand when no other model was supported. For system designers, the goal of providing at the outset a mechanism for model formation for new system users is important. It is also important for a given system to provide clear cues as to how it might diverge from the generalized model users might approach the system with. Given that respondents in this study formed models of the Space system (without being instructed to) which sometimes had inaccuracies (see section 4.3.1), system designers and documentation writers should insure that any potentially harmful inaccuracies in the models which might be created by users are avoided. To apply this conceptual and empirical knowledge about models to IR system design, designers may do any or all of several things. A first possibility is to provide a useful system model to users before they start to use the system. Second is to make the system conform to the extent possible with pre-existing models which users might possess, both of the interface, the data, and the process which the system facilitates. Third is to provide specific mechanisms to allow users to know where their generalized model (perhaps derived through interaction with other systems) may be inaccurate. Fourth is to realize that the interface may be perceived as "the system," requiring that their either be a clear match between interface and both functionality and data, or that information about what the interface provides functionality to be readily available. Learning the Space system took place from the interface in (see sections 4.2.3 and 4.3.1). That is, respondents learned about and described the interface qualities but did not gain much understanding of the database qualities. No respondents commented on the nature of the database or appropriateness of the assigned queries to the ERIC database with the Space system. The Space system, in spite of the differences between the system functionality and the generalized model users possessed of information retrieval systems, was assuredly learnable and usable. This means that people can learn to use new IR systems with fundamental differences from systems they have used in the past. A remaining question is whether completely inexperienced users will be able to use a system based on navigation better than one based on relevance. The relevance-based tradition in information retrieval, which appeared to be the basis for Prism's design, did not account for all the types of uses of IR systems which might have applied, from the point of view of the respon- dents. Many respondents commented about the appropriateness of Prism for typical relevance-based tasks: when the topic is well known, query formation is not difficult. Many also commented that Prism was not optimal for browsing and hypothesized that Space would be better for browsing (sections 4.3.1 and 4.4.5). Although this has far reaching implications for IR system design, this work is certainly not the first to support a redefinition or moving away from relevance as a sole basis for IR system design. Other types of systems, some of which were discussed in Chapter 2 (section 2.2), do not focus on relevance- based criteria for system design. The approach taken here was one possible beginning of a solution for the types of information needs which do not fit well with the relevance tradition, with hope that it could also have implications for information needs which do fit in the relevance tradition. The visual interface for IR was usable, as indicated by the success of the majority of attempted searches. Some respondents expressed keen interest or pleasure in using the visual interface (see section 4.4.5). Previous work in visual interfaces for IR centered on hypertext applications for small databases, often with children as targeted users (discussed in Chapter 2). This work has demonstrated the potential for visualization of information for typical IR system users and purposes. As discussed in Chapter 2, there is a current focus on visual interfaces for all sorts of computer applications. As the need for quickly learnable and intuitive systems increases, it will be more important for IR system interfaces to fit with users' expectations of modern computer system interfaces. The term relations incorporated into the test information space were found to be intuitive to some users but confusing to others. For Prism though, many respondents commented on the difficulty in coming up with appropriate search terms (note the many requests for a thesaurus) but none mentioned ease in selecting search terms. I interpret this as a definite call for a better match between information space and cognitive space and an indication from at least some users that the notion of "similarity" as fundamental relationship among keyterms might be appropriate. The data indicate a need for a better understanding of how document representation schemes can be made to relate to information needs and for systems which facilitate the selection of query terms which reflect an information need. The computational intensity of the Space system, both for construction of the information space and running the prototype system itself, was in excess of what is currently reasonable for a commercial application but acceptable for experimental application. For innovation in system design, this study has demonstrated an important role for the use of advanced computing tools for investigation of the types of systems which may (or may not!) become the standard for tomorrow. Human communication was indicated and supported by this work as an eventual goal for IR systems. This was a surprise to some extent because the conceptual framework and the evaluation methods did not focus heavily on human communication aspects of IR system use. However, Chapter 4 revealed the importance of qualities of human communication systems for interaction with information systems, especially in section 4.3. Some of the empirical outcomes were employed directly to further the conceptual development of Chapter 1. Although this study did not seek to gather direct empirical validation for the notion presented in Chapter 1 that human communication is a reasonable medium-range goal for IR systems, empirical results provided indirect support. This section has described briefly some implications for system designers and evaluators which may be drawn directly or inferred from this work. At the outset, this work was at a "high risk" because of the experimen- tal nature of all three main components. In this section may be seen the fruit of the seeds which were planted on uncertain ground. Future research and design for IR systems may benefit from the findings of this work. 5.3 Limitations This section summarizes the most important limitations and shortcomings of this work. An important general limitation is that the overall effort of the work was split between the user-based evaluative component of the study, development of the conceptual framework, and the system design phase. This was not simply a user-based evaluation study. The conceptual framework could have been further developed to achieve more completeness. The system, too, was not as complete as it might have been, both as perceived by users and in terms of functionality which might have been included. The three main areas of the work, with their associated limitations, entered into a synergistic relationship in this work. To turn the shortcomings of the previous paragraph on their heads: the conceptual framework was tested and furthered through empirical system design and evaluation; the user-based evaluation benefitted from both a deeper (and earlier) involvement in the system design process than is typical of IR system design efforts; and the system designed was conceptually-based and empirically tested -- something that some IR systems never see, let alone early in the design pro- cess. The remainder of this section addresses specific limitations of each of the three main areas of the work. 5.3.1 The Conceptual Framework Within the domain of information retrieval, the conceptual framework was reasonably consistent. It had the benefit of being partially derived from well- established bodies of theory and research in information science, communica- tion, and social psychology. The full range of information seeking activities and information uses can be considered from within the framework. To complete the framework will require at least an expansion beyond standard bibliographic retrieval, and will also require further consideration of the pro- gression from relevance-based systems to exosomatic memory. Although the framework might be expandable through conjecture alone, additional empirical work should be done to gather evidence for future directions in the conceptual development. The conceptual framework has the limitation of being incomplete. It addressed sufficient issues for information retrieval to serve as guidance for the rest of this work, but brought up some concepts which deserve further consideration. The nature of human communication-driven information systems and the qualities of exosomatic memory could be topics of further exposition. So could the nature of cognitive space and the process of cognitive change. The framework was only partially tested within the context of this work. Conclusions were drawn about the usefulness of approximating the qualities of human communication systems and were the basis for most of the framework, yet the conclusions were based on theory from other fields, not from empirical evidence directly related to information retrieval. 5.3.2 The Space System The methods chosen to construct the information space for the user- based evaluation were not likely to produce the most navigable information space that could be constructed, but instead were chosen to be a do-able project which took a few steps in a direction derived from the conceptual framework rather than a giant leap forward in a more questionable direction. Two important limitations were the lack of ability for concepts and relations to change during interaction and the relative paucity of relations among items in the space. The space was not as dynamic as possible, in that the space did not change at all during the search process -- an information space that matched the cognitive space of its individual users would change as the cognitive space changed during the dynamic communication process. This would have made the system somewhat beyond navigation, and more towards human communication systems for IR. An important dynamic aspect that the Space system did have was the ability of users to take different perspectives on the information space contents like the changing foci of a conversation between humans. The contents themselves, though, did not change their relationships. Chapter 1 focused on the desirability of isomorphism between an information space and the cognitive space of a particular user or user group (with a particular information need). The only way the system implemented brought the information space closer to a cognitive space, though, was to incorporate relations among all keyterms and documents. While this was argued in Chapter 1 as an important advantage over typical systems which employ independent keyterms, it can not be claimed to be the same as might be found in human memory. Furthermore, there was no actual test of the extent to which the spatial representation would correspond to a psychometric study of the same concepts. That project, and the associated project of discovering (if they exist) more precise methods for approximating cognitive spaces using statistical methods, was left for the future. The relations in the information space were not as rich as possible in that only "similarity," as measured statistically, was used as the basis for spatial location. Other types of relations to be included in future spaces may not be so easily or automatically gauged. Hypertext applications have begun to investigate different sorts of relations which may be solicited from users; many potential relational types can be identified. The relations, like the information space as a whole, would be subject to change while the cognitive space changes in an ideal scenario. Note that relations and concepts among them remain the fundamental stuff of information spaces, even when the spaces become more complex. The troublesome qualities of spatial locations of documents relative to keyterms was discussed in Chapter 4 (sections 4.4.4 and 4.4.5). Respondents expected that documents would be close to associated terms instead of at the center of some group of terms. Future system design might attempt to locate documents closer to their terms (this could be accomplished by either multiple placement of the documents in the information space or by allowing a space with non-Euclidean properties). For the current study, it became apparent that there was more than one information space -- or perhaps two views on the same space. The information space which was constructed for the user-based evaluation contained documents and keyterms. The keyterms were the foundation of the space located according to statistical similarity measures as described in Chapter 3. Respondents commented that the keyterm locations seemed fairly intuitive: an indication that this component of the information space, at least, had some resonance with respondents' cognitive spaces. The document locations, however, were less well understood by respondents, mainly because documents were not located near associated keyterms in most cases but were some distance away. Documents did tend to cluster conceptually together just as the keyterms did. However, there was no mechanism to go to the locus of a conceptual group of documents -- such a mechanism was present for keyterms only. The information space created for this study consisted of two spaces merged together. The keyterm space was fairly well received by respondents, and by all indications was navigable. The requests of respondents for a thesaurus function for Prism may indicate that the keyterm component of the evaluation space was more navigable than the Prism information space. There is anecdotal evidence (for instance, respondents who said that the Space system's map window was "intuitive") to suggest that the keyterm space was more closely matched to users' cognitive spaces than the Prism space but no explicit evidence (as would be possible from, say, a follow-up psychometric evaluation). For the document space (that is, the document component of the informa- tion space created for the evaluation), user assessments were inconclusive at best. No one commented on the suitability of the relations among documents and several respondents mentioned difficulty in finding appropriate documents near appropriate keyterms. Further user-based evaluation of the spatial representation methods used for this investigation might yield insight into the document component. The outcome of the evaluation completed here indicates breakdown when moving from the keyterm component of the information space to the document component. In this way, the overall information space may have been less well matched to users' cognitive spaces than would be desired, in that keyterms were not as close as expected to desired documents. Different sorts of relationships to be incorporated in the future might make up for some of the deficiencies of the information space. More likely, the solution will be at the interface level: systems which use the same sort of representation methods for keyterms could retrieve associated documents on demand. For instance, keyterms could be located first, and then related documents would be incorporated according to either the user perspective on the information space or a user query. This method would insure that documents related to the keyterms in a particular area of the information space would appear there. 5.3.3 The Empirical Study An important shortcoming of the empirical portion of this work was that some of the perceived drawbacks of the system might have been avoided, were a user-based study of the conceptual framework carried out in advance of system design. Pre-testing of the survey instrument pointed to some of the less desirable system "features," which were then fixed. Remaining drawbacks existed in the form of omissions of expected functionality for Space, such as the lack of Boolean-type searching. In the end, it seems that some of the com- ments about basic system functionality and expectations might have pre- empted more informative comments about navigation, model building, etc. User-based evaluation of the prototype system was found to be viable but problematic. Respondents could use the system and were understanding of its shortcomings. However, the lack of expected functions in the system was prone to comment by users, possibly at the expense of comments dealing with the overall system functionality or on the functions which were available (in this case, many users expected to be able to point-and-click on desired documents, or wanted to use multi-term queries to re-locate in the space). One method to get around this would have been a small-scale user-study in the beginning stages of system design with the intent of understanding user expectations and then adding them to the system design. This approach might have led to a more favorable evaluation of the Space system but it would also have led to (presumably) a reification of what current features IR systems possess, and resulted in few or none of the non-traditional features demon- strated by Space. User-based system design and user-based evaluation have a role for IR systems but need to be used cautiously with prototype systems which differ significantly from the generalized model of such systems possessed by a respondent group. User-based methodologies may be employed for circumstances where an information need exists, but are more difficult to apply when the respondent is in a situation where he or she is does not recognize the process involved. As stated previously (section 3.5) there was potential for the prototype to be less favorably evaluated because it did not meet expectations and also to be more favorably evaluated simply due to its unusualness. Several outcomes discussed in Chapter 4 lent empirical support for both of these types of bias. Neither the research questions nor the general approach to data collection changed during the progression from the proposal to the completed work, although perhaps they should have. It was not possible, after the data were collected, to go back and ask additional questions about, say, models which respondents had of the Space system before and after the research, in spite of the value that such data might have added. Methods more in accord with the final conceptual framework would have yielded results which would have enabled more development of the framework than is found in this work. Since the final framework was derived partially from the results of the methods as they were applied, though, this would be impossible with only one data collection phase. Other ways to improve the method include drawing respondents from a more diverse population (which was foregone because prior experience with Prism-like systems proved more important than was previously thought to be the case during pre-testing), and a move towards more specific criteria for IR systems and away from such generic measures as "satisfaction" (this, also, was not anticipated, as generic measures seemed more in accord with typical IR evaluations than the criteria discussed in Chapter 1). In spite of the problems with the evaluation, the data collected formed a critical component to this work. Without the empirical study, the conceptual framework would have remained conjecture and the prototype system a simple showpiece. The data collected enabled much of the analysis completed in Chapter 4 and added value to the conclusions which have been drawn for navigation and IR system design. 5.4 Future Work on Navigation This work has answered the research questions it sought to answer but, as anticipated in Chapter 1, not with answers of "yes" or "no." The answers were both less decisive than might be desired and more far-reaching. The answers to the research questions leave open the specific methods to be take for building navigable information systems. At the same time, support and insight into the conceptual framework for navigation developed in Chapter 1 has been built, especially for the role of models of the other for successful navigation of information systems. This support is not taken from individual questionnaire items, but across items, both open- and closed-ended. Future work will go beyond the questions asked in this work, drawing on the experi- ences gained. There are three main directions for future research based on this work. The first has to do with system design, the second with the conceptual framework, and the third involves more refined information spaces. Continued development of new information systems based on the principals of navigation is a major direction for further work. Actual system building is the surest test of the concepts of navigation. Insight into the importance of the role of models for effective navigation will lead to evaluation of the effectiveness of various types of models and testing of various methods for providing needed models. Perhaps a more refined model of the Space system would have led to a very different evaluation of the system. The role of navigation cues should also be investigated further. Moving beyond navigation, towards human communication and exosomatic memory systems, can be started with steps towards navigation. Derived research questions of immediate interest include: - What environmental cues do people use during interaction with an IR system? - Are navigation cues situation-dependent? - How is the definition and/or operationalization of navigation different for browsing versus known-item or more directed information needs? The refinement and testing of the conceptual framework for navigation developed in Chapter 1 could possibly lead to the development of a formal theory of navigation for information systems. It is too early to tell whether the steps taken on the road to theory in this work will prove to have been in the right direction or even on a path with a sound foundation. However, as Chapters 1 and 2 show, navigation is clearly an important concept for information systems and is in need of additional understanding. Refinement can come from further empirical evaluation including the derivation of hypotheses generated from ongoing insight into navigation processes. Refine- ment can also come from further consideration of the epistemological and ontological bases of possible theories of navigation. Incorporation of existing paradigms for social, behavioral, and computational behavior, too, can lead to a richer theoretical basis for navigation. The conceptual framework may be investigated and developed further through investigations driven by research questions such as these: - Users of information systems employ words related to movement through physical space. To what extent is interaction with an information space perceived to have the same qualities as movement through physical space for information system users? - Can people find their way more easily around an information space which they have arranged themselves than one designed by someone else? - Are IR systems perceived as being "better" than other systems (as described by users) found to be better for reasons consistent with the assumptions about navigation for IR made here? Finally, more work is needed on the development of information spaces which more closely match the cognitive spaces of users. If the eventual goal of exosomatic memory is to be achieved, a firm understanding of the cognitive structures and cognitive processes of users during information seeking behavior is needed. There are many facets to cognitive movement and many possible representation schemes for information spaces which may better match dynamic cognitive spaces. Future investigations might be based on psy- chometric methods for the measurement of cognitive spaces, on studies of system use which yield user protocols during actual use, or on the continued refinement of relational models of database construction based on theorized cognitive processes. Research questions in this area include: - To what extent does "similarity" play a role in the perceived relationships among concepts in an information space? - What are the most important qualities relating concepts in an information space, and how does the relative importance of each quality change in different situations? - What statistical or other automatic methods can be used to generate relations among concepts which are more in accord with relations as perceived by users or user groups? This section has looked towards some possible futures for the role of navigation for IR. The research questions sketched out here are beginnings only. Further work will produce more questions and new directions for investigation. At this time, there are some clear directions to be explored on the way to more navigable systems and eventually to human-communication based systems. As answers are obtained, and the more fruitful paths are identified, further research questions will be derived. 5.5 Conclusion This work has laid the foundation of an area of research for a theoretic approach to navigation for information retrieval which has not previously been cogently stated in the literature on IR or elsewhere. The claim for navigation as a fundamental concept for information retrieval and information system design was supported by this work. Future theory and research may reject or maintain the framework presented here -- regardless of which, this work can serve as a stepping stone to better understanding of navigation as a fundamental concept for information retrieval and information system design. It appears that navigation may be the path towards human communication systems for information retrieval and from there perhaps towards exosomatic memory. This work has taken some small steps along that path. APPENDIX A: TRAINING PACKET APPENDIX B: QUESTIONNAIRE Task Sheet 1A For this task, you will try to find some citations to match a query using the "Prism" system. 1. Please write the time: _____:_____ 2. The attached set of words or phrases describe an "information need." Use them to guide you in a search for document citations which pertain to the information need. Try to find at least one document which would meet this information need: *** Do not go on until you have found at *** *** least one citation, or until the research *** *** assistant indicates you should go on to *** *** the next task. *** 3. Write the year, author, and title for the document you found which best meets the information need. Write any additional document citations on the back of this form: Year: __________ Author: _____________________________ Title: _________________________________________________ 4. Write the time: _____:_____ 5. On the scale below, indicate how SATISFIED you are with the outcome of your search: very very dissatisfied 0 1 2 3 4 5 6 7 8 9 10 satisfied 6. What was it that gave you the degree of SATISFACTION that you had? _____________________________________________________________________________________ _______________________________________________________________________________ 7. On the scale below, indicate the extent to which you knew WHERE YOU WERE during your search (as opposed to being LOST): a very a very small extent 0 1 2 3 4 5 6 7 8 9 10 large extent 9. On the scale below, indicate the extent to which you know WHERE YOU WANTED TO GO during your search (as opposed to NOT KNOWING where to go): a very a very small extent 0 1 2 3 4 5 6 7 8 9 10 large extent 10. What was it that helped you to know WHERE YOU WANTED TO GO during your search (as opposed to NOT KNOWING where to go): _____________________________________________________________________________________ _______________________________________________________________________________ 11. If you want, use the space below to write any comments, feelings, or reflections you have about this task. _____________________________________________________________________________________ _____________________________________________________________________________________ _____________________________________________________________________________________ _________________________________________________________________________ Task Sheet 2A For this task, you will try to find some citations to match a query using the "Space" system. 1. Please write the time: _____:_____ 2. The attached set of words or phrases describe an "information need." Use them to guide you in a search for document citations which pertain to the information need. Try to find at least one document which would meet this information need: *** Do not go on until you have found at *** *** least one citation, or until the research *** *** assistant indicates you should go on to *** *** the next task. *** 3. Write the document number for the document you found below. Write any additional document numbers on the back of this form. _______________________________________________________ 4. Write the time: _____:_____ 5. On the scale below, indicate how SATISFIED you are with the outcome of your search: very very dissatisfied 0 1 2 3 4 5 6 7 8 9 10 satisfied 6. What was it that gave you the degree of SATISFACTION that you had? _____________________________________________________________________________________ _______________________________________________________________________________ 7. On the scale below, indicate the extent to which you knew WHERE YOU WERE during your search (as opposed to being LOST): a very a very small extent 0 1 2 3 4 5 6 7 8 9 10 large extent 9. On the scale below, indicate the extent to which you know WHERE YOU WANTED TO GO during your search (as opposed to NOT KNOWING where to go): a very a very small extent 0 1 2 3 4 5 6 7 8 9 10 large extent 10. What was it that helped you to know WHERE YOU WANTED TO GO during your search (as opposed to NOT KNOWING where to go): _____________________________________________________________________________________ _______________________________________________________________________________ 11. If you want, use the space below to write any comments, feelings, or reflections you have about this task. Task Sheet 1B For this task, you will try to find some citations to match a general information need using the "Prism" system. 1. Please write the time: _____:_____ 2. The attached paragraph describes an "information need." Try to select some document citations which might meet the need. *** Do not go on until you have found at least one *** *** document citation or the research assistant *** *** indicates you should go on to the next task *** 3. Write the year, author, and title for the document you found which best meets the information need above. Write any additional pertinent document citations on the back of this form. Year: __________ Author: _____________________________ Title: _________________________________________________ 4. Write the time: _____:_____ 5. On the scale below, indicate how SATISFIED you are with the outcome of your search: very very dissatisfied 0 1 2 3 4 5 6 7 8 9 10 satisfied 6. What was it that gave you the degree of SATISFACTION that you had? _____________________________________________________________________________________ _______________________________________________________________________________ 8. On the scale below, indicate the extent to which you knew WHERE YOU WERE during your search (as opposed to being LOST): a very a very small extent 0 1 2 3 4 5 6 7 8 9 10 large extent 9. On the scale below, indicate the extent to which you know WHERE YOU WANTED TO GO during your search (as opposed to NOT KNOWING where to go): a very a very small extent 0 1 2 3 4 5 6 7 8 9 10 large extent 10. What was it that helped you to know WHERE YOU WANTED TO GO during your search (as opposed to NOT KNOWING where to go): _____________________________________________________________________________________ _______________________________________________________________________________ 11. If you want, use the space below to write any comments, feelings, or reflections you have about this task. _____________________________________________________________________________________ _____________________________________________________________________________________ _____________________________________________________________________________________ _________________________________________________________________________ Task Sheet 2B For this task, you will try to find some citations to match a general information need using the "Space" system. 1. Please write the time: _____:_____ 1a. Will you use the _____ glove or _____ mouse? 2. The attached paragraph describes an "information need." Try to select some document citations which might meet the need. *** Do not go on until you have found at least one *** *** document citation or the research assistant *** *** indicates you should go on to the next task *** 3. Write the document number for the document you found. Write any additional document numbers on the back of this sheet. _______________________________________________________ 4. Write the time: _____:_____ 5. On the scale below, indicate how SATISFIED you are with the outcome of your search: very very dissatisfied 0 1 2 3 4 5 6 7 8 9 10 satisfied 6. What was it that gave you the degree of SATISFACTION that you had? _____________________________________________________________________________________ _______________________________________________________________________________ 8. On the scale below, indicate the extent to which you knew WHERE YOU WERE during your search (as opposed to being LOST): a very a very small extent 0 1 2 3 4 5 6 7 8 9 10 large extent 9. On the scale below, indicate the extent to which you know WHERE YOU WANTED TO GO during your search (as opposed to NOT KNOWING where to go): a very a very small extent 0 1 2 3 4 5 6 7 8 9 10 large extent 10. What was it that helped you to know WHERE YOU WANTED TO GO during your search (as opposed to NOT KNOWING where to go): _____________________________________________________________________________________ _______________________________________________________________________________ 11. If you want, use the space below to write any comments, feelings, or reflections you have about this task. _____________________________________________________________________________________ _____________________________________________________________________________________ _____________________________________________________________________________________ _________________________________________________________________________ Impressions This sheet is to provide additional feedback concerning the Information Systems you used. A. Focus on the "Prism" system. 1. Please describe the "Prism" system in your own words _____________________________________________________________________________________ _____________________________________________________________________________________ ____________________________________________________________________________ 2. On the scale below, indicate your degree of SATISFACTION with the "Prism" system overall: a very a very small degree 0 1 2 3 4 5 6 7 8 9 10 large degree 3. On the scale below, indicate the extent to which you knew WHERE YOU WERE and WHERE YOU WANTED TO GO while using "Prism." a very a very small extent 0 1 2 3 4 5 6 7 8 9 10 large extent 4. Briefly describe the types of situations in which you might prefer to use "Prism." _____________________________________________________________________________________ _______________________________________________________________________________ B. Focus on the "Space" system 1. Please describe the "Space" system in your own words _____________________________________________________________________________________ _____________________________________________________________________________________ ____________________________________________________________________________ 2. On the scale below, indicate your degree of SATISFACTION with the "Space" system overall: a very a very small degree 0 1 2 3 4 5 6 7 8 9 10 large degree 3. On the scale below, indicate the extent to which you knew WHERE YOU WERE and WHERE YOU WANTED TO GO while using "Space:" a very a very small extent 0 1 2 3 4 5 6 7 8 9 10 large extent 4. Briefly describe the types of situations in which you might prefer to use "Space:" _____________________________________________________________________________________ _______________________________________________________________________________ 5. Use the spaces below to assess the usefulness of some of the "Space" system components: A. The "Vocabulary" window: ______________________________ _______________________________________________________ _______________________________________________________ B. Finding keywords with the "F" key: ___________________ _______________________________________________________ _______________________________________________________ C. The "Map" window (upper right corner): _______________ _______________________________________________________ _______________________________________________________ D. Spatial location of keywords: ________________________ _______________________________________________________ _______________________________________________________ E. Spatial location of documents: _______________________ _______________________________________________________ _______________________________________________________ 6. Finally, please use the space below to write any suggestions or other comments you have for the "Space" system. Post-Study Information In order to more fully assess the role of demographic factors in information system use, please complete the following brief personal data sheet. Do not write your name on this form. 1. Please circle your gender: Male Female 2. Write your age: ________ 3. Circle one number on the scale below to reflect your experience with computers: none 0 1 2 3 4 5 6 7 8 9 10 lots 4. Circle one number on the scale below to reflect your experience with computerized information systems, such as online card catalogs (LCS, Carl) or CD-ROM systems (ERIC, WilsonDisc): none 0 1 2 3 4 5 6 7 8 9 10 lots 5. Are you familiar with the ERIC database (if you do not know what ERIC is, check "no")? Check one: _____ yes, primarily in print form _____ yes, primarily in electronic form (online or CD-ROM) _____ no 6. Have you ever used the Prism or Spires database management systems (if you do not know what these are, check "no")? Check one: _____ yes _____ no 7. Years of education completed: ________ (examples: 12 = high school, 16 = college grad) 8. Profession, major, or field of study ________________________ APPENDIX C: INFORMATION NEED STATEMENTS 12/01/91 gbn Codebook for Query ID These are the queries which were given to respondents. They were attached to the task sheets (in random order). QueryID Query --------- ------------------------------------------------------ For Form 2: 01 managing technology economic factors 02 managing management style communication 03 I am interested in planning financing for my child's college education. 04 I am developing a training program to increase the effectiveness of teachers. For Form 1: 05 managing replacement of technology economic factors strategy 06 managing happiness in the workplace activities outside of work youngsters 07 I am interested in planning financing for my child's education, probably in a public university setting. 08 I am developing a training program to increase the effectiveness of teachers, mostly in university settings. Pretest queries: administration recreation facilities planning I am investigating some different approaches to financial planning programs for beginning college students agriculture industry forecasting What are some approaches to evaluating the performance of public employees? government roles unions outdoor education What are some of the qualities that differentiate between different student groups? teaching methods practical skills supervision What government agencies play a role in the creation and maintenance of standards for public employees? APPENDIX D: INSTRUCTIONS TO THE RESEARCH ASSISTANT October 25, 1991 Greg Newby Instructions to Research Assistants for Evaluation of the "Information Space" System 1. Overview You are in charge of interacting with respondents throughout the research process. The intent is for the instrument is to be self-administered, but there are some points at which you will need to provide guidance or offer help or instructions. 2.0 General 2.1 Formality Even for respondents who you know, it is important to keep a serious atmosphere during the research. 2.2 Scheduling When possible, schedule respondents in advance. This will increase the perceived credibility of the research and the researchers. Give people a phone number (or email address) to contact in case they will be late or absent. 3.0 Before the research 3.1 Prepare the surroundings Make sure the room is neat and clean. Arrive a few minutes before the expected appointment. Have pens & pencils and a clean scratch pad near the Iris. 3.2 Prepare the equipment Plug in the PowerGlove, log into the Iris (username 'navigate') and run the 'setup' command. Also open a clock window for timing (choose a clock from the Tools menu at the top left of the screen). Open the Prism window (type 'prism') and login to suvm.acs.syr.edu as gbnewby (use the 'tn3270' command). Then, click in the upper-left corner of the window to stow it for later use. Test the system by running a demo program or the space program -- do this even if you just used the system. 3.3 Prepare the instrument Make sure all pages are in order and intact. Write the respondent number at the top of each page in the data packet. There are two tasks for each system (four tasks in total). There are also four sets of queries/keywords which respondents will attempt to satisfy. You need to tape or otherwise affix the correct query or set of keywords to each of the four blanks in the data packet. The tasks should be alternating in two ways: 1. Some users do Prism tasks first, others do the Space tasks first (in all cases, both tasks are accomplished for one system before going on to the next). 2. Alternate which query is matched with which system. That is, give one set of keywords with Prism first, and Space for the next respondent. 4.0 During the research 4.1 Pre-briefing Introduce yourself and briefly describe your role. Explain that you will be in the room (or perhaps up the hallway) to answer questions and time some of the steps. Also explain that the instrument is meant to be self-administered -- that is, that you are not supposed to give much help. Give the respondent their research packet and ask them to read through the Consent Form section. If they have no questions after the Consent Form, respondents can proceed through the rest of the research. 4.2 Measurement You are responsible for insuring that people do not take more than 5 minutes on each of the four tasks. Be firm about this, please. A stopwatch is available. In addition, measure how long people practice with each system before moving on. Write a time for both the Prism and the Space systems on the first page of their data packet. 4.3 In between Help respondents with cleaning up windows and restarting the system, if needed. Note that the space command to start the research is not in the packet -- you have to tell them. 4.4 Giving assistance People may need help or clarification. Offer whatever you know, and make a note of the question asked. Make sure respondents indicate on their data sheets any uncertainties or problems they have (under the Comments section). 5.0 After the research 5.1 Thanks Thank the respondent. Ask if they have any questions. If necessary, offer to have me call them to answer anything you don't know. Make sure they have provided all the information and opinions they have to offer on the research on the forms (or, write down additional comments yourself). Respondents can keep the first part of the Research Packet. Make sure you have all the data. 5.2 Clean-up Make sure all data sheets are in order, with respondent numbers. Unplug the PowerGlove, close any extra windows on the Iris. 6.0 Specific instructions 6.1 Initializing The Iris should be ready to log in (or already be logged in) as user 'navigate' (all in lower case). Passwords for 'navigate' (and also for 'root') are hidden in the slide-out table ends of my desks. Kent Yates can help, if strange system errors occur. The PowerGlove just needs to be plugged in (the blue plug goes in the power box). It beeps twice, then the lights should start to flash. If the lights stay out, try waving the glove in front of the monitor. If still no lights, unplug and replug the glove. In the 'navigate' username, there are several special commands to get around. setup will open the window to be used for the 'space' system. space will run the navspace program, loading in ERIC data which include all descriptor terms. demo, demo2, and demo3 programs which respondents use in learning the navspace system. These are all stripped-down versions of the full program prism opens a window which can be used to connect to suvm.acs.syr.edu (login as gbnewby, password is written in the same place as the 'navigate' pass- word). Important: Only windows opened with the special 'wsh -v' command will work correctly with tn3270! Use the Tools window to open a clock (I like the square clock). This will be for self-timing by respondents. 6.2 Respondent numbers Respondent numbers will be based on the date. Each six digit number will include the month and day, and also the respondent number for that day. The first respondent on October 30 would be 103001. The fourth respondent on November 2 would be 110204. etc. 7.0 Troubleshooting Occasionally, the pop-up windows in the space system will not accept input from the keyboard. If this happens, try clicking (left mouse button) on the upper menu bar for the window. If that doesn't work, quit the system (right mouse button) and restart. If a PowerGlove button is pressed, the glove may leave the special raw- data mode. In this case, reboot the glove (all Iris space programs must first be stopped). If the respondent moves windows, they may still work correctly. If they are moved lots, quit the system and restart. Hopefully the screen won't freeze. If it does, login to gpx via alexia and type 'gclear' That should put the cursor at the center of the screen. If that doesn't work, use 'ps -elf | grep navigate' to identify the process numbers which are causing trouble, then use 'kill -9 nnn' to kill those processes (where 'nnn' is the process number from the 'ps' output). The network to SUVM seems to be unreliable. You could get logged out at any time. If this happens, try to re-establish contact by giving the 'tn3270' command again. If it still fails, have the respondent first accomplish all the Space system tasks, then, if necessary, re-schedule the second half of the research. 11/3/91 Additional Instructions 1. Check the data packet before the respondent arrives. Write respondent numbers on all data sheets at this time. In particular, insure that the ORDER of the tasks is correct (all Prism tasks together, and all Space tasks together -- not mixed up). 2. When you attach the "Query" to the task form, the query number to the right of the page should also be included. This will make the task of entering data much easier. The 10a, 11b, 21a and 22b number should appear on the data forms. 3. Check all packets for completeness BEFORE the respondent leaves. If there are missing data, find out why (there were some empty pages in the data collected thus far). 4. I have created TWO new pages for each data packet. The first is a cover page, which gives instructions for using each system in "real life." The second new page is to be added as a third page to the "Impressions" section of the data packet. It will solicit additional information about the Space system. APPENDIX E: INFORMATION SPACE COORDINATES Section 1: Coordinates of 264 Eric "descriptors" and "major descriptors." Each concept label is followed by its associated X, Y, and Z coordinate. academic_achievement -2.580954377 0.414155247 5.796693606 academic_libraries 5.731512264 2.016532036 -4.117546701 accountability -0.030831234 2.225506250 0.376203341 accounting 0.454225056 -2.792287499 -0.701991926 administration -1.146144023 -0.638811788 -0.161766949 administrative_organization 5.820468727 5.550367376 0.644107130 administrator_attitudes 0.754446022 4.321078278 -0.072769178 administrator_characteristics -1.152811367 3.675644703 -2.300916897 administrator_effectiveness -0.173926798 5.613046751 -2.563297738 administrator_evaluation -1.105588428 3.296259784 -1.270847064 administrator_qualifications -0.540672161 2.101663601 -1.692637910 administrator_responsibility -1.016610397 1.627842392 -1.401748442 administrator_selection -1.368574562 2.232072512 -0.695469960 adolescents -2.449667349 -1.159133979 0.921676948 adult_education -0.505819415 -1.046159033 0.223771329 agricultural_education -1.142290994 -1.609420110 -1.154923600 agricultural_production -1.506549780 -2.110796577 -2.047233780 anxiety -2.832174577 -1.294764988 2.555819957 automation 5.283444349 -3.601417410 0.417142416 banking -2.047690443 -1.899772295 -1.038409480 behavior_change -3.100272566 -0.983185622 4.209408193 behavior_disorders -2.632858578 -1.125405984 2.654058779 behavior_modification -4.067907394 -1.559779705 9.232134594 behavior_patterns -1.924745077 -0.618319234 0.040690727 behavior_problems -3.567790155 -1.160095036 5.180087845 budgeting 0.188867819 -2.552884910 0.126821298 burnout -1.291296519 -0.423163444 -0.160052830 business_administration 3.763320366 -0.159353079 -3.559127794 business_administration_education -0.885175818 -1.896088867 -3.628686954 business_education -0.261181375 -2.263384917 -0.508042852 career_development -1.283286155 -0.357541097 -0.913337262 career_education -1.655777873 -0.667572130 -0.679444721 change_strategies 6.165981784 8.255250823 0.310691925 child_rearing -2.958515022 -1.340896596 1.177053959 children -2.222426416 -1.047520620 0.673940908 classroom_techniques -4.651826540 -0.604075488 15.506520494 cognitive_style -0.991644598 -1.417940565 -0.665245685 collective_bargaining -0.585282451 0.095048602 -2.142935673 college_faculty 1.631896716 0.993299291 -0.288617948 college_planning 1.323894358 -0.739701395 1.186975179 college_students -0.687402341 -2.205530097 7.110936330 communication_skills -1.308483572 -0.357102183 -1.259323651 community_colleges -1.691775359 -1.176264233 -0.168248788 competence -1.120639292 -0.862238604 -1.730921725 competency_based_education -0.752852787 -1.074289730 -1.385306901 computer_assisted_instruction -0.015194385 -2.442957189 0.312035677 computer_literacy -0.364120422 -2.541008086 0.261778774 computer_managed_instruction -0.454765022 -1.780624262 1.570771389 computer_networks 4.102022926 -2.661144536 1.532027518 computer_simulation 0.244174861 -3.281220625 0.349834977 computer_uses_in_education 13.385928949 -7.085129356 11.823103923 computers 3.764065758 -3.735010762 0.938941175 construction_management -1.263700379 -0.359802792 -0.340772350 consumer_economics -3.265109839 -2.313737659 -1.354366644 consumer_education -2.901721404 -2.619298493 -1.139797179 cooperation -0.587939369 1.335064852 0.099549410 corporate_education -0.911433777 0.712520035 -4.610712106 cost_effectiveness 0.005283199 -1.424770192 -0.377166545 counseling_techniques -2.392275301 -1.293876845 0.202062315 curriculum_development 1.671877586 1.096921984 1.932771197 daily_living_skills -2.309382913 -1.603866116 -0.268796052 data_collection -0.337975428 -1.687850887 1.180447607 data_processing 1.976733723 -2.194986373 0.552401543 databases 6.486162643 -6.491017832 3.105428756 day_care -1.266541828 0.229230229 -1.102029924 decentralization 6.548764373 14.734738954 10.320447564 developing_nations -1.702508796 -0.784027303 -1.418691172 disabilities -2.852835232 -0.819785548 10.795801329 discipline -1.548579612 0.591473442 3.062519184 dual_career_family -2.568057557 -1.288235126 -0.989761023 economics_education -1.722370739 -1.765091654 -0.385083010 educational_change 3.018733310 14.263501279 6.101690356 educational_environment -0.283441507 2.926802958 2.594202749 educational_finance 0.953414339 -5.461051170 -0.387101263 educational_games -1.519028978 -1.712780570 -0.114850045 educational_improvement 1.985819551 14.340643604 4.261368409 educational_needs 0.011212912 0.530248447 -3.169738989 educational_planning 1.589973516 4.453137645 2.315532019 educational_quality -1.450333762 -0.225643081 0.198586966 educational_technology -0.432938299 -1.242384755 -0.725808703 efficiency 0.168903140 0.626446454 -1.189827339 employed_parents -3.718887885 -1.035801594 -2.332046917 employed_women -2.131874703 -0.655071699 -1.513653151 employee_attitudes 0.126153753 0.671127573 -5.009844288 employer_employee_relationship 3.661277975 4.636325084 -9.822440528 employment_practices 1.359507791 2.124382941 -4.912390985 endowment_funds -1.436922805 -2.016415010 -0.915162491 energy_conservation -1.448277057 -0.766938823 -0.089573731 energy_management -1.086996073 -0.634723146 -0.171665801 entrepreneurship -1.553529417 -1.967027274 -3.106278133 environmental_education -1.881378121 -0.502565261 -0.137980359 evaluation_criteria 0.942680812 0.437661644 -1.375235886 evaluation_methods 0.390176300 2.754131859 -0.282646733 experiential_learning -1.324599309 -0.616494677 0.016141944 extension_education -1.467983583 -1.049432238 -0.767914810 faculty_development -0.404839474 0.501838348 0.193636004 family_financial_resources -1.819823194 -1.846945707 -0.388447552 family_life_education -2.984310652 -1.694400016 -1.253971686 family_problems -2.325605999 -1.127691634 -0.213751274 farmers -2.024681767 -1.055539928 -2.043109939 federal_government 2.477413679 -1.551453244 -0.396331515 feedback -2.459936077 0.093822639 4.058235662 females -2.474931093 -0.873673267 -0.734213917 financial_policy 0.039075403 -3.640260312 -0.155247638 financial_problems -0.728505386 -1.072804992 -0.808180785 financial_services -1.773696076 -1.788099309 -1.123304337 goal_orientation 0.608339021 0.265280175 -1.384697762 governance -0.739905929 0.680642391 0.560223041 group_dynamics -0.736645038 1.531501889 -1.372037261 higher_education 1.528271012 -2.445156212 0.589494483 home_economics -2.077428674 -1.254833743 -0.718197615 human_resources 2.353120224 -0.868370867 -2.272457345 income -1.896202000 -1.844645269 -1.293520407 industrial_training -1.311627024 0.155469699 -3.698864715 industry -0.575569309 -0.229492883 -0.799431821 information_dissemination 1.845898415 -1.654377843 -0.589614455 information_management 34.045002552 -5.948207633 1.898958044 information_needs 0.513817416 -2.691171186 0.858045132 information_networks -0.711429875 -0.888433275 -0.467459443 information_processing 1.394882418 -3.586710512 -0.360802462 information_retrieval 3.549459074 -3.610387557 -0.075972946 information_services 4.492265527 -3.580995163 0.507705648 information_storage 1.462552927 -2.210387132 -0.292637168 information_systems 6.134355461 -2.520524635 -0.916721926 information_technology 22.675474267 -5.676857122 4.794135919 information_utilization -0.870925374 -1.563073655 -0.124906902 inservice_education 1.721952515 6.288992564 2.974171105 inservice_teacher_education -0.795099900 0.753599861 -0.074099154 institutional_autonomy -0.201681454 3.780775728 2.269972874 institutional_research 0.911138957 -1.136189008 0.921855025 instructional_effectiveness -1.701669387 -0.694567389 2.950063465 instructional_improvement -0.941070795 -0.169443764 1.845864108 instructional_leadership 0.059387409 8.073415625 -1.045917756 interpersonal_communication -1.121835426 -0.035737634 -1.843353412 interpersonal_competence -1.133090566 -1.298585318 -0.251731816 intervention -2.290418842 -0.452110410 0.404807572 investment -0.851795249 -2.715621978 -1.209892364 job_performance 0.743069892 2.259822115 -4.639118416 job_satisfaction -1.251421224 0.165863017 -1.585475095 job_skills -0.603666664 1.794370488 -2.437428565 job_training -0.326574725 -1.109226179 -1.153572315 labor_relations -0.182133403 1.295200061 -5.306506048 leadership -0.250485346 1.042262367 -0.398861442 leadership_qualities -0.545034013 4.947700100 -2.004200841 leadership_responsibility -0.612636856 5.747138003 -1.440032646 leadership_styles -0.134196512 3.521533531 -2.309562228 learning_activities -1.589579443 -1.156231080 0.177305914 learning_disabilities -3.481094960 -2.227692387 3.337396947 learning_strategies -2.164107348 -1.441302448 0.284820942 librarians 0.346814813 -0.598496614 -1.401012672 library_administration 7.071757965 2.139212797 -6.230443176 library_automation 2.893555836 -1.025739855 -1.696838197 library_personnel 1.269653379 1.030597146 -3.963397961 life_style -1.897884851 -1.005027786 -0.527276111 loan_repayment -1.514077066 -1.690393574 0.254749277 long_range_planning 2.559203339 0.426261020 0.296409681 management_by_objectives 5.754562728 6.621345675 -2.673769391 management_games -0.120835609 -2.089789831 -0.594130302 marketing -1.998751288 -1.634631039 -0.749024852 mental_health -1.449372001 -0.575306984 -0.768541763 mental_retardation -2.237421866 -0.926044007 0.954721836 middle_management -0.067858880 3.607177224 -4.939291047 models -0.059546061 -0.838863783 -0.046801136 mothers -2.986684916 -1.250769924 -0.996945415 needs_assessment 0.896374188 3.172491333 -2.355940984 occupational_information -0.821422064 -1.146509664 -0.205063536 office_occupations_education 1.377332758 -3.668353760 -0.366441763 older_adults -1.914160581 -1.185869559 -1.336553278 online_systems 4.495963056 -3.340963806 0.733022838 organizational_change 5.821837485 1.829685078 -1.783213347 organizational_climate 3.021191511 2.508730328 -2.236390133 organizational_communication 1.992276017 0.934270791 -1.092267180 organizational_development 1.233692174 2.718825753 -2.293828868 organizational_effectiveness 3.567260503 2.540399507 -3.123322069 organizational_objectives 2.353813614 -0.311837370 -1.349173153 organizational_theories -0.579998952 0.468775857 -0.960766162 parent_child_relationship -2.696347947 -1.405079262 0.852506408 parent_education -3.047709746 -0.718605325 1.723164525 parent_financial_contribution -0.665114294 -2.831086661 -0.168515588 parent_role -2.730860070 0.641941590 1.271855326 personnel_evaluation 0.025126351 2.344492626 -3.006553623 personnel_policy -0.280662916 0.692020046 -1.627439470 personnel_selection -1.673119593 -0.619292533 -0.980599435 planning 0.091536379 0.766439852 0.047384577 policy_formation 4.734838811 -1.230278805 0.828020243 positive_reinforcement -3.757209389 -1.015636098 6.657316430 power_structure 1.833897499 0.509317808 -1.070605997 prevention -1.392078462 0.297219425 -0.066496051 problem_solving -0.963921289 0.260337730 -1.774952710 productivity 1.511935229 2.202585782 -5.690112849 professional_continuing_education 0.080644778 0.813023760 -4.097361275 professional_training -0.514281376 -0.788360163 -0.209302646 program_administration 0.098700441 -1.668840215 -0.174141066 program_design -0.637107499 0.282453666 0.171994491 program_development 0.794091822 1.670285097 -0.907221264 program_effectiveness -1.260505886 4.078402656 -1.398803225 program_evaluation 0.822182222 1.971289388 2.513786595 program_implementation -0.550348239 2.629351136 2.941364127 purchasing -1.742474045 -1.306632957 -0.858950437 recordkeeping 0.732176720 -2.376860218 0.654891350 records_management 0.467172500 -2.222240506 0.454603929 relaxation_training -3.698218797 -0.536893578 4.910256401 research_libraries 1.119769422 -0.130302818 -2.979517060 research_methodology -1.072346652 0.734273000 -1.147884964 research_needs -1.954926224 0.536844836 -0.793050660 resource_allocation -1.272727135 -1.349194456 -0.090955444 retrenchment -1.781558919 -0.973816286 -0.510124368 role_conflict -3.080371052 -0.627523777 -1.808414065 scheduling -0.087884675 -0.205292499 0.166111034 school_accounting -0.840154156 -1.991561610 -0.463664750 school_administration 2.670912375 5.377617717 1.014228679 school_business_officials -1.468597217 -0.489695940 -0.594149904 school_business_relationship -0.901877918 0.765981504 -0.946941649 school_community_relationship -0.614488777 -0.285179846 0.386143497 school_districts 0.230619176 -0.367306298 2.028795085 school_effectiveness 1.993600485 11.832553894 6.080447172 school_maintenance -1.808855505 -0.915141269 -0.555407332 school_organization 1.643465496 9.865825329 3.881909660 school_restructuring -1.561758889 3.727235279 2.334851924 secretaries -0.180890157 -2.283384927 -0.648439930 sex_differences -2.196907020 -0.493332963 -1.417976461 simulation -1.168086858 -2.070403022 -0.866339193 skill_development -1.897417467 -2.100793135 2.587713809 small_businesses -0.821650987 -1.450809466 -3.386074290 social_support_groups -2.409096835 0.102889839 -0.690868749 special_education -1.273428999 -0.547524849 1.899281076 standards 1.357905736 -3.239518122 -0.235922316 state_programs -1.173848299 -0.746511588 -0.417004715 statewide_planning -0.849536459 -2.133355446 0.527650727 student_attitudes -2.593262361 -0.815728898 2.455966654 student_behavior -2.872112579 -0.553842267 4.204860088 student_characteristics -0.930368791 -2.299383376 2.035348660 student_costs -1.681964708 -2.550166614 -0.487540300 student_financial_aid -0.551961857 -2.784210883 1.235656492 student_loan_programs -1.362231953 -2.385706466 0.128265150 student_records 0.219017841 -1.580162983 0.055329537 superintendents -1.181127202 4.138886183 -1.023121241 supervisory_methods -0.079647407 1.351342625 -3.454352632 supervisory_training 1.141650785 1.790000432 -4.694987929 systems_approach -1.369501891 -0.582052345 -0.299200388 systems_development 2.078119052 -2.363537549 -0.189831164 teacher_administrator_relationship -0.531724182 5.924059667 2.019962393 teacher_attitudes -2.177963534 3.105560778 3.535125212 teacher_burnout -1.752676237 0.600989012 0.158365578 teacher_effectiveness -2.667703734 1.211056725 7.065044684 teacher_participation -0.503107334 3.232515789 1.830048248 teacher_role -1.198780246 0.329325097 2.053903749 teachers -1.701876705 0.237204187 0.466416864 teaching_methods -2.886859563 -2.617743754 3.589189415 teamwork 0.663581619 4.945652459 1.803269736 technological_advancement 8.280528062 -3.157868363 -0.196822487 telecommunications 2.458470012 -2.227430246 -0.054807865 test_anxiety -2.076189779 -0.597269818 1.748796641 time_on_task -2.292760807 -0.333079568 2.616822574 training 1.141748999 2.339661418 -1.639432793 training_methods 0.136089381 -2.067086388 0.424571772 unions -0.003810603 1.504129405 -2.901690441 vocational_education -1.688170726 -0.092863359 0.555232840 volunteers -0.578225360 -1.573195887 -0.349920091 wildlife_management -1.991320318 -0.862685399 -0.181173904 word_processing 0.794779413 -3.355007898 0.214589104 work_attitudes -0.379899583 -0.377212841 -2.532713250 work_environment 4.521711700 6.070783022 -5.961156131 workshops -2.102527019 -0.611520565 1.875978417 Section 2: The coordinates of 272 documents as they were located at the center of their associated keyterms. The numbers are the slot numbers from the Syracuse University installation of Prism. These are all documents in the Eric electronic database which used the word "management" as a descriptor or major descriptor as of Summer, 1991. D_207899 -0.31497908 -0.98614317 0.94140178 D_206160 0.86298031 -0.32891166 0.20019701 D_205493 -0.75137436 -0.77847612 -0.17364074 D_202024 4.18884420 -0.70779616 1.19852030 D_201665 0.41361880 0.52922225 0.24444737 D_201637 0.35462111 3.88881397 -0.61920643 D_201316 -0.04860522 -0.58561981 -1.60033345 D_201294 2.87893343 0.17808722 -1.66972458 D_200625 0.91736287 -1.68029952 -0.10440885 D_200597 1.42780602 3.50693870 0.28065711 D_200347 -0.66371989 0.06158816 0.48504531 D_195062 4.90446472 -2.52577186 1.28351176 D_195061 4.06864548 -2.06724358 0.84724647 D_195056 2.25085354 -1.24941766 -0.07289592 D_194406 -1.52549911 -1.50505447 -0.90125573 D_194405 -1.91376936 -1.60593545 -0.63017315 D_194404 -1.76493514 -1.66535163 -0.85316622 D_194403 -1.81834149 -1.47504652 -0.80806887 D_194402 -1.12874436 -1.66784012 -0.72831035 D_193388 0.18804349 0.72424203 -3.55978823 D_193387 0.47135836 0.90269089 -3.42744827 D_193386 0.35173100 0.89816469 -3.54305816 D_193384 -0.70814532 -1.56962955 -1.64394128 D_192723 1.13136899 -0.20251203 0.09212200 D_192535 0.31992728 6.46673107 -0.32281283 D_185466 -1.52146840 -1.55093074 -1.06814313 D_185389 1.46928811 -0.95120680 0.99725342 D_178724 1.61826420 5.71098661 1.33985615 D_178577 1.04118931 2.65048409 -2.35308051 D_178465 -0.54955900 -1.77223134 -1.66510165 D_178451 2.57098985 7.38683224 2.54017043 D_178450 2.02673125 7.27381611 3.62081933 D_177965 1.81876004 0.36098298 -2.60700488 D_177902 0.35322857 -1.03527045 0.26824027 D_177417 -0.08187178 -1.66923475 0.16023287 D_176409 1.37129092 0.27877703 -2.17647862 D_176408 1.37129092 0.27877703 -2.17647862 D_176407 1.37129092 0.27877703 -2.17647862 D_174435 1.10914934 2.14195681 0.13974674 D_174244 1.74592257 6.21660519 3.44165635 D_171185 1.73618639 -2.73091984 -0.26568747 D_171171 -0.68615371 -1.83030272 -0.49212867 D_170690 6.53712225 -0.85338098 2.69408679 D_170224 0.62397152 0.64417946 -0.21358545 D_168657 2.12905955 0.57431281 -2.24320197 D_168483 6.67775440 -2.98979497 3.17954350 D_168481 6.09074306 -3.88529468 2.61236143 D_163348 -1.00555396 -1.43195546 0.55487013 D_162266 -2.46669459 -0.86967915 3.46639371 D_162193 2.88812470 6.59258366 3.09008360 D_162156 1.76091766 2.40057349 -2.46170282 D_161927 -2.20746732 -1.85123789 2.01137924 D_161294 0.28117892 1.67031884 -1.32249999 D_157808 1.36379480 1.37965608 -2.03254104 D_188446 5.12110472 -2.78146958 0.51502049 D_156234 -0.02909787 -0.61655045 1.84423757 D_155919 1.89785302 -2.70406175 -0.05529077 D_170114 4.83000898 -2.20557642 3.64819956 D_154224 3.65494347 1.87713492 -2.42194414 D_154003 -0.89837402 -1.28675389 3.00633287 D_153957 -0.39350241 2.78231025 -1.52784491 D_153908 0.63858175 5.43186712 0.30395836 D_153283 2.26488352 -2.78041530 1.72167790 D_150181 -0.49314204 0.56221014 0.71876830 D_150088 -1.66975081 -0.44811341 4.36486483 D_148393 1.49453723 1.17063224 -2.21972513 D_148381 -0.71292758 -0.82358503 -0.74218583 D_148369 -1.01672244 -1.53330147 0.28388745 D_147529 0.90698117 4.59008360 0.77001071 D_146738 6.46247721 -1.06560409 3.79969072 D_146736 7.21477079 -3.51719260 2.95808578 D_146657 0.52123123 1.99918664 -4.88732147 D_146529 1.05539000 6.95983791 0.93481928 D_145419 1.45665407 7.64595366 2.76407456 D_143595 2.14396930 0.80322707 -2.39110494 D_181953 0.05053121 -2.09741807 -0.26254365 D_140023 -1.13909304 -0.66273636 0.03785577 D_181829 5.25353289 -0.74598527 2.35380745 D_139246 5.26248646 -1.71883559 0.69092131 D_134853 0.76140797 -0.73844045 0.02774068 D_133809 0.42050034 -1.09451497 -1.52192557 D_133719 -1.93753684 -1.32459474 1.70365798 D_133718 -2.20914698 -1.40368021 2.26815224 D_131402 -0.54378176 -0.78030211 0.12108952 D_130078 5.00467491 1.45250118 -1.55386019 D_126904 0.26164141 -1.37276816 1.41480815 D_126582 1.19763076 -1.66116822 -1.26809311 D_125582 3.29838419 -0.91891825 -2.63974190 D_189843 -0.38256946 -1.08981633 -0.29002905 D_189971 2.74448609 -0.80727065 -0.19967967 D_124796 -1.64173102 -0.54166973 2.64979696 D_124516 0.91673976 -1.69400251 -0.50569975 D_124505 1.33330715 -2.89699936 0.12178367 D_124504 1.33330715 -2.89699936 0.12178367 D_122528 1.12032783 -1.53274786 -1.12864304 D_122461 -0.13055414 0.77899104 -0.98303479 D_121705 0.30100483 0.57860357 -2.81752825 D_121469 0.45667326 1.07831454 -3.52134752 D_119702 0.09934289 -1.30735111 -1.19092214 D_119701 -0.26086324 -1.22700560 -1.06739497 D_118470 4.11035681 -1.71404374 -0.43266556 D_118456 3.39476228 -1.89674950 -0.47549289 D_116982 5.02493095 -2.17726707 0.65809119 D_116930 3.06561518 -3.59458137 2.11772990 D_115482 2.12017870 1.72203255 -3.63894987 D_113842 6.22086000 -1.72533357 0.03951422 D_192169 -1.87110257 -0.67377383 -0.81652063 D_189203 0.41399601 0.89632756 -3.24517894 D_112547 0.27608624 -0.56837744 0.21998996 D_112494 1.81734145 0.95816755 -3.61971879 D_111363 -0.23788436 -1.46152627 -0.36785582 D_191483 -0.22999971 -0.99054784 -0.85107350 D_111147 1.09650075 -0.61857992 0.21921124 D_110936 0.41039988 1.86401176 -3.96358681 D_110052 0.81971747 2.94348717 -0.17744866 D_110023 1.34760976 3.27144217 0.27587464 D_109822 0.22600365 -1.98228347 -0.90716064 D_109733 0.52230901 1.43723059 -4.18587971 D_109732 1.33086419 0.38136977 -3.52167010 D_109731 -0.01986924 0.48177013 -2.69272923 D_109044 4.48791456 -2.83709931 0.61101204 D_107470 3.61674309 -0.65970081 1.64938509 D_107468 3.14414525 0.19566584 1.44604003 D_107014 0.04175413 0.04468677 1.21476007 D_104727 2.70103073 4.66472578 1.06717873 D_104167 2.40924048 3.35192823 -0.27196673 D_104166 2.40924048 3.35192800 -0.27196673 D_103608 -0.25567973 3.65448427 -0.48708847 D_103336 -0.06442333 -0.30728835 -2.14564323 D_102540 -0.35363919 3.14091325 -2.10296798 D_102304 -0.90042174 -0.15052843 -1.35951412 D_196424 2.03499889 1.19632161 -0.04112380 D_113706 4.49636555 -3.02369642 1.40306926 D_190548 0.42640546 2.00670552 -2.95233297 D_189874 0.43667671 -0.95132977 -1.01298630 D_110796 0.21702802 -1.79157448 -0.86230934 D_200896 2.55700207 3.06979442 -3.45467973 D_194374 0.03971402 6.85752630 3.74092555 D_100918 -0.15954283 -0.73782784 -0.94107944 D_100517 1.23862696 2.86457253 1.09215200 D_100508 -0.35365316 2.30505395 -1.43356109 D_100463 0.95188820 0.82824284 -1.10162318 D_99496 1.29462147 7.09915686 1.60898197 D_99467 0.48561692 3.41024256 1.70517540 D_99281 0.59741277 0.51640737 -2.30530381 D_98683 0.24055810 -0.81020802 0.24645104 D_97435 4.12354517 -1.27363837 0.73041046 D_97097 2.34642363 6.49813890 2.55900216 D_94478 0.14730446 -1.57261026 -0.70102185 D_93524 -0.86329472 -2.26418042 1.10117948 D_92353 1.22350109 1.20868480 -0.03147799 D_91564 3.03187537 -2.12535977 -0.14727199 D_91374 1.42356253 -1.42545092 0.29975304 D_91373 1.42356253 -1.42545092 0.29975304 D_91372 1.30874193 -0.96105301 0.23504193 D_90357 -0.17881207 0.92682862 -1.96896219 D_89737 -0.41274595 0.37543038 -0.89882863 D_89733 -0.32774395 -1.39800560 -0.82317245 D_89732 -0.21248172 0.38520563 -2.26070833 D_89321 -0.39131433 -0.90608114 -0.32499859 D_88384 -0.61779940 -1.74924111 -0.43521255 D_87790 0.20492561 0.99536514 -2.45378447 D_87351 0.12377373 -0.31799763 0.54035163 D_87244 -0.05282724 -0.59118730 -1.06375086 D_87243 -0.65513903 0.94754845 -1.62423253 D_86882 0.27797130 0.19209324 -0.03611757 D_86242 0.61780584 -1.56968868 1.56538665 D_85372 3.39622045 -0.76314986 -0.23357360 D_85163 2.18828106 4.33546877 2.01778817 D_85029 0.40638155 1.74721718 -2.44226146 D_84249 1.72100425 -1.88880670 0.34962043 D_84225 2.42781639 -1.92114341 0.09474815 D_84224 2.94910264 -1.99795747 0.21396048 D_84155 -0.12005337 -0.30058813 -0.52076453 D_81725 -0.77453852 -1.04634333 -0.53693926 D_81724 -1.37349939 -1.15925658 1.48001969 D_81699 0.59601575 0.12367406 -1.66022050 D_79836 1.27682197 2.12605619 0.13279329 D_76951 -0.43743339 -2.54617834 0.26444793 D_75763 1.27677476 4.29267216 -0.41492546 D_72675 1.84487116 -0.17040904 -1.24930394 D_72159 -0.65506679 2.06921124 -1.29764235 D_67099 -0.20058121 -0.62920523 -0.02577782 D_63387 0.33719161 -1.91468716 0.71863908 D_63380 0.40116018 -1.21036065 -0.05197359 D_63368 1.15711987 -0.35461193 -0.25089729 D_63340 -0.49188483 0.97361350 -0.59335548 D_63071 -0.18646023 1.05228949 -1.35555577 D_62899 -0.52552384 0.62941819 1.42108369 D_62855 -2.38936305 -1.24468982 3.80410838 D_62512 1.48361135 0.05924882 -0.06657171 D_61691 0.02478336 1.66659284 0.08700436 D_61654 2.06327868 0.44156247 -3.42773771 D_60637 -1.60917473 -1.09397781 0.10858089 D_58354 0.77957696 -0.29186758 -0.92257446 D_57865 0.82620823 -0.56511432 -0.59511423 D_57639 0.53347695 1.98935378 -1.18275082 D_57425 -2.12532663 -1.11447012 1.59584439 D_57397 -2.14905238 -1.61907005 2.72428894 D_57395 -0.26570517 0.54296076 -1.23023975 D_57393 -2.04095697 -1.33394849 2.34586096 D_56382 -1.12182713 -1.48953700 1.70720363 D_56221 -1.10128570 3.52477717 5.42050505 D_55857 -0.51814932 -0.25222781 -1.02897537 D_54800 1.04086149 0.16966538 -0.41298527 D_54782 1.81734145 0.95816755 -3.61971879 D_54771 1.28410530 1.82667494 -1.17790079 D_54219 -0.70771676 -1.89545393 -0.76731330 D_54163 -1.04203558 -1.68644810 1.94446325 D_54005 -0.60366541 -0.08647406 0.77029073 D_53124 -0.54707962 -1.12509549 1.48691285 D_53082 -0.82942855 -0.07985075 -1.28676414 D_52716 -0.77052528 -0.59593153 -0.17485626 D_52283 0.45808518 1.87987137 -2.09032106 D_51151 -0.99000216 -1.00421989 1.01308548 D_51122 0.45349926 0.63586938 -0.80180633 D_49300 0.19592725 0.28557584 0.04668580 D_49286 1.69109285 -1.86264932 0.26239738 D_48718 -1.16456950 -0.80047947 -0.49296185 D_48230 -0.91565341 -1.98474753 -0.41438586 D_48123 0.79375905 1.81443346 1.24342024 D_46652 1.30220926 0.18736649 -0.52096045 D_45932 -0.57781637 0.80369151 0.22328788 D_42087 1.14005697 2.64111781 0.17674279 D_41223 2.01290607 1.58398914 -1.82839870 D_39745 1.29270279 6.64958286 2.17771626 D_38300 1.86956155 -2.13444781 0.16754714 D_36784 3.24258471 0.90550572 -0.21180141 D_36579 1.73567307 -3.22553420 0.06301577 D_33058 0.89090055 1.66611922 1.44891262 D_31397 1.20345914 -1.63157296 -0.10899563 D_31322 0.06077248 -0.73162311 0.49209404 D_30277 1.26127326 -1.93557644 0.40915853 D_30091 0.68488353 -2.24081230 0.31760240 D_29845 -0.33263534 -1.05998743 -0.87547988 D_29844 -0.33263540 -1.05998743 -0.87547994 D_29843 -0.33263537 -1.05998743 -0.87547988 D_29842 -0.33263540 -1.05998743 -0.87547994 D_29840 -0.33263534 -1.05998743 -0.87547988 D_29837 -1.88129330 -0.96038359 1.14132142 D_29824 -0.73125368 -0.48964813 -0.58974093 D_28962 0.36109218 2.23170137 -2.77180696 D_28659 0.07718769 -0.12755989 -1.05919230 D_27820 -0.28541517 1.91317236 -0.80992728 D_27701 1.52814507 5.12832499 0.64062697 D_27541 -2.03475118 -2.07557225 0.41479754 D_27449 0.07199531 -0.63121885 -1.31359279 D_27448 0.07199532 -0.63121885 -1.31359279 D_27402 -0.75780582 -0.56608593 0.48709711 D_26810 0.59573680 -0.94254059 -0.11424030 D_26668 -0.09559894 -1.52749670 -0.98648942 D_25280 -0.44489247 -0.30289358 -1.82627046 D_25086 0.32545295 2.40777898 -1.07100785 D_25075 0.31269592 2.00805616 -2.48145318 D_24756 1.46924305 -1.09795833 -1.20870829 D_24049 0.59647435 -2.60776281 0.52360296 D_23920 -0.44506469 2.16992211 -1.32074189 D_23906 1.72652936 2.06460118 -1.10748458 D_23890 -0.23786065 -1.03454196 0.14744782 D_23877 0.02001044 -0.96213228 0.11296951 D_23617 1.04004264 2.26768184 -1.89877045 D_23395 -0.43646854 0.86354595 0.01614761 D_22688 1.89878249 -2.06428480 0.24357677 D_22525 -0.44506469 2.16992211 -1.32074189 D_22099 -0.41680416 0.32657948 -0.03168180 D_21953 -0.29806277 -1.57668090 1.48052609 D_20419 0.06077247 -0.73162311 0.49209404 D_20233 1.41246068 1.34744477 -1.12938035 D_19970 -0.78937709 -0.94808972 0.03742633 D_19452 3.20757627 1.08767784 -2.06101060 D_19210 -0.52601248 -0.47912735 0.26340953 APPENDIX F: SAMPLE ERIC DOCUMENTS Three sample documents from the Eric database, retrieved using Prism. Only the Title, Personal.Author, Descriptors, Major.Desc, Identifiers, Major.Ident, Abstract, and Journal.Citation fields were displayed to users of the Space system. 1. SLOTNUM = 207899; EDRS.AVAIL = Yes; ADD.DATE = 1077952576; CHANGE.DATE = 1077952576; ACCESSION.NUM = EJ416925; ERIC.NUM = HE527473; PUBLICATION.TYPE = "080; 141"; PUBLICATION.DATE = 1990; TITLE = Seven Years' Experience with the Zero Cash Plan.; PERSONAL.AUTHOR = Schneeweiss, Stephen M.; DESCRIPTORS = College Administration; DESCRIPTORS = College Students; DESCRIPTORS = Higher Education; DESCRIPTORS = Program Descriptions; DESCRIPTORS = Program Development; DESCRIPTORS = Program Effectiveness; MAJOR.DESC = Money Management; MAJOR.DESC = Parent Financial Contribution; MAJOR.DESC = Student Financial Aid; MAJOR.DESC = Student Loan Programs; MAJ.IDENT = Cazenovia College NY; MAJ.IDENT = Zero Cash Plan; ISSUE = CIJMAR91; ABSTRACT = Essentially an application of installment purchase concepts to higher education, the Zero Cash Plan allows students without cash from family resources for room, board, and tuition to attend Cazenovia College (New York) by deferring parent contributions until after graduation. The program has evolved to meet changing student and institution needs. (MSE); REPORT.NUM = ISSN-0147-877X; JOURNAL.CITATION = "Business Officer; v24 n4 p37-38 Oct 1990"; LANGUAGE = English; DATE.ADDED = 05/21/91; DATE.UPDATED = 05/21/91; 2. SLOTNUM = 206160; EDRS.AVAIL = Yes; ADD.DATE = 1077952576; CHANGE.DATE = 1077952576; ACCESSION.NUM = EJ415186; ERIC.NUM = HE527226; PUBLICATION.TYPE = "080; 120"; PUBLICATION.DATE = 1990; TITLE = Strategies for the 1990s.; PERSONAL.AUTHOR = Chaffee, Ellen Earle; DESCRIPTORS = College Students; DESCRIPTORS = Educational Finance; DESCRIPTORS = Employer Employee Relationship; DESCRIPTORS = Higher Education; DESCRIPTORS = Teamwork; MAJOR.DESC = College Planning; MAJOR.DESC = Communication (Thought Transfer); MAJOR.DESC = Educational Quality; MAJOR.DESC = Institutional Research; MAJOR.DESC = Personnel Management; MAJ.IDENT = Strategic Planning; ISSUE = CIJFEB91; ABSTRACT = "Eight strategies are recommended to higher education in the 1990s including committing to improving quality; stressing customer service; increasing use of data and analysis in management; developing data on outcomes issues, processes, and behaviors; and working cooperatively with elementary and secondary education. (MLW)"; REPORT.NUM = ISSN-0271-0560; AVAILABILITY = UMI; JOURNAL.CITATION = "New Directions for Higher Education; (No. 70 An Agenda for the New Decade) v18 n2 p59-66 Sum 1990"; LANGUAGE = English; DATE.ADDED = 05/20/91; DATE.UPDATED = 05/20/91; 3. SLOTNUM = 205493; EDRS.AVAIL = Yes; ADD.DATE = 1077952576; CHANGE.DATE = 1077952576; ACCESSION.NUM = EJ414519; ERIC.NUM = CE521834; PUBLICATION.TYPE = "080; 143"; PUBLICATION.DATE = 1990; TITLE = Developing Management Meta-Competence: Can Distance Learning Help?; PERSONAL.AUTHOR = Linstead, Stephen; DESCRIPTORS = Adult Education; DESCRIPTORS = Competence; DESCRIPTORS = Competency Based Education; DESCRIPTORS = Educational Environment; DESCRIPTORS = Experiential Learning; DESCRIPTORS = Individual Development; DESCRIPTORS = Learning Processes; MAJOR.DESC = Cognitive Style; MAJOR.DESC = Discovery Learning; MAJOR.DESC = Distance Education; MAJOR.DESC = Management Development; MAJOR.DESC = Simulation; MAJOR.DESC = Training Methods; ISSUE = CIJFEB91; ABSTRACT = "Using distance education to develop management metacompetencies such as creativity, flexibility, and resourcefulness encounters problems in terms of learner vulnerability, learning style, structural dysfunctions, and ""reality"" methods. A combination of distance learning, discovery learning, and competency-based simulation may address these problems. (SK)"; REPORT.NUM = ISSN-0309-0590; JOURNAL.CITATION = "Journal of European Industrial Training; v14 n6 p17-27 1990"; LANGUAGE = English; DATE.ADDED = 05/20/91; DATE.UPDATED = 05/20/91; APPENDIX G: CODEBOOK 12/01/91 gbn Codebook for Data Entry Variable Columns Description --------- --------- -------------------------------------------- ID 1-6 respondent identification number RecNum 7-8 record number Demographics: Form 9 0=pretest, 1=form one, 2=form two Gender 10 0=male, 1=female Age 11-12 in years CompExp 13-14 computer experience (Likert scale: 0=none to 10=lots) IRExp 15-16 information system experience (Likert scale: 0=none to 10=lots) ERICExp 17 ERIC experience (0=none, 1=in print, 2=electronic form) PrismExp 18 Prism experience (0=no, 1=yes) Educat 19-20 years of education completed (16=college grad...) Major 21-22 major or field of study (1=LIS 2=Ed. 99=missing 98=other) [Blank] 23 Practice: PracSpac 24-25 minutes of training with Space PracPris 26-27 minutes of training with Prism [Blank] 28 Impressions: SatiPris 29-30 satisfaction with Prism WherPris 31-32 where you are/want to go with Prism PracSpac 33-34 satisfaction with Space WherPris 35-36 where you are/want to go with Space (all Likert scales: 0=none to 10=lots) [Blank] 37 Tasks: TaskID 38-39 task identifier (first column: 1=Prism, 2=space second column: 1=keyterm search, 2=information need statement) Time 40-41 minutes to complete task Glove 42 use of PowerGlove or mouse? (1=glove 2=mouse 9=Prism) QueryID 43-44 which query? (from Query Codebook) SatiTask 45-46 satisfaction (Likert scale: 0=none to 10=lots) AreWhere 47-48 extent to which you know where you are (Likert scale: 0=none to 10=lots) GoWhere 49-50 extent to which you know where you want to go (Likert scale: 0=none to 10=lots) [Blank] 51 Success 52 Was a document found? (0=no, 1=yes, 9=record is not for task data) [Blank] 53 ItemNum 54-55 item number for qualitative data (task data for questionnaire items 6, 10, or 11 if Success is 0 or 1. Otherwise, data for questionnaire items on "Impressions" section) [Blank] 56 Text 57-end text from qualitative questionnaire items (variable length field) APPENDIX H: OPEN-ENDED DATA BY ITEM All open-ended data are sorted by item. A description of the item proceeds each section. Associated closed-ended data are included (the first six columns are the respondent ID number). Appendix G contains the codebook for the closed-ended data. *** Task sheet data, item 6: What was it that gave you the degree of satisfaction that you had? 1031010311290607202001 9999 06070303 1111905020604 1 06 doesn't meet all four criteria in the information need 1031010411290607202001 9999 06070303 1207908030804 1 06 this ar- ticle is relatively old and it doesn't indicate what level of education it is aimed at 1031020111250806201601 1906 07080301 1205907050907 1 06 only found five records -- didn't know if they were about public university costs 1031020211250806201601 1906 07080301 1107906041004 1 06 from the description of the need it was hard to decide just what was re- quested, and how to translate the many concepts into one search 1031030110260909202001 9999 02020809 1207908040507 1 06 subject: professional training / faculty development but not really as specific as I think I should be 1031030210260909202001 9999 02020809 1113905010301 1 06 couldn't relate subjects, couldn't find any matching more than a few at a time -- not real grasp of what was wanted 1101010319999999999999 9999 05050406 1216908070503 1 06 I found a document that looks useful -- but not exactly on the subject key I wanted 1101010419999999999999 9999 05050406 1108905060804 1 06 I found several documents that looked interesting and probably relevant, but the search request is so vague that I'm not sure if they are really appropriate 1104010121270403201601 1403 08070303 1205904070706 1 06 looking at a few other records to find most significant 1104010221270403201601 1403 08070303 1115901090302 1 06 the search behind it and one record was found 1104020321250508201601 2020 08070505 1209904050605 1 06 I found something somewhat accidentally 1104020421250508201601 2020 08070505 1105901060808 1 06 I found something fairly close to the topic 1105010121279910211601 9999 08080302 1205904050808 1 06 the full abstract explains that the article deals with training teachers in time-management in order to improve their effectiveness 1105010221279910211601 9999 08080302 1110901060608 1 06 article found seems to fulfill the information need (based on abstract) 1106010321230610001701 9999 07090205 1205904050810 1 06 abstract and descriptors appear to be related to topic 1106010421230610001701 9999 07090205 1107902080810 1 06 abstracts contents and document date 1970 1107010121270607201901 9999 05070705 1205904090808 1 06 it's a good reference source 1107010221270607201901 9999 05070705 1102902030909 1 06 it's an article about library related things and that was not part of the information need 1107020120270710201601 9999 07080605 1210903060408 1 06 Trying to figure out the appropriate descriptors 1107020220270710201601 9999 07080605 1102902080909 1 06 citation pulled up on first try 1109010121991003201601 9999 08030309 1207903100910 1 06 found ex- act words in the title and abstract information verified it 1109010221991003201601 9999 08030309 1110901100810 1 06 got a match and a few to select from 1115010120300607201601 9999 09090909 1115901080809 1 06 from the abstract of the document which close to what I wanted 1115010220300607201601 9999 09090909 1212904091009 1 06 search result matched my request 1118010121221010201601 2020 03060503 1108902040908 1 06 I really wanted to combine the descriptors and identifiers in one search but I ended up getting an OK document 1118010221221010201601 2020 03060503 1207904090909 1 06 I think I spent a little more time on using the descriptors of a document I found and starting a new search, and by using a multi-step search to limit (better search technique on my part) 1119010320320808201801 1812 09090707 1203903100909 1 06 I found something very "on-the-mark" 1119010420320808201801 1812 09090707 1106901100908 1 06 a seem- ingly perfect match in comparatively little time 1120010120420807201601 1515 04040202 1107901030304 1 06 couldn't be sure I found all possible hits 1120010220420807201601 1515 04040202 1201903060606 1 06 the cita- tion indicates a current book covering the subject 1126010321340607201601 9999 07070101 1299903000010 0 06 very ir- ritating to play guessing games. give me a title damn it 1126010421340607201601 9999 07070101 1118901030707 1 06 not real- ly on the topic yet 1201010121240909211801 1003 00040810 1113901000000 0 06 nothing 1201010221240909211801 1003 00040810 1212904000405 0 06 99 1201020120261010212001 9999 09050708 1204903100405 1 06 the docu- ment abstracts indicated that it matched perfectly with the stat- ed information need 1201020220261010212001 9999 09050708 1104902060505 1 06 the descriptors fit, I'm not sure about the content 1202010120410606203498 9999 99990404 1204903101009 1 06 a dead-on hit, and speed 1202010220410606203498 9999 99990404 1107901000501 0 06 the lack of an online thesaurus made it impossible to narrow the number of hits that technology brought up, or to find alternate ways to search *** Task sheet data for Prism searches, item 10: What was it that helped you to know where you wanted to go during your search (as opposed to not knowing wehre to go)? 1031010311290607202001 9999 06070303 1111905020604 1 10 seeing the list of brief cites 1031010411290607202001 9999 06070303 1207908030804 1 10 I could see the list of articles in a vertical order 1031020111250806201601 1906 07080301 1205907050907 1 10 prompts menus make clear what options are but I wasn't sure whether to use abstract, descriptor, etc. 1031020211250806201601 1906 07080301 1107906041004 1 10 I could see and understand the progress of my search but again finding the right terms to search was hard 1031030110260909202001 9999 02020809 1207908040507 1 10 subject descriptors. but not really as familiar with the topic as I would have liked 1031030210260909202001 9999 02020809 1113905010301 1 10 not much 1101010319999999999999 9999 05050406 1216908070503 1 10 the phrases in the descriptors of the document I found were not what I had searched for. Thus "teacher effectiveness" found "program effectiveness" and "teacher burnout," so I ended up where I want- ed to go, but by accident 1101010419999999999999 9999 05050406 1108905060804 1 10 I had a general idea of the main concepts involved -- I felt that what I needed to do was to combine those appropriately 1104010121270403201601 1403 08070303 1205904070706 1 10 following commands on screen 1104010221270403201601 1403 08070303 1115901090302 1 10 research assistant -- we didn't know how to search several terms in abstracts or other search 1104020321250508201601 2020 08070505 1209904050605 1 10 I had used a system like this before 1104020421250508201601 2020 08070505 1105901060808 1 10 from the previous, I knew enough to look at the descriptors used 1105010121279910211601 9999 08080302 1205904050808 1 10 I'm some- what familiar with the Eric online and ondisc programs, so I tried to use the same kind of logic. I searched subject descrip- tors found in articles that weren't quite what I wanted 1105010221279910211601 9999 08080302 1110901060608 1 10 boolean operators concept 1106010321230610001701 9999 07090205 1205904050810 1 10 having an idea of how an abstract citation is formed 1106010421230610001701 9999 07090205 1107902080810 1 10 familiar- ity in how a database is organized 1107010121270607201901 9999 05070705 1205904090808 1 10 the topic seemed to be appropriate for Eric 1107010221270607201901 9999 05070705 1102902030909 1 10 the search did not seem appropriate for Eric 1107020120270710201601 9999 07080605 1210903060408 1 10 List of descriptors and identifiers 1107020220270710201601 9999 07080605 1102902080909 1 10 list of descriptors 1109010121991003201601 9999 08030309 1207903100910 1 10 clear search objective 1109010221991003201601 9999 08030309 1110901100810 1 10 clear statement of search 1115010120300607201601 9999 09090909 1115901080809 1 10 menu in- structions 1115010220300607201601 9999 09090909 1212904091009 1 10 my under- standing of the subject 1118010121221010201601 2020 03060503 1108902040908 1 10 I have searched Eric before using Dialog (and on CD-ROM) so I knew what the fields meant and how likely I was to find a term I wanted there 1118010221221010201601 2020 03060503 1207904090909 1 10 I knew where I wanted to go because I found a document (using a bad search term) that was a little relevant, so I wanted to go and find more with similar descriptors 1119010320320808201801 1812 09090707 1203903100909 1 10 having used Eric 1119010420320808201801 1812 09090707 1106901100908 1 10 the list of qualifications for the search 1120010120420807201601 1515 04040202 1107901030304 1 10 apparent lack of descriptor index as in Dialog and apparent relevance on title keyword found through experimenting 1120010220420807201601 1515 04040202 1201903060606 1 10 experi- ence from last search 1126010321340607201601 9999 07070101 1299903000010 0 10 99 1126010421340607201601 9999 07070101 1118901030707 1 10 keywords 1201010121240909211801 1003 00040810 1113901000000 0 10 99 1201010221240909211801 1003 00040810 1212904000405 0 10 99 1201020120261010212001 9999 09050708 1204903100405 1 10 there wasn't much specific help. I was reduced to doing free text searches in abstract and title 1201020220261010212001 9999 09050708 1104902060505 1 10 again, there were not many indicators 1202010120410606203498 9999 99990404 1204903101009 1 10 my own ability to find alternate terms for the search when others failed 1202010220410606203498 9999 99990404 1107901000501 0 10 the ab- sence of a thesaurus meant I was just grabbing at terms, this time I was unlucky *** Task sheet data for Prism searches, item 11: If you want, use the space below to write any comments, feelings, or reflec- tions you have about this task. 1031010311290607202001 9999 06070303 1111905020604 1 11 it would be useful to see a thesaurus of descriptors on the screen 1031010411290607202001 9999 06070303 1207908030804 1 11 I would like to have been able to see the list of descriptors from which to choose 1031020111250806201601 1906 07080301 1205907050907 1 11 99 1031020211250806201601 1906 07080301 1107906041004 1 11 99 1031030110260909202001 9999 02020809 1207908040507 1 11 typeface somewhat hard to read for me -- possibly my glasses 1031030210260909202001 9999 02020809 1113905010301 1 11 I like to know where I want to be, what I am supposed to do 1101010319999999999999 9999 05050406 1216908070503 1 11 99 1101010419999999999999 9999 05050406 1108905060804 1 11 This in- terface is not nearly as effective or helpful as the Dialog style interface that I have worked with before -- much more inflexible. It was nice to be able to use boolean logic again, though 1104010121270403201601 1403 08070303 1205904070706 1 11 titles in display sometimes were not as clear as descriptors and had to read the abstract itself 1104010221270403201601 1403 08070303 1115901090302 1 11 until I learned that boolean logic must be typed along and then worked through search strategies again for new terms, had problems with limiting search between three terms 1104020321250508201601 2020 08070505 1209904050605 1 11 99 1104020421250508201601 2020 08070505 1105901060808 1 11 99 1105010121279910211601 9999 08080302 1205904050808 1 11 this sys- tem is much easier to use than the ondisc online systems. It asks you to speak in plain English, not to adapt to its language or format 1105010221279910211601 9999 08080302 1110901060608 1 11 wasn't sure if I should use subject descriptors or key abstract words. Key abstract words worked better 1106010321230610001701 9999 07090205 1205904050810 1 11 the term is difficult to read (font) 1106010421230610001701 9999 07090205 1107902080810 1 11 difficul- ty in deciding between major and other descriptors and identif- iers 1107010121270607201901 9999 05070705 1205904090808 1 11 99 1107010221270607201901 9999 05070705 1102902030909 1 11 99 1107020120270710201601 9999 07080605 1210903060408 1 11 About as frustrating as searching and other database 1107020220270710201601 9999 07080605 1102902080909 1 11 99 1109010121991003201601 9999 08030309 1207903100910 1 11 99 1109010221991003201601 9999 08030309 1110901100810 1 11 99 1115010120300607201601 9999 09090909 1115901080809 1 11 a user has to have some basic train to use Eric efficiently 1115010220300607201601 9999 09090909 1212904091009 1 11 99 1118010121221010201601 2020 03060503 1108902040908 1 11 99 1118010221221010201601 2020 03060503 1207904090909 1 11 I had a little easier time narrowing on this search, and now that I am more familiar with the system it doesn't bother me any more that I have limited searching capabilities. I can get by 1119010320320808201801 1812 09090707 1203903100909 1 11 with a list of titles, selection is much easier 1119010420320808201801 1812 09090707 1106901100908 1 11 99 1120010120420807201601 1515 04040202 1107901030304 1 11 could not discover how booleans worked. I would ask for title "and" anoth- er word in title or other field and get a message about "index" even if I was looking for keywords 1120010220420807201601 1515 04040202 1201903060606 1 11 99 1126010321340607201601 9999 07070101 1299903000010 0 11 cutesy -- new age, but a bit premature please do not inflict this on any- one until it is a real improvement on current systems 1126010421340607201601 9999 07070101 1118901030707 1 11 I don't think this exercise reflects how librarians do business 1201010121240909211801 1003 00040810 1113901000000 0 11 I did not know the descriptors and could not retrieve any documents under their given titles ie managing management economics, technology etc. 1201010221240909211801 1003 00040810 1212904000405 0 11 99 1201020120261010212001 9999 09050708 1204903100405 1 11 I felt it was kind of hit and miss 1201020220261010212001 9999 09050708 1104902060505 1 11 the tasks were relatively simple 1202010120410606203498 9999 99990404 1204903101009 1 11 an online thesaurus would move my 9 above to 10 1202010220410606203498 9999 99990404 1107901000501 0 11 I knew where I had to go, but the tool I needed was absent. A real con- trast here with the space version. Part 2: Search task data for the evaluation (Space) system *** Task sheet data for item 6: What was it that gave you the degree of satisfaction that you had? 1031010111290607202001 9999 06070303 2213207030403 1 06 I thought I would find more relevant documents 1031010211290607202001 9999 06070303 2111206020404 1 06 I didn't feel the document I found is relevant to all the information needs above. It seems that it would be difficult to find a sin- gle source that covers "happiness in the workplace" and "activi- ties outside of work" 1031020311250806201601 1906 07080301 2213208000202 0 06 no arti- cles found. found terms without document numbers 1031020411250806201601 1906 07080301 2109205000301 0 06 didn't find anything -- not sure why 1031030310260909202001 9999 02020809 2214207080807 1 06 subject terms were close 1031030410260909202001 9999 02020809 2199906090907 1 06 more descriptors matched more closely the above criteria 1101010119999999999999 9999 05050406 2199206000402 0 06 couldn't find anything that fit more than one of the criteria 1101010219999999999999 9999 05050406 2205207090305 1 06 was able to find document quickly using one key word 1104010321270403201601 1403 08070303 2105902030303 1 06 system kept telling me to put in document number, but then would tell me document not found 1104010421270403201601 1403 08070303 2207203000405 1 06 none -- I reset the system at least twice to be sure had exited from previ- ous search, but still had same document number that was still not found 1104020121250508201601 2020 08070505 2216203030705 1 06 it seemed a relevant document perhaps existed on the system but I couldn't find it 1104020221250508201601 2020 08070505 2106202050709 1 06 the docu- ment I find was merely tangentially related 1105010321279910211601 9999 08080302 2105202030101 1 06 it was the only one that I found in fifteen minutes 1105010421279910211601 9999 08080302 2215203000204 1 06 99 1106010121230610001701 9999 07090205 2209203000507 0 06 the cita- tions I found would not answer my information need 1106010221230610001701 9999 07090205 2113901000207 0 06 no relevant citations 1107010321270607201901 9999 05070705 2204203040908 1 06 it didn't provide any information on planning (at least the abstract didn't say anything about planning) 1107010421270607201901 9999 05070705 2104201050805 1 06 it does not include any discussion on economic factors 1107020320270710201601 9999 07080605 2113201080205 1 06 appropri- ate document found 1107020420270710201601 9999 07080605 2210204030505 1 06 couldn't find document under the more appropriate keywords 1109010321991003201601 9999 08030309 2117902050206 1 06 found something, although not sure if this is useful 1109010421991003201601 9999 08030309 2220204000910 0 06 not find- ing anything close to what I was looking for. Not finding docu- ment numbers near my term 1111010321440505201702 9999 99990303 2219204010101 1 06 99 1111010421440505201702 9999 99990303 2111201020505 1 06 99 1115010320300607201601 9999 09090909 2120202070508 1 06 visual search result (2D) 1115010420300607201601 9999 09090909 2206203080909 1 06 view the spatial relations 1118010321221010201601 2020 03060503 2117201040204 1 06 that I took a stab and it happened to be close, and I finally found one so I could go on 1118010421221010201601 2020 03060503 2206203070507 1 06 I am very satisfied that I found something very quickly and I feel pretty confident now that I could find something else if asked 1119010120320808201801 1812 09090707 2227204020808 0 06 almost nothing at all 1119010220320808201801 1812 09090707 2111202060907 1 06 the abstract seemed somewhat related 1120010320420807201601 1515 04040202 2116201020303 1 06 inability to keyword search 1120010420420807201601 1515 04040202 2216204000404 0 06 99 1126010121340607201601 9999 07070101 2106202030307 1 06 99 1126010221340607201601 9999 07070101 2105204010109 1 06 I don't like wandering around aimlessly with strings of numbers on the screen - I want to see words connected to my search 1201010321240909211801 1003 00040810 2208103020703 1 06 playing with the computer 1201010421240909211801 1003 00040810 2110202081009 1 06 found a relevant document with the system 1201020320261010212001 9999 09050708 2111201030606 1 06 it has little to do with the information need fulfilling only two of the three requirements 1201020420261010212001 9999 09050708 2202204090909 1 06 this matches very well with the set of needs 1202010320410606203498 9999 99990404 2111201000609 0 06 having tried to locate myself in the space via keyword and checking out two possibilities in that region, I conclude there are none in the space 1202010420410606203498 9999 99990404 2205204050909 1 06 the abstract indicates that the document may be at least tangentially relevant to the topic and that's better than nothing *** Task sheet data for Space searches, item 10: What ws it that helped you to know where you wanted to go during your search (as opposed to not knowing where to go) 1031010111290607202001 9999 06070303 2213207030403 1 10 the galaxy diagram in the upper right hand corner; also noting the terms that appear around the keyword that I was looking at 1031010211290607202001 9999 06070303 2111206020404 1 10 galaxy diagram 1031020311250806201601 1906 07080301 2213208000202 0 10 99 1031020411250806201601 1906 07080301 2109205000301 0 10 I had very little idea what vocabulary terms to apply for the search. The terms I found were not relevant (the documents) 1031030310260909202001 9999 02020809 2214207080807 1 10 search terms nearby in the space -- once I figured out what I was doing I was able to move easier 1031030410260909202001 9999 02020809 2199906090907 1 10 found re- lated terms 1101010119999999999999 9999 05050406 2199206000402 0 10 many of the possibly relevant terms were grounded near each other -- but without many documents 1101010219999999999999 9999 05050406 2205207090305 1 10 I knew the terms I was looking for, but they did not seem to be close together 1104010321270403201601 1403 08070303 2105902030303 1 10 window would close automatically after doing a search and did not define any additional materials or place to go 1104010421270403201601 1403 08070303 2207203000405 1 10 previous search helped to know system better and where to go to find exact term 1104020121250508201601 2020 08070505 2216203030705 1 10 vocabu- lary window 1104020221250508201601 2020 08070505 2106202050709 1 10 vocabu- lary window 1105010321279910211601 9999 08080302 2105202030101 1 10 I under- stood the objective, which was to find a document which pertained to the information need 1105010421279910211601 9999 08080302 2215203000204 1 10 I knew what the "information need" was 1106010121230610001701 9999 07090205 2209203000507 0 10 keywords of similar topics 1106010221230610001701 9999 07090205 2113901000207 0 10 synonyms for "managing technology economic factors" 1107010321270607201901 9999 05070705 2204203040908 1 10 the term 1107010421270607201901 9999 05070705 2104201050805 1 10 the com- bination of terms I was looking for did not seem very appropriate for the Eric database 1107020320270710201601 9999 07080605 2113201080205 1 10 figuring out the spatial/informational connection 1107020420270710201601 9999 07080605 2210204030505 1 10 small amount of experience with the system from previous question 1109010321991003201601 9999 08030309 2117902050206 1 10 the in- formation needs wasn't that explicit -- couldn't find vocabulary words 1109010421991003201601 9999 08030309 2220204000910 0 10 clear terms and couldn't find in vocabulary 1111010321440505201702 9999 99990303 2219204010101 1 10 99 1111010421440505201702 9999 99990303 2111201020505 1 10 99 1115010320300607201601 9999 09090909 2120202070508 1 10 the spa- cial relation while I rotated around the screen 1115010420300607201601 9999 09090909 2206203080909 1 10 visual and multi-D relation among the key words 1118010321221010201601 2020 03060503 2117201040204 1 10 knowing that I was near good terms helped me to know where I wanted to be. But I wanted documents (there were none there!) I wandered about a bit and saw a bunch of documents nearby. I tried one. It worked! 1118010421221010201601 2020 03060503 2206203070507 1 10 I knew that the relevant documents were not really near the term. I also tried to get to know where subject area were in the space. I got the impression finance was close to the middle than college type things 1119010120320808201801 1812 09090707 2227204020808 0 10 vocabu- lary list 1119010220320808201801 1812 09090707 2111202060907 1 10 vocabu- lary 1120010320420807201601 1515 04040202 2116201020303 1 10 spatial relation once vocabulary list was checked for words 1120010420420807201601 1515 04040202 2216204000404 0 10 previous experience 1126010121340607201601 9999 07070101 2106202030307 1 10 the star map 1126010221340607201601 9999 07070101 2105204010109 1 10 the star map 1201010321240909211801 1003 00040810 2208103020703 1 10 vocabu- lary words 1201010421240909211801 1003 00040810 2110202081009 1 10 the map -- I used the yellow dots (keywords) and blue dots (documents) to navigate. Used the vocabulary keywords at first to get me to a relevant place in the space 1201020320261010212001 9999 09050708 2111201030606 1 10 the voca- bulary window combined with the position window 1201020420261010212001 9999 09050708 2202204090909 1 10 the posi- tion window showed me there was more 1202010320410606203498 9999 99990404 2111201000609 0 10 keywords 1202010420410606203498 9999 99990404 2205204050909 1 10 keyword *** Task sheet data for Space searches, item 11: If you want use the space below to write any comments, feelings, or reflections you have about this task. 1031010111290607202001 9999 06070303 2213207030403 1 11 the mouse is very sensitive and it's easy to move too rapidly past terms one wants to look at. This global appearance also looks confus- ing (the layout of it) 1031010211290607202001 9999 06070303 2111206020404 1 11 it would be helpful it there were a space that records the documents al- ready examined. Tell subjects that there are four tasks to be completed before beginning 1031020311250806201601 1906 07080301 2213208000202 0 11 I didn't understand the relation between terms and numbers, light and dark 1031020411250806201601 1906 07080301 2109205000301 0 11 99 1031030310260909202001 9999 02020809 2214207080807 1 11 99 1031030410260909202001 9999 02020809 2199906090907 1 11 by this time I finally figured out to really manipulate the space 1101010119999999999999 9999 05050406 2199206000402 0 11 99 1101010219999999999999 9999 05050406 2205207090305 1 11 99 1104010321270403201601 1403 08070303 2105902030303 1 11 since kept getting document number but wouldn't / couldn't find it, it was confusing to user 1104010421270403201601 1403 08070303 2207203000405 1 11 vocabu- lary words used must be identical (with punctuation) and this should be written somewhere for new users 1104020121250508201601 2020 08070505 2216203030705 1 11 strange concept which I would probably if I had found something, but since I did not it doesn't seem any more helpful than the other search systems 1104020221250508201601 2020 08070505 2106202050709 1 11 system is more familiar now, but I can't find much that is useful 1105010321279910211601 9999 08080302 2105202030101 1 11 I'd have to spend a lot more time experimenting with this system. The concept is interesting, but intimidating. The mouse is extremely sensitive, which can be frustrating 1105010421279910211601 9999 08080302 2215203000204 1 11 Once you find the vocabulary which is relevant to your search, it's not so clear what is the best way to locate the documents which may re- late 1106010121230610001701 9999 07090205 2209203000507 0 11 difficul- ty in staying focused on the screen keywords would move off the screen too fast with little movement of the mouse 1106010221230610001701 9999 07090205 2113901000207 0 11 frustra- tion not being able to find keywords to lead me to relevant cita- tions "lost in space" 1107010321270607201901 9999 05070705 2204203040908 1 11 interest- ing 1107010421270607201901 9999 05070705 2104201050805 1 11 99 1107020320270710201601 9999 07080605 2113201080205 1 11 interest- ing prototype of "cyberspace;" needs refinement, especially con- trols; screen image jerked around too much even in response to subtle touches 1107020420270710201601 9999 07080605 2210204030505 1 11 spatial relationships between terms and citations interesting. What basis/system used for assigning coordinates to citations? 1109010321991003201601 9999 08030309 2117902050206 1 11 a little frustrating -- would have liked to have seen the document it displayed somewhere on the output results 1109010421991003201601 9999 08030309 2220204000910 0 11 this was very frustrating! the terms were there but no docs 1111010321440505201702 9999 99990303 2219204010101 1 11 99 1111010421440505201702 9999 99990303 2111201020505 1 11 99 1115010320300607201601 9999 09090909 2120202070508 1 11 it's good idea to view the search process 1115010420300607201601 9999 09090909 2206203080909 1 11 99 1118010321221010201601 2020 03060503 2117201040204 1 11 There was one document I found that I thought would be like networks and management and turned to be about economics. I don't know where that came from. Once I find a document, I can't remember the number! It's not on the screen anymore. I found a document longer than the screen. Can I go further? (scroll?) 1118010421221010201601 2020 03060503 2206203070507 1 11 I wish the document was a little more tell-tale in space. I want to know what it is related to before I look at it. I also think the terms are too clustered and the documents are too clustered (pic- ture of separate clusters) 1119010120320808201801 1812 09090707 2227204020808 0 11 99 1119010220320808201801 1812 09090707 2111202060907 1 11 the second time around is easier 1120010320420807201601 1515 04040202 2116201020303 1 11 one term search doesn't put you in the rest of the space 1120010420420807201601 1515 04040202 2216204000404 0 11 99 1126010121340607201601 9999 07070101 2106202030307 1 11 I don't have a feel for what isn't there 1126010221340607201601 9999 07070101 2105204010109 1 11 the thesaurus is too limited and inflexible. It would be nice to "tailor" the search interactively, giving WORDS to expand / nar- row / qualify concepts 1201010321240909211801 1003 00040810 2208103020703 1 11 the voca- bulary words need to be more closely related to the document numbers -- what's the purpose otherwise? glove is too jumpy 1201010421240909211801 1003 00040810 2110202081009 1 11 using the map this time was more helpful than not using it the first time. Keep a list of key commands handy 1201020320261010212001 9999 09050708 2111201030606 1 11 the space system has more clues as to where you want to go 1201020420261010212001 9999 09050708 2202204090909 1 11 the posi- tion window was helpful 1202010320410606203498 9999 99990404 2111201000609 0 11 I don't think of me moving thru the space at all. I think of me as mani- pulating the space as if it were an object. Three dimensions are translated into two that way. I did not find it helpful to think of things being behind me -- they were just not in front of me 1202010420410606203498 9999 99990404 2205204050909 1 11 I still lose track of which button is being what. The location window (X) also helps to remind one of the direction that the elements on the screen are moving Part 3: Impressions data. Note that closed-ended data for queries have been removed. Items for which no data were given are denoted by '99' in the last columns. *** Impressions of Prism, Item 1: Please describe "Prism" system in your own words. 1031010411290607202001 9999 06070303 ------------- - 11 after entering descriptors, a list of articles in brief form is shown and one can see the more complete description of each article 1031020411250806201601 1906 07080301 ------------- - 11 tradi- tional -- menu driven -- command driven search for information on various fields, similar to my prior experience with electronic searching 1031030410260909202001 9999 02020809 ------------- - 11 highly verbose. suitable for one who is familiar with the subject area 1101010419999999999999 9999 05050406 ------------- - 11 A tradi- tional database system that uses boolean logic and allows one to choose from various access points. Actually, this interface is fairly simplistic and intuitive 1104010421270403201601 1403 08070303 ------------- - 11 fairly clear-cut as to the instructions on the screen and what to do, except with boolean logic which should be noted I kept putting in word with command rather than command then word. This may be confusing to user, but would make sense 1104020421250508201601 2020 08070505 ------------- - 11 textual 1105010421279910211601 9999 08080302 ------------- - 11 free text searching. Very easy to use. documents retrieved as soon as the system finds vocabulary matches 1106010421230610001701 9999 07090205 ------------- - 11 system for locating citations for a topic by knowing key terms (author) or entering my words that apply to topic 1107010421270607201901 9999 05070705 ------------- - 11 it is an online information retrieval system like most databases 1107020420270710201601 9999 07080605 ------------- - 11 standard search system I'm more familiar with this type of system -- ex- tensive training with similar systems 1109010421991003201601 9999 08030309 ------------- - 11 plus: could combine terms and narrow searches minus: was under voca- bulary control and no thesaurus available. Did not always select the right terms 1111010121440505201702 9999 99990303 ------------- - 11 I'm not a very spatially oriented person. found the task physically frus- trating as well as intellectually frustrating 1111010221440505201702 9999 99990303 ------------- - 11 not quite as frustrated physically still not happy with intellectual find- ings 1111010421440505201702 9999 99990303 ------------- - 11 99 1115010420300607201601 9999 09090909 ------------- - 11 it is a menu-command combined traditional index database 1118010421221010201601 2020 03060503 ------------- - 11 it looks like a home-made search system. Something that came out of a group programming project, not a highly developed business (no frills, no options). But it works. 1119010420320808201801 1812 09090707 ------------- - 11 reminds me of the Eric CD-ROM. Familiar, in that sense. "Traditional," electronically 1120010420420807201601 1515 04040202 ------------- - 11 inflexi- ble 1126010421340607201601 9999 07070101 ------------- - 11 old fashioned but workable 1201010421240909211801 1003 00040810 ------------- - 11 I did not like the prism system because I did not know searchable key words and descriptors 1201020420261010212001 9999 09050708 ------------- - 11 conven- tional, word based, information retrieval system 1202010420410606203498 9999 99990404 ------------- - 11 99 Im- pressions of Prism, Item 4: Briefly describe the types of sitau- tions in which you might prefer to use "Prism." 1031010411290607202001 9999 06070303 ------------- - 14 when in- formation need is very clearly defined 1031020411250806201601 1906 07080301 ------------- - 14 very specific need such as an author, or title search 1031030410260909202001 9999 02020809 ------------- - 14 when I know the subject area and have an idea what descriptors to use more specific questions 1101010419999999999999 9999 05050406 ------------- - 14 At the moment, anything when I'm in a hurry and anything in which I wanted to use boolean logic connecting disparate ideas or go in through access points that were not major descriptors 1104010421270403201601 1403 08070303 ------------- - 14 when hav- ing several terms to search under in different areas: abstracts, descriptors, etc. 1104020421250508201601 2020 08070505 ------------- - 14 most si- tuations -- if I knew anything about the subject (just heard of it) 1105010421279910211601 9999 08080302 ------------- - 14 when you have more of a general idea of what you want / need, not if you know exactly what you need 1106010421230610001701 9999 07090205 ------------- - 14 have specific information and a good idea of what is needed 1107010421270607201901 9999 05070705 ------------- - 14 when I have a specific and well defined need 1107020420270710201601 9999 07080605 ------------- - 14 when I'm in a hurry 1109010421991003201601 9999 08030309 ------------- - 14 narrow searches and knew current terminology 1111010421440505201702 9999 99990303 ------------- - 14 99 1115010420300607201601 9999 09090909 ------------- - 14 I was trained to do it 1118010421221010201601 2020 03060503 ------------- - 14 when I want information about more than one topic, and I know my topic(s) well 1119010420320808201801 1812 09090707 ------------- - 14 trying to find articles to cite for assigned papers 1120010420420807201601 1515 04040202 ------------- - 14 known items -- no subject searching 1126010421340607201601 9999 07070101 ------------- - 14 I use something like it all the time and feel comfortable with it 1201010421240909211801 1003 00040810 ------------- - 14 If you know the author or title of a book / person and want to find the exact article 1201020420261010212001 9999 09050708 ------------- - 14 a very specific information need 1202010420410606203498 9999 99990404 ------------- - 14 99 *** Impressions of Space, Item 1: Please describe the "Space" system in your own words. 1031010411290607202001 9999 06070303 ------------- - 21 this is a more visual system that looks like a galaxy and roughly indicated relationships between terms and overlap 1031020411250806201601 1906 07080301 ------------- - 21 lack of structure - randomness -- relational. I guess relational but I had a hard time seeing the relations 1031030410260909202001 9999 02020809 ------------- - 21 relates subject terms in 3D space locating documents nearby, depending on relative closeness to those terms 1101010419999999999999 9999 05050406 ------------- - 21 a data- base system where access is limited to a fairly short list of subject terms. The relationship of the documents and terms to each other is represented by a map on the screen (works well on the planar level, not so well as 3D). Apart from this map, there is no way to connect terms 1104010421270403201601 1403 08070303 ------------- - 21 difficult to understand until knew basic command letters. Window would close soon after command, user sometimes felt like I did some- thing wrong or not complete enough 1104020421250508201601 2020 08070505 ------------- - 21 spatial 1105010421279910211601 9999 08080302 ------------- - 21 conceptu- al text searching. must have a good sense of your place in the space in order to use effectively 1106010421230610001701 9999 07090205 ------------- - 21 system to locate relevant citations by finding a closely related term and moving all around in space to find relevant documents 1107010421270607201901 9999 05070705 ------------- - 21 it is very interesting and I enjoyed it. I wish I was more familiar with it and had a little more training before using it 1107020420270710201601 9999 07080605 ------------- - 21 clever! "Position" window not much use and reminiscent of a video game. Not completely intuitive, spatial relationships interesting, but often confusing. 1109010421991003201601 9999 08030309 ------------- - 21 liked seeing the vocabulary displayed at all times rotation was great and so was backwards and forwards commands worked great. if no document numbers near my term, did not know what to do. when selected a document (search) document number ass not displayed -- felt a little lost here 1111010421440505201702 9999 99990303 ------------- - 21 a field of terms float around waiting for capture. when captured they indicate a document number you could use 1115010420300607201601 9999 09090909 ------------- - 21 put a search into a multi-d (time, space) relation frame 1118010421221010201601 2020 03060503 ------------- - 21 a docu- ment retrieval system that puts terms and documents together in space and the searcher knows about trying to decide which docu- ments are near which terms 1119010420320808201801 1812 09090707 ------------- - 21 intui- tive. A "neat" 3D concept. Takes a little time to get used to 1120010420420807201601 1515 04040202 ------------- - 21 nice con- ceptual model of how ideas are linked, not helpful to me 1126010421340607201601 9999 07070101 ------------- - 21 a nice try. I know what you're after. I too believe in non-hierarchical search strategies. I'm sure someday you'll succeed. Keep work- ing at it 1201010421240909211801 1003 00040810 ------------- - 21 I liked the space much better than prism. It is visual and interactive. I could use the map to navigate where I wanted to go 1201020420261010212001 9999 09050708 ------------- - 21 a spatial interface 1202010420410606203498 9999 99990404 ------------- - 21 an at- tempt to translate conceptual closeness into spatial closeness *** Impressions of Space, Item 4: Briefly describe the types of sitautions in which you might prefer to use Space." 1031010411290607202001 9999 06070303 ------------- - 24 when I'm trying to see relationships between descriptors 1031020411250806201601 1906 07080301 ------------- - 24 when you are not sure how to get at a topic -- to browse, sort of 1031030410260909202001 9999 02020809 ------------- - 24 when less familiar with subject area -- if question is vaguer 1101010419999999999999 9999 05050406 ------------- - 24 a search where I felt that the main subject of a document would be my ac- cess point 1104010421270403201601 1403 08070303 ------------- - 24 limited main word searching, with not many boolean or further limiting searches 1104020421250508201601 2020 08070505 ------------- - 24 if I were completely ignorant about a subject 1105010421279910211601 9999 08080302 ------------- - 24 when I have a lot of time. Also when (as with prism) I'm working with a general area as opposed to an exact idea 1106010421230610001701 9999 07090205 ------------- - 24 when one has no idea what he wants but maybe clued in on an area by brows- ing the space 1107010421270607201901 9999 05070705 ------------- - 24 for browsing purposes 1107020420270710201601 9999 07080605 ------------- - 24 browsing 1109010421991003201601 9999 08030309 ------------- - 24 unfocused search and not sure what related terms might be relevant. Is great to browse around the terms 1111010421440505201702 9999 99990303 ------------- - 24 I need more practice -- might be fun at home 1115010420300607201601 9999 09090909 ------------- - 24 search a unfamiliar subject 1118010421221010201601 2020 03060503 ------------- - 24 when I only want to find something on one topic. It is way too hard to combine terms 1119010420320808201801 1812 09090707 ------------- - 24 I had really rather use something else. Something more literal 1120010420420807201601 1515 04040202 ------------- - 24 can't think of any 1126010421340607201601 9999 07070101 ------------- - 24 at gun- point 1201010421240909211801 1003 00040810 ------------- - 24 when looking for several articles on a topic with nothing specific in mind 1201020420261010212001 9999 09050708 ------------- - 24 a gen- eral, broad search 1202010420410606203498 9999 99990404 ------------- - 24 I doubt that I would. It seems like making the translation from verbal / conceptual space to a 2D representation of a 3D space is entirely redundant *** Impressions of Space, Item 5A: Assess the usefulness of the vocabulary window 1104010421270403201601 1403 08070303 ------------- - 25 good to get ideas to look under, needed to explicitly state how to key in words from. may be helpful to simply click on term rather than to go through 'f' 1104020421250508201601 2020 08070505 ------------- - 25 very use- ful 1105010421279910211601 9999 08080302 ------------- - 25 useful, but it should be noted that the user must type in the space _ between words exactly as listed 1106010421230610001701 9999 07090205 ------------- - 25 99 1107010421270607201901 9999 05070705 ------------- - 25 extremely useful -- without it I would have been completely lost 1107020420270710201601 9999 07080605 ------------- - 25 too lim- ited; some terms listed were not found 1109010421991003201601 9999 08030309 ------------- - 25 great! 1111010421440505201702 9999 99990303 ------------- - 25 OK, once you grab it 1115010420300607201601 9999 09090909 ------------- - 25 helpful 1118010421221010201601 2020 03060503 ------------- - 25 it got me to a subject area (as a rule) but it pushed me there, didn't tell me where I was in the big space. Also, some terms I got to were completely unsurrounded by anything else (a document in the mid- dle of nowhere) 1119010420320808201801 1812 09090707 ------------- - 25 very use- ful. would have been lost without it 1120010420420807201601 1515 04040202 ------------- - 25 very use- ful, almost necessary 1126010421340607201601 9999 07070101 ------------- - 25 limited 1201010421240909211801 1003 00040810 ------------- - 25 would be very useful if were very related to document numbers got me to the general areas of where to look 1201020420261010212001 9999 09050708 ------------- - 25 very helpful 1202010420410606203498 9999 99990404 ------------- - 25 crucial *** Impressions of Space, Item 5B: Assess the usefulness of finding keywords with the "F" key 1104010421270403201601 1403 08070303 ------------- - 26 "pretty" did not use much 1104020421250508201601 2020 08070505 ------------- - 26 not use- ful without the vocabulary window 1105010421279910211601 9999 08080302 ------------- - 26 useful 1106010421230610001701 9999 07090205 ------------- - 26 did not utilize their function, search would have been ? 1107010421270607201901 9999 05070705 ------------- - 26 extremely useful 1107020420270710201601 9999 07080605 ------------- - 26 very fast and efficient 1109010421991003201601 9999 08030309 ------------- - 26 also great 1111010421440505201702 9999 99990303 ------------- - 26 best part for me -- most like previous computer usages 1115010420300607201601 9999 09090909 ------------- - 26 easy 1118010421221010201601 2020 03060503 ------------- - 26 I think of the two as the same [as the vocabulary window] I didn't think about the vocabulary window until you asked. Easy to use 1119010420320808201801 1812 09090707 ------------- - 26 the only way to fly 1120010420420807201601 1515 04040202 ------------- - 26 needed vocabulary function 1126010421340607201601 9999 07070101 ------------- - 26 unorigi- nal 1201010421240909211801 1003 00040810 ------------- - 26 very use- ful with the vocabulary window -- once found the word in the vo- cabulary window used "f" to take me to the location of the voca- bulary word 1201020420261010212001 9999 09050708 ------------- - 26 would be better if a "spatial" AND were added 1202010420410606203498 9999 99990404 ------------- - 26 crucial *** Impressions of Space, Item 5C: Assess the usefulness of the map window 1104010421270403201601 1403 08070303 ------------- - 27 would be written over each other -- difficult to read 1104020421250508201601 2020 08070505 ------------- - 27 didn't look at it 1105010421279910211601 9999 08080302 ------------- - 27 I couldn't really get into it. I had trouble focusing on my place in the map 1106010421230610001701 9999 07090205 ------------- - 27 shows where the majority of documents are found 1107010421270607201901 9999 05070705 ------------- - 27 very use- ful but I wish I cold tell what's in front of me and what's behind me. maybe using a different diacritic (ie, * vs. D_ for those citations that are in front on comparison to those that are behind 1107020420270710201601 9999 07080605 ------------- - 27 of what use is this? 1109010421991003201601 9999 08030309 ------------- - 27 scale is too small so wasn't too useful until I saw that all the docs were far away from the crosshairs 1111010421440505201702 9999 99990303 ------------- - 27 a beauti- ful distraction 1115010420300607201601 9999 09090909 ------------- - 27 not very helpful 1118010421221010201601 2020 03060503 ------------- - 27 I looked at it quite a bit, but I couldn't understand it all of the time. I really only looked at it when I thought I was way out in space (too far out) 1119010420320808201801 1812 09090707 ------------- - 27 when lost 1120010420420807201601 1515 04040202 ------------- - 27 no use - I didn't notice it changing 1126010421340607201601 9999 07070101 ------------- - 27 saving grace 1201010421240909211801 1003 00040810 ------------- - 27 used to find my way around the space. looked for blue dots (document numbers) and yellow (keywords) to make a match. Knew that if I was near a dot, but the space was blank, I should zoom in and out to find the document word 1201020420261010212001 9999 09050708 ------------- - 27 very use- ful 1202010420410606203498 9999 99990404 ------------- - 27 moderate *** Impressions of Space, Item 5D: Assess the usefulness of the spatial location of keywords 1104010421270403201601 1403 08070303 ------------- - 28 appears to not have man documents within system by simply looking at numbers 1104020421250508201601 2020 08070505 ------------- - 28 I found some keywords with nothing near them. Neither other keywords nor documents. How can this be? 1105010421279910211601 9999 08080302 ------------- - 28 fine, as long as you remember to move forward and backward as well as back and forth 1106010421230610001701 9999 07090205 ------------- - 28 in combi- nation with document numbers, made choice easier 1107010421270607201901 9999 05070705 ------------- - 28 quite useful 1107020420270710201601 9999 07080605 ------------- - 28 couldn't figure out basis of locations 1109010421991003201601 9999 08030309 ------------- - 28 would like a tighter space. very useful 1111010421440505201702 9999 99990303 ------------- - 28 weird to adjust to 1115010420300607201601 9999 09090909 ------------- - 28 wonderful and creative 1118010421221010201601 2020 03060503 ------------- - 28 keywords bunched together, not enough separation. If the keywords are bunched and documents are bunched how can you tell which document goes with which keyword? 1119010420320808201801 1812 09090707 ------------- - 28 seemed to lack a really clear relation (word-to-word) 1120010420420807201601 1515 04040202 ------------- - 28 slightly helpful 1126010421340607201601 9999 07070101 ------------- - 28 it would be fine if it helped you find articles you wanted to read 1201010421240909211801 1003 00040810 ------------- - 28 good for finding related keywords 1201020420261010212001 9999 09050708 ------------- - 28 they hung together pretty well 1202010420410606203498 9999 99990404 ------------- - 28 very lit- tle *** Impressions of Space, Item 5E: Assess the usefulness of the spatial location of documents 1104010421270403201601 1403 08070303 ------------- - 29 would lime more "prompts" for new users or screen to prompt where to go next 1104020421250508201601 2020 08070505 ------------- - 29 I found some keywords with nothing near them. Neither other keywords nor documents. How can this be? 1105010421279910211601 9999 08080302 ------------- - 29 fine 1106010421230610001701 9999 07090205 ------------- - 29 99 1107010421270607201901 9999 05070705 ------------- - 29 quite useful 1107020420270710201601 9999 07080605 ------------- - 29 couldn't figure out basis of locations 1109010421991003201601 9999 08030309 ------------- - 29 moderate- ly helpful -- if couldn't find docs in a near space, then select- ed any docs that appeared at all 1111010421440505201702 9999 99990303 ------------- - 29 even weirder -- taught to read on one plane -- moving to another representation seems unnecessary 1115010420300607201601 9999 09090909 ------------- - 29 easy and fast 1118010421221010201601 2020 03060503 ------------- - 29 sometimes I had a screen of documents, nothing else. Perhaps if there were lines drawn to the relevant keyword, I could follow that 1119010420320808201801 1812 09090707 ------------- - 29 easy enough. Brightness differences helps 1120010420420807201601 1515 04040202 ------------- - 29 slightly helpful 1126010421340607201601 9999 07070101 ------------- - 29 ir- relevant -- I don't think in terms of document numbers 1201010421240909211801 1003 00040810 ------------- - 29 not very related to key words but ok for browsing though 1201020420261010212001 9999 09050708 ------------- - 29 they also have a semblance of hanging together 1202010420410606203498 9999 99990404 ------------- - 29 moderate -- if I was way far away from the concept, I wouldn't even bother to look *** Impressions of Space, Item 6: Write any suggestions or other comments you have for the "Space" system 1105010421279910211601 9999 08080302 ------------- - 30 I found myself lost or frozen a few times due to my misuse of the mouse. It's too easy to get stuck, and not easy to get back to where you think you should be 1106010421230610001701 9999 07090205 ------------- - 30 screen moved very quickly, difficult to focus on one area 1107010421270607201901 9999 05070705 ------------- - 30 as I men- tioned above: knowing what's in front of you and what's behind you would be a desirable feature 1109010421991003201601 9999 08030309 ------------- - 30 ability to join terms; in search show document number when displaying do- cument text; show larger scale for map and label map as MAP on screen instead of Position 1115010420300607201601 9999 09090909 ------------- - 30 I like to idea to search a document in a frame combine time, space 1118010421221010201601 2020 03060503 ------------- - 30 a person must have abstract mind to use this. The data must be abstracted in their minds, because everything is not on the screen (the per- son has to keep it inside). Some people may have a hard time with that. I think I have an abstract mind, and I had trouble at first 1119010420320808201801 1812 09090707 ------------- - 30 it seems as though the space should move the "other way," ie, when turning right, examples should move to the left, etc. 1120010420420807201601 1515 04040202 ------------- - 30 feels a bit like the difference between Macs and IBM. The spatial link- ing doesn't help me as I think in intellectual keyword ways 1126010421340607201601 9999 07070101 ------------- - 30 I'm sure it will provide you with job security for a long time to come. I really hope it works someday, and that everyone can have it, not just the well heeled few 1201010421240909211801 1003 00040810 ------------- - 30 make glove less jumpy -- if possible use 3D glasses. make words and documents more closely related somehow (perhaps by color coding) 1201020420261010212001 9999 09050708 ------------- - 30 space seems a bit haphazard 1202010420410606203498 9999 99990404 ------------- - 30 I don't see the need for a rotational mode. Left to right combined with forward / backward was all I used REFERENCES Aboulafia, M. (1991). Philosophy, Social Theory, and the Thought of George Herbert Mead. Albany, NY: State University of New York Press. Afsarmanesh, H. & McLeod, D. (1989). "The 3DIS: An extensible object-oriented information management environment." ACM Transactions on Database Systems 7(4):339-377. Anderson, R.C. & Shifrin, Z. (1980). "The meaning of words in context." in R.J. Spiro, B.C. Bruce and W.F. Brewer, Eds. Theoreti- cal Issues in Reading Comprehension. Hillsdale, NJ: Lawrence Erl- baum Associates. Apted, S.M. (1971). "General purposive browsing." Library Association Record 73(12):228-230. Babbie, Earl. (1989). The Practice of Social Research. 5th ed. Belmont, California: Wadsworth. Banchoff, T.F. (1990). Beyond the Third Dimension: Geometry, Computer Graphics, and Higher Dimensions. New York: Scientific American Library. Barnett, G.A. (1988). "Pigs in space." in G.A. Barnett and J.D. Woelfel, eds. Readings in the Galileo System: Theory, Method, and Applica- tion. Dubuque, Iowa: Kendall-Hunt. Bawden, D. (1986). "Information systems and the stimulation of creativity." J. Information Science 12:203-216. Belkin, N.J. (1975). "Towards a definition of information for informatics." in V. Horsnell, ed. Informatics 2. London: Aslib. Belkin, N.J. (1984). "Cognitive models and information transfer." Social Science Information Studies 4:111-129. Belkin, N.J. & Croft, W.B. (1987). "Retrieval techniques." in Martha E. Williams, ed. Annual Review of Information Science and Technology 22. New York: Elsevier. Belkin, N.J. & Kwasnik, B. (1986). "Using structural representations of anomalous states of knowledge for choosing document retrieval strategies." in Fausto Rabitti, ed. 1986 ACM conference on research and development in information retrieval September 8-10, 1986. New York: ACM. Belkin, N.J., Oddy, R.N, & Brooks, H.M. (1982). "ASK for information retrieval: Parts I & II." J. Documentation 38:61-71, 145- 164. Bobrow, D.G. (1975). "Dimensions of representation." in D.G. Bobrow and A. Collins, eds. Representation and understanding: Studies in cognitive science. New York: Academic Press. Boorstin, Daniel. (1983). The Discoverers. New York: Random House. Borgman, C.L. (1989). "All users of information retrieval systems are not created equal: An exploration into individual differences." Information processing and management 25(3):237-251. Brittain, M. (1979). "Information systems as extensions of human memory." in T. Henriksen, ed. Proceedings of the 3rd International Research Forum in Information Science. Oslo: Publikasjoner. Brookes, B.C. (1975). "The fundamental problem of information science." in V. Horsnell, ed. Informatics 2. London: Aslib. Brookes, B.C. (1980). "The foundations of information science. Part III. Quantitative aspects: Objective maps and subjective land- scapes." J. Information Science 2:269-275. Burgess, C. & Swiggeg, K. (1986). "A graphical database interface for casual, naive users." Information Processing and Management 22(6):511-521. Bush, V. (1945). "As we may think." Atlantic Monthly 176:101-108. BYTE (1990). "Computing without keyboards [Special issue]." July. Canter, D., Rivers, R. & Storrs, G. (1985). "Characterizing user navigation through complex data structures." Behaviour and Information Technology 4(2):93-102. Chang, S. (1990). "Visual reasoning for information retrieval from very large databases." J. Visual Languages and Computing 1:41-58. Cleveland, D.B. & Cleveland, A.D. (1983). Introduction to Indexing and Abstracting Littleton: CO: Libraries Unlimited. Cove, J.F. & Walsh B.C. (1988). "Online text retrieval via browsing." Information Processing and Management 24(1):31-37. Cronk, G. (1987). The Philosophical Anthropology of George Herbert Mead. New York: Peter Lang. Crouch, D.B. (1986). "The visual display of information in an information retrieval environment." in F. Rabitti, ed. ACM Conference on Research and Development in Information Retrieval. Cushman, D.P. (1977). "The rules perspective as a theoretical basis for the study of human communication." Communication Quarterly 25:30-45. Cushman, D.P. & Cahn, D.D. (1985). Communication in Interpersonal Relationships. Albany: State University of New York Press. Daniels, P.J. (1986) "Cognitive models in information retrieval - an evaluative review." J. Documentation 42(4):272-304. Danowski, J.A. (1988). "Organizational infographics and automated auditing: Using computers to unobtrusively gather as well as analyze communication." in G.M. Goldhaber and G.A. Barnett, eds. Handbook of Organizational Communication. Norwood, NJ: Ablex. Davies, R. (1989). "The creation of new knowledge by information retrieval and classification." J. Documentation 45(4):273-301. de Bono, E. (1978). "Lateral thinking and indexing." The Indexer 11(2):61-63. Deerwester, S., Dumais, S.T., Landauer, T., Furnas, G. & Beck, L. (1988). "Improving information retrieval with latent semantic indexing." Proceedings of the American Society for Information Science Annual Meeting. Medford, NJ: Learned Information. Deerwester, S., Dumais, S.T., Furnas, G.W., Landauer, T.K., & Harshman, R. (1990). "Indexing by latent semantic analysis." J. American Society for Information Science. Deregowski, J.B. (1989). "Real space and represented space: Cross- cultural perspectives." Behavioral and Brain Sciences 12:51-118. Dervin, B. (1983). "An overview of Sense-Making research: Concepts, methods and results to date." Paper presented at the annual meeting of the International Communications Association, Dallas. Dervin, B. & Dewdney, P. (1986). "Neutral questioning: A new approach to the reference interview." RQ 25(4):97-104. Dervin, B. & Nilan, M.S. (1986). "Information needs and uses." in M. Williams, ed. Annual Review of Information Science and Technology 21:3-33. Doyle, L.B. (1962). "Indexing and abstracting by association." American Documentation 108:120. Eisenberg, M.E. & Barry, C. (1988). "Order effects: A study of the possible influence of presentation order on user judgements of document relevance." J. American Society for Information Science 39(5):293-300. Eliot, J. (1987). Models of Psychological Space. New York: Springer- Verlag. Farradane, J.E.L. (1952). "A scientific theory of classification and indexing: Further considerations." J. Documentation 6(2):73-92. Farradane, J. & Thompson D. (1980). "The testing of relational indexing procedures by diagnostic computer programs." J. Infor- mation Science 2:285-297. Freeman, C.A. & Barnett, G.A. (1991). "An alternative approach to using interpretive theory to examine corporate messages and organizational culture." Paper submitted to the International Communication Association, Chicago, Ill. Furnas, G.W., Deerwester, S., Dumais, S.T., Landauer, T.K., Harshman, R., Streeter, L.A. & Lochbaum, K.E. (1988). "Information retrieval using a singular value decomposition model of latent semantic structure." Paper presented at the 1988 SIG/IR meeting in Grenoble, France. Gerrie, B. (1983). Online Information Systems. Arlington: Information Resources Press. Gibson, J.J. (1979). The Ecological Approach to Visual Perception. Boston: Houghton Mifflin Company. Gibson, W. (1984). Neuromancer. New York: Ace. Gluck, M. (1991). "Making sense of human wayfinding: Review of cognitive and linguistic knowledge for personal navigation with a new research direction." in D. Mark & A. Frank (eds) Cognitive and Linguistic Aspects of Geographic Space Proceedings of NATO ASI. New York: Kluwer Academic. Goldstine, H.H. (1972). The computer from Pascal to von Neumann. Princeton, NJ: Princeton University Press. Griffiths, A.H., Luckhurst, C. & Willett, P. (1986). "Using interdocument similarity information in document retrieval systems." J. American Society for Information Science 37(1):3-11. Harper, D.J. & van Rijsbergen, C.J. (1978). "An evaluation of feedback in document retrieval using co-occurrence data." J. Documentation 34(3):189-216. Hildreth, C. (1982). "The concept and mechanics of browsing in an online catalog." Proceedings of the National Online Meeting 181-193. Hofstadter, Douglas R. (1979). Godel, Escher, Bach: An Eternal Golden Braid. New York: Random House. Joas, H. (1980). G.H. Mead: A Contemporary Re-examination of his Thought. Frankfurt: Suhrkamp-Verlag. Johnson-Laird, P.N. (1988a). The Computer and the Mind: An Introduction to Cognitive Science. Cambridge: Harvard University Press. Johnson-Laird, P.M. (1988b). "How is meaning mentally represented?" International Social Science J. 2:45-62. Jones, W.P. & Furnas, G.W. (1987). "Pictures of relevance: A geometric analysis of similarity measures." J. American Society for Information Science 38(6):420-442. Kaplan, A. (1964). The Conduct of Inquiry. San Francisco: Chandler. Kent, A. & Lancour, H. (Eds.). (1970). Encyclopedia of Library and Information Science. New York: Marcel Dekker. Kerlinger, F.N. (1986). Foundations of Behavioral Research. New York: Holt, Rinehart, and Winston. Kerr, S.T. (1990). "Wayfinding in an electronic database: The relative importance of navigational cues vs. mental models." Informa- tion Processing and Management 26(4):511-523. Kincaid, D.L. (1988). "The convergence theory of communication and cultural change." in J.D. Woelfel, and E.L. Fink, eds. Readings in the Galileo System. Dubuque, Iowa: Kendall-Hunt. Krippendorff, K. (1980). Content Analysis. Beverly Hills: Sage. Koll, M.B. (1978). The Concept Space in Information Retrieval Systems as a Model of Human Concept Relations. PhD Thesis. Syra- cuse: School of Information Studies. Kuhn, T.S. (1962). The Structure of Scientific Revolutions. Chicago: University of Chicago Press. Larson, J.A. (1986). "A visual approach to browsing in a database environment." IEEE Computer. June: 62-71. Law, J. & Whittaker, J. (1992). "Mapping acidification research: A test of the co-word method." Scientometrics 23(3):417-461. Licklider, J.C.R. (1965). Libraries of the Future. Cambridge: MIT Press. Licklider, J.C.R. (1969). "A picture is worth a thousand words -- and it costs." in AFIPS Conference Proceedings 617-621. Littlejohn, S.W. (1983). Theories of Human Communication. Belmont, California: Wordsworth. Lynch, K. (1960). The Image of the City. Cambridge, Massachusetts: MIT Press. Mantei, M.M. (1982). Disorientation Behavior in Person-Computer Interaction. PhD Thesis. Los Angeles: Department of Communication, University of Southern California. Marchionini, G. & Schneiderman, B. (1988). "Finding facts vs. browsing knowledge in hypertext systems." IEEE Computer. Janu- ary: 70-80. Maturana, H.R. & Varela, F.J. (1987). The Tree of Knowledge. Boston: New Science Library. McGill, M.J. (1975). "Projections within knowledge spaces: The implications for information storage and retrieval systems." Pro- ceedings of the American Society for Information Science Annual Meeting. Washington, DC: The American Society for Information Science. McGill, M.J. (1976). "Knowledge and information spaces: Implications for retrieval systems." J. American Society for Infor- mation Science 205-210. Mead, G.H. (1960). On Social Psychology. Chicago: University of Chicago Press. Meicke, P.P.M. & Atherton, P. (1976). "Knowledge space: A conceptual basis for the organization of knowledge." J. American Society for Information Science 18-24. Miller, G.A. (1968). "Psychology and information." American Documentation 286-289. Morse, P.M. (1973). "Browsing and search theory." in C.H. Rawski, ed. Toward a Theory of Librarianship. Metuchen, NJ: The Scarecrow Press. Newby, G.B. (1988). "A self-concept based approach to artificial intelligence, with a case study of the Galileo computer system." Unpublished M.A. Thesis. Albany: State University of New York. Newby, G.B. (1989). "User models for information retrieval: Applying knowledge about human communication to computer inter- face design." Proceedings of the American Society for Information Science Annual Conference. Medford, NJ: Learned Information. Newby, G.B., Nilan, M.S., & Duvall, L.M. (1991). "Towards a reassessment of individual differences for information systems: The power of user-based situational predictors." in Proceedings of the American Society for Information Science Annual Meeting. Medford, New Jersey: Learned Information. Nilan, M.S., Newby, G.B., Paik, W., & Lopatin, K. (1989). "User-oriented interfaces for computer systems: A user-defined online help system for desktop publishing." in Proceedings of the American Society for Information Science Annual Meeting. Medford, New Jersey: Learned Information. Nilan, M.S. and Rosenbaum, H. (1991). "An epistemology for Sense- Making research." Unpublished manuscript. Noerr, P.L. & Noerr, K.T. (1985). "Browse and navigate: An advance in database access methods." Information Processing and Management 21(3):205-213. Noreault, T., McGill, M.J., & Koll, M.B. (1979). An Evaluation of Factors Affecting Document Ranking by Information Retrieval Systems. Syracuse: School of Information Studies, Syracuse University. Research supported by the National Science Foundation, Grant NSF- IST-78-10454. Norman, D.A. (1988). The Psychology of Everyday Things. New York: Basic Books. Oddy, R.N. (1977). "Information retrieval through man-machine dialogue." J. Documentation 33:1-14. Oddy, R.N. & Balakrishnan, B. (1991). "PTHOMAS: An adaptive information retrieval system on The Connection Machine." Information Processing and Management 27(4):255-389. Palay, A.J. & Fox, M.S. (1981). "Browsing through databases." in R.N. Oddy, S.E. Robertson, C.J.van Rijsbergen, and P.W. Williams, eds. Information Retrieval Research. London: Butterworths. Pao, M.L. (1989). Concepts of Information Retrieval. Englewood, Colorado: Libraries Unlimited. Parsaye, K. & Chignell, M. (1988). Expert Systems for Experts. New York: John Wiley and Sons. Pejtersen, A.M. (1989). "A library system for information retrieval based on a cognitive task analysis and supported by an icon-based interface." in N.J. Belkin and C.J. van Rijsbergen, eds. Proceedings of the Twelfth Annual ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM. Raghavan, V.V. & Wong, S.K.M. (1986). "A critical analysis of vector space model [sic] for information retrieval." J. American Society for Information Science 37():279=287. Raghavan, V.V. & Yu, C.T. (1979). "Experiments on the determination of the relationships between terms." ACM Transactions on Database Systems 4(2):240-260. Reynolds, P.D. (1971). A Primer on Theory Construction. New York: MacMillan. Rheingold, H. (1991). Virtual Reality. New York: Summit Books. Robertson, S.E. (1979). "Relevance, retrieval, and document spaces." in T. Henriksen, ed. Proceedings of the 3rd International Research Forum in Information Science. Oslo: Publikasjoner. Rodgers, J.L. & Nicewander, W.A. (1988). "Thirteen ways to look at the correlation coefficient." The American Statistician 42(1)- :59-66. Roget, P.M. (1977). Roget's International Thesaurus. 4th ed., revised by Champman, R.L. New York: Harper and Row. Rumelhart, D.E. and McClelland, J.L. (Eds.). (1986). Parallel Distributed Processing. Cambridge, MA: MIT Press. Salton, G., Fox, E.A. & Voorhees, E. (1985). "Advanced feedback methods in information retrieval." J. American Society for Infor- mation Science 36(3):200-210. Salton, G. & McGill, M.J. (1983). Introduction to Modern Information Retrieval. New York: McGraw Hill. Saracevic, T. & Kantor, P. (1988). "A study of information seeking and retrieving. II. Users, questions, and effectiveness." J. American Society for Information Science 39(3):177-196. SAS Institute. (1985). SAS User's Guide: Statistics. Cary, North Carolina: SAS Institute. Schamber, L., Eisenberg, M.E., & Nilan, M.S. (1991). "Towards a dynamic, situational definition of relevance." Information Processing and Management 26(2):755-776. Schauble, P. (1989). "Improving the effectiveness of retrieval systems by information structures." Information Processing and Management 25(4):363-376. Shannon, C. & Weaver, W. (1949). The Mathematical Theory of Communication. Urbana, IL: University of Illinois Press. Shneiderman, B., Shafer, P., Simon, R. & Weldon, L. (1986). "Display strategies for program browsing: Concepts and experiment." IEEE Software May, 7-14. Sievert, M.C. & Andrews, M.J. (1991). "Indexing consistency in Information Science Abstracts." J. American Society for Information Science 42(1):1-6. Taylor, R.S. (1968) "Question-negotiation and information seeking in libraries." College and Research Libraries 29(1):178-191. Thompson, D.A. (1971). "Interface design for an interactive information retrieval system: A literature survey and a research system description." J. American Society for Information Science 361-373. Trivison, D. (1987). "Term co-occurrence in cited/citing journal articles as a measure of document similarity." Information Processing and Management 23(3):183-194. van Rijsbergen, C.J. (1977). "A theoretical basis for the user of co- occurrence data in information retrieval." J. Documentation 33(2):106- 119. von Bertalanffy, L. (1961). General Systems Theory. New York: George Brazillen. von Eckartsberg, R. (1981). "Maps of the mind: The cartography of consciousness." in R.S. Valle and R. von Eckartsberg, eds. The Meta- phors of Consciousness. New York: Plenum Press. Winograd, T. & Flores, F. (1987). Understanding Computers and Cognition. Reading, Massachusetts: Addison-Wesley. Woelfel, J.D. & Fink, E.L. (1980). The Measurement of Communication Processes: Galileo Theory and Method. New York: Academic Press. Woelfel, J.D., Holmes, R., Cody, M.J., & Fink, E.L. (1988). "A multidimensional scaling method for designing persuasive messages and measuring their effects." in J.D. Woelfel, and E.L. Fink, eds. Readings in the Galileo System. Dubuque, Iowa: Kendall-Hunt. Wong, S.K.M., Ziarko, W., Raghavan, V.V., & Wong, P.C.N. (1987). "On modeling of information retrieval concepts in vector spaces." ACM Transactions on Database Systems 12(2):299-321. Yovits, M.C., de Korvin, A., Kleyle, R., & Mascarenhas, M. (1987). "External documentation and its quantitative relationship to the internal information state of a decision maker: The information profile." J. American Society for Information Science 38(6):405-419. Yu, C. T., Buckle, C., Lam, K., & Salton, G. (1983). "A generalized term dependence model in information retrieval." Information Technology, Research and Development 2(129-154). BIOGRAPHICAL DATA Name: Gregory Barton Newby Date and place of birth: February 9, 1965 Montreal, Canada Elementary School: Main Street School Port Washington, New York High School: Paul D. Schreiber High School Port Washington, New York Graduated June, 1983 College: State University of New York University Center at Albany Albany, New York Dual major in Communication and Psychology Bachelor of Arts awarded May, 1987 Graduate Work: State University of New York University Center at Albany Albany, New York Study in Department of Communication Master of Arts awarded December, 1988 Syracuse University Syracuse, New York Study in The School of Information Studies Doctor of Philosophy to be awarded in 1993