Evaluating Cognitive Models and Architectures
نویسنده
چکیده
Cognitive modeling is a promising research area combining the fields of psychology and artificial intelligence. Cognitive models have been successfully applied to different topics like theory generation in psychology, usability testing in humancomputer interface research, and cognitive tutoring in algebra teaching. This paper is focused on two major problems concerning the evaluation of cognitive models and architectures. First, it is shown that current approaches to quantify model complexity are not sufficient and second, the claim is reinforced that the flexibility of cognitive architectures leads to the problem that they cannot be falsified and are therefore too weak to be considered as theories. Cognitive architectures From a psychologist’s point of view, the main goal of cognitive modeling is to simulate human cognition. A cognitive model is seen as an instantiation of a theory in a computational form. This can then be used to test a certain theory empirically. Cognitive architectures like SOAR (Laird, Newell, & Rosenbloom 1987), EPIC (Kieras &Meyer 1997), and ACTR (Anderson et al. 2004) are useful frameworks for the creation of cognitive models. In addition, they are often regarded as useful steps towards a unified theory of cognition (Newell 1990). When a computational model is used to evaluate a theory, the main goal is (generalizable) concordance with human data. But this is not a sufficient proof of the theory. The following problems are often stated in the literature: • Compliance with the theory. The underlying theory and its implementation (in a cognitive architecture) may diverge. • Irrelevant specification (Reitman 1965). Most of the models need to make additional assumptions. It is very hard to tell which parts of the model are responsible for its overall performance and which parts may be obsolete. Newell (1990) demanded to “listen to the architecture” when creating new models in order to get around this Copyright c © 2007, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. problem. While this is a recommendable request, it is hard to verify that the modeler has complied with it. • Underlying theories. A special case of the irrelevant specification problem can occur when the architecture incorporates knowledge from previous research, for example Fitts’ law (Fitts 1954) about the time needed to perform a rapid movement of the forearm. These underlying empirical laws may provide a big part of the fit of the cognitive model in question. • Interpretability. Most of the time, one needs to look at the code of a model in order to understand what it really does. As many psychologists do not hold a degree in computer science, this hinders them from participating in the cognitive modeling research field. The usage of a cognitive architecture – as opposed to modeling everything from scratch – does not solve these problems completely, but weakens them noticeably. Cognitive architectures are subject to social control by the research community. This way, theory compliance, interpretability, and rejection of irrelevant parts are enforced at least on the level of the architecture. An informative evaluation of cognitive models has been done by Gluck & Pew (2005). They applied several cognitive architectures to the same task and compared the resulting models not only with regard to goodness-of-fit but also to modeling methodology, features and implied assumptions of the particular architecture, and the role of parameter tuning during the modeling process. This approach of Gluck & Pew is well suited for addressing the problems of model evaluation besides goodness-offit and therefore I’ll recommend it as standard practice. Evaluation methodology for cognitive models I stated above that from a psychologist’s point of view concordance with human data is the main goal of cognitive modeling. This concordance is typically assessed by computing goodness-of-fit indices. This approach has been critized sharply during the last years (Roberts & Pashler 2000), initiating a discussion that concluded with the joint statement that a good fit is necessary, but not sufficient for the validity of a model (Rodgers & Rowe 2002; Roberts & Pashler 2002).
منابع مشابه
مدل عملکردی تحلیلی FPGA برای پردازش با قابلیت پیکربندی مجدد
Optimizing FPGA architectures is one of the key challenges in digital design flow. Traditionally, FPGA designers make use of CAD tools for evaluating architectures in terms of the area, delay and power. Recently, analytical methods have been proposed to optimize the architectures faster and easier. A complete analytical power, area and delay model have received little attention to date. In addi...
متن کاملComparing Semantic Space Models Using Child-directed Speech
A number of semantic space models from the cognitive science literature were compared by training on a corpus of child-directed speech and evaluating on three increasingly rigorous semantic tasks. The performance of families of models varied with the type of semantic data, and not all models were reasonably successful on each task, suggesting a narrowing of the space of plausible model architec...
متن کاملEditorial: Cognitive Architectures, Model Comparison and AGI
Cognitive Science and Artificial Intelligence share compatible goals of understanding and possibly generating broadly intelligent behavior. In order to determine if progress is made, it is essential to be able to evaluate the behavior of complex computational models, especially those built on general cognitive architectures, and compare it to benchmarks of intelligent behavior such as human per...
متن کاملCognitive Architectures, Model Comparison and AGI
Cognitive Science and Artificial Intelligence share compatible goals of understanding and possibly generating broadly intelligent behavior. In order to determine if progress is made, it is essential to be able to evaluate the behavior of complex computational models, especially those built on general cognitive architectures, and compare it to benchmarks of intelligent behavior such as human per...
متن کاملIntegrated Cognitive-neuroscience Architectures for Understanding Sensemaking (ICArUS): Phase 1 Challenge Problem Design and Test Specification
Phase 1 of the IARPA program ICArUS (Integrated Cognitive-neuroscience Architectures for Understanding Sensemaking) requires a challenge problem that poses cognitive challenges of geospatial sensemaking (BAA, 2010). The problem serves as a modeling challenge for performers and enables assessment in T&E (Test and Evaluation) per BAA guidelines. This document describes the Phase 1 challenge probl...
متن کاملImproving the performance of financial forecasting using different combination architectures of ARIMA and ANN models
Despite several individual forecasting models that have been proposed in the literature, accurate forecasting is yet one of the major challenging problems facing decision makers in various fields, especially financial markets. This is the main reason that numerous researchers have been devoted to develop strategies to improve forecasting accuracy. One of the most well established and widely use...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007