The Right Threshold Value: What Is the Right Threshold of Cosine Measure When Using Latent Semantic Analysis for Evaluating Student Answers?

نویسندگان

Phanni Penumatsa

Matthew Ventura

Brent A. Olde

Donald R. Franceschetti

Arthur C. Graesser

چکیده

Auto Tutor is an intelligent tutoring system that holds conversations with learners in natural language. Auto Tutor uses Latent Semantic Analysis (LSA) to match sentences the student generates in response to essay type questions to a set of sentences (expectations) that would appear in a complete and correct response or which reflect common but incorrect understandings of the material (bads). The correctness of student contributions is decided using a threshold value of the LSA cosine between the student answer and the expectations. Our results indicate that the best agreement between LSA matches and the evaluations of subject matter experts can be obtained if the cosine threshold is allowed to be a function of the lengths of both student answer and the expectation being considered. Introduction Auto Tutor is a Computer Tutor that simulates natural discourse while executing pedagogically appropriate turns. Auto Tutor engages students in a natural language dialog (Graesser, Person, Harter, &TRG, 2001; Graesser, Van Lehn, Rose, Jordan, Harter, 2001) built around a series of questions in the subject being tutored. Auto Tutor understands student expressions by means of Latent Semantic Analysis (LSA). LSA is one of the major components in Auto Tutor. It is a statistical corpus-based technique for understanding natural language which represents word, sentences, or paragraphs (generically termed "documents") as vectors in a high dimensional vector space derived from the corpus. The most commonly employed measure of agreement between documents is the cosine of the angle 422 FLAIRS 2003 Copyright c © 2003, American Association for Artificial Intelligence (www.aaai.org). All rights reserved. between the corresponding vectors (Kintch, 1998; Landauer and Dumais, 1997; Landauer, Foltz and Latham,1998). The present study focuses on the appropriate cosine threshold value for declaring agreement in a tutoring situation. In Auto Tutor 2.0 for conceptual physics we declared a match whenever a cosine greater than 0.65 was found between the students answer and the expectation with which it was compared. Had a higher value been used, fewer matches would be found and the students would have been prodded to revise their Answer more often. This would lead to student frustration if the answer were in fact correct but merely phrased differently from the expectation. Further a length effect might be expected. The probability of two longer documents scoring a high cosine match by accident would not necessarily be the same as that for two shorter documents. To determine whether such a length effect could exist we compared the consistency of LSA cosine-based ratings with those of human experts for short, medium and longer length documents. The following sections reprise LSA and AT separately. We then report results of our document length study. Latent Semantic Analysis LSA is a statistical corpus-based text comparison technique that was originally developed for text retrieval. Nowadays it is more often used to capture the content of large bodies of texts (Kintsch, 1998; Landauer & Dumais, 1997; Landauer Foltz and Laham, 1998). LSA has been tested in the grading of essays (Foltz, Gilliam & Kendall, 2000) and found to assign grades consistent with the judgment of experts in composition. LSA begins with a corpus, a body of documents, generally derived from published texts or reference works. The documents can be individual sentences or paragraphs, or some other convenient unit. From this is constructed a rectangular matrix, with one row for each distinct word in the text and one column for each document. The matrix elements may then be subjected to a mathematical weighting process based on the frequency of occurrence of the words in the document or in the English language as a whole (Berry, Dumais &O'Brien, 1995). The resulting matrix is then subjected to a singular value decomposition (SVD), by which it is expressed as the product of three matrices, the second of which is diagonal with the singular values appearing in decreasing order. For the purposes of latent semantic analysis, all but the N largest diagonal elements are then set equal to zero and the matrices are re-multiplied. The N chosen is typically of the order of a few hundred. This process is thought to eliminate aspects of word use in the text which are incidental to the expression of meaning but to preserve correlations between words that capture meanings expressed in the text. Once a corpus has been constructed and the corresponding word-document matrix is transformed as outlined above, one can represent any combination of words in the corpus as vector by forming a linear combination of the rows representing the component words. For any pair of word combinations, then, there will be two vectors in the abstract N-dimensional space defined by the (SVD) which meet at an angle, the cosine of which is readily calculated from the vector "dot product," the sum of the pair-wise products of the N components. LSA Cosine values successfully predict the coherence of successive sentences in a text (Foltz, Kintsch and Landauer, 1998), the similarity between student answers and ideal answers to questions (Graesser, P. WiemerHastings et al, 2000) and the structural distance between nodes in conceptual graph structures (Graesser, Karnavat, Pomeroy, P. Wiemer-Hastings &TRG, 2000). At this point researchers are exploring the strengths and limitations of LSA in representing world knowledge. LSA Use in Auto Tutor A thorough description of the Auto Tutor is provided in Graesser et al (1999) and Graesser, Person, Harter and TRG (2001). We provide only a general overview here. Auto Tutor's style of tutoring is modeled after actual human tutoring strategies (Graesser, Pearson and Magliano, 1955). The tutor starts out by asking a question or posing a problem that requires a paragraph length answer. The tutor then works with the student to revise the paragraph until it covers the essential points (expectations) that the tutor deems constitute a correct and complete answer (Olde et al, 2002). Once a question has been satisfactorily answered the tutor poses the next question. Auto Tutor's general knowledge of its tutoring domain resides in the corpus of texts from which the LSA vector space has been constructed, while the expectations, probable bad answers, and repertoire of dialog moves for each question are contained in separate curriculum scripts. The main dialog moves available to Auto Tutor are hints, pumps and assertions. There are a variety of additional dialog moves in the curriculum script that need not be addressed in the present study. (Olde et al, 2002; Graesser, Person, Harter, 2001) Auto Tutor matches student responses to the expectations and probable bad answers for each question by calculating the LSA cosine between them. Based on the computed cosines, Auto Tutor selects its next dialog move which might include positive, negative of neutral feed back, pumps for additional information, a prompt for specific words, a hint, assertion, summary, correction or a follow-up question. The smoothness of the mixed initiative dialog in Auto Tutor critically depends on the

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Right Stuff: Do You Need to Sanitize Your Corpus When Using Latent Semantic Analysis?

Student responses to conceptual physics questions were analyzed with latent semantic analysis (LSA), using different text corpora. Expert evaluations of student answers to questions were correlated with LSA metrics of the similarity between student responses and ideal answers. We compared the adequacy of several text corpora in LSA performance evaluation, including the inclusion of written inco...

متن کامل

Meaning of “the Right Imam” based upon the Holy Quran’s Verses

The concept of “the Right Imam” is one of the most significant Quranic concepts and has attracted the attention of various jurisprudential, theological, mystical, interpretative, narrative and historical schools. However, it has not been dealt with by a semantic approach yet. Although the word “Imam” with the meaning of right leader has been used in 5 ranks in the Holy Quran, it could be said t...

متن کامل

Presentation of an efficient automatic short answer grading model based on combination of pseudo relevance feedback and semantic relatedness measures

Automatic short answer grading (ASAG) is the automated process of assessing answers based on natural language using computation methods and machine learning algorithms. Development of large-scale smart education systems on one hand and the importance of assessment as a key factor in the learning process and its confronted challenges, on the other hand, have significantly increased the need for ...

متن کامل

Presentation of an efficient automatic short answer grading model based on combination of pseudo relevance feedback and semantic relatedness measures

متن کامل

oIQa: An Opinion Influence Oriented Question Answering Framework with Applications to Marketing Domain

Understanding questions and answers in Question Answering (QA) system is a major challenge in the domain of natural language processing. In this paper, we present a question answering system that influences the human opinions in a conversation. The opinion words are quantified by using a lexicon-based method. We apply Latent Semantic Analysis and the cosine similarity measure between candidate ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2003

The Right Threshold Value: What Is the Right Threshold of Cosine Measure When Using Latent Semantic Analysis for Evaluating Student Answers?

نویسندگان

چکیده

منابع مشابه

The Right Stuff: Do You Need to Sanitize Your Corpus When Using Latent Semantic Analysis?

Meaning of “the Right Imam” based upon the Holy Quran’s Verses

Presentation of an efficient automatic short answer grading model based on combination of pseudo relevance feedback and semantic relatedness measures

Presentation of an efficient automatic short answer grading model based on combination of pseudo relevance feedback and semantic relatedness measures

oIQa: An Opinion Influence Oriented Question Answering Framework with Applications to Marketing Domain

عنوان ژورنال:

اشتراک گذاری