نتایج جستجو برای: evaluation metrics

تعداد نتایج: 878773  

2007
Pradipta Biswas

Evaluation is an unavoidable feature in any teaching or learning scenario. The evaluation strategy of students differs widely throughout the world. Further, most of the institutes do not use any objective technique to assess the teaching performance of a teacher. The present paper defines performance metrics both for student and teacher evaluation and also discusses the methodology for calculat...

2011
Annika Silvervarg Arne Jönsson

In this paper we present results from a study of subjective and objective evaluation metrics used to asses a conversational agent. Our study has been conducted in a school setting with students, aged 12 to 14 years old, who used a virtual learning environment that incorporates social conversation with a pedagogical agent. The subjective evaluation metrics capture the students’ experiences of di...

1998
R Harrison

Various object-oriented metrics have been proposed as a way of capturing features of object-oriented software such as encapsulation (information hiding), abstraction and inheritance. A major criticism of past object-oriented metrics is that little attention has been paid to theoretical validation or empirical evaluation of those metrics. By theoretical validation we refer to the process of ensu...

1995
Jianjun Zhao Jingde Cheng Kazuo Ushijima

Software metrics have many applications in software engineering activities including analysis, testing, debugging, and maintenance of programs, and management of project. Until now a number of complexity metrics have been proposed and used for measuring sequential programs, but few could be used for measuring concurrent and distributed programs. Cheng proposed a group of dependence-based comple...

2005
Enrique Amigó Julio Gonzalo Anselmo Peñas M. Felisa Verdejo

This paper presents a probabilistic framework, QARLA, for the evaluation of text summarisation systems. The input of the framework is a set of manual (reference) summaries, a set of baseline (automatic) summaries and a set of similarity metrics between summaries. It provides i) a measure to evaluate the quality of any set of similarity metrics, ii) a measure to evaluate the quality of a summary...

2008
M. Chambah S. Ouni M. Herbin E. Zagrouba

Usually in the field of image quality assessment the terms “automatic” and “subjective” are often incompatible. In fact, when it comes to image quality assessment, we have mostly two kinds of evaluation techniques: subjective evaluation and objective evaluation. Only objective evaluation techniques being automatizable, while subjective evaluation techniques are performed by a series of visual a...

2010
Hideki Isozaki Tsutomu Hirao Kevin Duh Katsuhito Sudoh Hajime Tsukada

Automatic evaluation of Machine Translation (MT) quality is essential to developing highquality MT systems. Various evaluation metrics have been proposed, and BLEU is now used as the de facto standard metric. However, when we consider translation between distant language pairs such as Japanese and English, most popular metrics (e.g., BLEU, NIST, PER, and TER) do not work well. It is well known ...

2007
Dror G. Feitelson

Metrics ought to be objective, as they are the judge of performance. Workloads ought to be representative, so that evaluations will lead to applicable results. But sometimes metrics and workloads collude to taint the performance evaluation process, leading to results of dubious merit. We use a case study dealing with parallel job scheduling to exemplify these issues. An analysis of interactions...

2015
Alessandro Canossa Gillian Smith

Existing approaches to characterizing and evaluating level designs involve either the application of theory-based language to qualitatively describe the level’s structure, or empirical evaluation of how players experience the levels. In this paper, we propose a method for evaluation that bridges these two approaches: theorybased, quantitative metrics that measure the qualities of levels. The me...

2007
Tetsuya Sakai

Large-scale information retrieval evaluation efforts such as TREC and NTCIR have tended to adhere to binary-relevance evaluation metrics, even when graded relevance data were available. However, the NTCIR-6 Crosslingual Task has finally started adopting graded-relevance metrics, though only as additional metrics. This paper compares three existing graded-relevance metrics that were mentioned in...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید