evaluation metrics

نتایج جستجو برای: evaluation metrics

تعداد نتایج: 878773 فیلتر نتایج به سال:

How NOT To Evaluate Your Dialogue System: An Empirical Study of Unsupervised Evaluation Metrics for Dialogue Response Generation

2016

Chia-Wei Liu Ryan Lowe Iulian Serban Michael Noseworthy Laurent Charlin Joelle Pineau

We investigate evaluation metrics for endto-end dialogue systems where supervised labels, such as task completion, are not available. Recent works in end-to-end dialogue systems have adopted metrics from machine translation and text summarization to compare a model’s generated response to a single target response. We show that these metrics correlate very weakly or not at all with human judgeme...

متن کامل

Maximum Correlation Training for Machine Translation Evaluation

2007

Ding Liu Daniel Gildea

We propose three new features for MT evaluation: source-sentence constrained n-gram precision, source-sentence reordering metrics, and discriminative unigram precision, as well as a method of learning linear feature weights to directly maximize correlation with human judgments. Our source-sentence constrained n-gram precision achieves, among all the testing metrics including BLEU, NIST, ROUGE, ...

متن کامل

Metrics for Cognitive Architecture Evaluation

2007

Robert Wray

The problem of evaluating general architectures is a difficult one (Newell, 1990). Comparative evaluations that focus on performance alone are especially problematic. It is usually feasible to develop a specialized solution for any particular problem that will outperform a general solution, such as one developed within a cognitive architecture. Thus, an evaluation of the architectural approach ...

متن کامل

Representation Based Translation Evaluation Metrics

2015

Boxing Chen Hongyu Guo

Precisely evaluating the quality of a translation against human references is a challenging task due to the flexible word ordering of a sentence and the existence of a large number of synonyms for words. This paper proposes to evaluate translations with distributed representations of words and sentences. We study several metrics based on word and sentence representations and their combination. ...

متن کامل

Evaluation Challenges for a Federation of Heterogeneous Information Providers: The Case of NASA's Earth Science Information Partnerships

2000

Catherine Plaisant Anita Komlodi Francis Lindsay

NASA’s Earth Science Information Partnership Federation is an experiment funded to assess the ability of a group of widely heterogeneous earth science data or service providers to self organize and provide improved and cheaper access to an expanding earth science user community. As it is organizing itself, the federation is mandated to set in place an evaluation methodology and collect metrics ...

متن کامل

QoS Metrics for Cloud Computing Services Evaluation

Journal: :International Journal of Intelligent Systems and Applications 2014

متن کامل

Towards Robust Metrics for Concept Representation Evaluation

Journal: :Proceedings of the ... AAAI Conference on Artificial Intelligence 2023

Recent work on interpretability has focused concept-based explanations, where deep learning models are explained in terms of high-level units information, referred to as concepts. Concept models, however, have been shown be prone encoding impurities their representations, failing fully capture meaningful features inputs. While concept lacks metrics measure such phenomena, the field disentanglem...

متن کامل

Methodologies to Develop Quantitative Risk Evaluation Metrics

Journal: :International Journal of Computer Applications 2012

متن کامل

An experiment in comparative evaluation: humans vs. computers

2003

Andrei Popescu-Belis

This paper reports results from an experiment that was aimed at comparing evaluation metrics for machine translation. Implemented as a workshop at a major conference in 2002, the experiment defined an evaluation task, description of the metrics, as well as test data consisting of human and machine translations of two texts. Several metrics, either applicable by human judges or automated, were u...

متن کامل

Performance Evaluation of Procedural Metrics and Object Oriented Metrics

2015

P.Ashok Reddy

Software metrics are widely accepted tools to control and assure software quality. A large number of software metrics with a variety of content can be found in the literature. Software metrics are widely accepted tools to control and assure software quality. A large number of software metrics with a variety of content can be found in the literature. In this paper, different software complexity ...

متن کامل

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید