rater reliability

نتایج جستجو برای: rater reliability

تعداد نتایج: 145715 فیلتر نتایج به سال:

Human Similarity Judgments: Implications for the Design of Formal Evaluations

2007

M. Cameron Jones J. Stephen Downie Andreas F. Ehmann

This paper presents findings of a series of analyses of human similarity judgments from the Symbolic Melodic Similarity, and Audio Music Similarity tasks from the Music Information Retrieval Evaluation Exchange (MIREX) 2006. The categorical judgment data generated by the evaluators is analyzed with regard to judgment stability, inter-grader reliability, and patterns of disagreement, both within...

متن کامل

A Political News Corpus in Chinese for Opinion Analysis

2008

Benjamin Ka-Yin T'sou Bin Lu

In this paper, we present an annotated corpus of political election news in Chinese for opinion analysis, and discuss some issues in the manual annotation process. The annotation scheme is described with examples, and inter-annotator agreement is explored for different levels of annotation: expression, sentence and document.

متن کامل

Creating a Test Corpus of Clinical Notes Manually Tagged for Part-of-Speech Information

2004

Serguei V. S. Pakhomov Anni Coden Christopher G. Chute

This paper presents a project whose main goal is to construct a corpus of clinical text manually annotated for part-of-speech information. We describe and discuss the process of training three domain experts to perform linguistic annotation. We list some of the challenges as well as encouraging results pertaining to inter-rater agreement and consistency of annotation. We also present preliminar...

متن کامل

Relating PhonePass overall scores to the Council of Europe Framework level descriptors

2002

John de Jong Jared Bernstein

This study is a preliminary report on an experiment relating PhonePass SET-10 scores to the scale of level descriptors in the Council of Europe Framework. This scale describes the content and level of second language proficiency from a functional communicative perspective. Speech samples from 121 non-native speakers of English were: (1) scored in SET-10, the automatic test of spoken English, an...

متن کامل

Intra-rater and inter-rater reliability of hemoglobin color scale method

Journal: :Indian Journal of Community Medicine 2009

متن کامل

Inter-Rater and Intra-Rater Reliability of the Occupational Therapy Diagnosis

Journal: :The Occupational Therapy Journal of Research 1995

متن کامل

Reliability of capturing foot parameters using digital scanning and the neutral suspension casting technique

2011

Matthew Carroll Mary-Ellen Annabell Keith Rome

BACKGROUND A clinical study was conducted to determine the intra and inter-rater reliability of digital scanning and the neutral suspension casting technique to measure six foot parameters. The neutral suspension casting technique is a commonly utilised method for obtaining a negative impression of the foot prior to orthotic fabrication. Digital scanning offers an alternative to the traditional...

متن کامل

Finding your "Inner-Annotator": An Experiment in Annotator Independence for Rating Discourse Coherence Quality in Essays

2014

Jill Burstein Swapna Somasundaran Martin Chodorow

An experimental annotation method is described, showing promise for a subjective labeling task – discourse coherence quality of essays. Annotators developed personal protocols, reducing front-end resources: protocol development and annotator training. Substantial inter-annotator agreement was achieved for a 4-point scale. Correlational analyses revealed how unique linguistic phenomena were cons...

متن کامل

Can training improve the quality of inferences made by raters in competency modeling? A quasi-experiment.

Journal: :The Journal of applied psychology 2007

Filip Lievens Juan I Sanchez

A quasi-experiment was conducted to investigate the effects of frame-of-reference training on the quality of competency modeling ratings made by consultants. Human resources consultants from a large consulting firm were randomly assigned to either a training or a control condition. The discriminant validity, interrater reliability, and accuracy of the competency ratings were significantly highe...

متن کامل

Automatic large-scale oral language proficiency assessment

2007

Febe de Wet Christa van der Walt Thomas Niesler

We describe first results obtained during the development of an automatic system for the assessment of spoken English proficiency of university students. The ultimate aim of this system is to allow fast, consistent and objective assessment of oral proficiency for the purpose of placing students in courses appropriate to their language skills. Rate of speech (ROS) was chosen as an indicator of f...

متن کامل

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید