نتایج جستجو برای: rater reliability
تعداد نتایج: 145715 فیلتر نتایج به سال:
This paper presents findings of a series of analyses of human similarity judgments from the Symbolic Melodic Similarity, and Audio Music Similarity tasks from the Music Information Retrieval Evaluation Exchange (MIREX) 2006. The categorical judgment data generated by the evaluators is analyzed with regard to judgment stability, inter-grader reliability, and patterns of disagreement, both within...
In this paper, we present an annotated corpus of political election news in Chinese for opinion analysis, and discuss some issues in the manual annotation process. The annotation scheme is described with examples, and inter-annotator agreement is explored for different levels of annotation: expression, sentence and document.
This paper presents a project whose main goal is to construct a corpus of clinical text manually annotated for part-of-speech information. We describe and discuss the process of training three domain experts to perform linguistic annotation. We list some of the challenges as well as encouraging results pertaining to inter-rater agreement and consistency of annotation. We also present preliminar...
This study is a preliminary report on an experiment relating PhonePass SET-10 scores to the scale of level descriptors in the Council of Europe Framework. This scale describes the content and level of second language proficiency from a functional communicative perspective. Speech samples from 121 non-native speakers of English were: (1) scored in SET-10, the automatic test of spoken English, an...
BACKGROUND A clinical study was conducted to determine the intra and inter-rater reliability of digital scanning and the neutral suspension casting technique to measure six foot parameters. The neutral suspension casting technique is a commonly utilised method for obtaining a negative impression of the foot prior to orthotic fabrication. Digital scanning offers an alternative to the traditional...
An experimental annotation method is described, showing promise for a subjective labeling task – discourse coherence quality of essays. Annotators developed personal protocols, reducing front-end resources: protocol development and annotator training. Substantial inter-annotator agreement was achieved for a 4-point scale. Correlational analyses revealed how unique linguistic phenomena were cons...
A quasi-experiment was conducted to investigate the effects of frame-of-reference training on the quality of competency modeling ratings made by consultants. Human resources consultants from a large consulting firm were randomly assigned to either a training or a control condition. The discriminant validity, interrater reliability, and accuracy of the competency ratings were significantly highe...
We describe first results obtained during the development of an automatic system for the assessment of spoken English proficiency of university students. The ultimate aim of this system is to allow fast, consistent and objective assessment of oral proficiency for the purpose of placing students in courses appropriate to their language skills. Rate of speech (ROS) was chosen as an indicator of f...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید