نتایج جستجو برای: stip
تعداد نتایج: 127 فیلتر نتایج به سال:
1. The runs differ in the types of visual features used. All runs use several bag-of-word representations fed to separate linear SVMs and the SVMs were fused by logistic regression. *F_A_Brno_resource_4: Only single best visual features (on the training set) are used – dense image sampling with rgb-SIFT. * F_A_Brno_basic_3: This run uses dense sampling and Harris-Laplace detector in combination...
Automatically inferring ongoing activities is to enable the early recognition of unfinished activities, which is quite meaningful for applications, such as online human-machine interaction and security monitoring. Stateof-the-art methods use the spatio-temporal interest point (STIP) based features as the low-level video description to handle complex scenes [1, 2, 3]. While the existing problem ...
To investigate if the Personality Disorder (PD) severity concept (Criterion A) of ICD-11 and DSM-5 AMPD is applicable to children adolescents, following lifespan perspective mental disorders, age-specific informant-adapted assessment tools are needed. The LoPF-Q 6-18 PR (Levels Functioning Questionnaire Parent Rating) was developed assess Impaired (IPF) in aged 6–18 parent-reported form. It bas...
We report on our system used in the TRECVID 2012 Multimedia Event Detection (MED) and Multimedia Event Recounting (MER) tasks. For MED, generally, it consists of three main steps: extracting features, training detectors and fusion. In the feature extraction part, we extract many low-level, high-level features and text features. Those features are then represented in three different ways which a...
Locality Sensitive Hashing (LSH) based algorithms have already shown their promise in finding approximate nearest neighbors in high dimensional data space. However, there are certain scenarios, as in sequential data, where the proximity of a pair of points cannot be captured without considering their surroundings or context. In videos, as for example, a particular frame is meaningful only when ...
We report on our system used in the TRECVID 2012 Multimedia Event Detection (MED) and Multimedia Event Recounting (MER) tasks. For MED, it consists of three main steps: extracting features, training detectors and fusion. In the feature extraction part, we extract many low-level, high-level, and text features. Those features are then represented in three different ways which are spatial bag-of w...
The Informedia group participated in three tasks this year, including: Multimedia Event Detection (MED), Semantic Indexing (SIN) and Surveillance Event Detection. Generally, all of these tasks consist of three main steps: extracting feature, training detector and fusing. In the feature extraction part, we extracted a lot of low-level features, high-level features and text features. Especially, ...
We report on our results in the TRECVID 2011 Multimedia Event Detection (MED) and Semantic Indexing (SIN) tasks. Generally, both of these tasks consist of three main steps: extracting features, training detectors and fusing. In the feature extraction part, we extracted many low-level features, high-level features and text features. We used the Spatial-Pyramid Matching technique to represent the...
Motivation. Automatic recognition of human activities (or events) from video is important to many potential applications of computer vision. One of the most common approach is the bag-of-visual-features, which aggregate space-time features globally, from the entire video clip containing complete execution of a single activity. The bag-of-visual-features does not encode the spatio-temporal struc...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید