supervised learning

The MultiRank Bootstrap Algorithm: Self-Supervised Political Blog Classification and Ranking Using Semi-Supervised Link Classification

2008

Frank Lin William W. Cohen

We present a new semi-supervised learning algorithm for classifying political blogs in a blog network and ranking them within predicted classes. We test our algorithm on two datasets and achieve classification accuracy of 81.9% and 84.6% using only 2 seed blogs.

متن کامل

Analysis of Spectral Kernel Design based Semi-supervised Learning

2005

Tong Zhang Rie Kubota Ando

We consider a framework for semi-supervised learning using spectral decomposition based un-supervised kernel design. This approach subsumes a class of previously proposed semi-supervised learning methods on data graphs. We examine various theoretical properties of such methods. In particular, we derive a generalization performance bound, and obtain the optimal kernel design by minimizing the bo...

متن کامل

On the characterization of noise filters for self-training semi-supervised in nearest neighbor classification

Journal: :Neurocomputing 2014

Isaac Triguero José A. Sáez Julián Luengo Salvador García Francisco Herrera

Semi-supervised classification methods have received much attention as suitable tools to tackle training sets with large amounts of unlabeled data and a small quantity of labeled data. Several semi-supervised learning models have been proposed with different assumptions about the characteristics of the input data. Among them, the self-training process has emerged as a simple and effective techn...

متن کامل

An Arabic Twitter Corpus for Subjectivity and Sentiment Analysis

2014

Eshrag Refaee Verena Rieser

We present a newly collected data set of 8,868 gold-standard annotated Arabic twitter feeds. The corpus is manually labelled for subjectivity and sentiment analysis (SSA) (κ = 0.816). In addition, the corpus is annotated with a variety of linguistically motivated feature-sets that have previously shown positive impact on classification performance. The paper highlights issues posed by twitter a...

متن کامل

Constrained Semi-Supervised Learning Using Attributes and Comparative Attributes

2012

Abhinav Shrivastava Saurabh Singh Abhinav Gupta

We consider the problem of semi-supervised bootstrap learning for scene categorization. Existing semi-supervised approaches are typically unreliable and face semantic drift because the learning task is under-constrained. This is primarily because they ignore the strong interactions that often exist between scene categories, such as the common attributes shared across categories as well as the a...

متن کامل

Efficient and Robust Semi-supervised Learning Over a Sparse-Regularized Graph

2016

Hang Su Jun Zhu Zhaozheng Yin Yinpeng Dong Bo Zhang

Graph-based Semi-Supervised Learning (GSSL) has limitations in widespread applicability due to its computationally prohibitive large-scale inference, sensitivity to data incompleteness, and incapability on handling time-evolving characteristics in an open set. To address these issues, we propose a novel GSSL based on a batch of informative beacons with sparsity appropriately harnessed, rather t...

متن کامل

Semi-Supervised Learning with Trees

2003

Charles Kemp Thomas L. Griffiths Sean Stromsten Joshua B. Tenenbaum

We describe a nonparametric Bayesian approach to generalizing from few labeled examples, guided by a larger set of unlabeled objects and the assumption of a latent tree-structure to the domain. The tree (or a distribution over trees) may be inferred using the unlabeled data. A prior over concepts generated by a mutation process on the inferred tree(s) allows efficient computation of the optimal...

متن کامل

A Semi-Supervised Approach for Gender Identification

2016

Juan Soler Leo Wanner

In most of the research studies on Author Profiling, large quantities of correctly labeled data are used to train the models. However, this does not reflect the reality in forensic scenarios: in practical linguistic forensic investigations, the resources that are available to profile the author of a text are usually scarce. To pay tribute to this fact, we implemented a Semi-Supervised Learning ...

متن کامل

Exponential Family Hybrid Semi-Supervised Learning

2009

Arvind Agarwal Hal Daumé

We present an approach to semi-supervised learning based on an exponential family characterization. Our approach generalizes previous work on coupled priors for hybrid generative/discriminative models. Our model is more flexible and natural than previous approaches. Experimental results on several data sets show that our approach also performs better in practice.

متن کامل

Text Classification Based On Manifold Semi- Supervised Support Vector Machine

2014

Vo Duy Thanh Vo Trung Hung Pham Minh Tuan Ho Khac Hung

This article presents a solution along with experimental results for an application of semi-supervised machine learning techniques and improvement on the SVM (Support Vector Machine) based on geodesic model to build text classification applications for Vietnamese language. The objective here is to improve the semi-supervised machine learning by replacing the kernel function of SVM using geodesi...

متن کامل