unsupervised-learning 12
[1112.6209] Building high-level features using large scale unsupervised learning
january 2012 by Vaguery
We consider the problem of building detectors for high-level concepts using only unsupervised feature learning. For example, we would like to understand if it is possible to learn a face detector using only unlabeled images downloaded from the internet. To answer this question, we trained a simple feature learning algorithm on a large dataset of images (10 million images, each image is 200x200). The simulation is performed on a cluster of 1000 machines with fast network hardware for one week. Extensive experimental results reveal surprising evidence that such high-level concepts can indeed be learned using only unlabeled data and a simple learning algorithm.
image-analysis
image-segmentation
unsupervised-learning
learning-by-doing
feature-extraction
nudge-targets
january 2012 by Vaguery
Stefano Melacci, Mikhail Belkin (2011). "Laplacian Support Vector Machines Trained in the Primal". Journal of Machine Learning Research 12: 1149−1184
april 2011 by quant18
In the last few years, due to the growing ubiquity of unlabeled data, much effort has been spent by the machine learning community to develop better understanding and improve the quality of classifiers exploiting unlabeled data. Following the manifold regularization approach, Laplacian Support Vector Machines (LapSVMs) have shown the state of the art performance in semi-supervised classification. In this paper we present two strategies to solve the primal LapSVM problem, in order to overcome some issues of the original dual formulation. In particular, training a LapSVM in the primal can be efficiently performed with preconditioned conjugate gradient.
We speed up training by using an early stopping strategy based on the prediction on unlabeled data or, if available, on labeled validation examples. This allows the algorithm to quickly compute approximate solutions with roughly the same classification accuracy as the optimal ones, considerably reducing the training time. The computational complexity of the training algorithm is reduced from O(n3) to O(kn2), where n is the combined number of labeled and unlabeled examples and k is empirically evaluated to be significantly smaller than n. Due to its simplicity, training LapSVM in the primal can be the starting point for additional enhancements of the original LapSVM formulation, such as those for dealing with large data sets. We present an extensive experimental evaluation on real world data showing the benefits of the proposed approach.
SVM
Unsupervised-learning
We speed up training by using an early stopping strategy based on the prediction on unlabeled data or, if available, on labeled validation examples. This allows the algorithm to quickly compute approximate solutions with roughly the same classification accuracy as the optimal ones, considerably reducing the training time. The computational complexity of the training algorithm is reduced from O(n3) to O(kn2), where n is the combined number of labeled and unlabeled examples and k is empirically evaluated to be significantly smaller than n. Due to its simplicity, training LapSVM in the primal can be the starting point for additional enhancements of the original LapSVM formulation, such as those for dealing with large data sets. We present an extensive experimental evaluation on real world data showing the benefits of the proposed approach.
april 2011 by quant18
peekaboo: nips 2010 - single layer networks in unsupervised feature learning: the deep learning killer [edit: now available online!]
january 2011 by chl
deep architectures considered not necessarily necessary.
inquire
deep-learning
nn
unsupervised-learning
classification
from delicious
january 2011 by chl
[1003.0470] Unsupervised Supervised Learning II: Training Margin Based Classifiers without Labels
august 2010 by Vaguery
"On a more philosophical level, our approach points at novel questions that go beyond supervised and semi-supervised learning. What benefit do labels provide over unsupervised training? Can our framework be extended to semi-supervised learning where a few labels do exist? Can it be extended to non-classification scenarios such as margin based regression or margin based structured prediction? When are the assumptions likely to hold and how can we make our framework even more resistant to deviations from them? These questions and others form new and exciting open research directions."
unsupervised-learning
supervised-learning
learning-from-data
machine-learning
regression
modeling
august 2010 by Vaguery
Unsupervised Name Disambiguation via Social Network Similarity
july 2009 by quant18
[We] investigate unsupervised methods which simultaneously learn 1) the number of entities represented by a particular name and 2) which observations correspond to the same entity. The disambiguation methods leverage the fact that an entity’s name can be listed in multiple sources, each with a number of related entities' names, which permits the construction of name-based relational networks. The methods studied in this paper differ based on the type of network similarity exploited for disambiguation. Method 1 relies upon exact name similarity and employs hierarchical clustering of sources, where each source is considered a local network. Method 2 employs a less strict similarity requirement by using random walks between ambiguous observations on a global social network constructed from all sources, or a community similarity ... findings suggest methods which measure similarity based on community, rather than exact, similarity provide more robust disambiguation capability.
Social-network-analysis
Word-sense-disambiguation
Unsupervised-learning
july 2009 by quant18
Copy this bookmark: