The Variational Nystrom method for large-scale spectral problems
نویسندگان
چکیده
Spectral methods for dimensionality reduction and clustering require solving an eigenproblem defined by a sparse affinity matrix. When this matrix is large, one seeks an approximate solution. The standard way to do this is the Nyström method, which first solves a small eigenproblem considering only a subset of landmark points, and then applies an out-of-sample formula to extrapolate the solution to the entire dataset. We show that by constraining the original problem to satisfy the Nyström formula, we obtain an approximation that is computationally simple and efficient, but achieves a lower approximation error using fewer landmarks and less runtime. We also study the role of normalization in the computational cost and quality of the resulting solution. Spectral problems involve finding eigenvectors of an affinity matrix and have become a standard technique in machine learning problems such as manifold learning (Cox & Cox, 1994; Schölkopf et al., 1998; Tenenbaum et al., 2000; Roweis & Saul, 2000; Belkin & Niyogi, 2003) or spectral clustering (Shi & Malik, 2000; Ng et al., 2002). Their success is due to the power of neighborhood graphs (via an affinity matrix or graph Laplacian) to express similarity between pairs of points, and to the existence of welldeveloped linear algebra routines to solve the numerical problem. We consider a spectral problem of the type minX tr ( XMX ) s.t. XX = I (P) whereM is an N ×N symmetric matrix (usually, a graph Laplacian) constructed on a high-dimensional datasetY = (y1, . . . ,yN ) ofD ×N , andX = (x1, . . . ,xN ) of d×N are coordinates in R for the N data points (often called Proceedings of the 33 rd International Conference on Machine Learning, New York, NY, USA, 2016. JMLR: W&CP volume 48. Copyright 2016 by the author(s). the embedding), where d < D. Constraints of the form XBX = I with positive definiteB can be used with a suitable transformation of M. The solution of (P) is given by the d trailing eigenvectors of M (note M need not be positive semidefinite, though it often is). When the number of points N is very large, an exact solution becomes computationally impractical or undesirable, even if M is sparse. Our goal is to solve problems of the type (P) approximately. We focus on approximate methods to solve (P) that use sampling, i.e., they solve an eigenproblem on a subset of L ≪ N points from Y (“landmarks”) and then use this to extrapolate the solution to all N points. The prototype of these is the Nyström method (Williams & Seeger, 2001; Fowlkes et al., 2004), based on an out-of-sample formula that predicts x ∈ R for a given point y ∈ R as a linear combination of the landmarks’ solution using as weights the affinity values between y and the landmarks. This has the advantage of interpolating the landmarks and being convenient—the weights are simply affinity matrix entries. It gives a good approximation if using sufficiently many landmarks. Its fundamental disadvantage is that the reduced eigenproblem on the landmarks, to which the Nyström formula applies, uses only the landmark-landmark affinity values. If too few landmarks are used, this eigenproblem gives a bad approximation and so does the Nyström extrapolation. A different approach is that of Locally Linear Landmarks (LLL) (Vladymyrov & Carreira-Perpiñán, 2013b), which seeks to define a reduced eigenproblem containing more information than just landmark-landmark affinities. LLL defines a different out-of-sample formula, a linear combination of the projections of the nearest landmarks to y using weights that reconstruct y locally linearly in input space. The crucial idea in LLL is that problem (P) is solved constrained to using these weights, resulting in a reduced eigenproblem that does use the entire affinity matrix. Hence, this obtains a better landmark embedding than the Nyström method for the same number of landmarks. This reasoning naturally leads to our first contribution, the The Variational Nyström Method for Large-Scale Spectral Problems Variational Nyström (VN) method, where we incorporate the Nyström formula as a constraint in (P). As in LLL, we obtain a reduced eigenproblem that uses the entire affinity matrix and thus better represents the manifold structure of the landmarks. This reduced eigenproblem is then “optimal” for the Nyström formula (unlike the one based only on the landmark-landmark affinities). We also save the expensive computation of the LLL weights. We call it “variational” Nyström to refer to its optimality motivation. Our second contribution addresses an issue that has so far been overlooked in Nyström-typemethods: how to use subsampling approximationswith data-dependent kernels (e.g. graph Laplacian)? There each kernel element is generated not only by the corresponding points from the original data, but from the other points as well. In this case, applying the approximations directly to the kernel gives bad results. We investigate ways to normalize the data-dependent kernel in order to get best performance with VN and other methods. Notation. Ã indicates thatA is approximated. Ŷ indicates a landmark subset of Y. A subscript shows to which matrix we apply a certain transformation, e.g.UP is a column matrix of eigenvectors ofP andDW = diag (W1) is a degree matrix for W (and 1 is a vector of ones). For degree matrices that are computed for rectangular matrices, an arrow indicates whether the sum is taken rowor columnwise, e.g. for an N × L matrixC, DC→ = diag (C1) is a N ×N matrix of row-wise sums andDC↓ = diag (1C) is an L× L matrix of column-wise sums.
منابع مشابه
A numerical technique for solving a class of 2D variational problems using Legendre spectral method
An effective numerical method based on Legendre polynomials is proposed for the solution of a class of variational problems with suitable boundary conditions. The Ritz spectral method is used for finding the approximate solution of the problem. By utilizing the Ritz method, the given nonlinear variational problem reduces to the problem of solving a system of algebraic equations. The advantage o...
متن کاملLarge Amplitude Vibration Analysis of Graphene Sheets as Resonant Mass Sensors Using Mixed Pseudo-Spectral and Integral Quadrature Methods
The present paper investigates the potential application of graphene sheets with attached nanoparticles as resonant sensors by introducing a nonlocal shear deformation plate model. To take into account an elastic connection between the nanoplate and the attached nanoparticle, the nanoparticle is considered as a mass-spring system. Then, a combination of pseudo-spectral and integral quadrature m...
متن کاملNon-Fourier heat conduction equation in a sphere; comparison of variational method and inverse Laplace transformation with exact solution
Small scale thermal devices, such as micro heater, have led researchers to consider more accurate models of heat in thermal systems. Moreover, biological applications of heat transfer such as simulation of temperature field in laser surgery is another pathway which urges us to re-examine thermal systems with modern ones. Non-Fourier heat transfer overcomes some shortcomings of Fourier heat tran...
متن کاملNumerical resolution of large deflections in cantilever beams by Bernstein spectral method and a convolution quadrature.
The mathematical modeling of the large deflections for the cantilever beams leads to a nonlinear differential equation with the mixed boundary conditions. Different numerical methods have been implemented by various authors for such problems. In this paper, two novel numerical techniques are investigated for the numerical simulation of the problem. The first is based on a spectral method utiliz...
متن کاملSampling with Minimum Sum of Squared Similarities for Nystrom-Based Large Scale Spectral Clustering
The Nyström sampling provides an efficient approach for large scale clustering problems, by generating a low-rank matrix approximation. However, existing sampling methods are limited by their accuracies and computing times. This paper proposes a scalable Nyström-based clustering algorithm with a new sampling procedure, Minimum Sum of Squared Similarities (MSSS). Here we provide a theoretical an...
متن کاملHartley Series Direct Method for Variational Problems
The computational method based on using the operational matrix of anorthogonal function for solving variational problems is computeroriented. In this approach, a truncated Hartley series together withthe operational matrix of integration and integration of the crossproduct of two cas vectors are used for finding the solution ofvariational problems. Two illustrative...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016