Manifold Learning, part 3

Part 1 here
Part 2 here

Non-linear dimensionality reduction

In an earlier post, we saw we can approximate higher dimensional data using dimensionality reduction techniques. However, PCA (and similar methods) have a weakness. It requires the constraints to line along or near straight lines. What about data that lie along curves?

Another Illustrated Example

Let us pretend again we have 2-dimensional data and our tools can only analyse one-dimensional data. We cannot use the trick we used earlier because the data no longer lies close to a straight line. However, we can cheat. Looking at it closely, we can see that our data (almost) lies along a spiral.

Just like before, we are going to use the projections of our data (red circles) onto the spiral as approximations of our data. We can do this as long as our data is "very close" to the spiral.

The final trick is to "unroll" the spiral and pretend it is a straight line. We can now use our tools for analysis. Just like in our previous approximation errors are introduced. As long as the data is "very close" to the spiral, the error introduced is small.

You may also notice that I skipped over how we actually unroll the spiral. We'll discuss one way to do it in the next post.