Improve the speed of t-sne implementation in python for huge data



I would like to do dimensionality reduction on nearly 1 million vectors each with 200 dimensions(doc2vec). I am using TSNE implementation from sklearn.manifold module for it and the major problem is time complexity. Even with method = barnes_hut, the speed of computation is still low. Some time even it runs out of Memory.

I am running it on a 48 core processor with 130G RAM. Is there a method to run it parallely or make use of the plentiful resource to speed up the process.


Posted 2016-02-06T14:19:10.243

Reputation: 1 834

Did you try map-reduc'ing in a framework like Spark? – Dawny33 – 2016-02-06T14:23:05.187

Nope.. how does it work and can you please direct me.. – chmodsss – 2016-02-06T14:46:32.263

Pl go through Spark's documentation for understanding it :)

– Dawny33 – 2016-02-06T14:48:23.540


See if this Spark implementation works.

– Emre – 2016-02-11T17:55:31.833

@Emre: Which language is used in that implementation ? It seems there is bit of R and scala.. I haven't worked in any of these.. I was looking for python implementation. – chmodsss – 2016-02-11T19:02:59.953

1It's Scala for Spark. If you want a python implementation you might be able to translate it; Spark runs on python too. – Emre – 2016-02-11T19:07:20.390

@Emre So does it mean I should install Spark and compile this spark-tsne and then import as a module in python. – chmodsss – 2016-02-11T19:20:53.810

No, it means you should read the Scala package and write one in python based on it. Personally I would advise trying to use the Scala package as is. Although I have no personal experience with, Beaker might help you use Scala and python concurrently. As an alternative to t-SNE, you could one of many python neural network libraries to find a autoencoded 2D embedding.

– Emre – 2016-02-11T20:45:01.863



You must look at this Multicore implementation of t-SNE.

I actually tried it and can vouch for its superior performance.

Nilav Baran Ghosh

Posted 2016-02-06T14:19:10.243

Reputation: 256


Check out FFT-accelerated Interpolation-based t-SNE (paper, code, and Python package).

From the abstract:

We present Fast Fourier Transform-accelerated Interpolation-based t-SNE (FIt-SNE), which dramatically accelerates the computation of t-SNE. The most time-consuming step of t-SNE is a convolution that we accelerate by interpolating onto an equispaced grid and subsequently using the fast Fourier transform to perform the convolution. We also optimize the computation of input similarities in high dimensions using multi-threaded approximate nearest neighbors.

The paper also includes an example of a dataset with a million points and 100 dimensions (similar to OP's setting), and it seems to take ~1 hour.


Posted 2016-02-06T14:19:10.243

Reputation: 171



It's significantly faster than t-SNE.

patel ashutosh

Posted 2016-02-06T14:19:10.243

Reputation: 71


Since, there are no answers in SO, I have asked myself in github page and the issue has been closed by stating the following reply by GaelVaroquaux..

If you only want to parallelise vector operation, then you should use a build of numpy compiled with MKL (don't attempt to do it yourself, it's challenging).

There could be approaches to high-level parallelism in the algorithm itself, which would probably lead to larger gains. However, after a quick look at the code, I didn't see any clear way of doing that.

I am going to ahead and close this issue, as it is more of a blue-sky whish list. I completely agree, I would like TSNE to go faster, and it would be great is parallelism was easy. But in the current state of affairs, more work is required to be in a state where we can tackle such wish list.


Posted 2016-02-06T14:19:10.243

Reputation: 1 834


Since version 0.22, there is a new parameter called n_jobs in the scikit-learn t-SNE implementation. This parameter specifies the number of parallel jobs to run for neighbors search.

The Multicore-TSNE project mentioned in another answer seems to be dead.


Posted 2016-02-06T14:19:10.243

Reputation: 394