## Variances of the principal components in Ewin Tang's PCA algorithm

3

In Quantum-inspired classical algorithms for principal component analysis and supervised clustering, the PCA algorithm requires that the variances of the principal vectors differ by at least a constant fraction of the Frobenius norm squared ($$\sigma_{i+1}^2 - \sigma_i^2 \ge \eta ||A||_{F}^2$$) and that the variance is above a certain constant $$\sigma^2$$. Can the assumption that the principal vectors have different variances be dropped and the output changed to just the hypervolume created by all the (unordered) principal vectors with at least a certain variance, $$\sigma^2$$? And, if so, what is the modified algorithm's computational complexity? From the remark,

As we assume our eigenvalues have an $$\eta ||A||_F^2$$ gap, the precise eigenvector $$|v_j\rangle$$ sampled can be identified by the eigenvalue estimate. Then, by computing enough samples, we can learn all of the eigenvalues of at least $$\sigma^2$$ and get the corresponding states

It seems like this should be fine, but later Tang adds,

Note that we crucially use the assumptions in Problem 7 for our QML algorithm: without guarantee on the gap or that $$\sigma_i \ge \sigma$$, finding the top k singular vectors would be intractable, even with samples of $$|v_i\rangle$$’s.

leaves open the possibility it wouldn't work.