37 What are some standard ways of computing the distance between documents? 2014-07-05T16:10:21.580

35 Applications and differences for Jaccard similarity and Cosine Similarity 2015-02-12T07:08:16.537

23 Sentence similarity prediction 2017-10-22T07:36:15.920

22 Best practical algorithm for sentence similarity 2017-11-23T14:40:25.603

19 Clustering based on similarity scores 2014-05-16T14:26:12.270

18 Adaboost vs Gradient Boosting 2018-10-04T14:25:48.957

13 Alternatives to TF-IDF and Cosine Similarity when comparing documents of differing formats 2017-01-02T20:41:13.493

12 MinHashing vs SimHashing 2015-06-11T21:21:55.473

10 Extract canonical string from a list of noisy strings 2014-08-22T15:59:07.097

10 Vector space model cosine tf-idf for finding similar documents 2015-10-09T16:31:44.320

8 Score matrix string similarity 2014-06-22T21:45:35.800

8 Similarity measure based on multiple classes from a hierarchical taxonomy? 2015-01-08T10:09:32.010

8 How to find similarity between different factors in a dataset 2015-06-26T20:48:12.417

8 Fixing data inconsistencies 2016-04-07T22:38:43.297

8 Which supervised learning algorithms are available for matching? 2016-06-21T15:43:25.803

7 Create most "average" cosine similarity observation 2014-07-01T13:44:23.290

7 Best way to search for a similar document given the ngram 2015-11-17T03:06:51.357

7 Is there a way to measure correlation between two similar datasets? 2017-02-28T15:25:38.020

6 Computing Image Similarity based on Color Distribution 2014-07-27T21:54:05.003

6 Similarity measure for ordered binary vectors 2014-11-06T07:38:33.490

6 Similarity measure for multivariate time series with heterogeous length and content 2016-08-16T08:04:51.537

6 How to measure the similarity between two images? 2019-04-05T00:36:49.563

6 How to compare different similarity measurements in text clustering? 2019-07-30T08:56:00.420

5 How to compute the Jaccard Similarity in this example? (Jaccard vs. Cosine) 2016-12-21T23:50:51.620

5 Text similarity using RNN 2017-01-24T11:13:22.463

5 Unsupervised Anomaly Detection in Images 2018-03-21T18:17:54.100

5 how to apply similarity algorithm(or comparision) of over one million vectors with other one million vectors? 2019-01-17T05:00:14.210

5 Text similarity with sentence embeddings 2019-09-19T20:04:05.063

5 Cluster elements that appear in the same lists 2019-12-26T14:44:40.587

4 What is the difference between Latent and Explicit Semantic Analysis 2015-05-14T06:46:54.377

4 How can I group texts with similar content together? 2016-05-03T14:33:40.117

4 If I have got different routes - a series of (lat, lng) points, how to get the similarity of different routes? 2016-06-20T02:21:14.963

4 When does it makes senses to use Dot-Product as similarity measure instead of Cosine? 2016-11-08T22:47:43.667

4 How to create vectors from text for address matching using binary classification? 2016-12-20T13:41:54.690

4 Check similarity of table/csv of Product Names 2017-02-11T12:37:50.200

4 Doc2vec to calculate cosine similarity - absolutely inaccurate 2017-11-06T11:03:51.883

4 How do you set sigma for the Gaussian similarity kernel? 2017-12-12T16:10:19.710

4 Which dissimilarity/similarity measure use after a dimension reduction ( PCA / AutoEncoder / ... )? 2018-02-03T15:18:33.053

4 Are there any good NLP APIs for comparing strings in terms of semantic similarity? 2018-04-19T09:07:00.963

4 Similar students using Machine Learning 2019-01-24T07:25:46.937

4 Calculating similarity where order matters 2019-03-05T21:43:54.110

4 Random-Forest-based Similarity Matrix for clustering: how does it behave? 2019-04-17T12:59:06.607

4 Distance between users 2019-06-17T15:23:08.753

4 Text classification based on n-grams and similarity 2020-05-21T07:57:29.167

3 Data sets for evaluating text retrieval quality 2014-09-05T14:47:52.127

3 semantic relation or semantic relatedness between terms or phrases 2015-07-23T17:15:10.287

3 Finding the top K most similar sets 2015-11-17T18:56:02.027

3 How to find similarity/distance matrix with mixed Continuous and Categorical data? 2015-12-07T15:40:19.207

3 Ranking skills depending on similarity 2016-04-20T16:02:20.687

3 Document similarity: Vector embedding versus BoW performance? 2017-03-07T21:24:26.520

3 Item Similarity with Location Feature 2017-04-07T08:14:29.147

3 Is there an algorithm or NN to match two documents, basically not closely similar? 2017-08-20T11:16:17.610

3 Collection Of Variable Length Sequences and Descriptions: A Search Problem 2018-02-07T20:59:28.297

3 How to find similar time series? 2018-03-19T20:16:25.640

3 Are there any very good APIs for matching similar images? 2018-05-10T12:14:17.083

3 clustering 2-dimensional euclidean vectors - appropriate dissimilarity measure 2018-07-09T13:50:22.060

3 Creating similarity metric with Doc2Vec and additional features 2018-09-27T20:52:25.110

3 Cluster Analysis - Comparing Same Individuals Clustered Across Different Datasets with different features 2019-03-09T20:26:13.007

3 How to measure the similarity between two text documents? 2019-04-14T17:20:32.650

3 Jaccard Similarity with Binary Data 2019-05-14T14:19:08.467

3 Best way to combine two similar document 2019-05-20T12:52:56.100

3 Similarity Measure of Simulated Time Series vs Observed time Series 2019-08-20T22:19:30.433

3 Similarity of words using BERTMODEL 2019-11-15T17:06:44.160

3 Comparing one small dataset with a big dataset for similar records 2020-02-20T11:28:05.843

3 Similarity Measure between two feature vectors 2020-07-24T06:03:10.047

3 What is the correct formula for Jaccard coefficient with integer vectors? 2020-10-17T05:55:30.117

3 Pearson correlation with data sets that have values on different scales 2020-12-20T11:33:32.297

3 Dot product for similarity in word to vector computation in NLP 2021-01-14T15:55:34.657

2 How to fix similarity matrix in pandas returning all NaNs? 2014-07-25T17:18:21.393

2 Probability of similarity of two clusters 2015-03-13T22:43:42.760

2 Selecting the number of hashes for minhash? Working with extremely sparse data and want more collisions 2016-02-25T14:12:30.200

2 Add extra term weight when grouping strings by similarity? 2016-03-11T07:35:28.530

2 Gower Distance Formula for KNN 2016-05-20T20:20:02.007

2 How to interpret upper-triangular matrix of cosine similarities 2016-06-20T14:02:37.407

2 Jaccard similarity between two items 2016-10-03T18:27:34.087

2 Is it possible to use Jaccard similarity instead of Cosine similarity in gensim document similarity? 2016-12-20T19:22:03.547

2 Recommendation matrix as a product of User Similarity and Ratings 2017-02-26T15:54:07.707

2 What are some function/package in R to find similarity of individual words not in the context of sentences? 2017-04-09T16:33:00.020

2 Calculate similarity on boolean data 2017-12-04T15:12:58.917

2 How can I measure the similarity between 2 IP addresses? Is there any code to re-use? 2018-05-01T18:27:02.020

2 Finding the most phonetically similar word from WordNet 2018-07-11T07:22:51.247

2 Can cosine similarity be applied to multidimensional matrices? 2018-07-12T14:43:24.623

2 Alternate of TF-IDF 2018-09-10T16:23:09.443

2 Identifying documents similar to specific clusters 2018-10-22T23:03:02.897

2 When I would use a specific similarity coefficient over another? 2019-02-03T12:10:15.303

2 Measure of variety within list/cluster 2019-02-12T16:01:24.990

2 How can I find similarities in two graphs? 2019-02-24T08:58:39.307

2 Match a two items from two different receipts 2019-02-28T06:55:48.700

2 Are there algorithms for clustering objects with pairwise distances, without computing all pairwise distances? 2019-03-08T16:34:00.070

2 Similarity score: Can Sklearn SVR predict values greater than 1 and less than 0? 2019-06-18T15:07:01.403

2 How can I find correlation between features? 2019-07-05T08:54:51.873

2 Similarity Measure Time Series 2019-08-08T22:45:18.853

2 How to build a symmetric similarity model on top of embeddings? 2019-10-16T19:25:05.977

2 Approach to semantic similarity between documents 2020-01-08T13:39:33.053

2 How is an ASR's output compared to ground truth for validation? 2020-10-20T22:16:58.083

2 What are the Most Dissimilar MNIST Digits? 2020-10-21T15:32:11.647

1 Subgraph isomorphism and Anti-monotone property 2014-08-13T20:30:18.143

1 Mimic a Mahout like system 2015-01-06T14:08:35.533

1 Assigning new items to existing similarity based clustering 2015-02-16T12:16:26.923