How to calculate the percentage similarity between two strings in Python

3 Answers

# difflib - Quick, built‑in similarity - Character‑based

from difflib import SequenceMatcher

def similarity(a, b):
    return SequenceMatcher(None, a, b).ratio() * 100
    
s1 = "The cat sat on the sofa"
s2 = "The dog sat on the carpet"

print(similarity(s1, s2))



'''
run:

70.83333333333334

'''

70+ SQL courses for beginners and professionals

answered 3 hours ago by avibootz
edited 3 hours ago by avibootz

# Cosine TF‑IDF - Meaning/semantic similarity - Best for sentences

from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity

def similarity_percent(s1, s2):
    vectorizer = TfidfVectorizer()
    tfidf = vectorizer.fit_transform([s1, s2])
    sim = cosine_similarity(tfidf[0:1], tfidf[1:2])[0][0]
    return sim * 100

print(similarity_percent("The cat sat on the sofa",
                         "The dog sat on the carpet"))



'''
run:

60.297481603805714

'''

70+ SQL courses for beginners and professionals

answered 3 hours ago by avibootz

Most popular tags

How to calculate the percentage similarity between two strings in Python

3 Answers

Related questions