Text similarity checker (percentage)
Get a similarity percentage between two texts. Adjust normalization options for more accurate comparison of content vs exact strings.
Text similarity
Get a similarity percentage between two texts. Adjust normalization options for more accurate comparison of content vs exact strings.
What are shingles and char grams?
Word shingles are consecutive word sequences (e.g., w-3 = 3-word phrases). They catch copy-with-small-edits.
Char grams are consecutive character sequences (e.g., c-5 = 5-char chunks). They help detect small formatting/typo changes.
Use cases
Useful for spotting near-duplicates, lightly edited copies, repeated content, or “same meaning with small edits” comparisons. For exact differences, use Text compare.
This isn’t a full plagiarism checker (no source search). It compares only the two texts you paste and produces a best-effort similarity score.
FAQs
How is the similarity calculated?
The score blends word overlap (cosine similarity), word shingles (phrase overlap), and character n-grams (small edit robustness).
What are “word shingles”?
Shingles are consecutive word sequences (e.g., w-3 compares 3‑word phrases). They’re great at catching copy-with-small-edits.
What are “character grams”?
Character grams are consecutive character chunks (e.g., c-5). They help detect small formatting or typo changes.
What should I set for word shingle and char gram?
Start with w-3 and c-5. Increase shingle size to be stricter on phrase matching; increase char grams to be stricter on formatting-level overlap.
Is my data stored?
This tool sends the text to the server to compute the score. It’s designed not to persist your pasted content.
Working with JSON payloads?
Try the JSON viewer to format and validate first.