How to Compare Text and Code Files Online
Learn how diff works, how the Myers algorithm finds minimal edits, and how to compare text, code, configs, and JSON files in seconds online.
File comparison is one of the oldest problems in software development, and one of the most practical. The Unix diff command has been solving it since 1974 (Hunt & McIlroy, Bell Labs Technical Journal, 1976). Whether you’re reviewing a config change, auditing a contract revision, or checking why an API response changed between deploys, understanding how diff works makes you faster and less error-prone.
This guide covers the history, the algorithm, the format, and the fastest way to compare any two text files right now.
Key Takeaways
- The Unix diff utility was created in 1974 by Douglas McIlroy at Bell Labs, establishing the line-level comparison model still used today.
- The Myers diff algorithm (1986) finds the minimum number of edits between two files. Git uses a variant of it internally.
- Unified diff format (+/- lines, @@ hunks) is the standard output for git diff, patches, and code review tools.
- Online diff checkers handle arbitrary text: configs, JSON, contracts, API responses. Git diff only works inside a repository.
- Word-level diff catches small changes inside long lines that line-level diff can miss.
Compare Two Files Right Now
Paste any two blocks of text below. The diff checker highlights every addition, deletion, and changed line. No login required. Everything runs in your browser.
Left file
Load any supported plain-text or code file.
Click to upload or drag and drop
TXT, MD, JSON, YAML, YML, XML, CSV, JS, JSX, TS, TSX, HTML, CSS, DIFF, PATCH up to 2MB
Right file
Compare a second file side by side.
Click to upload or drag and drop
TXT, MD, JSON, YAML, YML, XML, CSV, JS, JSX, TS, TSX, HTML, CSS, DIFF, PATCH up to 2MB
Left input
0 lines · 0 charsRight input
0 lines · 0 chars0
Added
0
Removed
0
Changed
0
Unchanged
100%
Similarity
No differences
Status
Split comparison
0 rowsPaste text or load files to compare
Diff Checker runs entirely in your browser and keeps both inputs local.
Compare pasted text, config files, Markdown, CSV, JSON, and source code without uploading anything. Split view is best for inspection, while unified view is ready to copy or download as a patch.
What Is a Diff?
A diff is the minimal description of the changes between two versions of a text. Given two files, “original” and “modified,” a diff tells you exactly which lines were added, which were removed, and which stayed the same. The output is compact: it only records what changed, not the entire content of both files.
Think of it as the delta between two states. You can reconstruct the modified file by applying the diff to the original. That’s why diffs are the foundation of version control systems, patch distribution, and code review workflows.
Citation capsule: The Unix diff utility, developed at Bell Labs in 1974 by Douglas McIlroy, introduced line-level text comparison as a tool for software development (Hunt & McIlroy, Bell Labs Technical Journal, 1976). Its output format became the basis for the unified diff standard used by every modern version control system.
Where Did Diff Come From?
The diff utility was created in 1974 at Bell Labs by Douglas McIlroy, with key contributions from James Hunt (Hunt & McIlroy, Bell Labs Technical Journal, 1976). It was part of the 5th Edition of Unix and quickly became one of the most-used tools in the operating system. The original algorithm compared files line by line, finding the longest common subsequence (LCS) to determine the minimal set of changes.
The unified diff format, which adds context lines around changes and uses @@ ... @@ headers to mark chunks, was introduced later and standardized as part of POSIX. It replaced the older “normal” and “context” diff formats because it’s more compact and easier for humans to read.
How Does the Diff Algorithm Work?
Eugene Myers published his landmark diff algorithm in 1986 (Myers, Algorithmica, 1986). It finds the shortest edit script (SES) between two sequences: the minimum number of insertions and deletions needed to transform one file into the other. Myers proved this is equivalent to finding the longest common subsequence, but his algorithm solves it faster in practice.
The algorithm models the problem as finding the shortest path through an edit graph. Each position in the graph represents a point in comparing file A at line i against file B at line j. Horizontal moves mean “delete a line from A,” vertical moves mean “insert a line from B,” and diagonal moves (free moves) mean the lines match. Myers finds the shortest path from top-left to bottom-right.
Why “minimal edit distance” matters
Minimal edit distance means the diff shows you the true change, not an arbitrary one. Naive comparison algorithms can produce diffs that look large and confusing even when only a few lines actually changed. Myers ensures you see the smallest possible set of differences, which makes the output readable and useful for code review.
Git uses a variant of Myers diff by default for git diff and git log -p (Git documentation, kernel.org). You can also select the “patience” or “histogram” diff algorithms in Git for cases where Myers produces noisy output on repetitive code.
# Use histogram diff (often cleaner on refactored code)
git diff --diff-algorithm=histogram HEAD~1
# Word-level diff instead of line-level
git diff --word-diff HEAD~1
Citation capsule: Eugene Myers’ 1986 algorithm finds the shortest edit script between two sequences by modeling the problem as shortest-path search through an edit graph (Myers, Algorithmica, 1986). Git uses a variant of this algorithm as its default diff engine, producing the minimal set of line changes needed to transform one file into another.
What Does Unified Diff Format Mean?
Unified diff is the standard output format for git diff, patch files, and most code review tools. It combines both the original and modified content into a single block with markers, rather than showing them side by side. Once you can read it, you can understand any patch file you encounter.
Here’s a minimal example:
--- a/config.yaml
+++ b/config.yaml
@@ -3,7 +3,7 @@
server:
host: 0.0.0.0
- port: 3000
+ port: 8080
workers: 4
timeout: 30
How to read the @@ header
The @@ -3,7 +3,7 @@ line is called a “hunk header.” It tells you exactly where in the file this change appears.
-3,7means the original file (---): starting at line 3, showing 7 lines+3,7means the modified file (+++): starting at line 3, showing 7 lines
Lines starting with - were removed. Lines starting with + were added. Lines with no prefix are context lines: they appear in both files and help you understand where the change sits.
Context lines are key for readability
By default, unified diff shows 3 context lines above and below each change. This gives enough surrounding code to understand what changed without reading the entire file. You can change this with git diff -U5 (5 context lines) or -U0 for no context at all.
How to Compare Files With the Online Diff Checker
The diff checker at /tools/developer/diff-checker handles any text: code, config files, markdown, JSON, legal documents. Here’s the fastest way to get useful results.
Step 1: Paste the original version
Copy the entire original text into the left panel. For config files, paste the version before your change. For code, paste the old function or file. For documents, paste the first draft.
Step 2: Paste the modified version
Copy the new version into the right panel. The diff runs automatically as you type. You don’t need to click a button.
Step 3: Read the highlighted output
Added lines appear in green. Removed lines appear in red. Unchanged context lines appear in neutral. Scroll through to verify every change matches your intention.
Step 4: Check word-level diff for dense lines
If you changed a single value inside a long line, switch to word-level diff mode. Line-level diff marks the entire line as changed. Word-level diff highlights the exact word or character that moved.
When Should You Use Git Diff vs an Online Tool?
Both tools solve the same underlying problem, but they’re not interchangeable. The right choice depends on whether your content is version-controlled and what kind of comparison you need.
| Tool | Best For | Setup Required | Works Without Git |
|---|---|---|---|
| Online diff checker | Arbitrary text, one-off comparisons, configs, documents | None | Yes |
| git diff | Version-controlled source code, staged changes, commit history | Git repo | No |
| VS Code built-in | File-by-file comparison inside a project, visual side-by-side | VS Code installed | Yes |
| Command-line diff | Scripted comparisons, CI pipelines, directory trees | Unix/Linux/macOS | Yes |
| GitHub code review | Pull request review, inline comments, team collaboration | GitHub account + repo | No |
Use git diff when your files are in a repository and you want to compare against a specific commit, branch, or staged state. Use an online tool for everything else: snippets from Stack Overflow, config files shared over Slack, API responses from two different environments, or any two text blocks that don’t live in a repo.
# Compare working tree against last commit
git diff HEAD
# Compare two branches
git diff main..feature/my-branch
# Compare two specific commits
git diff a1b2c3d e4f5g6h
# Word-level diff (highlights individual changed words)
git diff --word-diff HEAD~1
git diff --stat for a quick summary
Before reading a full diff, run git diff --stat to see which files changed and how many lines moved. It’s faster than scrolling through a large patch when you just need a high-level overview.
Why Does Word-Level Diff Matter?
Line-level diff is the default because files are naturally organized into lines. But line-level diff has a blind spot: when a single word or value changes inside a long line, the entire line is marked as deleted and re-added. The actual change is invisible unless you read both versions carefully.
Word-level diff solves this by tokenizing each line into words and running the diff algorithm on the tokens instead of the lines. Changed words are highlighted directly within the surrounding text. This is especially useful for prose documents, long JSON values, SQL queries, and minified code.
Most online diff tools offer a word-diff toggle. In Git, use git diff --word-diff or git diff --word-diff=color for terminal output.
What Text Types Benefit Most From Diff Comparison?
The diff algorithm works on any text. The most valuable use cases outside of source code are ones where manual comparison is unreliable.
Configuration files
YAML, TOML, JSON, and .env files change frequently in deployment workflows. A diff between two versions of a Kubernetes manifest or Docker Compose file reveals exactly which resource limits, environment variables, or image tags changed. Missing a single value change can cause a failed deploy or a security regression.
JSON and YAML format reference
API responses
When an API behaves differently between environments (staging vs. production, v1 vs. v2), comparing the raw JSON responses in a diff checker shows exactly which fields changed, were added, or were removed. This is faster than reading through both responses manually.
Legal and contractual documents
Contract revisions between parties often involve tracked changes in Word, but when you receive a plain-text or copy-pasted version, a diff checker reconstructs the edit history. Added clauses appear in green, removed language in red. The structure of a legal argument changes in ways that are easy to miss with the naked eye.
Database migration scripts
SQL migration files need careful review before they run against production. Comparing a migration script against the last approved version catches unintended changes to table names, column types, or constraint definitions.
Citation capsule: Git’s default diff engine uses a variant of Eugene Myers’ 1986 shortest-edit-script algorithm, which guarantees the minimum number of line insertions and deletions between any two text files (Git documentation, kernel.org). This minimum property is what makes diffs readable in code review: you see the true change, not an arbitrary transformation.
Tips for Cleaner, More Useful Diffs
Raw diffs can be noisy. Small formatting changes, inconsistent line endings, or trailing whitespace can flood the output with false positives that obscure real changes. These habits make diffs cleaner before you run them.
Normalize line endings first
Windows uses \r\n (CRLF) and Unix/macOS uses \n (LF). A file converted between systems shows every line as changed, even if the content is identical. Most editors can convert line endings. In Git, git config core.autocrlf input handles this automatically on macOS/Linux.
Strip trailing whitespace
Editors sometimes add or remove trailing spaces invisibly. In Git, git diff -w or git diff --ignore-all-space skips whitespace-only changes so you can focus on content.
Compare logic, not formatting
When comparing two versions of a minified file or a reformatted config, beautify both versions into consistent formatting first. A JSON file with consistent 2-space indentation produces a clean diff. The same content with mixed indentation produces noise on every line.
Use --ignore-blank-lines in Git
git diff --ignore-blank-lines skips diffs caused only by blank line additions or removals. Useful when auto-formatters or linters insert blank lines between functions.
How to Compare Two JSON Files Specifically
JSON comparison has a catch: field order is semantically insignificant in JSON, but line-level diff treats reordered fields as changes. A JSON object with keys in a different order produces a large diff even though the data is identical.
For meaningful JSON comparison, sort the keys before diffing. Most JSON formatters (including the one at /tools/developer/json-formatter) can sort keys as part of the formatting step. Paste both JSON blobs into the formatter, sort keys, then paste the results into the diff checker.
# Sort JSON keys with jq before comparing
jq --sort-keys . original.json > original-sorted.json
jq --sort-keys . modified.json > modified-sorted.json
diff original-sorted.json modified-sorted.json
This also applies to YAML files: consistent key ordering and formatting before diffing eliminates false positives from tooling differences.
Frequently Asked Questions
What is a diff file?
A diff file (also called a patch file) contains the output of the diff command in unified format. It records only the changed lines, not the full content of either file. Patch files can be applied to the original file with patch -p1 < file.patch to reproduce the modified version. Most open-source projects distribute bug fixes as patch files.
What does @@ mean in a diff?
The @@ markers in a unified diff header define a “hunk”: a contiguous section of changes. The numbers between the @@ signs tell you the starting line and line count in the original file (prefixed with -) and the modified file (prefixed with +). For example, @@ -10,6 +10,8 @@ means the hunk starts at line 10 in the original (6 lines shown) and line 10 in the modified file (8 lines shown, because 2 lines were added).
How do I compare two JSON files online?
Paste both JSON blobs into a JSON formatter, enable key sorting, then copy the formatted output into both panels of the diff checker. This removes false positives from field reordering and indentation differences, leaving only real data changes visible.
Can diff compare binary files?
Standard text diff tools work on plain text only. Binary files (images, PDFs, compiled executables) need specialized comparison tools. Git can detect that binary files differ but won’t show a meaningful line-level diff by default. For PDFs and documents, convert to text first or use a dedicated document comparison tool.
How is git diff different from the Unix diff command?
git diff is built on the same diff algorithm but integrates with Git’s object model. It compares against commits, branches, or the index (staged changes), not arbitrary files on disk. It also supports language-aware hunk headers that show the function or class name where a change appears (via .gitattributes diff drivers). The output format is identical unified diff.
File comparison has been a fundamental developer tool for fifty years because it solves a fundamental problem: how do you know exactly what changed? Whether you’re reviewing a pull request, auditing a config change, or checking whether a document revision introduced unintended edits, diff gives you a precise, reproducible answer.
The Myers algorithm’s guarantee of minimal edits is what makes that answer trustworthy. You’re not seeing an arbitrary view of the differences. You’re seeing the smallest true difference. Combined with unified diff’s readable format and the instant feedback of an online tool, you have everything needed to compare any two versions of any text accurately.
Related articles
SHA-256 vs MD5 vs SHA-1: Which Hash Algorithm Should You Use?
MD5 collisions were cracked in 2004. SHA-1 fell in 2017. Here's which hash algorithm is safe today and when to use SHA-256, SHA-512, or BLAKE3.
Git Commands Cheat Sheet: From Basics to Advanced
Complete git cheat sheet with 50+ commands, real examples, and ASCII branch diagrams. Used by 100M+ developers worldwide. Bookmark this reference.
JSON vs XML vs YAML vs TOML: When to Use Each Format
Compare JSON, XML, YAML, and TOML with real config examples. 70% of APIs use JSON (Postman, 2025) - learn which format fits your use case.