Skip to content

patch-id Git Command Guide

The git patch-id command reads a patch from standard input and computes a unique patch ID based on the file diffs. Patch IDs are stable across line number changes and can be used to identify duplicate or equivalent patches.

Terminal window
git patch-id [--stable | --unstable | --verbatim]
OptionDescription
--stableUse stable patch ID algorithm (default)
--unstableUse unstable algorithm (faster but less stable)
--verbatimUse verbatim algorithm (exact content match)
Patch ID = SHA-1 sum of file diffs with:
├── Context lines ignored
├── Line numbers ignored
├── Whitespace changes ignored
└── Only actual diff content considered
Stable: Ignores line numbers and context
Unstable: Faster but may vary with formatting
Verbatim: Exact patch content match
Stable across:
├── Line number changes
├── Context size changes
├── Minor formatting differences
├── File renames (if content same)
Changes with:
├── Actual code changes
├── Added/removed lines
├── Significant formatting changes
Terminal window
# Generate patch ID from git diff output
git diff HEAD~1 | git patch-id
# Output: <patch-id>
# Compute ID for specific commit
git show <commit> | git patch-id
# Get patch ID for staged changes
git diff --cached | git patch-id
Terminal window
# Compare two commits for duplicate patches
git show commit1 | git patch-id > id1.txt
git show commit2 | git patch-id > id2.txt
if diff id1.txt id2.txt >/dev/null; then
echo "Patches are identical"
else
echo "Patches are different"
fi
Terminal window
# Process multiple patches from mailbox
git am --patch-format=mbox < patches.mbox |
while read -r line; do
echo "$line" | git patch-id
done
# Batch process patch files
for patch in *.patch; do
echo "Processing $patch:"
git patch-id < "$patch"
done
#!/bin/bash
# Find duplicate commits in repository
find_duplicate_commits() {
echo "Searching for duplicate commits..."
# Create mapping of patch IDs to commits
declare -A patch_map
# Process all commits
git log --all --pretty=format:"%H" | while read commit; do
patch_id=$(git show "$commit" | git patch-id | cut -d' ' -f1)
if [ -n "${patch_map[$patch_id]}" ]; then
echo "Duplicate found:"
echo " Original: ${patch_map[$patch_id]}"
echo " Duplicate: $commit"
echo " Patch ID: $patch_id"
echo ""
else
patch_map[$patch_id]="$commit"
fi
done
}
find_duplicate_commits
Terminal window
# Analyze patch series for duplicates or conflicts
analyze_patch_series() {
local patch_dir="$1"
echo "Analyzing patch series in $patch_dir"
declare -A seen_patches
for patch_file in "$patch_dir"/*.patch; do
[ -f "$patch_file" ] || continue
patch_id=$(git patch-id < "$patch_file" | cut -d' ' -f1)
patch_name=$(basename "$patch_file")
if [ -n "${seen_patches[$patch_id]}" ]; then
echo "Warning: Duplicate patch content"
echo " Original: ${seen_patches[$patch_id]}"
echo " Duplicate: $patch_name"
echo " Patch ID: $patch_id"
else
seen_patches[$patch_id]="$patch_name"
echo "$patch_name (ID: ${patch_id:0:8})"
fi
done
}
analyze_patch_series "/path/to/patches"
Terminal window
# Remove duplicate patches from email threads
deduplicate_email_patches() {
local mbox_file="$1"
echo "Deduplicating patches in $mbox_file"
declare -A processed_patches
temp_file=$(mktemp)
# Process each message in mbox
git mailsplit -o. "$mbox_file" >/dev/null
for msg_file in [0-9]*; do
[ -f "$msg_file" ] || continue
# Extract patch content
if sed -n '/^---$/,/^---$/p' "$msg_file" | git patch-id >/dev/null 2>&1; then
patch_id=$(sed -n '/^---$/,/^---$/p' "$msg_file" | git patch-id | cut -d' ' -f1)
if [ -z "${processed_patches[$patch_id]}" ]; then
processed_patches[$patch_id]="$msg_file"
cat "$msg_file" >> "$temp_file"
echo "" >> "$temp_file" # Message separator
else
echo "Skipping duplicate patch: $msg_file (matches ${processed_patches[$patch_id]})"
fi
else
# Not a patch message, include as-is
cat "$msg_file" >> "$temp_file"
echo "" >> "$temp_file"
fi
done
# Clean up
rm -f [0-9]*
# Replace original with deduplicated version
mv "$temp_file" "$mbox_file.deduplicated"
echo "Deduplicated mbox saved as: $mbox_file.deduplicated"
}
deduplicate_email_patches "patches.mbox"
#!/bin/bash
# Patch review workflow with duplicate detection
review_patches() {
local patch_dir="$1"
echo "Reviewing patches in $patch_dir"
declare -A reviewed_patches
for patch_file in "$patch_dir"/*.patch; do
[ -f "$patch_file" ] || continue
patch_id=$(git patch-id < "$patch_file" | cut -d' ' -f1)
if [ -n "${reviewed_patches[$patch_id]}" ]; then
echo "⚠ Duplicate patch detected:"
echo " Original: ${reviewed_patches[$patch_id]}"
echo " Current: $(basename "$patch_file")"
echo " Consider rejecting duplicate"
else
echo "✓ New patch: $(basename "$patch_file")"
reviewed_patches[$patch_id]="$(basename "$patch_file")"
# Apply and test patch
if git apply --check "$patch_file" 2>/dev/null; then
echo " ✓ Patch applies cleanly"
else
echo " ✗ Patch has conflicts"
fi
fi
done
}
review_patches "/path/to/review/patches"
Terminal window
# Automated patch management system
manage_patches() {
local incoming_dir="$1"
local processed_dir="$2"
local duplicate_dir="$3"
mkdir -p "$processed_dir" "$duplicate_dir"
declare -A known_patches
# Load existing patch database
if [ -f patch-database.txt ]; then
while IFS='|' read -r patch_id filename; do
known_patches[$patch_id]="$filename"
done < patch-database.txt
fi
# Process incoming patches
for patch_file in "$incoming_dir"/*.patch; do
[ -f "$patch_file" ] || continue
patch_id=$(git patch-id < "$patch_file" | cut -d' ' -f1)
filename=$(basename "$patch_file")
if [ -n "${known_patches[$patch_id]}" ]; then
echo "Duplicate patch: $filename (matches ${known_patches[$patch_id]})"
mv "$patch_file" "$duplicate_dir/"
else
echo "New patch: $filename"
mv "$patch_file" "$processed_dir/"
known_patches[$patch_id]="$filename"
# Apply patch if it applies cleanly
if git apply --check "$processed_dir/$filename" 2>/dev/null; then
git am "$processed_dir/$filename"
echo "✓ Patch applied successfully"
else
echo "⚠ Patch needs manual review"
fi
fi
done
# Save updated database
> patch-database.txt
for patch_id in "${!known_patches[@]}"; do
echo "$patch_id|${known_patches[$patch_id]}" >> patch-database.txt
done
}
manage_patches "incoming" "processed" "duplicates"
Terminal window
# Validate patches in CI/CD pipeline
validate_patches_ci() {
echo "CI/CD patch validation"
# Check for duplicate patches in PR
declare -A pr_patches
for patch_file in *.patch; do
[ -f "$patch_file" ] || continue
patch_id=$(git patch-id < "$patch_file" | cut -d' ' -f1)
if [ -n "${pr_patches[$patch_id]}" ]; then
echo "❌ Duplicate patches detected in PR:"
echo " ${pr_patches[$patch_id]}"
echo " $patch_file"
exit 1
else
pr_patches[$patch_id]="$patch_file"
fi
# Validate patch applies
if ! git apply --check "$patch_file" 2>/dev/null; then
echo "❌ Patch does not apply cleanly: $patch_file"
exit 1
fi
done
echo "✅ All patches validated successfully"
}
validate_patches_ci
Terminal window
# Choose appropriate algorithm based on use case
# For patch deduplication (most common)
git config patch-id.algorithm stable
# For performance-critical operations
git config patch-id.algorithm unstable
# For exact content matching
git config patch-id.algorithm verbatim
Terminal window
# Cache patch IDs for repeated operations
cache_patch_ids() {
local cache_file=".patch-id-cache"
if [ ! -f "$cache_file" ] || [ "$cache_file" -ot "$(find . -name "*.patch" -newer "$cache_file" 2>/dev/null | head -1)" ]; then
echo "Building patch ID cache..."
> "$cache_file"
for patch_file in *.patch; do
[ -f "$patch_file" ] || continue
patch_id=$(git patch-id < "$patch_file" | cut -d' ' -f1)
echo "$patch_id|$patch_file" >> "$cache_file"
done
echo "Cache built with $(wc -l < "$cache_file") entries"
fi
}
cache_patch_ids
Terminal window
# Debug patch ID inconsistencies
debug_patch_id() {
local patch_file="$1"
echo "Debugging patch ID for: $patch_file"
# Show patch content summary
echo "Patch statistics:"
grep -c "^@@" "$patch_file" | xargs echo "Hunks:"
grep -c "^+" "$patch_file" | xargs echo "Additions:"
grep -c "^-" "$patch_file" | xargs echo "Deletions:"
# Compute with different algorithms
echo "Patch IDs:"
echo " Stable: $(git patch-id --stable < "$patch_file" | cut -d' ' -f1)"
echo " Unstable: $(git patch-id --unstable < "$patch_file" | cut -d' ' -f1)"
echo " Verbatim: $(git patch-id --verbatim < "$patch_file" | cut -d' ' -f1)"
}
debug_patch_id "problematic.patch"
Terminal window
# Handle large patches efficiently
process_large_patches() {
local patch_dir="$1"
echo "Processing large patches..."
# Process in parallel for performance
find "$patch_dir" -name "*.patch" -print0 | \
xargs -0 -n 1 -P $(nproc) bash -c '
patch_file="$1"
patch_id=$(git patch-id < "$patch_file" | cut -d' ' -f1)
echo "$patch_id|$patch_file"
' _ | sort > patch-ids.txt
echo "Processed $(wc -l < patch-ids.txt) patches"
}
process_large_patches "/large/patch/collection"
Terminal window
# Handle different patch encodings
normalize_patch_encoding() {
local patch_file="$1"
# Detect encoding
encoding=$(file -b --mime-encoding "$patch_file")
if [ "$encoding" != "utf-8" ]; then
echo "Converting $patch_file from $encoding to UTF-8"
iconv -f "$encoding" -t utf-8 "$patch_file" > "${patch_file}.utf8"
mv "${patch_file}.utf8" "$patch_file"
fi
# Normalize line endings
sed -i 's/\r$//' "$patch_file"
}
normalize_patch_encoding "encoded.patch"
#!/bin/bash
# Manage open source contributions with patch deduplication
manage_contributions() {
local contribution_dir="$1"
echo "Managing contributions in $contribution_dir"
declare -A contribution_map
declare -A duplicate_map
# Process all contribution patches
find "$contribution_dir" -name "*.patch" | while read patch_file; do
patch_id=$(git patch-id < "$patch_file" | cut -d' ' -f1)
contributor=$(basename "$(dirname "$patch_file")")
if [ -n "${contribution_map[$patch_id]}" ]; then
echo "Duplicate contribution detected:"
echo " Original: ${contribution_map[$patch_id]}"
echo " Duplicate: $contributor/$(basename "$patch_file")"
duplicate_map[$patch_id]="${duplicate_map[$patch_id]} $contributor"
else
contribution_map[$patch_id]="$contributor/$(basename "$patch_file")"
echo "✓ New contribution: $contributor/$(basename "$patch_file")"
fi
done
# Report duplicates
if [ ${#duplicate_map[@]} -gt 0 ]; then
echo ""
echo "Duplicate summary:"
for patch_id in "${!duplicate_map[@]}"; do
echo "Patch ID $patch_id: ${duplicate_map[$patch_id]}"
done
fi
}
manage_contributions "/contributions"
Terminal window
# Code review workflow with patch analysis
review_with_patch_analysis() {
local pr_number="$1"
echo "Reviewing PR #$pr_number with patch analysis"
# Get PR patches
curl -s "https://api.github.com/repos/owner/repo/pulls/$pr_number" |
jq -r '.diff_url' | xargs curl -s > pr.patch
# Split into individual patches
git mailsplit -o. pr.patch >/dev/null
# Analyze each patch
for patch_file in [0-9]*; do
[ -f "$patch_file" ] || continue
patch_id=$(git patch-id < "$patch_file" | cut -d' ' -f1)
echo "Analyzing patch: $patch_file (ID: ${patch_id:0:8})"
# Check for duplicates in codebase
if git log --all --pretty=format:"%H" | xargs git show | git patch-id | grep -q "^$patch_id"; then
echo " ⚠ Similar changes already exist in codebase"
fi
# Check patch quality
hunks=$(grep -c "^@@" "$patch_file")
additions=$(grep -c "^+" "$patch_file")
deletions=$(grep -c "^-" "$patch_file")
echo " Stats: $hunks hunks, +$additions -$deletions lines"
# Apply and test
if git apply --check "$patch_file" 2>/dev/null; then
echo " ✓ Applies cleanly"
else
echo " ✗ Has conflicts"
fi
done
# Cleanup
rm -f [0-9]* pr.patch
}
review_with_patch_analysis "123"
Terminal window
# Automated patch testing and validation
automated_patch_testing() {
local patch_dir="$1"
local test_script="$2"
echo "Automated patch testing for $patch_dir"
for patch_file in "$patch_dir"/*.patch; do
[ -f "$patch_file" ] || continue
echo "Testing: $(basename "$patch_file")"
# Backup current state
git branch backup-testing 2>/dev/null || true
# Apply patch
if git apply "$patch_file"; then
echo " ✓ Patch applied successfully"
# Run tests
if [ -x "$test_script" ]; then
if "$test_script"; then
echo " ✓ Tests passed"
else
echo " ✗ Tests failed"
patch_id=$(git patch-id < "$patch_file" | cut -d' ' -f1)
echo " Patch ID: $patch_id (for investigation)"
fi
fi
# Revert changes
git reset --hard HEAD
else
echo " ✗ Patch failed to apply"
fi
# Restore backup
git reset --hard backup-testing 2>/dev/null || true
done
git branch -D backup-testing 2>/dev/null || true
}
automated_patch_testing "/patches" "./run-tests.sh"

What’s the difference between patch-id algorithms?

Section titled “What’s the difference between patch-id algorithms?”

—stable ignores line numbers and context (recommended); —unstable is faster but less stable; —verbatim requires exact content match.

How stable are patch IDs across different Git versions?

Section titled “How stable are patch IDs across different Git versions?”

Very stable for —stable algorithm. Patch IDs computed with same algorithm should be identical across Git versions.

No, patch-id only works with text patches. Binary changes produce different patch IDs even for identical content.

What’s the performance impact of patch-id on large patches?

Section titled “What’s the performance impact of patch-id on large patches?”

Linear with patch size. Use —unstable for faster processing if exact stability isn’t required.

How do I use patch-id with git format-patch?

Section titled “How do I use patch-id with git format-patch?”

Pipe format-patch output to patch-id: git format-patch -1 | git patch-id

No, patch-id considers content changes but not patch ordering. Two patches with same changes in different order have different IDs.

What’s the relationship between patch-id and commit SHA?

Section titled “What’s the relationship between patch-id and commit SHA?”

Different concepts: patch-id identifies patch content; commit SHA includes author, date, and parent information.

Use cut -d’ ’ -f1 to extract just the patch ID. Check exit codes and handle errors gracefully.

Yes, computes ID from whatever diff content is provided. Useful for analyzing partial changes.

What’s the collision rate for patch IDs?

Section titled “What’s the collision rate for patch IDs?”

Very low for —stable algorithm. Collisions would require identical diff content with different line numbers/context.

How do I compare patches from different sources?

Section titled “How do I compare patches from different sources?”

Compute patch IDs for both and compare. Same ID means patches are functionally identical.

Yes, treats merge commit diffs like any other patch. Computes ID from combined diff content.

Default: single line with patch ID. With git diff-tree input: patch-id + commit-id on same line.

Use find + xargs for parallel processing, or write a loop that caches results to avoid recomputation.

  1. Duplicate Detection: Identify duplicate patches and commits across repositories and patch series
  2. Patch Management: Organize and deduplicate large collections of patches
  3. Code Review: Detect when similar changes are proposed multiple times
  4. Automated Testing: Validate patch uniqueness in CI/CD pipelines
  5. Contribution Tracking: Manage open source contributions and avoid duplicate work
  6. Patch Series Analysis: Analyze relationships between patches in complex patch sets