mailsplit Git Command Guide
The git mailsplit command is a simple UNIX mbox splitter program that splits mbox files or Maildir directories into individual email message files for further processing. It creates sequentially numbered files (0001, 0002, etc.) in a specified output directory.
git mailsplit Syntax:
Section titled “git mailsplit Syntax:”git mailsplit [-b] [-f<nn>] [-d<prec>] [--keep-cr] [--mboxrd] -o<directory> [--] [(<mbox>|<Maildir>)...]Email Processing Options:
Section titled “Email Processing Options:”| Option | Description |
|---|---|
-b | Stop processing at first blank line |
-f<nn> | Start numbering from nn instead of 1 |
-d<prec> | Set precision of generated numbers |
--keep-cr | Keep CR characters in output |
--mboxrd | Use mboxrd rather than mboxo format |
-o<directory> | Output directory for split files (required) |
Parameters:
Section titled “Parameters:”| Parameter | Description |
|---|---|
<mbox> | Mbox file to split (stdin if omitted) |
<Maildir> | Maildir directory to process |
Basic Usage Examples:
Section titled “Basic Usage Examples:”Split mbox file into email messages:
Section titled “Split mbox file into email messages:”# Split mbox file into individual email filesgit mailsplit -o emails/ patches.mbox
# Start numbering from 1000git mailsplit -f1000 -o emails/ patches.mbox
# Split with 4-digit precisiongit mailsplit -d4 -o emails/ patches.mboxProcess Maildir directory:
Section titled “Process Maildir directory:”# Process Maildir into numbered email filesgit mailsplit -o output/ ~/Maildir/
# Combine multiple Maildirgit mailsplit -o emails/ ~/Maildir ~/another-maildir/Read from standard input:
Section titled “Read from standard input:”# Read from stdin, split into directorycat big-mbox.eml | git mailsplit -o split-emails/
# Process email from pipewget -qO- https://example.com/mail.mbox | git mailsplit -o downloads/Advanced Email Processing:
Section titled “Advanced Email Processing:”Large patch set processing:
Section titled “Large patch set processing:”# Handle large patch seriesgit mailsplit -b -o patches/ large-patches.mbox
# Preserve original formattinggit mailsplit --keep-cr -o formatted/ special-encoding.mbox
# Use mboxrd formatgit mailsplit --mboxrd -o mboxrd/ legacy-mail.mboxAutomated batch processing:
Section titled “Automated batch processing:”# Process multiple mbox filesfor mbox in *.mbox; do base=$(basename "$mbox" .mbox) git mailsplit -f1 -d3 -o "patches/$base/" "$mbox"done
# Process with date-based namingtimestamp=$(date +%Y%m%d_%H%M%S)git mailsplit -o "mail_$timestamp/" inbox.mboxEmail filtering and selection:
Section titled “Email filtering and selection:”# Process specific email rangegit mailsplit -f100 -o emails/ large-mailbox.mbox | head -20
# Split by subject patterngrep -l "PATCH" *.mbox | while read mbox; do git mailsplit -o "patches/$(basename "$mbox" .mbox)/" "$mbox"doneIntegration with Git Workflows:
Section titled “Integration with Git Workflows:”Patch application workflow:
Section titled “Patch application workflow:”#!/bin/bash# Complete patch processing workflow
# Step 1: Split mbox into individual emailsgit mailsplit -o temp-patches/ patch-series.mbox
# Step 2: Process each patchfor patch in temp-patches/*; do echo "Processing $patch"
# Extract commit message and patch git mailinfo msg.txt patch.txt < "$patch"
# Apply the patch git am patch.txt
# Clean up rm msg.txt patch.txtdone
# Clean up temporary directoryrm -rf temp-patchesConcurrent patch processing:
Section titled “Concurrent patch processing:”# Process multiple patch series in parallel#!/bin/bash
process_patch_series() { local mbox_file="$1" local output_dir="$2"
# Create output directory mkdir -p "$output_dir"
# Split the mbox git mailsplit -o "$output_dir/" "$mbox_file"
echo "Split $(ls "$output_dir" | wc -l) emails from $mbox_file"}
# Process multiple series concurrentlyprocess_patch_series "series1.mbox" "out1" &process_patch_series "series2.mbox" "out2" &process_patch_series "series3.mbox" "out3" &wait
echo "All patch series processed"Quality control and validation:
Section titled “Quality control and validation:”# Validate patch formatting before processingvalidate_mbox() { local mbox_file="$1"
# Check if valid mbox format if grep -q '^From .*@' "$mbox_file"; then echo "Valid mbox format detected"
# Count emails count=$(grep '^From .*@' "$mbox_file" | wc -l) echo "Found $count emails"
return 0 else echo "Invalid mbox format" return 1 fi}
# Process only valid mbox filesfor mbox in *.mbox; do if validate_mbox "$mbox"; then git mailsplit -o "validated/$(basename "$mbox" .mbox)/" "$mbox" fidoneConfiguration and Optimization:
Section titled “Configuration and Optimization:”Directory structure creation:
Section titled “Directory structure creation:”# Create organized directory structureoutput_base="processed-patches"timestamp=$(date +%Y%m%d_%H%M%S)
output_dir="$output_base/$timestamp"mkdir -p "$output_dir"
# Process with organized outputfor series in *-patches.mbox; do series_name=$(basename "$series" -patches.mbox) series_dir="$output_dir/$series_name"
git mailsplit -o "$series_dir/" "$series" echo "Processed $series_name: $(ls "$series_dir" | wc -l) patches"doneLarge mailbox handling:
Section titled “Large mailbox handling:”# Handle very large mbox filesbig_mbox="large-archive.mbox"total_emails=$(grep -c '^From ' "$big_mbox")
# Split in batchesbatch_size=1000for ((start=1; start<=$total_emails; start+=batch_size)); do end=$((start + batch_size - 1)) if [ $end -gt $total_emails ]; then end=$total_emails; fi
echo "Processing emails $start to $end"
# Extract batch to temporary file awk "/^From .*@$start/,/^From .*@$end/" "$big_mbox" > "batch_$start.mbox"
# Process batch git mailsplit -f$start -o "batches/batch_$start/" "batch_$start.mbox"
# Clean up rm "batch_$start.mbox"doneTroubleshooting Common Issues:
Section titled “Troubleshooting Common Issues:”Malformed mbox format:
Section titled “Malformed mbox format:”# Check mbox format validitygrep -n "^From " mailbox.mbox | head -5
# Fix mbox format issues# Add missing "From " lines# Ensure proper line endingsdos2unix mailbox.mbox || unix2dos mailbox.mbox
# Convert to standard mbox formatformail -ds < mailbox.mbox | git mailsplit -o fixed/Empty or invalid patches:
Section titled “Empty or invalid patches:”# Verify patch contentls -la output-dir/cat output-dir/0001 | head -10
# Check for zero-byte filesfind output-dir/ -size 0 -deleteecho "Removed $(find output-dir/ -size 0 -print | wc -l) empty files"
# Validate patch formatfind output-dir/ -name "*" -exec git mailinfo /dev/null /dev/null < {} \; -print | grep -v "Author:"Maildir processing issues:
Section titled “Maildir processing issues:”# Check Maildir structurels -la ~/Maildir/cur/ ~/Maildir/new/
# Sort filenames for correct orderls ~/Maildir/cur/ ~/Maildir/new/ | sort > maildir-files.txt
# Process sorted Maildircat maildir-files.txt | xargs -I {} git mailsplit -o processed/ ~/Maildir/{}Encoding and character issues:
Section titled “Encoding and character issues:”# Handle different encodingsgit mailsplit --mboxrd -o processed/ international-mails.mbox
# Convert encoding before processingiconv -f iso-8859-1 -t utf-8 foreign-mails.mbox | git mailsplit -o converted/
# Preserve special charactersgit mailsplit --keep-cr -o preserved/ special-chars.mboxReal-World Usage Examples:
Section titled “Real-World Usage Examples:”Open-source project patch management:
Section titled “Open-source project patch management:”# Process patch submissions from mailing list#!/bin/bash
# Download latest patches from listwget -q https://lists.kernel.org/archive/patches.mbox
# Split into individual patchesgit mailsplit -o kernel-patches/ patches.mbox
# Validate each patchfor patch in kernel-patches/*; do if git mailinfo /dev/null /dev/null < "$patch" > /dev/null 2>&1; then echo "✓ Valid patch: $patch" else echo "✗ Invalid patch: $patch" fidone
# Apply valid patchesfor patch in kernel-patches/*; do git mailinfo commit-msg.txt patch-file.txt < "$patch" git am patch-file.txtdone
# Clean uprm -rf kernel-patches/ commit-msg.txt patch-file.txt patches.mboxCorporate patch review workflow:
Section titled “Corporate patch review workflow:”# Automated patch processing for reviewpatch_dir="review-patches"processed_dir="processed"
# Create directoriesmkdir -p "$patch_dir" "$processed_dir"
# Receive and split patches# (Assuming patches.mbox is received via email)git mailsplit -o "$patch_dir/" patches.mbox
# Process each patch for reviewfor patch_file in "$patch_dir"/*; do patch_num=$(basename "$patch_file")
# Extract patch information git mailinfo "$processed_dir/${patch_num}-msg.txt" \ "$processed_dir/${patch_num}-patch.txt" \ < "$patch_file"
# Create review directory review_base="$processed_dir/review-$patch_num" mkdir -p "$review_base"
# Generate review information echo "Original email: $patch_file" > "$review_base/review-info.txt" echo "Commit message:" >> "$review_base/review-info.txt" cat "$processed_dir/${patch_num}-msg.txt" >> "$review_base/review-info.txt"
# Check patch applicability if git apply --check "$processed_dir/${patch_num}-patch.txt" 2>/dev/null; then echo "Status: Can apply cleanly" >> "$review_base/review-info.txt" else echo "Status: Conflicts detected" >> "$review_base/review-info.txt" fi
# Assign reviewer based on file changes if grep -q "src/security/" "$processed_dir/${patch_num}-patch.txt"; then echo "Assigned Reviewer: security-team" >> "$review_base/review-info.txt" fi
echo "Prepared review for patch $patch_num"done
# Notify review teamecho "Prepared $(ls "$patch_dir” | wc -l) patches for review in $processed_dir"Continuous integration integration:
Section titled “Continuous integration integration:”# CI pipeline for patch validation#!/bin/bash
# Split incoming patchespatches_received="incoming-patches.mbox"temp_dir=$(mktemp -d "/tmp/patch-validation-XXXXXX")
git mailsplit -o "$temp_dir/" "$patches_received"
failed_patches=0successful_patches=0
# Validate each patchfor patch_file in "$temp_dir"/*; do echo "Validating patch: $(basename "$patch_file")"
# Test patch application if git mailinfo "/dev/null" "$temp_dir/test.patch" < "$patch_file" > /dev/null 2>&1; then # Apply to test branch if git apply --check "$temp_dir/test.patch" > /dev/null 2>&1; then echo "✓ Patch applies cleanly" successful_patches=$((successful_patches + 1)) else echo "✗ Patch has conflicts" failed_patches=$((failed_patches + 1)) fi else echo "✗ Invalid patch format" failed_patches=$((failed_patches + 1)) fi
# Clean up test patch rm -f "$temp_dir/test.patch"done
# Summary reporttotal_patches=$((successful_patches + failed_patches))echo "Patch validation complete:"echo "Total patches: $total_patches"echo "Successful: $successful_patches"echo "Failed: $failed_patches"
# Exit with appropriate codeif [ "$failed_patches" -gt 0 ]; then echo "Some patches failed validation" exit 1else echo "All patches passed validation" exit 0fi
# Clean uprm -rf "$temp_dir"How does mailsplit handle different mbox formats?
Section titled “How does mailsplit handle different mbox formats?”Recognizes mboxo (traditional) and mboxrd (with >From escaping) formats. —mboxrd option forces mboxrd interpretation for maildir-style quoting.
What’s the difference between mbox and Maildir processing?
Section titled “What’s the difference between mbox and Maildir processing?”Mbox files contain all emails concatenated; Maildir stores each email as separate file. mailsplit creates consistent numbered output from both.
How do I handle mal-ordered Maildir files?
Section titled “How do I handle mal-ordered Maildir files?”Maildir filenames should be sorted for correct patch order. Use ls -t for time-ordering if alphabetical order doesn’t match chronological order.
Can mailsplit process compressed mbox files?
Section titled “Can mailsplit process compressed mbox files?”No, decompress first: gunzip archive.mbox.gz | git mailsplit -o output/ or zcat archive.mbox.gz | git mailsplit -o output/
What’s the impact of -b option on patch processing?
Section titled “What’s the impact of -b option on patch processing?”Stops at first blank line, useful for processing only email headers when you don’t need full patch content for initial processing.
How do I handle emails with attachments?
Section titled “How do I handle emails with attachments?”mailsplit extracts all content. Use email processing tools like munpack first if you need to separate attachments before Git processing.
What’s the performance overhead for large mbox files?
Section titled “What’s the performance overhead for large mbox files?”Minimal memory usage, processes emails sequentially. Time depends on mbox size and number of emails, typically fast for reasonable sizes.
Can mailsplit work with multipart messages?
Section titled “Can mailsplit work with multipart messages?”Processes raw mbox content, doesn’t decode MIME multipart. Use email parsing tools if you need to handle multipart before mailsplit.
How do I reconstruct original mbox from split files?
Section titled “How do I reconstruct original mbox from split files?”cat output-dir/* > reconstructed.mbox rebuilds mbox, but loses original formatting. Not recommended for archival purposes.
What’s the relationship between mailsplit and git am?
Section titled “What’s the relationship between mailsplit and git am?”mailsplit splits archives into individual emails; git am processes individual emails. mailsplit is preprocessing for git am workflows.
Can mailsplit handle international email encodings?
Section titled “Can mailsplit handle international email encodings?”Use —keep-cr to preserve original line endings. Convert encoding with iconv before or after mailsplit processing.
How do I handle emails with multiple patches?
Section titled “How do I handle emails with multiple patches?”Each patch should be separate email. If multiple patches in one email, manual splitting required before mailsplit.
What’s the output format naming convention?
Section titled “What’s the output format naming convention?”Creates sequentially numbered files: 0001, 0002, etc. Use -f option to change starting number, -d for digit precision control.
Can mailsplit process email threads?
Section titled “Can mailsplit process email threads?”Treats each email individually. Thread context is lost unless processed through email client or tools that understand threading.
How do I troubleshoot empty output files?
Section titled “How do I troubleshoot empty output files?”Usually indicates mbox format issues. Check “From ” line format, ensure emails are properly separated in the mbox file.
What’s the relationship between mailsplit and git-mailinfo?
Section titled “What’s the relationship between mailsplit and git-mailinfo?”mailsplit splits mbox into individual emails; mailinfo extracts commit message and patches from individual email. Used together in patch workflows.
Applications of the git mailsplit command
Section titled “Applications of the git mailsplit command”- Patch Series Processing: Split mbox archives into individual patches for git-am application
- Email Archive Management: Organize large email archives into manageable individual files
- Automated Patch Management: Process incoming patch emails in CI/CD pipelines
- Mailing List Integration: Handle patch submissions from development mailing lists
- Code Review Workflows: Prepare email-based patch reviews for individual assessment
- Backup and Recovery: Extract individual emails from corrupted mail archives