Skip to content

Refactor received header parsing: replace regex list with RFC 5321 to…#150

Open
fedelemantuano wants to merge 1 commit intodevelopfrom
claude/improve-header-regex-vCFLu
Open

Refactor received header parsing: replace regex list with RFC 5321 to…#150
fedelemantuano wants to merge 1 commit intodevelopfrom
claude/improve-header-regex-vCFLu

Conversation

@fedelemantuano
Copy link
Contributor

…kenizer

Replace the 10 separate regex patterns (each duplicating boundary lookaheads) with a keyword-based tokenizer aligned with RFC 5321 §4.4 grammar. Key improvements:

  • Tokenize on clause keywords (from/by/via/with/id/for) in a single pass instead of running 10 independent regex searches
  • Handle IBM "for from " by accepting only the first 'from' clause per header
  • Extract envelope-from/sender from parenthesized comments in clause values
  • Validate IPv4 octets (0-255) instead of matching any digits; add IPv6 support via REGXIP6
  • Simplify JUNK_PATTERN to only collapse tabs/newlines, preserving parenthesized comments and bracketed IPs
  • Add 27-test corpus covering Postfix, Exim, Exchange, Gmail, SendGrid, IBM/Domino, AWS SES, and edge cases

https://claude.ai/code/session_01CwmwWkvZGLpTBY6ApKFi79

…kenizer

Replace the 10 separate regex patterns (each duplicating boundary
lookaheads) with a keyword-based tokenizer aligned with RFC 5321 §4.4
grammar. Key improvements:

- Tokenize on clause keywords (from/by/via/with/id/for) in a single
  pass instead of running 10 independent regex searches
- Handle IBM "for <addr> from <sender>" by accepting only the first
  'from' clause per header
- Extract envelope-from/sender from parenthesized comments in clause
  values
- Validate IPv4 octets (0-255) instead of matching any digits; add
  IPv6 support via REGXIP6
- Simplify JUNK_PATTERN to only collapse tabs/newlines, preserving
  parenthesized comments and bracketed IPs
- Add 27-test corpus covering Postfix, Exim, Exchange, Gmail,
  SendGrid, IBM/Domino, AWS SES, and edge cases

https://claude.ai/code/session_01CwmwWkvZGLpTBY6ApKFi79
@fedelemantuano fedelemantuano self-assigned this Mar 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants