MESSAGE
DATE | 2007-11-18 |
FROM | Ron Guerin
|
SUBJECT | Re: [NYLXS - HANGOUT] Website Updates
|
Ron Guerin wrote: > Ruben Safir wrote: >> From: ruben-at-mrbrklyn Sun Nov 18 01:22:56 2007 >> >> I think that literally written by sendmail when the mail is recieved >> and entered on the first line of the mail, which is why the only >> thing you can trust in your headers with regard to spam is the >> very first line. > > I've gotta say, that Sendmail's choice of delimiter is almost grossly > inappropriate since it makes it much harder than necessary to sort out > message delimiters from From: headers. They could have and should have > used any string that wasn't the letters "f","r","o","m" and a colon.
You sure there's a colon there?
The best description of an mbox that I was able to turn up in Google, suggests the delimiter you're showing above is invalid.
A message encoded in mbox format begins with a From_ line, continues with a series of non-From_ lines, and ends with a blank line. A From_ line means any line that begins with the characters F, r, o, m, space
The final line is a completely blank line (no spaces or tabs). Notice that blank lines may also appear elsewhere in the message. If the last line of the message was a partial line, it writes two newlines; otherwise it writes one.
The From_ line always looks like From envsender date moreinfo. envsender is one word, without spaces or tabs; it is usually the envelope sender of the message. date is the delivery date of the message. It always contains exactly 24 characters in asctime format. moreinfo is optional; it may contain arbitrary information.
Between the From_ line and the blank line is a message in RFC 822 format.
>From quoting ensures that the resulting lines are not From_ lines: the program prepends a > to any From_ line, >From_ line, >>From_ line, >>>From_ line, etc.
HOW A MESSAGE IS READ A reader scans through an mbox file looking for From_ lines. Any From_ line marks the beginning of a message. The reader should not attempt to take advantage of the fact that every From_ line (past the beginning of the file) is preceded by a blank line.
Once the reader finds a message, it extracts a (possibly corrupted) envelope sender and delivery date out of the From_ line. It then reads until the next From_ line or end of file, whichever comes first. It strips off the final blank line and deletes the quoting of >From_ lines and >>From_ lines and so on. The result is an RFC 822 message.
- Ron
|
|