
This module is BETA code, which means that the interfaces are fairly stable BUT it has not been out in the community long enough to guarantee much testing. Use with caution! Please report any errors back to eryq@zeegee.com as soon as you can.
 NAME
 NAMEMIME::Head - MIME message header (a subclass of Mail::Header)
 SYNOPSIS
 SYNOPSISBefore reading further, you should see MIME::Tools to make sure that you understand where this module fits into the grand scheme of things. Go on, do it now. I'll wait.
Ready? Ok...
 Construction
 Construction
    ### Create a new, empty header, and populate it manually:    
    $head = MIME::Head->new;
    $head->replace('content-type', 'text/plain; charset=US-ASCII');
    $head->replace('content-length', $len);
    
    ### Parse a new header from a filehandle:
    $head = MIME::Head->read(\*STDIN);
    
    ### Parse a new header from a file, or a readable pipe:
    $testhead = MIME::Head->from_file("/tmp/test.hdr");
    $a_b_head = MIME::Head->from_file("cat a.hdr b.hdr |");
 Output
 Output
    ### Output to filehandle:
    $head->print(\*STDOUT);  
    
    ### Output as string:
    print STDOUT $head->as_string;
    print STDOUT $head->stringify;
 Getting field contents
 Getting field contents
    ### Is this a reply?
    $is_reply = 1 if ($head->get('Subject') =~ /^Re: /);
    
    ### Get receipt information:
    print "Last received from: ", $head->get('Received', 0), "\n";
    @all_received = $head->get('Received');
    
    ### Print the subject, or the empty string if none:
    print "Subject: ", $head->get('Subject',0), "\n";
     
    ### Too many hops?  Count 'em and see!
    if ($head->count('Received') > 5) { ...
    
    ### Test whether a given field exists
    warn "missing subject!" if (! $head->count('subject'));
 Setting field contents
 Setting field contents
    ### Declare this to be an HTML header:
    $head->replace('Content-type', 'text/html');
 Manipulating field contents
 Manipulating field contents
    ### Get rid of internal newlines in fields:
    $head->unfold;
    
    ### Decode any Q- or B-encoded-text in fields (DEPRECATED):
    $head->decode;
     
 Getting high-level MIME information
 Getting high-level MIME information
    ### Get/set a given MIME attribute:
    unless ($charset = $head->mime_attr('content-type.charset')) {
        $head->mime_attr("content-type.charset" => "US-ASCII");
    }
    ### The content type (e.g., "text/html"):
    $mime_type     = $head->mime_type;
    
    ### The content transfer encoding (e.g., "quoted-printable"):
    $mime_encoding = $head->mime_encoding;
    
    ### The recommended name when extracted:
    $file_name     = $head->recommended_filename;
    
    ### The boundary text, for multipart messages:
    $boundary      = $head->multipart_boundary;
 DESCRIPTION
 DESCRIPTIONA class for parsing in and manipulating RFC-822 message headers, with some methods geared towards standard (and not so standard) MIME fields as specified in RFC-1521, Multipurpose Internet Mail Extensions.
 PUBLIC INTERFACE
 PUBLIC INTERFACE Creation, input, and output
 Creation, input, and output
    ### Create a new header by parsing in a file:
    my $head = MIME::Head->from_file("/tmp/test.hdr");
Since this method can function as either a class constructor or 
an instance initializer, the above is exactly equivalent to:
    ### Create a new header by parsing in a file:
    my $head = MIME::Head->new->from_file("/tmp/test.hdr");
On success, the object will be returned; on failure, the undefined value.
The OPTIONS are the same as in new(), and are passed into new() if this is invoked as a class method.
Note: This is really just a convenience front-end onto read(),
provided mostly for backwards-compatibility with MIME-parser 1.0.
Supply this routine with a reference to a filehandle glob; e.g., \*STDIN:
    ### Create a new header by parsing in STDIN:
    $head->read(\*STDIN);
On success, the self object will be returned; on failure, a false value.
Note: in the MIME world, it is perfectly legal for a header to be empty, consisting of nothing but the terminating blank line. Thus, we can't just use the formula that "no tags equals error".
Warning: as of the time of this writing, Mail::Header::read did not flag either syntax errors or unexpected end-of-file conditions (an EOF before the terminating blank line). MIME::ParserBase takes this into account.
 Getting/setting fields
 Getting/setting fieldsThe following are methods related to retrieving and modifying the header fields. Some are inherited from Mail::Header, but I've kept the documentation around for convenience.
    ### Add the trace information:    
    $head->add('Received', 
               'from eryq.pr.mcs.net by gonzo.net with smtp');
Normally, the new occurence will be appended to the existing occurences. However, if the optional INDEX argument is 0, then the new occurence will be prepended. If you want to be explicit about appending, specify an INDEX of -1.
Warning: this method always adds new occurences; it doesn't overwrite
any existing occurences... so if you just want to change the value
of a field (creating it if necessary), then you probably don't want to use 
this method: consider using replace() instead.
    ### Was a "Subject:" field given?
    $subject_was_given = $head->count('subject');
The TAG is treated in a case-insensitive manner. This method returns some false value if the field doesn't exist, and some true value if it does.
:-). 
This method has been deprecated.
See decode_headers for the full reasons.
If you absolutely must use it and don't like the warning, then
provide a FORCE:
   "I_NEED_TO_FIX_THIS"
          Just shut up and do it.  Not recommended.
          Provided only for those who need to keep old scripts functioning.
   "I_KNOW_WHAT_I_AM_DOING"
          Just shut up and do it.  Not recommended.
          Provided for those who REALLY know what they are doing.
What this method does.
For an example, let's consider a valid email header you might get:
    From: =?US-ASCII?Q?Keith_Moore?= <moore@cs.utk.edu>
    To: =?ISO-8859-1?Q?Keld_J=F8rn_Simonsen?= <keld@dkuug.dk>
    CC: =?ISO-8859-1?Q?Andr=E9_?= Pirard <PIRARD@vm1.ulg.ac.be>
    Subject: =?ISO-8859-1?B?SWYgeW91IGNhbiByZWFkIHRoaXMgeW8=?=
     =?ISO-8859-2?B?dSB1bmRlcnN0YW5kIHRoZSBleGFtcGxlLg==?=
     =?US-ASCII?Q?.._cool!?=
That basically decodes to (sorry, I can only approximate the
Latin characters with 7 bit sequences /o and 'e):
    From: Keith Moore <moore@cs.utk.edu>
    To: Keld J/orn Simonsen <keld@dkuug.dk>
    CC: Andr'e  Pirard <PIRARD@vm1.ulg.ac.be>
    Subject: If you can read this you understand the example... cool!
Note: currently, the decodings are done without regard to the
character set: thus, the Q-encoding =F8 is simply translated to the
octet (hexadecimal F8), period.  For piece-by-piece decoding
of a given field, you want the array context of 
MIME::Word::decode_mimewords().
Warning: the CRLF+SPACE separator that splits up long encoded words into shorter sequences (see the Subject: example above) gets lost when the field is unfolded, and so decoding after unfolding causes a spurious space to be left in the field. THEREFORE: if you're going to decode, do so BEFORE unfolding!
This method returns the self object.
Thanks to Kent Boortz for providing the idea, and the baseline RFC-1522-decoding code.
    ### Remove some MIME information:
    $head->delete('MIME-Version');
    $head->delete('Content-type');
If a numeric INDEX is given, returns the occurence at that index, 
or undef if not present:
    ### Print the first and last 'Received:' entries (explicitly):
    print "First, or most recent: ", $head->get('received', 0), "\n";
    print "Last, or least recent: ", $head->get('received',-1), "\n"; 
If no INDEX is given, but invoked in a scalar context, then
INDEX simply defaults to 0:
    ### Get the first 'Received:' entry (implicitly):
    my $most_recent = $head->get('received');
If no INDEX is given, and invoked in an array context, then
all occurences of the field are returned:
    ### Get all 'Received:' entries:
    my @all_received = $head->get('received');
    ### How did it get here?
    @history = $head->get_all('Received');
Note: I had originally experimented with having get() return all 
occurences when invoked in an array context... but that causes a lot of 
accidents when you get careless and do stuff like this:
    print "\u$field: ", $head->get($field), "\n";
It also made the intuitive behaviour unclear if the INDEX argument was given in an array context. So I opted for an explicit approach to asking for all occurences.
The override actually lets you print to any object that responds to a print() method. This is vital for outputting MIME entities to scalars.
Also, it defaults to the currently-selected filehandle if none is given (not STDOUT!), so please supply a filehandle to prevent confusion.
as_string.
 MIME-specific methods
 MIME-specific methodsAll of the following methods extract information from the following fields:
    Content-type
    Content-transfer-encoding
    Content-disposition
Be aware that they do not just return the raw contents of those fields,
and in some cases they will fill in sensible (I hope) default values.
Use get() or mime_attr() if you need to grab and process the 
raw field text.
Note: some of these methods are provided both as a convenience and for backwards-compatibility only, while others (like recommended_filename()) really do have to be in MIME::Head to work properly, since they look for their value in more than one field. However, if you know that a value is restricted to a single field, you should really use the Mail::Field interface to get it.
    $head->mime_attr("content-type"         => "text/html");
    $head->mime_attr("content-type.charset" => "US-ASCII");
    $head->mime_attr("content-type.name"    => "homepage.html");
This would cause the final output to look something like this:
    Content-type: text/html; charset=US-ASCII; name="homepage.html"
Note that the special empty sub-field tag indicates the anonymous first sub-field.
Giving VALUE as undefined will cause the contents of the named subfield 
to be deleted:
    $head->mime_attr("content-type.charset" => undef);
Supplying no VALUE argument just returns the attribute's value,
or undefined if it isn't there:
    $type = $head->mime_attr("content-type");      ### text/html
    $name = $head->mime_attr("content-type.name"); ### homepage.html
In all cases, the new/current value is returned.
"base64", "binary"), which is returned in all-lowercase.
If no encoding could be found, the default of "7bit" is returned.  
I quote from RFC-1521 section 5:
    This is the default value -- that is, "Content-Transfer-Encoding: 7BIT" 
    is assumed if the Content-Transfer-Encoding header field is not present.
real hard to determine the content type (e.g., "text/plain",
"image/gif", "x-weird-type", which is returned in all-lowercase.  
"Real hard" means that if no content type could be found, the default 
(usually "text/plain") is returned.  From RFC-1521 section 7.1:
    The default Content-Type for Internet mail is 
    "text/plain; charset=us-ascii".
Unless this is a part of a "multipart/digest", in which case "message/rfc822" is the default. Note that you can also set the default, but you shouldn't: normally only the MIME parser uses this feature.
Content-type: field; that
is, the leading double-hyphen (--) is not prepended.
Well, almost exactly... this passage from RFC-1521 dictates
that we remove any trailing spaces:
   If a boundary appears to end with white space, the white space 
   must be presumed to have been added by a gateway, and must be deleted.
Returns undef (not the empty string) if either the message is not multipart, if there is no specified boundary, or if the boundary is illegal (e.g., if it is empty after all trailing whitespace has been removed).
Returns undef if no filename could be suggested.
 NOTES
 NOTES
    There is also IMHO no requirement [for] MIME::Heads to look 
    like [email] headers; so to speak, the MIME::Head [simply stores] 
    the attributes of a complex object, e.g.:
        new MIME::Head type => "text/plain",
                       charset => ...,
                       disposition => ..., ... ;
I agree in principle, but (alas and dammit) RFC-1521 says otherwise. RFC-1521 [MIME] headers are a syntactic subset of RFC-822 [email] headers. Perhaps a better name for these modules would be RFC1521:: instead of MIME::, but we're a little beyond that stage now.
In my mind's eye, I see an abstract class, call it MIME::Attrs, which does
what Achim suggests... so you could say:
     my $attrs = new MIME::Attrs type => "text/plain",
				 charset => ...,
                                 disposition => ..., ... ;
We could even make it a superclass of MIME::Head: that way, MIME::Head would have to implement its interface, and allow itself to be initiallized from a MIME::Attrs object.
However, when you read RFC-1521, you begin to see how much MIME information is organized by its presence in particular fields. I imagine that we'd begin to mirror the structure of RFC-1521 fields and subfields to such a degree that this might not give us a tremendous gain over just having MIME::Head.
Looking at a typical mail message header, it is sooooooo tempting to just
store the fields as a hash of strings, one string per hash entry.  
Unfortunately, there's the little matter of the Received: field, 
which (unlike From:, To:, etc.) will often have multiple 
occurences; e.g.:
    Received: from gsfc.nasa.gov by eryq.pr.mcs.net  with smtp
        (Linux Smail3.1.28.1 #5) id m0tStZ7-0007X4C; 
	 Thu, 21 Dec 95 16:34 CST
    Received: from rhine.gsfc.nasa.gov by gsfc.nasa.gov 
	 (5.65/Ultrix3.0-C) id AA13596; 
	 Thu, 21 Dec 95 17:20:38 -0500
    Received: (from eryq@localhost) by rhine.gsfc.nasa.gov 
	 (8.6.12/8.6.12) id RAA28069; 
	 Thu, 21 Dec 1995 17:27:54 -0500
    Date: Thu, 21 Dec 1995 17:27:54 -0500
    From: Eryq <eryq@rhine.gsfc.nasa.gov>
    Message-Id: <199512212227.RAA28069@rhine.gsfc.nasa.gov>
    To: eryq@eryq.pr.mcs.net
    Subject: Stuff and things
The Received: field is used for tracing message routes, and although
it's not generally used for anything other than human debugging, I
didn't want to inconvenience anyone who actually wanted to get at that
information.  
I also didn't want to make this a special case; after all, who knows what other fields could have multiple occurences in the future? So, clearly, multiple entries had to somehow be stored multiple times... and the different occurences had to be retrievable.
 AUTHOR
 AUTHOREryq (
All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
The more-comprehensive filename extraction is courtesy of Lee E. Brotzman, Advanced Data Solutions.
 VERSION
 VERSION$Revision: 5.403 $ $Date: 2000/11/04 19:54:46 $