The script at the bottom of the page offers what looks like a very nice solution - I wish it was available when I ran accross this problem. Unfortunately I had to roll my own; the result of a bunch of tweaking after running on a production system is presented below, for those interested in an alternative solution. Also uses procmail, along with "html2text" utility that's pretty widely available.
:0
* <599000
* ^Content-Type: text/html
{
:0 fbw
| /usr/local/bin/html2text -nobs -style pretty -width 62
:0 fbw
| /usr/bin/sed 's/ />/g'
:0 Afw
| /usr/bin/sed 's/\(Content-Type: text\/\)html/\1plain/'
}
I searched a while to avoid the /This transaction appears to have no content/ message when users send pure HTML mail (they shoudln't ; but they do). I did not find any out of the box solution, so here is how we made it : just use the small perlscript below and replace the aliases file by the following :
rt: "|/home/rt/bin/html2mime | /usr/local/rt3/bin/rt-mailgate --queue "General" --action correspond --url http://whatever/"
The script is so small that I put it directly here:
#!/usr/bin/perl
# 1.0 - XXXX - original version
use strict;
use Mail::Field;
use MIME::Parser;
use MIME::Entity;
use HTML::TreeBuilder;
use HTML::FormatText;
# because MIME silently writes local files
chdir("/tmp/");
my $parser = new MIME::Parser;
my $entity = $parser->parse(\*STDIN) or die "parse failed\n";
# Parse only non-multipart text/html mime-type
if ( ($entity->is_multipart) ||
($entity->mime_type ne "text/html") ) {
$entity->print(\*STDOUT);
exit 0;
}
# Decode body
my $bh = $entity->bodyhandle;
my $tree = HTML::TreeBuilder->new();
$tree->utf8_mode();
$tree->parse($bh->as_string);
my $formatter = HTML::FormatText->new(leftmargin => 0, rightmargin => 72);
my $txt= $formatter->format($tree);
my $txtEntity=MIME::Entity->build(Data => $txt,
Type => "text/plain",
#Encoding => "quoted-printable"
Encoding => "8bits"
);
$entity->make_multipart;
$entity->add_part($txtEntity,0);
$entity->print(\*STDOUT);
The purpose is to create a text/plain transcription of HTML mails, keeping both formats to avoid loosing informations. Hope this will help.