Anyone want to point out what's wrong with this subroutine? I am trying to read in a HTML file and parse it into two sections. The two sections are they head section and the body section. I am not sure why, but somehow the head section is repeated each time this subroutine is run. For example if the head section look something like this:
<head>
<title>my title</title>
</head>
Then what gets returned in the $head is:
<head>
<title>my title</title>
</head>
<head>
<title>my title</title>
</head>
Here's the code for the subroutine:
<head>
<title>my title</title>
</head>
Then what gets returned in the $head is:
<head>
<title>my title</title>
</head>
<head>
<title>my title</title>
</head>
Here's the code for the subroutine:
sub load_file {
my $self = shift;
my $filename = shift;
my $productdir = shift;
if(!open(PRODFH,$productdir."/".$filename)) {
$self{'error'} = "cannot open $filename $!";
return -1;
}
$text = join '',<PRODFH>;
close(PRODFH);
$text =~ /(<\s*HEAD\s*>.*<\s*\/\s*HEAD\s*>)/ism;
$head = $1;
$text =~ /<\s*\/\s*HEAD\s*>(.*<\s*\/\s*HTML\s*>)/ism;
$body = $1;
$body =~ s/<\s*\/html\s*>//ismg;
$self{'init'} = 1;
$self{'head'} = $head;
$self{'body'} = $body;
if($body =~ /<!--PROD_DESC-->(.*?)<!--END_PROD_DESC-->/isgm) {
$self{'proddesc'} = $1;
}
return 1;
}
