Perl guru's: A text Parsing question

DivideBYZero

Lifer
May 18, 2001
24,117
2
0
I want to Parse a message stream and put it into multiple variables or a compound var.

What I want to do is match a multiple line message if it contains a certain string, regardless of how many newlines the message has in it. The string I am using to make the match is never going to be split across more than one line.



I was hoping that [.\n]* would match any number and combination of characters including newline, but it doesn?t seem to.



Input I want to match on is XML, something like



<Message>

<OtherTag> asddsfj </OtherTag> (repeated a variable number of times)

</Message>


Cheers!

/0
 

DivideBYZero

Lifer
May 18, 2001
24,117
2
0
Here is a sample message:

<AlertMessage>
<Identification>
<systemCode>QRS</systemCode>
<messageCode>0003</messageCode>
</Identification>
<Source>
<componentName>nameremoved.co.uk/application/subset/name</componentName>
<serverName>NT002: 813.153.131.231</serverName>
</Source>
<Parameters>
<UserDefNmValPair>
<name>$_ATTEMPTS_$</name>
<value>5</value>
</UserDefNmValPair>
<UserDefNmValPair>
<name>$_DATABASE_$</name>
<value> ADPORA0J </value>
</UserDefNmValPair>
<UserDefNmValPair>
<name>$_COMMENT_$</name>
<value> stopped and </value>
</UserDefNmValPair>
</Parameters>
</AlertMessage>


I can get the <whatever>TEXT</whatever> out using:

$complete equals /<.*>(.*)<\/.*>/

This puts <whatever> into complete[1], TEXT into complete[2] and so on.

What I need now is for the .* to carry on past the newline.

Anyone?
 

Celeryman

Senior member
Oct 9, 1999
310
0
76
Well I think you have two options. You can go through and remove the new line characters and then process the string from there or you could try it using the /s modifier. This will let "." match \n. I tried, almost successfully to strip a Excel Office HTM file and only remove the table elements so this would be similiar to your problem.

 

DivideBYZero

Lifer
May 18, 2001
24,117
2
0
FYI: this did it:-

$complete equals /<.*>([\s\S]*)<\/.*>/

So whats up? Do we have any perl guys on here? I know this forum is slow, but I thought we had a bunch of hacking nerds. I even thought about putting this in OT as I bet I would have got an answer amid all the flames ;)

Anyhow, if anyone is interested in capturing multi-line XML into a variable, here it is. Enjoy!
 

DivideBYZero

Lifer
May 18, 2001
24,117
2
0
Originally posted by: Celeryman
Well I think you have two options. You can go through and remove the new line characters and then process the string from there or you could try it using the /s modifier. This will let "." match \n. I tried, almost successfully to strip a Excel Office HTM file and only remove the table elements so this would be similiar to your problem.

Thanks for the reply, m8. I was posting my answer/rant whilst you had put up a great post. I appreciate your help!
 

DivideBYZero

Lifer
May 18, 2001
24,117
2
0
I'm still trying to strip the tags. FWIW I'm using a tool that is not a true perl implementation, so this is hampering my efforts somewhat.

I can't use =~, it barfs. :|

Someone cry me a river, will you?

;)
 

DivideBYZero

Lifer
May 18, 2001
24,117
2
0
So here is my final solution:

This is used to drag in the full XML stream into one var:
$complete equals /<.*>([\s\S]*)<\/.*>/

Each line of the XML in the stream is put into varlogX for me by the tool so I parse out the data
$varlog12 equals /<.*>(.*)<\/.*>/
$varlog13 equals /<.*>(.*)<\/.*>/
$varlog14 equals /<.*>(.*)<\/.*>/

Then I assign the stripped data to the vars I want, and I am a very happy camper.
xyz_systemCode = $varlog12[1]
xyz_messageCode = $varlog13[1]
xyz_severity = $varlog14[1]
 

jman19

Lifer
Nov 3, 2000
11,225
664
126
Originally posted by: DivideBYZero
FYI: this did it:-

$complete equals /<.*>([\s\S]*)<\/.*>/

So whats up? Do we have any perl guys on here? I know this forum is slow, but I thought we had a bunch of hacking nerds. I even thought about putting this in OT as I bet I would have got an answer amid all the flames ;)

Anyhow, if anyone is interested in capturing multi-line XML into a variable, here it is. Enjoy!

I've done some Perl scripting and I've been studying XML processing recently, I guess I was a bit slow getting to this... I'm surprised notfred didn't reply in this thread :p

Also, I bet programming talk would pick up if there was a dedicated forum for it... :) (*hint hint*)
 

DivideBYZero

Lifer
May 18, 2001
24,117
2
0
Originally posted by: jamesave
i don't understand why you can't use =~

Btw, the answer to your queestion might be here:
http://perlmonks.org/?node_id=2989

for any perl enlightment, perlmonks.org is the place to go.

I stumbled upon the monks today and found most of what I did there. Thanks for the recommendation.

I can't use =~ as it's perl used inside another tool and the tool just hated =~ :)

I can really see why a separate programing forum is a good idea....now.
 

notfred

Lifer
Feb 12, 2001
38,241
4
0
Originally posted by: DivideBYZero
So here is my final solution:

This is used to drag in the full XML stream into one var:
$complete equals /<.*>([\s\S]*)<\/.*>/

Each line of the XML in the stream is put into varlogX for me by the tool so I parse out the data
$varlog12 equals /<.*>(.*)<\/.*>/
$varlog13 equals /<.*>(.*)<\/.*>/
$varlog14 equals /<.*>(.*)<\/.*>/

Then I assign the stripped data to the vars I want, and I am a very happy camper.
xyz_systemCode = $varlog12[1]
xyz_messageCode = $varlog13[1]
xyz_severity = $varlog14[1]

That's.... really not very good.

First off, you should be using XML::parser, or XML::LibXML (or a similar module), but you seem dead set against that, so I don't see why you don't just do something like the attached code.

But then I guess you're not really using perl...
 

DivideBYZero

Lifer
May 18, 2001
24,117
2
0
Yeah, notfred. Thats much cleaner, but like I said, this is perl constructs inside another tool (BMC Impact Manager logfile adapter). I tried to use the XML::parser, but again, it barfed. :(

As you can see, I couldn't even use =~. Anyhow, I have my code, it does what I need and it is for a proof of concept. Plenty of time for optimizing it when they drop the cash on it.... :D

Thanks again. This has been an education. I have never used Perl before.