is there a method to extract specific text from a MS Word document?

edprush

Platinum Member
Sep 18, 2000
2,541
0
0
I it possible to set parameters for text/numbers that I want to extract from a MS Word document?

Has anyone done this?

Thanks.
 

mundane

Diamond Member
Jun 7, 2002
5,603
8
81
You should be able to use VBA to programmatically access/modify the document, although I'm not sure exactly what you're trying to do.
 

edprush

Platinum Member
Sep 18, 2000
2,541
0
0
I don't know a thing about VBA.


What I'm trying to do is extract all text strings from a large MS Word document. This would be similiar to the "Find" feature in MS Word but I need to find all the text strings that match my search criteria and extract them to another text file.
 

SickBeast

Lifer
Jul 21, 2000
14,377
19
81
Tell it to Bill Gates and he'll have an excuse to release a new version of MS Office. :)

I don't think what you're trying to do is possible, at least not in Word directly. It sounds like something that database software might be able to deal with.
 

HN

Diamond Member
Jan 19, 2001
8,186
4
0
Edit-->Find-->check the box for "Highlight all items found in:" (i'm guessing you want "Main Document")-->Find All

This will find and highlight all instances of your string. Now just copy and paste to wherever.
 

Smilin

Diamond Member
Mar 4, 2002
7,357
0
0
Originally posted by: SickBeast
Tell it to Bill Gates and he'll have an excuse to release a new version of MS Office. :)

I don't think what you're trying to do is possible, at least not in Word directly. It sounds like something that database software might be able to deal with.

He already did. It's called Office 2007 and it saves in XML.
 

edprush

Platinum Member
Sep 18, 2000
2,541
0
0
Originally posted by: HN
Edit-->Find-->check the box for "Highlight all items found in:" (i'm guessing you want "Main Document")-->Find All

This will find and highlight all instances of your string. Now just copy and paste to wherever.

That's very close but I want to define specific strings of text, such as 'text, with out spaces, that contain at least 4 characters and one of the characters has to be a "g". '
 

HN

Diamond Member
Jan 19, 2001
8,186
4
0
Originally posted by: edprush
Originally posted by: HN
Edit-->Find-->check the box for "Highlight all items found in:" (i'm guessing you want "Main Document")-->Find All

This will find and highlight all instances of your string. Now just copy and paste to wherever.

That's very close but I want to define specific strings of text, such as 'text, with out spaces, that contain at least 4 characters and one of the characters has to be a "g". '

well, if you don't mind doing it by hand, there are a bunch of wildcards you can use (check the "Use wildcards" box in the Find dialog. See here for reference: http://word.mvps.org/FAQS/General/UsingWildcards.htm