Originally posted by: Descartes
Well, there are a number of ways to do it, but here's what I would do:
I would record a simple macro that saves the doc as a text file, then later process those text files. You could then issue a simple command like the following at the CLI:
for %i in (*.doc) do "winword /mYOURMACRONAME"
The 'm' switch to winword just tells it to run the macro. You would then have text files that you could process as normal. I would personally use Perl to extrapolate the email addresses using something like the following:
cat *.txt | perl -ne "/(\w+[\w-\.]*\@\w+((-\w+)|(\w*))\.[a-z]{2,3})/;print \"$1\n\";" > emails.txt
I didn't run this myself obviously, so don't berate me for my ad-hoc syntax. I think at least the first part should be useful to you.
Originally posted by: KLin
Originally posted by: Descartes
Well, there are a number of ways to do it, but here's what I would do:
I would record a simple macro that saves the doc as a text file, then later process those text files. You could then issue a simple command like the following at the CLI:
for %i in (*.doc) do "winword /mYOURMACRONAME"
The 'm' switch to winword just tells it to run the macro. You would then have text files that you could process as normal. I would personally use Perl to extrapolate the email addresses using something like the following:
cat *.txt | perl -ne "/(\w+[\w-\.]*\@\w+((-\w+)|(\w*))\.[a-z]{2,3})/;print \"$1\n\";" > emails.txt
I didn't run this myself obviously, so don't berate me for my ad-hoc syntax. I think at least the first part should be useful to you.
wtf does that do?![]()
Originally posted by: Descartes
Originally posted by: KLin
Originally posted by: Descartes
Well, there are a number of ways to do it, but here's what I would do:
I would record a simple macro that saves the doc as a text file, then later process those text files. You could then issue a simple command like the following at the CLI:
for %i in (*.doc) do "winword /mYOURMACRONAME"
The 'm' switch to winword just tells it to run the macro. You would then have text files that you could process as normal. I would personally use Perl to extrapolate the email addresses using something like the following:
cat *.txt | perl -ne "/(\w+[\w-\.]*\@\w+((-\w+)|(\w*))\.[a-z]{2,3})/;print \"$1\n\";" > emails.txt
I didn't run this myself obviously, so don't berate me for my ad-hoc syntax. I think at least the first part should be useful to you.
wtf does that do?![]()
It will concatenate all the txt files in the directory, pipe that to perl which will then use the regular expression to extrapolate the email address out of the line (if present), and the output is then finally redirected into the file emails.txt. Like I said, it was just an example of what you could do, but I acknowledge that not everyone would want to do the same. The benefit is that of time, but it comes at a learning cost to those not familiar.
[edit]Oh, and it does work.[/edit]
Originally posted by: Descartes
Well, there are a number of ways to do it, but here's what I would do:
I would record a simple macro that saves the doc as a text file, then later process those text files. You could then issue a simple command like the following at the CLI:
for %i in (*.doc) do "winword /mYOURMACRONAME"
The 'm' switch to winword just tells it to run the macro. You would then have text files that you could process as normal.
*snip*.
