Anyone around here good with awk or sed?

Entity

Lifer
Oct 11, 1999
10,090
0
0
I'm trying to do some file conversion/manipulation/whatever you want to call it. I've got a file that looks like this right now:
Jim Wornast CENTURY 21 Smith/Ring Inc.
Toll Free: 1-888-842-1475, Direct: 425-221-2311, Email:
[17]jgw@merlinchess.com
Real Estate, Buyer, Fisrt time buyer, Buyers Agent
[18]Add to Scratch Pad
[19]Contact me now
[20]Go to my site
[21]"Ask for" Fay Ainsworth - Real Estate Consultant -
JOHN L. SCOTT REAL ESTATE - FEDERAL WAY
Office: (253)927-4000, Mobile: (253)222-4329, Email:
[22]fayainsworth@msn.com
Puget Sound Lifestyles, View Property, Waterfront, Investment, First
Home, Retirement, *
[23]Add to Scratch Pad
[24]Contact me now
[25]Go to my site
<snip>
I've managed, using sed, to get it pared down:
1,/15/d
s/site/site\n/g
s/[[][0-9]*[]]//g
s/\&quot;//g
s/\*//g
s/Email\://g
s/^\n//g
s/^[(][0-9]*[)].*[0-9]//g
s/^ *//
/^<</,/^END/d
/^Add/d
/^Contact/d
s/^Go to my site//g
s/^\n//g
s/\n$//g
Now it looks better:
Jim Wornast CENTURY 21 Smith/Ring Inc.
Toll Free: 1-888-842-1475, Direct: 425-221-2311,
jgw@merlinchess.com
Real Estate, Buyer, Fisrt time buyer, Buyers Agent

Ask for Fay Ainsworth - Real Estate Consultant -
JOHN L. SCOTT REAL ESTATE - FEDERAL WAY
Office: (253)927-4000, Mobile: (253)222-4329,
fayainsworth@msn.com
Puget Sound Lifestyles, View Property, Waterfront, Investment, First
Home, Retirement,

Tim Altman
Fred Sands Desert Realty
Toll Free: (877)867-3166, Office: (760)341-9710,
agenttim@hotmail.com
Buy or Sell, Let Tim Altman Be Your Next HouseSold Agent...
I'm trying to get the last line(s) (everything after the email address until the next group of names starts) to be deleted, so that I get something that looks like this:

Name
Company
Phone Numbers
Email Address

...anyone know what I might be able to do in sed or awk to get that line deleted? I can't figure this one out.

Bleh.

Rob
 

Entity

Lifer
Oct 11, 1999
10,090
0
0
Originally posted by: notfred
I'll figure this out and psot it in a bit.

Thanks. :D

The end hope is that I'll be able to put this either in a text format or formulate it into a mysql query string. I just started learning sed and awk last night. :D

Rob
 

notfred

Lifer
Feb 12, 2001
38,241
4
0
I had to go to a class at 1:00 (4 minutes after my last post :)), but here's what I did. It's in perl, which may be slightly different than awk or sed which I'm not real familiar with, but the regular expression syntax should be pretty similar.

s/([\@]*\@[^\n]*).*?(\n\n|$)/$1$2/sg;

When operating on this input:
Jim Wornast CENTURY 21 Smith/Ring Inc.
Toll Free: 1-888-842-1475, Direct: 425-221-2311,
jgw@merlinchess.com
Real Estate, Buyer, Fisrt time buyer, Buyers Agent

Ask for Fay Ainsworth - Real Estate Consultant -
JOHN L. SCOTT REAL ESTATE - FEDERAL WAY
Office: (253)927-4000, Mobile: (253)222-4329,
fayainsworth@msn.com
Puget Sound Lifestyles, View Property, Waterfront, Investment, First
Home, Retirement,

Tim Altman
Fred Sands Desert Realty
Toll Free: (877)867-3166, Office: (760)341-9710,
agenttim@hotmail.com
Buy or Sell, Let Tim Altman Be Your Next HouseSold Agent...

this is the output you get:
Jim Wornast CENTURY 21 Smith/Ring Inc.
Toll Free: 1-888-842-1475, Direct: 425-221-2311,
jgw@merlinchess.com

Ask for Fay Ainsworth - Real Estate Consultant -
JOHN L. SCOTT REAL ESTATE - FEDERAL WAY
Office: (253)927-4000, Mobile: (253)222-4329,
fayainsworth@msn.com

Tim Altman
Fred Sands Desert Realty
Toll Free: (877)867-3166, Office: (760)341-9710,
 

Entity

Lifer
Oct 11, 1999
10,090
0
0
Originally posted by: notfred
I had to go to a class at 1:00 (4 minutes after my last post :)), but here's what I did. It's in perl, which may be slightly different than awk or sed which I'm not real familiar with, but the regular expression syntax should be pretty similar.

s/([\@]*\@[^\n]*).*?(\n\n|$)/$1$2/sg;

When operating on this input:
Jim Wornast CENTURY 21 Smith/Ring Inc.
Toll Free: 1-888-842-1475, Direct: 425-221-2311,
jgw@merlinchess.com
Real Estate, Buyer, Fisrt time buyer, Buyers Agent

Ask for Fay Ainsworth - Real Estate Consultant -
JOHN L. SCOTT REAL ESTATE - FEDERAL WAY
Office: (253)927-4000, Mobile: (253)222-4329,
fayainsworth@msn.com
Puget Sound Lifestyles, View Property, Waterfront, Investment, First
Home, Retirement,

Tim Altman
Fred Sands Desert Realty
Toll Free: (877)867-3166, Office: (760)341-9710,
agenttim@hotmail.com
Buy or Sell, Let Tim Altman Be Your Next HouseSold Agent...

this is the output you get:
Jim Wornast CENTURY 21 Smith/Ring Inc.
Toll Free: 1-888-842-1475, Direct: 425-221-2311,
jgw@merlinchess.com

Ask for Fay Ainsworth - Real Estate Consultant -
JOHN L. SCOTT REAL ESTATE - FEDERAL WAY
Office: (253)927-4000, Mobile: (253)222-4329,
fayainsworth@msn.com

Tim Altman
Fred Sands Desert Realty
Toll Free: (877)867-3166, Office: (760)341-9710,

Hmm. That didn't work in sed, but I'm running late meeting up with my g/f right now. I'll post more later and see if I can figure out why it didn't work.

Also, was this part:
Tim Altman
Fred Sands Desert Realty
Toll Free: (877)867-3166, Office: (760)341-9710,
A misprint? If not, it deleted his email. :p

I appreciate the help. I'll be back to try to understand it later tonight. ;)

Rob
 

Entity

Lifer
Oct 11, 1999
10,090
0
0
Hmm. I can't get that to work, and I'm not sure why. I've done some tweaks but still can't figure it out:

Here's what it looks like now:
Jim Wornast CENTURY 21 Smith/Ring Inc.
Toll Free: 1-888-842-1475, Direct: 425-221-2311, Email:
jgw@merlinchess.com
Real Estate, Buyer, Fisrt time buyer, Buyers Agent
;
Ask for Fay Ainsworth - Real Estate Consultant -
JOHN L. SCOTT REAL ESTATE - FEDERAL WAY
Office: (253)927-4000, Mobile: (253)222-4329, Email:
fayainsworth@msn.com
Puget Sound Lifestyles, View Property, Waterfront, Investment, First
Home, Retirement,
;
Tim Altman
Fred Sands Desert Realty
Toll Free: (877)867-3166, Office: (760)341-9710, Email:
agenttim@hotmail.com
Buy or Sell, Let Tim Altman Be Your Next HouseSold Agent...
;
I put the semicolon in there as a marker to make it easier for myself to identify, but still haven't had any luck. :p

I'll work on it more tomorrow. Any ideas?

Rob
 

notfred

Lifer
Feb 12, 2001
38,241
4
0
It won't work with a semicolon there, the regular expression expects to find two line breaks seperating records.

Translate the regular expression from perl's syntax to sed. Ask if you're unsure what any of the perl syntax is.
 

Entity

Lifer
Oct 11, 1999
10,090
0
0
Originally posted by: notfred
It won't work with a semicolon there, the regular expression expects to find two line breaks seperating records.

Translate the regular expression from perl's syntax to sed. Ask if you're unsure what any of the perl syntax is.

I should have stated that I modified the expression (or at least tried) to work with the semicolon; I wasn't using it as you put it, since the expression didn't work for me at first. Let me give a shot at translating this here:

s/([\@]*\@[^\n]*).*?(\n\n|$)/$1$2/sg;

s = substitute

I tried translating it like this:

s/([\@]*\@[^\n]*).*?(\n\n|$)/$1$2/g

...but that didn't work. It doesn't look like this part:

([\@]*\@[^\n]*).*?(\n\n|$)

Is actually matching up with anything. I tried to understand it, but got a bit confused:

Questions: I'm not sure why you put escape characters before the @ sign; is there a reason for that? @ doesn't require them, does it?

From what I could understand (and like I said, I'm new to this):

The beginning looks like you were trying to define a region (everything in-between the parenthesis). The [\@]* didn't really make sense, because I thought that was just notating N instances of the @ character, which wouldn't really make sense.

I'm guessing I'm misunderstanding this. Are you around on AIM or anything? I'm browsing through O'reilly's Understanding Perl right now to see if I can figure this out.

Rob
 

notfred

Lifer
Feb 12, 2001
38,241
4
0
@ doesn't have to be escaped. I jsut did it cause I'm dumb ;)

[\@] at the beginning was actually supposed to be [^\@]. By coincience it works either way.

my aim is TylerKaraszewski