• Guest, The rules for the P & N subforum have been updated to prohibit "ad hominem" or personal attacks against other posters. See the full details in the post "Politics and News Rules & Guidelines."

SQL query question - finding the longest part of a string

Homerboy

Lifer
Mar 1, 2000
29,563
3,335
126
I'm trying to figure out a way to extract the street name from a full street address
While I know there's no 100% fool-proof way of doing this (unless the data is uniformly formatted, which is far from the case), I'm hoping to be able to extract the the street name on most of the entries in my column.
My thought is I want to pull the value of the LONGEST parsed substring in a value

So if the entry is "123 BROADWAY ST # 456"
The longest substring, based on " " being the delimiter of that string is "BROADWAY" so I want my returning value to pull "BROADWAY"

Again, I know it won't work 100% of the time "1234 ELM St." is going to end up returning "1234" but due to the way this data is arranged, I should be able to at least highlight those for the end user who is cleaning up this data.

Does that make sense? Or am I nuts in this line of thinking?
Thanks for any suggestions/help/advice you can provide.
 

purbeast0

No Lifer
Sep 13, 2001
50,945
3,644
126
I don't think that is a good way to do that. Not only does it break in your one example, but if you have a street name that is 2 words, you are screwed. Then if you have Boulevard spelled out instead of Blvd. you are also screwed.

"123 Martin Luther King Boulevard" breaks on multiple levels.

What I would do is do a delimeter on space like you were doing, then just scrap the first and last entry. Then just add all the remaining strings together with a space between and you have the street name.

Most addresses are <number> <street_name> <street_type> so that would handle multiple word addresses.

This isn't fool proof either. It breaks on a P.O. Box address, if that matters. Probably more cases too if I thought more about it.

Also, this isn't really SQL related. You'll want to just pull the entire column of street numbers then run this logic on it outside the context of SQL.

EDIT:

This is super simple logic.

JavaScript:
getStreet(addressString) {
    const split = addressString.split(' ');
    const streetNameArray = split.slice(1, split.length - 2);
    const streetName = streetNameArray.join(' ');
    return streetName;
}
There is no error checking in case split didn't have an array of at least 3, but I'm not sure how you would want to handle those cases anyways but it would be easy to implement whatever you choose.

Could probably do it via regex too but I'm far from a regex expert.
 
Last edited:

mike8675309

Senior member
Jul 17, 2013
435
87
101
Look at your data and try to see how it's formatted? USA Addresses or also Canada or Mexico?

Most USA addresses are
building number street name [street ext...apt#...suite#]
City, State, postal code

If this is a static set of data (not going to change, so you don't need to keep doing this) then review the data for the rules to apply.
Maybe start with doing a first pass where you grab the first section of the string by cutting off the first space, that gets rid of the building number.
Then look at common usages for apartments or suites. From there you could go through and eliminate those from the data.
With those gone the rules for just the street should be easier.

If the data isn't static, fix the system capturing this so it breaks up the data in the format you need, or at least applies a delimiter you can use to break it up into the component parts.
 
Thread starter Similar threads Forum Replies Date
R Programming 2
J Programming 4

ASK THE COMMUNITY