String Split Function in Python

themolyhaec

Junior Member
Oct 7, 2021
2
0
11
I have a problem here

Code:
name = "Mohith Jain is my name."
Desired output:

Code:
['Mohith Jain', 'is', 'my name.']

My tried solution: I have tried using the split function in Python. Code:
Code:
result = name.split()
['Mohith', 'Jain', 'is', 'my', 'name']
I am stuck here and unable to proceed further. I have tried using for loop. But, I am not getting the required result.

Am I missing something or is there any other way of solving this problem?

Please let me know.

I have tried [a proprietary site describing str.split], A real-life example in MS Excel where the string entered in the cell is split based on some delimiter. As I followed the same it was working

Edited to remove potential spam link - moderator Ken g6
 
Last edited by a moderator:

themolyhaec

Junior Member
Oct 7, 2021
2
0
11
Post removed.



If you have an issue with moderator actions, there is one place to put it - Moderator Discussions,

and his username is, as you are well aware, Ken g6.

Administrator allisolm
 
Last edited by a moderator:

purbeast0

No Lifer
Sep 13, 2001
52,855
5,727
126
I'm not sure exactly what you are asking. The split function is working as designed. It splits a string by a default delimiter and if you leave it blank it splits on whitespace.

You haven't said what your use case logic is either so we don't know what your end goal is aside from that one use case.
 

Aikouka

Lifer
Nov 27, 2001
30,383
912
126
@purbeast0 touched on the problem, but essentially, the issue is that you're not presenting your set of requirements for this method. It appears that you want it to group names even if they're split by whitespace, so what we're dealing with is more of an extended or intelligent String split. The problem here is that this request can become quite technical as it is dealing with recognition logic, or... you could just take a simplistic approach.

So, if a simple approach is acceptable, I would likely add a rule to my splitting (or another way, which I'll get to) which is... a whitespace split is ignored if the previous group started with a capital letter and the first character found after the whitespace is also a capital letter. To be clear, this is not a perfect approach. A good example is that it would have a hard time with titles, which feature plenty of capitalization. For example, "Back to the Future" would split properly into ["Back", "to", "the", "Future"], but "The Dark Knight" would split into ["The Dark Knight"].

As for implementation, you could either write your own splitter, which includes the aforementioned rule, or write a second-pass that performs a standard split and then implements the rule.
 

purbeast0

No Lifer
Sep 13, 2001
52,855
5,727
126
@purbeast0 touched on the problem, but essentially, the issue is that you're not presenting your set of requirements for this method. It appears that you want it to group names even if they're split by whitespace, so what we're dealing with is more of an extended or intelligent String split. The problem here is that this request can become quite technical as it is dealing with recognition logic, or... you could just take a simplistic approach.

So, if a simple approach is acceptable, I would likely add a rule to my splitting (or another way, which I'll get to) which is... a whitespace split is ignored if the previous group started with a capital letter and the first character found after the whitespace is also a capital letter. To be clear, this is not a perfect approach. A good example is that it would have a hard time with titles, which feature plenty of capitalization. For example, "Back to the Future" would split properly into ["Back", "to", "the", "Future"], but "The Dark Knight" would split into ["The Dark Knight"].

As for implementation, you could either write your own splitter, which includes the aforementioned rule, or write a second-pass that performs a standard split and then implements the rule.
That doesn't even fit his use case though.

It wouldn't split into 'my name.' like he wants, it would split it into 'my' 'name`' like the example he doesn't want.
 

Aikouka

Lifer
Nov 27, 2001
30,383
912
126
That doesn't even fit his use case though.

It wouldn't split into 'my name.' like he wants, it would split it into 'my' 'name`' like the example he doesn't want.

Yes, it does. If you ran "Mohith Jain is my name." based upon the criteria that I set, it will split into ["Mohith Jain", "is", "my", "name."] The point is that it combines together or ignores whitespace splits when the two groups would have both started with a capital letter (e.g. "Mohith" and "Jain"). The only other problem with doing a post-processing would be that it would not be able to properly rebuild whitespace as it would have to guess how many characters (one space vs. two spaces) or what characters (space vs. tab).

Thinking about it... I wonder if you could use a regex-based split?
 
Last edited:
  • Like
Reactions: Ken g6

Ajay

Lifer
Jan 8, 2001
15,451
7,861
136
Code:
name = "Mohith Jain is my name."
result = name.split()

newStr[] = result[0]+ " " + result[1] 
newStr.append(result[2]
newStr.append(result[3]
newStr.append(result[4]

Or a loop with
x = len(result)

for y in range(x):   
    newStr.append(result[y])
 
# actually, would need to be handled a little differently to account for concat on the first to elements
# I'm not a python guy and don't want to think anymore...
 
  • Haha
Reactions: mxnerd and Ken g6

Aikouka

Lifer
Nov 27, 2001
30,383
912
126
That is not his desired output though...

Looks like I had a bit of a derp moment and completely missed the last two segments being combined as well. Apologies for that! Got a little too headstrong on assuming it was the combination of the name that was the point of the exercise! :p So... I guess I'm not entirely sure what's the point? Is it just always combining the first two and last two?
 

purbeast0

No Lifer
Sep 13, 2001
52,855
5,727
126
Looks like I had a bit of a derp moment and completely missed the last two segments being combined as well. Apologies for that! Got a little too headstrong on assuming it was the combination of the name that was the point of the exercise! :p So... I guess I'm not entirely sure what's the point? Is it just always combining the first two and last two?
Yeah that is why I was saying there just is not enough information in his post to really figure out what he's trying to do. We can't really help without more info at this point though.

Was probably just a spam post anyways lol.
 

Fallen Kell

Diamond Member
Oct 9, 1999
6,037
431
126
Personally for this use case, pattern matching is really a more appropriate method. The big reason being that names may have multiple words/spaces, not just a single space, like "Jamie Lee Curtis". Patten matching for cutting up the string is a better method as a result.

While I am not a python expert, a perl regex would the following:
my $name = "Mohith Jain is my name.";
my @array = ($name =~ /^(.+) (is) (my name.)$/);
print join(", ", @array) . "\n";

Output:
Mohith Jain, is, my name.

Given the above, you can easy make an array for the "is", "my name" (although, you might as well just hard code them since you pattern matched for them already and know they are there in the proper sequience (in perl the "( )" within a regex like above define what piece to match out of the string and return as an element in an array, so the ^(.+) is the first element's string, the "(is)" will be the second element of the array, and "(my name)" is the third element of the array). Again, this uses some of the advantages of perl's regex functionality, but you can do similar with python, I believe they are defined as the result.group which is an array (just like in the perl example above), so you can use the exact same regex, just in your python code.
 
Last edited: