1

I have a problem when creating a function that's supposed to first return lowercase letters, "_" and "." and then uppercase letters, " " and "|" in that order. My version seems to return numbers and special characters like <>@ too which I don't want it to do, It's only supposed to read through the input string once and I don't know if that's achieved with my code.

My code is:

def split_iterative(n):
    splitted_first = ""
    splitted_second = ""
    for i in n:
        if i == i.lower() or i == "_" or i == ".":
            splitted_first = splitted_first + i
        elif i == i.upper() or i == " " or i == "|":
            splitted_second = splitted_second + i
    return splitted_first + splitted_second

if I do split_iterative("'lMiED)teD5E,_hLAe;Nm,0@Dli&Eg ,#4aI?rN@T§&e7#4E #<(S0A?<)NT8<0'")) it returns "'li)te5,_he;m,0@li&g ,#4a?r@§&e7#4 #<(0?<)8<0'MEDDELANDEINTESANT" which is incorrect as it should eliminate all those special characters and numbers. How do I fix this? It should return ('lite_hemligare', 'MEDDELANDE INTE SANT')

2
  • 2
    The problem is i.lower() and i.upper() returns the same output when given a number or a special character. Commented Sep 24, 2017 at 11:19
  • It might be a case for a regular expression character class, something like [\w. |], then a custom sort to get the required order. Commented Sep 24, 2017 at 11:21

5 Answers 5

1

You could try this:

def f(input_string):
    str1 = str2 = ""
    for character in input_string:
        if character.isalpha():
            if character.islower():
                str1 += character
            else:
                str2 += character
        elif character in "_.":
            str1 += character
        elif character in " |":
            str2 += character
    return str1, str2

Output:

>>> input_string = "'lMiED)teD5E,_hLAe;Nm,0@Dli&Eg ,#4aI?rN@T§&e7#4E #<(S0A?<)NT8<0'"
>>> 
>>> print f(input_string)
('lite_hemligare', 'MEDDELANDE INTE SANT')
>>> 
Sign up to request clarification or add additional context in comments.

1 Comment

Sorry about that. The description isn't 100% clear, I thought _ should come after lower case letters. Your method looks correct.
1

This is because you are iterating through a string. The lowercase of the special characters is the same as the character. i.e.. '#'.lower() == '#'. hence it'll return '#' and all other special characters. you should explicitly check for alphabets using the isalpha() method on strings. (i.isalpha() and i.lower() == i) or i == '_' or i == '.'

Comments

0

First, to make it return a list don't return the concatenated string but a list

Second, you are not checking or filtering out the characters, one way would be by checking if the character is a letter using isalpha() method

something like this:

def split_iterative(n):
splitted_first = ""
splitted_second = ""
for i in n:
    if (i.isalpha() and i == i.lower()) or i == "_" or i == ".":
        splitted_first = splitted_first + i
    elif (i.isalpha() and i == i.upper()) or i == " " or i == "|":
        splitted_second = splitted_second + i
#returns a list you can make it a variable if you need
return [splitted_first, splitted_second] 

Comments

0

You can use ASCII values for the filtering of characters:

def split_iterative(n):
    splitted_first = ""
    splitted_second = ""
    for i in n:
        if ord(i) in range(97,122) or i == "_" or i == ".":
            splitted_first = splitted_first + i
        elif ord(i) in range(65,90) or i == " " or i == "|":
            splitted_second = splitted_second + i
    return (splitted_first , splitted_second)

1 Comment

change the last line to return (splitted_first , splitted_second) if you want a tuple instead of a string
0

You can make use of two lists while walking through characters of your text.

You can append lowercase, underscore, and stop characters to one list then uppercase, space and pipe characters to the other.

Finally return a tuple of each list joined as strings.

def splittext(txt):
  slug, uppercase_letters = [], []
  slug_symbols = {'_', '.'}
  uppercase_symbols = {' ', '|'}

  for letter in txt:
    if letter.islower() or letter in slug_symbols:
      slug.append(letter)
    if letter.isupper() or letter in uppercase_symbols:
      uppercase_letters.append(letter)

  return ''.join(slug), ''.join(uppercase_letters)


txt="'lMiED)teD5E,_hLAe;Nm,0@Dli&Eg ,#4aI?rN@T§&e7#4E #<(S0A?<)NT8<0'"
assert splittext(txt) == ("lite_hemligare", "MEDDELANDE INTE SANT")

12 Comments

I think that may be happening because those sets are created every time in the loop.
Aha, if you replace the sets with strings, your method becomes ~8 seconds faster. 14.916232824325562 vs. 22.767919063568115.
Python makes the string once. So every time it re-uses the same string. s = "jDo"; g = "jDo"; assert id(s) == id(g)
@jDo Try import dis; dis.dis(compile("a = {'_', '.'}; b = '_.'", '', 'exec')).
@StefanPochmann You're right. Set's aren't immutable data types.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.