unable to check if string is substring of another in python

Question

I'm a beginner in python and I have 2 string variables called

user_comment = "Hobbit 2013:Bad Movie"
comment_in_movie = "Hobbit 2013 [email protected]:Bad Movie"

I am trying to check if the user_comment is inside the second variable using :

if user_comment in comment_in_movie:
     print("found")

In more detail I am trying to check if all the above words exist in the second string . But I get no result . I think the problem is that the user string does not appear in the same way in the second string since there are more words between "2013" and ":Bad Movie" I would appreciate your help in guiding me to solve this simple task . Thank you in advance .

in check if the entirety of string 1 is in string 2. in your case there is SOME overlap, but clearly user_comment contains parts not in comment_in_movie. what are you trying to achieve? if there is always going to be a colon you can always slice the string — Nullman
– Nullman, Commented Jun 22, 2020 at 7:51

felipe · Accepted Answer · 2020-06-22 07:56:29Z

1

You are indeed correct on your assumption. The strings will only get matched if the exact string is found. You may do something like so:

user_comment = "Hobbit 2013:Bad Movie"
comment_in_movie = "Hobbit 2013 [email protected]:Bad Movie"

for string in user_comment.split(":"):
    if string in comment_in_movie:
        print(f"Found '{string}' in comment_in_movie.")

Which will output:

Found 'Hobbit 2013' in comment_in_movie.
Found 'Bad Movie' in comment_in_movie.

If you are trying to check for individual words, you can replace the : delimiter with and split the string by :

user_comment = "Hobbit 2013:Bad Movie"
comment_in_movie = "Hobbit 2013 [email protected]:Bad Movie"

for string in user_comment.replace(":", " ").split(" "):
    if string in comment_in_movie:
        print(f"Found '{string}' in comment_in_movie.")

Will output:

Found 'Hobbit' in comment_in_movie.
Found '2013' in comment_in_movie.
Found 'Bad' in comment_in_movie.
Found 'Movie' in comment_in_movie.

You may also use all() to return to you a single True or False that will tell you if all strings are present. This can be done in one line like so:

user_comment = "Hobbit 2013:Bad Movie"
comment_in_movie = "Hobbit 2013 [email protected]:Bad Movie"

in_str = all(x in comment_in_movie for x in user_comment.replace(":", " ").split(" "))
print(in_str)

The above will output True. You will notice that if you change user_comment to say Dark Knight in the movie name section you will get False as the output.

edited Jun 22, 2020 at 7:56

answered Jun 22, 2020 at 7:51

felipe

8,1253 gold badges30 silver badges40 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Rolf of Saxony Over a year ago

This is making a big specific assumption that the colon is key. The next check will almost certainly fail, as it won't contain a :. Better to replace all punctuation characters in both strings with space, before testing –

felipe Over a year ago

str.replace(old, new) does not throw error if the given old is not in the str. While, yes, there was an assumption that : would be there, its lack thereof won't necessarily be an issue. OP should be able to generalize as he/she sees fit.

kamal · Accepted Answer · 2020-06-22 07:54:58Z

1

Sure, user_comment is not in comment_in_movie, you need to split user_comment by space then search for each word. Here is the solution:

  if all(x in comment_in_movie for x in user_comment.split(" ")):
    print ("found")

answered Jun 22, 2020 at 7:54

kamal

2701 gold badge2 silver badges10 bronze badges

Comments

Irfan wani · Accepted Answer · 2020-06-22 10:00:11Z

1

You answered is right by your own.Now you can solve this problem by many steps:

Store the string as a list with elements of that list equal to the words separated by spaces in the string and use a loop to check whether the elements of first string are present in the second string.But the problem with this method is that it is return true even if the elements of the first string are present in the second string even in the wrong order...Hope you got the answer.If you don't know about loops you can learn them from no. of tutorials present om different platforms or just text back.

edited Jun 22, 2020 at 10:00

answered Jun 22, 2020 at 7:58

Irfan wani

5,2522 gold badges32 silver badges49 bronze badges

Comments

Povilas Kirna · Accepted Answer · 2020-06-22 07:53:59Z

0

You can split your

user_comment ="Hobbit 2013:Bad Movie"

Into two separate strings by using

user_comment = "Hobbit 2013:Bad Movie"
comment_in_movie = "Hobbit 2013 [email protected]:Bad Movie"
split_string = user_comment.split(":")
if split_string[0] in comment_in_movie and split_string[1] in comment_in_movie:
    print("found")

and then compare them to the:

comment_in_movie = "Hobbit 2013 [email protected]:Bad Movie"

Notice how this answer is different from the first one because you access the split text by list index.

answered Jun 22, 2020 at 7:53

Povilas Kirna

845 bronze badges

Comments

alani · Accepted Answer · 2020-06-22 08:50:32Z

You can do this, in order to test whether comment_in_movie contains all characters from user_comment in the correct order. Additional characters that are missing from user_comment are permitted anywhere inside comment_in_movie, and the loop over the characters of comment_in_movie will just carry on until it finds matching characters again. Provided that the end of user_comment is reached before running out of characters in comment_in_movie, that is considered a match.

user_comment = "Hobbit 2013:Bad Movie"
comment_in_movie = "Hobbit 2013 [email protected]:Bad Movie"

i = 0

for c in comment_in_movie:
    if user_comment[i] != c:
        continue
    i += 1
    if i == len(user_comment):
        found = True
        break
else:
    found = False

if found:
    print("found")

The above code makes no assumptions whatsoever regarding permitted places in which missing characters may occur. In some sense, this is more flexible because it checks that all characters are present, while not needing any advance knowledge of what delimiter(s) to use to split the string. However, it might be for example that gaps in the middle of a word should not be acceptable (e.g. if user_comment contained Hit instead of Hobbit).

For this reason, an alternative version is given below which is based on looking for whole words. All the words in user_comment now have to appear in comment_in_movie, again in the correct order, and non-word characters e.g. punctuation are simply ignored. The logic is exactly the same, except that we loop over words in lists, instead of characters in strings. So for example, "Hobbit 2013,Bad Movie" would be "found", without requiring that the comma is contained in comment_in_movie, but "Hit 2013:Bad Movie" would not be "found".

import re

user_comment = "Hobbit 2013:Bad Movie"
comment_in_movie = "Hobbit 2013 [email protected]:Bad Movie"

user_comment_words = re.findall("\w+", user_comment)
comment_in_movie_words = re.findall("\w+", comment_in_movie)

i = 0
for w in comment_in_movie_words:
    if user_comment_words[i] != w:
        continue
    i += 1
    if i == len(user_comment_words):
        found = True
        break
else:
    found = False

if found:
    print("found")

Collectives™ on Stack Overflow

unable to check if string is substring of another in python

5 Answers 5

2 Comments

Comments

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

2 Comments

Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related