0

I am working with regular expressions in Python. I want to match a few lines from a CSV file inserted into a database that starts and ends with an underscore.

I have used regular expressions in my Python script to do the same but it prints the result as 'none'. Here is my code for the same, kindly tell me what mistake I am making:

reg = re.compile(r'^_.*_$',re.I)
imatch = reg.match(unicode(row[4], "utf8"))

Here r'^_.*_$',re.I is my regular expression to match lines starting and ending with _. unicode(row[4], "utf8") specifies the row from the CSV file inserted into a database.

Any help would be appreciated.

10
  • It's not possible to answer this question without knowing the contents of row[4], and what you're trying to match. Do you know that there are cases that begin and end with a _ that are not being matched? Commented Feb 17, 2013 at 16:17
  • unicode(row[4], "utf8") = ( aaaaa bbbb ccccc 5635! fgsfrq. ) Assume this is my string , i want to match few strings like this that starts and ends with _ and it should match with that regular expression Commented Feb 17, 2013 at 16:19
  • Why would you expect that to match this regular expression? It doesn't start and end with an _. Commented Feb 17, 2013 at 16:20
  • Can you give me a proper re syntax for matching if mine is wrong. Commented Feb 17, 2013 at 16:22
  • What are you trying to match? You said you wanted to match only lines starting and ending with _. Is that not what you want to do? Commented Feb 17, 2013 at 16:23

2 Answers 2

1
import re
lines = [line.strip() for line in open('file.csv')]
for x in lines:
    match=re.search(r'^_.*_$',x)
    if match: print x

we have to strip each line otherwise each line ends with char '\n' instead of '_' in that case regex won't match the string.

file.csv

_abdlfla_
sldjlfds_
_adlfdls
_132jdlfjflds_

output

_abdlfla_
_132jdlfjflds_
Sign up to request clarification or add additional context in comments.

1 Comment

If you included a sentence about why using strip solves the problem I would be inclined to upvote.
0

You may use startswith and endswith function instead of re. Any specific reason for using re?

for l in open('test.csv'):
    l=l.strip()
    if l.startswith('_') and l.endswith('_'):
        print(l)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.