Sentiment analysis Python TypeError: expected string or bytes-like object

Question

I am doing a sentiment analysis and I want to Add NOT to every word between negation and following punctuation. I am performing the following code:

import re


fin=open("aboveE1.txt",'r', encoding='UTF-8')

transformed = re.sub(r'\b(?:never|no|nothing|nowhere|noone|none|not|havent|hasnt|hadnt|cant|couldnt|shouldnt|wont|wouldnt|dont|doesnt|didnt|isnt|arent|aint)\b[\w\s]+[^\w\s]', 
   lambda match: re.sub(r'(\s+)(\w+)', r'\1NEG_\2', match.group(0)), 
   fin,
   flags=re.IGNORECASE)

Traceback (most recent call last): line 14, in flags=re.IGNORECASE) line 182, in sub return _compile(pattern, flags).sub(repl, string, count) TypeError: expected string or bytes-like object

I dont know how to fix the error. Can you help me?

oxalorg · Accepted Answer · 2016-06-25 04:51:37Z

1

re.sub takes in a string, not a file object. Documentation here.

import re

fin=open("aboveE1.txt",'r', encoding='UTF-8')    
transformed = ''

for line in fin:
    transformed += re.sub(r'\b(?:never|no|nothing|nowhere|noone|none|not|havent|hasnt|hadnt|cant|couldnt|shouldnt|wont|wouldnt|dont|doesnt|didnt|isnt|arent|aint)\b[\w\s]+[^\w\s]', 
    lambda match: re.sub(r'(\s+)(\w+)', r'\1NEG_\2', match.group(0)), 
    line,
    flags=re.IGNORECASE)
    # No need to append '\n' to 'transformed'
    # because the line returned via the iterator includes the '\n'

fin.close()

Also remember to always close the file you open.

edited Jun 25, 2016 at 4:51

answered Jun 25, 2016 at 4:46

oxalorg

2,8181 gold badge18 silver badges27 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Sentiment analysis Python TypeError: expected string or bytes-like object

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related