1

I am doing a sentiment analysis and I want to Add NOT to every word between negation and following punctuation. I am performing the following code:

import re


fin=open("aboveE1.txt",'r', encoding='UTF-8')

transformed = re.sub(r'\b(?:never|no|nothing|nowhere|noone|none|not|havent|hasnt|hadnt|cant|couldnt|shouldnt|wont|wouldnt|dont|doesnt|didnt|isnt|arent|aint)\b[\w\s]+[^\w\s]', 
   lambda match: re.sub(r'(\s+)(\w+)', r'\1NEG_\2', match.group(0)), 
   fin,
   flags=re.IGNORECASE)

Traceback (most recent call last): line 14, in flags=re.IGNORECASE) line 182, in sub return _compile(pattern, flags).sub(repl, string, count) TypeError: expected string or bytes-like object

I dont know how to fix the error. Can you help me?

1 Answer 1

1

re.sub takes in a string, not a file object. Documentation here.

import re

fin=open("aboveE1.txt",'r', encoding='UTF-8')    
transformed = ''

for line in fin:
    transformed += re.sub(r'\b(?:never|no|nothing|nowhere|noone|none|not|havent|hasnt|hadnt|cant|couldnt|shouldnt|wont|wouldnt|dont|doesnt|didnt|isnt|arent|aint)\b[\w\s]+[^\w\s]', 
    lambda match: re.sub(r'(\s+)(\w+)', r'\1NEG_\2', match.group(0)), 
    line,
    flags=re.IGNORECASE)
    # No need to append '\n' to 'transformed'
    # because the line returned via the iterator includes the '\n'

fin.close()

Also remember to always close the file you open.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.