I know there have probably been a million questions on this, but I'm wondering how to remove these tags without having to import or use HTMLParser or regex. I've tried a bunch of different replace statements to try and remove parts of the strings enclosed by < >'s, to no avail.
Basically what I'm working with is:
response = urlopen(url)
html = response.read()
html = html.decode()
From here I am just trying to manipulate the string variable html to do the above. Is there any way to do it as i specified, or must you use previous methods I have seen?
I also tried to make a for loop that went through every character to check if it was enclosed, but for some reason it wouldn't give me a proper print out, that was:
for i in html:
if i == '<':
html.replace(i, '')
delete = True
if i == '>':
html.replace(i, '')
delete = False
if delete == True:
html.replace(i, '')
Would appreciate any input.