1

When I execute the script, the result is empty. Why? The script connected with a site and parse html tag <a>:

#!/usr/bin/python3

import re
import socket
import urllib, urllib.error
import http.client
import sys

conn = http.client.HTTPConnection('www.guardaserie.online');
headers = { "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
                "Content-type": "application/x-www-form-urlencoded; charset=UTF-8" }
params = urllib.parse.urlencode({"s":"hannibal"})
conn.request('GET', '/',params, headers)
response = conn.getresponse();

site = re.search('<a href="(.*)" class="box-link-serie">', str(response.read()), re.M|re.I)
if(site):
  print(site.group())
1

1 Answer 1

1

It's likely the pattern you are searching for is non-existent in the read response, or it chokes at some point trying to parse html.

re.search( 'href="(.*)" class="box-link-serie"', str(response.read()), re.M | re.I )

Using something more generic or another parser method will likely lead you to your desired result.

Sign up to request clarification or add additional context in comments.

8 Comments

If you tried the pattern above it should return a result. I would recommend you try using these imports: import re, httplib, socket, urllib, sys, and change params = urllib.urlencode, as well as conn = httplib.HTTPConnection ...
the pattern return the entire html page
the result is always that
I get href="http://www.guardaserie.online/ray-donovan-a/" class="box-link-serie" when using print(site.group()) ... python code here : gist.github.com/anonymous/43026f7262b2fddfb7643169f0d558b2
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.