2

For educational purposes, and without any importance, I wanted to implement a script that could make simple HTTP requests and show the content of the answer at the console (in plain text). I have achieved it with this code:

import socket
import sys

sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)

server_address = ('localhost', 8080)
print >>sys.stderr, 'connecting to %s port %s' % server_address
sock.connect(server_address)

message = 'GET /php.php HTTP/1.1\r\n'
message += 'Host: localhost:8080\r\n\r\n'
print >>sys.stderr, 'sending "%s"' % message
sock.sendall(message)

data = sock.recv(10000000)
print >>sys.stderr, 'received "%s"' % data

sock.close()

I just build the HTTP request, send it to the server, and wait for an answer.

Now comes the question: I do not know how to read the whole answer, I know there is a header that is "content-lengt" (let's assume it will always be there). How can I read all the content of the answer without having to do sock.recv (1000000000000000000)?

1 Answer 1

5

Typically you would read a certain ammount of bytes (eg 1024) in a loop. If recv returns any bytes append it to your data, else break the loop and close the connection.

import socket

server_address = ('httpbin.org', 80)
message  = b'GET / HTTP/1.1\r\n'
message += b'Host: httpbin.org:80\r\n'
message += b'Connection: close\r\n'
message += b'\r\n'

sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.connect(server_address)
sock.sendall(message)

data = b''
while True:
    buf = sock.recv(1024)
    if not buf:
        break
    data += buf

sock.close()
print(data.decode())

Note that you'll have to set the Connection header to 'close' (or use HTTP 1.0). Otherwise the loop will hang due to persistent connections by default, as implemented in HTTP 1.1.


Alternatinely you could read the first bytes and parse them to get the HTTP headers. If there is a Content-Length header you can use it to calculate the ramaining bytes.

...
data = b''
while b'\r\n\r\n' not in data:
    data += sock.recv(1)

header = data[:-4].decode()
headers = dict([i.split(': ') for i in header.splitlines()[1:]])
content_length = int(headers.get('Content-Length', 0))

if content_length:
    data += sock.recv(content_length)
...

By using bytes in send and recv, this should work for Python3 as well. However this is a very basic example and will fail in many cases (HTTPS, cookies, redirects, etc), so it's best to use a library designed for HTTP requests.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.