22

I am experiencing some performance problems when creating a very simple Python HTTP server. The key issue is that performance is varying depending on which client I use to access it, where the server and all clients are being run on the local machine. For instance, a GET request issued from a Python script (urllib2.urlopen('http://localhost/').read()) takes just over a second to complete, which seems slow considering that the server is under no load. Running the GET request from Excel using MSXML2.ServerXMLHTTP also feels slow. However, requesting the data Google Chrome or from RCurl, the curl add-in for R, yields an essentially instantaneous response, which is what I would expect.

Adding further to my confusion is that I do not experience any performance problems for any client when I am on my computer at work (the performance problems are on my home computer). Both systems run Python 2.6, although the work computer runs Windows XP instead of 7.

Below is my very simple server example, which simply returns 'Hello world' for any get request.

from BaseHTTPServer import BaseHTTPRequestHandler, HTTPServer

class MyHandler(BaseHTTPRequestHandler):
    def do_GET(self):
        print("Just received a GET request")
        self.send_response(200)
        self.send_header("Content-type", "text/html")
        self.end_headers()

        self.wfile.write('Hello world')

        return

    def log_request(self, code=None, size=None):
        print('Request')

    def log_message(self, format, *args):
        print('Message')

if __name__ == "__main__":
    try:
        server = HTTPServer(('localhost', 80), MyHandler)
        print('Started http server')
        server.serve_forever()
    except KeyboardInterrupt:
        print('^C received, shutting down server')
        server.socket.close()

Note that in MyHandler I override the log_request() and log_message() functions. The reason is that I read that a fully-qualified domain name lookup performed by one of these functions might be a reason for a slow server. Unfortunately setting them to just print a static message did not solve my problem.

Also, notice that I have put in a print() statement as the first line of the do_GET() routine in MyHandler. The slowness occurs prior to this message being printed, meaning that none of the stuff that comes after it is causing a delay.

3 Answers 3

33

The request handler issues a inverse name lookup in order to display the client name in the log. My Windows 7 issues a first DNS lookup that fails with no delay, followed by 2 successive NetBIOS name queries to the HTTP client, and each one run into a 2 sec timeout = 4 seconds delay !!

Have a look at https://bugs.python.org/issue6085

Another fix that worked for me is to override BaseHTTPRequestHandler.address_string() in my request handler with a version that does not perform the name lookup

def address_string(self):
    host, port = self.client_address[:2]
    #return socket.getfqdn(host)
    return host

Philippe

Sign up to request clarification or add additional context in comments.

4 Comments

Great find! You have saved orders of magnitude of my life in which I was otherwise waiting wondering why my server blew under Win7 and 8. Thanks!
This was my problem using IronPython 2.7.4, had a 100x speedup by removing this line
Old question but definitely made my day! Thank you for this!
This solved a long term issue when I use some python http server, such as this one: stackp/Droopy: Mini Web server that let others upload files to your computer, see the Droopy is slow due to a DNS lookup bug in python's module BaseHTTPServer, many thanks.
19

This does not sound like a problem with the code. A nifty way of troubleshooting an HTTP server is to connect to it to telnet to it on port 80. Then you can type something like:

GET /index.html HTTP/1.1
host: www.blah.com
<enter> <enter>

and observe the server's response. See if you get a delay using this approach.

You may also want to turn off any firewalls to see if they are responsible for the slowdown.

Try replacing 127.0.0.1 for localhost. If that solves the problem, then that is a clue that the FQDN lookup may indeed be the possible cause.

4 Comments

john - Thanks for the tip, having the request go to 127.0.0.1 solved the speed issue. If anyone can point me to more info about controlling the FQDN lookup that would be useful.
Ok... this could mean that urllib2.urlopen() and Excel are doing a way of FQDN lookup on "localhost" that is somehow taking forever to resolve. The question to ask is how/why are Chrome and curl doing the lookup differently? You could try other Python functions that access http:// (socket calls, perhaps?) and see if this is a Python interpreter specific thing or urllib specific. If interpreter specific, then the OS/firewall might somehow be treating processes such as python.exe and Excel differently from Chrome and curl.
Just wanted to indicate that this old answer worked for me in 2020. Both on Windows and Linux. Not sure if it was the same cause, but the solution worked.
Absolutely insane. But this really was the issue.
-1

Replacing localhost with 127.0.0.1 can solve the problem:)

1 Comment

Your answer could be improved with additional supporting information. Please edit to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers in the help center.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.