How To Capture Output of Curl from Python script

Question

I want to find the info about a webpage using curl, but in Python, so far I have this:

os.system("curl --head www.google.com")

If I run that, it prints out:

HTTP/1.1 200 OK
Date: Sun, 15 Apr 2012 00:50:13 GMT
Expires: -1
Cache-Control: private, max-age=0
Content-Type: text/html; charset=ISO-8859-1
Set-Cookie: PREF=ID=3e39ad65c9fa03f3:FF=0:TM=1334451013:LM=1334451013:S=IyFnmKZh0Ck4xfJ4; expires=Tue, 15-Apr-2014 00:50:13 GMT; path=/; domain=.google.com
Set-Cookie: NID=58=Giz8e5-6p4cDNmx9j9QLwCbqhRksc907LDDO6WYeeV-hRbugTLTLvyjswf6Vk1xd6FPAGi8VOPaJVXm14TBm-0Seu1_331zS6gPHfFp4u4rRkXtSR9Un0hg-smEqByZO; expires=Mon, 15-Oct-2012 00:50:13 GMT; path=/; domain=.google.com; HttpOnly
P3P: CP="This is not a P3P policy! See http://www.google.com/support/accounts/bin/answer.py?hl=en&answer=151657 for more info."
Server: gws
X-XSS-Protection: 1; mode=block
X-Frame-Options: SAMEORIGIN
Transfer-Encoding: chunked

What I want to do, is be able to match the 200 in it using a regex (i don't need help with that), but, I can't find a way to convert all the text above into a string. How do I do that? I tried: info = os.system("curl --head www.google.com") but info was just 0.

"The subprocess module provides more powerful facilities for spawning new processes and retrieving their results; using that module is preferable to using this function. See the Replacing Older Functions with the subprocess Module section in the subprocess documentation for some helpful recipes." -docs.python.org/library/os.html#os.system — Jesus is Lord
– Jesus is Lord, Commented Apr 15, 2012 at 1:02

Raúl Martín · Accepted Answer · 2015-08-20 13:16:33Z

48

For some reason... I need use curl (no pycurl, httplib2...), maybe this can help to somebody:

import os
result = os.popen("curl http://google.es").read()
print result

answered Aug 20, 2015 at 13:16

Raúl Martín

4,7293 gold badges26 silver badges43 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Melvin Roest Over a year ago

Thanks this is more intuitive than other answers, handy for dirty / quickly created scripts :)

stats con chris Over a year ago

Thanks but it prints also :% Total % Received % Xferd Average Speed Time Time Time Current... how can I remove this extra info from log file

Raúl Martín Over a year ago

This info that you see is the shell executing the curl. That info won't be captured in the result variable. Si python t.py >> hola.txt will only leave on the file the output of result. But if you don't want to see that add --silent to the curl in case the log is capturing everything in the output. result = os.popen("curl http://google.es --silent").read() Hope this help

Óscar López · Accepted Answer · 2012-04-15 01:08:27Z

27

Try this, using subprocess.Popen():

import subprocess
proc = subprocess.Popen(["curl", "--head", "www.google.com"], stdout=subprocess.PIPE)
(out, err) = proc.communicate()
print out

As stated in the documentation:

The subprocess module allows you to spawn new processes, connect to their input/output/error pipes, and obtain their return codes. This module intends to replace several other, older modules and functions, such as:

os.system
os.spawn*
os.popen*
popen2.*
commands.*

edited Apr 15, 2012 at 1:08

answered Apr 15, 2012 at 1:02

Óscar López

237k38 gold badges321 silver badges391 bronze badges

3 Comments

Ignacio Vazquez-Abrams Over a year ago

@user1333973: Because subprocess works and os.system() doesn't.

Óscar López Over a year ago

@user1333973 added link to the documentation

JJ Roman Over a year ago

in order to also ger err - we need to call Popen as: proc = subprocess.Popen(fullCommand, stderr=subprocess.PIPE, stdout=subprocess.PIPE)

Community · Accepted Answer · 2020-06-20 09:12:55Z

import os
cmd = 'curl https://randomuser.me/api/'
os.system(cmd)

Result

{"results":[{"gender":"male","name":{"title":"mr","first":"çetin","last":"nebioğlu"},"location":{"street":"5919 abanoz sk","city":"adana","state":"kayseri","postcode":53537},"email":"çetin.nebioğ[email protected]","login":{"username":"heavyleopard188","password":"forgot","salt":"91TJOXWX","md5":"2b1124732ed2716af7d87ff3b140d178","sha1":"cb13fddef0e2ce14fa08a1731b66f5a603e32abe","sha256":"cbc252db886cc20e13f1fe000af1762be9f05e4f6372c289f993b89f1013a68c"},"dob":"1977-05-10 18:26:56","registered":"2009-09-08 15:57:32","phone":"(518)-816-4122","cell":"(605)-165-1900","id":{"name":"","value":null},"picture":{"large":"https://randomuser.me/api/portraits/men/38.jpg","medium":"https://randomuser.me/api/portraits/med/men/38.jpg","thumbnail":"https://randomuser.me/api/portraits/thumb/men/38.jpg"},"nat":"TR"}],"info":{"seed":"0b38b702ef718e83","results":1,"page":1,"version":"1.1"}}

Michael Dillon · Accepted Answer · 2012-04-15 01:08:39Z

1

You could use an HTTP library or http client library in Python instead of calling a curl command. In fact, there is a curl library that you can install (as long as you have a compiler on your OS).

Other choices are httplib2 (recommended) which is a fairly complete http protocol client supporting caching as well, or just plain httplib or a library named Request.

If you really, really want to just run the curl command and capture its output, then you can do this with Popen in the builtin subprocess module documented here: http://docs.python.org/library/subprocess.html

answered Apr 15, 2012 at 1:08

Michael Dillon

32.5k7 gold badges76 silver badges107 bronze badges

Comments

IT Ninja · Accepted Answer · 2012-04-15 02:02:41Z

1

Well, there is an easier to read, but messier way to do it. Here it is:

import os
outfile=''  #put your file path there
os.system("curl --head www.google.com>>{x}".format(x=str(outfile))  #Outputs command to log file (and creates it if it doesnt exist).
readOut=open("{z}".format(z=str(outfile),"r")  #Opens file in reading mode.
for line in readOut:
    print line  #Prints lines in file
readOut.close()  #Closes file
os.system("del {c}".format(c=str(outfile))  #This is optional, as it just deletes the log file after use.

This should work properly for your needs. :)

answered Apr 15, 2012 at 2:02

IT Ninja

6,51611 gold badges45 silver badges65 bronze badges

Comments

Adam · Accepted Answer · 2012-04-15 01:31:56Z

0

Try this:

import httplib
conn = httplib.HTTPConnection("www.python.org")
conn.request("GET", "/index.html")
r1 = conn.getresponse()
print r1.status, r1.reason

answered Apr 15, 2012 at 1:31

Adam

3,18811 gold badges48 silver badges80 bronze badges

1 Comment

576i Over a year ago

This does not really answer the question on how to capture output from curl. Often you need curl to send specific cookies and other parameters.

Adrian Mole · Accepted Answer · 2024-04-12 07:36:02Z

-1

Try this:

import subprocess as sp

cmd = "curl --head www.google.com"
p1 = sp.Popen(cmd, 
              stdin=sp.PIPE,
              stdout=sp.PIPE,
              stderr=sp.PIPE,
              text=True,
              shell=True)  
(output, err) = p1.communicate()
print('output: ', output)
print('err: ', err)

edited Apr 12, 2024 at 7:36

Adrian Mole

52.1k193 gold badges61 silver badges101 bronze badges

answered Apr 9, 2024 at 16:27

Just ForWork

11 bronze badge

1 Comment

Community Over a year ago

Your answer could be improved with additional supporting information. Please edit to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers in the help center.

Collectives™ on Stack Overflow

How To Capture Output of Curl from Python script

7 Answers 7

3 Comments

3 Comments

Result

Comments

Comments

Comments

1 Comment

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

7 Answers 7

3 Comments

3 Comments

Result

Comments

Comments

Comments

1 Comment

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related