A small snippet of code using urllib and urllib2 of Python to fetch a page.
The difference I've just encountered is that urllib will return the content of a page even if the page is not found (404) while urllib2 will throw an exception.
urllib
from __future__ import print_function
import urllib, sys
def fetch():
if len(sys.argv) != 2:
print("Usage: {} URL".format(sys.argv[0]))
return
url = sys.argv[1]
f = urllib.urlopen(url)
html = f.read()
print(html)
fetch()
Running python try_urllib.py https://www.python.org/xyz
will print a big HTML page
because https://www.python.org/xyz is a big HTML page.
urllib2
examples/python/try_urllib2.py
from __future__ import print_function
import urllib2, sys
def fetch():
if len(sys.argv) != 2:
print("Usage: {} URL".format(sys.argv[0]))
return
url = sys.argv[1]
try:
f = urllib2.urlopen(url)
html = f.read()
print(html)
except urllib2.HTTPError as e:
print(e)
fetch()
Running python try_urllib2.py https://www.python.org/xyz
will print
HTTP Error 404: OK