Python Programming/Internet
The urllib module which is bundled with python can be used for web interaction. This module provides a file-like interface for web urls.
Getting page text as a string[edit | edit source]
An example of reading the contents of a webpage
import urllib.request as urllib
pageText = urllib.urlopen("http://www.spam.org/eggs.html").read()
print(pageText)
Processing page text line by line:
import urllib.request as urllib
for line in urllib.urlopen("https://en.wikibooks.org/wiki/Python_Programming/Internet"):
print(line)
Get and post methods can be used, too.
import urllib.request as urllib
params = urllib.urlencode({"plato":1, "socrates":10, "sophokles":4, "arkhimedes":11})
# Using GET method
pageText = urllib.urlopen("http://international-philosophy.com/greece?%s" % params).read()
print(pageText)
# Using POST method
pageText = urllib.urlopen("http://international-philosophy.com/greece", params).read()
print(pageText)
Downloading files[edit | edit source]
To save the content of a page on the internet directly to a file, you can read() it and save it as a string to a file object
import urllib2
data = urllib2.urlopen("http://upload.wikimedia.org/wikibooks/en/9/91/Python_Programming.pdf", "pythonbook.pdf").read() # not recommended as if you are downloading 1gb+ file, will store all data in ram.
file = open('Python_Programming.pdf','wb')
file.write(data)
file.close()
This will download the file from here and save it to a file "pythonbook.pdf" on your hard drive.
Other functions[edit | edit source]
The urllib module includes other functions that may be helpful when writing programs that use the internet:
>>> plain_text = "This isn't suitable for putting in a URL"
>>> print(urllib.quote(plain_text))
This%20isn%27t%20suitable%20for%20putting%20in%20a%20URL
>>> print(urllib.quote_plus(plain_text))
This+isn%27t+suitable+for+putting+in+a+URL
The urlencode function, described above converts a dictionary of key-value pairs into a query string to pass to a URL, the quote and quote_plus functions encode normal strings. The quote_plus function uses plus signs for spaces, for use in submitting data for form fields. The unquote and unquote_plus functions do the reverse, converting urlencoded text to plain text.
Email[edit | edit source]
With Python, MIME compatible emails can be sent. This requires an installed SMTP server.
import smtplib
from email.mime.text import MIMEText
msg = MIMEText(
"""Hi there,
This is a test email message.
Greetings""")
me = 'sender@example.com'
you = 'receiver@example.com'
msg['Subject'] = 'Hello!'
msg['From'] = me
msg['To'] = you
s = smtplib.SMTP()
s.connect()
s.sendmail(me, [you], msg.as_string())
s.quit()
This sends the sample message from 'sender@example.com' to 'receiver@example.com'.
External links[edit | edit source]
- urllib.request, docs.python.org
- HOWTO Fetch Internet Resources Using The urllib Package, docs.python.org
- urllib2 for Python 2, docs.python.org
- HOWTO Fetch Internet Resources Using urllib2 — Python 2.7, docs.python.org