Modules are reusable libraries of code in Python. Python comes with many standard library modules.
A module is imported using the
>>> import time >>> print(time.asctime()) 'Fri Mar 30 12:59:21 2012'
In this example, we’ve imported the time module and called the
asctime function from that module, which returns current time as a string.
There is also another way to use the import statement.
>>> from time import asctime >>> asctime() 'WED May 06 16:50:37 2020'
Here we just imported the
asctime function from the time module.
pydoc command provides help on any module or a function.
$ pydoc time Help on module time: NAME time - This module provides various functions to manipulate time values. ... $ pydoc time.asctime Help on built-in function asctime in time: time.asctime = asctime(...) asctime([tuple]) -> string ...
On Windows, the
pydoc command is not available. The work-around is to use, the built-in help function.
>>> help('time') Help on module time: NAME time - This module provides various functions to manipulate time values. ...
Writing our own modules is very simple.
For example, create a file called num.py with the following content.
def square(x): return x * x def cube(x): return x * x * x
Now open Python interpreter:
>>> import num >>> num.square(3) 9 >>> num.cube(3) 27
That's all we’ve written a python library.
Try pydoc num (pydoc.bat numbers on Windows) to see documentation for this numbers modules. It won’t have any documentation as we haven’t provided anything yet.
In Python, it is possible to associate documentation for each module, function using docstrings. Docstrings are strings written at the top of the module or at the beginning of a function.
Lets try to document our
num module by changing the contents of num.py
"""The num module provides utilties to work on numbers. Current it provides square and cube. """ def square(x): """Computes square of a number.""" return x * x def cube(x): """Computes cube of a number.""" return x * x
The pydoc command will now show us the doumentation nicely formatted.
Help on module num: NAME num - The num module provides utilties to work on numbers. FILE /Users/anand/num.py DESCRIPTION Current it provides square and cube. FUNCTIONS cube(x) Computes cube of a number. square(x) Computes square of a number.
Under the hood, python stores the documentation as a special field called doc.
>>> import os >>> print(os.getcwd.__doc__) getcwd() -> path Return a string representing the current working directory.
3.1. Standard Library
Python comes with many standard library modules. Lets look at some of the most commonly used ones.
3.1.1. os module
os.path modules provides functionality to work with files, directories etc.
Problem 1: Write a program to list all files in the given directory.
Problem 2: Write a program
extcount.py to count number of files for each extension in the given directory. The program should take a directory name as argument and print count and extension for each available file extension.
$ python extcount.py src/ 14 py 4 txt 1 csv
Problem 3: Write a program to list all the files in the given directory along with their length and last modification time. The output should contain one line for each file containing filename, length and modification date separated by tabs.
Hint: see help for
Problem 4: Write a program to print directory tree. The program should take path of a directory as argument and print all the files in it recursively as a tree.
$ python dirtree.py foo foo |-- a.txt |-- b.txt |-- code | |-- a.py | |-- b.py | |-- docs | | |-- a.txt | | \-- b.txt | \-- x.py \-- z.txt
3.1.2. urllib module
The urllib module provides functionality to download webpages.
>>> from urllib.request import urlopen >>> response = urlopen("http://python.org/") >>> print(response.headers) Date: Fri, 30 Mar 2012 09:24:55 GMT Server: Apache/2.2.16 (Debian) Last-Modified: Fri, 30 Mar 2012 08:42:25 GMT ETag: "105800d-4b7b-4bc71d1db9e40" Accept-Ranges: bytes Content-Length: 19323 Connection: close Content-Type: text/html X-Pad: avoid browser bug >>> response.header['Content-Type'] 'text/html' >>> content = request.read()
Problem 5: Write a program wget.py to download a given URL. The program should accept a URL as argument, download it and save it with the basename of the URL. If the URL ends with a /, consider the basename as index.html.
$ python wget.py http://docs.python.org/tutorial/interpreter.html saving http://docs.python.org/tutorial/interpreter.html as interpreter.html. $ python wget.py http://docs.python.org/tutorial/ saving http://docs.python.org/tutorial/ as index.html.
3.1.3. re module
Problem 6: Write a program antihtml.py that takes a URL as argument, downloads the html from web and print it after stripping html tags.
$ python antihtml.py index.html ... The Python interpreter is usually installed as /usr/local/bin/python on those machines where it is available; putting /usr/local/bin in your ...
Problem 7: Write a function make_slug that takes a name converts it into a slug. A
slug is a string where spaces and special characters are replaced by a hyphen, typically used to create blog post URL from post title. It should also make sure there are no more than one hyphen in any place and there are no hyphens at the beginning and end of the slug.
>>> make_slug("hello world") 'hello-world' >>> make_slug("hello world!") 'hello-world' >>> make_slug(" --hello- world--") 'hello-world'
Problem 8: Write a program links.py that takes URL of a webpage as argument and prints all the URLs linked from that webpage.
Problem 9: Write a regular expression to validate a phone number.
3.1.4. json module
Problem 10: Write a program myip.py to print the external IP address of the machine. Use the response from
http://httpbin.org/ get and read the IP address from there. The program should print only the IP address and nothing else.
3.1.5. zipfile module
The zipfile module provides interface to read and write zip files.
Here are some examples to demonstate the power of zipfile module.
The following example prints names of all the files in a zip archive.
import zipfile z = zipfile.ZipFile("a.zip") for name in z.namelist(): print(name)
The following example prints each file in the zip archive.
import zipfile z = zipfile.ZipFile("a.zip") for name in z.namelist(): print() print("FILE:", name) print() print(z.read(name))
Problem 11: Write a python program zip.py to create a zip file. The program should take name of zip file as first argument and files to add as rest of the arguments.
$ python zip.py foo.zip file1.txt file2.txt
Problem 12: Write a program mydoc.py to implement the functionality of pydoc. The program should take the module name as argument and print documentation for the module and each of the functions defined in that module.
$ python mydoc.py os Help on module os: DESCRIPTION os - OS routines for Mac, NT, or Posix depending on what system we're on. ... FUNCTIONS getcwd() ...
-The dir function to get all entries of a module
-The inspect.isfunction function can be used to test if given object is a function
-x.doc gives the docstring for x.
-The import function can be used to import a module by name
3.2. Installing third-party modules
PyPI, The Python Package Index maintains the list of Python packages available. The third-party module developers usually register at PyPI and uploads their packages there.
The standard way to installing a python module is using
easy_install. Pip is more modern and preferred.
Lets start with installing
Download the easy_install install script ez_setup.py.
Run it using Python.
That will install
easy_install, the script used to install third-party python packages.
Before installing new packages, lets understand how to manage virtual environments for installing python packages.
Earlier the only way of installing python packages was system wide. When used this way, packages installed for one project can conflict with other and create trouble. So people invented a way to create isolated Python environment to install packages. This tool is called
$ easy_install virtualenv
Installing virtualenv also installs the pip command, a better replace for easy_install.
Once it is installed, create a new virtual env by running the
$ virtualenv testenv
Now to switch to that env.
On UNIX/Mac OS X:
$ source testenv/bin/activate
Now the virtualenv testenv is activated.
Now all the packages installed will be limited to this virtualenv. Lets try to install a third-party package.
$ pip install tablib
This installs a third-party library called
tablib library is a small little library to work with tabular data and write csv and Excel files.
Here is a simple example.
# create a dataset data = tablib.Dataset() # Add rows data.append(["A", 1]) data.append(["B", 2]) data.append(["C", 3]) # save as csv with open('test.csv', 'wb') as f: f.write(data.csv) # save as Excel with open('test.xls', 'wb') as f: f.write(data.xls) # save as Excel 07+ with open('test.xlsx', 'wb') as f: f.write(data.xlsx)
It is even possible to create multi-sheet excel files.
sheet1 = tablib.Dataset() sheet1.append(["A1", 1]) sheet1.append(["A2", 2]) sheet2 = tablib.Dataset() sheet2.append(["B1", 1]) sheet2.append(["B2", 2]) book = tablib.Databook([data1, data2]) with open('book.xlsx', 'wb') as f: f.write(book.xlsx)
Problem 13: Write a program
csv2xls.py that reads a csv file and exports it as Excel file. The program should take two arguments. The name of the csv file to read as first argument and the name of the Excel file to write as the second argument.
Problem 14: Create a new virtualenv and install BeautifulSoup. BeautifulSoup is very good library for parsing HTML. Try using it to extract all HTML links from a webpage.
Read the BeautifulSoup documentation to get started.
Top comments (2)
Great stuff Py-lady
Thanks.. Just trying