Python: OS, Shutil, Glob & Unipath

This is part of my Python & Django Series which can be found here including information on how to download all the source code.

This article looks at ways of interacting with the operating and file system.

OS

The OS module is a collection of miscellaneous operating system interfaces. If you want to interact with the operating system or call operating system functionality then you can probably do it with the OS module.

The following outlines some of the most useful functionality.

We can get the name and details of the OS with the name property and uname function. Uname returns a tuple with the following information.

  • sysname: operating system name
  • nodename: name of machine on network (implementation-defined)
  • release: operating system release
  • version: operating system version
  • machine: hardware identifier

Code:

print(os.name)
print(os.uname())

Output:

posix
posix.uname_result(sysname=’Linux’, nodename=’Minty’, release=’3.13.0-37-generic’, version=’#64-Ubuntu SMP Mon Sep 22 21:28:38 UTC 2014′, machine=’x86_64′)

We can determine the current working directory and also change it with the getcwd and chdir functions.

Code:

print(os.getcwd())
os.chdir('/')
print(os.getcwd())

Output:

/data/data/Dropbox/Development/SandBox/Git/ThePythonPit/PythonSandBox/StandardLibrary
/

We can access the user group id and the user id of the user principal of the active running thread along with a list of all user group ids for the OS.

Code:

print(os.getgid())
print(os.getuid())
print(os.getgroups()) 

Output:

1000
1000
[4, 24, 27, 30, 46, 108, 110, 1000]

The getenv and and putenv can be used to get and set system variables respectively.

Code:

print(os.getenv("HOME"))
os.putenv("Fluffy", "Yes")

Output:

/home/luke

We can create directories and remove them with the mkdir and rmdir functions. We can also run bash commands with the system function.

os.mkdir("foo")
os.rmdir("foo")
os.system("mkdir XXX")

Shutil

The shutil module provides a high level interface when working with a file or a collection of files.

The following outlines some of the most useful functionality.

We can move and copy files with the move and copy functions.

The copytree function can be used to copy a directory and all it’s contents while the rmtree can be used to delete a directory and all it’s contents.

Code:

from shutil import move, copy, copytree, rmtree, 

move("source_file.txt", "target_dir" )
copy("source_file.txt", "target_file.txt" )
copytree("source_dir", "target_dir")
rmtree("source")

We can use shutil to compress and uncompress files and directories with the make_archive and unpack_archive functions respectively. The format has to be registered upon the system and with python. We can determine which formats are configured with the get_archive_formats function. It returns a list of tuples, each tuple contains the file extension and the format description.

Code:

from shutil import get_archive_formats, make_archive, unpack_archive

print(get_archive_formats())
make_archive(archive_name, 'gztar', root_dir)
unpack_archive(archive_name, "source_dir", "gztar" )

Output:

[(‘bztar’, “bzip2’ed tar-file”), (‘gztar’, “gzip’ed tar-file”), (‘tar’, ‘uncompressed tar file’), (‘zip’, ‘ZIP file’)]

The disk_usage function can be used to determine the disk usages statistics for a partition or hard disk. It returns a tuple of total the hard disk size, the used size and the free size in bytes.

Code:

from shutil import disk_usage

print(disk_usage("/"))

Output:

usage(total=20507914240, used=7373537280, free=12069023744)

The which command can be used to find the location of an executable which is locatable within the path variable. Here we use it to find the location of the python and python3 executables.

Code:

from shutil import which

print(which("python"))
print(which("python3"))

Output:

/usr/bin/python
/usr/bin/python3

Glob

Glob provides functionality for getting a list of files on a hard disk from a search pattern. The pattern rules for glob are not actually regular expressions but standard Unix path expansion rules.

  • * matches zero or more characters as wild card characters
  • ? matches a single character as a wild card character
  • [] matches a single character form a list of possibilities. Allows the character ‘-‘ to determine a range and the character ‘!’ to negate.
    • [0123456789] matches any number
    • [abc] matches letters a, b or c as lowercase letters
    • [0-9] matches any number
    • [a-zA-Z] matches any letter upper or lowercase
    • [!abc] matches anything except letters a, b or c

Glob can be used with relative or absolute paths.

Glob is not recursive; i.e it will only search local entities and not within subdirectories. You can use os.walk to search recursively.

The following matches any file which has an extension of ‘txt’.

Code:

from glob import glob

for name in glob('*.txt'):
    print(name)

The following will match any file which ends in og.txt and has one first character which can be anything.

Code:

from glob import glob

for name in glob('?og.txt'):
    print(name)

The following will match any file which ends in og.txt and has one first character which can be any lowercase letter.

Code:

from glob import glob

for name in glob('[a-z]og.txt'):
    print(name)

The following will match any file which ends in og.txt and has one first character which is anything except for lowercase a, b, c, d or e.

Code:

from glob import glob

for name in glob('[!abcde]og.txt'):
    print(name)

Unipath

Unitpath is a Object-oriented alternative to os, os.path and shutil.

Path

Everything works around the Path class which can take any number of path name components which are concatenated to use the correct pathname separator. It supports glob style syntax as well as relative and absolute paths.

A Path instance can be created from any of the following.

Code:

Path("/", "home", "lukey")        # An absolute path of /home/lukey
Path("foo", "log.txt")            # A relative path of foo/log.txt
Path(__file__)                    # The current running file
Path()                            # Path(os.curdir)
p = Path("")                      # An empty path

Path Properties

The path class have number of properties which can be used to return a Path instance of the parent directory, the file name, the file extension and the file name without the extension.

The components function can be used to get a list of directories which define the Path instance, each as an instance of Path.

Code:

here = Path(__file__)
print(here)

print(here.components())            # A list of all the directories and the file as Path instances.
print(here.parent)                  # The path without the file name
print(here.name)                    # The file name
print(here.ext)                     # The file extension
print(here.stem)                    # The  file name without the extension

Output:

[Path(‘/’), Path(‘data’), Path(‘data’), Path(‘Dropbox’), Path(‘Development’), Path(‘SandBox’), Path(‘Git’), Path(‘ThePythonPit’), Path(‘PythonSandBox’), Path(‘StandardLibrary’), Path(‘unipath_example.py’)]

/data/data/Dropbox/Development/SandBox/Git/ThePythonPit/> PythonSandBox/StandardLibrary

unipath_example.py

.py

unipath_example

Child & Parent Methods

We can access the parent directory of a path instance with the parent property as shown above.

We can jump up the ancestry tree x times with the ancestor method; this is the same as calling parent x times.

We can walk down the ancestry tree with the child method passing all components of the path to the required directory or file.

Code:

print(here.parent)                  # The containing directory
print(here.ancestor(5))             # Up x entities ( same as calling parent x times).
print(here.ancestor(3).child("PythonSandBox", "StandardLibrary")) # Returns the child as defined by the components.

Output:

/data/data/Dropbox/Development/SandBox/Git/ThePythonPit/PythonSandBox/StandardLibrary

/data/data/Dropbox/Development/SandBox

/data/data/Dropbox/Development/SandBox/Git/ThePythonPit/PythonSandBox/StandardLibrary

Expand, Expand User and Expand Vars

Path instances can be defined with the ~, system variables and also the .. notation.

Note: ~ represents the users home director for Linux/Unix.
Note: .. is a notation for up one directory when defining relative paths.

We can expand these relative path notations to absolute paths. The function expand_user will expand ~ while expand_vars will expand system variables. The norm function will expand the .. and . notations.

Alternatively the expand function will expand ~, system variables as well as the .. and . notations.

Code:

print(Path("~").expand_user() )     # Expands ~ to a absolute path name
print(Path("$HOME").expand_vars())  # Expands system variables
print(Path("/home/luke/..").norm()) # Expands .. and . notation
print(Path("$HOME/..").expand())    # Expands system variables, ~ and also ..

Output:

/home/luke
/home/luke
/home
/home

File Attributes and permissions

The path class also has a number of attributes which can be used to return information about the file or directory. Most of them are self explanatory.

Note that the atime and ctime functions return time as seconds past the epoch which for unix is the first second of 1970. You can find out the epoch with gmtime(0).

Code:

here = Path(__file__)

print(here.atime())                     # Last access time
print(here.ctime())                     # Last permission or ownership modification; windows is creation time
print(here.isfile())                    # Is this a file? Symbolic links are followed
print(here.isdir())                     # Is this a directory? Symbolic links are followed
print(here.islink())                    # Is a symbolic link?
print(here.ismount())                   # Is a mount point; i.e. is the parent on a different device?
print(here.exists())                    # File or directory actually exists? Symbolic links are followed.
print(here.lexists())                   # Same as exists but symbolic links are not followed
print(here.size())                      # File size in bytes
print(Path("/foo").isabsolute())        # Is an absolute and not a relative path

The function gmtime can be used to determine the epoch.

Code:

print(gmtime(0))

Output:

time.struct_time(tm_year=1970, tm_mon=1, tm_mday=1, tm_hour=0, tm_min=0, tm_sec=0, tm_wday=3, tm_yday=1, tm_isdst=0)

The stat and lstat can be used to get statistics for a file. Stat will navigate through symbolic links while lstat will use the file the path instance is looking at regardless if it if is a symbolic link or not.

Code:

here = Path(__file__)
print(here.stat())                      # File stat object for size, permissions etc. Symbolic links are

Output:

os.stat_result(st_mode=33188, st_ino=2753975, st_dev=2052, st_nlink=1, st_uid=1000, st_gid=1000, st_size=3054, st_atime=1434042724, st_mtime=1434042724, st_ctime=1434042724)

Python: Logging

This is part of my Python & Django Series which can be found here including information on how to download all the source code.

The logging module provides the ability to add conditional logging into any code.

Levels

Logging is associated with a level of seriousness which starts at debug information and ends up at a critical system.

Level Description
Debug Debug information
Info Information
Warning Warning
Error Error
Critical Critical

When we log information we provide the level along with the message.

Code:

import logging

logging.debug('Debug Info')
logging.info('Info')
logging.warning('Warning')
logging.error('Error')
logging.critical('Critical Error')

We configure the active level with the level parameter of the basicconfig function. Only entities which are equal to or of a higher seriousness are reported. By default the level is set to WARNING.

Code:

logging.basicConfig(level=logging.WARNING)   # Set report level

Output:

WARNING:root:Warning
ERROR:root:Error
CRITICAL:root:Critical Error

If we changed the level to debug we would output the following.

Output:

DEBUG:root:Debug Info
INFO:root:Info
WARNING:root:Warning
ERROR:root:Error
CRITICAL:root:Critical Error

Logging To A File

By default the messages are logged to a terminal. We can log to a file with the filename parameter.

Code:

logging.basicConfig(filename='log.txt',level=logging.DEBUG)

By default messages are appended to the log file between application runs. We can overwrite the file each time by setting the filemode parameter to ‘w’.

Code:

logging.basicConfig(filename='log.txt',level=logging.DEBUG, filemode="w")

Format

We can define the format of the messages in a number of ways.

First lets add the time to the output message.

Code:

logging.basicConfig(format='%(asctime)s %(message)s')

Output:

2015-06-10 19:19:03,578 Warning
2015-06-10 19:19:03,579 Error
2015-06-10 19:19:03,579 Critical Error

We can also add new templates into the format and pass the data as key value pairs into the logging message.

Below we add in the templates ip and user into the format string, these are then populated by the parameter named extra which should be a dictionary.

Code:

logging.basicConfig(format='%(asctime)-15s %(ip)s %(user)-8s %(message)s')
logging.critical('Critical %s', 'Error', extra = {'ip': '192.168.0.1', 'user': 'luke'})

Output:

2015-06-10 19:55:01,698 192.168.0.1 luke Critical Error

Python: Numerical Types, Mathematical Functions & The Random Module

This is part of my Python & Django Series which can be found here including information on how to download all the source code.

Numbers

Python and its dynamic type system means that in most cases you don’t need to worry about the data type you are working on.

Before you get too excited there are still inherent inaccuracies when working with floating point numbers!

One thing I really like about Python is that from version 3 operators are not truncated to the narrowest type, as such 3.3 / 1 = 1.1 and not 1!

Python provides the following numerical types which are all immutable.

Type Description Example
int Integral value x = 10
float Floating point x = 1.1
Decimal Decimal point x = Decimal(1.1)
boolean Sublcass of int x = True
Fraction Holds a separate numerator and denominator x = Fraction(1, 2)
complex Representation of a theoretical complex numbers x = complex(-1, 0, 0.0 )

They all respond to the basic mathematical and assignment operators. TODO

Integers

Before Python version 3 there were two integer types; int and long. Ints had a defined range while longs were in theory infinite but in practice depended upon the size of the available memory. The Python runtime determined which type was required and for longs handled the amount of memory required.

For version 3 and greater the int and long types have been merged into the int type. The runtime handles determining the memory size based upon the size of the integer being represented.

In theory ints can grow as big as they are required; the only limit is the amount of available memory.

Assignment is made with the = operator and an integral value. The int constructor is implicitly called though can be used.

Code:

x = 1
y = int(1)

We can determine the type with the type function.

Code:

print(x, type(x))

Output:

1

We can parse strings to an int via the constructor.

Code:

x = int("1")

We can check that a string only contains characters which can be parsed with the isdigit function.

Code:

print(str.isdigit("1"));
print(str.isdigit("a"));

Output:

True
False

We can round by passing in a negative number of decimal places.

Below we round a value to -2 decimal places; the first two digits to the left of the decimal point become zeros, the third is rounded accordingly.

Code:

print(round(11111, -2))

Output:

11100

We can perform many maths operators, below are some examples. . See ../Operators/*.py for more details *TODO**

Code:

print(1+1)
print(1-1)
print(1/1)
print(1*1)

Output:

2
0
1.0
1

We can get physical storage information about the int class with the int_info function.

Code:

import sys
print(print(sys.int_info))

The output will vary depending if you are using a 32bit or 64bit operating system.

Output:

sys.int_info(bits_per_digit=30, sizeof_digit=4)
sys.int_info(bits_per_digit=15, sizeof_digit=2)

64 bit systems will store a digit as 2**30 bits and 32 bit systems will store a digit as 2**15 bits. Both will allow an int to grow to an infinite size dependant on memory.

The memory used is dependant upon what is required. We can use the getsizeof function to get the size of the memory allocated for an instance of an int.

Code:

from sys import getsizeof

print(getsizeof(0))
print(getsizeof(100**100))
print(getsizeof(100**100100))

Output:

24
116
88700

Floating Point

The float type represents a floating point number.

A float is a fixed size representation of a fractional number; it contains digits to the left and right of the decimal point.

1/3 gives us an infinite number of digits after the decimal point which is impossible to store exactly.

Float has a fixed memory size and it’s range is represented by significant figures rather than a physical min and max boundary.

For example 1.0e10 = 10000000000.0 and only has one digit of signification.

Format Total bits Significant bits Exponent bits
Single precision 32 23 + 1 sign 8
Double precision 64 52 + 1 sign 11

Assignment is made with the = operator and any real number i.e. not an integral. It is a short cut for the float constructor which can be implicitly called.

Code:

x = 1.1
y = float(1.1)

We can determine the type with the type function.

Code:

print(x, type(1))

Output:

1.1
1

We can parse strings to a float including exponential representations.

Code:

print(float("1"))
print(float("1.0e10"))

Output:

1.0
10000000000.0

We can round to x d.p with the round function.

Code:

print(round(1.11111, 2))

Output:

1.11

We can perform many maths operators. TODO. See ../Operators/*.py for more details.

Code:

print(1.1 + 1.1)
print(1.1 - 1.1)
print(1.1 / 1.1)
print(1.1 * 1.1)

Output:

2.2
0.0
1.0
1.2100000000000002

We can get physical storage information about the float class with the float_info function.

Code:

print(float_info)
print(sys.float_info.min)
print(sys.float_info.max)

Output:

sys.float_info(max=1.7976931348623157e+308, max_exp=1024, max_10_exp=308, min=2.2250738585072014e-308, min_exp=-1021, min_10_exp=-307, dig=15, mant_dig=53, epsilon=2.220446049250313e-16, radix=2, rounds=1)
2.2250738585072014e-308
1.7976931348623157e+308

For 64 bit machines we have 53 bits of signification as noted with the mant_dig property, this includes the sign!

The memory appears to always fixed at 24 bytes, incrementing over this size causes an OverflowError exception to be raised.

Code:

print(getsizeof(float(0)))
print(getsizeof(1.1))
print(getsizeof(float(9999999.9)))

Output:

24
24
24

As per all languages floats are approximations and as such suffer from inaccuracy.

Code:

print(1 / 3)
print(.1 + .1 + .1 == .3)
print(float(.1) + float(.1) + float(.1))

Output:

0.3333333333333333
False
0.30000000000000004

The following pages explain more about floats and their inherent issues.

Decimal

The decimal type represents a real number (integral and fraction) and has a closer representation to the real value when compared to float.

It is not optimised for computers and as such has a memory and performance hit when compared to floats though they are more accurate.

In python decimals can grow to take on more precision and accuracy as required.

Assignment is made with the = operator and the decimal constructor.

Code:

from decimal import Decimal

x = Decimal(1.1)

We can determine the type with the type function.

Code:

x = print(x, type(Decimal(1.1))

Output:

1.100000000000000088817841970012523233890533447265625

We can parse strings to a Decimal including exponential numbers.

Code:

print(Decimal("1"))
print(Decimal("1.0e10"))

Output:

1
1.0E+10

We can round to x d.p with the round function.

Code:

print(round(Decimal(1.111111), 2))

Output:

1.11

We can perform many maths operators. TODO ee ../Operators/*.py for more details

Code:

x = Decimal(1.1)
print(x + x)
print(x - x)
print(x / x)
print(x * x)

Output:

2.200000000000000177635683940
0E-51
1
1.210000000000000195399252334

We can use the getcontext function to get information about the physical memory representation of a Decimal.

Code:

from decimal import Decimal, getcontext

print(getcontext())

Output:

Context(prec=28, rounding=ROUND_HALF_EVEN, Emin=-999999, Emax=999999, capitals=1, clamp=0, flags=[Inexact, FloatOperation, Rounded], traps=[InvalidOperation, DivisionByZero, Overflow])

The memory allocation seems to always be fixed as 24 bytes.

Code:

print(getsizeof(Decimal(0)))
print(getsizeof(Decimal(1024)))
print(getsizeof(Decimal(999999999999)))

Output:

104
104
104

Decimals provide better accuracy to floats.

Code:

print(.1 + .1 + .1 == .3)
print(Decimal(".1") + Decimal(".1") + Decimal(".1") == Decimal(".3"))

Output:

False
True

Fractions

Represents a fraction i.e it has a nominator and a denominator.

A Fraction instance can be constructed from a pair of integers, from another rational number or from a string.

Code:

from fractions import Fraction
from decimal import Decimal

print(Fraction(1, 2))
print(Fraction(Fraction(1, 2)))
print(Fraction(1.1))
print(Fraction(Decimal(1.1)))

Output:

1/2
1/2
2476979795053773/2251799813685248
2476979795053773/2251799813685248

We can determine the type with the type function.

Code:

x = Fraction(1, 2)
print(x, type(x))

Output:

1/2

We can parse strings of rational number into a fraction.

Code:

print(Fraction("1.1"))

Output:

11/10

We can perform many maths operators. TODO: See ../Operators/*.py for more details

Code:

x = Fraction(1, 2)
print(x + x)
print(x - x)
print(x / x)
print(x * x)

Output:

1
0
1
1/4

Complex Numbers

Complex numbers have a real and imaginary part both of which which are floating point numbers.

Assignment is made with the = operator and the complex constructor.

Code:

x = complex(1.1, 2.2)

We can determine the type with the type function.

Code:

print(x, type(x))

Output:

(1.1+2.2j)

We can perform many maths operators. TODO: # See ../Operators/*.py for more details

Code:

x = complex(1.1, 2.2)
print(x + x)
print(x - x)
print(x / x)
print(x * x)

Output:

(2.2+4.4j)
0j
(1+0j)
(-3.630000000000001+4.840000000000001j)

Math

The math module contains various useful basic maths functions.

Code:

import math

The ceil and floor functions can be used to round up and down to the nearest integer respectively. Where negative numbers are found they work away from zero.

Code:

print(math.ceil(11.11))    
print(math.floor(11.11))  
print(math.ceil(-11.11))    
print(math.floor(-11.11))  

Output:

12
11
-11
-12

The min and max functions can be used to return the largest and smallest valued entity from any number of arguments passed in respectively. They can both work from collections.

The fsum function can be used to sum all elements within a collection.

Code:

print(min(1, 2, 3))
print(max(1, 2, 3))                 
print(math.fsum([1, 2, 3]))    

Output:

1
3
6

Modf returns a tuple of a real number as an integral and fractal.

Trunc removes any digits after the decimal point leaving an integral value.

Fabs returns a positive value of a number or the number itself if it is not negative.

Code:

print(math.modf(1.1))        
print(math.trunc(1.11))   
print(math.trunc(-1.11))   
print(math.fabs(-999))    

Output:

(0.10000000000000009, 1.0)
1
-1
999.0

Isfinite can be used to ensure a number is not NaN or infinite. Isinf can determine if a number is infinite while isnan can determine if a value is assigned nan.

Code:

print(math.isfinite(1))                    
print(math.isinf(float("inf")))    
print(math.isnan(float("inf")))   

Output:

True
True
False

The function pow calculates one number to the power of another, while the sqrt function can be used to calculate the square root of a number.

Code:

print(math.pow(3, 3))    
print(math.sqrt(9))        

Output:

9
3

The pi and e properties can be used to get the pi and e constants respectively.

Code:

print(math.pi)      # Pi
print(math.e)        # E

The radians and degrees function can be used to convert degrees to radians and vice versa respectively.

Output:

3.141592653589793
2.718281828459045

Code:

print(math.radians(360))                      # Degrees to Radian
print(math.degrees(6.283185307179586))    # Radians to degrees

Output:

6.283185307179586
360.0

Python provides many other trigonometry functions.

Function Description
acos(x) Return the arc cosine of x, in radians.
asin(x) Return the arc sine of x, in radians.
atan(x) Return the arc tangent of x, in radians.
cos(x) Return the cosine of x radians.
hypot(x, y) Return the Euclidean norm, sqrt(xx + yy).
sin(x) Return the sine of x radians.
tan(x) Return the tangent of x radians.

Random

The random module allows functionality of random selections.

Random generates a random float which has no lower or upper limits.

Code:

from random import random

print(random())

Output:

0.07197929300003614

Uniform generates a random float which has a lower and upper limit as defined by the first and second arguments respectively.

Code:

from random import uniform

print(uniform(1, 10))

Output:

3.8361968102149504

Randint can be used to generate a random int which has a lower and upper limit.

Code:

from random importrandint

print(randint(1, 99))

Output:

20

Randrange can be used to select an element which has a lower and upper limit but also respects an increment. Here we used it to get a random odd number between 1 and 99.

Code:

from random import randrange

print(randrange(1, 99, 2))

Output:

71

Choice can be used to randomly select an element from an enumerable.

Code:

from random import choice

print(choice('abcdefghi'))

Output:

h

Sample can be used to randomly select any number of elements from an enumerable. Here we select any two random elements from a collection.

Code:

from random import sample

print(sample([1, 2, 3, 4, 5], 2))       

Output:

[4, 2]

Shuffle can be used to randomly order an enumerable.

Code:

from random import shuffle

letters = "a,b,c,d".split(',')
shuffle(letters)
print("Shuffled Letters:", letters)

Output:

[‘b’, ‘c’, ‘a’, ‘d’]

Python: Dates, Times & TimeIt

This is part of my Python & Django Series which can be found here including information on how to download all the source code.

This article looks at dates and times along with functionality which will be useful when working with them.

Date

The date class provides a representation of a date. It is located within the datetime namespace.

Code:

from datetime import date

We can create an instance of the date class populated with today’s day, month and year with the today function.

Code:

from datetime import date

today = date.today()
print(today)

Output:

2015-06-09

A date instance responds to day, month and year.

Code:

print("{0}-{1}-{2}".format(today.day, today.month, today.year))

Output:

9-6-2015

We can create an instance of a date with the constructor which takes the year, month and day as integers.

Code:

print(date(1978, 10, 25))

Output:

1978-10-25

The date class is limited to a range of 0001-01-01 to 9999-12-31 with increments of 1 day. The min, max and resolution properties can be used to return this information.

Code:

print("Min = {0}, Max = {1}, Resolution = {2}".format(date.min, date.max, date.resolution))

Output:

Min = 0001-01-01, Max = 9999-12-31, Resolution = 1 day, 0:00:00

The date class is immutable but we can use the replace function to create a new instance replacing any of the year, month and day fields; all parameters are optional.

Code:

print(today.replace(1, 2, 3))

Output:

0001-02-03

We can determine the day of the week as an integer with the weekday and isoweekday functions. Weekday returns 0-6 for Monday to Sunday while isoweekday returns 1-7 for Monday to Sunday.

Code:

print(today.weekday())
print(today.isoweekday())

Output:

1
2

Date objects can support basic operators with the most useful being the minus operator, this allows us to determine the difference between two days. It returns a timedelta instance

Code:

print(today.replace(2016) - today )

Output:

366 days, 0:00:00

Date Time

The datetime class is similar to the date class but it also holds state for time. It is located within the datetime namespace.

Code:

from datetime import datetime

We can get an instance of datetime populated as the current date and time with the today, now and utcnow functions.

The functions today and now take the local time including daylight saving, the utcnow will be the local time without daylight saving.

Code:

print(datetime.today())
print(datetime.now())
print(datetime.utcnow())

Output:

2015-06-09 16:48:55.378254
2015-06-09 16:48:55.378304
2015-06-09 15:48:55.378328

Datetime can represent a date and time between the range of 0001-01-01 00:00:00, and 9999-12-31 23:59:59.999999. The minimal increment is 0.000001 seconds. This data can be retrieved with the min, max and resolution properties.

Code:

print("Min = {0}, Max = {1}, Resolution = {2}".format(datetime.min, datetime.max, datetime.resolution))

The datetime class has properties for the day, month, year as well as hour, minutes and seconds.

Code:

print("{0}-{1}-{2} {3}:{4}:{5}".format(now.day, now.month, now.year, now.hour, now.minute, now.second))

Output:

9-6-2015 16:48:55

We can create an instance of a datetime by passing in any of the required components of state into the constructor.

Code:

print(datetime(2001, 2, 3, 4, 5, 6))

Output:

2001-02-03 04:05:06

A datetime instance is immutable though we can use the replace function to create a new instance based on another while swapping over any of the state components; all parameters are optional.

Code:

# replace(year, month, day, hours, minutes, seconds)
print(datetime.today().replace(1, 2, 3, 4, 5, 6, 7))

Output:

0001-02-03 04:05:06.000007

We can use basic operators between two instances of datetime, the most useful being the minus operator. This can be used to determine the time period between two dates, it returns a timedelta.

Code:

print(datetime.now().replace(year=2016) - datetime.now())

Output:

365 days, 23:59:59.999989

##Formatting DateTime## {#FormattingDateTime}

Python provides the following templates which can be used with date, time and datetime where applicable when formatting them to strings.

Code Example Description
%a Mon Name of day short
%A Monday Name of day
%w 0 Day of week as integral. Sunday – Saturday = 0 – 6
%d 25 Day of the month
%b Jan Name of month short
%B January Name of month
%m 1 Month (0-12)
%y 79 Short year ( last two digits)
%Y 1978 Year ( as 4 digits)
%H 18 Hour as integral of 24 hour clock
%I 6 Hour as integral of 12 hour clock
%p AM AM/PM
%M 30 Minute as integral
%S 30 Second as integral
%f 989898 Microsecond as integral
%z UTC offset (form +HHMM or -HHMM)
%Z Time zone name
%j 213 Day of the year
%U 10 Week number of the year ( Sunday as the first day of the week)
%W 10 Week number of the year (Monday as the first day of the week)
%c 01/02/2014 12:30:55 Locale formatted date time
%x 01/02/2014 Locale formatted date
%X 12:30:55 Locale formatted time

We can use any number of these templates along with the strftime function.

Code:

print(date.today().strftime("%m-%d-%y"))
print(datetime.now().strftime("%d %b %Y %X"))

Output:

06-09-15
09 Jun 2015 17:01:34

We can also use the isoformat and ctime functions for predefined formatted representations.

Code:

print(datetime.today().isoformat())
print(date.today().isoformat())
print(datetime.today().ctime())
print(date.today().ctime())

Output:

2015-06-09T17:01:34.880119
2015-06-09
Tue Jun 9 17:01:34 2015
Tue Jun 9 00:00:00 2015

Time Delta

The timedelta class allows the representation of a time range. It is returned when subtracting or working out the difference between two dates or datetimes.

Code:

from datetime import timedelta

We can create an instance with the constructor, all parameter are optional.

Internally only days, seconds and microseconds are stored all other arguments are converted.

Code:

a_timedelta = timedelta(days=1, seconds=2, microseconds=3, milliseconds=0, minutes=0, hours=0, weeks=0)
print(a_timedelta)

Output:

1 day, 0:00:02.000003

We can access the days, seconds and microseconds by similarly named properties.

Code:

print(a_timedelta.days)
print(a_timedelta.seconds)
print(a_timedelta.microseconds)

Output:

1
2
3

A time period is made up of the microseconds, seconds and days all together. We can use the total_seconds function to get the entire time range within seconds.

Code:

print(a_timedelta.total_seconds()) # Seconds contained in days, second sand microseconds

Output:

86402.000003

A timedelta can hold data within a range from -999999999 days, 0:00:00 seconds to 999999999 days, 23:59:59.999999 seconds in increments of 0.000001 seconds. The min, max and resolution properties can be used to return this information.

Code:

print("Min = {0}, Max = {1}, Resolution = {2}".format(timedelta.min, timedelta.max, timedelta.resolution))

Output:

Min = -999999999 days, 0:00:00, Max = 999999999 days, 23:59:59.999999, Resolution = 0:00:00.000001

The print, str and repr functions can be used to report an instance into a string.

Code:

print(a_timedelta)
print(str(a_timedelta))
print(repr(a_timedelta))

Output:

1 day, 0:00:02.000003
1 day, 0:00:02.000003
datetime.timedelta(1, 2, 3)

We can use a timedelta to add or subtract a time period onto a date or datetime.

Code:

today = datetime.now()
yesterday = datetime.now() - timedelta(days=1)
print(today - yesterday)
print((today - yesterday).total_seconds())

Output:

23:59:59.999537
86399.999537

Time

The time class allows us to represent a time along with its date.

Code:

import time

We can grab the current time and date with the time function. It returns time in seconds or ticks since 12:00am, January 1, 1970.

Code:

print(time.time())

Output:

1433866491.5349488

We can convert this to something a little more human readable with the localtime function.

Code:

print(time.localtime(time.time()))

It returns a struct_time which is a named tuple.

Output:

time.struct_time(tm_year=2015, tm_mon=6, tm_mday=9, tm_hour=17, tm_min=14, tm_sec=51, tm_wday=1, tm_yday=160, tm_isdst=1)

The following defines the struc_time tuple.

Index Attribute Description Range
0 tm_year Year Any int
1 tm_mon Month 1 to 12
2 tm_mday Day of month 1 to 31
3 tm_hour Hour 0 to 23
4 tm_min Minutes 0 to 59
5 tm_sec Seconds 0 to 61 where 60/61 are leap-econds
6 tm_wday Day of week 0 to 6 where 0 is Monday
7 tm_yday Day of year 1 to 366 (Julian day)
8 tm_isdst Daylight saving 1=y, 0=n, -1=library determines DST

We can use the asctime function to format a time object into a string.

Code:

print(time.asctime(time.localtime(time.time())))

Output:

Tue Jun 9 17:14:51 2015

TimeIt

The timeit module provides stopwatch style functionality for timing the running of code.

Lets take a simple function which performs a loop and does some multiplication.

Code:

import timeit

def function_to_time(max_value):

    start = 0

    for count in range(max_value):
        start = start ** max_value

We can use the timer class in timit to run the function a set number of times and then return the time it required to run it.

In the following we run our function 100, 200 and then 300 times with a value of 100.

Code:

    t = timeit.Timer(lambda: function_to_time(100))

    for number in [100, 200, 300]:
        print("{0}: {1}".format(number, t.timeit(number=number)))

Output:

100: 0.004333864999352954
200: 0.009334164000392775
300: 0.013926845000241883

The example above used a lambda expression though timit allows the code to be run represented as a string. Below we loop through 0 to 100 and join all the numbers with a hyphen.

Code:

for number in [100, 200, 300]:
    print("{0}: {1}".format(number, timeit.timeit('"-".join(str(n) for n in range(100))', number=number)))

Output:

100: 0.0035422290002316004
200: 0.006914595000125701
300: 0.009374900999318925

Python: Unit Testing

Unit Testing

This is part of my Python & Django Series which can be found here including information on how to download all the source code.

The ability for software to test and diagnose itself is a powerful feature.

A Simple Example

Lets take a simple function which adds two numbers together.

Code:

def add_two_numbers(a, b):
    """
    A simple method to test
    """

    return a + b

We can create a test to ensure that add_two_numbers works as expected by comparing the result of a call to the function with our expected result.

Code:

from unittest import TestCase, main

class MyTestClass(TestCase):
    """
    A simple unit test example
    """

    def test_add_two_numbers(self):
        self.assertEqual(add_two_numbers(1, 2), 3)

A test class inherits from unittest.TestCase. All functions which are prefixed with test_ will be determined as tests which are required to be run.

Above we call add_two_numbers with parameters 1 and 2. We then use the returned value as a parameter to the assertEqual function along with our expected result of 3.

If the assertion validates as expected the assertion returns allowing control to carry on, otherwise an error is raised and the test is marked as failed.

A test function can have any number of assertions called.

We can run our test function with the main function from unittest.

Code:

if __name__ == '__main__':
    main()

Output:

.py::MyTestClass true
Testing started at 12:42 …

Process finished with exit code 0

If a bug appeared in our code we would see a result similar to the following.

Output:

.py::MyTestClass true
Testing started at 12:44 …

Process finished with exit code 0

Failure
Traceback (most recent call last):
File “/data/data/Dropbox/Development/SandBox/Git/ThePythonPit/PythonSandBox/Testing/unittest_examples/simple_example.py”, line 26, in test_add_two_numbers
self.assertEqual(add_two_numbers(1, 2), 4)
AssertionError: 3 != 4

Assertions

In the previous section we saw the assertEqual assertion. The unittest module provides many assertion functions to cater for a range of possible test criteria.

Equals Assertions

Equality assertion can be made with the assertEqual and inequality assertion can be made with the assertNotEqual function. Both functions take two parameters; the result and the expected result.

Code:

self.assertEqual(1, 1)
self.assertNotEqual(1, 2)

For numerical results the assertAlmostEqual and assertNotAlmostEqual functions allow equality assertion within a tolerance of error. The tolerance is passed in as the third parameter and represents the number of decimal places to be used when determining equality.

The call to assertAlmostEqual takes 1.1 and 1.11 with a tolerance of 1 d.p. This would fail if we used assertEqual but as 1.11 becomes 1.1 when rounding to 1 d.p and therefore the assertion passes.

Code:

self.assertAlmostEqual(1.1, 1.11, 1)  # 3rd argument is the precession
self.assertNotAlmostEqual(1.1, 1.11, 2)  # 3rd argument is the precession

The assertEqual function can take most types. All of the following asserts for lists, tuples, sets, dictionaries and multi-line strings pass assertion.

When being called for collections, the test requires both collections to be of the same type, contain the same number of elements and the elements at the same ordinal position to be equal.

Code:

self.assertEqual([1, 2, 3], [1, 2, 3]) # list
self.assertEqual((1, 2, 3), (1, 2, 3)) # tuple
self.assertEqual({1, 2, 3}, {1, 2, 3}) # set
self.assertEqual({'a': 1}, {'a': 1})  # dictionary
self.assertEqual("onentwo", "onentwo") # multi-line string

Unittest does provide specific assert equal functions for each type though these are implicitly called via the assertEquals function. You should favour using the assertEquals functions.

Code:

self.assertListEqual([1, 2, 3], [1, 2, 3])
self.assertTupleEqual((1, 2, 3), (1, 2, 3))
self.assertSetEqual({1, 2, 3}, {1, 2, 3})
self.assertDictEqual({'a': 1}, {'a': 1})
self.assertMultiLineEqual("onentwo", "onentwo")

Code:

The assertEqual function works upon equality; as such an integer of value 1 and a float of value 1.0 will pass an assertion check together.

self.assertEqual(1.0, 1)

Booleans Assertions

The assertFalse and assertTrue functions for ensuring that a boolean type is either false or true respectively.

Code:

self.assertFalse(False)
self.assertTrue(True)

Collections Assertions

A number of assertions specifically for collections are provided.

We have already seen the assertEqual function which determines if two parameters are equal.

When working with collections this performs the following checks

  • The collection types are equal
  • The collections contain the same number of elements
  • Each element at the same ordinal position equals that in the other collection.

The elements can be of another type as long as their values are equal. In the example below one list contains integers and the other floats but the assertion passes as the elements are equal.

Code:

self.assertEqual([1.0, 2.0, 3.0], [1, 2, 3]) 

The assertSequenceEqual function works the same as assertEqual though it will not fail if the collections are of different types. Below we ensure that the contents of a list and a tuple are equal.

Code:

self.assertSequenceEqual((1, 2, 3), [1, 2, 3])  # Checks only the sequence

The assertIn and assetNotIn funcitons allows checks to see if an element is contained or not contained within a collection. The check is based upon equality.

Here we check that 1 is in 1,2,3 and that 4 is not in 1, 2, 3.

Code:

self.assertIn(1, (1, 2, 3))
self.assertNotIn(4, (1, 2, 3))

The assertCountEqual function has to be a contender for the worst named function in history. This function ensures that two collections contain exactly the same elements though their order is not important.

Code:

self.assertCountEqual((1, 2, 3), (3, 2, 1))  # Badly named. This checked elements and not their order

Comparison Assertions

Python provides the comparison checks in the form of less than, less than or equal to, greater than and greater than or equal to.

Code:

self.assertLess(1, 10)
self.assertLessEqual(1, 1)
self.assertGreater(10, 1)
self.assertGreaterEqual(1, 1)

Identity Assertions

Identity ensures that two parameters point to the same object instance.

In Python each type instance is assigned it’s own object id upon creation. More information can be found here .

The assertIs and assertIsNot can ensure that two objects are and are not the same instance respectively.

Code:

self.assertIs(1, 1)
self.assertIsNot(1, 2)

For parameters which are not referencing any data or have not been initialised they will point to the None type. Here we can check to see if a parameter is pointing to or not pointing to None with the assertIsNone and assertIsNotNone functions.

Code:

self.assertIsNone(None)
self.assertIsNotNone(1)

The assertIsInstance and assertNotIsInstance functions can be used to see if a parameter holds a specific type. Here we pass a parameter holding an instance of a type along with the class name of the type that we want to insure it references or does not reference.

Code:

self.assertIsInstance((), tuple)
self.assertNotIsInstance((), set)

Regular Expressions Assertions

Code:

We can use regular expressions to ensure the format of a string is as expected with the assertRegex and assertNotRegex functions

self.assertRegex('Luke', "^[a-zA-Z]{3,4}$")
self.assertNotRegex('Lukey', "^[a-zA-Z]{3,4}$")

Exceptions Assertions

Code should throw exceptions when we want it to or when it is called incorrectly. We can use the assertRaises function to assert that not only an exception is raised but it is of a certain type.

Below we ensure that a ZeroDivisionError error is raised.

Code:

with self.assertRaises(ZeroDivisionError) as ex:
    result = 1 / 0

self.assertEqual(str(ex.exception), "division by zero")

In the above example we assign the raised exception to a variable ex, we can then run assertions upon the exception to make sure it is as expected. We check the string representation of the object is as expected. The latter check can be enforced with the assertRaisesRegex function.

Code:

with self.assertRaisesRegex(ZeroDivisionError, "^division by [a-zA-z]{4}$"):
    result = 1 / 0

We can also annotate a test with the @expectedFailure attribute. Here the test will fail if an error is not raised.

Output:

  @expectedFailure
    def test_expectedFailure(self):
        self.fail("This is an expected failure")

Warnings Assertions

Python provides the same functions for warnings as it does for exceptions; they work in exactly the same way

Code:

with self.assertWarns(DeprecationWarning) as wn:
    warn("deprecated", DeprecationWarning)

self.assertEqual(str(wn.warning), "deprecated")

with self.assertWarnsRegex(DeprecationWarning, "^deprecate[a-z]$"):
    warn("deprecated", DeprecationWarning)

Assertions Messages

Each assertion can optionally take a string to be used as an error message when the test fails.

Code:

self.assertFalse(False, "False is not false!")

Would report as the following:

Output:

AssertionError: True is not false : False is not false

The following would be reported if the error message had not been provided.

Output:

AssertionError: True is not false

Failing Tests

We can fail a test in code with the fail method.

Code:

self.fail("Fail!!!")

Test Fixture

If a test class has a function called setUp, it will be run before every test function within it. If an error is raised within the setUp function then no test functions will be run.

If a test class has a function called tearDown, it will be run after every test function within it. This function will always be run after each test function regardless if the test passes or fails.

Code:

from unittest import TestCase


class TestFixtureExample(TestCase):

    def setUp(self):
        # Set up / initialise before a test
        # If this fails then no tests will be run
        print("In the setUp")

    def tearDown(self):
        # Destroy any resources required during the test
        # Will always be run if setUp runs regardless of tests successes
        print("In the tearDown")

    def test_fixture_one(self):
        self.assertTrue(True)

    def test_fixture_two(self):
        self.assertTrue(True)

    def test_fixture_three(self):
        self.assertTrue(True)

Output:

.py::TestFixtureExample true
Testing started at 14:17 …
In the setUp
In the tearDown
In the setUp
In the tearDown
In the setUp
In the tearDown

Test Suite

The TestSuite class can be used to register tests which can then be run with the TextTestRunner.

The addTest can be used to add an individual test method into a TestSuite instance.

The TestLoader().loadTestsFromTestCase() can be used to create a TestSuite with all test functions of a test class.

The TextTestRunner().run() function can then run all TestSuites passed in.

**Code:

from unittest import TestSuite, TextTestRunner, TestLoader

# Test Suite
def my_test_suite():
    suite_one= TestSuite()
    suite_one.addTest(MyTestClass('test_add_two_numbers')) # Adds MyTestClass.test_add_two_numbers()

    suite_two = TestLoader().loadTestsFromTestCase(TestAssertsExample)

    return TestSuite([suite_one, suite_two])

# Run the test suite
if __name__ == '__main__':
    TextTestRunner().run(my_test_suite())

Skipping Tests

Test functions can be annotated with specific unittest attributes.

Skip can be used to stop a test from running. This can also be done in code with the SkipTest function

SkipIf can be used to stop a test from running if a boolean statement evaluates to true.

SkipUnless can be used to stop a test from running unless a boolean statement evaluates to true.

Code:

class TestAttributes(TestCase):

    @skip("Test is not run")
    def test_skip(self):
        self.fail("This should not be run")

    @skipIf(True, "This is not run")
    def test_skipIf(self):
        self.fail("This should not be run")

    @skipUnless(False, "This is not run")
    def test_skipUnless(self):
        self.fail("This should not be run")

    def test_skipTest(self):
        SkipTest("This should not be run")

Python: File I/O

This is part of my Python & Django Series which can be found here including information on how to download all the source code.

This article runs through reading and writing to files.

User Input

When running terminal applications we can write to the terminal stream with the print command and also collect information from the user with the input command.

The input command takes a string to display to the user prompting them to input some data.

In the following example we ask the user their name with the input command and assign it to a variable called name.

It is important to note that the runtime environment pauses the application while it is waiting the user for their input.

Code:

name = input('What is your name?: ')
print("Hello", name)

File Class

The remainder of the article looks at how to read and write to physical files in various formats of data.

All file access in Python uses a File class which is generated with the open function. Below defines the various parameters permissible to the open function.

Code:

a_file = open("FileName.txt", "[rw//r+][b]")

The first parameter is the file path of the file to read or write, the second parameter is a string representing the access type and file type required.

File access can be defined as permutations of read or write.

Access Description
r Readonly, which is the default
w Write, overwrite all existing data
a Write Append, opened at the end of file for appendage
r+ Allow read and write access

By default it is assumed UTF-8 text access is required. The usage of b along with the access type can define binary access.

Type Description
[none] Defaults as UTF-8 (text)
b Binary access

With Statement

As a file handle is an expensive resource you should always call close upon the file after you have finished working with it.

Code:

a_file = open("afile.txt")
# Actions upon the file.
a_file.close()

As you can never guarantee that your code will run all the way through to the calling of the close function without an error, it is best practice to place protection around your code to ensure it is always called. You could place the close() within the finally statement of a try catch block though Python provides the with statement which is easier and more elegant to use.

with open(text_file, "w") as f:
    # Code within with scope

# Code outside of with scope.

Here the close function of the file is automatically called upon leaving the with scope regardless if an error is thrown or not.

Path Class

The Path class from within the OS name space provides a handy function called join. Here we can provide a starting directory along with any number of directories and a file name to join.

The advantage of using the join method is that it will always use the correct path separator character regardless of which operating system your code is running on.

In the following example we join a directory called output to a file named output.txt to create a relative file path.

Code:

from os import path

text_file = path.join('output', 'output.txt')

Output:

output/output.txt

Using this relative file path will create a file relative to the file the program initially started running from; ideal for us to test stream usage.

Writing Text Files

To write to a file as text we simply need to create a file handle with the Open function, as mentioned above, along with the file path and the ‘w’ access mode. The file will default to UTF-8 text encoding.

Below we loop through a list of strings and write them to the file with the write function. We also add a new line character after each string by writing “\n” to the stream.

Code:

from os import path

text_file = path.join('output', 'output.txt')

data = ["This is a list of strings", "which need to be saved", "into a file"]

with open(text_file, "w") as f:
    for a_line in data:
        f.write(a_line)
        f.write("n")

File Contents:

This is a list of strings
which need to be saved
into a file

We could have used the function writelines to write all the strings contained in a collection.

Code:

with open(text_file, "w") as f:
    f.writelines(data)

Reading Text Files

We can read the contents of the file above by providing the same file path but changing the file access to readable by changing the ‘w’ to ‘r’.

We can then read all lines into a list with the readlines function.

Code:

with open(text_file, "r") as f:
    print(f.readlines())

Output:

[‘This is a list of strings\n’, ‘which need to be saved\n’, ‘into a file\n’]

Notice above that we have a newline character after each line in the file. Python does not automatically strip the newline character off when reading each line.

We can also read one line at a time with the read function. This would require writing an infinite loop and manually breaking when the read function stops returning data.

This is very long winded for Python….. instead we can enumerate the file handle!!! Each iteration is a line in the file and the loop stops automatically when we reach the end of the file.

Code:

with open(text_file, "r") as f:
    for a_line in f:
        print(a_line, end='')  # The file has new line chars also print adds one on by default

Output:

This is a list of strings
which need to be saved
into a file

We use the end=” to prevent the print function automatically adding on a newline character after each output.

In the following example we use list comprehensions to iterate through the file, strip of the newline character with the rstrip function and add each line into a list which we assign to a variable called lines.

Code:

lines = [line.rstrip('n') for line in open(text_file)]
print(lines)

Output:

[‘This is a list of strings’, ‘which need to be saved’, ‘into a file’]

The list constructor can take an enumerator, as such we can actually pass the file handle into the constructor to read each line into an instance of a list.

Code:

print("nRead with list():", text_file)
with open(text_file, "r") as f:
    print(list(f))

Seek & Tell

Python maintains the current location of the file handler with a marker.

When a file is opened the marker is normally initially set at the very start of the file. Writing a file as append actually sets the marker to the very end of the file.

Calling readline on a file handle will read all the contents from the current marker position up to the next newline character. It will then move the marker to the character after the newline it has read.

The position of the marker can be read and set with the tell and seek functions respectively.

Tell returns the position of the marker from the start of the file as bytes.

Seek sets the position of the marker by defining the number of bytes from the start of the file.

The following example opens a file, reads all the content and then resets the marker to the start of the file. Tell is used to show the position throughout the example.

with open(text_file, "r") as f:
    print("Tell: ", f.tell())
    contents = list(f)
    print("Tell: ", f.tell())
    f.seek(0)
    print("Tell: ", f.tell())

Output:

Tell: 0
Tell: 61
Tell: 0

JSON

JavaScript Object Notation is an open standard format that uses human-readable text to pass data or objects.. It strives to be human readable compared to formats such as XML. It uses attribute value pairs to represent the data.

It’s main use is to pass data between a server and web application; web services and AJAX calls.

Python provides a json class as a JSON parser for both serialisation of a python type to a JSON string and de-serialisation back to the python type.

Lets take a dictionary and populate it.

Code:

data = {"numbers": [1, 2, 3, ],
        "written numbers": ['one', 'two', 'three'],
        "characters": ['a', 'b', 'c']}

print(data)
print(type(data))

Output:

{‘written numbers’: [‘one’, ‘two’, ‘three’], ‘numbers’: [1, 2, 3], ‘characters’: [‘a’, ‘b’, ‘c’]}

We can serialise the dictionary to JSON with the dumps function on the json class.

Code:

<br />import json

json_string = json.dumps(data)
print(json_string)
print(type(json_string))

Output:

{“written numbers”: [“one”, “two”, “three”], “numbers”: [1, 2, 3], “characters”: [“a”, “b”, “c”]}

The type has now changed to a string though the physical output in python is the same as when printing the dictionary. This is because python outputs the dictionary to JSON as part of the print command.

We can de-serialise the JSON string back to a dictionary with the loads function on the json class.

Code:

decoded_data = json.loads(json_string)
print(decoded_data)
print(type(decoded_data))

Output:

{‘written numbers’: [‘one’, ‘two’, ‘three’], ‘numbers’: [1, 2, 3], ‘characters’: [‘a’, ‘b’, ‘c’]}

The josn class also provides the ability to serialise to a text file and de-serialise from a text file with the dump and load functions which both take a file instance.

Code:

print("Dumping:", json_file_path)
with open(json_file_path, "w") as f:
    json.dump(data, f)

print("Loading:", json_file_path)
with open(json_file_path, "r") as f:
    loaded_json = json.load(f)
    print("t", loaded_json)
    print("t", type(loaded_json))

Output:

Dumping: output/json.dump
Loading: output/json.dump
{‘written numbers’: [‘one’, ‘two’, ‘three’], ‘numbers’: [1, 2, 3], ‘characters’: [‘a’, ‘b’, ‘c’]}

The dumps and dump function allows us some configuration points for formatting the JSON string.

The sort_keys parameter, defaulting to false, allows the dictionary to be sorted based upon the key value.

The indent parameter defines the number of characters to be used as indentation between nested elements within a collection.

The separators parameter defines a tuple of separation chars between list elements and keys.

Code:

print(json.dumps(data, sort_keys=True, indent=2, separators=(',', ':')))

Output:

{
“characters”:[
“a”,
“b”,
“c”
],
“numbers”:[
1,
2,
3
],
“written numbers”:[
“one”,
“two”,
“three”
]
}

Pickle

JSON is great where compatibility or human readability is required. However if you simply want to persist the state of an object to read it later Python provides pickle; an inbuilt binary format. This will be more efficient than text formats.

The dump function is used to serialise a class instance while the load function is used to de-serialise back to a class instance.

The file parameter takes a handle to a file instance representing the destination or source file.

Code:

import pickle

# Serialisation
try:
    with open(pickle_file, "wb") as output_file:
        pickle.dump(data, file=output_file)
except IOError as err:
    print('File error: ' + str(err))
except pickle.PickleError as pickle_error:
    print('Pickling error: ' + str(pickle_error))

Code:

# Deserialization
try:
    with open(pickle_file, "rb") as input_file:
        loaded_data = pickle.load(input_file)
        print("t", loaded_data)
        print("t", type(loaded_data))
except IOError as err:
    print('File error: ' + str(err))
except pickle.PickleError as pickle_error:
    print('Pickling error: ' + str(pickle_error))

Output:

*** The raw data:
{‘characters’: [‘a’, ‘b’, ‘c’], ‘written numbers’: [‘one’, ‘two’, ‘three’], ‘numbers’: [1, 2, 3]}

Dumping to: output/pickle.data
Loading from: output/pickle.data
{‘characters’: [‘a’, ‘b’, ‘c’], ‘written numbers’: [‘one’, ‘two’, ‘three’], \> ‘numbers’: [1, 2, 3]}

Python: Exceptions

This is part of my Python & Django Series which can be found here including information on how to download all the source code.

Code does not always work as intended. Even if the perfect system could exist, it is not possible to protect around every possible situation from badly formed data files, user input or even the network going down.

Expecting errors is an integral part of coding and provides the developer with the ability to respond when an error has occurred. Whether it is to reverse a database transaction, clearing expensive resources or simply logging and reporting the error.

Try Catch

Like most languages, the try catch statement is the basics building block of error handling.

The try statement defines an area where we would like the run time environment to allows us the opportunity to respond to errors being raised.

The catch or expect statement is the code which will run when an error is raised.

The basic syntax looks like this.

Code:

    try:
        # Code
    expect:
        # Error handling code

Any code after the try statement and before the except statement which causes an error, will immediately stop execution and be resumed within the top of the except statement.

After the except statement has run, control will be passed to the fist line outside of the try catch block, allowing the program to carry on as if no error has been raised.

Code:

def convert_to_int(input_value):
    try:
        x = int(input_value)
        print("{0} can be converted into an int of {1}".format(input_value, x))
    except:
        print("An error was caught!:")

for an_input in ["1", "a", "b"]:
    print("nTrying with:", an_input)
    convert_to_int(an_input)

Output:

Trying with: 1
1 can be converted into an int of 1

Trying with: a
An error was caught!

Trying with: b
An error was caught!

Catch An Exception

If we want to find out more about the error raised we can explicitly catch the exception and assign it to a variable. This will allow us to report the exception to the user or the log system as required.

The example above has been modified by catching the exception into the variable called ex in the except statement. In the error handling code we print the exception to the terminal.

Code:

def convert_to_int(input_value):
    try:
        x = int(input_value)
        print("{0} can be converted into an int of {1}".format(input_value, x))
    except Exception as ex :
        print("The following exception was caught:")
        print(ex)

for an_input in ["1", "a", "b"]:
    print("nTrying with:", an_input)
    convert_to_int(an_input)

Output:

Trying with: 1
1 can be converted into an int of 1

Trying with: a
The following exception was caught:
invalid literal for int() with base 10: ‘a’

Trying with: b
The following exception was caught:
invalid literal for int() with base 10: ‘b’

Exception Granularity

The Exception type is a class which can be inherited from. Python ships with many sublcasses of Exception with the intention of code raising an exception which is more specific to the error being raised.

In code we might want to act differently based upon the error being raised. For example if the network is down we might want to retry but if we have bad data from the user we might want to allow the user to re-input the data.

Python allows catching exceptions by their type as well as the general catch statement which have seen above.

We can provide multiple except statements all catching a different exception type which in turn allows us to respond differently based upon the error being raised.

Code:

try:
    f = open('foo.txt')
    s = f.readline()
    i = int(s.strip())
except IOError as err:
    print("IOError: {0}".format(err))
except ValueError as err:
    print("ValueError: {0}".format(err))
except Exception as err:
    print("Exception: {0}".format(err))
except:
    print("Won't ever execute due to the except condition above")

Output:

IOError: [Errno 2] No such file or directory: ‘foo.txt’

An except statement will run if the exception types are the same or the error being raised has the defined exception type in its ancestry; i.e it inherits directly or indirectly from the defined exception type.

Only one exception statement will run so you should be careful to ensure your exceptions are placed from the most specific to the least specific.

The example above catches the Exception type last which will catch all errors being raised as long as they have not already been caught.

Exception Details

Like all types in Python, the exception is a class and as such contains state and behaviour.

We can write the exception summary to a string with the __str__ method which is called from string format or the print command.

The __traceback__ can be used to read the method stack at the time when the exception was raised.

The args property can be used to determine any additional arguments assigned to the exception when it was raised.

Code:

try:
    1 / 0
except Exception as err:
    print("Exception: {0}".format(err))
    print(err)
    print(err.__traceback__)
    print(err.args)

Output:

Exception: division by zero
division by zero

(‘division by zero’,)

Alternatively the sys.exec_info returns a tuple of information about the current exception being handled.

Code:

import sys

try:
    f = open('foo.txt')
    s = f.readline()
    i = int(s.strip())
except:
    print("Catch!!")
    for a_msg in sys.exc_info():
        print(a_msg)

Output:

Catch!!

[Errno 2] No such file or directory: ‘foo.txt’

Try Catch Finally

Python also allows a finally statement with a try catch block. Here the code is called regardless if an exception is raised or not or whether a raised exception was caught.

  • Iteration 1 raises no error.
  • Iteration 2 raises an error which is caught
  • Iteration 3 raises an error which is not caught.

After the catch statement has run for iteration three the program is terminated due to an exception not being caught.

Code:

def raise_if_true(arg_input):
    try:
        if arg_input == 2:
            raise ValueError("Input was 2")
        elif arg_input == 3:
            raise Exception("Input was 3")
    except ValueError as exception:
        print("Caught:", exception)
    finally:
        print('This is the finally!!!!')

for number in [1, 2, 3]:
    raise_if_true(number)

Output:

This is the finally!!!!
Traceback (most recent call last):
File “/home/lukey/Dropbox/Development/SandBox/Git/ThePythonPit/PythonSandBox/Exceptions/try_finally_example.py”, line 18, in
Caught: Input was 2
raise_if_true(number)
This is the finally!!!!
File “/home/lukey/Dropbox/Development/SandBox/Git/ThePythonPit/PythonSandBox/Exceptions/try_finally_example.py”, line 11, in raise_if_true
This is the finally!!!!
raise Exception(“Input was 3”)
Exception: Input was 3

Try Catch Finally Else

The else statement can be added onto a try statement to allow an area of code which will be run if no error is raised.

All variables are accessible to the try statement are available in the else statement. Here we assign the result to a variable called result which is created in the try block, we then access this within the else statement.

Code:

def divide(x, y):
    try:
        print("nPerforming: {0} / {1}".format(x, y))
        result = x / y
    except ZeroDivisionError:
        print("division by zero!")
    else:
        print("Result =", result)
    finally:
        print("Executing the finally clause")

divide(1, 2)
divide(1, 0)

Output:

Performing: 1 / 2
Result = 0.5
Executing the finally clause

Performing: 1 / 0
division by zero!
Executing the finally clause

Re-Throwing An Exception

We can re-throw an exception after we have finished handling it. This can be useful if we want the program to finish executing or we would like an outer try catch block to also catch and respond to the error.

We re-throw an error with the raise keyword.

In the following example we re-throw an exception which has been caught. As the try statement is not nested the program will terminate immediately.

Code:

try:
    f = open('foo.txt')
    s = f.readline()
    i = int(s.strip())
except:
    print("Caught!!")
    raise

print("This won't print!!!")

Output:

Caught!!
Traceback (most recent call last):
File “/home/lukey/Dropbox/Development/SandBox/Git/ThePythonPit/PythonSandBox/Exceptions/rethrowing_an_exception.py”, line 6, in
f = open(‘foo.txt’)
FileNotFoundError: [Errno 2] No such file or directory: ‘foo.txt’

Raising An Exception

There might be times in your code where you want to raise an exception to trigger common error handling code which exists higher up in the method stack.

An error can be raised by simply creating an instance of the Exception class or any class which inherits from Exception along with the raise command.

The Exception class gathers all constructor arguments and places them into the args collection.

Code:

try:
    raise Exception('spam', 'eggs')
except Exception as inst:
    print(inst)
    print(inst.args)

Output:

(‘spam’, ‘eggs’)
(‘spam’, ‘eggs’)

Sublassing Exceptions

Any class which has the Exception type within its ancestry can be raised and caught in Python.

Inheriting from Exception allows catching to be granular as we have seen previously but it also allows us to add state and behaviour onto an exception.

In the following example we subclass exception to allow a field called value to be set during error raising and read during the error handling.

Code:

class MyError(Exception):
    def __init__(self, value):
        self.value = value

    def __str__(self):
        return repr(self.value)

try:
    raise MyError(2 * 2)
except MyError as e:
    print(e.value)
    print(e)

Output:

4
4

Python: Strings

This is part of my Python & Django Series which can be found here including information on how to download all the source code.

Basics

Strings in Python can be created with either single or double quotes.

Code:

print('single quotes')
print('double quotes')

Output:

single quotes
double quotes

What is great about having the choice is that you can swap between the two when you need a string which contains the other without having to escape any characters.

Code:

print(' " double quotes" ')
print(" ' single quotes' ")

Output:

” double quotes”
‘ single quotes’

The traditional backspace “\” can still be used to escape special characters.

Code:

print(' ' single escaped quotes ' ')
print(" " double escaped quotes " ")

Output:

‘ single escaped quotes ‘
” double escaped quotes ”

When there are many special characters which require escaping a raw string can be created by simply prefixing the string with an r. This takes the string as verbatim.

Code:

print(r"Thisisarawstring@ in c#")

Output:

This\is\a\raw\string\@ in c#

Strings, which are very long, can be split between multiple lines.

Code:

print(("This can help"
       " separate Strings"))

print("This can also help  
separate Strings")

Output:

This can help separate Strings
This can also help separate Strings

Collections Of Characters

Strings are nothing more than an immutable list of characters. A lot of the functionality applicable to collections is applicable for strings.

Code:

a_string = 'HelloWorld!'
print(type(a_string))       # The string class
print(len(a_string))        # The length
print(a_string[0])          # Char at index 0
print(a_string[2:5])        # Chars at index 2 to 5

Output:

11
H
llo

Concatenation

Strings can be concatenated with the + operator or replicated with the * operator.

Code:

a_string = 'HelloWorld!'
print(a_string + "TEST")    
print(a_string * 2)

Output:

HelloWorld!TEST
HelloWorld!HelloWorld!

The format function can also be used when concatenating strings. Here parameters are placed into template holders of a string.

The templates are defined as ordinal positions {0} or as named {name}. They are then filled in by the parameter with the same ordinal position or named. They can even be mixed in the same function call.

Code:

print('{0} can have {1} templates or placeholders'.format('Strings', 'ordinal'))
print('{first} can have {second} templates or placeholders'.format(first='Strings', second='named'))
print('{0} and {named} placeholders can be mixed!'.format('Ordinal', named='named'))

Code:

> Strings can have ordinal templates or placeholders
> Strings can have named templates or placeholders
> Ordinal and named placeholders can be mixed!

The format function also allows key value pair references to be displayed with template notation.

Code:

> data_table = {'One': 'And a one!', 'Two': 'And a two!'}
> print(type(data_table), data_table)
> print('One: {0[One]:s}; Two: {0[Two]:s}'.format(data_table))
> print('One: {One:s}; Two: {Two:s}'.format(**data_table))        # With Unpacking

Output:

<class 'dict'> {'Two': 'And a two!', 'One': 'And a one!'}
One: And a one!; Two: And a two!
One: And a one!; Two: And a two!

The older versions of Python used implicit ordinal positioning along with the output type. This functionality is also still allowed.

Code:

print("Hello there %s %s %s" % ("Mr", "Luke", "Wickstead"))

Output:

Hello there Mr Luke Wickstead

The %s is a string format option; there are many parameters available for formatting types to strings. See the section “Format Specification Mini-Language” bellow for more options.

Padding

Strings can be padded with with rjust ljust and center functions. These functions adds spaces to the left, right or each side respectively, until the string contains the defined number of characters. The zfill function is the same as rjust though 0’s are used to pad the string instead of spaces..

Code:

print("1".rjust(3))
print("1".ljust(3))
print("1".center(3))
print("1".zfill(3))  # zeros to left

Output:

” 1″
“1 “
” 1 ”
“001”

Format Specification Mini-Language

Overview

We have already looked at the format function for concatenating strings. Along with the ordinal or named element of the data we can also provide other criteria which will affect how the strings is formatted. The full possible format for a template parameter is as follows:

Spec:

|[[fill]align][sign][#][0][width][,][.precision][type]

Possible Values of the spec elements.

Name Possible Description
Filll [Any Char] Any char to be used as a fill char
Align = ^ Fill alignment. Left, right, left sign aware and centre
Sign +,-, ” “ Sign always on shown, only for negative, space for +ve and – for -ve
# # Formats binary, octal or hex numbers prefixed with ‘0b’, ‘0o’, or ‘0x’
0 0 Use sign aware 0 padding when turning on the width
Width Integer Turn on 0 padding for numbers
, , Use commas to separate number rages
Precision Integer Round with x number of decimal places
Type b, c, d, e, E, f, F, g,G,n,o,s,z Defines what format type the data is to be represented.

The type reference equates to the following:

Type Name Description
s String
b Binary
c Character
d Decimal ( base 10 ) integer
o Octal
x X Hex
n General number which will use the computers localisation settings to format the number.
d None type
e E Exponential number
g G General provides general rules for precision, rounding and when a max number before switching to exponential
% ercentage

We now look at some examples for common situations. More information can be found here

Formatting Floats

[+- ][0][width][,][.precision]  
[Sign Setting][Use 0 Instead of Space For Padding][Use Comma As Separator][Final Width In characters]

Code:

print("{0:.0f}".format(123.456))      # 0 dp
print("{0:.2f}".format(123.456))      # 2 dp
print("{0: 08,.3f}".format(123.456))  # 3 dp with padding to 8 chars
print("{0:+.3f}".format(123.456))         # 3 dp with +/-
print("{0:-.3f}".format(-123.456))        # 3 dp with - if -ve.
print("{0:,}".format(-1121212123.456))  # , (comma) separator

Output:

123
123.46
” 123.456″
+123.456
-123.456
-1,121,212,123.456

If the entity is a percentage we can round and append on a % sign as followed.

Code:

print("{0:.3%}".format(0.25555))    # Format percentage to 3dp

Output:

25.555%

Space Padding

[*][<>=^][int]
[Pad Char][Alignment Type][Final Width In Chars]

Code:

print("{0:0>5d}".format(1))     # Left pad with 0 to 5 chars
print("{0:0<5d}".format(1))     # Right pad with 0 to 5 chars
print("{0:0^5d}".format(1))     # Centre pad with 0 to 5 chars
print("{0:5d}".format(1))       # Left pad with space to 5 chars
print("{0:<5d}".format(1))      # Right pad with space to 5 chars
print("{0:^5d}".format(1))      # Centre pad with space to 5 chars

Output:

“00001”
“10000”
“00100”
” 1″
“1 “
” 1 ”

Types

The last parameter is always the type. Each type can then be configured using the above settings.

Some examples of how to output various types to various output representation types.

Code:

print("{0:g}".format(1111123.456))      # As general
print("{0:5.2n}".format(123.456))       # As number
print("{0:b}".format(123))              # As Binary
print("{0:x}".format(123))              # As hex
print("{0:e}".format(123))              # As exponential

Output:

1.11112e+06
1.2e+02
1111011
7b
1.230000e+02

The world of possibilities are endless. It is not really possible to be exhaustive but hopefully this is enough to get people started.

Python: Collections

Collections

This is part of my Python & Django Series which can be found here including information on how to download all the source code.

Collections are types which hold references 0, 1 or more elements.

In Python collections can hold elements which are of different types.

In general, collections are mutable; i.e. their state can change by adding, removing and replacing elements.

There are, as per most languages, many types of collections, all of which exist to solve various scenarios.

List

The list type is probably the most used collection in python. It represents a mutable collection of elements which are referenced by their ordinal position.

To create an instance of a list you use the [] notation.

We can use the type function to determine the class as well as the print function to see its contents.

Code:

numbers = []
print(type(numbers), numbers)

Output:

<class ‘list’\> []

We can initialise a list with elements by declaring them in the call to the constructor separated by commas.

Code:

numbers = [1, 2, "three", 4]
print(type(numbers), numbers)

Output:

<class ‘list’\> [1, 2, ‘three’, 4]

Elements can be accessed via their ordinal position which starts at 0 and ends at size – 1 via [index]. If we try to access an element at an index which does not exist we get a IndexError exception raised.

Code:

numbers = [1, 2, "three", 4]
print(numbers[0])
print(numbers[10])

Output:

1
IndexError: list index out of range

Alternatively we can ask the index of an element with the index method. If the element does not exist we get a ValueError exception raised.

Code:

numbers = [1, 2, "three", 4]
print(numbers.index('three'))
print(numbers.index('nine'))

Output:

2
ValueError: ‘nine’ is not in list

We can use the in operator and the not in operators to determine if a list contains an element.

Code:

numbers = [1, 2, "three", 4]
print('three' in numbers)
print('nine' not in numbers)

Output:

True
True

The len function can be used to return the length or number of elements in a collection.

Code:

numbers = [1, 2, "three", 4]
print(len(numbers))

Output:

4

Collections are dynamic and can contain any class instance, including other collections. Here we create list of lists.

Code:

print([[1], [1, 2], [1, 2, 3]])

Output:

[[1], [1, 2], [1, 2, 3]]

The append function can add an element to the end of a list.

The extend function can add all elements within another collection to the end of a list.

The insert function can be used to add an element into a specific ordinal position. All elements with an ordinal position greater than or equal to the index will have their ordinal position incremented by 1.

An element can be removed via the remove function and the ordinal position. All elements with an ordinal position greater than the index will have their ordinal position decremented by 1.

Code:

numbers = [1, 2, "three", 4]

# Append an element
numbers.append(9)
print(numbers)

# Append multiple elements
numbers.extend([10, 11])
print(numbers)

# Insert element into an index
numbers.insert(12, 99999)
print(numbers)

# Removes an element at a specified index
numbers.remove(10)
print(numbers)

Output:

[1, 2, “three”, 4]

[1, 2, ‘three’, 4, 9]

[1, 2, ‘three’, 4, 9, 10, 11]

[1, 2, ‘three’, 4, 9, 10, 11, 99999]

[1, 2, ‘three’, 4, 9, 11, 99999]

The + operator can be used to create a new list containing all elements in the first list followed by all elements in the second list.

Code:

print([1, 2, "three", 4] + [5, 'six', 7, 8])

Output:

[1, 2, ‘three’, 4, 5, ‘six’, 7, 8]

The * operator can be used to repeat the contents of a list x times.

Code:

print([1] * 5)

Output:

[1, 1, 1, 1, 1]

We can iterative through the elements in a collection with a for in command.

Code:

for number in [1, 2, 3]:
print(number)

Output:

1
2
3

We can count the number of occurrences of an element within the collection with the count function.

Code:

print( [1, 1, 1, 1, 1].count(1))

Output:

5

The clear function can be used to remove all elements.

Code:

queue.clear()

It is important to note that many of the list functions are available to other collection types.

Tuple

A tuple is an immutable list; i.e once declared it is read-only and cannot have any elements added, removed or replaced.

A tuple can contain mutable objects which can themselves be changed.

All list functionality which does not affect the state are applicable for tuples; slicing, indexing etc

They are faster than lists and should be used when the collection is modelling constant data.

They are an ideal contender for dictionary keys where multiple elements determine the unique key. They can only be used in dictionaries if they do not contain any mutable elements.

They are created with().

Code:

empty = ()
a_tuple = ('abcd', 786, 2.23, 'john', 70.2)
print(empty)
print(a_tuple)
print(type(a_tuple))

Output:

()
(‘abcd’, 786, 2.23, ‘john’, 70.2)
<class ‘tuple’\>
They are implicitly created via packing; multiple comma separated arguments being assigned to a single variable. A trailing comma is required where only one element is to be added.

Code:

tupe_one = 1, 2, 3
tuple_two = 1,

print(tuple_one)
print(tuple_two)

Output:

(1, 2, 3)
(1,)

We can also unpack a tuple’s elements into separate variables, in fact this works for most collections. If the number of elements and variables do not align exactly an error is raised.

Code:

one, two, three, four = (1, 2, 3, 4)
print(one, two, three, four)

Output:

1 2 3 4

Tuples support most read-only style functionality of lists

Code:

print(a_tuple) # Prints complete list
print(a_tuple[0]) # Prints first element of the list
print(a_tuple[1:3]) # Prints elements starting from 2nd utill 3rd
print(a_tuple[2:]) # Prints all elements starting from 3rd element
print(a_tuple * 2) # Prints list two times
print(a_tuple + a_tuple) # Prints concatenated tuples

Output:

(‘abcd’, 786, 2.23, ‘john’, 70.2)
abcd
(786, 2.23)
(2.23, ‘john’, 70.2)
(‘abcd’, 786, 2.23, ‘john’, 70.2, ‘abcd’, 786, 2.23, ‘john’, 70.2)
(‘abcd’, 786, 2.23, ‘john’, 70.2, ‘abcd’, 786, 2.23, ‘john’, 70.2)

One important last feature of tuples is that the elements can be named upon creation and then referenced as if they are state. The data is always read-only and trying to change the state will result in an AttributeError exception being raised.

Output:

Person = namedtuple("Person", ["name", "age"])
a_person = Person(name="Luke", age=36)

print(a_person)
print(type(a_person))
print(a_person.name)
print(a_person.age)

Ouyput:

Person(name=’Luke’, age=36)
<class ‘__main__.Person’\>
Luke
36

Dictionary

The dictionary class represents a collection of elements without order but are keyed or accessed upon an entity which should be unique in the set of data.

The {} is used to create a dictionary. We can use the type and len functions to determine the class type and the number respectively.

The key can be of any type as long as it is immutable; numbers and strings for example. Tuples are immutable and can be used as long as they only contain elements which are immutable.

Code:

dictionary_one = {}
print(type(dictionary_one))
print(dictionary_one)
print("Length: ", len(dictionary_one))

Output:

<class ‘dict’\>
{}
Length: 0

In fact {} is just a shortcut to the dict class which can be invoked by its constructor.

Code:

dictionary_one = dict()
print(type(dictionary_one))

Output:

<class ‘dict’\>

We can create a dictionary with elements and their keys already assigned with the notation key1 : element1, key2 : element2, etc.

dictionary_two = {'three': 3, 4: "four"}
print(dictionary_two)

Output:

{‘three’: 3, 4: ‘four’}

We can set or edit an element in the dictionary by the key and the [] access method.

If the key does not exist it is considered a new element. If the key exists it is considered as a replacement to an existing element.

Code:

dictionary_one {}
dictionary_one[1] = "one"
dictionary_one["two"] = 2
print(dictionary_one)

Output:

{‘two’: 2, 1: ‘one’}

We can use the same [key] notation to access the element. If the key does not exist a KeyError exception is raised. You can use the in operator and the not in operator to determine if a key exists.

Code:

dictionary_one = { 1: 'one', 'two': 2}
print(dictionary_one[1])
print(dictionary_one["two"])
print(1 in dictionary_one )

Output:

one
2
1 in dictionary_one: True

Python provides the methods keys and values to return a dict_keys and dict_values instance which are iterators of the keys and elements respectively.

Code:

dictionary_one = { 1: 'one', 'two': 2}
print(dictionary_one.keys())
print(dictionary_one.values())

Output:

dict_keys([1, ‘two’])
dict_values([‘one’, 2])

We can loop through the keys of a dictionary with a for in loop directly without having to use the keys method.

Code:

for k in { 1: 'one', 'two': 2}:
print(k)

Output:

1
two

We can use the items function to loop through a dictionary with access to the key and the element at the same time.

Code:

for k, v in { 1: 'one', 'two': 2}.items():
print(k, v)

Output;

1 one
two 2

Tuples can be used as keys as long as they contain no mutable objects.

You can even use the in and not in operators to determine if the dictionary contains a key.

Code:

dict_with_tuples = {('a', 'b'): 'ab', ('a', 'c'): 'ac', ('a', 'c'): 'ac'}
print(dict_with_tuples)
print(('a', 'b') in dict_with_tuples.keys())

Output:

{(‘a’, ‘b’): ‘ab’, (‘a’, ‘c’): ‘ac’}
True

Queue

A queue is a special collection which has semantics for first in first out functionality. Python provides the deque class.

Before we can use the deque class we need to import it from the collections namespace.

We initiate a queue by passing in an enumerator; below we pass in a list of numbers from 0-9.

Code:

<br />from collections import deque

numbers = list(range(10))
queue = deque(numbers)
print(queue)
print(type(queue))

Output:

deque([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
<class ‘collections.deque’\>
We can then remove elements from the start with popleft function or the end with the pop function.

Code:

print(queue.popleft())
print(queue)
print(queue.pop())
print(queue)

Output:

0
deque([1, 2, 3, 4, 5, 6, 7, 8, 9])
9
deque([1, 2, 3, 4, 5, 6, 7, 8])

We can append elements to the start or the end of a queue with the append and appendleft functions respectively.

Code:

queue.append(10)
print(queue)

queue.appendleft(11)
print(queue)

Output:

deque([1, 2, 3, 4, 5, 6, 7, 8, 10])
deque([11, 1, 2, 3, 4, 5, 6, 7, 8, 10])

Stack

A stack is a collection which have the semantics of first in last out or last in first out functionality.

Python does not have a dedicated stack class though a list can be used with the pop command to remove the last element.

Pop removes the last element or it can take an index of the element to be removed.

Code:

numbers = list(range(10))

print(numbers)
print(numbers.pop()) # Remove last element
print(numbers.pop(3)) # Remove element at index x
print(numbers)

Output:

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
9
3
[0, 1, 2, 4, 5, 6, 7, 8]

Set

A set is a collection which contains only unique elements. Adding an element which already exists will leave the collection untouched.

In Python a set is created by passing an iterator into the set constructor. Where duplicate elements are found in the iterator they are reduced to a unique set of values in the set.

Code:

numbers_bag = [1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4]
numbers_set = set(numbers_bag)

print(numbers_bag)
print(numbers_set)

Output:

[1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4]
{1, 2, 3, 4}
<class ‘set’\>

The sets class provides useful set based operators and functionality which create a new set from an operation between two sets.

Operator Name Description
Set difference All elements in the first set which are not in the second.
Pipe Union All elements.
& Intersection All elements which exist in both sets.
^ Symmetric Difference. All in LHS or RHS but not both.
> = Is subset True if all elements in LHS are found within the RHS.
< = Is superset True if all elements in RHS are found within the LHS.

Code:

set_one = set({1, 2, 3, 4, 5, 6})
set_two = set({5, 6, 7, 8})

print(set_one - set_two) # Difference
print(set_one | set_two) # Union
print(set_one & set_two) # Intersection
print(set_one ^ set_two) # Symmetric Difference.

print(set({ 1, 2 }) <= set({1, 2, 3})) # Subset
print(set({ 1, 2, 3}) >= set({1, 2})) # Superset

Output:

{1, 2, 3, 4}
{1, 2, 3, 4, 5, 6, 7, 8}
{5, 6}
{1, 2, 3, 4, 7, 8}
True
True

List Ordering & Iterating

Iterating and ordering lists has already been covered in article on page flow which can be seen here .

Conditional Operators

Conditional operators can be used to determine if a collection is equal.

Two collections are considered equal if they have the same number of elements and each element is equal to the element at the same ordinal potion in the other collection.

The less than, less than or equal to, greater than, greater than or equal to conditional operators can all be used though they perform the semantics of their operator.

For example >= will ensure that the element in the fist set is greater than or equal to it’s counter part in the other set.

Code:

# Compares each element at the same ordinal position
print("1, 2, 5) >= (1, 2, 5):", (1, 2, 5) >= (1, 2, 5))

# Checks each element for equality
print("(1,2) == (1.0, 2.0):", (1, 2) == (1.0, 2.0))

# Can be used for most immutable elements
print('("a","b") < ("e", "f"):', ("a", "b") < ("e", "f"))

Output:

1, 2, 5) \>= (1, 2, 5): True

(1,2) == (1.0, 2.0): True

(“a”,”b”) < (“e”, “f”): True

Slicing

Slicing allows cloning of a collection with criteria data such as a start index, end index and increment value. Any collection which has access to its data via an ordinal position or index can be used.

The basic format is [start_index: end_index : increment]. The end index is used as a less than predicate; i.e. it is the first index not considered.

The values for start index, end index and increment are all optional and will default to [0 : size : 1].

Using either default options or explicit values we can shallow clone an entire collection by either of the following commands:

Code:

new_numbers: numbers[:]
new_numbers: numbers[0:len(numbers):1]

We can mix and match which elements we enter; they are all independently optional.

Code:

numbers[:3] # Element at index 1 until < 3
numbers[8:] # Element at index 8 until the end

We can use a negative increment though this would require the start index to be higher than the end index. The result would be the elements in the reverse order of their source collection.

We can also have an increment which is greater than 1. 2 would be every other element.

Code:

numbers[::2]
numbers[::-2]
numbers[len(numbers):0:-2]

Output:

[1, 3, 5, 7, 9]
[9, 7, 5, 3, 1]
[9, 7, 5, 3, 1]

Slicing can also be used for setting elements though only for mutable types. This would not include tuples.

Here we assign 99 and 100 to index 1 and 2 in the same line of code.

Code:

numbers[0:2] = [99, 100]

We can use slicing to remove a range of elements by assigning an empty collection.

Code:

numbers[0:2] = []

As the start index and end index are defaulted to the first and last elements respectively, we can clear a collection with the following syntax.

Code:

numbers[:] = []

List Comprehensions

List comprehensions allow the generation of a list of elements from another list of elements while providing criteria data. They extend the list slicing functionality.

In short it is a condensed form of a for loop iterating a collection along with an expression to generate a new element from the old element and a predicate to determine which elements to use.

They are best explored by examples.

Imagine we want a list of the square numbers from 0 to 9.

Code:

squares = []
for x in range(10):
squares.append(x**2)

Output:

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

We could condense this with a list comprehension as followed.

Code:

print([x ** 2 for x in range(10)])
  • We create an iterator of 0-9 with the range function.
  • We loop through it assigning each element to a variable named x.
  • We convert the element value using the expression x ** 2; this squares the value.

We could have used a lambda and the map function as follows which in short does the same thing.

Code:

print(list(map(lambda x: x ** 2, range(10))))

Where multiple collections or enumerators are require to generate the new collection we can use a nested loop where every combination of elements is used to generate the list.

We map each pair of x and y into a tuple.

Code:

print([(x, y) for x in [1, 2, 3] for y in [10, 11, 12]])

Output:

(1, 10), (1, 11), (1, 12), (2, 10), (2, 11), (2, 12), (3, 10), (3, 11), (3, 12)]

If we did not want every combination of elements but each pair of elements at the same ordinal position we can use the zip function.

Zip can take any number of iterators though it must be stressed that it only iterates through the number of times of the smallest collection. Below the second collection will have data which is missed from the new collection as it contains more elements than the first collection.

Code:

print([(x, y) for x, y in zip([1, 2, 3], [10, 11, 12, 13, 14, 15])])

Output:

[(1, 10), (2, 11), (3, 12)]

Here we append a predicate to restrict x and y to being even. Only combinations when both x and y are even will be used.

Code:

print([(x, y) for x in [1, 2, 3] for y in [10, 11, 12] if x % 2 == 0 & y % 2 == 0 ] )

Output:

[(2, 10), (2, 11), (2, 12)]

Here we append a predicate to ensure only letters in abcdef which are not in the word cab.

Code:

print({x for x in 'abcdef' if x not in 'cab'})

Output:

{‘e’, ‘d’, ‘f’}

Lists & Linq Functionality

Linq is a collection of set based functionality which come as part of .NET. Python also provides some useful linq style functionality.

The min, max and sum functions can be used to determine the minimum and maximum valued entity along with the sum of all entities without having to iterate through the collection.

**Code:***

numbers = list(range(0, 10))
print("Numbers:", numbers)
print("Min:", min(numbers))
print("Max:", max(numbers))
print("Sum:", sum(numbers))

Output:

Numbers: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
Min: 0
Max: 9
Sum: 45

We can use the in and not in operators to determine if an element is contained within a collection. For immutable and mutable types this will work based upon equality.

Code:

numbers = list(range(0, 10))
print("Numbers:", numbers)
print("1 in:", 1 in numbers)
print("10 not in", 10 not in numbers)
print("[1,2] in [[1,2],[1,1]]", [1,2] in [[1,2],[1,1]])

Output:

Numbers: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
True
True
True

The filter class is an iterator which takes a where clause; only elements satisfying the predicate will be iterated over.

Here we provide a function which returns a boolean indicating if the parameter is even. We use this to loop through only the even numbers of a collection

Code:

def is_even(x):
return x % 2 == 0

for even_number in filter(is_even, range(0, 5)):
print(even_number)

Code:

0
2
4

Anywhere a function which takes a function as an argument can always take a lambda expression. The above example could be rewritten using a lambda.

Code:

for even_number in filter(lambda x: x % 2 == 0, range(0, 5)):
print(even_number)

Python also provides the map function which allows looping through one or more iterators along with a conversion function. The return value from the conversion function will define the elements in the new collection.

Here we we loop through 0-3 and create a new collection of the square.

Code:

for n in map(lambda x: x**2, range(0, 4)):
print(n)

Output:

0
1
4
9

Map can take any number of iterators. All elements at the same ordinal position will form the parameters of the iteration. Note the number of iterations will be that of the smallest collection. Any elements at a higher ordinal position will not be iterated through.

Here we loop through two collections both containing 0 – 3 and we crate a new collection of the sum of the elements at the same ordinal position.

Code:

for n in map(lambda x, y: x + y, range(0, 4), range(0, 4)):
print(n)

Output:

0
2
4
6

Don’t forget that a list instance can be created from an iterator. This is useful if you want the results from map or filter to be within a list. In fact this works for most collections except for dictionaries.

Code:

new_list = list(map((lambda x: x**2), range(0, 4)))

The reduce function loops through all elements along with a reduce mapping function. The input parameters of the function will be the next element along with the result of the last call to the reduce mapping function.

Here we can sum all the elements of a collection by adding the ongoing result to the current element and then returning the result.

The function can also take a third parameter which is a starting value. The starting value will default to 0 if it is not provided.

Code:

import functools

print(functools.reduce(lambda x, y: x + y, range(4))) # Starting value of 0. 0 + 1 + 2 + 3
print(functools.reduce(lambda x, y: x + y, range(4), 10)) # Starting value 10. 10 + 1 + 2 + 3

Output:

6
16

Python: Object Orientation

This is part of my Python & Django Series which can be found here including information on how to download all the source code.

Like many modern day languages, Python is object orientated or at least has object orientated features.

The Pillars Of Object Orientation

Object orientation has three defined pillars; abstraction, encapsulation and polymorphism.

Abstraction

Abstraction is the process of taking a complex implementation and providing a simplified interface for consuming it. Consumers are hidden from the complexities of the internals by being presented with an abstracted interface.

Abstraction is implemented by creating classes with state (member fields) and behaviour (methods).

Encapsulation

Encapsulation is the process of restricting access of an objects state and behaviour from consumers to prevent any incorrect usage.

Another advantage of encapsulation is that by defining an interface for a consuming an object, any changes to the internals of that object can safely occur without breaking any of the consumers, as long as the existing interface remains in tact.

Encapsulation is implemented by adding access modifiers onto types and their defined members; ie by making them private.

Pythons does not directly support access modifiers, though convention states that sate or behaviour with the naming convention of __state__ is to be treated as private. The convention __classname__state__ can be used to ensure name uniqueness.

However the private state is only a convention and it is up to the developer to respect this.

Polymorphism

The ability of a group of heterogeneous objects to be treated as a homogeneous group by exposing the same interface.

In more traditional OO languages there are two types of polymorphism; interface and inheritance.

Python with its dynamic functionality allows for all objects to be treated as the same same as long as they respond to the state or behaviour being called, otherwise it goes a a big bang.

Class Definition

Classes in Python feel little more than giving functions and variables scope. Their simple syntax and their ability to be dynamic strengthen this thought. To create a class we simply need the class keyword and the name of the class. Below we create a class called Person.

Code:

class Person():

An instance of the class can be created with a method named after the class.

Code:

foo = Person():

Methods

Class methods are any functions which are in scope; i.e they follow the class definition and are indented.

Code:

class Person():

    def DoSomething(self):
        return 'This is a person!'

To access a member method we use the name of the method along with the dot notation and either the self key word or the class instance. Self for when we are inside the scope of the class and the class instance when we are outside.

Methods are by default all public unless explicitly defined as private.

Code:

foo = Person():
print(foo.DoSomething())

Output:

This is a person!

State

In object orientation, state is represented as data. Other OO languages allow data to belong to an instance of a class, or shared between all instances.

Instance Variables

Instance or member variables are simply variables which are assigned to a class within it’s scope. In its simplest form we can simply use self.variable_name = x to assign a value to a member variable. This can be done without any prior knowledge of the member variable.

Code:

self.Name = name
classinstance.Name = name

The preferred way of creating member variables is within the _init__ method which is Python’s constructor method for a class.

To access a member variable we use the name of the variable along with the dot notation and either the self keyword or the class instance. Self for when we are inside the scope of the class and the class instance when we are outside.

In the following code we create two member variables, Name and Age, onto the Person class within the constructor.

The str method is a special method to report the class instance as a string. It is called by the print() method.

Code:

class Person():

    def __init__(self, name, age):
        self.Name = name
        self.Age = age

    def __str__(self):
        return '{0} is {1} years old.'.format(self.Name, self.Age)

Code:

a_person = Person("Luke", 36)
print(a_person)
print(a_person.Age)

Output:

Luke is 36 years old.
36

Class member variables are by default all public unless explicitly defined as private.

Class Variables

Class variables are similar to static or friend variables in other languages. They are state which is shared between class instances however they do not behave as you would expect!

Immutable objects appear to be instance variables while mutable objects are shared until they are reassigned. Take the following class.

Code:

class ClassVariable():

    AnInt = 1
    AList = []

    def __str__(self):
        return "AnInt = {0}, AList = {1}".format(self.AnInt, self.AList)

If we manipulate the state of two separate instances:

Code:

one = ClassVariable()
two = ClassVariable()
one.AnInt += 1
one.AList.append("Item-1")

print(one)
print(two)

two.AList = []
print(one)
print(two)

We get the following at the terminal.

Output:

AnInt = 2, AList = [‘Item-1’]
AnInt = 1, AList = [‘Item-1’]
AnInt = 2, AList = [‘Item-1’]
AnInt = 1, AList = []

The AnInt, being an integer and immutable appears to work as an instance variable, i.e changes to an instance only affect that instances.

The AList being mutable acts as a shared variable when modifying the object, i.e changes to an instance are reflected by all class instances. However when we reassign a new list onto one of the instances, both instances now point to two separate list objects; changes to the new list do not affect other classes.

It is strongly advised against using class variables.

Python Classes Are Dynamic

Extending upon the above, Python classes are actually dynamic; during run time we can assign an instance of a member variable or even a method which will only be available to that instance.

Code:

a_person.IsCool = "In a quirky way"

print("Is {0} a cool?: {1}".format(a_person.Name, a_person.IsCool))

Output:

Is Luke a cool?: In a quirky way

Subsequent class instances of Person won’t respond to the IsCool instance method as it was added onto an instance of the class Person.

In short two class instances which are of the same class type can literally have a different set of member fields and functions!

We can use the hasattr method to determine if an instance responds to a variable or method name.

Code:

a_person = Person("Luke", 36)

if hasattr(a_person, 'IsCool'):
    print("Is {0} a cool?: {1}".format(a_person.Name, a_person.IsCool))
else:
    print("{0} does not respond to IsCool".format(a_person.Name))

Output:

Luke does not respond to IsCool

IsInstance

Due to Python’s dynamic nature you don’t know what class or type a variable holds until run time. Fortunately the isinstance method can be used. Taking a variable and a type name it returns true if the variable is a instance of the type.

Code:

isinstance(a_person, Person)

Inheritance

Inheritance promotes code reuse by implementing common ancestors; state and behaviour can be inherited from another class.

The following class Boy extends Person by inheriting from it.

Code:

class Boy(Person):
    def __init__(self, name, age):
        super().__init__(name, age)
        self.Sex = "Boy"

    def __str__(self):
        return "{0} and is a {1}".format(super().__str__(), self.Sex)

We can access the state of the Person class or its behaviour directly from the Boy subclass.

The super function can be used to ensure the parent class state or behaviour is called. In the example above both Boy and Person have a constructor called init. We use super.init to call the Person.Init from inside the Boy.Init function.

Code:

    a_boy = Boy("LukeyBoy", 36)
    print(a_boy.Name)

Output:

LukeyBoy

All member variables and methods are virtual by default; they can be overridden by any inheriting classes. This is shown by the two implementations of the str method. Boy overrides the version provided by Person. Calling string on Boy calls the version local to to that class, however we can still access overridden data and behaviour by replacing self with super().

IsSubClass

The method issubclass can be used to determine if a type is another type or has it in its ancestry. This means it inherits from it by not necessarily directly.

The method works directly on the types and not instances of the types. The type method can be used to get a type from an instance.

Code:

print("Is Boy A SubClass of Boy?:", issubclass(Boy, Boy))
print("Is Boy A SubClass of Person?:", issubclass(Boy, Person))
print("Is Person A SubClass of Boy?:", issubclass(Person, Boy))

Output:

Is Boy A SubClass of Boy?: True
Is Boy A SubClass of Person?: True
Is Person A SubClass of Boy?: False

Multiple Inheritance

Python allows multiple direct ancestors or multiple inheritance, by simply separating the classes being extended by commas.

Code:

    class ExtendingClass(Base1, Base2, Base3.....):

Any calls to state of behaviour on an instance of ExtendingClass will simply look in the order of ExtendingClass, Base1, Base2, Base3 stopping when a match is made.