This is part of my Python & Django Series which can be found here including information on how to download all the source code.
This article looks at ways of interacting with the operating and file system.
The OS module is a collection of miscellaneous operating system interfaces. If you want to interact with the operating system or call operating system functionality then you can probably do it with the OS module.
The following outlines some of the most useful functionality.
We can get the name and details of the OS with the name property and uname function. Uname returns a tuple with the following information.
- sysname: operating system name
- nodename: name of machine on network (implementation-defined)
- release: operating system release
- version: operating system version
- machine: hardware identifier
posix.uname_result(sysname=’Linux’, nodename=’Minty’, release=’3.13.0-37-generic’, version=’#64-Ubuntu SMP Mon Sep 22 21:28:38 UTC 2014′, machine=’x86_64′)
We can determine the current working directory and also change it with the getcwd and chdir functions.
print(os.getcwd()) os.chdir('/') print(os.getcwd())
We can access the user group id and the user id of the user principal of the active running thread along with a list of all user group ids for the OS.
print(os.getgid()) print(os.getuid()) print(os.getgroups())
[4, 24, 27, 30, 46, 108, 110, 1000]
The getenv and and putenv can be used to get and set system variables respectively.
print(os.getenv("HOME")) os.putenv("Fluffy", "Yes")
We can create directories and remove them with the mkdir and rmdir functions. We can also run bash commands with the system function.
os.mkdir("foo") os.rmdir("foo") os.system("mkdir XXX")
The shutil module provides a high level interface when working with a file or a collection of files.
The following outlines some of the most useful functionality.
We can move and copy files with the move and copy functions.
The copytree function can be used to copy a directory and all it’s contents while the rmtree can be used to delete a directory and all it’s contents.
from shutil import move, copy, copytree, rmtree, move("source_file.txt", "target_dir" ) copy("source_file.txt", "target_file.txt" ) copytree("source_dir", "target_dir") rmtree("source")
We can use shutil to compress and uncompress files and directories with the make_archive and unpack_archive functions respectively. The format has to be registered upon the system and with python. We can determine which formats are configured with the get_archive_formats function. It returns a list of tuples, each tuple contains the file extension and the format description.
from shutil import get_archive_formats, make_archive, unpack_archive print(get_archive_formats()) make_archive(archive_name, 'gztar', root_dir) unpack_archive(archive_name, "source_dir", "gztar" )
[(‘bztar’, “bzip2’ed tar-file”), (‘gztar’, “gzip’ed tar-file”), (‘tar’, ‘uncompressed tar file’), (‘zip’, ‘ZIP file’)]
The disk_usage function can be used to determine the disk usages statistics for a partition or hard disk. It returns a tuple of total the hard disk size, the used size and the free size in bytes.
from shutil import disk_usage print(disk_usage("/"))
usage(total=20507914240, used=7373537280, free=12069023744)
The which command can be used to find the location of an executable which is locatable within the path variable. Here we use it to find the location of the python and python3 executables.
from shutil import which print(which("python")) print(which("python3"))
Glob provides functionality for getting a list of files on a hard disk from a search pattern. The pattern rules for glob are not actually regular expressions but standard Unix path expansion rules.
- * matches zero or more characters as wild card characters
- ? matches a single character as a wild card character
-  matches a single character form a list of possibilities. Allows the character ‘-‘ to determine a range and the character ‘!’ to negate.
-  matches any number
- [abc] matches letters a, b or c as lowercase letters
- [0-9] matches any number
- [a-zA-Z] matches any letter upper or lowercase
- [!abc] matches anything except letters a, b or c
Glob can be used with relative or absolute paths.
Glob is not recursive; i.e it will only search local entities and not within subdirectories. You can use os.walk to search recursively.
The following matches any file which has an extension of ‘txt’.
from glob import glob for name in glob('*.txt'): print(name)
The following will match any file which ends in og.txt and has one first character which can be anything.
from glob import glob for name in glob('?og.txt'): print(name)
The following will match any file which ends in og.txt and has one first character which can be any lowercase letter.
from glob import glob for name in glob('[a-z]og.txt'): print(name)
The following will match any file which ends in og.txt and has one first character which is anything except for lowercase a, b, c, d or e.
from glob import glob for name in glob('[!abcde]og.txt'): print(name)
Unitpath is a Object-oriented alternative to os, os.path and shutil.
Everything works around the Path class which can take any number of path name components which are concatenated to use the correct pathname separator. It supports glob style syntax as well as relative and absolute paths.
A Path instance can be created from any of the following.
Path("/", "home", "lukey") # An absolute path of /home/lukey Path("foo", "log.txt") # A relative path of foo/log.txt Path(__file__) # The current running file Path() # Path(os.curdir) p = Path("") # An empty path
The path class have number of properties which can be used to return a Path instance of the parent directory, the file name, the file extension and the file name without the extension.
The components function can be used to get a list of directories which define the Path instance, each as an instance of Path.
here = Path(__file__) print(here) print(here.components()) # A list of all the directories and the file as Path instances. print(here.parent) # The path without the file name print(here.name) # The file name print(here.ext) # The file extension print(here.stem) # The file name without the extension
[Path(‘/’), Path(‘data’), Path(‘data’), Path(‘Dropbox’), Path(‘Development’), Path(‘SandBox’), Path(‘Git’), Path(‘ThePythonPit’), Path(‘PythonSandBox’), Path(‘StandardLibrary’), Path(‘unipath_example.py’)]
Child & Parent Methods
We can access the parent directory of a path instance with the parent property as shown above.
We can jump up the ancestry tree x times with the ancestor method; this is the same as calling parent x times.
We can walk down the ancestry tree with the child method passing all components of the path to the required directory or file.
print(here.parent) # The containing directory print(here.ancestor(5)) # Up x entities ( same as calling parent x times). print(here.ancestor(3).child("PythonSandBox", "StandardLibrary")) # Returns the child as defined by the components.
Expand, Expand User and Expand Vars
Path instances can be defined with the ~, system variables and also the .. notation.
Note: ~ represents the users home director for Linux/Unix.
Note: .. is a notation for up one directory when defining relative paths.
We can expand these relative path notations to absolute paths. The function expand_user will expand ~ while expand_vars will expand system variables. The norm function will expand the .. and . notations.
Alternatively the expand function will expand ~, system variables as well as the .. and . notations.
print(Path("~").expand_user() ) # Expands ~ to a absolute path name print(Path("$HOME").expand_vars()) # Expands system variables print(Path("/home/luke/..").norm()) # Expands .. and . notation print(Path("$HOME/..").expand()) # Expands system variables, ~ and also ..
File Attributes and permissions
The path class also has a number of attributes which can be used to return information about the file or directory. Most of them are self explanatory.
Note that the atime and ctime functions return time as seconds past the epoch which for unix is the first second of 1970. You can find out the epoch with gmtime(0).
here = Path(__file__) print(here.atime()) # Last access time print(here.ctime()) # Last permission or ownership modification; windows is creation time print(here.isfile()) # Is this a file? Symbolic links are followed print(here.isdir()) # Is this a directory? Symbolic links are followed print(here.islink()) # Is a symbolic link? print(here.ismount()) # Is a mount point; i.e. is the parent on a different device? print(here.exists()) # File or directory actually exists? Symbolic links are followed. print(here.lexists()) # Same as exists but symbolic links are not followed print(here.size()) # File size in bytes print(Path("/foo").isabsolute()) # Is an absolute and not a relative path
The function gmtime can be used to determine the epoch.
time.struct_time(tm_year=1970, tm_mon=1, tm_mday=1, tm_hour=0, tm_min=0, tm_sec=0, tm_wday=3, tm_yday=1, tm_isdst=0)
The stat and lstat can be used to get statistics for a file. Stat will navigate through symbolic links while lstat will use the file the path instance is looking at regardless if it if is a symbolic link or not.
here = Path(__file__) print(here.stat()) # File stat object for size, permissions etc. Symbolic links are
os.stat_result(st_mode=33188, st_ino=2753975, st_dev=2052, st_nlink=1, st_uid=1000, st_gid=1000, st_size=3054, st_atime=1434042724, st_mtime=1434042724, st_ctime=1434042724)