Django Models Advance

Django Models Advance

This post is part of my Django series. You can see an overview of the series along with instruction on how to get all the source code here.

This article assumes you are comfortable creating a Django project, can create apps and register them into the INSTALLED_APPS list of the settings file. If not please read my Django HelloWorld article.

This article assumes you have a project called DjangoSandBox in an application called modelsadvanced.

JOINS / Relationships

Django provides three relationship types between models; these translate to joins and foreign keys at the SQL and schema level.

  • One to One (1:1)
  • One to Many (1:n)
  • Many to Many (n:m)

They are all applied as fields to the models; Django then provides an API for walking, querying, setting and deleting the related model instances.

One To One

A 1:1 relationship is where only one model instance is allowed to be associated on either end of the join field.

In this example we create a Man and a Dog model and associate them with the field owner which is a one to one join field set upon the Dog model. The Man is considered the parent and the Dog is considered the child model.

The parent model will have a field automatically added onto it named after the child model; in this case dog.

class Man(Model):
    name = models.CharField(max_length=20)

class Dog(Model):
    name = models.CharField(max_length=20)
    owner = models.OneToOneField(Man, on_delete=models.CASCADE, blank=True, null=True)

The on_delete option defines what happens when the parent record is deleted; here we state CASCADE which means the dog record will also be deleted at the same time as its associated Man record.

The join is optional (not mandatory) in the database as we set null=True. If we were to use Django forms or class based views the relationship would also be considered optional as we set blank=True.

We can create the Man record first then the Dog record passing in the man record to the owner field.

Note: here we use the get_or_create function; to prevent multiple records being generated we need to pass in the man instance into the defaults parameter.

a_man = Man.objects.get_or_create(name="John")[0]
a_dog = Dog.objects.get_or_create(name="Fido", defaults={'owner': a_man})[0]

We could have use the owner field to associate the man onto the dog.

a_man = Man(name="John")
a_man.save()

a_dog = Dog(name="Fido")
a_dog.owner = a_man

We could also have used the auto generated dog field on man to associate the dog onto the man.

a_dog = Dog(name="Fido")
a_dog.save()

a_man = Man(name="John")
a_man.dog = a_man
a_man.save()

If we have a mandatory join then we would have to create the parent record first, and associate the parent (Man) onto the child (Dog) before saving. Trying to save a dog without an owner, when we have a mandatory join, will cause an error to be raised.

We can read data via the dog and owner fields, this traverses the join and causes a database read.

for aMan in Man.objects.all():
    print("{0} --> {1}".format(aMan, aMan.dog))

for a_dog in Dog.objects.all():
    print("{0} --> {1}".format(a_dog, a_dog.owner))

Note: Calling a joined model field will cause a trip to the database upon the first call. Subsequent calls will used a cached version of the data. We can prevent the initial database trip by using the select_related function when we initially get the data.

We can query using the join field using an instance of the associated record and the join field name. Below we find the Dog record who has the owner set to our instance of the Man model found within the variable a_man.

a_dog = Dog.objects.get(name="Fido")
a_man = Man.objects.get(dog=a_dog)

a_man = Man.objects.get(name="Luke")
Dog.objects.get(owner=a_man)

Alternatively we can query using the joined table field by a double underscore between the join field name, the joined table field name and the operator required.

Dog.objects.get(owner__name="Luke")
Dog.objects.get(owner__name__icontains="L")

If no joined record exists accessing the join field will raise a ObjectDoesNotExist error.

One To Many

A one to many (1:n) relationship is where one parent record is associated to multiple children record.

In this example we create a one to many relationship between a model called Parent and a model called Child.

class Parent(Model):
    name = models.CharField(max_length=10, unique=True, null=True)

class Child(Model):
    name = models.CharField(max_length=10, unique=True)
    parent = models.ForeignKey(Parent, null=True)

The parent record can be accessed from the child record via the field called parent.

The children can be accessed from the parent record via a field called child_set. A QuerySet containing all children is returned; all QuerySet functions including filter and order_by can be called.

for aParent in Parent.objects.all():
    print("{0} --> {1}".format(aParent, aParent.child_set.all()))

A QuerySet is always returned, all be it empty, regardless if any children records have been saved. However calling parent upon a child record will raise a ObjectDoesNotExist error if no association has been made.

Django automatically names the field ChildModelName_set though this can be overridden with the related_name option. The following would rename the automatically generated field child_set to children upon the model Parent.

parent = models.ForeignKey(Parent, null=True, related_name=children)

We can create the parent record and then each child passing the parent into the parent field.

a_parent = Parent.objects.get_or_create(name="Bob")[0]
a_child = Child.objects.get_or_create(name="John", defaults={'parent': a_parent})[0]

We can also call add upon the child_set field to associate a child onto a parent. You don’t need to call save afterwards as add saves the data to the database.

a_parent = Parent(name="Bob")
a_parent.save()

a_child = Child(name="John")
a_child.save()
a_parent.child_set.add(a_child)

The add function of the child field can take any number of children.

a_parent.child_set.add(a_child, a_child_two)

Alternatively we can simply assign a list of children onto the parent field. Any existing associated children would be removed. This does not save to the database; a call to save is required afterwards.

a_parent.child_set = [a_child, a_child_two]
a_parent.save()

We can also save a parent onto a child; here we would need to call save.

a_parent = Parent(name="Bob")
a_parent.save()

a_child = Child(name="John")
a_child.parent = a_parent
a_child.save()

We can navigate to the parent record from the child with the join field:

parent_name = Child.objects.get(name="Sophie").parent.name

We can search by the actual joined records. The first line finds the parent who has the child called Sophie associated to it. The later finds the child record who has the parent named bob associated to it.

Parent.objects.get(child=Child.objects.get(name="Sophie"))
Child.objects.get(parent=Parent.objects.get(name="Bob"))

We can search against the parent fields in the format ParentTableName__ParentFieldName__operator. Here we find the child of Bob by using the parent’s name field.

child_of_bob = Child.objects.get(parent__name="Bob")
parent_of_sophie = Parent.objects.get(child__name="Bob")

We can use all the normal filter operators:

Parent.objects.filter(child__name="Sport")
Parent.objects.filter(child__name__contains="S")
Parent.objects.filter(child__name__contains="S").filter(child__name__contains="p")

Where a the join is optional and an association has not been made to a record Django fills the field with null. We can use the isnull operator to query records if they have an association or not.

parent_with_children = Parent.objects.filter(child__name__isnull=True)
parent_with_no_children - Parent.objects.filter(child__name__isnull=False)

The child_set returns a QuerySet which we can predicate, order and iterate through etc.

Parent.objects.get(name="Dave").child_set.filter(name__contains="S").count()

Many operations upon child records will cause subsequent database hits as we traverse the relational model. The select_related function pre-populates the join objects so we don’t have to hit the database again

Parent.objects.select_related().get(name="Dave").child_set.count()
Child.objects.select_related().get(name="Sophie").parent.name

We can annotate the models with aggregate statistics of the associated child records.

parents_with_name = Parent.objects.annotate(children_count=Count("child"))
parents_with_name.get(name="Luke").children_count

Many To Many

A many to many (m:n) join allows any parent record to be associated to any number of children as well as any child record to be associated to any number of parents.

The following example creates the tables Author and Books and then creates a many to many relationship between them.

class Author(Model):
    name = models.CharField(max_length=10, unique=True)

class Book(Model):
    authors = models.ManyToManyField(Author)
    name = models.CharField(max_length=10, unique=True)

Books can access their authors via the authors field, an author can access their books with the auto generated book_set field unless the related_name option is used to define the field name.

for anAuthor in Author.objects.all():
    for a_book in anAuthor.book_set.all():
        print("{0} --> {1}".format(anAuthor, a_book))

for a_book in Book.objects.all():
    for anAuthor in a_book.author.all():
        print("{0} --> {1}".format(a_book, anAuthor))

We can assign records with the add function upon either end of the join field; we only need to associate the records once. Calling add saves the change to the database; there is no need to call save afterwards.

an_author = Author.objects.get_or_create(name="Author 1)[0]
a_book = Book.objects.get_or_create(name=name)[0]
a_book.authors.add(author)

As long as the join is non mandatory we can associate the records from either end via the authors or book_set field just the same way as associating children to a parent in the example of 1:n joins above.

a_parent.book_set.add(a_child)
a_parent.book_set.add(a_child, a_child_two)
a_parent.book_set = [a_child, a_child_two]

Calling either side’s join field returns a QuerySet which supports the the same functionality as with calling the child field on a 1:n relationship; i.e calling parent.child_set as shown above.

Book.objects.get(name="1984").authors.filter(name__contains="o")

Author.objects.get(name__contains="George").book_set
Author.objects.get(name__contains="George").book_set.filter(name__contains="b")

Additional Join Notes

We can place restrictions upon the the the valid join records via the limit_choices_\to option.

fieldname = models.ForeignKey(MyModel, limit_choices_to={'field_name': 'value'})

By default the foreign key and joins for relationships are made with the primary key field; we can use to_field to define another field.

fieldname = models.ForeignKey(MyModel, to_field='field_name')

As mentioned previous the on_delete=models.CASCADE flag means that when we delete a record all assigned entities are also deleted. Other options include:

  • PROTECT: Raises an error if any joined records are found
  • SET_NULL: Joined records are assigned null instead of the record which is being deleted
  • SET_DEFAULT: Joined records are assigned the default value for the field; you must set default option on the field.

Model Inheritance

As per inheritance in Object Orientated programming we can inherit models to reuse model definitions.

Duplicating Fields

The simplest option is to simply copy and paste the fields; though this should be kept to a minimum to prevent abusing the DRY principal. For simply a duplicating a field or two between a couple models this might be the simplest option.

Abstract Parent

An abstract parent allows inheriting fields form a parent model. When we apply our migrations a table named after the child model is found with all the fields from the child and parent. No table is made for the parent model.

class AbstractBaseParent(Model):
    name = models.CharField(max_length=10, unique=True)

    def __str__(self):
        return self.name

    class Meta:
        abstract = True

class AbstractBaseChild(AbstractBaseParent):
    age = models.IntegerField()

    def __str__(self):
        return "Name: {0}, Age: {1}".format(self.name, self.age)

Note: Marking the parent table as abstract within the class meta ensures that no table is generated for the parent table.

We save a child record with the fields of both the parent and child.

a_child = AbstractBaseChild(age=1, name"foo")
a_child.save()

We have access to read the parent models fields from the child model instance.

for child in AbstractBaseChild.objects.all():
    print(child.name, child.age)

A child model can inherit from any number of parent models.

class AChild(ParentOne, ParentTwo, ParentThree):
    pass

Multi-Parent base Child

A multi parent base child creates table definitions for all parent models as well as the child model. Django creates foreign keys within the schema and automatically generates model fields and the required SQL for traversing the joins

Data access is always made via the child which has access to all parent fields as if they are local. Django magically saves and reads to and from the parent tables.

class MultiTableBaseParent(Model):
    name = models.CharField(max_length=10, unique=True)
class MultiTableBaseChild(MultiTableBaseParent):
    age = models.IntegerField()

    def __str__(self):
        return "Name: {0}, Age: {1}".format(self.name, self.age)

Note: This more normalised approached will have the overhead of extra data reads and writes. Use with care.

We access child and parent fields through an instance of the child model. This includes when we create an instance of a child record. We pass in all parent fields at the same time; Django handles saving the parent record.

a_child = MultiTableBaseChild(age=age, name=name)
a_child.save()

We have access to the parent fields as if they are local to the child.

for child in MultiTableBaseChild.objects.all():
    print(child.name, child.age)

Proxy

A Proxy child allows us to wrap another model with extra class meta information. Data is read and written to the true model table.

In the following example the real table is ProxyParent and the proxy model is called ProxyChild.

The proxy child inherits from the ProxyParent model and sets proxy=True in the class meta.

class ProxyParent(Model):
    name = models.CharField(max_length=10, unique=True)

class ProxyChild(ProxyParent):
    class Meta:
        proxy = True

We save an instance of ProxyChild as per normal however the record is saved into the ProxyParent table.

ProxyChild.objects.get_or_create(name="Luke")

We can even access the data via the ProxyChild.

for child in ProxyChild.objects.all():
    print(child)

So why bother? The main point here is that we can have a model class with different meta information from it’s parent. One possible use is to set a different Data Manager. Here we can place additional predicates upon what is considered all of the data. For example we would return only records which are validated or considered live data rather than having to continuously adding in the predicate when required.

Validation

When using Django forms you call is_valid() to ensure all validation is called. When you are working at the model level you can call full_clean().

The functions is_valid() and full_clean() calls the following functions upon the model:

  1. Model.clean_fields() which validates the model’s fields
  2. Model.clean() which validates the model as a whole entity
  3. Model.validate_unique() which validate the field uniqueness against the database and any field constraints.

Simply calling save() upon a model does not raise validation; only database schema constraints will raise errors.

All the functions above allow include/exclude fields to control which fields are validated against.

Django comes with the following validators:

  • MinValueValidator and MaxValueValidator for validating against min and max numerical values.
  • MinLengthValidator and MaxLengthValidator for validating against min and max string fields length.
  • RegexValidator for more complex string validation.

Validators are initiated and assigned to the validators property when defining the model field; it takes a list.

We can also define our own custom validator; simply write a function which takes a value and raises a ValidationError if required.

In the following example we add the following validation onto our ContactDetails model.

  • clean() validates to ensure that the model does not have the name Luke. This is a poor example; clean should be used to validate between fields or where applicable to touch the database.
    • The code sets the error type identifier; this will default to invalid
    • The error raised takes a dictionary taking the field name and then the error.
  • age integer field has a min and max value of 10 and 100
  • name has a min and length of 3 and 10. We also provide a regular expression to ensure it starts with an upper case letter (A-Z) and then contains at least one lower letter.
  • contactDate takes our custom validator to ensure that the date is in the future.
def is_future_date_validator(value):
    if value <= date.today():
        raise ValidationError("{0} is not a future date.".format(value))


class ContactDetails(Model):
    def clean(self):
        if self.name == 'Luke':
            raise ValidationError({'name': "Luke is barred"}, code="Invalid")

    age = models.IntegerField(
        validators=[
            MinValueValidator(10),
            MaxValueValidator(100)])

    name = models.CharField(max_length=20,
                            validators=[
                                MinLengthValidator(3),
                                MaxLengthValidator(10),
                                RegexValidator("^[A-Z][a-z]{1,}$")])

    contactDate = models.DateField(
        validators=[
            is_future_date_validator])

Calling full_clean raises a ValidationError which contains a dictionary of validation error messages within the message_dict property. We can see the error messages raised with the following test.

from unittest import TestCase
from datetime import date

from django.core.exceptions import ValidationError

from ..models import ContactDetails

class TestCreateAuthorBook(TestCase):

    def test_create_author_book(self):

        try:
            a_contact = ContactDetails(name="Luke", age=101, contactDate=date(year=2015, month=7, day=2))
            a_contact.full_clean()
        except ValidationError as e:

            self.assertEqual({'name': ['Ensure this value has at least 10 characters (it has 4).', 'Luke is barred'],
                              'age': ['Ensure this value is less than or equal to 100.'],
                              'contactDate': ['2015-07-02 is not a future date.']}, e.message_dict)

Overriding Error Messages

Django provides a number of default error messages for various types of invalid data; for example when we try to save a model without setting a mandatory field or if we invalidate a maximum value of the field. We can override the default messages by providing a dictionary of error types and validation messages to the error_messages argument when defining the model field.

 name = models.CharField(max_length=20, 
     error_messages={'blank': 'Please provide a value'
                     'invalid': "Names must start with an upper case letter and contain only letters"})

Valid keys include null, blank, invalid, invalid_choice, unique, and unique_for_date. We can create our own when manually raising ValidationError; see the example above in the clean function.

Error Scope

Error messages are associated to where they have been validated against. Where this is a field it is associated to the field via it’s name. The error messages are found within a dictionary upon the exception raised via the message_dict property. When we raise a ValidationError manually we can specify the origin of the error; in the example above we associate it to the name field.

 def clean(self):
        if self.name == 'Luke':
            raise ValidationError({'name': "Luke is barred"}, code="Invalid")

When we hook in the UI the association is important as it affects where the error message is displayed. Associating it to a field displays it next to the field, otherwise it is displayed at the top above all fields.

Transactions

Django’s default behaviour is autocommit mode; each touch of the database is immediately committed.

We can change this to a transaction to ensure all database touches either pass or fail together; use the with statement along with the atomic class to achieve this.

Leaving the scope of the with statement by passing naturally outside will cause the database to be committed. If an error is thrown inside the with scope which is handled anywhere outside of the with statement the database is rolled back losing all changes made since the start of the with statement.

Note: It is important to ensure that the error handling code is outside of the with scope otherwise the database will be committed if we don’t re-throw the exception.

from django.db import transaction, IntegrityError

def another_function():
    try:
        with transaction.atomic():
            # Touch db
    except IntegrityError:
        # Handle error. 

Alternatively we can decorate a function with the atomic decorator; any touching of the database is wrapped in a transaction. The database is committed when we leave the function without error.

We can manually roll back the database by simply raising an exception which is not caught or handled inside the function decorated by the @transaction.atomic decorator.

from django.db import transaction

@transaction.atomic
def a_function():
    pass

Transactions can be nested; failure only rolls back database touches which are within scope. Take the following example.

with transaction.atomic():       
    MyModel.object.all().first().delete()
    try:
        with transaction.atomic():
            MyModel.object.all().first().delete()
            raise Exception()
    except IntegrityError:
    print("Inner")

We have a two with satement; one nested inside the other.

The outer with statement deletes a record, we then enter the inner with statement where we delete another record.

Immediately after the second delete, still in scope of both with statements, we raise an exception which is caught outside of the scope of the inner with statement. The deletion of the second record is undone.

The exception handler catches the exception and allows code execution to continue naturally outside of the outer with statement causing the database to be committed and saving the initial delete.

The result is only the initial delete is persisted to the database.

Additional Information

  • Model Meta Options can be used for various configs from ordering, table names etc.
    Reference:
  • Managers can be used provident different perspectives of the data.

References

Relationships
Inheritance
Many to Many
Many to One
One to One
Transactions

Advertisements

Django Models Introduction

Models Introduction

This post is part of my Django series. You can see an overview of the series along with instruction on how to get all the source code here.

This article assumes you are comfortable creating a Django project, can create apps and register them into the INSTALLED_APPS list of the settings file. If not please read my Django HelloWorld article.

It also assumes that you have a database created and configured for connection. By default Django comes preconfigured for SQLite3. The database and default schema for the required Django default applications can be created with the following command.

$ python manage.py migrate

This article assumes you have a project called DjangoSandBox and an app called modelsintrodcution.

ORM

Django comes with an inbuilt ORM with drivers for many database vendors. By default it is configured to use SQLite3. The ORM removes, for the most part, interaction directly with the database. Django will create and manage schema along with generating any SQL which is required for selection, inserts, updates and deletes of data.

Any class which is contained within models.py and inherits from django.db.models.Model will have a table created within the configured database. We also get a rich framework for querying, creating, updating and deleting records.

A simple model would look like:

# models.py
from django.db import models

class Sample(models.Model):
    name = models.CharField(max_length=20)

To apply the model where modelsintroduction is the application you have placed the model in.

$ python manage.py makemigrations modelsintroduction
$ python manage.py migrate

We now have a table called modelsintroduction_sample within the database. It has one field called name which is a string field and can contain up to 20 characters.

By default Django will create an id field for any table where the model does not explicity mark any field as the primary key. The auto generated primary key field will be of an integer type and have an auto incrementing identity index.

If we want to change the schema of the table we simply change the model definition and rerun the migration; Django takes care of the rest.

$ python manage.py makemigrations modelsintroduction
$ python manage.py migrate

Note: You need to have consideration when modifying models; under the hood Django makes schema changes via SQL; check the SQL to ensure that the changes are as expected and are possible. Don’t expect to be able to add a mandatory column with no default to a table which is already populated.

Migration scripts are added into the migrations directory of the application directory. You can view the file in an editor or use the sqlmigrate command to view them on the terminal formatted. The following will view the migration named 0001 in the myapp application.

$ python manage.py sqlmigrate myapp 0001

Django also provides the check function to validate that the current outstanding model changes can be applied.

$ python manage.py check myapp

Model Creation

Django has many field types all specialising for a certain situation, each field is also then configurable.

Lets create our model:

#modelsintroduction/models.py
from django.db import models

OPTION_SEX_MALE = 'M'
OPTION_SEX_FEMALE = 'F'
OPTION_SEX_MALE_DESCRIPTION = 'MALE'
OPTION_SEX_FEMALE_DESCRIPTION = 'FEMALE'

OPTIONS_SEX = (
    (OPTION_SEX_MALE, OPTION_SEX_MALE_DESCRIPTION),
    (OPTION_SEX_FEMALE, OPTION_SEX_FEMALE_DESCRIPTION)
)


class Person(models.Model):
    name = models.CharField(max_length=20, unique=True)
    height = models.FloatField()
    date_of_birth = models.DateField()
    sex = models.CharField(max_length=1, choices=OPTIONS_SEX)  
    # get_sex_display() to get the display of MALE/FEMALE
    validated = models.BooleanField(default=False)

    def __str__(self):
        return "{0} - {1} - {2}- {3}- {4}".format(self.name, self.sex, self.date_of_birth, self.height, self.validated)

    class Meta:
        get_latest_by = 'date_of_birth'
  • The name field is a string field of up to 20 characters with a unique index placed on the schema.
  • The height field is a float field.
  • The sex field is a string field with one character only. We provide a choice list of M/F with mapping to Male and Female strings. When we ask our person class it’s sex we return M/F though we can translate this to Male/Femail with get_sex_display(); this function is automatically created by Django upon our model class.
  • The validated field is a boolean field with a default option of False.

Models can contain an inner class called Meta. Here we can set various meta options which are used throughout the Django framework.

Field Types ## {#Field Types}

Django provides a host of field types which you can read about more here. Below is a selection of the most common entities and their settings.

String Fields

  • max_length is the maximum number of characters permitted.
models.CharField(max_length=10)      # Displays as a text box
models.TextField(max_length=100)     # Displays as a text input area; multi lines
models.EmailField(max_length=100)    # Provides regex validation for a email address
models.CommaSeparatedIntegerField(max_length=100)   # List of csv integers

Boolean Fields

models.BooleanField()           # true/false
models.NullBooleanField()       #null/true/false

Integer Fields

models.BigIntegerField() # 64 bit: -9223372036854775808 to 9223372036854775807
models.IntegerField() # 32 bit: -2147483648 to 2147483647
models.PositiveIntegerField() # 0 to 2147483647
models.SmallIntegerField() # -32768 to 32767
models.PositiveSmallIntegerField # 0 to 3276

Real Number Fields

  • max_digits is the maximum number of digits allowed
  • decimal_places is the maximum number of digits allowed to the right hand side of the decimal point
models.DecimalField(max_digits = 8, decimal_places = 2)     # Python Decimal type.
models.FloatField()       # Python Float type. Does not take max_digits or decimal_places

Date Fields

  • auto_now set as true will update the field value to now upon every save
  • auto_now_add set as true will update the field value to now upon record creation
models.DateField()       # Date
models.TimeField()       # Time
models.DateTimeField()   # Date and time 
models.DurationField()   # modelled by Python time delta

Common Field Type Attributes

A list of the common field attributes which we pass into the field when we define it.

Attributes Possible Value Description
null True/False If true sets the nullable constrains on the field in the db
blank True/False Whether Django validation allows no data for the model field. This allows extra control over the null option. We can enforce collection upon certain forms and not on others
choices A list of possible strings A list of possible values for string fields. Acts like an enumerator
db_column String Override the field name created in the table. By default it is set as the name of the field in the model
db_index True/False If true an index is added onto the table field
default Any value of the field type Sets a default value. If no field value provided a record takes this value for this field
editable True/False When Django automatically creates the UI for creating and updating this table, determines if this field will be displayed or not.
error_messages Dictionary of strings Custom error messages
help_text String Defines a help message to be displayed next tot he field upon forms
primary_key True/False Makes the field a primary key
unique True/False Adds uniqueness constraints to the db for the field value
unique_for_date True/False Adds a uniqueness constraints to the db for the date value
unique_for_month True/False Adds uniqueness constraints to the db but only for the month value of a date
unique_for_year True/False Adds uniqueness constraints to the db but only for the year value of a date
verbose_name adds String Provides a custom external name for the table. Used when displaying to the user on the UI. This value defaults to the field name with underscores replaced with spaces and the words capitalised

CRUD

Create, Update and Delete; how we create and maintain our data.

Creation

Djago provides us with a number of ways of creating records.

The first option is via the constructor function, we pass in the fieldnanes and values as **kwargs. Arguments marked with ** collect all named arguments and places them within a dictionary.

In the following example the record is not created until we call save upon the model instance.

from mypass.models import Person

a_person = Person(name="Luke, sex="M", height=1.82, date_of_birth=date(1975, 9, 17))
a_person.save()

We can access all records within the modelsintroduction_person table via the Person model class and the all function. Here we use print to output a string returned from __str__ function of the model to the terminal.

for a_model in Person.objects.all():
    print(a_model)

We can access individual fields via their name.

for a_model in Person.objects.all():
    print(a_model.name)

We can use an empty constructor call and update record via the fields we defined upon the model.

a_person = Person()
a_person.name = "Luke"
a_person.sex = "M"
a_person.height = 1.82,
a_person.date_of_birth = date(1975, 9, 17)
a_person.save()

We can use the get_or_create function to only create our record once, even after subsequent calls. All fields passed in as named/valued arguments are used to predicate the table. If the record already exists it is simply returned.

The function returns a tuple of the record and a boolean indicating if the record was created.

a_person = Person.objects.get_or_create(name="Luke", sex="M", height=1.82, date_of_birth=date(1875, 9, 17))[0]

If we only wanted a subset of the fields to be used to determine if the record exists we can pass the other fields into the parameter named defaults which is a dictionary. The following would only use the name field to determine if the record exists. The second call would simply return the record created in the first call despite having all fields different except name .

a_person = Person.objects.get_or_create(name="Luke", defaults={'sex': "M", 'height': 1.82, 'date_of_birth' : date(1875, 9, 17))

a_person = Person.objects.get_or_create(name="Luke", defaults={'sex': "F", 'height': 2.82, 'date_of_birth' : date(1885, 9, 11))

The update_or_create funcion works similar to the get_or_create though subsequent fields will update the record. Here only one record is created though the state of the record resembles the second function call.

a_person = Person.objects.update_or_create(name="Luke", defaults={'sex': "M", 'height': 1.82, 'date_of_birth' : date(1875, 9, 17))

a_person = Person.objects.update_or_create(name="Luke", defaults={'sex': "F", 'height': 2.82, 'date_of_birth' : date(1885, 9, 11))

The create function is similar to the constructor though no call to save is required.

Person.objects.create(name="Luke", sex="M", height=1.82, date_of_birth=date(1875, 9, 17))

The bulk_create can create a number of records within the same transaction. By default Django runs in a mode called autocommit; all data is committed upon every touch of the database. This means that nothing runs in a transaction unless especially asked for.

people = []
people.append(Person(name="Luke", sex="M", height=1.82, date_of_birth=date(1875, 9, 17)))
people.append(Person(name="Lukey, sex="M", height=1.82, date_of_birth=date(1875, 9, 17)))

Person.objects.bulk_create(people)

Updating

Updating is made by modifying a model instance and then calling save.

tallest_man.validated = True
tallest_man.save()

The save function() can be passed a collection of fields to save; omitted fields are not saved even if they are modified.

a_person.save(update_fields=["name", "sex"])

We can update multiple records with the update command. Here we update all records to be validated.

Person.objects.all().update(validated=True)

Deleting

Deleting is made by calling delete upon a model instance.

a_person.delete()

Cloning Records

There is no specific clone function though we can reset the id to 0 and call save upon the record. Django will generate a SQL insert statement if the id/pk is None, “” or 0, otherwise it will generate a SQL update statement.

a_person.id = 0
a_person.save()

Predicates

Predicates allow the ability to conditionally select data from the database.

All

Whenever we deal with a collection of records or a handle upon the table we deal with a QuerySet. This provides use with the ability to query, order, update and many other things.

The easiest way to get a handle on a QuerySet is to call object.all() upon a model. We can then iterate through the QuerySet.

for a_model in Person.objects.all():
    print(a_model)

Filter

The filter function allows predicating or searching upon the data; a SQL where clause.

Each field on the model can be searched upon by passing key value pairs into the filter function in the format; [field-name]__[operator]=value.

Note: between the field and the operator there are two underscores.

Querying a field directly for equality is made by the [field-name]__[exact] keyword.

Person.objects.filter(id__exact=1)

There is a shortcut to the exact operator; we can simply use the field name.

Person.objects.filter(id=1)

For the primary key field we can also use the ‘pk’ alias which in our case would equate to id.

Person.objects.filter(pk=1)

IsNull

The isnull operator can be used to see if a column has a null value.

Person.objects.filter(name__isnull=False)

Strings

Django provides a number of operators for string fields.

By default exact, when used upon string fields, uses case sensitive searching though ultimately this would depend upon the database and the configuration used.

We can explicitly ask for case insensitive searching with the iexact option.

Person.objects.filter(name__iexact="sophie")

The contains argument will search for a string or characters anywhere within the stored field value. The icontains is a case insensitive version.

Person.objects.filter(name__contains="Sop")
Person.objects.filter(name__icontains="sop")

The startswith and endswith operators allows searching for a string of characters which are found at the start or end of the stored field value respectively. Once more there are case insensitive versions which are prefixed with i.

Person.objects.filter(name__startswith="Sop")
Person.objects.filter(name__endswith="hie")
Person.objects.filter(name__istartswith="sop")
Person.objects.filter(name__iendswith="hie")

If the database supports full-text searching then we can improve performance of string searching by using the search operator.

# SQLite3 does not support full-text search
Person.objects.filter(name__search="sop")

Note: SQLite3 does not allow this functionality.

We can search for complex string pattern with regular expressions and the regex and iregex operators.

Person.objects.filter(name__regex="^[SJ]")
Person.objects.filter(name__iregex="^[sj]")

Dates

Date fields can have their individual components queried with the year, month, day, hour, min, second and week_day operators.

Person.objects.filter(date_of_birth__year=1977)
Person.objects.filter(date_of_birth__month=8)
Person.objects.filter(date_of_birth__day=25)

Choices

For fields which are enumerated with the choice setting we can search upon their field value directly with the exact operator. We can also use the in operator where we are interested in several of the choice values.

Person.objects.filter(sex=OPTION_SEX_FEMALE)
Person.objects.filter(name__in=["Sophie", "Claire", "Jim"])

Comparisons

Comparison operators consist of the greater than (gt), greater than or equal to (gte), less than (lt) and less than or equal to (lte). We can also search for all entries between a min and max value with the range operator. This translates to a SQL BETWEEN clause.

Comparisons operators can be used with all field types; strings, numericals and dates.

Person.objects.filter(height__gt=1.75)
Person.objects.filter(height__gte=1.75)
Person.objects.filter(height__lte=1.75)
Person.objects.filter(height__lt=1.75)
Person.objects.filter(height__range=(1.75, 2.00))

AND & OR

The filter function can take multiple operators separated by commas, by default these use an AND notation. The following would search for people named Sophie and are greater than 1.75 in height.

Person.objects.filter(name="Sophie", height__gt=1.75)

We can use the Q function along with either I or & operators to use a OR or AND notation between predicate entries.

The following searches for People who are called Sophie or Emma. The latter searches for people called Sophie who are 1.7 hight.

from django.db.models import Q

Person.objects.filter(Q(name="Sophie") | Q(name="Emma"))
Person.objects.filter(Q(name="Sophie") & Q(height=1.7))

We can even match the comma (AND) along with Q and I or &. The following searches for women ( SEX=”F”) and who are called Sophie or Emma.

Person.objects.filter(Q(sex="F"), Q(name="Sophie") | Q(name="Emma"))

F

The F function allows us to use the value of one field to search upon another. Here we find anyone who has the letter of their sex (F/M) in their name.

from django.db.models import F

Person.objects.filter(name__contains=F("sex"))

Exists

The exists function returns true/false depending upon if the current QuerySet will yield any records. Calling this upon all() will return true/false depending upon if the table is empty or not.

Person.objects.filter(name="Jimmy").exists()
Person.objects.all().exists()

Chaining

We can chain many QuerySet functions together and even multiple filter calls; this can aid code readability. Below we search for women called Sophie; i.e. AND notation is used.

Person.objects.filter(sex=OPTION_SEX_FEMALE).filter(name="Sophie")

The interesting thing about chaining is that we don’t actually hit the database until we use the data; i.e. looping through, calling first or last upon the QuerySet.

Each QuerySet also has a cache against the query; so even if we were to loop through a QuerySet twice it would only hit the database once. This does not hold true for the all() function upon the model.objects as this returns a new QuerySet each time.

Exclude

The exclude function can be used to provide a predicate which all returned records will not match. It can take any of the operators as defined with the filter function. We can also chain this to the filter function. Here we find people who are female but are not called Sophie.

Person.objects.filter(sex=OPTION_SEX_FEMALE).exclude(name="Sophie")

Ordering

We can order the data with the order_by function along with the field name. This will order the data ascending though we can call reverse to order the data descending.

people_by_height = Person.objects.all().order_by("height")
people_by_height_desc = Person.objects.all().order_by("height").reverse()

We can order upon multiple columns by providing multiple field names.

people_by_height = Person.objects.all().order_by("height", "name")

Calling order_by subsequent times will overwrite the ordering; the following will order the data by the name column, the order upon the height column will be discarded.

people_by_height = Person.objects.all().order_by("height").order_by("name")

First, Last, Latest, Earliest

We can ask for the first or last record within the database or QuerySet. If there is no order criteria upon the QuerySet then the data will be returned ordered by the primary key. If the QuerySet is empty then None will be returned.

Person.objects.all().order_by("name").first()
Person.objects.all().order_by("name").last()

In fact as QuerySet is a collection and as such we can access the elements by their ordinal position; a call to first is equivalent to [0].

Person.objects.all().order_by("name")[0]

For date fields we can use the latest and earliest functions along with a field name or names. This will return the first or last entry once the data has been ordered. If no fields are defined then the field set upon the model’s class meta entry of get_latest_by is used.

Person.objects.latest("date_of_birth")
Person.objects.earliest("date_of_birth")

If the QuerySet is empty then calls to these functions will raise a DoesNotExist error.

Get

Get works exactly the same as filter though it assumes one record will be returned. If no records are returned then DoesNotExist is raised. If multiple records are returned then MultipleObjectsReturned is raised.

Get can take any operators as defined with the filter function.

a_person = Person.objects.get(name="Emma")

Limiting Returned Records

QuerySets respond to collection slicing.

two_elements = Person.objects.all().order_by("name")[1:3]
first = Person.objects.all().order_by("name")[0]

Aggregates

Django provides a few aggregate functions.

The count function can be used to return the number of elements in the QuerySet; it is equivalent to the len function.

all_people = Person.objects.all().count()

Django provides aggregates functions for count, average, minimum, maximum and sum. They are passed into the aggregate function along with the field name.

The aggregate function returns a dictionary of in the format:

{ [field]__[aggregate], value }

For example asking for the average aggregate upon the height column would yield something similar to this:

{ "height__avg", 1.11 }

Below are examples of how to use aggregates:

all_people = Person.objects.all()

sex_count = all_people.aggregate(Count('sex'))["sex__count"]
avg_height = all_people.aggregate(Avg('height'))["height__avg"]
max_height = all_people.aggregate(Max('height'))["height__max"]
min_height = all_people.aggregate(Min('height'))["height__min"]
total_height = all_people.aggregate(Sum('height'))["height__sum"]

The count function has a switch which can be used to return a count of the distinct values.

distinct_sex_count = all_people.aggregate(Count('sex', True))["sex__count"]

We can also call the distinct function which will return a collection of distinct field values. This will generate SQL containing SELECT DISTINCT.

Note: SQLite3 does not support distinct

# Won't work with SQLite3
distinct_sex_count = Person.objects.all().distinct("sex")

Adding Database Indexes

We can add indexes onto the table by either defining the field with db_index=True or via the index_together class Meta option if we require an index with multiple fields.

Here we create an index on field called ‘one’ and another on fields called ‘two’ and ‘three’.

class DbIndexedModelExample(models.Model):
    one = models.CharField(db_index=True)
    two = models.CharField()
    three = models.CharField()

    class Meta:
        index_together = ["two", "three"]

Refresh From DB

A model can have its data refreshed or re-synced from the database with the refresh_from_db function.

a_person.refresh_from_db()

We can limit the fields which are to be reloaded from the database by using the fields parameter.

a_person.refresh_from_db(fields={"name", "height"})

Equality

Model equality in Django uses the python equality or (__eq__ ) function.

By default equality is based upon the types of the model being the same and the equality of the primary key values.

If the model class is a proxy (see the next post) then the mode class is defined as the first non proxy class in the ancestry of the model.

References