Django Models Advance

Django Models Advance

This post is part of my Django series. You can see an overview of the series along with instruction on how to get all the source code here.

This article assumes you are comfortable creating a Django project, can create apps and register them into the INSTALLED_APPS list of the settings file. If not please read my Django HelloWorld article.

This article assumes you have a project called DjangoSandBox in an application called modelsadvanced.

JOINS / Relationships

Django provides three relationship types between models; these translate to joins and foreign keys at the SQL and schema level.

  • One to One (1:1)
  • One to Many (1:n)
  • Many to Many (n:m)

They are all applied as fields to the models; Django then provides an API for walking, querying, setting and deleting the related model instances.

One To One

A 1:1 relationship is where only one model instance is allowed to be associated on either end of the join field.

In this example we create a Man and a Dog model and associate them with the field owner which is a one to one join field set upon the Dog model. The Man is considered the parent and the Dog is considered the child model.

The parent model will have a field automatically added onto it named after the child model; in this case dog.

class Man(Model):
    name = models.CharField(max_length=20)

class Dog(Model):
    name = models.CharField(max_length=20)
    owner = models.OneToOneField(Man, on_delete=models.CASCADE, blank=True, null=True)

The on_delete option defines what happens when the parent record is deleted; here we state CASCADE which means the dog record will also be deleted at the same time as its associated Man record.

The join is optional (not mandatory) in the database as we set null=True. If we were to use Django forms or class based views the relationship would also be considered optional as we set blank=True.

We can create the Man record first then the Dog record passing in the man record to the owner field.

Note: here we use the get_or_create function; to prevent multiple records being generated we need to pass in the man instance into the defaults parameter.

a_man = Man.objects.get_or_create(name="John")[0]
a_dog = Dog.objects.get_or_create(name="Fido", defaults={'owner': a_man})[0]

We could have use the owner field to associate the man onto the dog.

a_man = Man(name="John")
a_man.save()

a_dog = Dog(name="Fido")
a_dog.owner = a_man

We could also have used the auto generated dog field on man to associate the dog onto the man.

a_dog = Dog(name="Fido")
a_dog.save()

a_man = Man(name="John")
a_man.dog = a_man
a_man.save()

If we have a mandatory join then we would have to create the parent record first, and associate the parent (Man) onto the child (Dog) before saving. Trying to save a dog without an owner, when we have a mandatory join, will cause an error to be raised.

We can read data via the dog and owner fields, this traverses the join and causes a database read.

for aMan in Man.objects.all():
    print("{0} --> {1}".format(aMan, aMan.dog))

for a_dog in Dog.objects.all():
    print("{0} --> {1}".format(a_dog, a_dog.owner))

Note: Calling a joined model field will cause a trip to the database upon the first call. Subsequent calls will used a cached version of the data. We can prevent the initial database trip by using the select_related function when we initially get the data.

We can query using the join field using an instance of the associated record and the join field name. Below we find the Dog record who has the owner set to our instance of the Man model found within the variable a_man.

a_dog = Dog.objects.get(name="Fido")
a_man = Man.objects.get(dog=a_dog)

a_man = Man.objects.get(name="Luke")
Dog.objects.get(owner=a_man)

Alternatively we can query using the joined table field by a double underscore between the join field name, the joined table field name and the operator required.

Dog.objects.get(owner__name="Luke")
Dog.objects.get(owner__name__icontains="L")

If no joined record exists accessing the join field will raise a ObjectDoesNotExist error.

One To Many

A one to many (1:n) relationship is where one parent record is associated to multiple children record.

In this example we create a one to many relationship between a model called Parent and a model called Child.

class Parent(Model):
    name = models.CharField(max_length=10, unique=True, null=True)

class Child(Model):
    name = models.CharField(max_length=10, unique=True)
    parent = models.ForeignKey(Parent, null=True)

The parent record can be accessed from the child record via the field called parent.

The children can be accessed from the parent record via a field called child_set. A QuerySet containing all children is returned; all QuerySet functions including filter and order_by can be called.

for aParent in Parent.objects.all():
    print("{0} --> {1}".format(aParent, aParent.child_set.all()))

A QuerySet is always returned, all be it empty, regardless if any children records have been saved. However calling parent upon a child record will raise a ObjectDoesNotExist error if no association has been made.

Django automatically names the field ChildModelName_set though this can be overridden with the related_name option. The following would rename the automatically generated field child_set to children upon the model Parent.

parent = models.ForeignKey(Parent, null=True, related_name=children)

We can create the parent record and then each child passing the parent into the parent field.

a_parent = Parent.objects.get_or_create(name="Bob")[0]
a_child = Child.objects.get_or_create(name="John", defaults={'parent': a_parent})[0]

We can also call add upon the child_set field to associate a child onto a parent. You don’t need to call save afterwards as add saves the data to the database.

a_parent = Parent(name="Bob")
a_parent.save()

a_child = Child(name="John")
a_child.save()
a_parent.child_set.add(a_child)

The add function of the child field can take any number of children.

a_parent.child_set.add(a_child, a_child_two)

Alternatively we can simply assign a list of children onto the parent field. Any existing associated children would be removed. This does not save to the database; a call to save is required afterwards.

a_parent.child_set = [a_child, a_child_two]
a_parent.save()

We can also save a parent onto a child; here we would need to call save.

a_parent = Parent(name="Bob")
a_parent.save()

a_child = Child(name="John")
a_child.parent = a_parent
a_child.save()

We can navigate to the parent record from the child with the join field:

parent_name = Child.objects.get(name="Sophie").parent.name

We can search by the actual joined records. The first line finds the parent who has the child called Sophie associated to it. The later finds the child record who has the parent named bob associated to it.

Parent.objects.get(child=Child.objects.get(name="Sophie"))
Child.objects.get(parent=Parent.objects.get(name="Bob"))

We can search against the parent fields in the format ParentTableName__ParentFieldName__operator. Here we find the child of Bob by using the parent’s name field.

child_of_bob = Child.objects.get(parent__name="Bob")
parent_of_sophie = Parent.objects.get(child__name="Bob")

We can use all the normal filter operators:

Parent.objects.filter(child__name="Sport")
Parent.objects.filter(child__name__contains="S")
Parent.objects.filter(child__name__contains="S").filter(child__name__contains="p")

Where a the join is optional and an association has not been made to a record Django fills the field with null. We can use the isnull operator to query records if they have an association or not.

parent_with_children = Parent.objects.filter(child__name__isnull=True)
parent_with_no_children - Parent.objects.filter(child__name__isnull=False)

The child_set returns a QuerySet which we can predicate, order and iterate through etc.

Parent.objects.get(name="Dave").child_set.filter(name__contains="S").count()

Many operations upon child records will cause subsequent database hits as we traverse the relational model. The select_related function pre-populates the join objects so we don’t have to hit the database again

Parent.objects.select_related().get(name="Dave").child_set.count()
Child.objects.select_related().get(name="Sophie").parent.name

We can annotate the models with aggregate statistics of the associated child records.

parents_with_name = Parent.objects.annotate(children_count=Count("child"))
parents_with_name.get(name="Luke").children_count

Many To Many

A many to many (m:n) join allows any parent record to be associated to any number of children as well as any child record to be associated to any number of parents.

The following example creates the tables Author and Books and then creates a many to many relationship between them.

class Author(Model):
    name = models.CharField(max_length=10, unique=True)

class Book(Model):
    authors = models.ManyToManyField(Author)
    name = models.CharField(max_length=10, unique=True)

Books can access their authors via the authors field, an author can access their books with the auto generated book_set field unless the related_name option is used to define the field name.

for anAuthor in Author.objects.all():
    for a_book in anAuthor.book_set.all():
        print("{0} --> {1}".format(anAuthor, a_book))

for a_book in Book.objects.all():
    for anAuthor in a_book.author.all():
        print("{0} --> {1}".format(a_book, anAuthor))

We can assign records with the add function upon either end of the join field; we only need to associate the records once. Calling add saves the change to the database; there is no need to call save afterwards.

an_author = Author.objects.get_or_create(name="Author 1)[0]
a_book = Book.objects.get_or_create(name=name)[0]
a_book.authors.add(author)

As long as the join is non mandatory we can associate the records from either end via the authors or book_set field just the same way as associating children to a parent in the example of 1:n joins above.

a_parent.book_set.add(a_child)
a_parent.book_set.add(a_child, a_child_two)
a_parent.book_set = [a_child, a_child_two]

Calling either side’s join field returns a QuerySet which supports the the same functionality as with calling the child field on a 1:n relationship; i.e calling parent.child_set as shown above.

Book.objects.get(name="1984").authors.filter(name__contains="o")

Author.objects.get(name__contains="George").book_set
Author.objects.get(name__contains="George").book_set.filter(name__contains="b")

Additional Join Notes

We can place restrictions upon the the the valid join records via the limit_choices_\to option.

fieldname = models.ForeignKey(MyModel, limit_choices_to={'field_name': 'value'})

By default the foreign key and joins for relationships are made with the primary key field; we can use to_field to define another field.

fieldname = models.ForeignKey(MyModel, to_field='field_name')

As mentioned previous the on_delete=models.CASCADE flag means that when we delete a record all assigned entities are also deleted. Other options include:

  • PROTECT: Raises an error if any joined records are found
  • SET_NULL: Joined records are assigned null instead of the record which is being deleted
  • SET_DEFAULT: Joined records are assigned the default value for the field; you must set default option on the field.

Model Inheritance

As per inheritance in Object Orientated programming we can inherit models to reuse model definitions.

Duplicating Fields

The simplest option is to simply copy and paste the fields; though this should be kept to a minimum to prevent abusing the DRY principal. For simply a duplicating a field or two between a couple models this might be the simplest option.

Abstract Parent

An abstract parent allows inheriting fields form a parent model. When we apply our migrations a table named after the child model is found with all the fields from the child and parent. No table is made for the parent model.

class AbstractBaseParent(Model):
    name = models.CharField(max_length=10, unique=True)

    def __str__(self):
        return self.name

    class Meta:
        abstract = True

class AbstractBaseChild(AbstractBaseParent):
    age = models.IntegerField()

    def __str__(self):
        return "Name: {0}, Age: {1}".format(self.name, self.age)

Note: Marking the parent table as abstract within the class meta ensures that no table is generated for the parent table.

We save a child record with the fields of both the parent and child.

a_child = AbstractBaseChild(age=1, name"foo")
a_child.save()

We have access to read the parent models fields from the child model instance.

for child in AbstractBaseChild.objects.all():
    print(child.name, child.age)

A child model can inherit from any number of parent models.

class AChild(ParentOne, ParentTwo, ParentThree):
    pass

Multi-Parent base Child

A multi parent base child creates table definitions for all parent models as well as the child model. Django creates foreign keys within the schema and automatically generates model fields and the required SQL for traversing the joins

Data access is always made via the child which has access to all parent fields as if they are local. Django magically saves and reads to and from the parent tables.

class MultiTableBaseParent(Model):
    name = models.CharField(max_length=10, unique=True)
class MultiTableBaseChild(MultiTableBaseParent):
    age = models.IntegerField()

    def __str__(self):
        return "Name: {0}, Age: {1}".format(self.name, self.age)

Note: This more normalised approached will have the overhead of extra data reads and writes. Use with care.

We access child and parent fields through an instance of the child model. This includes when we create an instance of a child record. We pass in all parent fields at the same time; Django handles saving the parent record.

a_child = MultiTableBaseChild(age=age, name=name)
a_child.save()

We have access to the parent fields as if they are local to the child.

for child in MultiTableBaseChild.objects.all():
    print(child.name, child.age)

Proxy

A Proxy child allows us to wrap another model with extra class meta information. Data is read and written to the true model table.

In the following example the real table is ProxyParent and the proxy model is called ProxyChild.

The proxy child inherits from the ProxyParent model and sets proxy=True in the class meta.

class ProxyParent(Model):
    name = models.CharField(max_length=10, unique=True)

class ProxyChild(ProxyParent):
    class Meta:
        proxy = True

We save an instance of ProxyChild as per normal however the record is saved into the ProxyParent table.

ProxyChild.objects.get_or_create(name="Luke")

We can even access the data via the ProxyChild.

for child in ProxyChild.objects.all():
    print(child)

So why bother? The main point here is that we can have a model class with different meta information from it’s parent. One possible use is to set a different Data Manager. Here we can place additional predicates upon what is considered all of the data. For example we would return only records which are validated or considered live data rather than having to continuously adding in the predicate when required.

Validation

When using Django forms you call is_valid() to ensure all validation is called. When you are working at the model level you can call full_clean().

The functions is_valid() and full_clean() calls the following functions upon the model:

  1. Model.clean_fields() which validates the model’s fields
  2. Model.clean() which validates the model as a whole entity
  3. Model.validate_unique() which validate the field uniqueness against the database and any field constraints.

Simply calling save() upon a model does not raise validation; only database schema constraints will raise errors.

All the functions above allow include/exclude fields to control which fields are validated against.

Django comes with the following validators:

  • MinValueValidator and MaxValueValidator for validating against min and max numerical values.
  • MinLengthValidator and MaxLengthValidator for validating against min and max string fields length.
  • RegexValidator for more complex string validation.

Validators are initiated and assigned to the validators property when defining the model field; it takes a list.

We can also define our own custom validator; simply write a function which takes a value and raises a ValidationError if required.

In the following example we add the following validation onto our ContactDetails model.

  • clean() validates to ensure that the model does not have the name Luke. This is a poor example; clean should be used to validate between fields or where applicable to touch the database.
    • The code sets the error type identifier; this will default to invalid
    • The error raised takes a dictionary taking the field name and then the error.
  • age integer field has a min and max value of 10 and 100
  • name has a min and length of 3 and 10. We also provide a regular expression to ensure it starts with an upper case letter (A-Z) and then contains at least one lower letter.
  • contactDate takes our custom validator to ensure that the date is in the future.
def is_future_date_validator(value):
    if value <= date.today():
        raise ValidationError("{0} is not a future date.".format(value))


class ContactDetails(Model):
    def clean(self):
        if self.name == 'Luke':
            raise ValidationError({'name': "Luke is barred"}, code="Invalid")

    age = models.IntegerField(
        validators=[
            MinValueValidator(10),
            MaxValueValidator(100)])

    name = models.CharField(max_length=20,
                            validators=[
                                MinLengthValidator(3),
                                MaxLengthValidator(10),
                                RegexValidator("^[A-Z][a-z]{1,}$")])

    contactDate = models.DateField(
        validators=[
            is_future_date_validator])

Calling full_clean raises a ValidationError which contains a dictionary of validation error messages within the message_dict property. We can see the error messages raised with the following test.

from unittest import TestCase
from datetime import date

from django.core.exceptions import ValidationError

from ..models import ContactDetails

class TestCreateAuthorBook(TestCase):

    def test_create_author_book(self):

        try:
            a_contact = ContactDetails(name="Luke", age=101, contactDate=date(year=2015, month=7, day=2))
            a_contact.full_clean()
        except ValidationError as e:

            self.assertEqual({'name': ['Ensure this value has at least 10 characters (it has 4).', 'Luke is barred'],
                              'age': ['Ensure this value is less than or equal to 100.'],
                              'contactDate': ['2015-07-02 is not a future date.']}, e.message_dict)

Overriding Error Messages

Django provides a number of default error messages for various types of invalid data; for example when we try to save a model without setting a mandatory field or if we invalidate a maximum value of the field. We can override the default messages by providing a dictionary of error types and validation messages to the error_messages argument when defining the model field.

 name = models.CharField(max_length=20, 
     error_messages={'blank': 'Please provide a value'
                     'invalid': "Names must start with an upper case letter and contain only letters"})

Valid keys include null, blank, invalid, invalid_choice, unique, and unique_for_date. We can create our own when manually raising ValidationError; see the example above in the clean function.

Error Scope

Error messages are associated to where they have been validated against. Where this is a field it is associated to the field via it’s name. The error messages are found within a dictionary upon the exception raised via the message_dict property. When we raise a ValidationError manually we can specify the origin of the error; in the example above we associate it to the name field.

 def clean(self):
        if self.name == 'Luke':
            raise ValidationError({'name': "Luke is barred"}, code="Invalid")

When we hook in the UI the association is important as it affects where the error message is displayed. Associating it to a field displays it next to the field, otherwise it is displayed at the top above all fields.

Transactions

Django’s default behaviour is autocommit mode; each touch of the database is immediately committed.

We can change this to a transaction to ensure all database touches either pass or fail together; use the with statement along with the atomic class to achieve this.

Leaving the scope of the with statement by passing naturally outside will cause the database to be committed. If an error is thrown inside the with scope which is handled anywhere outside of the with statement the database is rolled back losing all changes made since the start of the with statement.

Note: It is important to ensure that the error handling code is outside of the with scope otherwise the database will be committed if we don’t re-throw the exception.

from django.db import transaction, IntegrityError

def another_function():
    try:
        with transaction.atomic():
            # Touch db
    except IntegrityError:
        # Handle error. 

Alternatively we can decorate a function with the atomic decorator; any touching of the database is wrapped in a transaction. The database is committed when we leave the function without error.

We can manually roll back the database by simply raising an exception which is not caught or handled inside the function decorated by the @transaction.atomic decorator.

from django.db import transaction

@transaction.atomic
def a_function():
    pass

Transactions can be nested; failure only rolls back database touches which are within scope. Take the following example.

with transaction.atomic():       
    MyModel.object.all().first().delete()
    try:
        with transaction.atomic():
            MyModel.object.all().first().delete()
            raise Exception()
    except IntegrityError:
    print("Inner")

We have a two with satement; one nested inside the other.

The outer with statement deletes a record, we then enter the inner with statement where we delete another record.

Immediately after the second delete, still in scope of both with statements, we raise an exception which is caught outside of the scope of the inner with statement. The deletion of the second record is undone.

The exception handler catches the exception and allows code execution to continue naturally outside of the outer with statement causing the database to be committed and saving the initial delete.

The result is only the initial delete is persisted to the database.

Additional Information

  • Model Meta Options can be used for various configs from ordering, table names etc.
    Reference:
  • Managers can be used provident different perspectives of the data.

References

Relationships
Inheritance
Many to Many
Many to One
One to One
Transactions

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s