Using Django with Multiple Databases

dboost.me
4 min readDec 12, 2022

--

If your backend is in full swing and the database load is getting hot, you may consider to offload part of the work to another database. You can shard or move parts of the database entirely to an additional database. Which can be chosen specifically for your needs: you may store most of your data in Postgres while keeping your analytics in ClickHouse . In this short article we are going to cover how to connect second database to your Django application and use it.

Django Setup

Let’s create Book and Author models and add books database (in models.py and settings.py respectively):

# file: models.py

class Author(Model):
first_name = models.TextField()
last_name = models.TextField()

class Book(Model):
content = models.TextField()



# file: settings.py

DATABASES = {
"default": {
"ENGINE": "django.db.backends.sqlite3",
"NAME": BASE_DIR / "db.sqlite3",
},
"books": {
"ENGINE": "django.db.backends.sqlite3",
"NAME": BASE_DIR / "books.sqlite3",
}
}

It’s time to tell django to store Book model in books database. That is accomplished by creating a database router and registering it in the settings:

# file: router.py

class BookRouter:
books_db = "books"
default_db = "default"

def db_for_read(self, model, **hints):
model_name = model._meta.model_name
if model_name == 'book':
return self.books_db
else:
return None

def db_for_write(self, model, **hints):
model_name = model._meta.model_name
if model_name == 'book':
return 'books'
else:
return None

def allow_migrate(self, db, app_label, model_name=None, **hints):
if model_name == 'book':
return db == 'books'
else:
return db == 'default'


# file: settings.py

DATABASE_ROUTERS = [
"router.BookRouter"
]

In the configured router Book model will be read and written to the books database. For any other model (Author in our case) django will iterate over all configured routers to get the database for the model. If all routers returns None default database will be used.

Running Migrations

Turned out that migrate command accepts database CLI parameter, which by default is set to default , so it’s easy to run migrations on another database:

python manage.py migrate --database=books

In our case we need to run migrations on both databases. Luckily we can create a script to run them in parallel and wait until it’s done:

# file: migrate.sh

python manage.py migrate &
python manage.py migrate --database=books &

wait

Migration file can contain multiple types of operations: model operations, python operations and sql operations:

# file: 0002_mix.py

from django.db import migrations, models


def migration_code(apps, schema_editor):
...


sql_code = "..."



class Migration(migrations.Migration):

dependencies = [
('books', '0001_initial'),
]

operations = [
migrations.AddField(model_name='book', name='short_description', field=models.TextField(null=True)),
migrations.AddField(model_name='author', name='biography', field=models.TextField(null=True)),
migrations.RunPython(code=migration_code, reverse_code=migrations.RunPython.noop),
migrations.RunSQL(sql=sql_code, reverse_sql=migrations.RunSQL.noop),
]

For each operation in a migration file Django will figure out operation target database, if it matches the database on which migration is executed (python manage.py --database=?) operation is executed otherwise operation is skipped.

Thus each operation should be associated with a database. In router.py we defined that Author model is stored in the default database, while Books model is stored in thebooks database. However RunPython and RunSQL operations are routed to the default database and can fail if they touch data from another database.

Thankfully there is a way to route both RunPython and RunSQL to the database of our choice — they accept hints dictionary which is later passed to allow_migrate function:

# file: 0002_mix.py

operations = [
migrations.AddField(model_name='book', name='short_description', field=models.TextField(null=True)),
migrations.AddField(model_name='author', name='biography', field=models.TextField(null=True)),
migrations.RunPython(hints={"db": "books"}, code=migration_code, reverse_code=migrations.RunPython.noop),
migrations.RunSQL(hints={"db": "books"}, sql=sql_code, reverse_sql=migrations.RunSQL.noop)
]


# file: router.py

class BookRouter:

def allow_migrate(self, db, app_label, model_name=None, **hints):
if hints.get("db") is not None:
return db == hints.get("db")
if model_name == 'book':
return db == 'books'
else:
return db == 'default'

So now only AddField operation on Author model will be routed to the default database, all the others will be routed to the books database.

Transactions

Using atomic will start a transaction in the default database unless you explicitly pass the database name in using parameter:

with transaction.atomic(using="books"):
# transaction on books database is started
pass

Transaction commit hook works the same: by default it’s executed after commit to the default database. It’s overriden with using parameter:

def send_email()
pass

transaction.on_commit(send_email, using="books")

Conclusion

In this article we briefly covered the main aspects of using multiple databases:

  • routing models to the database
  • routing SQL and Python queries to the database
  • migrations
  • working with transactions

It looks like a lot of work to setup, but if it’s done right it may bring a lot of benefits:

  • leverage the most suitable database for the particular task
  • share the data between multiple machines with different CPU, RAM and storage types
  • shard the data (if database doesn't support it)

--

--

No responses yet