Blogi 24.8.2018

Are you sure you need an ORM for that?

Kiva kun löysit tämän artikkelin! Se sisältää varmasti hyvää tietoa, mutta pidäthän mielessä, että se on kirjoitettu 6 vuotta sitten.

Using an ORM library is typically the default choice when using relational databases. Yet what if you don’t want to use an ORM? Despite making life easier in a lot of ways, ORM libraries have their limitations and quirks, and inevitably obscure your view into the resulting SQL, which often results in inefficient queries.
So let’s say you want to avoid using an ORM, but would rather not write raw SQL everywhere either. One option is to use a query builder library such as jOOQSquirrel or Knex. Query builder libraries provide a relatively thin abstraction over SQL, but make writing queries more convenient and – crucially – composable.
In the project I’ve been working on for the past year, we use Knex on top of PostgreSQL, which provides first-class JSON support. In this blogpost, I’d like to illuminate how we utilize some of PostgreSQL’s JSON functions to help with managing relations without an ORM.
First, let’s consider the following very simple data model:

Imagine that we want to fetch a person with id 1, along with the person’s hobbies, including the category of each hobby. The desired result should look something like this:

{
    id: 1,
    name: 'John Doe',
    hobbies: [
        {
            id: 1,
            name: 'Tennis',
            category: { id: 1, name: 'Sports' }
        },
        {
            id: 2,
            name: 'Minesweeper',
            category: { id: 2, name: 'Video games' }
        }
    ]
}

Using an ORM library such as Bookshelf (which internally uses Knex), one needs to only define the models and their relations, after which actually fetching the desired result is as easy as this:

Person.where('id'1).fetch({ withRelated: ['hobbies.category'] });

Knex has no concept of models, so accomplishing the same result would involve fetching the desired Persons, their Hobbies and Hobby Categories, and mapping them together manually in the service layer to provide the desired result. That’s a fair bit of work.
What we would like to do is to write one query that returns the entire desired object graph in the correct format. As it happens, PostgreSQL provides JSON building functions that help accomplish exactly this: row_to_json, which turns a single row into a JSON object with column names as keys, and json_agg which does the same for multiple rows and returns the results neatly in an array. Similar functionality can be achieved in MariaDB/MySQL, but not as conveniently.
Let’s first consider the case of fetching a Hobby and its associated Category. We need a query for each:

SELECT id, name FROM hobby_category WHERE hobby_category.id = hobby.hobby_category;
SELECT id, name FROM hobby;

Using the row_to_json-function, we can now get the Category as a subquery within the Hobby query:

SELECT name, id,
    (SELECT row_to_json(sq) FROM
       (SELECT name, id FROM hobby_category
            WHERE hobby_category.id = hobby.hobby_category) sq
    as category
FROM hobby;

The single row returned from the subquery is turned into an object and embedded within the main result, and we get the following result, exactly as we wanted:

{ id: 1, name: 'Tennis', category: { id: 1, name: 'Sports' } }

Now let’s consider the second case of fetching a Person and their Hobbies. This is a many to many relation, so we need to a use a join to a junction table in the subquery:

SELECT name, id FROM person;
SELECT name, id FROM hobby INNER JOIN person_hobby ON person_hobby.hobby = hobby.id WHERE person_hobby.person = person.id;

Now we embed the subquery like before, but using the json_agg-function instead:

SELECT person.name, person.id,
    (SELECT json_agg(sq) FROM
        (SELECT hobby.name, hobby.id FROM hobby
            INNER JOIN person_hobby ON hobby.id = person_hobby.hobby
            WHERE person_hobby.person = person.id) sq
    as hobbies
FROM person

Again, the results look exactly as we wanted:

{ id: 1, name: 'John Doe', hobbies: [{ id: 1, name: 'Tennis' }, { id: 2, name: 'Minesweeper' }] }

So we are now getting the results in the form we want with just SQL – all we need to do is to combine the two queries. Since we’ve seen the raw SQL, let’s translate it into Knex. First, let’s define a function for each table that returns the basic contents of the table:

// In the real world, these would be more complex queries, but the basic principle is the same: functions that return queries which can be composed and embedded in other queries
const getCategories = () => knex('hobby_category').select(['hobby_category.id''hobby_category.name']);
const getHobbies = () => knex('hobby').select(['hobby.id''hobby.name']);
const getPersons = () => knex('person').select(['person.id''person.name']);

We can then compose them together:

const category = getCategories()
    .whereRaw('hobby_category.id = hobby.hobby_category');
const hobbies = getHobbies()
    .select(knex.raw('(SELECT row_to_json(sq) FROM ? sq) as category', [ category ]))
    .innerJoin('person_hobby''person_hobby.hobby''hobby.id')
    .whereRaw('person_hobby.person = person.id');
const query = getPersons()
    .select(knex.raw('(SELECT json_agg(sq) FROM ? sq) as hobbies', [hobbies]))
    .where('person.id'1);

Running the query will return the following results:

{
    id: 1,
    name: 'John Doe',
    hobbies: [
        {
            id: 1,
            name: 'Tennis',
            category: { id: 1, name: 'Sports' }
        },
        {
            id: 2,
            name: 'Minesweeper',
            category: { id: 2, name: 'Video games' }
        }
    ]
}

Which is the exact result we originally wanted.
With a bit of work (the exact details of which are outside the scope of this blog), you can extend Knex’s query builder function to get the final query to look something like this:

const category = getCategories()
    .whereRaw('hobby_category.id = hobby.hobby_category');
const hobbies = getHobbies()
    .includeOne('category', category)
    .innerJoin('person_hobby''person_hobby.hobby''hobby.id')
    .whereRaw('person_hobby.person = person.id');
const query = getPersons()
    .includeMany('hobbies', hobbies)
    .where('person.id'1);

For anyone proficient in SQL, this is quite readable and fairly easy to pick up.
If you really wanted to, you could even define some specialized classes named after your tables that contain query logic as well as knowledge of relations that you’d commonly want to include, so you could eventually end up with something even more readable like this:

Person.where('id'1).fetch({ withRelated: ['hobbies.category'] });

Jokes aside, this basic approach has worked very well in my current project. Although using an ORM would undoubtedly have made basic queries simper to write, with PostgreSQL basic queries are not a problem regardless, and there’s a tendency for an ORM to become an obstacle whenever trying to achieve something more complicated. Although you can generally fall back to raw SQL or a query builder even with an ORM, sticking close to SQL by default makes you think of how to solve your problems with SQL in general, resulting in better queries and improved SQL skills for solving those complicated cases. If your ORM is causing you headaches, I’d gladly recommend sticking to a query builder and utilizing the capabilities of your DBMS.

English

Risto Ollila

Risto is a full-stack software developer with an interest in clean software architecture and efficient, maintainable code. Currently a MSc student in Computer Science at Uni. Helsinki, his work of late has utilized a lot of Node, PostgreSQL and Vue.

Takaisin ylös