How to hydrate trees in Doctrine

How to hydrate trees in Doctrine

This tree was stored in the table with parentId field, nothing special, and the most obvious way to get the tree is to fetch root element first. The best for rendering is ORM, it allows to render tree recursive from parent to children. And here I could stop writing this article, because this will work.

The only thing I don’t like here is that it will cause additional SQL request for every tree node children. At SQL level it will look like like this:

SELECT * FROM table_name WHERE parent_id=0; 
-- for example return a record with ID=1 

SELECT * FROM table_name WHERE parent_id=1;
-- here we get a bunch of records with different IDs

SELECT * FROM table_name WHERE parent_id=X;
-- and so on for every tree node
...

Is it a problem? Probably not, if you have a small tree. And if you’re sure it always will be small. Otherwise executing unpredictable count of SQL request from PHP could be a big deal.

So probably we have to find some way to hydrate whole tree by a single SQL request.

At first sight we could recursively join the table to itself on parent_id field. But we can’t predict join count. That’s a problem, and it could be better to fetch nodes by one.

So what can we do?

How Doctrine hydrates objects

First of all let’s look how hydration works in Doctrine and how objects are hydrated. This is a quote from the
[Object Hydration
](https://www.doctrine-project.org/projects/doctrine-orm/en/2.16/reference/dql-doctrine-query-language.html#object-hydration
) section of Doctrine manual

Objects fetched in a FROM clause are returned as a Set, that means every object is only ever included in the resulting array once. This is the case even when using JOIN or GROUP BY in ways that return the same row for an object multiple times. If the hydrator sees the same object multiple times, then it makes sure it is only returned once.

What does it mean? Let’s fetch the same post from Blog table twice

/** @var Post $post1 */
$post1 = $postRepository->find(1);
/** @var Post $post2 */
$post2 = $postRepository->find(1);

if($post1 === $post2) {
    dump('It is the same object');
} else {
    dump('It is NOT the same object');
}
// Output: "It is the same object"

Wait, wait!

We could expect that variables $post1 and $post2 contain the same post data, but it’s not just a copy!
Actually these variables are links to the same object in the memory!

Let’s play with it.

Experiment #1

If you change title of $post1, what will be with the title of $post2?

Spoiler: It will be changed too 🙂

$post1->setTitle('Some title');
if($post2->getTitle() === 'Some title') {
    dump('Title is the same: "Some title"');
};
// Output: 'Title is the same: "Some title"'

$post1->setTitle('Some title 2');
if($post2->getTitle() === 'Some title 2') {
    dump('Title is changed to: "Some title 2"');
};
// Output: 'Title is changed to: "Some title 2"'

Experiment #2

What will do this time:

  • fetch $post1
  • change title to some new title
  • fetch $post2

Question: What title will it have?
Spoiler: the changed title.

/** @var Post $post1 */
$post1 = $postRepository->find(1);
$post1->setTitle('Some new title');

/** @var Post $post2 */
$post2 = $postRepository->find(1);
if($post2->getTitle() === 'Some new title') {
    dump('Title is: "Some new title"');
};
// Output: 'Title is the same: "Some new title"'

Wow! It means, Doctrine thinks the title is already fetched and doesn’t overwrite it! It doesn’t overwrite defined properties at all!

Ok. What about undefined properties?

Experiment #3

We know usually ORM uses lazy fetch mode and that means, that related entities will not be fetched before use. Let’s check our $post1’s tags

/** @var Post $post1 */
$post1 = $postRepository->find(1);
dump($post1->getTags());
// Output:
// #collection: Doctrine\Common\Collections\ArrayCollection {#577 ▼
//   -elements: []
// }

So, as expected, the tags collection is empty because of lazy fetch mode.

Now let’s fetch $post2 with joined tags

$post2 = $postRepository
    ->createQueryBuilder('post')
    ->addSelect('tag')
    ->innerJoin('post.tags', 'tag')
    ->where("post.id = :id")
    ->setParameter('id', 1)
    ->getQuery()
    ->getSingleResult();

dump($post2->getTags());
// Output
//   #collection: Doctrine\Common\Collections\ArrayCollection {#577 ▼
//    -elements: array:3 [▶]
//  }

Here we can see, tags collection is NOT empty. Ok. This is expected too.

And finally go back to $post1 and check tag count:

dump($post1->getTags());
// Output:
//  #collection: Doctrine\Common\Collections\ArrayCollection {#577 ▼
//    -elements: array:3 [▶]
//  }

Surprise! It’s changed! Amazing!

We did literally nothing with $post1 but it’s changed because of $post1 and $post2 are the same object in memory!

Ok. This was funny, but could we find a usage for this knowledge?

Back to our trees

So here we need some example table. Suppose it’s categories in our blog application, suppose it has parent_id field, something like this:

  #[ORM\Entity]
  #[ORM\Table(name: 'categories')]
  class Category
  {
      ...
      #[ORM\ManyToOne(targetEntity: Category::class, inversedBy: 'children')]
      private readonly Category $parent;

      #[ORM\OneToMany(targetEntity: Category::class, mappedBy: 'parent')]
      private Collection $children;
      ...

And let’s try fetch and hydrate all categories in the most stupid way:

  $categories = $categoriesRepository
    ->createQueryBuilder('category')
    ->getQuery()
    ->getResult();

Ok. Using this code we fetch and hydrate all the categories. But because of lazy fetch mode we will not get children

So what next? What if we join and fetch children in the most stupid way?

$categories = $categoryRepository
     ->createQueryBuilder('category')
     ->addSelect('children')
     ->leftJoin('category.children', 'children')
     ->getQuery()
     ->getResult();

So now we have hydrated children for each category. But could we use this query result for categories tree recursive render?

We don’t need just every category children been hydrated. We need every child children been hydrated too!

And the core question is: will doctrine hydrate every child’s children too?

Yes, of course!

  • We hydrate ALL the categories
  • Children are categories too
  • As we know the hydrator will not create new brand new empty child category instance, just return previously fetched category

Instead of summary

What I have to say – it blows my mind!

This problem in real world demands

  • fetching root first, then children of root, then children of children and so on
  • or fetching root with joined children of root, then joined children of children and so on …

But thanks for the known feature of Doctrine hydrator we did it in a simple request with ONE join!

Amazing!

Have a nice day!

Udostępnij

Dodaj komentarz

Twój adres e-mail nie zostanie opublikowany. Wymagane pola są oznaczone *

© copyright 2024. All Rights Reserved.