Unwanted query triggered when using a composite key on a self-referencing entity

118 Views Asked by At

I am encountering a strange issue and cannot understand why this is happening.

I have an entity that is using a composed primary key on the instance and rid fields.

I also have parent as a self referencing relation to the same entity.

class Subscription
{
    /**
     * @ORM\Column(name="id", type="uuid", unique=true)
     * @JMS\Type("string")
     */
    protected $id;

    /**
     * @ORM\Id
     * @ORM\Column(name="instance_id", type="integer")
     * @JMS\Type("int")
     */
    protected $instance;

    /**
     * @ORM\Id
     * @ORM\Column(name="rid", type="bigint")
     * @ORM\SequenceGenerator(sequenceName="subscriptions_id_seq", initialValue=1, allocationSize=1)
     * @JMS\Type("int")
     */
    protected $rid;

    /**
     * @ORM\ManyToOne(targetEntity="Subscription")
     * @ORM\JoinColumns({
     *     @ORM\JoinColumn(name="parent_int_id", referencedColumnName="rid"),
     *     @ORM\JoinColumn(name="instance_id", referencedColumnName="instance_id")
     * })
     */
    protected $parent;

    ...
}

The problem I am facing:

For the following data:

id rid parent_int_id instance_id
11000000-0000-0000-0000-0000000f37d2 147605 890467 1
11000000-0000-0000-0000-0000000e065f 890467 null 1
$subscription = $subscriptionService->findOneBy([
    'rid' => 147605
]);

When retrieving a subscription using findOneBy() I see 3 queries being triggered by Doctrine:

This is the initial query, works as expected:

SELECT t0.id AS id_1, t0.instance_id AS instance_id_2, t0.rid AS rid_3, ...
FROM subscriptions t0 WHERE t0.rid = 147605 LIMIT 1;

This is the query for the parent field, also working as expected:

SELECT t0.id AS id_1, t0.instance_id AS instance_id_2, t0.rid AS rid_3, ... 
FROM subscriptions t0 WHERE t0.rid = '890467' AND t0.instance_id = 1 LIMIT 1;

This query is unexpected and cannot find a way to disable it or understand why it is triggered:

SELECT t0.id AS id_1, t0.instance_id AS instance_id_2, t0.rid AS rid_3, ...
FROM subscriptions t0 WHERE t0.instance_id = 1

As you can see, the 3rd query tries to select all rows for that instance_id which is a problem as our table is very large (millions of rows) and it would cause our database to crash. The retrieved object is built correctly and the parent property is populated as expected, but I would expect that query to not exist at all.

Removing the instance as a composite key stops this behaviour, however we need it to work in this format, so it's not an option.

Any idea what might be causing this additional query and how to get rid of it, so it will work as intended?

1

There are 1 best solutions below

0
On

Okay, what appears to happen is this. When the subscription is loaded, during the hydration of that subscription (turning data into an object) the many-to-one relation is encountered, which - unless specified otherwise - will be eagerly loaded, i.e. immediately. This causes the second query, which might very much be useful to some degree. Here, the same thing repeats.

This coincides with another problem, namely, that the associated relation is defined on the parent_int_id and the always non-null instance_id. This is enough to trigger the search for the parent entity, with the reduced condition (only instance_id) (doctrine source which is probably relevant - There is a part with a TODO in it, so maybe it's not working as intended yet). Which obviously is not what you would want.

Since all parent entities are fetched one by one anyway, there might be good reason to just not fetch it eagerly, but lazily. This would stop the deadly query from happening if you fetch an existing entity. However, as soon as you would access the parent object on a subscription without parent, the deadly query would occur again.

There might be ways to prevent this from happening in general, however, it's unclear to me if this can be achieved in doctrine (or it's annotations/attributes) without digging deep into it. The only "correct" way would probably be to somehow teach doctrine that it should not attempt to fetch a parent if one part of the parent composite key is null (see linked code, it's in there somewhere), which probably involves overwriting the default implementation of UnitOfWork ... which is very central to doctrine ORMs workings and not intended to be overwritten, apparently. You probably would have to write your own UnitOfWork, create your own EntityManager which uses that new UnitOfWork. You would have to adapt your implementation to stay somewhat synchronous to doctrine. Since this kind of customization is rarely desired, the correct way might be too much hassle.

I can imagine a few options, depending on your software development standards and requirements, some or all of these options may be unsuitable. However, here are a few:

  1. for the parent_id you could reference the uuid column instead of the composite key (this obviously requires an index on the uuid column, which it should already have, since it's a unique column anyway) - this one is probably the safest method, that will work.
  2. you could lazy load the parent entity and in your parent getter ensure, that you never try to access non-id columns or it would try to fetch the object (which might trigger a deadly query). You would have to check, if the parent has an rid (besides the instance_id) and return null otherwise. - I sadly don't know, if this would actually work, the lazy-loaded proxy objects usually provide working getters for the id column(s), but I never tried this with composite keys.
  3. Add a parent_instance_id column which is technically redundant and pose its own problems, but it will work, since it can be null when no parent exists (yet).
  4. Not using doctrines association mapping at all for this specific parent relation. Instead keeping it as a property and fetching the parent (via id) from a repository when requested (which is technically a downgrade from option 2, but it's easier to control but separation of concerns is dirty, since an entity is not supposed to make queries or fetch stuff from a repository)