Background
I have an application that uses a team model that includes a field using the ltree postgres extension, which is a field that allows you to store the entire ancestral chain of an object including itself in a single field (the representation I see is . separated id values). This field type is not supported by Django out of the box, so some custom extensions were added to add proper support for the ancestor/descendant checks that I think are based on some packages available that do some of the same. This ltree field is being used as a path column in our Team table in the database to store team hierarchies.
The Problem There is an export option that will generate a CSV copy of the team structure, and as part of that, a given team will have the ancestors display each in their own columns next to other information about the team. When some larger teams are exported, the order of the ancestors is scrambled. Some digging shows that the subquery used to create the list annotation that explodes the path into individual teams is not respecting the ordering provided in the subquery.
The basic query that is being used is as follows:
ancestors_qs = Team.objects.filter(path__ancestorsof=OuterRef("path")).order_by("path")
teams = Team.objects.all()
teams = teams.annotate(ancestors=Func(Subquery(ancestors_qs.values("name")), function="array"))
teams.get(id=5).ancestors
where the ancestorsof query method is defined as below:
@LtreeField.register_lookup
class Ancestor(models.Lookup):
"""Lookup to find ancestors of current node using GiST @> operator."""
lookup_name = "ancestorsof"
def as_sql(self, compiler, connection):
"""Return the SQL generated for ancestorsof lookup on ltree field."""
lhs, lhs_params = self.process_lhs(compiler, connection)
rhs, rhs_params = self.process_rhs(compiler, connection)
params = lhs_params + rhs_params
return f"{lhs} operator(public.@>) {rhs}", params
This query for that team returns the ancestor teams, but does not return them in the order expected (this team is part of a 5 team chain 1-5). The ordering received back is [4, 1, 2, 3, 5] for the last bit of that query instead of the expected [1,2,3,4,5] and forcing Django to print the query it generates from that ORM call it returned the following:
SELECT
"core_team"."id",
"core_team"."name",
"core_team"."path",
array(
(SELECT
U0."name"
FROM
"core_team" U0
WHERE
U0."path" operator(public.@>) "core_team"."path")
) AS "ancestors" FROM "core_team"
Building the SQL by hand, I was able to get the proper output order with the following:
SELECT * FROM core_team WHERE path @> '1.2.3.4.5' ORDER BY path;
Looking at the Django output, I believe there is a missing ORDER BY that would be inside of the array function call and I don't know why that part is not included because the order_by is defined in the original ancestors_qs subquery.
Because of the way Django handles subqueries, you're having trouble with the ancestors' ordering in your query. Django's
annotatemethod with a subquery does not ensure that the results of the subquery will appear in the correct order. The subquery that obtains the ancestors isn't expressly ordered in your example even if you are annotating each team with its ancestors. As a result, the outcomes aren't in the order that was anticipated. You can change your query to take care of this issue and guarantee that the ancestors are arranged properly. Create a subquery by specifically include aorder_byphrase in order to fetch the ancestors in the appropriate order. Assign a row number based on the desired order and then use Django's "Window" capabilities to correctly reorder the ancestors. Finally, use a filter to find the particular team you desire, then obtain the sorted ancestors from theancestors_orderedfield. The problem you encountered is resolved by using this method, which makes sure the ancestors are arranged correctly in the output.Hope this helps!
**Below attached is the code you requested, keep this in mind this just an example code for what you want to do. Do let me know if you need any help.