Join two models on geometry predicate

I’d like to know if anyone already succeeded in joining two models that relate only on their geometry fields, without raw SQL. My use case is to annotate a parent model containing a polygon field with the number of points (PointField of another model) contained in each parent model instance.

I found this StackOverflow post (python - GeoDjango query: all point that are contained into a multi polygon - Stack Overflow) which is close to what I’m trying to achieve, but I’m unfortunately getting errors. I can develop my trials/errors, but I’d like first to know if someone found a solution for a similar use case.

Any success story here?

If that SO answer is on-point and still relevant, it may be helpful if you posted what you tried and the errors you received when trying it.

First try (like suggested on SO post):

PolyModel.objects.annotate(
    contained_points=Subquery(
        PointModel.objects.filter(
            point__within=OuterRef('polygon')
        )
    )
)

Result: FieldError: Cannot resolve expression type, unknown output_field

Second try:

PolyModel.objects.annotate(
    contained_points=Subquery(
        PointModel.objects.filter(
            point__within=OuterRef('polygon')
        ), output_field=IntegerField(),
    )
)

Result: ProgrammingError: ERROR: Subquery must return only one column

Another try:

PolyModel.objects.annotate(
    contained_points=Subquery(
        PointModel.objects.filter(
            point__within=OuterRef('polygon')
        ).aggregate(num=Count('id'))['num'],
        output_field=IntegerField(),
    )
)

Result: ValueError: This queryset contains a reference to an outer query and may only be used in a subquery.

I think this is closest:

(I’m winging this)

My first attempt would be to try:

PolyModel.objects.annotate(
    contained_points=Subquery(
        PointModel.objects.filter(
            point__within=OuterRef('polygon')
        )
    ).count()
)

or possibly:

PolyModel.objects.annotate(
    contained_points=Subquery(
        PointModel.objects.filter(
            point__within=OuterRef('polygon')
        )
    ).count(), output_field=IntegerField()
)

Yea, … that’s not going to work.

I’m going to play with some ideas to see what I can get to work.

@claudep here’s how I would approach it

Ideally the ORM would allow you define relationships between models that don’t involve foreign keys so you could do

class PointField(models.Model):
    point = models.PointField()

class PolygonModel(models.Model):
    polygon = models.PolygonField()
    contained_points = Relation(PointField, Q(
        contained_points__point__within=F("polygon"),
    ))

And then simply do

PolygonModel.objects.annotate(Count("contained_points))

Which would result in the following SQL

SELECT
    polygonmodel.*,
    COUNT(pointmodel.id)
FROM polygonmodel
LEFT JOIN polygonpoint ON (
    ST_Within(polygonpoint.point, polygonmodel.polygon)
)
GROUP BY polygonmodel.id

Not possible today unless you rely on one of the linked third-party solution though so you have to get creative.

In your case I don’t see how you can avoid a RawSQL annotation or a Subquery subclass unfortunately as even if there are ways to do aggregation through annotations in subqueries you need to be able to group by a particular column for the workaround to work and in your case you want an aggregation without grouping by the child table.

Proper support for passing subqueries to aggregation function would address your issue as you could simply do

PolygonModel.objects.annotate(Count(
    PointField.objects.filter(point__within=OuterRef("polygon"))
))

I’m afraid until this lands you’ll have to create your own Subquery subclass that wraps a the SQL into a different template.

Wow, thanks a lot @charettes for this awesomely documented answer. It’s so precious to know that something is not possible, otherwise you can spent hours trying without hope! Thanks also for the pointer to the Subquery subclass.

Thanks also @KenWhitesell for trying to help on this!