Finding items NOT in a table.

I have a list of URLs in memory, I have a table of URLs that I have visited.
Is there a way to use the Django ORM to find which URLs in memory are not in the table using a single query?

In SQL I would use some sort of Temptable or equivalent to join against the model table.
Any good options with the ORM?
Thanks!

Foo.objects.filter(exclude__url__in=urls)

maybe?

model.objects.exclude(url__in={url list})

Thanks!

I think these return records that are in the table that aren’t in the list.
I want items in the list that aren’t in the table.

Assuming the list in memory is mem_list and the field holding the visited urls in the Urls table is visited .

# grab the urls in the table that are in the list
qs = Urls.objects.filter(visited__in=mem_list)
# flatten the resulting queryset from above into a list and use set() to extract the difference
set(mem_list).difference(set(qs.values_list("visited", flat=True)))

You can turn all the above into a one-liner:
set(mem_list).difference(set(Urls.objects.filter(visited__in=mem_list).values_list("visited", flat=True)))

1 Like

Have you tried using a FilteredRelation where it is null? It will do a left join allowing null values. Filtering on null values will help you identify where relationships don’t exist efficiently.

Using a set difference against a __in=urls lookup as suggested by @onyeibo is definitely the best way to do it.