I have a list of URLs in memory, I have a table of URLs that I have visited.
Is there a way to use the Django ORM to find which URLs in memory are not in the table using a single query?
In SQL I would use some sort of Temptable or equivalent to join against the model table.
Any good options with the ORM?
Thanks!
Assuming the list in memory is mem_list and the field holding the visited urls in the Urls table is visited .
# grab the urls in the table that are in the list
qs = Urls.objects.filter(visited__in=mem_list)
# flatten the resulting queryset from above into a list and use set() to extract the difference
set(mem_list).difference(set(qs.values_list("visited", flat=True)))
You can turn all the above into a one-liner: set(mem_list).difference(set(Urls.objects.filter(visited__in=mem_list).values_list("visited", flat=True)))
Have you tried using a FilteredRelation where it is null? It will do a left join allowing null values. Filtering on null values will help you identify where relationships don’t exist efficiently.