Inefficient Query Execution

My function make_single_dict_from_model is experiencing an issue related to inefficient query execution when accessing attributes of models in the list_plot parameter. The function currently executes a query for each item in the list_plot, which can result in a significant number of unnecessary database queries and impact performance.

Reproduction Steps:

  1. Execute the function make_single_dict_from_model with a large list_plot parameter containing Django models.
  2. Observe that the function executes a separate query for each item in the list_plot, causing potential performance degradation.
def make_single_dict_from_model(list_plot: List[models.Model], value_key: str, time_key: str):
    """
    [{model: 1, 2}, {model: 3, 4}] -> {value_key: [1, 3], time_key: [2, 4]}
    """

    if list_plot:
        return {
            key: [getattr(item, key) for item in list_plot]
            for key in (value_key, time_key)
        }

    return {}

Query
SELECT id, value_unnest , updateat FROM "collection_basic_arrays" , LATERAL unnest(value) AS value_unnest WHERE ("collection_basic_arrays"."itemid" = 'test' AND "collection_basic_arrays"."lineid" = \'1\' AND "collection_basic_arrays"."machineid" = \'641c0b97d3c5a19affa77b7e\' AND "collection_basic_arrays"."updateat" BETWEEN \'2023-04-04 22:00:00+00:00\' AND \'2023-04-05 06:03:00+00:00\') ORDER BY "collection_basic_arrays"."updateat" ASC'

Postgres Logs

Execution

data = self.obj.objects.raw(query)
data = make_single_dict_from_model(list(data), value_key, time_key)

Do any of your models contain foreign key relationships? look up select_related and prefetch_related in the docs.

Also, sometimes be careful with you usage (if any) of defer() and only()

I would also look as to whether casting the queryset as a list (list(data) in make_single_dict_from_model) has any impact on how evaluated the queryset becomes…

and last edit (sorry! :)), wouldn’t a simple:

<snip>
    if list_plot:
        return list_plot.values()
</snip>

work?

Asking all those questions as:

  • we do not know the details of your models
  • maybe we miss a little bit of context around the execution of those methods.
list_plot.values()
'RawQuerySet' object has no attribute 'values'

No, the collection_basic_array model does not contain foreign keys, I think the queries it is doing are related to the unnest, i.e. it does for 250 times, locations of my array, the query Select … where id = … limit 21.

what do you mean by

Also, sometimes be careful with you usage (if any) of defer() and only()

Actually the actual code would be this:

data = self.obj.objects.raw(query)
p = Paginator(data, settings.LIMIT_RECORDS)
        if p.num_pages > settings.LIMIT_PAGE_NUM:
            raise ValidationError()

        for page in p.page_range:
            data = list(p.page(page).object_list)
            if not data:
                continue

           data = self.make_single_dict_from_model(data, value_key, datetime_key)

My model is:

I use GitHub - schlunsen/django-timescaledb: A Django database backend and tooling for Timescaledb. as manager, but in this case i didnt perform a query that use TimescaleManager function, but only the BaseManager

from timescale.db.models.managers import TimescaleManager

class BasicArray(models.Model):
    id = models.BigAutoField(primary_key=True)

    updateat = DateTimeWithoutTZField(auto_now=True, db_index=True)

    lineid = models.TextField(db_index=True)

    machineid = models.TextField(db_index=True)

    itemid = models.TextField(db_index=True)

    objects = TimescaleManager()

    value = ArrayField(models.TextField())

    type = models.TextField(default=var.VariableType.BasicArray)

    def __str__(self):
        return self.itemid

    class Meta():
        get_latest_by = 'updateat'
        managed = settings.TEST_UNMANAGED_MODELS
        db_table = settings.COLLECTION_TABLE_PREFIX + 'basic_arrays'
        constraints = [
            models.UniqueConstraint(
                fields=['itemid', 'machineid', 'lineid', 'updateat'], name='unique_basic_array'),
            models.UniqueConstraint(
                fields=['id', 'updateat'], name='unique_id_updateat_basic_array')
        ]

Thanks, not sure, but what about this:

<snip>
    for page in p.page_range:
            data = list(p.page(page).object_list)
<snip>

For each page, that is likely to execute a query, all the more since each page’s queryset is cast as a list (which I think will have the effect of “unlazying” it).

Can you tell us whether the number of queries you see on you sql log is the same as the number of pages?

Also, why the need to run a raw query?

I use a custom implementation of this Pagination | Django documentation | Django,
i reimplement the count, like this Efficient Pagination: PostgreSQL and Django.

Can you tell us whether the number of queries you see on you sql log is the same as the number of pages?

This executes 1 query for all , but returns the objects in pieces of n, under the postgres logs, I did 3 page for cycles, the query is still one.

2023-06-13 09:48:00.932 GMT [76] LOG:  statement: SELECT (CAST(value_unnest AS double precision)) AS "value_conv", "collection_basic_arrays"."id", ("collection_basic_arrays"."updateat" + '2:00:00') AS "time" FROM "collection_basic_arrays" , LATERAL unnest(value) AS value_unnest WHERE ("collection_basic_arrays"."itemid" = 'test"' AND "collection_basic_arrays"."lineid" = '1' AND "collection_basic_arrays"."machineid" = '641c0b97d3c5a19affa77b7e' AND "collection_basic_arrays"."updateat" BETWEEN '2023-04-04 22:00:00+00:00' AND '2023-04-05 06:02:00+00:00') ORDER BY "collection_basic_arrays"."updateat" ASC
2023-06-13 09:48:02.510 GMT [76] LOG:  statement: SELECT "collection_basic_arrays"."id", "collection_basic_arrays"."itemid" FROM "collection_basic_arrays" WHERE "collection_basic_arrays"."id" = 5929741 LIMIT 21
2023-06-13 09:48:02.708 GMT [76] LOG:  statement: SELECT "collection_basic_arrays"."id", "collection_basic_arrays"."itemid" FROM "collection_basic_arrays" WHERE "collection_basic_arrays"."id" = 5929741 LIMIT 21
2023-06-13 09:48:02.723 GMT [76] LOG:  statement: SELECT "collection_basic_arrays"."id", "collection_basic_arrays"."itemid" FROM "collection_basic_arrays" WHERE "collection_basic_arrays"."id" = 5929741 LIMIT 21
2023-06-13 09:48:02.738 GMT [76] LOG:  statement: SELECT "collection_basic_arrays"."id", "collection_basic_arrays"."itemid" FROM "collection_basic_arrays" WHERE "collection_basic_arrays"."id" = 5929741 LIMIT 21
2023-06-13 09:48:02.753 GMT [76] LOG:  statement: SELECT "collection_basic_arrays"."id", "collection_basic_arrays"."itemid" FROM "collection_basic_arrays" WHERE "collection_basic_arrays"."id" = 5929741 LIMIT 21
2023-06-13 09:48:02.770 GMT [76] LOG:  statement: SELECT "collection_basic_arrays"."id", "collection_basic_arrays"."itemid" FROM "collection_basic_arrays" WHERE "collection_basic_arrays"."id" = 5929741 LIMIT 21
2023-06-13 09:48:02.787 GMT [76] LOG:  statement: SELECT "collection_basic_arrays"."id", "collection_basic_arrays"."itemid" FROM "collection_basic_arrays" WHERE "collection_basic_arrays"."id" = 5929741 LIMIT 21
2023-06-13 09:48:02.805 GMT [76] LOG:  statement: SELECT "collection_basic_arrays"."id", "collection_basic_arrays"."itemid" FROM "collection_basic_arrays" WHERE "collection_basic_arrays"."id" = 5929741 LIMIT 21
2023-06-13 09:48:02.821 GMT [76] LOG:  statement: SELECT "collection_basic_arrays"."id", "collection_basic_arrays"."itemid" FROM "collection_basic_arrays" WHERE "collection_basic_arrays"."id" = 5929741 LIMIT 21
2023-06-13 09:48:02.838 GMT [76] LOG:  statement: SELECT "collection_basic_arrays"."id", "collection_basic_arrays"."itemid" FROM "collection_basic_arrays" WHERE "collection_basic_arrays"."id" = 5929741 LIMIT 21
2023-06-13 09:48:09.436 GMT [40] FATAL:  terminating connection due to idle-session timeout
2023-06-13 09:48:11.581 GMT [47] FATAL:  terminating connection due to idle-session timeout
2023-06-13 09:48:15.311 GMT [76] LOG:  statement: SELECT "collection_basic_arrays"."id", "collection_basic_arrays"."updateat" FROM "collection_basic_arrays" WHERE "collection_basic_arrays"."id" = 5929741 LIMIT 21
2023-06-13 09:48:15.336 GMT [76] LOG:  statement: SELECT "collection_basic_arrays"."id", "collection_basic_arrays"."updateat" FROM "collection_basic_arrays" WHERE "collection_basic_arrays"."id" = 5929741 LIMIT 21
2023-06-13 09:48:15.356 GMT [76] LOG:  statement: SELECT "collection_basic_arrays"."id", "collection_basic_arrays"."updateat" FROM "collection_basic_arrays" WHERE "collection_basic_arrays"."id" = 5929741 LIMIT 21
2023-06-13 09:48:15.378 GMT [76] LOG:  statement: SELECT "collection_basic_arrays"."id", "collection_basic_arrays"."updateat" FROM "collection_basic_arrays" WHERE "collection_basic_arrays"."id" = 5929741 LIMIT 21
2023-06-13 09:48:15.398 GMT [76] LOG:  statement: SELECT "collection_basic_arrays"."id", "collection_basic_arrays"."updateat" FROM "collection_basic_arrays" WHERE "collection_basic_arrays"."id" = 5929741 LIMIT 21
2023-06-13 09:48:15.418 GMT [76] LOG:  statement: SELECT "collection_basic_arrays"."id", "collection_basic_arrays"."updateat" FROM "collection_basic_arrays" WHERE "collection_basic_arrays"."id" = 5929741 LIMIT 21
2023-06-13 09:48:15.439 GMT [76] LOG:  statement: SELECT "collection_basic_arrays"."id", "collection_basic_arrays"."updateat" FROM "collection_basic_arrays" WHERE "collection_basic_arrays"."id" = 5929741 LIMIT 21
2023-06-13 09:48:15.461 GMT [76] LOG:  statement: SELECT "collection_basic_arrays"."id", "collection_basic_arrays"."updateat" FROM "collection_basic_arrays" WHERE "collection_basic_arrays"."id" = 5929741 LIMIT 21
2023-06-13 09:48:15.483 GMT [76] LOG:  statement: SELECT "collection_basic_arrays"."id", "collection_basic_arrays"."updateat" FROM "collection_basic_arrays" WHERE "collection_basic_arrays"."id" = 5929741 LIMIT 21
2023-06-13 09:48:15.505 GMT [76] LOG:  statement: SELECT "collection_basic_arrays"."id", "collection_basic_arrays"."updateat" FROM "collection_basic_arrays" WHERE "collection_basic_arrays"."id" = 5929741 LIMIT 21
2023-06-13 09:48:19.211 GMT [76] LOG:  statement: SELECT "collection_basic_arrays"."id", "collection_basic_arrays"."itemid" FROM "collection_basic_arrays" WHERE "collection_basic_arrays"."id" = 5929741 LIMIT 21
2023-06-13 09:48:19.235 GMT [76] LOG:  statement: SELECT "collection_basic_arrays"."id", "collection_basic_arrays"."itemid" FROM "collection_basic_arrays" WHERE "collection_basic_arrays"."id" = 5929741 LIMIT 21
2023-06-13 09:48:19.253 GMT [76] LOG:  statement: SELECT "collection_basic_arrays"."id", "collection_basic_arrays"."itemid" FROM "collection_basic_arrays" WHERE "collection_basic_arrays"."id" = 5929741 LIMIT 21
2023-06-13 09:48:19.272 GMT [76] LOG:  statement: SELECT "collection_basic_arrays"."id", "collection_basic_arrays"."itemid" FROM "collection_basic_arrays" WHERE "collection_basic_arrays"."id" = 5929741 LIMIT 21
2023-06-13 09:48:19.290 GMT [76] LOG:  statement: SELECT "collection_basic_arrays"."id", "collection_basic_arrays"."itemid" FROM "collection_basic_arrays" WHERE "collection_basic_arrays"."id" = 5929741 LIMIT 21
2023-06-13 09:48:19.307 GMT [76] LOG:  statement: SELECT "collection_basic_arrays"."id", "collection_basic_arrays"."itemid" FROM "collection_basic_arrays" WHERE "collection_basic_arrays"."id" = 5929741 LIMIT 21
2023-06-13 09:48:19.326 GMT [76] LOG:  statement: SELECT "collection_basic_arrays"."id", "collection_basic_arrays"."itemid" FROM "collection_basic_arrays" WHERE "collection_basic_arrays"."id" = 5929741 LIMIT 21
2023-06-13 09:48:19.340 GMT [76] LOG:  statement: SELECT "collection_basic_arrays"."id", "collection_basic_arrays"."itemid" FROM "collection_basic_arrays" WHERE "collection_basic_arrays"."id" = 5929741 LIMIT 21
2023-06-13 09:48:19.356 GMT [76] LOG:  statement: SELECT "collection_basic_arrays"."id", "collection_basic_arrays"."itemid" FROM "collection_basic_arrays" WHERE "collection_basic_arrays"."id" = 5929741 LIMIT 21
2023-06-13 09:48:19.373 GMT [76] LOG:  statement: SELECT "collection_basic_arrays"."id", "collection_basic_arrays"."itemid" FROM "collection_basic_arrays" WHERE "collection_basic_arrays"."id" = 5929741 LIMIT 21

Also, why the need to run a raw query?
This is why:
"Unexpected ORM Query Translation: CAST and Unnest Functions in Django"

i need to perform unnest in lateral query

this optimizes the function, but it’s still too slow and repeated queries show up

def make_single_dict_from_model(self, list_plot: List[models.Model], value_key: str, time_key: str):
        """
        [{model: 1, 2}, {model: 3, 4}] -> {value_key: [1, 3], time_key: [2, 4]}
        """

        if not list_plot:
            return {}

        return {
            value_key: [item.value_conv for item in list_plot],
            time_key: [item.time for item in list_plot]
        }

In general, using getattr() is more flexible than accessing it directly using the dot (.) operator followed by the attribute name. However, in terms of execution speed, direct access using the dot operator is generally faster than using getattr().

Honestly not sure if this can help but:

SELECT
  (CAST(value_unnest AS double precision)) AS "value_conv",
  "collection_basic_arrays"."id",
  "collection_basic_arrays"."itemid",
  ("collection_basic_arrays"."updateat" + '2:00:00') AS "time"
FROM
  "collection_basic_arrays",
  LATERAL unnest(value) AS value_unnest
WHERE
  (
    "collection_basic_arrays"."itemid" = 'test"'
    AND "collection_basic_arrays"."lineid" = '1'
    AND "collection_basic_arrays"."machineid" = '641c0b97d3c5a19affa77b7e'
    AND "collection_basic_arrays"."updateat" BETWEEN '2023-04-04 22:00:00+00:00'
    AND '2023-04-05 06:02:00+00:00'
  )
ORDER BY
  "collection_basic_arrays"."updateat" ASC

where "collection_basic_arrays"."itemid", is added in the SELECT.

Proposing this as this happens when, for example someone does this (to optimise the queryset’s “size” and performance):

persons = Person.objects.only('pk', 'first_name').all()

and in the template or in a loop elsewhere:

{% for person in persons %}
- {{ person.first_name }} {{ person.last_name }}
{% endfor %}

which will cause a query (afik) for each item in the queryset, something like:

SELECT last_name FROM person_person WHERE id = 123;

maybe that’s what’s happening to you?

This looks like you are missing the field "collection_basic_arrays"."itemid", as already pointed out by @plopidou.

I still think you are chasing the wrong rabbit to get this more performant. Is the time in the logs from consecutive python calls, thus gets called right after? If so, then your raw query already takes ~2s (at how many table entries?). A few remarks:

  • Run the query with EXPLAIN ... and EXPLAIN ANALYZE .... Check for bottlenecks, like big interim allocations (adjust postgres settings accordingly) or sequential scans.
  • Move the +2h from field eval into the where clause. Re-run explain. This should have a huge impact, if updateat is properly indexed, shifting from seq to indexed scan.
  • Do a dry-run without the unnesting, see the difference in the query planner. This most likely has the biggest impact, well see my answers in your other post…
  • Do a dry-run w’o the double precision cast. Should also have some impact - if its big enough, save values with correct type already.

where "collection_basic_arrays"."itemid", is added in the SELECT.

This happened because I was in debug mode of VSCode and opening the object from “variables”, the fields I clicked on appeared displayed, while when I start it not in DEBUG, this is the output

2023-06-13 11:27:00.433 GMT [58] LOG:  statement: SELECT (CAST(value_unnest AS double precision)) AS "value_conv", "collection_basic_arrays"."id", ("collection_basic_arrays"."updateat" + '2:00:00') AS "time" FROM "collection_basic_arrays" , LATERAL unnest(value) AS value_unnest WHERE ("collection_basic_arrays"."itemid" = 'ns=3;s="Motor"."11A1"."Out_X"' AND "collection_basic_arrays"."lineid" = '1' AND "collection_basic_arrays"."machineid" = '641c0b97d3c5a19affa77b7e' AND "collection_basic_arrays"."updateat" BETWEEN '2023-04-04 22:00:00+00:00' AND '2023-04-05 06:02:00+00:00') ORDER BY "collection_basic_arrays"."updateat" ASC
2023-06-13 11:27:00.937 GMT [58] LOG:  statement: DELETE FROM "cms_caches"
2023-06-13 11:27:00.941 GMT [58] LOG:  statement: SELECT "collection_basic_arrays"."id", "collection_basic_arrays"."updateat" FROM "collection_basic_arrays" WHERE "collection_basic_arrays"."id" = 5929741 LIMIT 21
2023-06-13 11:27:01.053 GMT [58] LOG:  statement: SELECT "collection_basic_arrays"."id", "collection_basic_arrays"."updateat" FROM "collection_basic_arrays" WHERE "collection_basic_arrays"."id" = 5929741 LIMIT 21
2023-06-13 11:27:01.069 GMT [58] LOG:  statement: SELECT "collection_basic_arrays"."id", "collection_basic_arrays"."updateat" FROM "collection_basic_arrays" WHERE "collection_basic_arrays"."id" = 5929741 LIMIT 21
2023-06-13 11:27:01.084 GMT [58] LOG:  statement: SELECT "collection_basic_arrays"."id", "collection_basic_arrays"."updateat" FROM "collection_basic_arrays" WHERE "collection_basic_arrays"."id" = 5929741 LIMIT 21
2023-06-13 11:27:01.101 GMT [58] LOG:  statement: SELECT "collection_basic_arrays"."id", "collection_basic_arrays"."updateat" FROM "collection_basic_arrays" WHERE "collection_basic_arrays"."id" = 5929741 LIMIT 21
2023-06-13 11:27:01.118 GMT [58] LOG:  statement: SELECT "collection_basic_arrays"."id", "collection_basic_arrays"."updateat" FROM "collection_basic_arrays" WHERE "collection_basic_arrays"."id" = 5929741 LIMIT 21
2023-06-13 11:27:01.133 GMT [58] LOG:  statement: SELECT "collection_basic_arrays"."id", "collection_basic_arrays"."updateat" FROM "collection_basic_arrays" WHERE "collection_basic_arrays"."id" = 5929741 LIMIT 21
2023-06-13 11:27:01.150 GMT [58] LOG:  statement: SELECT "collection_basic_arrays"."id", "collection_basic_arrays"."updateat" FROM "collection_basic_arrays" WHERE "collection_basic_arrays"."id" = 5929741 LIMIT 21
2023-06-13 11:27:01.166 GMT [58] LOG:  statement: SELECT "collection_basic_arrays"."id", "collection_basic_arrays"."updateat" FROM "collection_basic_arrays" WHERE "collection_basic_arrays"."id" = 5929741 LIMIT 21
2023-06-13 11:27:01.181 GMT [58] LOG:  statement: SELECT "collection_basic_arrays"."id", "collection_basic_arrays"."updateat" FROM "collection_basic_arrays" WHERE "collection_basic_arrays"."id" = 5929741 LIMIT 21
2023-06-13 11:27:01.197 GMT [58] LOG:  statement: SELECT "collection_basic_arrays"."id", "collection_basic_arrays"."updateat" FROM "collection_basic_arrays" WHERE "collection_basic_arrays"."id" = 5929741 LIMIT 21
2023-06-13 11:27:01.213 GMT [58] LOG:  statement: SELECT "collection_basic_arrays"."id", "collection_basic_arrays"."updateat" FROM "collection_basic_arrays" WHERE "collection_basic_arrays"."id" = 5929741 LIMIT 21
2023-06-13 11:27:01.228 GMT [58] LOG:  statement: SELECT "collection_basic_arrays"."id", "collection_basic_arrays"."updateat" FROM "collection_basic_arrays" WHERE "collection_basic_arrays"."id" = 5929741 LIMIT 21
2023-06-13 11:27:01.244 GMT [58] LOG:  statement: SELECT "collection_basic_arrays"."id", "collection_basic_arrays"."updateat" FROM "collection_basic_arrays" WHERE "collection_basic_arrays"."id" = 5929741 LIMIT 21
2023-06-13 11:27:01.259 GMT [58] LOG:  statement: SELECT "collection_basic_arrays"."id", "collection_basic_arrays"."updateat" FROM "collection_basic_arrays" WHERE "collection_basic_arrays"."id" = 5929741 LIMIT 21
2023-06-13 11:27:01.275 GMT [58] LOG:  statement: SELECT "collection_basic_arrays"."id", "collection_basic_arrays"."updateat" FROM "collection_basic_arrays" WHERE "collection_basic_arrays"."id" = 5929741 LIMIT 21
2023-06-13 11:27:01.291 GMT [58] LOG:  statement: SELECT "collection_basic_arrays"."id", "collection_basic_arrays"."updateat" FROM "collection_basic_arrays" WHERE "collection_basic_arrays"."id" = 5929741 LIMIT 21
2023-06-13 11:27:01.307 GMT [58] LOG:  statement: SELECT "collection_basic_arrays"."id", "collection_basic_arrays"."updateat" FROM "collection_basic_arrays" WHERE "collection_basic_arrays"."id" = 5929741 LIMIT 21
2023-06-13 11:27:01.323 GMT [58] LOG:  statement: SELECT "collection_basic_arrays"."id", "collection_basic_arrays"."updateat" FROM "collection_basic_arrays" WHERE "collection_basic_arrays"."id" = 5929741 LIMIT 21
2023-06-13 11:27:01.339 GMT [58] LOG:  statement: SELECT "collection_basic_arrays"."id", "collection_basic_arrays"."updateat" FROM "collection_basic_arrays" WHERE "collection_basic_arrays"."id" = 5929741 LIMIT 21
2023-06-13 11:27:01.356 GMT [58] LOG:  statement: SELECT "collection_basic_arrays"."id", "collection_basic_arrays"."updateat" FROM "collection_basic_arrays" WHERE "collection_basic_arrays"."id" = 5929741 LIMIT 21
2023-06-13 11:27:01.373 GMT [58] LOG:  statement: SELECT "collection_basic_arrays"."id", "collection_basic_arrays"."updateat" FROM "collection_basic_arrays" WHERE "collection_basic_arrays"."id" = 5929741 LIMIT 21
2023-06-13 11:27:01.390 GMT [58] LOG:  statement: SELECT "collection_basic_arrays"."id", "collection_basic_arrays"."updateat" FROM "collection_basic_arrays" WHERE "collection_basic_arrays"."id" = 5929741 LIMIT 21
2023-06-13 11:27:01.408 GMT [58] LOG:  statement: SELECT "collection_basic_arrays"."id", "collection_basic_arrays"."updateat" FROM "collection_basic_arrays" WHERE "collection_basic_arrays"."id" = 5929741 LIMIT 21
2023-06-13 11:27:01.424 GMT [58] LOG:  statement: SELECT "collection_basic_arrays"."id", "collection_basic_arrays"."updateat" FROM "collection_basic_arrays" WHERE "collection_basic_arrays"."id" = 5929741 LIMIT 21
2023-06-13 11:27:01.441 GMT [58] LOG:  statement: SELECT "collection_basic_arrays"."id", "collection_basic_arrays"."updateat" FROM "collection_basic_arrays" WHERE "collection_basic_arrays"."id" = 5929741 LIMIT 21
2023-06-13 11:27:01.458 GMT [58] LOG:  statement: SELECT "collection_basic_arrays"."id", "collection_basic_arrays"."updateat" FROM "collection_basic_arrays" WHERE "collection_basic_arrays"."id" = 5929741 LIMIT 21
2023-06-13 11:27:01.474 GMT [58] LOG:  statement: SELECT "collection_basic_arrays"."id", "collection_basic_arrays"."updateat" FROM "collection_basic_arrays" WHERE "collection_basic_arrays"."id" = 5929741 LIMIT 21

Is the time in the logs from consecutive python calls, thus gets called right after?

Yes

  • Move the +2h from field eval into the where clause. Re-run explain. This should have a huge impact, if updateat is properly indexed, shifting from seq to indexed scan.

I need to .order_by(updateat) or .order_by(updateat +2) ?

As sorry misread the query - was under the impression, that you filter by updateat +2 field. But you already filter correctly for updateat only, so this will not change anything.

What you can try is using a multicolumn index on the fields with the lowest selectivity (try different pattern, maybe with item_id/line_id/machine_id). Should make a big difference in getting very fast to rows of interest, but comes to a price (multicol indexes get very big rather soon, and are more exp during inserts).

I found the first error, in time_key I passed the field without the +2 conversion, so every time it went to process that field

  • save values with correct type already.

I can’t, this field has bool, float, int,…

multicolumn index

This is my table indexes

image

Oh well your second index already joins all filter fields into one big index (just a guess from the name). Does it get used by planner correctly? Can you post the query planner output here? (the output of EXPLAIN ANALYZE <your raw query>)

EXPLAIN ANALYZE SELECT (CAST(value_unnest AS double precision)) AS "value_conv", "collection_basic_arrays"."id", ("collection_basic_arrays"."updateat" + '2:00:00') AS "time" 
FROM "collection_basic_arrays" , LATERAL unnest(value) AS value_unnest 
WHERE ("collection_basic_arrays"."itemid" = 'ns=3;s="Motor"."11A1"."Out_Z"' 
	   AND "collection_basic_arrays"."lineid" = '1' 
	   AND "collection_basic_arrays"."machineid" = '641c0b97d3c5a19affa77b7e' 
	   AND "collection_basic_arrays"."updateat" BETWEEN '2023-04-04 22:00:00+00:00' AND '2023-04-05 06:02:00+00:00') 
	   ORDER BY "collection_basic_arrays"."updateat" ASC


"Nested Loop  (cost=0.45..4684.76 rows=220000 width=28) (actual time=10.021..33.881 rows=14750 loops=1)"
"  ->  Custom Scan (DecompressChunk) on collection_basic_arrays_70_chunk  (cost=0.44..9.58 rows=22000 width=1174) (actual time=9.976..10.106 rows=59 loops=1)"
"        Filter: ((updateat >= '2023-04-04 22:00:00+00'::timestamp with time zone) AND (updateat <= '2023-04-05 06:02:00+00'::timestamp with time zone))"
"        Rows Removed by Filter: 941"
"        ->  Index Scan using compress_hyper_4_146_chunk__compressed_hypertable_4_itemid_mach on compress_hyper_4_146_chunk  (cost=0.27..9.58 rows=22 width=166) (actual time=0.047..0.067 rows=1 loops=1)"
"              Index Cond: ((itemid = 'ns=3;s=""Motor"".""11A1"".""Out_Z""'::text) AND (machineid = '641c0b97d3c5a19affa77b7e'::text) AND (lineid = '1'::text))"
"              Filter: ((_ts_meta_max_1 >= '2023-04-04 22:00:00+00'::timestamp with time zone) AND (_ts_meta_min_1 <= '2023-04-05 06:02:00+00'::timestamp with time zone))"
"              Rows Removed by Filter: 89"
"  ->  Memoize  (cost=0.01..0.11 rows=10 width=32) (actual time=0.025..0.200 rows=250 loops=59)"
"        Cache Key: collection_basic_arrays_70_chunk.value"
"        Cache Mode: binary"
"        Hits: 0  Misses: 59  Evictions: 0  Overflows: 0  Memory Usage: 853kB"
"        ->  Function Scan on unnest value_unnest  (cost=0.00..0.10 rows=10 width=32) (actual time=0.020..0.073 rows=250 loops=59)"
"Planning Time: 5.759 ms"
"Execution Time: 37.401 ms"
EXPLAIN SELECT (CAST(value_unnest AS double precision)) AS "value_conv", "collection_basic_arrays"."id", ("collection_basic_arrays"."updateat" + '2:00:00') AS "time" 
FROM "collection_basic_arrays" , LATERAL unnest(value) AS value_unnest 
WHERE ("collection_basic_arrays"."itemid" = 'ns=3;s="Motor"."11A1"."Out_Z"' 
	   AND "collection_basic_arrays"."lineid" = '1' 
	   AND "collection_basic_arrays"."machineid" = '641c0b97d3c5a19affa77b7e' 
	   AND "collection_basic_arrays"."updateat" BETWEEN '2023-04-04 22:00:00+00:00' AND '2023-04-05 06:02:00+00:00') 
	   ORDER BY "collection_basic_arrays"."updateat" ASC


"Nested Loop  (cost=0.45..4684.76 rows=220000 width=28)"
"  ->  Custom Scan (DecompressChunk) on collection_basic_arrays_70_chunk  (cost=0.44..9.58 rows=22000 width=1174)"
"        Filter: ((updateat >= '2023-04-04 22:00:00+00'::timestamp with time zone) AND (updateat <= '2023-04-05 06:02:00+00'::timestamp with time zone))"
"        ->  Index Scan using compress_hyper_4_146_chunk__compressed_hypertable_4_itemid_mach on compress_hyper_4_146_chunk  (cost=0.27..9.58 rows=22 width=166)"
"              Index Cond: ((itemid = 'ns=3;s=""Motor"".""11A1"".""Out_Z""'::text) AND (machineid = '641c0b97d3c5a19affa77b7e'::text) AND (lineid = '1'::text))"
"              Filter: ((_ts_meta_max_1 >= '2023-04-04 22:00:00+00'::timestamp with time zone) AND (_ts_meta_min_1 <= '2023-04-05 06:02:00+00'::timestamp with time zone))"
"  ->  Memoize  (cost=0.01..0.11 rows=10 width=32)"
"        Cache Key: collection_basic_arrays_70_chunk.value"
"        Cache Mode: binary"
"        ->  Function Scan on unnest value_unnest  (cost=0.00..0.10 rows=10 width=32)"
EXPLAIN ANALYZE SELECT (CAST(value AS double precision[])) AS "value_conv", "collection_basic_arrays"."id", ("collection_basic_arrays"."updateat" + '2:00:00') AS "time" 
FROM "collection_basic_arrays" 
WHERE ("collection_basic_arrays"."itemid" = 'ns=3;s="Motor"."11A1"."Out_Z"' 
	   AND "collection_basic_arrays"."lineid" = '1' 
	   AND "collection_basic_arrays"."machineid" = '641c0b97d3c5a19affa77b7e' 
	   AND "collection_basic_arrays"."updateat" BETWEEN '2023-04-04 22:00:00+00:00' AND '2023-04-05 06:02:00+00:00') 
	   ORDER BY "collection_basic_arrays"."updateat" ASC


"Result  (cost=0.44..1494.58 rows=22000 width=52) (actual time=9.606..15.238 rows=59 loops=1)"
"  ->  Custom Scan (DecompressChunk) on collection_basic_arrays_70_chunk  (cost=0.44..9.58 rows=22000 width=1174) (actual time=9.519..9.628 rows=59 loops=1)"
"        Filter: ((updateat >= '2023-04-04 22:00:00+00'::timestamp with time zone) AND (updateat <= '2023-04-05 06:02:00+00'::timestamp with time zone))"
"        Rows Removed by Filter: 941"
"        ->  Index Scan using compress_hyper_4_146_chunk__compressed_hypertable_4_itemid_mach on compress_hyper_4_146_chunk  (cost=0.27..9.58 rows=22 width=166) (actual time=0.038..0.061 rows=1 loops=1)"
"              Index Cond: ((itemid = 'ns=3;s=""Motor"".""11A1"".""Out_Z""'::text) AND (machineid = '641c0b97d3c5a19affa77b7e'::text) AND (lineid = '1'::text))"
"              Filter: ((_ts_meta_max_1 >= '2023-04-04 22:00:00+00'::timestamp with time zone) AND (_ts_meta_min_1 <= '2023-04-05 06:02:00+00'::timestamp with time zone))"
"              Rows Removed by Filter: 89"
"Planning Time: 5.642 ms"
"Execution Time: 15.431 ms"
EXPLAIN ANALYZE SELECT value_unnest, "collection_basic_arrays"."id", ("collection_basic_arrays"."updateat" + '2:00:00') AS "time" 
FROM "collection_basic_arrays" , LATERAL unnest(value) AS value_unnest 
WHERE ("collection_basic_arrays"."itemid" = 'ns=3;s="Motor"."11A1"."Out_Z"' 
	   AND "collection_basic_arrays"."lineid" = '1' 
	   AND "collection_basic_arrays"."machineid" = '641c0b97d3c5a19affa77b7e' 
	   AND "collection_basic_arrays"."updateat" BETWEEN '2023-04-04 22:00:00+00:00' AND '2023-04-05 06:02:00+00:00') 
	   ORDER BY "collection_basic_arrays"."updateat" ASC

"Nested Loop  (cost=0.45..3584.76 rows=220000 width=52) (actual time=10.239..28.340 rows=14750 loops=1)"
"  ->  Custom Scan (DecompressChunk) on collection_basic_arrays_70_chunk  (cost=0.44..9.58 rows=22000 width=1174) (actual time=10.195..10.317 rows=59 loops=1)"
"        Filter: ((updateat >= '2023-04-04 22:00:00+00'::timestamp with time zone) AND (updateat <= '2023-04-05 06:02:00+00'::timestamp with time zone))"
"        Rows Removed by Filter: 941"
"        ->  Index Scan using compress_hyper_4_146_chunk__compressed_hypertable_4_itemid_mach on compress_hyper_4_146_chunk  (cost=0.27..9.58 rows=22 width=166) (actual time=0.037..0.057 rows=1 loops=1)"
"              Index Cond: ((itemid = 'ns=3;s=""Motor"".""11A1"".""Out_Z""'::text) AND (machineid = '641c0b97d3c5a19affa77b7e'::text) AND (lineid = '1'::text))"
"              Filter: ((_ts_meta_max_1 >= '2023-04-04 22:00:00+00'::timestamp with time zone) AND (_ts_meta_min_1 <= '2023-04-05 06:02:00+00'::timestamp with time zone))"
"              Rows Removed by Filter: 89"
"  ->  Memoize  (cost=0.01..0.11 rows=10 width=32) (actual time=0.024..0.196 rows=250 loops=59)"
"        Cache Key: collection_basic_arrays_70_chunk.value"
"        Cache Mode: binary"
"        Hits: 0  Misses: 59  Evictions: 0  Overflows: 0  Memory Usage: 853kB"
"        ->  Function Scan on unnest value_unnest  (cost=0.00..0.10 rows=10 width=32) (actual time=0.019..0.072 rows=250 loops=59)"
"Planning Time: 5.422 ms"
"Execution Time: 31.493 ms"

Hmm, thats very expensive already in the planning, and execution is slowed down by the LATERAL unpacking. A runtime of >20s is really not good anymore for only ~220k rows.

Can you try the following:

  • remove the lateral unnesting from the query just returning the row id (thus prefiltering for rows in question)
  • do a second query with WHERE IN with those ids, now with unnesting
  • measure both with in the planner again
EXPLAIN ANALYZE SELECT "collection_basic_arrays"."id"
FROM "collection_basic_arrays"
WHERE ("collection_basic_arrays"."itemid" = 'ns=3;s="Motor"."11A1"."Out_Z"' 
	   AND "collection_basic_arrays"."lineid" = '1' 
	   AND "collection_basic_arrays"."machineid" = '641c0b97d3c5a19affa77b7e' 
	   AND "collection_basic_arrays"."updateat" BETWEEN '2023-04-04 22:00:00+00:00' AND '2023-04-05 06:02:00+00:00') 
	   ORDER BY "collection_basic_arrays"."updateat" ASC


"Custom Scan (DecompressChunk) on collection_basic_arrays_70_chunk  (cost=0.44..9.58 rows=22000 width=12) (actual time=0.273..0.313 rows=59 loops=1)"
"  Filter: ((updateat >= '2023-04-04 22:00:00+00'::timestamp with time zone) AND (updateat <= '2023-04-05 06:02:00+00'::timestamp with time zone))"
"  Rows Removed by Filter: 941"
"  ->  Index Scan using compress_hyper_4_146_chunk__compressed_hypertable_4_itemid_mach on compress_hyper_4_146_chunk  (cost=0.27..9.58 rows=22 width=134) (actual time=0.071..0.082 rows=1 loops=1)"
"        Index Cond: ((itemid = 'ns=3;s=""Motor"".""11A1"".""Out_Z""'::text) AND (machineid = '641c0b97d3c5a19affa77b7e'::text) AND (lineid = '1'::text))"
"        Filter: ((_ts_meta_max_1 >= '2023-04-04 22:00:00+00'::timestamp with time zone) AND (_ts_meta_min_1 <= '2023-04-05 06:02:00+00'::timestamp with time zone))"
"        Rows Removed by Filter: 89"
"Planning Time: 6.628 ms"
"Execution Time: 0.357 ms"

Too Long

EXPLAIN ANALYZE SELECT (CAST(value_unnest AS double precision)) AS "value_conv", "collection_basic_arrays"."id", ("collection_basic_arrays"."updateat" + '2:00:00') AS "time" 
FROM "collection_basic_arrays" , LATERAL unnest(value) AS value_unnest
WHERE "collection_basic_arrays"."id" IN (6108348, 6108349, 6108350, 6108351, 6108352, 6108353, 6108354, 6108355, 6108356, 6108357, 6108358, 6108359, 6108360, 6108361, 6108362, 6108363, 6108364, 6108365, 6108366, 6108367, 6108368, 6108369, 6108370, 6108371, 6108372, 6108373, 6108374, 6108375, 6108376, 6108377, 6108378, 6108379, 6108380, 6108381, 6108382, 6108383, 6108384, 6108385, 6108386, 6108387, 6108388, 6108389, 6108390, 6108391, 6108392, 6108393, 6108394, 6108395, 6108396, 6108397, 6108398, 6108399, 6108400, 6108401, 6108402, 6108403, 6108404, 6108405, 6108406)

"Nested Loop  (cost=0.05..1562557.70 rows=71820000 width=20) (actual time=91677.035..105215.907 rows=14750 loops=1)"
"  ->  Append  (cost=0.03..36152.82 rows=7182000 width=1190) (actual time=91671.356..105184.525 rows=59 loops=1)"
"        ->  Custom Scan (DecompressChunk) on collection_basic_arrays_5_chunk  (cost=0.03..7.16 rows=216000 width=1177) (actual time=2608.488..2610.109 rows=0 loops=1)"
"              Filter: (id = ANY ('{6108348,6108349,6108350,6108351,6108352,6108353,6108354,6108355,6108356,6108357,6108358,6108359,6108360,6108361,6108362,6108363,6108364,6108365,6108366,6108367,6108368,6108369,6108370,6108371,6108372,6108373,6108374,6108375,6108376,6108377,6108378,6108379,6108380,6108381,6108382,6108383,6108384,6108385,6108386,6108387,6108388,6108389,6108390,6108391,6108392,6108393,6108394,6108395,6108396,6108397,6108398,6108399,6108400,6108401,6108402,6108403,6108404,6108405,6108406}'::integer[]))"
"              Rows Removed by Filter: 214328"
"              ->  Seq Scan on compress_hyper_4_77_chunk  (cost=0.00..7.16 rows=216 width=124) (actual time=0.013..1.453 rows=216 loops=1)"
"        ->  Custom Scan (DecompressChunk) on collection_basic_arrays_14_chunk  (cost=0.03..5.62 rows=162000 width=1201) (actual time=1777.660..1778.643 rows=0 loops=1)"
"              Filter: (id = ANY ('{6108348,6108349,6108350,6108351,6108352,6108353,6108354,6108355,6108356,6108357,6108358,6108359,6108360,6108361,6108362,6108363,6108364,6108365,6108366,6108367,6108368,6108369,6108370,6108371,6108372,6108373,6108374,6108375,6108376,6108377,6108378,6108379,6108380,6108381,6108382,6108383,6108384,6108385,6108386,6108387,6108388,6108389,6108390,6108391,6108392,6108393,6108394,6108395,6108396,6108397,6108398,6108399,6108400,6108401,6108402,6108403,6108404,6108405,6108406}'::integer[]))"
"              Rows Removed by Filter: 160746"
"              ->  Seq Scan on compress_hyper_4_116_chunk  (cost=0.00..5.62 rows=162 width=124) (actual time=0.007..0.748 rows=162 loops=1)"
"        ->  Custom Scan (DecompressChunk) on collection_basic_arrays_21_chunk  (cost=0.03..5.62 rows=162000 width=1198) (actual time=1836.501..1837.560 rows=0 loops=1)"
"              Filter: (id = ANY ('{6108348,6108349,6108350,6108351,6108352,6108353,6108354,6108355,6108356,6108357,6108358,6108359,6108360,6108361,6108362,6108363,6108364,6108365,6108366,6108367,6108368,6108369,6108370,6108371,6108372,6108373,6108374,6108375,6108376,6108377,6108378,6108379,6108380,6108381,6108382,6108383,6108384,6108385,6108386,6108387,6108388,6108389,6108390,6108391,6108392,6108393,6108394,6108395,6108396,6108397,6108398,6108399,6108400,6108401,6108402,6108403,6108404,6108405,6108406}'::integer[]))"
"              Rows Removed by Filter: 160746"
"              ->  Seq Scan on compress_hyper_4_117_chunk  (cost=0.00..5.62 rows=162 width=124) (actual time=0.007..0.376 rows=162 loops=1)"
"        ->  Custom Scan (DecompressChunk) on collection_basic_arrays_29_chunk  (cost=0.03..8.70 rows=270000 width=1197) (actual time=3043.945..3044.666 rows=0 loops=1)"
"              Filter: (id = ANY ('{6108348,6108349,6108350,6108351,6108352,6108353,6108354,6108355,6108356,6108357,6108358,6108359,6108360,6108361,6108362,6108363,6108364,6108365,6108366,6108367,6108368,6108369,6108370,6108371,6108372,6108373,6108374,6108375,6108376,6108377,6108378,6108379,6108380,6108381,6108382,6108383,6108384,6108385,6108386,6108387,6108388,6108389,6108390,6108391,6108392,6108393,6108394,6108395,6108396,6108397,6108398,6108399,6108400,6108401,6108402,6108403,6108404,6108405,6108406}'::integer[]))"
"              Rows Removed by Filter: 267910"
"              ->  Seq Scan on compress_hyper_4_118_chunk  (cost=0.00..8.70 rows=270 width=124) (actual time=0.008..0.654 rows=270 loops=1)"
"        ->  Custom Scan (DecompressChunk) on collection_basic_arrays_40_chunk  (cost=0.03..5.62 rows=162000 width=1203) (actual time=1794.260..1794.791 rows=0 loops=1)"
"              Filter: (id = ANY ('{6108348,6108349,6108350,6108351,6108352,6108353,6108354,6108355,6108356,6108357,6108358,6108359,6108360,6108361,6108362,6108363,6108364,6108365,6108366,6108367,6108368,6108369,6108370,6108371,6108372,6108373,6108374,6108375,6108376,6108377,6108378,6108379,6108380,6108381,6108382,6108383,6108384,6108385,6108386,6108387,6108388,6108389,6108390,6108391,6108392,6108393,6108394,6108395,6108396,6108397,6108398,6108399,6108400,6108401,6108402,6108403,6108404,6108405,6108406}'::integer[]))"
"              Rows Removed by Filter: 160746"
"              ->  Seq Scan on compress_hyper_4_119_chunk  (cost=0.00..5.62 rows=162 width=124) (actual time=0.016..0.400 rows=162 loops=1)"
"        ->  Custom Scan (DecompressChunk) on collection_basic_arrays_44_chunk  (cost=0.03..5.62 rows=162000 width=1168) (actual time=2022.847..2023.312 rows=0 loops=1)"
"              Filter: (id = ANY ('{6108348,6108349,6108350,6108351,6108352,6108353,6108354,6108355,6108356,6108357,6108358,6108359,6108360,6108361,6108362,6108363,6108364,6108365,6108366,6108367,6108368,6108369,6108370,6108371,6108372,6108373,6108374,6108375,6108376,6108377,6108378,6108379,6108380,6108381,6108382,6108383,6108384,6108385,6108386,6108387,6108388,6108389,6108390,6108391,6108392,6108393,6108394,6108395,6108396,6108397,6108398,6108399,6108400,6108401,6108402,6108403,6108404,6108405,6108406}'::integer[]))"
"              Rows Removed by Filter: 160746"
"              ->  Seq Scan on compress_hyper_4_120_chunk  (cost=0.00..5.62 rows=162 width=124) (actual time=0.008..0.597 rows=162 loops=1)"
"        ->  Custom Scan (DecompressChunk) on collection_basic_arrays_45_chunk  (cost=0.03..7.16 rows=216000 width=1183) (actual time=2605.615..2606.372 rows=0 loops=1)"
"              Filter: (id = ANY ('{6108348,6108349,6108350,6108351,6108352,6108353,6108354,6108355,6108356,6108357,6108358,6108359,6108360,6108361,6108362,6108363,6108364,6108365,6108366,6108367,6108368,6108369,6108370,6108371,6108372,6108373,6108374,6108375,6108376,6108377,6108378,6108379,6108380,6108381,6108382,6108383,6108384,6108385,6108386,6108387,6108388,6108389,6108390,6108391,6108392,6108393,6108394,6108395,6108396,6108397,6108398,6108399,6108400,6108401,6108402,6108403,6108404,6108405,6108406}'::integer[]))"
"              Rows Removed by Filter: 214328"
"              ->  Seq Scan on compress_hyper_4_121_chunk  (cost=0.00..7.16 rows=216 width=124) (actual time=0.094..0.981 rows=216 loops=1)"
"        ->  Custom Scan (DecompressChunk) on collection_basic_arrays_46_chunk  (cost=0.03..7.16 rows=216000 width=1186) (actual time=2631.352..2631.777 rows=0 loops=1)"
"              Filter: (id = ANY ('{6108348,6108349,6108350,6108351,6108352,6108353,6108354,6108355,6108356,6108357,6108358,6108359,6108360,6108361,6108362,6108363,6108364,6108365,6108366,6108367,6108368,6108369,6108370,6108371,6108372,6108373,6108374,6108375,6108376,6108377,6108378,6108379,6108380,6108381,6108382,6108383,6108384,6108385,6108386,6108387,6108388,6108389,6108390,6108391,6108392,6108393,6108394,6108395,6108396,6108397,6108398,6108399,6108400,6108401,6108402,6108403,6108404,6108405,6108406}'::integer[]))"
"              Rows Removed by Filter: 214328"
"              ->  Seq Scan on compress_hyper_4_122_chunk  (cost=0.00..7.16 rows=216 width=124) (actual time=0.097..0.980 rows=216 loops=1)"
"        ->  Custom Scan (DecompressChunk) on collection_basic_arrays_47_chunk  (cost=0.03..5.62 rows=162000 width=1199) (actual time=2275.769..2276.187 rows=0 loops=1)"
"              Filter: (id = ANY ('{6108348,6108349,6108350,6108351,6108352,6108353,6108354,6108355,6108356,6108357,6108358,6108359,6108360,6108361,6108362,6108363,6108364,6108365,6108366,6108367,6108368,6108369,6108370,6108371,6108372,6108373,6108374,6108375,6108376,6108377,6108378,6108379,6108380,6108381,6108382,6108383,6108384,6108385,6108386,6108387,6108388,6108389,6108390,6108391,6108392,6108393,6108394,6108395,6108396,6108397,6108398,6108399,6108400,6108401,6108402,6108403,6108404,6108405,6108406}'::integer[]))"
"              Rows Removed by Filter: 160746"
"              ->  Seq Scan on compress_hyper_4_123_chunk  (cost=0.00..5.62 rows=162 width=124) (actual time=0.087..0.779 rows=162 loops=1)"
"        ->  Custom Scan (DecompressChunk) on collection_basic_arrays_48_chunk  (cost=0.03..5.62 rows=162000 width=1173) (actual time=2129.270..2129.708 rows=0 loops=1)"
"              Filter: (id = ANY ('{6108348,6108349,6108350,6108351,6108352,6108353,6108354,6108355,6108356,6108357,6108358,6108359,6108360,6108361,6108362,6108363,6108364,6108365,6108366,6108367,6108368,6108369,6108370,6108371,6108372,6108373,6108374,6108375,6108376,6108377,6108378,6108379,6108380,6108381,6108382,6108383,6108384,6108385,6108386,6108387,6108388,6108389,6108390,6108391,6108392,6108393,6108394,6108395,6108396,6108397,6108398,6108399,6108400,6108401,6108402,6108403,6108404,6108405,6108406}'::integer[]))"
"              Rows Removed by Filter: 160746"
"              ->  Seq Scan on compress_hyper_4_124_chunk  (cost=0.00..5.62 rows=162 width=124) (actual time=0.122..0.808 rows=162 loops=1)"
"        ->  Custom Scan (DecompressChunk) on collection_basic_arrays_49_chunk  (cost=0.03..8.70 rows=270000 width=1200) (actual time=3209.418..3209.777 rows=0 loops=1)"
"              Filter: (id = ANY ('{6108348,6108349,6108350,6108351,6108352,6108353,6108354,6108355,6108356,6108357,6108358,6108359,6108360,6108361,6108362,6108363,6108364,6108365,6108366,6108367,6108368,6108369,6108370,6108371,6108372,6108373,6108374,6108375,6108376,6108377,6108378,6108379,6108380,6108381,6108382,6108383,6108384,6108385,6108386,6108387,6108388,6108389,6108390,6108391,6108392,6108393,6108394,6108395,6108396,6108397,6108398,6108399,6108400,6108401,6108402,6108403,6108404,6108405,6108406}'::integer[]))"
"              Rows Removed by Filter: 267910"
"              ->  Seq Scan on compress_hyper_4_125_chunk  (cost=0.00..8.70 rows=270 width=124) (actual time=0.108..1.202 rows=270 loops=1)"
"        ->  Custom Scan (DecompressChunk) on collection_basic_arrays_50_chunk  (cost=0.03..5.62 rows=162000 width=1196) (actual time=2159.726..2159.936 rows=0 loops=1)"
"              Filter: (id = ANY ('{6108348,6108349,6108350,6108351,6108352,6108353,6108354,6108355,6108356,6108357,6108358,6108359,6108360,6108361,6108362,6108363,6108364,6108365,6108366,6108367,6108368,6108369,6108370,6108371,6108372,6108373,6108374,6108375,6108376,6108377,6108378,6108379,6108380,6108381,6108382,6108383,6108384,6108385,6108386,6108387,6108388,6108389,6108390,6108391,6108392,6108393,6108394,6108395,6108396,6108397,6108398,6108399,6108400,6108401,6108402,6108403,6108404,6108405,6108406}'::integer[]))"
"              Rows Removed by Filter: 160746"
"              ->  Seq Scan on compress_hyper_4_126_chunk  (cost=0.00..5.62 rows=162 width=124) (actual time=0.129..0.846 rows=162 loops=1)"
"        ->  Custom Scan (DecompressChunk) on collection_basic_arrays_51_chunk  (cost=0.03..5.62 rows=162000 width=1189) (actual time=2179.259..2179.265 rows=0 loops=1)"
"              Filter: (id = ANY ('{6108348,6108349,6108350,6108351,6108352,6108353,6108354,6108355,6108356,6108357,6108358,6108359,6108360,6108361,6108362,6108363,6108364,6108365,6108366,6108367,6108368,6108369,6108370,6108371,6108372,6108373,6108374,6108375,6108376,6108377,6108378,6108379,6108380,6108381,6108382,6108383,6108384,6108385,6108386,6108387,6108388,6108389,6108390,6108391,6108392,6108393,6108394,6108395,6108396,6108397,6108398,6108399,6108400,6108401,6108402,6108403,6108404,6108405,6108406}'::integer[]))"
"              Rows Removed by Filter: 160746"
"              ->  Seq Scan on compress_hyper_4_127_chunk  (cost=0.00..5.62 rows=162 width=124) (actual time=0.094..0.808 rows=162 loops=1)"
"        ->  Custom Scan (DecompressChunk) on collection_basic_arrays_52_chunk  (cost=0.03..7.16 rows=216000 width=1183) (actual time=2975.483..2975.774 rows=0 loops=1)"
"              Filter: (id = ANY ('{6108348,6108349,6108350,6108351,6108352,6108353,6108354,6108355,6108356,6108357,6108358,6108359,6108360,6108361,6108362,6108363,6108364,6108365,6108366,6108367,6108368,6108369,6108370,6108371,6108372,6108373,6108374,6108375,6108376,6108377,6108378,6108379,6108380,6108381,6108382,6108383,6108384,6108385,6108386,6108387,6108388,6108389,6108390,6108391,6108392,6108393,6108394,6108395,6108396,6108397,6108398,6108399,6108400,6108401,6108402,6108403,6108404,6108405,6108406}'::integer[]))"
"              Rows Removed by Filter: 214328"
"              ->  Seq Scan on compress_hyper_4_128_chunk  (cost=0.00..7.16 rows=216 width=124) (actual time=0.128..1.358 rows=216 loops=1)"
"        ->  Custom Scan (DecompressChunk) on collection_basic_arrays_53_chunk  (cost=0.03..7.16 rows=216000 width=1196) (actual time=2982.037..2982.307 rows=0 loops=1)"
"              Filter: (id = ANY ('{6108348,6108349,6108350,6108351,6108352,6108353,6108354,6108355,6108356,6108357,6108358,6108359,6108360,6108361,6108362,6108363,6108364,6108365,6108366,6108367,6108368,6108369,6108370,6108371,6108372,6108373,6108374,6108375,6108376,6108377,6108378,6108379,6108380,6108381,6108382,6108383,6108384,6108385,6108386,6108387,6108388,6108389,6108390,6108391,6108392,6108393,6108394,6108395,6108396,6108397,6108398,6108399,6108400,6108401,6108402,6108403,6108404,6108405,6108406}'::integer[]))"
"              Rows Removed by Filter: 214328"
"              ->  Seq Scan on compress_hyper_4_129_chunk  (cost=0.00..7.16 rows=216 width=124) (actual time=0.174..1.127 rows=216 loops=1)"
"        ->  Custom Scan (DecompressChunk) on collection_basic_arrays_54_chunk  (cost=0.03..5.62 rows=162000 width=1174) (actual time=2165.433..2165.438 rows=0 loops=1)"
"              Filter: (id = ANY ('{6108348,6108349,6108350,6108351,6108352,6108353,6108354,6108355,6108356,6108357,6108358,6108359,6108360,6108361,6108362,6108363,6108364,6108365,6108366,6108367,6108368,6108369,6108370,6108371,6108372,6108373,6108374,6108375,6108376,6108377,6108378,6108379,6108380,6108381,6108382,6108383,6108384,6108385,6108386,6108387,6108388,6108389,6108390,6108391,6108392,6108393,6108394,6108395,6108396,6108397,6108398,6108399,6108400,6108401,6108402,6108403,6108404,6108405,6108406}'::integer[]))"
"              Rows Removed by Filter: 160746"
"              ->  Seq Scan on compress_hyper_4_130_chunk  (cost=0.00..5.62 rows=162 width=124) (actual time=0.107..0.744 rows=162 loops=1)"
"        ->  Custom Scan (DecompressChunk) on collection_basic_arrays_55_chunk  (cost=0.03..5.62 rows=162000 width=1168) (actual time=2236.158..2236.448 rows=0 loops=1)"
"              Filter: (id = ANY ('{6108348,6108349,6108350,6108351,6108352,6108353,6108354,6108355,6108356,6108357,6108358,6108359,6108360,6108361,6108362,6108363,6108364,6108365,6108366,6108367,6108368,6108369,6108370,6108371,6108372,6108373,6108374,6108375,6108376,6108377,6108378,6108379,6108380,6108381,6108382,6108383,6108384,6108385,6108386,6108387,6108388,6108389,6108390,6108391,6108392,6108393,6108394,6108395,6108396,6108397,6108398,6108399,6108400,6108401,6108402,6108403,6108404,6108405,6108406}'::integer[]))"
"              Rows Removed by Filter: 160746"
"              ->  Seq Scan on compress_hyper_4_131_chunk  (cost=0.00..5.62 rows=162 width=124) (actual time=0.108..0.742 rows=162 loops=1)"
"        ->  Custom Scan (DecompressChunk) on collection_basic_arrays_56_chunk  (cost=0.03..8.70 rows=270000 width=1203) (actual time=3381.932..3382.206 rows=0 loops=1)"
"              Filter: (id = ANY ('{6108348,6108349,6108350,6108351,6108352,6108353,6108354,6108355,6108356,6108357,6108358,6108359,6108360,6108361,6108362,6108363,6108364,6108365,6108366,6108367,6108368,6108369,6108370,6108371,6108372,6108373,6108374,6108375,6108376,6108377,6108378,6108379,6108380,6108381,6108382,6108383,6108384,6108385,6108386,6108387,6108388,6108389,6108390,6108391,6108392,6108393,6108394,6108395,6108396,6108397,6108398,6108399,6108400,6108401,6108402,6108403,6108404,6108405,6108406}'::integer[]))"

............................................. REPETED LINES.......................................


"              Rows Removed by Filter: 214328"
"              ->  Seq Scan on compress_hyper_4_150_chunk  (cost=0.00..7.16 rows=216 width=124) (actual time=0.097..1.014 rows=216 loops=1)"
"        ->  Custom Scan (DecompressChunk) on collection_basic_arrays_75_chunk  (cost=0.03..5.62 rows=162000 width=1195) (actual time=2036.813..2036.814 rows=0 loops=1)"
"              Filter: (id = ANY ('{6108348,6108349,6108350,6108351,6108352,6108353,6108354,6108355,6108356,6108357,6108358,6108359,6108360,6108361,6108362,6108363,6108364,6108365,6108366,6108367,6108368,6108369,6108370,6108371,6108372,6108373,6108374,6108375,6108376,6108377,6108378,6108379,6108380,6108381,6108382,6108383,6108384,6108385,6108386,6108387,6108388,6108389,6108390,6108391,6108392,6108393,6108394,6108395,6108396,6108397,6108398,6108399,6108400,6108401,6108402,6108403,6108404,6108405,6108406}'::integer[]))"
"              Rows Removed by Filter: 160746"
"              ->  Seq Scan on compress_hyper_4_151_chunk  (cost=0.00..5.62 rows=162 width=124) (actual time=1.674..2.358 rows=162 loops=1)"
"        ->  Custom Scan (DecompressChunk) on collection_basic_arrays_76_chunk  (cost=0.05..2.54 rows=54000 width=1198) (actual time=717.785..717.786 rows=0 loops=1)"
"              Filter: (id = ANY ('{6108348,6108349,6108350,6108351,6108352,6108353,6108354,6108355,6108356,6108357,6108358,6108359,6108360,6108361,6108362,6108363,6108364,6108365,6108366,6108367,6108368,6108369,6108370,6108371,6108372,6108373,6108374,6108375,6108376,6108377,6108378,6108379,6108380,6108381,6108382,6108383,6108384,6108385,6108386,6108387,6108388,6108389,6108390,6108391,6108392,6108393,6108394,6108395,6108396,6108397,6108398,6108399,6108400,6108401,6108402,6108403,6108404,6108405,6108406}'::integer[]))"
"              Rows Removed by Filter: 53582"
"              ->  Seq Scan on compress_hyper_4_152_chunk  (cost=0.00..2.54 rows=54 width=124) (actual time=0.824..1.093 rows=54 loops=1)"
"  ->  Memoize  (cost=0.01..0.11 rows=10 width=32) (actual time=0.117..0.307 rows=250 loops=59)"
"        Cache Key: collection_basic_arrays_5_chunk.value"
"        Cache Mode: binary"
"        Hits: 0  Misses: 59  Evictions: 0  Overflows: 0  Memory Usage: 853kB"
"        ->  Function Scan on unnest value_unnest  (cost=0.00..0.10 rows=10 width=32) (actual time=0.035..0.087 rows=250 loops=59)"
"Planning Time: 18.938 ms"
"JIT:"
"  Functions: 82"
"  Options: Inlining true, Optimization true, Expressions true, Deforming true"
"  Timing: Generation 3.072 ms, Inlining 16.733 ms, Optimization 98.150 ms, Emission 116.508 ms, Total 234.463 ms"
"Execution Time: 105245.496 ms"

Yepp, as I thought - the unnesting gets you into the trouble here. Can you reshape the code, so you can just grab the array as a single value as is from the DB, any maybe unpack on python side? Then you should be in the time range of the first query here (+runtime in python to unpack values).

Should I also do the conversion to code? Can i use Pandas, Numpy, Dataframes

This would be the clean query, the analyzes of the other 2 queries, with unnest and with cast are above: Inefficient Query Execution - #13 by zN3utr4l

EXPLAIN ANALYZE SELECT value, "collection_basic_arrays"."id", ("collection_basic_arrays"."updateat" + '2:00:00') AS "time" 
FROM "collection_basic_arrays" 
WHERE ("collection_basic_arrays"."itemid" = 'ns=3;s="Motor"."11A1"."Out_Z"' 
	   AND "collection_basic_arrays"."lineid" = '1' 
	   AND "collection_basic_arrays"."machineid" = '641c0b97d3c5a19affa77b7e' 
	   AND "collection_basic_arrays"."updateat" BETWEEN '2023-04-04 22:00:00+00:00' AND '2023-04-05 22:02:00+00:00') 
	   ORDER BY "collection_basic_arrays"."updateat" ASC

"Result  (cost=0.29..2237.08 rows=33000 width=52) (actual time=10.589..1690.493 rows=17861 loops=1)"
"  ->  Custom Scan (DecompressChunk) on collection_basic_arrays_70_chunk  (cost=0.29..9.58 rows=33000 width=1174) (actual time=10.497..204.217 rows=17861 loops=1)"
"        Filter: ((updateat >= '2023-04-04 22:00:00+00'::timestamp with time zone) AND (updateat <= '2023-04-05 22:02:00+00'::timestamp with time zone))"
"        Rows Removed by Filter: 1139"
"        ->  Index Scan using compress_hyper_4_146_chunk__compressed_hypertable_4_itemid_mach on compress_hyper_4_146_chunk  (cost=0.27..9.58 rows=33 width=166) (actual time=0.034..0.197 rows=19 loops=1)"
"              Index Cond: ((itemid = 'ns=3;s=""Motor"".""11A1"".""Out_Z""'::text) AND (machineid = '641c0b97d3c5a19affa77b7e'::text) AND (lineid = '1'::text))"
"              Filter: ((_ts_meta_max_1 >= '2023-04-04 22:00:00+00'::timestamp with time zone) AND (_ts_meta_min_1 <= '2023-04-05 22:02:00+00'::timestamp with time zone))"
"              Rows Removed by Filter: 71"
"Planning Time: 4.854 ms"
"Execution Time: 1694.827 ms"

Ah this looks much more reasonable. The query itself still seems rather slow to me, but this might come from value alone, if it is quite big (I think you wrote something about 250 values per row in the other thread).

I cannot tell you, how to go on from here on - you write above, that the values might be bool, float, int,… - I am not even sure, how you managed to get that into an array type in postgres, as arrays are still type bound (are they string encoded somehow?) Isn’t such a mixup of different values/types more json-like than array-like?
Well with a json field you could get it fast converted with a proper fast json impl. You could also mimick the old cartesian product behavior by shimming db-like objects with a single value from the array in a loop. It really depends on how you further process these values.

The array itself contains all values ​​of the same type, but in the same table, each itemid can have different values, i.e. itemid x bool[], itemid y int[], so bool[] and int[] are sent as strings, simply str()

Just to make sure I understand correctly - all different array types end up in the same column with each value string-converted beforehand? Thus the real column type is text[]?

If thats the case, and you dont need any array functionality on db level at all, then you might gain more performance by doing 3 more things:

  • remove the array column, instead save the values in a single string (implement to/from converters)
  • compress array value in the string as far as possible, e.g. save “101” instead of “[true, false, true]”
  • try to fit the values into postgres’ page memory by adjusting the toast compression settings (for this to work its important not go over 8KB in a single value, default threshold is 2KB)

Avoiding the db toast and the array abstraction will remove a lot indirection in the query execution (lots of internal pointers and decompession), but also come with a price of faster page turn over (thus updates or deletions will faster fragment the table and you need more often to vacuum things).

If you cannot get <<8KB, you can also try to switch the column compression to LZ4 for postgres 14+ (see What is the new LZ4 TOAST compression in PostgreSQL 14, and how fast is it?).

With these changes the query should get much faster than the current ~1.6s. Hope this helps :slight_smile:

all different array types end up in the same column with each value string-converted beforehand? Thus the real column type is text ?

Yes

  • remove the array column, instead save the values in a single string (implement to/from converters)

This implementation can be done at DB Level or python?

  • try to fit the values into postgres’ page memory by adjusting the toast compression settings (for this to work its important not go over 8KB in a single value, default threshold is 2KB)

At the moment there seems to be no compression, I don’t know if using TimescaleDB, the Postgresql extension, changes anything, the table in question is a hypertable, the data is not saved in the actual table but in chunks of it

DataAnalyser=# \d+
                                                          List of relations
 Schema |                     Name                     |   Type   |  Owner   | Persistence | Access method |    Size    | Description 
--------+----------------------------------------------+----------+----------+-------------+---------------+------------+-------------
public | collection_basic_arrays                      | table    | postgres | permanent   | heap          | 8192 bytes | 
public | collection_basic_arrays_id_seq               | sequence | postgres | permanent   |               | 8192 bytes |

_____________________________________________________________________

DataAnalyser=# \d+ collection_basic_arrays
                                                        Table "public.collection_basic_arrays"
  Column   |           Type           | Collation | Nullable |             Default              | Storage  | Compression | Stats target | Description 
-----------+--------------------------+-----------+----------+----------------------------------+----------+-------------+--------------+-------------
 id        | integer                  |           | not null | generated by default as identity | plain    |             |              | 
 updateat  | timestamp with time zone |           | not null |                                  | plain    |             |              | 
 lineid    | text                     |           |          |                                  | extended |             |              | 
 machineid | text                     |           |          |                                  | extended |             |              | 
 itemid    | text                     |           |          |                                  | extended |             |              | 
 value     | text[]                   |           |          |                                  | extended |             |              | 
 type      | text                     |           | not null | 'BasicArray'::text               | extended |             |              | 
Indexes:
    "PK_collection_basic_arrays" PRIMARY KEY, btree (id, updateat)
    "IX_collection_basic_arrays_itemid" btree (itemid)
    "IX_collection_basic_arrays_itemid_machineid_lineid_updateat" UNIQUE, btree (itemid, machineid, lineid, updateat)
    "IX_collection_basic_arrays_itemid_updateat" UNIQUE, btree (itemid, updateat)
    "IX_collection_basic_arrays_lineid" btree (lineid)
    "IX_collection_basic_arrays_machineid" btree (machineid)
    "IX_collection_basic_arrays_updateat" btree (updateat)
Triggers:
    ts_insert_blocker BEFORE INSERT ON collection_basic_arrays FOR EACH ROW EXECUTE FUNCTION _timescaledb_internal.insert_blocker()
Child tables: _timescaledb_internal.collection_basic_arrays_14_chunk,
              _timescaledb_internal.collection_basic_arrays_21_chunk,
              _timescaledb_internal.collection_basic_arrays_29_chunk,
              _timescaledb_internal.collection_basic_arrays_40_chunk,
              _timescaledb_internal.collection_basic_arrays_44_chunk,
              _timescaledb_internal.collection_basic_arrays_45_chunk,
              _timescaledb_internal.collection_basic_arrays_46_chunk,
              _timescaledb_internal.collection_basic_arrays_47_chunk,
              _timescaledb_internal.collection_basic_arrays_48_chunk,
              _timescaledb_internal.collection_basic_arrays_49_chunk,
              _timescaledb_internal.collection_basic_arrays_50_chunk,
              _timescaledb_internal.collection_basic_arrays_51_chunk,
              _timescaledb_internal.collection_basic_arrays_52_chunk,
              _timescaledb_internal.collection_basic_arrays_53_chunk,
              _timescaledb_internal.collection_basic_arrays_54_chunk,
              _timescaledb_internal.collection_basic_arrays_55_chunk,
              _timescaledb_internal.collection_basic_arrays_56_chunk,
              _timescaledb_internal.collection_basic_arrays_57_chunk,
              _timescaledb_internal.collection_basic_arrays_58_chunk,
              _timescaledb_internal.collection_basic_arrays_59_chunk,
              _timescaledb_internal.collection_basic_arrays_5_chunk,
              _timescaledb_internal.collection_basic_arrays_60_chunk,
              _timescaledb_internal.collection_basic_arrays_61_chunk,
              _timescaledb_internal.collection_basic_arrays_62_chunk,
              _timescaledb_internal.collection_basic_arrays_63_chunk,
              _timescaledb_internal.collection_basic_arrays_64_chunk,
              _timescaledb_internal.collection_basic_arrays_65_chunk,
              _timescaledb_internal.collection_basic_arrays_66_chunk,
              _timescaledb_internal.collection_basic_arrays_67_chunk,
              _timescaledb_internal.collection_basic_arrays_68_chunk,
              _timescaledb_internal.collection_basic_arrays_69_chunk,
              _timescaledb_internal.collection_basic_arrays_70_chunk,
              _timescaledb_internal.collection_basic_arrays_71_chunk,
              _timescaledb_internal.collection_basic_arrays_72_chunk,
              _timescaledb_internal.collection_basic_arrays_73_chunk,
              _timescaledb_internal.collection_basic_arrays_74_chunk,
              _timescaledb_internal.collection_basic_arrays_75_chunk,
              _timescaledb_internal.collection_basic_arrays_76_chunk
Access method: heap