ListView on Timeseries data

Hi all,

I am using timeseries data and would like to display a filtered selection based on a user input. At the moment I am using an existing mySQL database which I know from reading other blog posts may not be ideal for timeseries. I am looking at moving to Postgres and timescale, but one thing at a time (just over a years’ experience so head is about to explode as it is). For the models below, I ran inspectdb. Also I know some of the column names are not good and are based on the API calls they are populated from, I will be changing.

I have a devices (modems) model with associated ListView and DetailView. As there is only one device/modem per row and this was straightforward to achieve and working well.

class Devices(models.Model):
device_id = models.SmallAutoField(primary_key=True)
serialno = models.CharField(db_column='serialNo', unique=True, max_length=255)
deviceid = models.CharField(db_column='deviceId', max_length=255, blank=True, null=True)
machine = models.ForeignKey('Machines', models.DO_NOTHING, blank=True, null=True)
imei = models.CharField(max_length=255, blank=True, null=True)
machinetypeid = models.CharField(db_column='machineTypeId', max_length=255, blank=True, null=True)
hardware = models.CharField(max_length=255, blank=True, null=True)
firmware = models.CharField(max_length=255, blank=True, null=True)
apicountrycode = models.CharField(db_column='apiCountryCode', max_length=255, blank=True, null=True)
datasource = models.CharField(db_column='dataSource', max_length=255, blank=True, null=True)

class Meta:
    managed = False
    db_table = 'devices'
    verbose_name = 'Device'
    verbose_name_plural = 'Devices'

def __str__(self):
    return self.serialno

def get_absolute_url(self):
    return reverse('device-detail', args=[str(self.device_id)])

However the sensor data model is different. The device refers to the devices model. The api_param is referenced to a separate parameter model. So each sensor data point has a value, timestamp and an associated parameter (e.g. engine RPM) and device/modem (e.g. modem1). There is another model which links each modem to a machine serial number, as a modem can be fitted to any machine.

class sensordata(models.Model):
row_id = models.AutoField(primary_key=True)
device = models.ForeignKey(Devices, models.DO_NOTHING)
api_param = models.ForeignKey(Apiparams, models.DO_NOTHING)
timestamp = models.DateTimeField(db_column='TimeStamp')
value = models.CharField(db_column='Value', max_length=255, blank=True, null=True)

class Meta:
    managed = False
    db_table = 'sensordata'
    verbose_name = 'Sensor Data'
    verbose_name_plural = 'Sensor Data'

def __str__(self):
    return self.device

def get_absolute_url(self):
    return reverse('sensordata-detail', args=[str(self.device)])

I was able to create a ListView with the following.

class SensorDataListView(ListView):

model = Sensordata
context_object_name = 'sensordata_list'
template_name = 'modem_api/sensordata_list.html'

def get_queryset(self):
     
    return Sensordata.objects.select_related("device").select_related("device__machine").values("device__serialno", "device__machine__machineserialno").distinct().all()

url:

path('sensordata/<int:device>', SensorDataDetailView.as_view(), name='sensordata-detail'),

Template:

          {% for modem in sensordata_list %}
      <tr>
          <td><a href="{{ modem.get_absolute_url }}">{{ modem.device__serialno }}</td>
          <td>{{ modem.device__machine__machineserialno }}</td>
      </tr>
      {% empty %}
          <li>No Sensor Data yet.</li>
      {% endfor %}

There are a few problems, if I add “device” to values() on the queryset, everything slows down to a crawl. At the moment, querying the database is surprisingly fast (>300M rows). Even if I add device to the queryset, the following still does not work.

The listview is displayed with the correct list of devices/modems in the table, however there is no associated device for each link (i.e. the URL does not have the device_id), so I cannot select individual devices/modems (how I then show a detailed view based on that selection is another problem for another day!).

I think it has something to do with the get_absolute_url in the sensordata model as in this model there is multiple rows with the same device_id. I was thinking about trying to apply a uniue function to this, however I am not convinced this is even being called as if I have a print(something) in the sensdata get_absolute_url(), I get no output, or errors.

Does get_absolute_url() get passed if I perform a distinct filter on the queryset?

Or am I going down completely the wrong route and trying to wedge something into list and detail view that they were not designed for?

MySQL is fine for timeseries. See these posts from MySQL-using legend Baron Schwartz:

https://www.xaprb.com/blog/2014/06/08/time-series-database-requirements/
https://www.slideshare.net/vividcortex/how-vividcortexs-mysqlbased-time-series-database-works

If you’re using MySQL though I suggest at least using my library Django-MySQL, at least for its system checks.

Also if you can “upgrade” to MariaDB, it’s generally more performant, plus it has a bunch of extra features.

Try select_related() / prefetch_related(). See the django docs on database access optimization.

Change to reverse('sensordata-detail', args=[self.device_id]). This avoids fetching the device row if it’s not fetched. Again see db optimization page.

You can combine all the select_related()s into one call. The .all() is redundant. The distinct() is likely the cause of your performance problems. Try moving to prefetch_related to use separate queries and stop the query optimizer from guessing.

This won’t work, and similarly elswhere. It’s modem.device.serialno. The __ syntax is only for querying the DB. Templates don’t throw errors for missing variarbles by default.

Hi Adam, Thanks for the resources on MySQL, I will read and add the Django-MySQL library. I will also have a look at MariaDB. As an aside I just listened to your postcast on Django chat from last year so puts some of what you are saying into context!

I changed the reverse and combined the select_related() and removed the all().
I will have a look at changing to select_related/prefetch_related, however I am using the distinct function to get a list of unique devices in the model for the list view and cannot seem to get it to work without this.

The __ names on the template are a result of using the values() function in the view. This names the variables as they are called in the values function:

filtered_data = Sensordata.objects.select_related("device", "device__machine").values("device__serialno", "device__machine__machineserialno").distinct()

The template is not actually using the __ as a search, it is just the way the variable is named. The template seems to be working with the __ names and displays the devices and machine serial numbers correctly.

The more I look at it, the more I believe that “modem.get_absolute_url” is simply not passed through to the template. In the template, I can change modem.get_absolute_url to “foo.bar” and get exactly the same result (table displaying correctly, with hyerlinks to the same page (i.e. no /)). As you say the template does not throw errors for missing variables by default.

I could not get it working in an efficient way, so I have changed tack slightly and moved away from list/detail views, and started using Django-filter. Using this, I can populate a dropdown menu from my existing devices table, then filter the data displayed on the same page using a very similar filter query as above:

queryset = Firstlaststats.objects.select_related("device__machine", "api_param__param")
f = FirstLastStatsDevicesFilter(request.GET, queryset)

Seems to be working ok and I get a dynamic page filtered on the machine I want to see. Still trying to figure out what select_related() and values() actually do, but getting there…

Thanks for your help and useful resources, they pointed me in the right direction.

If that’s what the distinct() is for, using prefetch_related() is probably what you need then.

Apologies I missed this.

Add django-debug-toolbar to inspect your context, and try setting the ‘string_if_invalid’ setting in development.

Looking again, not it’s not. values() returns tuples not model instances. To only pull back certain fields, instead use only().

Glad to help

I have the toolbar installed but only really using it for the SQL queries. Really useful tip about the context, will spend some time exploring the toolbar more.