slow queries

Hi,

Just after a bit of advice when it comes running complex queries across multiple tables. Although im not sure my queries are too complex i seem to get bad performance.

I think i will look at caching to solve some of the performance problems, but i have many views that are based upon users selecting time frames so can’t see how the cache will work for these?

I’ve also noticed that if i restore the live database to my dev platform the queries are faster, but running on less spec’d infra which i’m assuming is because the restore maybe does some form of DB tidy up?

So what is the recommended way to handle situations when complex calculations are on the fly.

Thanks

Tom

There is no pat answer to issues like this.

You’ll want to use the EXPLAIN feature of your database to help identify where the bottlenecks may be within the query. Generally, improvements are going to be made by either adding indexes where appropriate, or possibly even restructuring the query if there are a lot of joins involved. It may even end up being necessary to restructure the tables if there’s no other avenue for improvement.

I’m not sure I understand what you mean by:

If you’re talking about query expressions calculating values being the time-consuming operations, then optimizing these may take a fair bit of work.

But it all depends on the specifics. There really isn’t a way to generalize this.

Thank you, Ken.

I mean like when the user selects a new date range all the calculations are re-calculated for that time period and presented in tables and charts.

I was thinking of running some overnight celery tasks to do all of calculations and adding to a kind of key metrics table - would that be sensible? I know its hard to answer without understanding the metrics that are being generated, but seemed like maybe a solution to at least reduce some of the queries against the database

That’s actually not a bad idea.

We do something like that. Our date ranges are rather arbitrary - we don’t know in advance what date range a set of reports will cover. So when an individual defines a date range, we allow them to save the range and the aggregated data and store it for later retrieval. (Should some data change within that range, they can reselect the range to update the values.)

Ok. That’s good. I will give this a go.

Thanks for your help again, Ken.

Tom