django orm take to much time in mysql and server

we have a script in django and this script loop on the csv rows and add csv rows to the database. when we do it in local and with sqlite script do the job so fast but when we do it in server and with mysql it take too much time.
I debug the script and I found out after some row when we reach line of getting first object it takes to much time for response
code:

filtered=Stock_Csv.objects.filter(ticker=name)
filtered_by_date=filtered.filter(date=date_object)
current_obj=filtered_by_date.first()

when reach this line: current_obj=filtered_by_date.first() it takes to much time for response

but it’s just happened in mysql and in local we don’t have this problem.

First, keep in mind that the first two statements do not actually access the database.

Querysets are lazy, which means they aren’t passed to the database server for execution until it’s necessary to do so. So it’s only the last statement that is hitting the database. The first two statements only serve to help define the query.

Now, what is meant by “too much time”?
How large is the Stock_Csv table (how many rows)?
What is your primary key for the table?
Do you have a default ordering defined on the model?
(You’re not specifying an order_by clause in the query, how do you know which one you’re going to get as “first”?)

It might be worth creating an index for ticker and another index for date.

You might also look at your MySQL configuration. Look at the query being generated, and run that individually, directly on the database. Then run it from your server. (What I would be trying to do at this point is isolate where the time is being spent.)

When I debugging I notice that for the some first row for example 50 or 100 (for any runs was different) it’s okay and so fast but after some rows when code reach to the .first() take 2 or 3 seconds for each row to response.I don’t use order_by because I have just one object in my filtered_by_date variable.
this is my query for example:

SELECT "csvtojson_stock_csv"."id", "csvtojson_stock_csv"."ticker", "csvtojson_stock_csv"."date", "csvtojson_stock_csv"."open", "csvtojson_stock_csv"."high", "csvtojson_stock_csv"."low", "csvtojson_stock_csv"."close", "csvtojson_stock_csv"."vol", "csvtojson_stock_csv"."arzesh", "csvtojson_stock_csv"."tedad", "csvtojson_stock_csv"."diroz", "csvtojson_stock_csv"."shamsi", "csvtojson_stock_csv"."akharin", "csvtojson_stock_csv"."office" FROM "csvtojson_stock_csv" WHERE ("csvtojson_stock_csv"."ticker" = کرماشا AND "csvtojson_stock_csv"."date" = 2020-02-23)

and when run this query in phpmyadmin mysql I got this error:

SQL query:

SELECT "csvtojson_stock_csv"."id", "csvtojson_stock_csv"."ticker", "csvtojson_stock_csv"."date", "csvtojson_stock_csv"."open", "csvtojson_stock_csv"."high", "csvtojson_stock_csv"."low", "csvtojson_stock_csv"."close", "csvtojson_stock_csv"."vol", "csvtojson_stock_csv"."arzesh", "csvtojson_stock_csv"."tedad", "csvtojson_stock_csv"."diroz", "csvtojson_stock_csv"."shamsi", "csvtojson_stock_csv"."akharin", "csvtojson_stock_csv"."office" FROM "csvtojson_stock_csv" WHERE ("csvtojson_stock_csv"."ticker" = کرماشا AND "csvtojson_stock_csv"."date" = 2020-02-23) LIMIT 0, 25

MySQL said:
#1064 - You have an error in your SQL syntax; check the manual that corresponds to your MariaDB server version for the right syntax to use near '."id", "csvtojson_stock_csv"."ticker", "csvtojson_stock_csv"."date", "csvtojs...' at line 1

Every csv can have 2000 rows less or more.
Primary key in setting of django is set to django.db.models.BigAutoField
We don’t have a default ordering in model.
All of this problem happened in mysql not in the sqlite.

The query attribute generated by something like print(queryset.query) is not valid SQL as-is. It’s not the true query being issued to the database. You need to enclose the constants in the query with single quotes.

Your WHERE clause should look more like this:
WHERE ("csvtojson_stock_csv"."ticker" = 'کرماشا' AND "csvtojson_stock_csv"."date" = '2020-02-23')

You should define indexes in your model for both the ticker and date fields.

thank you,I do Indexing and I think it’s get better but I still have hanging in my code I upload the video for you.
link of video:Upload files for free - execcsvtojson.mp4 - ufile.io
and I change the code to this:

current_obj=Stock_Csv.objects.filter(Q(ticker__exact=split_ticker(row1[0])) & Q(date__exact=date_object)).first()

Ok, so there’s more going on here than just that query. It looks like you might be running this query multiple times in a loop?

Please post the complete view in which this query is executing. There may be other factors involved beyond just the query itself.

1 Like

yes I do it in the loop.
I change the code a lot but this is what I ran in the video:

from csvtojson.models import Stock_Csv
import csv
from django.db.models import Q
from datetime import datetime
import os,glob
from django.db import connection
from farsi_tools import standardize_persian_text
import logging

logging.basicConfig(filename='csvtojson.log', filemode='w',format='%(asctime)s - %(message)s', level=logging.INFO)


def split_ticker(ticker):
    if "-" not in ticker:
        return ticker
    split_word=ticker.split("-")[0:-1]
    word="".join(split_word)
    return word



def add_csv_to_database():
    print("csvtojson ejra shod")
    #csv folder path
    path = r'/home/chartban/chartban/tsefile'
    extension = 'csv'
    os.chdir(path)
    #all csv file in folder
    all_csv_file = glob.glob('*.{}'.format(extension))
    for csvfile in all_csv_file:
        fullpath=f"{path}/{csvfile}"
        with open(fullpath,encoding='utf-16le',errors='ignore') as csvfile:
            try:
                csvreader = csv.reader(x.replace('\0', '') for x in csvfile)#replace null byte
                header=[]
                header=next(csvreader)
                should_update=[]
                should_create=[]
                row1_count=0
                
                for row in csvreader:
                    if row1_count==1 or len(row)==0:
                        break
                    row1=row
                    row1_count+=1
                if not row1:
                    break
                print(split_ticker(row1[0]))
                logging.info(split_ticker(row1[0]))

                for row in csvreader:
                    #agar row khali nabod
                    if len(row) !=0:
                        myrow=[]
                        column_count=1
                        for column in row:
                            if column_count == 1 or column_count == 13:
                                #split the word untill last -

                                myrow.append(split_ticker(column))
                            else:
                                myrow.append(column)
                            column_count+=1
                        date_object=datetime.strptime(myrow[1][2:],"%y%m%d")
                        # find exist object
                        print("before")
                        current_obj=Stock_Csv.objects.filter(Q(ticker__exact=standardize_persian_text(split_ticker(row1[0]))) & Q(date__exact=date_object)).first()
                        print("after")


                        # update object
                        if current_obj:

                            current_obj.open=myrow[2]
                            current_obj.high=myrow[3]
                            current_obj.low=myrow[4]
                            current_obj.close=myrow[5]
                            current_obj.vol=myrow[6]
                            current_obj.arzesh=myrow[7]
                            current_obj.tedad=myrow[8]
                            current_obj.diroz=myrow[9]
                            current_obj.shamsi=myrow[10]
                            current_obj.akharin=myrow[11]
                            current_obj.office=myrow[12]

                            should_update.append(current_obj)
                        else:
                            # make new object of Stock_Csv
                            stock=Stock_Csv(ticker=standardize_persian_text(myrow[0]),
                                            date=date_object,
                                            open=myrow[2],
                                            high=myrow[3],
                                            low=myrow[4],
                                            close=myrow[5],
                                            vol=myrow[6],
                                            arzesh=myrow[7],
                                            tedad=myrow[8],
                                            diroz=myrow[9],
                                            shamsi=myrow[10],
                                            akharin=myrow[11],
                                            office=myrow[12])

                            should_create.append(stock)
                if should_create:
                    Stock_Csv.objects.bulk_create(should_create)
                if should_update:
                    Stock_Csv.objects.bulk_update(should_update,fields=["open","high","low","close","vol","arzesh","tedad","diroz","shamsi","akharin","office",])

            except Exception as e:
                print("error!!! : ",e)
                pass

Ok, one of the major issues here is that what you’re seeing is misleading you.

Python output from the print function is, by default, buffered. The pauses you are seeing are not necessarily caused by a delay in that one statement, but caused by the buffer being emptied and waiting for the next buffer to be filled.

If you’re going to correctly and accurately evaluate this, you need to track the actual time spent in each operation, and not rely upon when you see the buffer being sent to the screen.

timing for each file in the mysql and server:
link photo:Pasteboard - Uploaded Image
timing for each file in the sqlite and local:
link photo:Pasteboard - Uploaded Image