I have a strange problem, that I can reproduce every 4-6th try and I failed so far to understand how to avoid it:
Somewhere in my django code there is a function that imports a CSV file to the database of products, and from these products I create images with the price on the image - and sometimes an old price is used:
def batch_update_products_in_db(working_file, store, delimiter = ";"):
with open(working_file, encoding = "utf-8") as inputfile:
reader = csv.reader(inputfile, delimiter = delimiter)
for row in reader:
try:
productdata = {header[i]: row[i] for i in range(1, len(row))}
product = Product.objects.get(store = store, number = row[0])
if product.data == productdata:
continue
else:
product.data = productdata
print("TEST 1:", product.data)
product.save()
update_product_in_db(product)
except Product.DoesNotExist:
Product.objects.create(store = store, number = row[0], data = productdata)
except Exception as E:
log(...)
in this function there is a call to another function that generates these images:
def update_product_in_db(product):
def do_update(task):
links = Link.objects.filter(products = product)
## get all link objects for this product:
for link in links:
product_pks = [p.pk for p in link.products.all()]
product_qs = Product.objects.filter(pk__in = product_pks)
if product_qs.exists():
img = generate_img_from_layout(link.layout, product_qs)["object"]
...
task.timestamp_end = timezone.now()
task.is_done = True
task.save()
task = BigTask.objects.create(name = "update products in db")
t = threading.Thread(target = do_update, args = [task])
t.setDaemon(True)
t.start()
return {'id': task.id}
where I stripped down the generate_img_from_layout()
function to just output the product:
def generate_img_from_layout(layout, product_qs)
for product in Product.objects.filter(pk__in = products):
print("Test 2:", product.data)
and in the console when uploading multiple csv files (one after each other) I get the output:
Test 1: {‘name’: ‘hellotest’, ‘price’: ‘16.90’}
Test 2: {‘name’: ‘hellotest’, ‘price’: ‘16.90’}Test 1: {‘name’: ‘hellotest’, ‘price’: ‘16.91’}
Test 2: {‘name’: ‘hellotest’, ‘price’: ‘16.91’}Test 1: {‘name’: ‘hellotest’, ‘price’: ‘16.92’}
Test 2: {‘name’: ‘hellotest’, ‘price’: ‘16.92’}Test 1: {‘name’: ‘hellotest’, ‘price’: ‘16.93’}
Test 2: {‘name’: ‘hellotest’, ‘price’: ‘16.92’} #######Test 1: {‘name’: ‘hellotest’, ‘price’: ‘16.94’}
Test 2: {‘name’: ‘hellotest’, ‘price’: ‘16.94’}
So at some point the product.save()
apparently did not finish before the next Product.objects.filter()
reads the product again.
So after posting a similar example to stackoverflow (here) I tried to change the code of batch_update_products_in_db()
to:
...
print("TEST 1:", product.data)
product.save()
transaction.on_commit(lambda: update_product_in_db(product))
...
but this resulted in the update_products_in_db()
function not to be called at all. Where did I take a wrong turn?