Loading data from excel file into models.

Based on this question https://stackoverflow.com/questions/37688054/saving-a-pandas-dataframe-to-a-django-model I’m trying to add data from excel to my Model.
The reason why I prefer using Pandas is the to_sql() method which seems to be way faster than iterating over each row by itself. The thing is my excel file migth be big (100k+ rows), therefore injecting the data using some other method might take quite a while.

So my model looks like this:

class Company(models.Model):
    KIND = (
    (0, "Corporate"),
    (1, "Start Up"),
    name = models.CharField(verbose_name="full company", max_length=50)
    address = models.CharField(max_length=100)
    kind = models.IntegerField(default=0)

My excel that is loaded into dataframe looks like this(df):

      name    address     kind
1   Ubisoft    Dowtown       0
2    Exxon  Tuas Link        1

Now I followed the suggestion in that question and ended up creating the following code in my views.py:

from django.shortcuts import render
from django.shortcuts import redirect
from sqlalchemy import create_engine
# Create your views here.
from django.http import HttpResponse
from django.conf import settings
import pandas as pd
from importer.models import Company

def importer(request):
    df.columns=["id", "name", "address", "kind"]
    df.set_index("id", inplace=True)
    user = settings.DATABASES['default']['USER']
    password = settings.DATABASES['default']['PASSWORD']
    database_name = settings.DATABASES['default']['NAME']

    database_url = 'postgresql://{user}:{password}@localhost:5433/{database_name}'.format(
    engine = create_engine(database_url, echo=False)
    df.to_sql(Company, con=engine,  if_exists='append')

So it seems that it loads dataframe correctly but then I get error when it is trying to append data to the database in postgresSQL, because I get the following error when I access route corresponding to importer view:

Traceback (most recent call last):
  File "C:\Users\Admin\Anaconda3\lib\site-packages\django\core\handlers\exception.py", line 34, in inner
    response = get_response(request)
  File "C:\Users\Admin\Anaconda3\lib\site-packages\django\core\handlers\base.py", line 115, in _get_response
    response = self.process_exception_by_middleware(e, request)
  File "C:\Users\Admin\Anaconda3\lib\site-packages\django\core\handlers\base.py", line 113, in _get_response
    response = wrapped_callback(request, *callback_args, **callback_kwargs)
  File "C:\Users\Admin\Documents\DjangoApps\excel_reader\importer\views.py", line 29, in importer
    df.to_sql(Company, con=engine,  if_exists='append')
  File "C:\Users\Admin\Anaconda3\lib\site-packages\pandas\core\generic.py", line 2713, in to_sql
  File "C:\Users\Admin\Anaconda3\lib\site-packages\pandas\io\sql.py", line 518, in to_sql
  File "C:\Users\Admin\Anaconda3\lib\site-packages\pandas\io\sql.py", line 1321, in to_sql
    if not name.isdigit() and not name.islower():
AttributeError: type object 'Company' has no attribute 'isdigit'

Anyone has idea what can be causing this problem?

df.to_sql doc suggests to pass the name of the SQL table as string but you’re passing the django object(Company) instead of the table name. Check what is the name of the table that has been created in Postgres for Company model and try passing that.

1 Like

Thanks it worked! I looked up table name in pgAdmin4 and just used is as string.

please how did you run am trying it now to load the exel file data to django model

How can we add some logic to avoid duplicate entries for the incoming data. For example I am able to save the excel file in my database using to_sql. I have to validate that no two duplicate entries can be created for a column in this case I have title column and validate if two products have same title or name, the entry does not get added to the database.

any updates on the duplicate entry?