Loading data from excel file into models.

datainvestor · April 7, 2020, 7:33am

Based on this question python - Saving a Pandas DataFrame to a Django Model - Stack Overflow I’m trying to add data from excel to my Model.
The reason why I prefer using Pandas is the to_sql() method which seems to be way faster than iterating over each row by itself. The thing is my excel file migth be big (100k+ rows), therefore injecting the data using some other method might take quite a while.

So my model looks like this:

class Company(models.Model):
    KIND = (
    (0, "Corporate"),
    (1, "Start Up"),
    )
    name = models.CharField(verbose_name="full company", max_length=50)
    address = models.CharField(max_length=100)
    kind = models.IntegerField(default=0)

My excel that is loaded into dataframe looks like this(df):

      name    address     kind
id
1   Ubisoft    Dowtown       0
2    Exxon  Tuas Link        1

Now I followed the suggestion in that question and ended up creating the following code in my views.py:

from django.shortcuts import render
from django.shortcuts import redirect
from sqlalchemy import create_engine
# Create your views here.
from django.http import HttpResponse
from django.conf import settings
import pandas as pd
from importer.models import Company


def importer(request):
    df=pd.read_excel("datafile.xlsx")
    df.columns=["id", "name", "address", "kind"]
    df.set_index("id", inplace=True)
    print(df.dtypes)
    user = settings.DATABASES['default']['USER']
    password = settings.DATABASES['default']['PASSWORD']
    database_name = settings.DATABASES['default']['NAME']

    database_url = 'postgresql://{user}:{password}@localhost:5433/{database_name}'.format(
        user=user,
        password=password,
        database_name=database_name,
    )
    engine = create_engine(database_url, echo=False)
    df.to_sql(Company, con=engine,  if_exists='append')

So it seems that it loads dataframe correctly but then I get error when it is trying to append data to the database in postgresSQL, because I get the following error when I access route corresponding to importer view:

Traceback (most recent call last):
  File "C:\Users\Admin\Anaconda3\lib\site-packages\django\core\handlers\exception.py", line 34, in inner
    response = get_response(request)
  File "C:\Users\Admin\Anaconda3\lib\site-packages\django\core\handlers\base.py", line 115, in _get_response
    response = self.process_exception_by_middleware(e, request)
  File "C:\Users\Admin\Anaconda3\lib\site-packages\django\core\handlers\base.py", line 113, in _get_response
    response = wrapped_callback(request, *callback_args, **callback_kwargs)
  File "C:\Users\Admin\Documents\DjangoApps\excel_reader\importer\views.py", line 29, in importer
    df.to_sql(Company, con=engine,  if_exists='append')
  File "C:\Users\Admin\Anaconda3\lib\site-packages\pandas\core\generic.py", line 2713, in to_sql
    method=method,
  File "C:\Users\Admin\Anaconda3\lib\site-packages\pandas\io\sql.py", line 518, in to_sql
    method=method,
  File "C:\Users\Admin\Anaconda3\lib\site-packages\pandas\io\sql.py", line 1321, in to_sql
    if not name.isdigit() and not name.islower():
AttributeError: type object 'Company' has no attribute 'isdigit'

Anyone has idea what can be causing this problem?

sks444 · April 7, 2020, 11:16am

df.to_sql doc suggests to pass the name of the SQL table as string but you’re passing the django object(Company) instead of the table name. Check what is the name of the table that has been created in Postgres for Company model and try passing that.

datainvestor · April 7, 2020, 2:52pm

Thanks it worked! I looked up table name in pgAdmin4 and just used is as string.

Bintysaed · February 20, 2021, 6:19am

please how did you run am trying it now to load the exel file data to django model

venqics · January 31, 2022, 12:18pm

How can we add some logic to avoid duplicate entries for the incoming data. For example I am able to save the excel file in my database using to_sql. I have to validate that no two duplicate entries can be created for a column in this case I have title column and validate if two products have same title or name, the entry does not get added to the database.

muneermohd9690 · February 21, 2022, 6:34pm

any updates on the duplicate entry?

Topic		Replies	Views
Inserting data into database using files on filesystem Using Django	6	2543	September 20, 2020
SQLAlchemy engine issue - data not transferred to db using Pandas Using the ORM	1	446	February 6, 2023
insert pandas dataframe from API into model without a form Forms & APIs	3	2126	March 11, 2022
How to save processed excel file (by pandas) in django model? Forms & APIs	4	1250	February 14, 2023
Recommended Method to Process Files in Pandas Using Django	6	2799	December 26, 2021

Loading data from excel file into models.

Related topics