Best database for my new app prototype: some required PLUS many optional fields for users

Hi,

What’s the best database model for my new app prototype: it’ll have some 10-20 required PLUS many (30-40) optional fields for user profiles. And the app is expected to have many active users (at least that’s what I hope for).

  • Is it best to just put all in SQLite table? That would end up with table, each row with like columns with many empty fields.

  • Or should I use SQLite with JSON extension, where I put optional data in the JSON table? (which I can later convert to PostgreSQL with JSONB)

  • What about using MongoDB? Or a hybrid of MongoDB with SQL?

In this 2 years old Reddit thread most users suggest to not use MongoDB with Django as it’s not well integrated with it, and Django doesn’t have admin dashboard for MongoDB databases as it has for SQL databases.

How should I best structure my database when building with Django for my app prototype?

We would need a lot more details about your intended model structure, specifically the optional profile fields.

Are there any logical groupings of fields? (For example, you may provide fields for a mailing address. Individually, that could be 5 or 6 fields, but constitute a group that could be called “mailing address”.)

Are there any repeated instances? (For example, you may provide fields for multiple email addresses. You may want to save three email addresses, so you say you need three fields, but it’s actually three instances of an “email” field.)

<opinion>
Never for related data. That’s not what it’s designed for.

Nor do I recommend the use of a JSON field for anything other than a “document” storage. It’s not a substitute for a proper set of related tables.
</opinion>

1 Like

It’s a dating app, so there are not many groupings or repeats.

There will be one big logical group, which is “what I’m looking for” or “my match”, describing what user will look for in potential partner. It will repeat most of the optional profile fields for matching.

If you’re looking for assistance here, we’re going to need the specifics.

General descriptions of this situation does not provide enough information to make a recommendation of a structure for a model, beyond the general answer that you should create a normalized set of tables representing the data entities you’re working with.

Below is description of user profile data:

Required fields marked with *

User profile data:

  • Email*
  • First name and last name*
  • Password*
  • Gender*
  • Birth of date*
  • Country*
  • City
  • Profile ID number (generated automatically)*
  • Up to 10 pictures
  • Profile name*
  • Gender*
  • Religion
  • Online or offline
  • Last active*
  • About me: free text
  • What type of partner I’m looking for: free text
  • Looks:
    • Height
    • Weight
    • Eye color
    • Hair color
    • Ethnicity
  • Drinks
  • Smokes
  • Marital status
  • Nationality (adding like tags from drop down list)
  • Languages spoken (adding like tags drop down list)
  • Children:
    • Have children
    • Nr of children
    • Children age and genders (adding children with age and gender)
  • Education
  • Income
  • Employment status
  • Hobbies, interests and sports (choosing multiple from list and a text area)
  • Favorited profiles (key to another table)
  • “What I’m looking” for fields:
    • Age
    • Country
    • Religion
    • Language (can add multiple)
    • Nationality
    • Looks:
      • Height
      • Weight
      • Eye color
      • Hair color
      • Ethnicity
    • Drinks
    • Smokes
    • Nationality (adding like tags from drop down list)
    • Languages spoken (adding like tags from drop down list)
    • Children:
      • Have children
      • Nr of children
      • Children age and genders (adding children ages and genders)
    • Marital status
    • Education
    • Income (just a number in USD or EUR)
    • Employment status
    • Hobbies, interests and sports (choosing multiple from list and a text area)

There will be other technical fields too, like email verification code. More fields can be added depending on additional features, like a “is verified” field for some profile information like age, country, picture, income etc.

Other tables / entities will be about messages and conversations, likes etc.

And we’ll have a separate table / entity for “admin users”.

But these other entities won’t need many fields.

If I were designing for this, there would be a number of tables involved.

There are a couple of subcategories here that I would use to organize my structures.

Yes, you have the “personal” information, that’s one grouping. The “what I’m looking for” is a separate grouping. You could also possibly divide these into the subgroupings “demographics” (physical characteristics such as age, height, weight, etc) and “personality” (characteristics such as “Hobbies”, “drinks”, etc).

I’d also create a separate model for all “free text” fields consisting of a “field_type” identifier (e.g. “About me”, “Partner type”, etc) and the free text field.

I’d also create a number of “code table” - the ones that you’re talking about using as select lists. (One code table per field, such as “Language”, “Nationality”)

Then there would be other tables in a many-to-one relationship with the base profile such as “Children age and genders”, or “Pictures”.

Just by very rough count, I can easily see there being 8 - 10 models for this, possibly more for additional code tables that you may wish to create, such as for “Education”, “Drinks” and/or “Smoke”.

You could do it in fewer, and I wouldn’t say you would be wrong in doing that, either. Other than the “select multiple” categories that would be a Many-to-many relationship with their respective tables and the Many-to-one relationships, there’s nothing fundamentally wrong with putting all the columns in two or three tables. (At a minimum I would have a separation between the “About me” and the “What I’m looking for” columns.)

So for the sake of simplicity, let’s say instead of having one table like this:
Person(ID, email, name, country, height, weight, eyecolor).

We divide it into two tables:
Person(ID, email, name, country)
PhysicalFeatures(ID_FK (FK to ID of the Person), height, weight, eyecolor)

Since each person has single physical feature, there’ll be one-on-one relationship between these two tables. If the user didn’t enter any detail on physical features, there’ll be no row for the person, saving memory space.

Typically there’ll be a lot of reading from the rows, as users open and close multiple profile pages quickly. Wouldn’t dividing the table like this make such reading procedure slower, as the app always needs a “join” operation to retrieve the data?

Whatever time is required for the indexed join is going to be far overshadowed by the amount of time for the data to be rendered and presented to the user. Worrying about that join is focusing on an effectively irrelevent issue.

What is more important is that your data design best facilitates usage - queries, adds, updates, and deletes as well as browsing, and the security, integrity and reliability of the data. It’s the totality of operations that should be looked at.

1 Like

Ok, so I’ll start with a more unified model, and divide it later if I figure out it’s better to divide.