Opinions on a few choices I made with my app

[If this thread is too generic, let me know and I’ll make sure to ask more specific questions: I would immensely appreciate it even if you guys gave me an answer even to a single point]

I’ve been using Django for approximately 1 year, and I’m currently working on my biggest project so far. It’s a web app for my university, for teachers to create online exams and students to take on them, and I made it with Django REST fw + vue.js.

It’s become a big project and I’m proud of the fact I’ve been able to make it work all by myself, but now I’m taking a look at the code I wrote so far and I’m wondering what could’ve been done better. This is mainly a huge learning experience for me. I’ll give here a brief rundown of what the app does, and then I’ll ask some specific questions.

  • Teachers and students can log in using their university account (this is done via Google social login with oauth2)
  • Teachers have an editor to create Exams, which are made up of Questions and programming Exercises. All exam items belong to a Category, and there are several randomization options to have students be presented with different questions or in a different order (the words beginning with a capital letter are models).
  • Students log in and type in the exam code, given to them by teachers, and are then presented sequentially all the questions and exercises. The sequence is dynanically generated as students answer or skip items. All GivenAnswers to questions and Submissions to programming exercises are recorded and stored in a db.
  • At the end of an exam, teachers can download either a CSV report or a zip archive containing a PDF file for each student, to see the answers (ExamReport model).
  • There are other features that aren’t relevant here, like: a dashboard for teachers to monitor progress during an exam; a websocket-based lock that prevents two teachers from editing an exam at the same time, and a node backend that runs student-submitted code for programming exercises and automatically tests it against teacher-provided test cases.

Here’s a link to the repository for the backend: GitHub - samul-1/js-exercise-platform
(The name js-exercise-platform is misleading: this started out as a platform for programming exercises in JavaScript, but then expanded as a general-purpose exam platform that any course or subject can use)

Here are some doubts I have about my code and the general sw engineering choices I made:

  • I never used model inheritance, and I found myself with some repeated code. Take this model:
class Question(models.Model):
    """
    A question shown in exams
    """
    # ...
    text = models.TextField()
    rendered_text = models.TextField(null=True, blank=True, default="")
    # ...

    def save(self, render_tex=True, *args, **kwargs):
        # ...
        super(Question, self).save(*args, **kwargs)
        if render_tex:
            self.rendered_text = tex_to_svg(self.text)
            self.save(render_tex=False)

The function tex_to_svg runs Mathjax in a node subprocess to convert LaTeX formulas to svg’s, preventing the users’ client from having to do the rendering. We also keep the original text around to allow teachers to update the text of the items.

I have those two fields and that logic in the save method in 3 other models. Is this considered bad practice? What the instinct tells me is I would have to create an abstract class LaTeXRenderable or whatever, and use it as an interface to inherit those two fields. I am unsure what would happen to the save method though. What if I have to inherit from another abstract model too? Moreover, would it be better to have an abstract ExamItem model and inherit from it for both Question and Exercise?

  • when it comes to nested serializers, I’m having a hard time generalizing or keeping the code concise and DRY. If you look at my ExamSerializer’s update method, you’ll concur it’s pretty ugly. I have duplicate code all around. One thing I thought of trying is the following: in my ExamSerializer I would keep a field children with the name of the models I need to nested-update, and then I could do something like this:
for child in chidren:
  child_model = get_model("jsplatform", child)
  child_serializer = getattr(sys.modules[__name__], f"{child}Serializer")
  # update each item
  for item_data in items_data:
      if item_data.get("id") is not None:  # item has an id: object already exists
          item = child_model.objects.get(pk=item_data["id"])
          save_id = item_data.pop("id")
      else:  # item does not have an id: create new object
          item = item(exam=instance)
          item.save()
          save_id = item.pk

      serializer = child_serializer(
          item, data=item_data, context=self.context
      )

      # pop item category as it's not handled by the serializer
      item_category = item_data.pop("category", None)
      # item belongs to a new category we just created; get it
      if item_category is None:
          item_category = Category.objects.get(
              tmp_uuid=item_data["category_uuid"]
          )

      serializer.is_valid(raise_exception=True)

      # update item
      updated_item = serializer.update(
          instance=item, validated_data=item_data
      )
      # update item category
      updated_item.category = item_category
      updated_item.save()

      # remove item from the list of those still to process
      items = items.exclude(pk=save_id)

  # remove any items for which data wasn't sent (i.e. user deleted them)
  for item in items:
      item.delete()

Would this be any better?

  • I have a method that I use for Questions that goes like this:
    def format_for_pdf(self):
        return {
            "text": preprocess_html_for_pdf(self.rendered_text),
            "introduction_text": preprocess_html_for_pdf(self.introduction_text),
            "type": self.question_type,
            "answers": [
                {
                    "text": preprocess_html_for_pdf(a.rendered_text),
                    "is_right_answer": a.is_right_answer,
                }
                for a in self.answers.all()
            ],
        }

When having to do this kind of processing, is there a better/more elegant way? I thought I could probably do this with a serializer rather than a model method. How would I achieve this?

In addition to the points above:

  • do you see any no-no’s in my code, just glancing through the repo?
  • are there queries that could be optimized? I have recently started using prefetch_related, but I’m learning there are more ways to optimize db access
  • what are the best ways to keep your Django code DRY and just compliant with the SOLID principles in general when you have several parts of your application that behave similarly, but also slightly differently?

Thank you so much to anyone who made it through the post, and I’m looking forward to hearing your feedback.

Hi Samul-1,

Thanks for asking these questions. They are difficult questions to answer because they depend on your situation, funding, availability, and goals. I might suggest reading some other materials to see how others reason about things. Some resources are:

Looking through your project, I would consider the question, how do I onboard another engineer? I see some tests, but you’re definitely lacking in that area. If this were to expand as an actual product and you bring others on, tests go a long way in preventing bugs as new devs get acquainted with the project. It may also reveal areas that you should refactor to make things easier to test (and potentially reason about).

Some other things to check out:

  • Deployment checklist, specifically the HTTPS section.
  • Try creating large quantities of fake data to see if your application or admin breaks anywhere. This can help you find areas that may need optimization as your userbase and content grows. You can use the Django Debug Toolbar to identify problematic SQL queries, though we’re still working on the ASGI compatibility.

Best of luck and good work so far!

-Tim

2 Likes

Thank you so much for your quick answer! I’ll make sure to read the material you posted.

As far as the testing goes: I wish I had written more tests, but the truth is–up to a few weeks ago, I kept getting asked to change and implement new stuff. Most of those tests don’t even pass anymore, because after a while I just temporarily gave up on testing. I found myself, being the only developer working on this, having to make the decision between implementing all the requested features and writing the tests.

Now the situation is definitely more stable and I’m not getting any more new feature requests, so I can start working on adding test cases. As far as generating fake data, what would be your suggested method? I have heard the name “fixtures” being thrown around, but I haven’t read about it in details yet.

One last, more specific question about business logic: yesterday I was reading the first thread you linked and I got to an article which was about the disadvantages of the fat model pattern, and offered a few guidelines for how to decide when a model method is to be moved out to a separate module. I’ll retrieve the link to the article later.

In my case, my ExamReport model serves to hold CSV and (a zip archive of) PDF reports of an exam. Now, let’s disregard the fact that for the csv part I’m actually generating a string and keeping it in the model (I’ll soon make a method to generate an actual file like I do for the pdf). What I’m wondering is, is it fine to keep those methods in the model? You’ll notice the model methods make calls to methods in pdf.py, so I was wondering if it could be better to have a class like ExamReportMaker in a separate module encapsulate that feature, and have something like a method that takes in an exam and returns (and saves to the db) an ExamReport.

Thank you once again for your input, it’s very much appreciated.

A lot of developers/companies have this issue when trying to write automated tests. The reality is that your development speed slows down and you have to accept it. Only writing tests when things are slow will result in tests never being written. You don’t have to write tests for everything, but you should make sure the tests you do have are always passing.

Fixtures are one way. Recently I’ve started writing a management command per project that will generate the test data programmatically. That way it can be run on any environment and it’s pretty easy to change once it’s written. The downside is that it won’t run as fast as a fixture.

That’s a good question. I’ve heard solid arguments both ways. I personally am not a fan of the fat models philosophy. I prefer my models to represent the data model. That said, I haven’t made use of QuerySet inheritance as much as I should. So perhaps that’s why I feel this way. Given all of that, I think your ExamReportMaker approach would make more sense to me. However, that’s in the smallest windows of perspective. If your project has a different set of patterns it’s using, switching it for this one case could be problematic as it would break consistency (which I value a lot).

The added benefit of having a separate class such as ExamReportMaker is that it’s easier to test that functionality. In your tests you could pass it some mock object that would assert that its being operated upon as expected. Where as if it’s all one model, the tests get a bit larger and difficult to maintain.

1 Like

Thank you once again for the valuable advice :slight_smile:

You could say I’m moving my very first steps into building bigger projects, so while I did study design patterns and principles like SOLID in uni, I haven’t had a chance to apply many of them in actual projects yet.

The only pattern I would say I did consciously try to follow in this project was using fat models. However, that does seem to negate principles like the single responsibility one, or high cohesion. So I guess it’ll take me some time to learn the balance.

On the other hand though, I want to say something completely opposite to what’s been said in another thread recently–thank you to all the creators and maintainers of Django! This framework is helping me learn a ton of stuff and is what has gotten me to the point where I can develop apps and transform ideas into something that people can see and use. :muscle:t3:

Hey everybody,

Since the question I’m about to ask is closely related to the other things we discussed here, I’m just going to ask here as opposed to starting another thread.

I recently reworked some of the features of my school app (if you need a refresher, just take a peek at my first post in this thread). Prior to the new version, answers during a test were sent to the server when the user clicked on “next question” in the UI. With the new version I’m working on, a user can also go back during a test, therefore I thought a safer, more robust way of recording answers would be to send a request to the server each and every time a user selects an answer, i.e. when a checkbox/radio button is clicked.

I created two new entry points in my ExamViewSet: give_answer, withdraw_answer.
This is the behavior:

  • for questions that accept a single answer (radio button in the UI), give_answer will either create a GivenAnswer object with the supplied answer, or it will fetch the existing one and update it with the new answer (equivalent to the user selecting a different radio button)
  • for questions that accept multiple answers, give_answer will be reached whenever a user checks a new checkbox; a new GivenAnswer object will be created
  • wheb a user deselects a checkbox, withdraw_answer will be reached and the corresponding GivenAnswer object will be deleted

I’ll post the code for these two actions below. My question is, do you think this is too much business logic to be put inside of a view function? There is slightly more code than I’d like, but I’m not sure it’s enough to warrant trying to factor it out into a helper function or something.

# we're inside ExamViewSet

@action(detail=True, methods=["post"], permission_classes=[~TeachersOnly])
def give_answer(self, request, **kwargs):
    exam = get_object_or_404(self.get_queryset(), pk=kwargs.pop("pk"))
    user = request.user
    if exam.closed:
        return Response(
            status=status.HTTP_410_GONE,
            data={"message": constants.MSG_EXAM_OVER},
        )

    try:
        exam_progress = ExamProgress.objects.get(user=user, exam=exam)
    except ExamProgress.DoesNotExist:
        return Response(status=status.HTTP_403_FORBIDDEN)

    current_question = exam_progress.current_item
    
    # exam items can be Question or Exercise, and this view doesn't work for the latter
    if not isinstance(current_question, Question):
        return Response(status=status.HTTP_400_BAD_REQUEST)

    try:
        answer_id = request.data["answer"]
        answer = Answer.objects.get(pk=answer_id)
    except (KeyError, Answer.DoesNotExist):
        return Response(status=status.HTTP_400_BAD_REQUEST)

    if current_question.accepts_multiple_answers:
        try:
            GivenAnswer.objects.create(
                user=user, question=current_question, answer=answer
            )
        except (InvalidAnswerException, IntegrityError):
            return Response(status=status.HTTP_400_BAD_REQUEST)
    else:
        try:
            GivenAnswer.objects.update_or_create(
                user=user,
                question=current_question,
                defaults={"answer": answer},
            )
        except InvalidAnswerException:
            return Response(status=status.HTTP_400_BAD_REQUEST)

    return Response(status=status.HTTP_200_OK)

@action(detail=True, methods=["post"], permission_classes=[~TeachersOnly])
def withdraw_answer(self, request, **kwargs):
    exam = get_object_or_404(self.get_queryset(), pk=kwargs.pop("pk"))
    if exam.closed:
        return Response(
            status=status.HTTP_410_GONE,
            data={"message": constants.MSG_EXAM_OVER},
        )

    user = request.user

    try:
        exam_progress = ExamProgress.objects.get(user=user, exam=exam)
    except ExamProgress.DoesNotExist:
        return Response(status=status.HTTP_403_FORBIDDEN)

    current_question = exam_progress.current_item

    if not isinstance(current_question, Question):
        return Response(status=status.HTTP_400_BAD_REQUEST)

    try:
        answer_id = request.data["answer"]
        answer = Answer.objects.get(pk=answer_id)
    except (KeyError, Answer.DoesNotExist):
        return Response(status=status.HTTP_400_BAD_REQUEST)

    if not current_question.accepts_multiple_answers:
        return Response(status=status.HTTP_403_FORBIDDEN)

    try:
        withdrawn_answer = GivenAnswer.objects.get(
            user=user, question=current_question, answer=answer
        )
        withdrawn_answer.delete()
    except GivenAnswer.DoesNotExist:
        return Response(status=status.HTTP_400_BAD_REQUEST)

    return Response(status=status.HTTP_200_OK)


Personally I would refactor the fetching logic of the exam, question and answer out into some other function. That logic seems like it’s pretty standard for the whole app. If that were pulled out, the views won’t look so large.

1 Like

That’s what I was thinking too, but I’m kind of having a hard time imagining what it would look like.

Unlike with Http404, which can be raised and automatically cause the view to return a 404 response, my understanding is there is nothing like that for the other error codes. Therefore, I can’t just define something like a get_exam_progress_or_403 function, can I?

So the only thing I can think of, taking for example the fetching of the Answer object, would be to have a function like get_answer_from_request_body, but I would still have to try.. except its execution inside the view to return the correct http code.

Is there anything I can do to abstract the flow a little more, or is this exactly what you meant by extracting the fetching logic?

I’m not sure if there’s an automatic handler like the 404 handler. I’m not sure how else you use this logic, but for the two views above, a decorator seems like the cleanest choice.

There are “error handlers” for other response codes, see - Error views, Customizing error views, and the ‘handler’ settings.

True, there aren’t any shortcut functions like get_exam_progress_or_403 - but a 403 is a permissions/authorization response, which is an exception raised by the permission_required decorator or the PermissionRequiredMixin for CBVs. If you’ve got special handling within your view, there’s nothing prohibiting you from raising the PermissionDenied error for whatever reason you wish.

interesting–I have never written a custom decorator. I looked for resources on how to write one for django views, but I couldn’t really find anything. I don’t have experience working with custom decorators in general, either. are here any resources you could point me to to get started? I found some methods that have to do with decorators in the django official repo, but I couldn’t get to an actual one to inspect the code (searching for, say, login_required mostly yields results in test files)

EDIT: never mind, I found a good primer. I’ll see what I come up with and post it here