Discussion on Django Benchmarking project

Hi everyone, I want to contribute to this project as part of GSOC 2022, I’ve created this discussion to get some more details about the project

  1. The project mentions that top level benchmarks should be run against each pull request, so actions should be set to run djangobench whenever a pull request is mode to the django repo right? this could be done either by creating a repository dispatch event in djangobench or include benchmarking in the django repo itself. which way should I go?

  2. The projects point to the lack of benchmarks in the HTTP-serving-speed area, does it mean that time difference for getting a response from two different versions of django should be benchmarked? or should the time taken by the server to fulfill the requested be benchmarked?

  3. Storing the results - The results can either be stored as artifacts or in a managed database while both of them have their pros and cons storing it in a DB might be better as it can queried easily instead of first downloading a zip file and extracting it. Would love to hear someone’s thought’s on this.

  4. I started contributing to the djangobench repo, I created a pull request and there are several changes that I want to push but I did not get any response from maintainers. Does it usually take long?

Hi @deepakdinesh1123 — Welcome.

I see your PR here ORM Overhead Compared to cursor.execute() Benchmark Added by deepakdinesh1123 · Pull Request #42 · Django/Djangobench — Sometimes it can take a while, yes. The benchmarks aren’t super active… — picking that up is the part of project.

I see four main areas:

  1. Docs and bootstrapping — so that it’s as easy as possible for folks to get going with the benchmarks. This may be no more than a refresh to the README but…

  2. Getting the benchmarks running regularly. Maybe GHAs could be used for that. (That would be great no? — But minutes :thinking:) Maybe we need to host the runner ourselves. We have capacity to do that server wise, no problem but we don’t necessarily have the bandwidth to say we need X, Y, and Z exactly. We could either do it in a VM directly if that’s better, or via a docker-compose file, or …

  3. It would be good to pick up @smithdc1’s work on ASV and get that running all the time. (So following on from 2.

  4. Then a test project that we could configure with different servers (Gunicorn/Daphne/…) to test WSGI vs ASGI, and different Django versions (going forward — we don’t need to test super old versions, which can be hard these days) with basic load tests to see how they perform, that would very helpful. (We’re in the middle of extending async support and being able to see the effect of different setups and changes is needed to make progress on some of these points.)

Your 3: storing results — We can certainly find some DB space for this, yes – even if we just start gathering data going forward, it’s soon visualisable to see patterns (and improvements/regressions).

We can put things into django/django if we need to, but the preference would be to keep benchmarking related bits in dangobench — so I’d say try to do it that way, and if that’s not reasonable we can go the other way (with a Why this has to be here in hand.)

Does that give you enough to be going on with?

Kind Regards,

Carlton

@carltongibson Thank you very much for replying.

I looked in issue but I could not find @smithdc1 's work on ASV, I found the Github action that ran on heroku but it seems like it does not use ASV. He mentioned in a comment that he had the benchmarks with ASV working should I continue his work by asking him to share it or do it from scratch?

I’ll upload a draft proposal tomorrow with all the details and schedule outlined in it, please review it.

Hi @deepakdinesh1123 — I think this is the repo @smithdc1 was working in:

Earlier I looked into this repo but I could not find the asv.conf.json file which is a requirement to run benchmarks with ASV according to their documentation. Is there some other way to run it?

I’d imagine you’d need to define that for your setup no? :thinking:

Your best bet might be to open a ticket on the repo asking how to get up and running. (Show how far you’ve got, as better questions with more details are easier to respond to.)

Hi @deepakdinesh1123

Thank you for the interest here.

GitHub - django/django-asv: Benchmarks for Django using asv is the repo I think you are looking for. There’s some notes on the readme there on how to get up and running locally. I suspect time has passed and there may be some features in recent versions of Django which mean that some of the benchmarks may now fail.

My latest detailed thoughts on Benchmarking of Django are here which you’ve likely come across. Tracker App · Issue #38 · django/djangobench · GitHub

A couple of more additional comments on Carlton’s comment above.

Getting the benchmarks running regularly. Maybe GHAs could be used for that.

There was a blog post here (Is GitHub Actions suitable for running benchmarks? | Quansight Labs) that seemed to have more joy than I did in getting reliable runs from GHA. It was very noisy when I did it.

Also scipy has an airspeed velocity suite which they appears to be running very regularly and seems to give repeatable results. I’m not sure how they are doing it :thinking:

@smithdc1 Thank you very much for pointing me to your work on ASV

I looked into the repo and I was able to run it locally, should I add all the benchmarks that are in djangobench but not in django-asv or should I add only a set of benchmarks along with some new benchmarks?

As for running the benchmarks with GHA or some other way, I will look into other options and compare results to see which way would be ideal.

It would be useful to perform and audit that I’ve moved all of the Djangobench benchmarks across. I thought I had but… well it’s been a while.

Happy to try and comment on any questions you have. May be best in the form of issues / PRS on GitHub especially if it’s code related.

1 Like