Template Lexer Tokenisation

Currently, the DTL Lexer uses RegEx for tokenisation. There are a couple of consequences of this.

  • In some scenarios it can result in excessive back tracking, see PR #19360
  • The current RegEx means the verbatim tag can be limited in certain situations, see ticket 23424

Also related is the ticket to allow multiple line template tags see PR #18805.

I have investigated writing a lexer that avoids using RegEx (is ‘recursive descent’, the correct phrase?). Here’s the branch, see link.

It passes the current test suite with one failure – it allows new lines inside tags, which is the change in PR #18805.

Performance-wise, the difference is varied. In normal usage, it is slower than the current RegEx implementation (35% in the var example), but can be significantly quicker in some scenarios (var nested open 500x). See the full benchmark comparison below the line.

Appreciate opinions and thoughts on whether this is something worth investigating further.


Benchmark examples:

Ticket 35675

----------------
CASE: var
Total time taken: 0.000174 seconds
Time per execution: 0.000002 seconds
----------------
CASE: tag
Total time taken: 0.000169 seconds
Time per execution: 0.000002 seconds
----------------
CASE: comment
Total time taken: 0.000127 seconds
Time per execution: 0.000001 seconds
----------------
CASE: mismatched
Total time taken: 0.000084 seconds
Time per execution: 0.000001 seconds
----------------
CASE: var_nested_open
Total time taken: 1.661711 seconds
Time per execution: 0.016617 seconds
----------------
CASE: tag_nested_open
Total time taken: 0.683057 seconds
Time per execution: 0.006831 seconds
----------------
CASE: comment_nested_open
Total time taken: 1.653287 seconds
Time per execution: 0.016533 seconds
----------------
CASE: var_nested_closed
Total time taken: 0.004123 seconds
Time per execution: 0.000041 seconds
----------------
CASE: tag_nested_closed
Total time taken: 0.003450 seconds
Time per execution: 0.000034 seconds
----------------
CASE: comment_nested_closed
Total time taken: 0.003978 seconds
Time per execution: 0.000040 seconds
----------------
CASE: mismatched_nested
Total time taken: 0.709095 seconds
Time per execution: 0.007091 seconds

New:

----------------
CASE: var
Total time taken: 0.000234 seconds
Time per execution: 0.000002 seconds
----------------
CASE: tag
Total time taken: 0.000246 seconds
Time per execution: 0.000002 seconds
----------------
CASE: comment
Total time taken: 0.000334 seconds
Time per execution: 0.000003 seconds
----------------
CASE: mismatched
Total time taken: 0.000229 seconds
Time per execution: 0.000002 seconds
----------------
CASE: var_nested_open
Total time taken: 0.003380 seconds
Time per execution: 0.000034 seconds
----------------
CASE: tag_nested_open
Total time taken: 0.003386 seconds
Time per execution: 0.000034 seconds
----------------
CASE: comment_nested_open
Total time taken: 0.003444 seconds
Time per execution: 0.000034 seconds
----------------
CASE: var_nested_closed
Total time taken: 0.000777 seconds
Time per execution: 0.000008 seconds
----------------
CASE: tag_nested_closed
Total time taken: 0.000377 seconds
Time per execution: 0.000004 seconds
----------------
CASE: comment_nested_closed
Total time taken: 0.000732 seconds
Time per execution: 0.000007 seconds
----------------
CASE: mismatched_nested
Total time taken: 0.003309 seconds
Time per execution: 0.000033 seconds
5 Likes

Hey David,

Thanks for looking at this! That’s a really cool idea that solves some longstanding issues.

Where are the benchmark files? I’m interested to see what each name refers to.

I take it this means your benchmarks compare with PR #19360. Is that because it’s faster than main?


I ran a quick analysis on your benchmark results and produced this table:

Case Before (s) After (s) Speedup (>1 = faster)
comment 0.000127 0.000334 0.38x
comment_nested_closed 0.003978 0.000732 5.43x
comment_nested_open 1.653287 0.003444 480.05x
mismatched 0.000084 0.000229 0.37x
mismatched_nested 0.709095 0.003309 214.29x
tag 0.000169 0.000246 0.69x
tag_nested_closed 0.003450 0.000377 9.15x
tag_nested_open 0.683057 0.003386 201.73x
var 0.000174 0.000234 0.74x
var_nested_closed 0.004123 0.000777 5.31x
var_nested_open 1.661711 0.003380 491.63x

Overall geometric mean speedup: 10.61x

The previously terrible cases have massively improved, but for many, there’s a general slowdown.

I’d hope we can reduce some of that impact by optimizing your code somewhat. I think there are some opportunities around avoiding extra string slices, which are the most expensive part here because Python copies data into new strings every time you make a slice.

Great work!

4 Likes

That’s correct. As there’s an open PR to improve the current RegEx implementation, I thought it best to compare to that rather than to main.

I’ve created a variation to the benchmark in the PR19360. While those benchmarks are great I think they were likely designed with specific examples in mind as that PR is targeting a specific issue and so don’t cover text tokens. I’ve created a benchmark that tokenizes a couple of full templates. One from crispy-forms and one from the django-admin. See gist.

Based on my work at the time I posted it above, my whole template benchmarks also showed a general slow down.

I’ve since done some further work and have managed to improve it somewhat. Testing with Python 3.13.4 at commit:

Case Before After Speedup (>1 = faster)
Crispy 0.012432 0.012763 0.97
admin_index 0.005583 0.005857 0.95

I’m always slighlty nervous about benchmarks as I’m just one person on my one machine. Others may see different things. Hopefully, the above isn’t too far out.

reduce some of that impact by optimizing your code somewhat

100% there’ll be optimisations that can be made to my code. Comments from all would be welcome. :smiley:

1 Like