Security of Django application running untrusted code via Celery

samul-1 · July 29, 2023, 6:23pm

I have an LMS based on Django REST which, among its features, allows users to run JavaScript code remotely. It offers users an in-browser IDE with a Run button that makes the code run on the backend against a set of test cases, returning the results.

A critical vulnerability has recently been discovered in the library I use to sandbox the JS code. Since I don’t have enough time to migrate to another library as of now, I want to double-check that the rest of the infrastructure is secure enough to handle possible leaks from the library.

Here’s an outline of how the process works. I want to determine whether an attacker who may gain privileges or even perform arbitrary code execution on the sandbox environment would also be able to access the user-facing part of the application and, particularly importantly, the database.

In order to run user code, the application exposes an API action:

    @action(detail=True, methods=["post"])
    def run(self, request, **kwargs):
         code = ... # get code from the request
         # schedule code execution
         run_code.delay(code)
         # ... return some identifier to the user that can be polled in order to get results

A Celery task is thus called, which looks like this: (I’m omitting all details about how code runs are related to and saved to models, just assume that we always know which model instance to save the results to)

@app.task(bind=True, retry_backoff=True, max_retries=5)
def run_code(self, code):
    try:
        testcases = ... # retrieve from db
        run_code_and_save_results(code, testcases)
    except Exception as e:
        logger.critical("Run code exception: %s", e, exc_info=1)
        try:
            self.retry(countdown=1)
        except MaxRetriesExceededError:
            # ... save failure to model

Finally, the function that actually calls the node sandbox looks like this:

def run_js_code_in_vm(code, testcases):
    node_vm_path = os.environ.get("NODE_VM_PATH", "coding/ts/runJs.js")

    testcases_json = [{"id": t.id, "assertion": t.code} for t in testcases]

    # call node subprocess and run user code against test cases
    try:
        res = subprocess.check_output(
            [
                "node",
                node_vm_path,
                code,
                json.dumps(testcases_json),
            ]
        )
        return {**json.loads(res), "state": "completed"}
    except subprocess.CalledProcessError as e:
       # ... log the error

Essentially, the Celery process calls a node subprocess with a script that runs sandboxed code.

Assuming the sandbox script is insecure and vulnerable, my reasoning is the following: since I use docker-based deployment, the Celery task should run in a different container. Even if the malicious code is able to take over that container, it shouldn’t be able to affect the Django application.

My only worry would be if the Celery container was able to somehow access the db, which it must definitely be able to do with some capacity since it already saves some models when working normally.

Assuming the node script is vulnerable, which we know is the case right now, but is a reasonable assumption to make even if I end up migrating to one with no known vulnerabilities, what can I do to protect my application, especially the db, with the given setup?

KenWhitesell · July 30, 2023, 4:32am

How Celery is running is almost irrelevent. What matters is how Django is communicating with Celery, and whether your Celery tasks have access to the Django database.

You write:

If your Celery container is a “Django project” with the DATABASES setting referring to the regular Django database, you’ve got no protection. What you would need to do is create an api-type layer between Celery and Django and enforce that as the only communication between the two, where you can define what is (and isn’t) allowed to be submitted.

Note: You’re also going to want to set up your Celery task such that it restarts after every execution. You don’t want someone messing up the global environment of that sandbox to mess up the next person who submits something.

DanielGnzlzVll · July 30, 2023, 2:56pm

hi,

I have a similar system, what i actually doing is allowing the celery container to access the docker daemon, and run a container with the user code.
I realized there are other alternatives, you can for example run a aws lambda with the code or yo can use aws batch jobs.

Topic		Replies	Views
Celery vs Rabbitmq for this particular use case Async/Channels	17	2174	April 4, 2022
track celery task with django application Deployment	4	1104	October 9, 2023
Background Tasks using Celery based on model instances Deployment	5	2365	July 11, 2022
Best way to schedule long-running task Using Django	16	7214	July 28, 2021
Django SaaS multi-tenancy(?) architecture Using Django	3	1099	January 28, 2020

Security of Django application running untrusted code via Celery

Related Topics