Security of Django application running untrusted code via Celery

I have an LMS based on Django REST which, among its features, allows users to run JavaScript code remotely. It offers users an in-browser IDE with a Run button that makes the code run on the backend against a set of test cases, returning the results.

A critical vulnerability has recently been discovered in the library I use to sandbox the JS code. Since I don’t have enough time to migrate to another library as of now, I want to double-check that the rest of the infrastructure is secure enough to handle possible leaks from the library.

Here’s an outline of how the process works. I want to determine whether an attacker who may gain privileges or even perform arbitrary code execution on the sandbox environment would also be able to access the user-facing part of the application and, particularly importantly, the database.

  1. In order to run user code, the application exposes an API action:
    @action(detail=True, methods=["post"])
    def run(self, request, **kwargs):
         code = ... # get code from the request
         # schedule code execution
         run_code.delay(code)
         # ... return some identifier to the user that can be polled in order to get results
  1. A Celery task is thus called, which looks like this: (I’m omitting all details about how code runs are related to and saved to models, just assume that we always know which model instance to save the results to)
@app.task(bind=True, retry_backoff=True, max_retries=5)
def run_code(self, code):
    try:
        testcases = ... # retrieve from db
        run_code_and_save_results(code, testcases)
    except Exception as e:
        logger.critical("Run code exception: %s", e, exc_info=1)
        try:
            self.retry(countdown=1)
        except MaxRetriesExceededError:
            # ... save failure to model
  1. Finally, the function that actually calls the node sandbox looks like this:
def run_js_code_in_vm(code, testcases):
    node_vm_path = os.environ.get("NODE_VM_PATH", "coding/ts/runJs.js")

    testcases_json = [{"id": t.id, "assertion": t.code} for t in testcases]

    # call node subprocess and run user code against test cases
    try:
        res = subprocess.check_output(
            [
                "node",
                node_vm_path,
                code,
                json.dumps(testcases_json),
            ]
        )
        return {**json.loads(res), "state": "completed"}
    except subprocess.CalledProcessError as e:
       # ... log the error

Essentially, the Celery process calls a node subprocess with a script that runs sandboxed code.

Assuming the sandbox script is insecure and vulnerable, my reasoning is the following: since I use docker-based deployment, the Celery task should run in a different container. Even if the malicious code is able to take over that container, it shouldn’t be able to affect the Django application.

My only worry would be if the Celery container was able to somehow access the db, which it must definitely be able to do with some capacity since it already saves some models when working normally.

Assuming the node script is vulnerable, which we know is the case right now, but is a reasonable assumption to make even if I end up migrating to one with no known vulnerabilities, what can I do to protect my application, especially the db, with the given setup?

How Celery is running is almost irrelevent. What matters is how Django is communicating with Celery, and whether your Celery tasks have access to the Django database.

You write:

If your Celery container is a “Django project” with the DATABASES setting referring to the regular Django database, you’ve got no protection. What you would need to do is create an api-type layer between Celery and Django and enforce that as the only communication between the two, where you can define what is (and isn’t) allowed to be submitted.

Note: You’re also going to want to set up your Celery task such that it restarts after every execution. You don’t want someone messing up the global environment of that sandbox to mess up the next person who submits something.

hi,

I have a similar system, what i actually doing is allowing the celery container to access the docker daemon, and run a container with the user code.
I realized there are other alternatives, you can for example run a aws lambda with the code or yo can use aws batch jobs.

1 Like