Websocket connection failed in production. (Nginx, Gunicorn, Daphne) (Django/React)

I am trying to deploy my Django Rest Framework application to production. I have my own server running Debian. I am not new to deploying DRF and React applications and the WSGI part of the application works fine with Gunicorn. The problem I can’t solve is I cannot connect to my Websocket from Django Channels no matter what I do.

For further information, running python manage.py runserver and running everything locally works, I normally connect to my websocket.

My routing.py file:

from channels.routing import ProtocolTypeRouter, URLRouter
from django.urls import path, re_path

from apps.chat_app.consumers import ChatConsumer

websocket_urlpatterns = [
    path('ws/chat/<int:id>/<int:curr>/', ChatConsumer.as_asgi()),
]

application = ProtocolTypeRouter({
    'websocket':
        URLRouter(
            websocket_urlpatterns
        )
    ,
})

My consumers file:

import json

from channels.db import database_sync_to_async
from channels.generic.websocket import AsyncWebsocketConsumer
from django.contrib.auth import get_user_model

from apps.chat_app.models import Message


class ChatConsumer(AsyncWebsocketConsumer):
    async def connect(self):
        current_user_id = self.scope['url_route']['kwargs']['curr']
        other_user_id = self.scope['url_route']['kwargs']['id']

        self.room_name = (
            f'{current_user_id}_{other_user_id}'
            if int(current_user_id) > int(other_user_id)
            else f'{other_user_id}_{current_user_id}'
        )

        self.room_group_name = f'chat_{self.room_name}'

        await self.channel_layer.group_add(self.room_group_name, self.channel_name)
        await self.accept()

    async def disconnect(self, close_code):
        await self.channel_layer.group_discard(self.room_group_name, self.channel_layer)
        await self.disconnect(close_code)

    async def receive(self, text_data=None, bytes_data=None):
        data = json.loads(text_data)
        message = data.get('message', '')
        sender_username = data['sender'].replace('"', '')
        sender = await self.get_user(username=sender_username)

        typing = data.get('typing', False)
        delete = data.get('delete', '')

        if typing:
            await self.channel_layer.group_send(
                self.room_group_name,
                {
                    'type': 'user_typing',
                    'sender': sender_username,
                    'msg': f'{sender.first_name.capitalize()} {sender.last_name.capitalize()} is typing...',
                }
            )
        elif delete:
            await self.delete_message(msg_id=data['delete'])

            await self.channel_layer.group_send(
                self.room_group_name,
                {
                    'type': 'message_delete',
                    'msg_id': data['delete'],
                }
            )
        else:
            await self.channel_layer.group_send(
                self.room_group_name,
                {
                    'type': 'user_typing',
                    'sender': sender_username,
                    'msg': '',
                }
            )

            if message:
                msg = await self.save_message(sender=sender, message=message, thread_name=self.room_group_name)

                await self.channel_layer.group_send(
                    self.room_group_name,
                    {
                        'type': 'chat_message',
                        'msg_id': msg.id,
                        'message': message,
                        'sender': sender_username,
                        'timestamp': msg.timestamp.strftime('%d/%m/%Y %H:%M'),
                        'full_name': f'{sender.first_name.capitalize()} {sender.last_name.capitalize()}',
                    },
                )

    async def message_delete(self, event):
        msg_id = event['msg_id']

        await self.send(
            text_data=json.dumps(
                {
                    'delete': msg_id,
                }
            )
        )

    async def user_typing(self, event):
        username = event['sender']
        msg = event['msg']

        await self.send(
            text_data=json.dumps(
                {
                    'is_typing': True,
                    'sender': username,
                    'msg': msg,
                }
            )
        )

    async def chat_message(self, event):
        message = event['message']
        username = event['sender']
        full_name = event['full_name']
        msg_id = event['msg_id']
        timestamp = event['timestamp']
        typing = event.get('typing', False)
        delete = event.get('delete', '')

        if typing:
            await self.send(
                text_data=json.dumps(
                    {
                        'sender': username,
                        'typing': typing,
                    }
                )
            )
        elif delete:
            await self.send(
                text_data=json.dumps(
                    {
                        'delete': delete,
                    }
                )
            )
        else:
            if message:
                await self.send(
                    text_data=json.dumps(
                        {
                            'msg_id': msg_id,
                            'message': message,
                            'timestamp': timestamp,
                            'sender': username,
                            'full_name': full_name,
                        }
                    )
                )

    @database_sync_to_async
    def get_user(self, username):
        return get_user_model().objects.filter(username=username).first()

    @database_sync_to_async
    def save_message(self, sender, message, thread_name):
        return Message.objects.create(sender=sender, message=message, thread_name=thread_name)

    @database_sync_to_async
    def delete_message(self, msg_id):
        Message.objects.filter(id=msg_id).delete()

My asgi.py file:

import os
from django.core.asgi import get_asgi_application

from channels.auth import AuthMiddlewareStack
from channels.routing import ProtocolTypeRouter, URLRouter
from channels.security.websocket import AllowedHostsOriginValidator


os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'inp_proj.settings')
django_asgi_app = get_asgi_application()

import apps.chat_app.routing

application = ProtocolTypeRouter(
    {
        'http': django_asgi_app,
        'websocket': AllowedHostsOriginValidator(
            AuthMiddlewareStack(URLRouter(apps.chat_app.routing.websocket_urlpatterns))),
    }
)

My daphne.service file:

[Unit]
Description=WebSocket Daphne Service
After=network.target

[Service]
Type=simple
User=root
WorkingDirectory=/www/projectdir
ExecStart=/www/projectdir/venv/bin/python /www/projectdir/venv/bin/daphne -b 0.0.0.0 -p 8001 proj.asgi:application
Restart=on-failure

[Install]
WantedBy=multi-user.target

My gunicorn.service file:

[Unit]
Description=gunicorn daemon
Requires=gunicorn.socket
After=network.target

[Service]
User=jan
Group=www-data
WorkingDirectory=/www/projectdir
ExecStart=/www/projectdir/venv/bin/gunicorn \
          --access-logfile - \
          --workers 3 \
          --bind unix:/run/gunicorn.sock \
          proj.wsgi:application

[Install]
WantedBy=multi-user.target

My gunicorn.socket file:

[Unit]
Description=gunicorn socket

[Socket]
ListenStream=/run/gunicorn.sock

[Install]
WantedBy=sockets.target

And finally, my nginx configuration file:

upstream websocket {
    server 127.0.0.1:8001;
}

server {
    server_name 127.0.0.1 mydomain;

    location = /favicon.ico { access_log off; log_not_found off; }
    location /static/ {
        root /www/projdir;
    }
    
     location / {
        include proxy_params;
        proxy_pass http://unix:/run/gunicorn.sock;
    }
    
    location /ws/ {
    proxy_pass http://websocket;

        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection "upgrade";

        proxy_redirect off;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Host $server_name;    
    }
}

All of the services (gunicorn socket, gunicorn service, daphne, nginx) work normally and are up and running. The gunicorn WSGI part works fine and the whole application works normally, everything works except I cannot connect to my websocket. This is how I connect to the websocket in my client code:

    const client = useMemo(() => {
        return new w3cwebsocket(`ws://mydomain:8001/ws/chat/${id}/${userId}/`);
    }, [id, userId]);

Also, instead of the mydomain:8001 i tried putting in [serveripv4address]:8001, I tried it without port 8001, I tried both wss and ws even though it is HTTP. Also in my allowed hosts I allowed the domains and even the server ipv4 address.

I tried literally everything I can think of and every post I saw. My Nginx, gunicorn or Daphne don’t show any errors.

What is this “w3cwebsocket” function? (I’ve never used it, I’ve only ever used the native “WebSocket” object.)

For something like this, my first step is to try and determine at what point in the connection process that things stop working.

What are you seeing in your logs for nginx? Are you seeing the connection being made?

It might be worth trying to diagnose this by specifying a separate log file for your /ws/ location to make sure you can tell which location section is handling the request.

You might also want to look at the details of the websocket connection attempt in the network tab of your browser’s developer tools.

I don’t see any differences in your nginx configuration from mine, aside from a couple of trivial differences that I don’t think matter at all. (e.g. I don’t use an external “upstream” definition, my proxy_pass directive has the uri in the same manner as what you have for your / location. I also use a unix domain socket for the websocket connection in the same way you’re using one for gunicorn.)

However, there is one benefit to using a tcp socket to connect to daphne, and that is you can use something like tcpdump to determine whether nginx is passing the request on to daphne.

I’m sorry for not clarifying that part. I am using an npm package for a ‘better experience’ working with websockets and that’s where the w3cwebsocket is coming from. It should in theory work as a normal native “WebSocket” object. Howver, I have tried using another client side library and the native “WebSocket” object, but I am still not getting the connection.

However, now that I have tried the native object, my Nginx error log returns this:

*132 SSL_do_handshake() failed (SSL: error:141CF06C:SSL routines:tls_parse_ctos_key_share:bad key share) while SSL handshaking, client: [clientip], server: 0.0.0.0:443

That is the only interesting things the log is returning except for some logs of the WSGI part of my application.

TCP is allowed through my firewall, UDP isn’t, do you think that could be giving me trouble?

The network tab also does’t show anything worthwhile I think:

Can you connect to the Daphne instance locally (using, say, python-websockets)?

$ python -m websockets ws://localhost:8001/

If so then it’s an issue connecting from Nginx. If not it’s your app…

We figured out the solution in the meanwhile. To be honest, I do not know why this fixed it so if anyone has an explanation, I would love to hear it.

We had to add

proxy_set_header Origin “ws://domain:8001”;

in the /ws/ location of the Nginx configuration. After we added that, the websocket connected.

See the Channels security docs Security — Channels 4.0.0 documentation

I have the AllowedHostsOriginValidator wrapped around, I posted it up there in my original post. And my allowed hosts contained all the necessary hosts, but it wasn’t working until i overrode it manually. Is there a reason for that?

What specifically do you have in your ALLOWED_HOSTS setting?

Notice what was supplied in your Origin header from your network tab screenshot above - is that entry in ALLOWED_HOSTS? (Are you loading this page and JavaScript from a different site rather than from this one?)

Well, yes, but you’ll have to dig into it :sweat_smile: (It’s not something that jumps out from here!)

Check that the Origin is actually reaching your app: is Nginx forwarding it, as Ken says, with what value? And so on.

By hard-coding the Origin header, you’re enabling CSRF, which you likely don’t want to do. (Non browser clients can set any Origin, but they don’t also send the cookies for your logged in user…)

All the hosts needed for the rest of my WSGI app + the WebSocket URL that I specified manually in my Nginx configuration. I am loading this page and JS from a standalone FE app.

What origin should I actually forward it? I had set the $host variable as the “Origin” header, but I guess i’m not supposed to send that? What should Nginx actually forward?

That’s the root cause of the issue here. This is specifically the type of situation that the various CORS and CSRF protection systems are designed to manage. (In other words, protect you from a malicious site trying to access yours. You need to explicitly configure your app to permit these connections.)

You need to ensure that your Django app is allowed to accept connections from code loaded by your standalone front-end application.

Note: Understanding the technical reasons behind all this goes well beyond Django itself, these are more fundamental issues being managed by the browsers and web applications in general. You’ll want to gain an understanding of what each of these headers (Host and Origin) are used for.
Your research into this, if you’re truly interested, should take you well beyond the Django docs.

Django itself (and DRF) touch upon this in a couple of areas:

for starters.

To come back to your previous question then:

It should forward what the browser is sending. That’s the only way that Django is going to be able to verify that the request is coming from an approved source.

Yes I am aware of the certain protections and I feel like many others, had my fair of issues with them (while configuring the WSGI part of my app). Thank you and @carltongibson for the help and the useful resources.

Could you just explain a bit further what this last part means?

It should forward what the browser is sending. That’s the only way that Django is going to be able to verify that the request is coming from an approved source.

Your browser is sending both a Host and Origin header - you’re showing it in the network tab screenshot you posted above. Those are the values that nginx should be forwarding to your application.

By hard-coding those values in your nginx config:

you’re overwriting what the browser is sending. This is telling Django that the Origin is ws://domain:8001 instead of http://localhost:3000 (what you show above). This means that any site (yours or someone else’s) having JavaScript code that tries to open a websocket on your site, is going to present an Origin of ws://domain:8001
(Side note: This doesn’t prevent a malicious user from writing code to do this. What it prevents is, for example, me going to “Site A”, and having it load code on my browser that tries to connect to your site, taking control of my credentials.)

1 Like