Daphne: websocket works locally with ws protocol, but it connects and quickly disconnects in production

Hello,

Thanks for taking the time to read the issue below.

In fact I have been struggling for two weeks trying to resolve the issue below.

Issue:
We have a Django project which uses Daphne and Django Channels. Everything works well locally (ws protocol), but only the Http requests work in production, but the wss requests do not work. In production, we use NGINX and docker.

The JS console is showing connection to ‘wss://domain.tld/ws/inbox/…/’ failed, and the error code is 1006.

Below is what the log shows on the server:

 "**WSCONNECTING** /ws/inbox/948252324-169626-QvepuQVR2z9odML8joreJT/" - -
 DEBUG    Upgraded connection ['192.168.16.1', 60258] to WebSocket
 INFO     test:  948252324-169626-QvepuQVR2z9odML8joreJT
 "**WSCONNECT** /ws/inbox/948252324-169626-QvepuQVR2z9odML8joreJT/" - -
DEBUG    WebSocket ['192.168.16.1', 60258] open and established
DEBUG    WebSocket ['192.168.16.1', 60258] accepted by application
DEBUG    Get address info redis:6379, type=<SocketKind.SOCK_STREAM: 1>
DEBUG    Getting address info redis:6379, type=<SocketKind.SOCK_STREAM: 1> took 0.430ms: [(<AddressFamily.AF_INET: 2>, <SocketKind.SOCK_STREAM: 1>, 6, '', ('192.168.16.2', 6379))]
DEBUG    <asyncio.TransportSocket fd=15, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('192.168.16.6', 56826), raddr=('192.168.16.2', 6379)> connected to redis:6379: (<_SelectorSocketTransport fd=15 read=polling write=<idle, bufsize=0>>, <asyncio.streams.StreamReaderProtocol object at 0xffff88452670>)
DEBUG    Sent WebSocket packet to client for ['192.168.16.1', 60258]
DEBUG    WebSocket closed for ['192.168.16.1', 60258]
"**WSDISCONNECT** /ws/inbox/948252324-169626-QvepuQVR2z9odML8joreJT/" - -
INFO     close code:  1006
DEBUG    Get address info redis:6379, type=<SocketKind.SOCK_STREAM: 1>
DEBUG    Getting address info redis:6379, type=<SocketKind.SOCK_STREAM: 1> took 0.523ms: [(<AddressFamily.AF_INET: 2>, <SocketKind.SOCK_STREAM: 1>, 6, '', ('192.168.16.2', 6379))]
DEBUG    <asyncio.TransportSocket fd=12, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('192.168.16.6', 56830), raddr=('192.168.16.2', 6379)> connected to redis:6379: (<_SelectorSocketTransport fd=12 read=polling write=<idle, bufsize=0>>, <asyncio.streams.StreamReaderProtocol object at 0xffff88452f70>)

We used Daphne for http and ws requests, but due to this issue and to better troubleshoot, we started using Gunicorn for http.

Below is our NGINX conf:

server {
        listen      ***:443 ssl http2;
        server_name domain.tld ;
        error_log   /var/log/apache2/domains/domain.error.log error;

        ssl_certificate     ***.pem;
        ssl_certificate_key ***.key;
        ssl_stapling        on;
        ssl_stapling_verify on;

        # TLS 1.3 0-RTT anti-replay
        if ($anti_replay = 307) { return 307 https://$host$request_uri; }
        if ($anti_replay = 425) { return 425; }

        location ~ /\.(?!well-known\/|file) {
                deny all;
                return 404;
        }



        location /ws/ {
                proxy_pass http://localhost:6002/ws/;
                proxy_http_version 1.1;
                proxy_set_header Upgrade $http_upgrade;
                proxy_set_header Connection "upgrade";
                proxy_read_timeout 86400s;
                proxy_send_timeout 86400s;
                proxy_redirect off;
                proxy_set_header Host $host;
                proxy_set_header X-Real-IP $remote_addr;
                proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
                proxy_set_header X-Forwarded-Host $server_name;
                proxy_set_header Accept-Encoding gzip;
        }

        location / {
                proxy_pass http://localhost:4000/;
        }
}

We can add more information if requested.

Thanks again for the help.

I think we will need to see the asgi.py and routing.py files. (We might also end up needing to see the consumer, but lets see these other files first.)

Thanks @KenWhitesell for replying.

Below is my asgi.py:

"""
ASGI config for festishareBackend project.

It exposes the ASGI callable as a module-level variable named ``application``.

For more information on this file, see
https://docs.djangoproject.com/en/4.2/howto/deployment/asgi/

https://channels.readthedocs.io/en/stable/tutorial/part_1.html
"""

import os
import django

os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'festishareBackend.settings')
django.setup()

from django.core.asgi import get_asgi_application
from channels.auth import AuthMiddlewareStack
from channels.security.websocket import AllowedHostsOriginValidator
from channels.routing import ProtocolTypeRouter, URLRouter
from main.routing import websocket_urlpatterns
from .ChannelsJWTAuthMiddleware import JWTAuthMiddleware


# os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'festishareBackend.settings')

# application = get_asgi_application()

application = ProtocolTypeRouter(
    {
        "http": get_asgi_application(),
        "websocket": AllowedHostsOriginValidator(
            AuthMiddlewareStack(
                JWTAuthMiddleware(
                    URLRouter(websocket_urlpatterns)
                )
            )
        ),
    }
)

Below is my routing.py:

from django.urls import path, re_path
from main import consumers


websocket_urlpatterns = [
    path("ws/inbox/<inbox_id>/", consumers.InboxConsumer.as_asgi()),
] 

Below is my consumer.py:

import json

from asgiref.sync import async_to_sync

from channels.db import database_sync_to_async
from channels.auth import login
from channels.generic.websocket import AsyncWebsocketConsumer, WebsocketConsumer

from .signals import joined_chat_signal

from .models import Message, Inbox, User
from .serializers import MessageSerializer, InboxSerializer, UserSerializer

from .shared import getInboxMessages


class InboxConsumer(AsyncWebsocketConsumer):
    
    def __init__(self, *args, **kwargs):
        super().__init__(args, kwargs)
        self.inbox_id = None
        self.inbox_group_name = None
        self.inbox = None
        self.user = None

    @database_sync_to_async
    def get_inbox(self, inbox_id):
        return Inbox.objects.filter(pk=inbox_id).first()
    
    @database_sync_to_async
    def get_inbox_messages(self):
        return getInboxMessages(user=self.user, inbox_id=self.inbox_id)
    

    @database_sync_to_async
    def get_inbox_online_users(self, inbox_id):
        inbox =  Inbox.objects.filter(pk=inbox_id).first()
        # print('users: ', [user.username for user in inbox.online.all()])
        return [user.username for user in inbox.online.all()]
    
    @database_sync_to_async
    def add_inbox_online_user(self, inbox_id, user):
        inbox =  Inbox.objects.filter(pk=inbox_id).first()
        inbox.online.add(user)
        # inbox.online.clear()
        return 
    
    @database_sync_to_async
    def remove_inbox_online_user(self, inbox_id, user):
        inbox =  Inbox.objects.filter(pk=inbox_id).first()
        inbox.online.remove(user)
        return 

    async def connect(self):
        # connection has to be accepted
        # print('connecting')
        self.inbox_id = self.scope["url_route"]["kwargs"]["inbox_id"]
        self.inbox_group_name = f'chat_{self.inbox_id}'
        self.inbox = self.get_inbox(inbox_id=self.inbox_id)
        self.user = self.scope['user']

        # print('user0: ', self.user)
        print('test: ', self.inbox_id)

        await self.accept()
        # print('accepted')
        # send the user list to the newly joined user
        await self.send(json.dumps({
            'type': 'user_list',
            'users': await self.get_inbox_online_users(self.inbox_id),
        }))
        # # print('done')

        
    async def disconnect(self, close_code):
        # print('user1: ', self.user)
        print("close code: ", close_code)

        # Leave room group
        await self.channel_layer.group_discard(self.inbox_group_name, self.channel_name)
        
        if self.user.is_authenticated:
            # send the leave event to the room
            await self.channel_layer.group_send(
                self.inbox_group_name, {"type": "user_leave", 'user': self.user.username,}
            )
            # self.inbox.online.remove(self.user)
            await self.remove_inbox_online_user(self.inbox_id, self.user)


    async def receive(self, text_data=None, bytes_data=None):
        text_data_json = json.loads(text_data)
        # print('data: ', text_data_json)
        type = text_data_json['type']
        token = self.scope.get('token', '')
        refreshToken = self.scope.get('refreshToken', '')
        

        if token:
            await self.send(json.dumps({
                'type': 'jwt_tokens',
                'token': token,
                'refreshToken': refreshToken,
            }))


        if not self.user.is_authenticated: 
            print('not authenticated: ', self.user) 
            return
        
        if type == 'join_room':
            # print('group: ', self.inbox_group_name)
            print('joining room ...')
            # join the inbox group
            await self.channel_layer.group_add( self.inbox_group_name, self.channel_name, )

            await self.channel_layer.group_send(
                self.inbox_group_name,
                {
                    'type': 'user_join',
                    'user': self.user.username,
                }
            )
            await self.add_inbox_online_user(self.inbox_id, self.user)
        
        elif type == 'get_messages':
            print('getting messages ...')
            await self.send(json.dumps({
                'type': 'messages',
                'messages': await self.get_inbox_messages(),
            }))
        

        elif type == 'message':
            message = text_data_json['message']
            # send chat message event to the inbox
            await self.channel_layer.group_send(
                self.inbox_group_name, {
                    'type': 'chat_message',
                    'user': self.user.username, 
                    'message': message,
                }
            )
            Message.objects.create(user=self.user, inbox=self.inbox, content=message)

    
    async def chat_message(self, event):
        await self.send(text_data=json.dumps(event))

    async def user_join(self, event):
        await self.send(text_data=json.dumps(event))

    async def user_leave(self, event):
        await  self.send(text_data=json.dumps(event))

    async def send_jwt_tokens(self, event):
        await self.send(text_data=json.dumps(event))

    

Thanks!

How are you running Daphne in your container? (What exec or cmd are you using to run Daphne itself?)

Are you using docker compose, or running the containers individually? If you’re using docker compose, please post the section of your compose file that runs the Daphne container. If you’re running Daphne separately, please post the command you’re using to start it.

Thanks @KenWhitesell ,

I’m using docker compose. Below is the daphne session:

 asgibackendserver:
    image: image
    hostname: hostname
    env_file:
      - ./.env

    build:
      context: .
      dockerfile: ./Dockerfile

    command: bash -c "daphne -b 0.0.0.0 -p 6002 myapp.asgi:application -v 3"


    deploy:
       mode: replicated
       replicas: 1


    restart: always

    ports:
      - 6002

    volumes:
      - app_static:/app/static

    depends_on:
      - redis

    environment:
      - REDIS_HOST=redis

Kindly note that it properly handles http requests. The only problem is with websockets. It disconnects as soon as it connects. The client does not receive any packet while the log in the server states that a packet is sent to client. On the client, the connection fails with a 1006 error code.

You’re showing http2 support in the server, but the portion of the log you’re showing doesn’t include Daphne’s acknowledgement of that.

For testing purposes, I would try removing the http2 parameter to see if doing so changes the behavior. (You could also try verifying that you’ve got the http2 support for Daphne installed, but I’m not sure that http2 is valid without https.)

Disclaimer: We’re getting well outside my area of knowledge here. I know some about http2, but I’m far from what I would consider knowledgeable about that protocol. This is a situation where I’m looking to simplify the situation to get back to a known/good baseline.

I removed the http2 support from the server to see if that will fix the issue, but it did not. I will set it back to http2 on server and add http2 support on daphne and see if it fixes the issue.

Thanks @KenWhitesell

Are the symptoms the same? Same set of messages in the Daphne log? Same error code in the browser?

Also, check the nginx error log to see if anything is recorded there.

(Side note: I’ve never needed or used http2. I don’t see where adding it is going to help. If it’s not working with http/1.1, there’s something else here I’m not seeing.)

Below is what appears on the client console:

image

It shows a code 1006. My intuition is that the server receives the request from the client, accepts the connection, but the client is not receiving the response (The server shows connect, which means the connection is made, but the client is not receiving anything) It should maybe be an issue with NGINX because everything works well locally with the runserver command.

Nginx has very little to do with this. It’s a tunnel between the outside and Daphne. Once you get the “WSCONNECTING” message from Daphne, you know that nginx is doing what it needs to do.

The issue is most likely either with Daphne or the application.

Please post the INSTALLED_APPS section of your settings.

Thank you so much for the time spent on this. I have been troubleshooting for two weeks. Below is the INSTALLED_APPS:

INSTALLED_APPS = [
    'daphne',
    'channels',
    'admin_interface',
    'colorfield',
    'django.contrib.admin',
    'django.contrib.auth',
    'django.contrib.contenttypes',
    'django.contrib.sessions',
    'django.contrib.messages',
    'django.contrib.staticfiles',
    'rest_framework',
    'rest_framework_simplejwt',
    'rest_framework_simplejwt.token_blacklist',
    'main',
    'drf_yasg', # https://drf-yasg.readthedocs.io/en/stable/readme.html#installation
    "encrypted_fields",
    # 'easyaudit'
]

The other thing I’d suggest trying would be to remove these two settings.

I don’t use either of these, and I’m seeing information in the docs and blogs that describe that these values should be below 75s.

Thanks, I did not have it before, I added it while trying to fix this issue. I just removed it.

Can you post a pip list of what’s installed in the Daphne container?

Can you post the Daphne log from startup? (The first 20 lines or so.)

Can you post the JavaScript code that is attempting to connect to the web socket? (I only need enough to see how you’re trying to connect, with a few extra lines to help me understand the context. I don’t need to see all the JavaScript.)

Ok, below are the requested information.

Pip list:

pip == 24.0
gunicorn
daphne == 4.0.0
# daphne == 3.0.2
channels[daphne] == 4.0.0
# channels[daphne] 
django #== 4.0.9
django-admin-interface == 0.26.1
djangorestframework == 3.14.0
# djangochannelsrestframework
channels_redis == 4.1.0
djangorestframework-simplejwt == 5.3.0
celery == 5.3.4
django-celery-beat == 2.5.0
django-easy-audit == 1.3.5
sqlalchemy == 2.0.22 # required for celery
# https://stackoverflow.com/questions/41636273/celery-tasks-received-but-not-executing
eventlet == 0.33.3 # for celery tasks
django-dotenv == 1.4.2
#psycopg2  # For postgresql windows
psycopg2-binary == 2.9.9 # For postgresql linux or mac
requests == 2.31.0
# httplib2
pyqrcode == 1.2.1
pypng == 0.20220715.0
# whitenoise == 6.1.0
#bcrypt
django[bcrypt] #== 4.0.1
shortuuid == 1.0.11
Pillow == 10.1.0
sib-api-v3-sdk == 7.6.0
django-redis == 5.4.0
django-cors-headers == 4.3.0
django-searchable-encrypted-fields == 0.2.1 # https://pypi.org/project/django-searchable-encrypted-fields/

firebase-admin == 6.2.0
protobuf == 4.24.4
requests-toolbelt == 0.10.1
urllib3 == 1.26.15 #https://stackoverflow.com/questions/76175487/sudden-importerror-cannot-import-name-appengine-from-requests-packages-urlli
pyrebase5 == 5.0.1

paypal-checkout-serversdk == 1.0.3
paypal-payouts-sdk == 1.0.0
braintree == 4.23.0
# forex-python == 1.6 
plaid-python == 17.0.0
phonenumbers == 8.13.23
drf-yasg == 1.21.7
cryptography == 41.0.5

geopy == 2.4.0
pyzipcode == 3.0.1
pyparsing == 2.4.7
#keyring

Daphne log from startup:

INFO     Starting server at tcp:port=6002:interface=0.0.0.0
INFO     HTTP/2 support not enabled (install the http2 and tls Twisted extras)
INFO     Configuring endpoint tcp:port=6002:interface=0.0.0.0
INFO     HTTPFactory starting on 6002
INFO     Starting factory <daphne.http_protocol.HTTPFactory object at 0xffffbbad6730>
INFO     Listening on TCP address 0.0.0.0:6002

JavaScript:

useEffect(()=>{
        if (!loaded){
            setLoaded(true)
            const chatSocket = getChatSocket(inboxId)
            console.log('first step')
            console.log(chatSocket)
            
            chatSocket.onopen = function(){
                console.log('connection made ...')

                chatSocket.send(JSON.stringify({
                    'type': 'get_messages'
                }));

                chatSocket.send(JSON.stringify({
                    'type': 'join_room'
                }));

                ws.current = chatSocket
            }
        
            chatSocket.onmessage = function (e:any) {
                const data = JSON.parse(e.data);
                console.log(data);
        
                switch (data.type) {
                    case "jwt_tokens":
                        setTokens(data.token, data.refreshToken)
                        break;

                    case "chat.message":
                        setMessages(messages => (messages.some(message => message.id === data.message.id))? 
                            (messages.map((message:any)=>{return (message.id === data.message.id)?data.message : message})) : [...messages, data.message]
                        )
                        if (profile.username === data.message.username){
                            scrollToBottom()
                        }
                        break;
        
                    case "messages":
                        setMessages(data.messages)
                        setCurrentSubView('default')
                        scrollToBottom()
                        break;

                    case "user_list":
                        for (let i = 0; i < data.users.length; i++) {
                            
                            // onlineUsersSelectorAdd(data.users[i]);
                        }
                        break;

                    case "user_join":
                        // chatLog.value += data.user + " joined the room.\n";
                        // onlineUsersSelectorAdd(data.user);
                        break;

                    case "user_leave":
                        // chatLog.value += data.user + " left the room.\n";
                        // onlineUsersSelectorRemove(data.user);
                        break;
                    default:
                        console.error("Unknown message type!");
                        break;
                }
        
                // scroll 'chatLog' to the bottom
                // chatLog.scrollTop = chatLog.scrollHeight;
            };
        
            chatSocket.onclose = function(e:any) {
                console.error('Chat socket closed unexpectedly: ', e);
                backButton()
            };

            chatSocket.onerror = function(e:any) {
                console.error('Error: ', e);
            };

        }

    },[loaded, getChatSocket, inboxItem.id, inboxId])

The getChatSocket call:

import config from "../config";
import ReconnectingWebSocket from 'reconnecting-websocket';


export const useMessagesWs = () => {
    const path = config.baseWs + '/inbox/'

    const getChatSocket = (inboxId?:string)=>{
        return new WebSocket(path + inboxId + '/')
        // return new WebSocket(path)
        // return new ReconnectingWebSocket(path + inboxId + '/')
    }

    return {
        getChatSocket
    }

}

Thanks!

Ok, I think this helps get us closer. I’ve got two thoughts in mind as to what it might be.

Idea 1 - It is an nginx configuration issue, in that it’s not forwarding a header that it’s supposed to forward.

Idea 2 - It’s an app configuration issue in that what is being forwarded through by nginx is not what the app is configured to expect.

I can think of a couple ways to help diagnose this. The first is to remove one or more of the AllowedHostsOriginValidator, AuthMiddlewareStack, and JWTAuthMiddleware from the router. Again, the idea is to make changes to help isolate the issue.

You can remove them all to see if the sockets can be established without them involved. If you can, then finding out which one is causing the problem can help identify the solution.

The other way to try and diagnose this would be to use wireshark or tcpdump to perform a network capture between nginx and Daphne in your production environment, and compare it to a capture done in front of your development environment. It would also be instructive to do a capture in front of nginx, but for that to be useful, you would need to enable http so that the packets being captured aren’t encrypted.

Thanks @KenWhitesell,

I will definitely try that and get back if it works or not.

Thanks!

Hi @KenWhitesell ,

I hope you are doing well.

I have done some troubleshooting and got what was Nginx was sending to the app:

{
    'type': 'websocket', 
    'path': '/ws/inbox/948252324-169626-QvepuQVR2z9odML8joreJT/', 
    'raw_path': b'/ws/inbox/948252324-169626-QvepuQVR2z9odML8joreJT/', 
    'headers': [
        (b'upgrade', b'websocket'), 
        (b'connection', b'Upgrade'), 
        (b'host', b'domain.tld'), 
        (b'x-real-ip', b'2601::'), 
        (b'x-forwarded-for', b'2601::, 2601::'), 
        (b'x-forwarded-host', b'domain.tld'), 
        (b'accept-encoding', b'gzip'), 
        (b'cf-ray', b'xxx'), 
        (b'x-forwarded-proto', b'https'), 
        (b'cf-visitor', b'{"scheme":"https"}'), 
        (b'pragma', b'no-cache'), 
        (b'cache-control', b'no-cache'), 
        (b'user-agent', b'Mozilla/5.0 (iPhone; CPU iPhone OS 16_6 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/16.6 Mobile/15E148 Safari/604.1'), 
        (b'accept-language', b'en-US,en;q=0.6'), 
        (b'origin', b'https://domain.tld'), 
        (b'sec-websocket-version', b'13'), 
        (
            b'cookie', 
            b'frontSessionId=bSvRqnio4E1czNmVSm1uGE; 
            deviceId=; 
            token=xxx; 
            refreshToken=xxx'
        ), 
        (b'sec-websocket-key', b'6fvSmel4+8zrJ1JtuI3Wpw=='), 
        (b'sec-websocket-extensions', b'permessage-deflate; client_max_window_bits'), 
        (b'cf-connecting-ip', b'2601::'), 
        (b'cdn-loop', b'cloudflare'), 
        (b'cf-ipcountry', b'US')
    ], 
    'query_string': b'', 
    'client': ['**192.168.112.1**', 41550], 
    'server': ['**192.168.112.5**', 8003], 
    'subprotocols': [], 
    'asgi': {'version': '3.0'}, 
    'cookies': 
        {
            'frontSessionId': 'bSvRqnio4E1czNmVSm1uGE', 
            'deviceId': '', 
            'token': 'xxx', 
            'refreshToken': 'xxx'
    }, 
    'session': <django.utils.functional.LazyObject object at 0xffffa0ed83d0>, 
    'user': <channels.auth.UserLazyObject object at 0xffffa0ed8520>
}

The client and server’s IPs are different. Could that be an issue?

What do you mean “different”. I would hope that the client and server had different addresses.