why the django3.2 StreamingHttpResponse can not stream

Hi everybody,
In my views, django3.2 version

     def sse_view(request):
             def event_stream():
                  robot_response = []
                  for chunk in stream:
                  robot_response.append(chunk['message']['content'].encode('utf-8').decode('utf-8'))
      
             
                  yield f"data: {json.dumps({'msg': chunk['message']['content']}, ensure_ascii=False)}\n\n"
      
                  sys.stdout.flush()
                  # time.sleep(1)
      
              full_response = ''.join(robot_response)
      
              store_message(conversation_id, 'assistant', full_response)
      
     
              yield f"data: {json.dumps({'msg': '@@!@@#@@'}, ensure_ascii=False)}\n\n"

    response = StreamingHttpResponse(event_stream(), content_type='text/event-stream')

    response['Cache-Control'] = 'no-cache'
    response['X-Accel-Buffering'] = 'no'  # 
    return response

it can not stream response ni frontend html render word by word in big model llama3.2
any advice for this issue, it appreciated
Best,
hanhuihong

Welcome @honghh2018 !

First, a Side Note: When posting code here, enclose the code between lines of three
backtick - ` characters. This means you’ll have a line of ```, then your code,
then another line of ```. This forces the forum software to keep your code
properly formatted. (I have taken the liberty of correcting your original posts.
Please remember to do this in the future.)

You’ve identified that this isn’t working, please describe what’s actually happening. Are you getting any error messages, either in the browser or on the server console? If you are, please post them.

If you’re not, use the network tab in your browser’s developer tools to see what is being returned by the server and describe what is being sent.

(Also, it looks like there may be some code missing from this view, and that the indentation of this isn’t correct. Please post the complete view.)

Thanks for the remider,

  def sse_stream(request):
 
        print('1', request.GET)
        print('2', request.body)
        user_id = 1
        sender = f'user{user_id}'
        # store_message(conversation_id, sender, 'assistent')
        conn = get_redis_connection('default')
        talk_info = json.loads(conn.get(sender))
    
        print("talk_info->", talk_info)
        content = talk_info.get('content', "")
        role_content = talk_info.get('roleContent', "")
        model_select = talk_info.get('modelSelect', "")
        role = 'user'
        conversation_id = f'user_history{user_id}'
        store_message(conversation_id, role, content)
    
        history_prompt = get_history_message(conversation_id)
    
        print('history_prompt->', history_prompt)
        if role_content != "":
            # history_prompt.append({'role': 'system', 'content': role_content})
            history_prompt.append({
                'role': 'user',
                'content': content
            })
    
        
        else:
            history_prompt.append({
                'role': 'user',
                'content': content
            })
       
        print("history_prompt-->", history_prompt)
    
        stream = llama_chat_stream_tool5(history_prompt, model_select)
    
        def event_stream():
            robot_response = []
            for chunk in stream:
                robot_response.append(chunk['message']['content'].encode('utf-8').decode('utf-8'))
    
                # print('111', chunk)
                yield f"data: {json.dumps({'msg': chunk['message']['content']}, ensure_ascii=False)}\n\n"
                # time.sleep(1)
    
            full_response = ''.join(robot_response)
    
            store_message(conversation_id, 'assistant', full_response)
            # if role_content != "":
            #     store_message(conversation_id, 'system', role_content)
    
  
            yield f"data: {json.dumps({'msg': '@@!@@#@@'}, ensure_ascii=False)}\n\n"
    
        response = StreamingHttpResponse(event_stream(), content_type='text/event-stream')
    
        response['Cache-Control'] = 'no-cache'
        response['X-Accel-Buffering'] = 'no'  
        return response     

the llama tool defined below:
def llama_chat_stream_tool5(content,model_name):
  if model_name=="":
      model_name = 'llama3:8b'



  stream = ollama.chat(
      model=model_name,
  
      messages=content,
      stream=True,
  )

  return stream


I can see the generator print  stream message in django backend  console,
But when the all message output completely in backend, the html frontend just show at once  

the chrome web don' t show any error,and  frontend page cannot display the model's response information in a streaming manner.
Best,
hanhuihong

Had anyone hear this issue?
The frontend response header show below info:
access-control-allow-headers: *
access-control-allow-methods: *
access-control-allow-origin: *
cache-control: no-cache
content-type: text/event-stream
referrer-policy: same-origin
transfer-encoding: chunked
transfer-encoding: None
vary: Cookie
x-accel-buffering: no
x-content-type-options: nosniff
Is transfer-encoding: chunked trigger the unstream problem ?
and how to fix this issue in django 3.2 version