real-time streaming(typing effect)

Here is a deeper dive into why the Server-Sent Events (SSE) approach is highly recommended for WeChat Mini-Programs, along with how to implement it both on the CrestApp (ASP.NET Core) backend and the WeChat frontend.

Why SSE is Better Than SignalR for WeChat Mini-Programs

While SignalR is fantastic for standard web browsers, it introduces friction in WeChat:

The Sandbox Environment: The official @microsoft/signalr JavaScript client expects a standard browser environment (it looks for window, document, and the standard browser WebSocket object). WeChat Mini-Programs run in a custom JS runtime (JSCore) where these global objects do not exist.
Adapter Dependency: To make SignalR work in WeChat, you have to rely on third-party polyfills or modified SignalR clients to map the SignalR transport to WeChat's wx.connectSocket. These adapters can become outdated or introduce hard-to-debug connection drops.
The Chat Pattern: AI chat is typically asymmetric. The user sends a short burst of text (perfect for a standard HTTP POST), and the server responds with a long, slow stream of text (the "typing" effect). You don't actually need the persistent, bidirectional, full-duplex nature of WebSockets just to stream text in one direction.

SSE uses standard HTTP/1.1 (or HTTP/2) with chunked transfer encoding. Because it's just HTTP, WeChat handles it natively and robustly.

Step 1: The CrestApp Backend (ASP.NET Core)

You need to create an API endpoint in your Orchard Core module that holds the HTTP request open and writes chunks to the response stream as the AI generates them.

Here is an example of what that controller looks like:

using Microsoft.AspNetCore.Mvc;
using System.Text;
using System.Threading.Tasks;
// Include your CrestApp AI abstractions here

[ApiController]
[Route("api/wechat/chat")]
public class WeChatChatController : Controller
{
    private readonly IOrchestrator _orchestrator; // CrestApp's AI processor

    public WeChatChatController(IOrchestrator orchestrator)
    {
        _orchestrator = orchestrator;
    }

    [HttpPost("stream")]
    public async Task StreamMessage([FromBody] WeChatChatRequest request)
    {
        // 1. Set the correct headers for Server-Sent Events
        Response.Headers.Add("Content-Type", "text/event-stream");
        Response.Headers.Add("Cache-Control", "no-cache");
        Response.Headers.Add("Connection", "keep-alive");

        // 2. Build your Orchestration Context (based on CrestApp's models)
        // This includes the user's prompt, session ID, and profile settings
        var context = BuildOrchestratorContext(request);

        // 3. Process the stream
        // Assuming _orchestrator.ProcessStreamAsync returns an IAsyncEnumerable of chunks
        await foreach (var chunk in _orchestrator.ProcessStreamAsync(context, HttpContext.RequestAborted))
        {
            // Format the chunk according to the SSE standard: "data: {content}\n\n"
            // Or just send raw text if your client prefers it.
            var sseMessage = $"data: {chunk.Text}\n\n";
            var bytes = Encoding.UTF8.GetBytes(sseMessage);

            await Response.Body.WriteAsync(bytes, 0, bytes.Length);
            await Response.Body.FlushAsync(); // Force the chunk to be sent immediately
        }

        // 4. Send a termination signal so the client knows it's done
        var endMessage = Encoding.UTF8.GetBytes("data: [DONE]\n\n");
        await Response.Body.WriteAsync(endMessage, 0, endMessage.Length);
        await Response.Body.FlushAsync();
    }
}

Step 2: The WeChat Mini-Program Frontend

On the WeChat side, you use the standard wx.request API, but you turn on enableChunked: true. This prevents WeChat from waiting for the entire request to finish before triggering callbacks; instead, it fires an event every time your backend flushes a new chunk of data.

Page({
  data: {
    messages: [], // Array of chat messages
    currentAssistantMessage: "" // The message currently being typed
  },

  sendMessage(userText) {
    // 1. Add user message to UI immediately
    this.setData({
      messages: [...this.data.messages, { role: 'user', text: userText }],
      currentAssistantMessage: ""
    });

    // 2. We need a TextDecoder to convert ArrayBuffer chunks to strings
    // Note: WeChat provides a TextDecoder in newer base libraries
    const decoder = new TextDecoder('utf-8');

    // 3. Initiate the chunked request
    const requestTask = wx.request({
      url: 'https://your-crestapp-domain.com/api/wechat/chat/stream',
      method: 'POST',
      enableChunked: true, // IMPORTANT: Enables streaming
      header: {
        'content-type': 'application/json'
      },
      data: {
        text: userText,
        sessionId: 'user-session-123'
      },
      success: (res) => {
        // The request fully completed
        console.log("Stream finished", res);
        
        // Move the typed message into the main message history
        this.setData({
          messages: [...this.data.messages, { role: 'assistant', text: this.data.currentAssistantMessage }],
          currentAssistantMessage: ""
        });
      },
      fail: (err) => {
        console.error("Stream failed", err);
      }
    });

    // 4. Listen for incoming chunks
    requestTask.onChunkReceived((response) => {
      // response.data is an ArrayBuffer containing the chunk bytes
      const chunkText = decoder.decode(new Uint8Array(response.data));
      
      // Parse the SSE format (strip out "data: " and "\n\n")
      // Note: In a real app, you should use a proper regex or split logic here
      // to handle cases where a chunk contains multiple "data:" lines.
      if (chunkText.includes("[DONE]")) {
        return; 
      }

      let newContent = chunkText.replace(/^data: /gm, '').replace(/\n\n/g, '');

      // Append the new text to the UI variable to create the typing effect
      this.setData({
        currentAssistantMessage: this.data.currentAssistantMessage + newContent
      });
      
      // Optionally scroll to the bottom of the chat view here
    });
  }
})

Summary of the Flow

User taps "Send".
wx.request fires an HTTP POST.
The CrestApp controller accepts it, contacts the AI model (e.g., Azure OpenAI), and begins receiving tokens.
As each token arrives from the AI, the controller calls Response.Body.FlushAsync().
The WeChat client immediately fires onChunkReceived, decodes the bytes, and updates the currentAssistantMessage in the UI, causing the screen to instantly reflect the new word.