admin管理员组

文章数量:1345891

I am working on integrating a VoIP call (Asterisk setup) with a real-time WebSocket bot. The goal is to send incoming voice data to the bot over WebSocket and play back the bot’s response to the caller. I have attempted two approaches but am facing significant audio quality issues in both:

Approach 1: Using ARI

  • I connected to the ARI server and created a bridge.

  • A local channel and an external channel were created and added to the bridge.

  • A snoop channel was created on the local channel with spy="in" to capture incoming audio.

  • Another snoop channel was created with whisper="out" to inject the bot’s response.

  • created an external media connection on the external channel, and I also started an RTP UDP server.

Issue:

The audio sent over WebSocket is extremely noisy, making it incomprehensible to the bot. Additionally, the bot's response audio, when sent back to the caller, is not audible.

Approach 2: Using AudioSocket

  • I set up an AudioSocket server to handle the call’s audio.

  • The incoming audio is successfully sent to the bot, but similar noise issues persist.

Issue:

While the bot’s response audio is at least audible to the caller, it is still not understandable due to excessive noise.

Troubleshooting Attempts

I have tried resampling the audio before sending it to the bot, but this did not improve the quality.

I have spent several days troubleshooting this issue without success.

Request for Help

I am unsure what I might be doing wrong in my setup. Is there a better way to handle the audio streams, or am I missing any critical configuration? Any guidance would be greatly appreciated.

Asterisk Version using is 18.

The bot is gpt4o realtime preview.

本文标签: