admin管理员组

文章数量:1356905

I am building a Q&A app using Streamlit, OpenAI Whisper, and SoundDevice. The goal of the app is to:

Listen to real-time audio input from the microphone. (With or without transcribing the audio into text using OpenAI Whisper.) Generate responses using OpenAI's language model. Display both the transcribed question and response on the Streamlit interface in real time.

The app fails to listen to audio and return answers as expected. The Streamlit UI starts and shows the message "Listening for interview questions...", but nothing happens when I speak into the microphone. Occasionally, the app crashes and shows a "Connection Error" message.

I have tried the following without success:

Changing the Whisper model from base to tiny to reduce memory usage. Using separate threads for audio capture and transcription to avoid blocking the UI. Running on macOS Monterey with an Apple M2 Pro chip. Restarting the Streamlit app and checking microphone permissions.

Environment: OS: macOS Monterey 12.6 Python Version: 3.10 Streamlit Version: 1.25.0 Whisper Version: latest SoundDevice Version: latest Hardware: Apple M2 Pro chip (2024 MacBook Pro)

Code:

import os
import openai
import whisper
import streamlit as st
import sounddevice as sd
import numpy as np
import queue
import threading
import time

# Initialize Whisper model
whisper_model = whisper.load_model('tiny')

# Streamlit setup
st.title("AI/ML Interview Assistant")
st.markdown("Listening for interview questions...")

# Real-time audio queue
audio_queue = queue.Queue()

# Audio callback to capture microphone input
def audio_callback(indata, frames, time, status):
    audio_queue.put(indata.copy())

# Transcribe audio and generate responses
def transcribe_and_respond():
    audio_data = []
    while True:
        try:
            if not audio_queue.empty():
                audio_data.append(audio_queue.get())
                if len(audio_data) > 20:
                    audio_segment = np.concatenate(audio_data, axis=0)
                    audio_data.clear()
                    transcription = whisper_model.transcribe(audio_segment)
                    question = transcription['text']
                    st.text(f"You: {question}")
                    response = generate_response(question)
                    st.text(f"Assistant: {response}")
                    time.sleep(1)
        except Exception as e:
            st.error(f"Error during transcription: {str(e)}")

# Generate response using OpenAI API
def generate_response(question):
    try:
        response = openai.Completion.create(
            model="text-davinci-003",
            prompt=f"Q: {question}\nA:",
            max_tokens=150
        )
        return response['choices'][0]['text'].strip()
    except Exception as e:
        return f"Error generating response: {str(e)}"

# Start audio stream in a separate thread
def start_audio_stream():
    try:
        stream = sd.InputStream(callback=audio_callback)
        with stream:
            threading.Thread(target=transcribe_and_respond, daemon=True).start()
            while True:
                time.sleep(0.1)
    except Exception as e:
        st.error(f"Audio stream error: {str(e)}")

# Start the audio stream
start_audio_stream()

本文标签: pythonQampA App Fails to Listen to Audio and Return Answers with Streamlit and WhisperStack Overflow