admin管理员组文章数量:1347181
I’m working on a project where I need to use the speech_recognition module to process audio in real-time. However, from my research, it seems that speech_recognition (which works with pyaudio) only supports input sources like microphones and doesn’t seem to support capturing system output (the audio played through speakers). I need to find a way to capture the system’s output audio so that it can be processed by the recognizer.
I’ve tried using speech_recognition with pyaudio and sounddevice to record audio from the microphone, but neither captures system audio. I've researched into using both loopback methods and directly using output sources with speech_recognition, but most of them seem to be paid services? Any guidance/help would be appreciated.
Here's the code.
import numpy as np
import speech_recognition as sr
from time import sleep
def select_microphone():
mics = sr.Microphone.list_microphone_names()
input_mics = []
for index, name in enumerate(sr.Microphone.list_microphone_names()):
input_mics.append([index,name])
if not input_mics:
print("No available microphones.")
exit(1)
print("Available Microphones:")
for i, mic in input_mics:
print(f"{i}: {mic}")
while True:
try:
choice = int(input("Select microphone index: "))
if 0 <= choice < len(input_mics):
return sr.Microphone(sample_rate=16000, device_index=input_mics[choice][0])
except ValueError:
pass
print("Invalid selection. Try again.")
# Load model quickly
model = whisper.load_model("small")
data_queue = Queue()
transcription = ['']
recorder = sr.Recognizer()
# Select microphone
source = select_microphone()
def record_callback(_, audio: sr.AudioData):
data_queue.put(audio.get_raw_data())
with source:
recorder.adjust_for_ambient_noise(source)
recorder.listen_in_background(source, record_callback)
while True:
try:
if not data_queue.empty():
audio_data = b''.join(data_queue.queue)
data_queue.queue.clear()
audio_np = np.frombuffer(audio_data, dtype=np.int16).astype(np.float32) / 32768.0
else:
sleep(0.25)
except KeyboardInterrupt:
break
print("\nFinal Transcription:")
print("\n".join(transcription))
本文标签: pythonHow to Capture Loopback Audio with SpeechRecognition (PyAudio)Stack Overflow
版权声明:本文标题:python - How to Capture Loopback Audio with SpeechRecognition (PyAudio)? - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1743832648a2546807.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论