admin管理员组

文章数量:1401795

So I have an Electron app that uses the web speech API (SpeechRecognition) to take the user's voice, however, it's not working. The code:

if ("webkitSpeechRecognition" in window) {
  let SpeechRecognition =
    window.SpeechRecognition || window.webkitSpeechRecognition;
  let recognition = new SpeechRecognition();

  recognition.onstart = () => {
    console.log("We are listening. Try speaking into the microphone.");
  };

  recognition.onspeechend = () => {
    recognition.stop();
  };

  recognition.onresult = (event) => {
    let transcript = event.results[0][0].transcript;
    console.log(transcript);
  };

  recognition.start();
} else {
  alert("Browser not supported.");
}

It says We are listening... in the console, but no matter what you say, it doesn't give an output. On the other hand, running the exact same thing in Google Chrome works and whatever I say gets console logged out with the console.log(transcript); part. I did some more research and it turns out that Google has recently stopped support for the Web Speech API in shell-based Chromium windows (Tmk, everything that is not Google Chrome or MS Edge), so that seems to be the reason it is not working on my Electron app.

See: electron-speech library's end Artyom.js issue another stackOverflow question regarding this

So is there any way I can get it to work in Electron?

So I have an Electron app that uses the web speech API (SpeechRecognition) to take the user's voice, however, it's not working. The code:

if ("webkitSpeechRecognition" in window) {
  let SpeechRecognition =
    window.SpeechRecognition || window.webkitSpeechRecognition;
  let recognition = new SpeechRecognition();

  recognition.onstart = () => {
    console.log("We are listening. Try speaking into the microphone.");
  };

  recognition.onspeechend = () => {
    recognition.stop();
  };

  recognition.onresult = (event) => {
    let transcript = event.results[0][0].transcript;
    console.log(transcript);
  };

  recognition.start();
} else {
  alert("Browser not supported.");
}

It says We are listening... in the console, but no matter what you say, it doesn't give an output. On the other hand, running the exact same thing in Google Chrome works and whatever I say gets console logged out with the console.log(transcript); part. I did some more research and it turns out that Google has recently stopped support for the Web Speech API in shell-based Chromium windows (Tmk, everything that is not Google Chrome or MS Edge), so that seems to be the reason it is not working on my Electron app.

See: electron-speech library's end Artyom.js issue another stackOverflow question regarding this

So is there any way I can get it to work in Electron?

Share Improve this question edited Mar 2, 2024 at 13:01 XYBOX asked Jan 18, 2023 at 18:08 XYBOXXYBOX 9713 bronze badges 1
  • Hey, and if possible, maybe this question could gain enough traction to reach the panies managing these APIs and perhaps they could do something about native support on shell-based browsers. I understand the reasons they might've disabled it, but I think those should be solved in a way other than pletely removing support. – XYBOX Commented Mar 2, 2024 at 13:03
Add a ment  | 

2 Answers 2

Reset to default 7

I ended up doing an implementation that uses the media devices API to get the user's speech through their microphone and then sends it to a Python server using WebSockets which uses the audio stream with the SpeechRecognition pip package and returns the transcribed text to the client (Electron app).

This is what I implemented, it is way too long for a thing as simple as this, but if someone has a better suggestion, please do let me know by writing an answer.

I used Rust, Neon, cpal and Vosk to make a nodejs module that can start/stop independent OS threads that handle listening to the mic and recognizing text from it in real-time. From node you can select the device and plug in different language recognizers, hand it trigger words to call back to, etc. It works for what I built it for but I can probably put up a repo for it and make it a little more flexible if anyone's interested.

const { app, BrowserWindow } = require('electron');
const voiceModule = require('./index.node');


// in this demo I will stop after two rounds of recognizing target words:
let called = 0;
function onWordsFound(words) {
  console.log('words found:', words);
  called ++;
  if (called > 1) {
    console.log('stopping listener');
    voiceModule.stopListener();
    return;
  }
  // I use setTimeout here since the rust function calling this js function must exit before the next call to lookForWords
  // but you can use voiceModule.lookForWords anywhere in your JS code
  setTimeout(() => {
    console.log('calling lookForWords');
    voiceModule.lookForWords(["second", "words"], true); 
  }, 1000);
}

const f = async () => {
  voiceModule.setPathToModel('./models/large'); // this is the english large model, but you can use any vosk-patible model you want
  const r = voiceModule.listDevices();
  // just use the default microphone for now but you can use listDevices and setMicName to make a selection UI
  voiceModule.setMicName(r[0]); 
  // after selecting the mic you can call startListener
  voiceModule.startListener(onWordsFound); // pass your callback
  voiceModule.lookForWords(['hello', 'world'], false); // false means match ANY word, true means they must match ALL words in the list
};

本文标签: javascriptWhat are the ways to implement speech recognition in ElectronStack Overflow