RecognizeAsync() doesn't work but Recognize() does, why? - c#

Im trying to have my windows forms program to continuously listen to my microphone to detect speech and then display that information on the gui.
Here is my SpeechListener class:
{
public class SpeechListener
{
GUI gui;
public SpeechListener(GUI gui) { this.gui = gui; }
public void StartListening() {
gui.setLabel("speech activated");
// Create an in-process speech recognizer for the en-US locale.
using (
SpeechRecognitionEngine recognizer =
new SpeechRecognitionEngine(
new System.Globalization.CultureInfo("en-US")))
{
// Create and load a dictation grammar.
recognizer.LoadGrammar(new DictationGrammar());
// Add a handler for the speech recognized event.
recognizer.SpeechRecognized +=
new EventHandler<SpeechRecognizedEventArgs>(Recognizer_SpeechRecognized);
// Configure input to the speech recognizer.
recognizer.SetInputToDefaultAudioDevice();
// Start asynchronous, continuous speech recognition.
recognizer.RecognizeAsync(RecognizeMode.Multiple);
}
}
// Handle the SpeechRecognized event.
public void Recognizer_SpeechRecognized(object sender, SpeechRecognizedEventArgs e)
{
//this is where i want to get
gui.setLabel("Recognized text: " + e.Result.Text);
}
}
}
This class is instantiated and StartListening() is called from a form class (my gui).
I never reach the method that handles the SpeechRecognized event. However when i change
recognizer.RecognizeAsync(RecognizeMode.Multiple);
to
recognizer.Recognize();
the speech detection works (but only once, and it freezes my gui). Why doesn't the async method work?
I've used this same code on a console program and it works perfectly.

Related

How to handle multiple speech recognition events in C#?

First off I must admit that I'm fairly new to C#. I am using C# speech recognition for writing an application that will interface between a human and a robot. An example of a dialogue is as follows:
Human: Ok Robot, Drill.
Robot: Where?
Human shows where.
Robot: I am ready to drill.
Human: Ok Robot, start.
My approach is having two speech recognizes. The first one is for higher level commands such as "Drill", "Cut Square", "Cut Rectangle", "Play Game" etc. The second one is for start/stop commands for each of the higher level tasks.
This is my current code:
using System.IO;
using System.Speech.Recognition;
using System.Speech.Synthesis;
namespace SpeechTest
{
class RobotSpeech
{
public SpeechRecognitionEngine MainRec = new SpeechRecognitionEngine();
public SpeechRecognitionEngine SideRec = new SpeechRecognitionEngine();
public SpeechSynthesizer Synth = new SpeechSynthesizer();
private Grammar mainGrammar;
private Grammar sideGrammar;
private const string MainCorpusFile = #"../../../MainRobotCommands.txt";
private const string SideCorpusFile = #"../../../SideRobotCommands.txt";
public RobotSpeech()
{
Synth.SelectVoice("Microsoft Server Speech Text to Speech Voice (en-US, ZiraPro)");
MainRec.SetInputToDefaultAudioDevice();
SideRec.SetInputToDefaultAudioDevice();
BuildGrammar('M');
BuildGrammar('S');
}
private void BuildGrammar(char w)
{
var gBuilder = new GrammarBuilder("Ok Robot");
switch (w)
{
case 'M':
gBuilder.Append(new Choices(File.ReadAllLines(MainCorpusFile)));
mainGrammar = new Grammar(gBuilder) { Name = "Main Robot Speech Recognizer" };
break;
case 'S':
gBuilder.Append(new Choices(File.ReadAllLines(SideCorpusFile)));
sideGrammar = new Grammar(gBuilder) { Name = "Side Robot Speech Recognizer" };
break;
}
}
public void Say(string msg)
{
Synth.Speak(msg);
}
public void MainSpeechOn()
{
Say("Speech recognition enabled");
MainRec.LoadGrammarAsync(mainGrammar);
MainRec.RecognizeAsync(RecognizeMode.Multiple);
}
public void SideSpeechOn()
{
SideRec.LoadGrammarAsync(sideGrammar);
SideRec.RecognizeAsync();
}
public void MainSpeechOff()
{
Say("Speech recognition disabled");
MainRec.UnloadAllGrammars();
MainRec.RecognizeAsyncStop();
}
public void SideSpeechOff()
{
SideRec.UnloadAllGrammars();
SideRec.RecognizeAsyncStop();
}
}
}
In my main program I have the speech recognized event as follows:
private RobotSpeech voiceIntr;
voiceIntr.MainRec.SpeechRecognized += MainSpeechRecognized;
private void MainSpeechRecognized(object sender, SpeechRecognizedEventArgs e)
{
if (!e.Result.Text.Contains("Ok Bishop")) return;
switch (e.Result.Text.Substring(10))
{
case "Go Home":
voiceIntr.Say("Going to home position.");
UR10MoveRobot.GoHome();
break;
case "Say Hello":
voiceIntr.Say("Hello. My name is Bishop.");
break;
case "Drill":
voiceIntr.Say("Show me where you want me to drill.");
// Actual code will be for observating the gestured pts and
// returning the number of pts observed
var msg = "I am ready to drill those " + new Random().Next(2, 5) + " holes.";
voiceIntr.Say(msg);
voiceIntr.SideSpeechOn();
voiceIntr.SideSpeechOff();
break;
case "Cut Outlet":
voiceIntr.Say("Show me where you want me to cut the outlet.");
// Launch gesture recognition to get point for cutting outlet
break;
case "Stop Program":
voiceIntr.Say("Exiting Application");
Thread.Sleep(2200);
Application.Exit();
break;
}
}
The problem I am having is, when one of the MainRec events gets triggered, I am in one the cases here. Now I only want to listen for "Ok Robot Start" and nothing else which is given by the SideRec. If I subscribe to that event here this will go to another eventhandler with a switch case there from which I wouldn't know how to get back to the main thread.
Also after telling the human that the robot is ready for drilling, I would like it to block until it receives an answer from the user for which I need to use a synchronous speech recognizer. However, after a particular task I want to switch off the recognizer which I can't do if its synchronous.
Here are the files for the grammers:
MainRobotCommands.txt
Go Home
Say Hello
Stop Program
Drill
Start Drilling
Cut Outlet
Cut Shap
Play Tic-Tac-Toe
Ready To Play
You First
SideRobotCommands.txt:
Start
Stop
The speech recognition is only a part of a bigger application hence it has to be async unless I want to make it preciously block. I am sure there is better way to design this code, but I'm not sure my knowledge of C# is enough for that. Any help is greatly appreciated!
Thanks.
My approach is having two speech recognizes.
There is no need for 2 recognizers, you can have just 1 recognizer and load/unload grammars when you need them.
private void BuildGrammar(char w)
This is not a straightforward programming style to use switch statement and invoke same function two times with different arguments. Just create two grammars sequentially.
The problem I am having is, when one of the MainRec events gets triggered, I am in one the cases here. Now I only want to listen for "Ok Robot Start" and nothing else which is given by the SideRec. If I subscribe to that event here this will go to another eventhandler with a switch case there from which I wouldn't know how to get back to the main thread.
If you have one recognizer it's enough to have a single handler and do all work there.
Also after telling the human that the robot is ready for drilling, I would like it to block until it receives an answer from the user for which I need to use a synchronous speech recognizer. However, after a particular task I want to switch off the recognizer which I can't do if its synchronous.
It not easy to mix async and sync style in the same design. You need to use either sync programming or async programming. For event-based software there is actually no need to mix, you can work in strictly async paradigm without waiting. Just start new drilling action inside MainSpeechRecognized event handler when "ok drill" is recognized. You can also synthesize audio in async mode without waiting for the result and continue processing in on SayCompleted handler.
To track the state of your software you can create a state variable and check it in event handlers to understand in what state you are and choose next action.
This programming paradigm is called "Event-driven programming", you can read a lot about it in network, check Wikipedia page and start with this tutorial.

SpeechRecognitionEngine RecognizeCompleted event doesn't fire

I have been searching for this problem for a while but I didn't find anything helpful.
I want perform a continuous speech recognition using System.Speech.Recognition; from a wav file however , I want to return the result when the recognition engine finish its work.
I found a solution to use RecognizeCompleted event handler but it doesn't fire at all.
Recognition code:
SpeechRecognitionEngine speechRecognitionEngine = null;
bool completed = false;
// create the engine
speechRecognitionEngine = createSpeechEngine("en-GB");
// hook to events
speechRecognitionEngine.SpeechRecognized += new EventHandler<SpeechRecognizedEventArgs>(engine_SpeechRecognized);
speechRecognitionEngine.RecognizeCompleted += new EventHandler<RecognizeCompletedEventArgs>(recognizer_RecognizeCompleted);
// load dictionary
speechRecognitionEngine.LoadGrammar(new DictationGrammar());
speechRecognitionEngine.SetInputToWaveFile("C:\\Converted Audio\\BBC1 cut sport.wav");
// start listening for continuous speech recognition
speechRecognitionEngine.RecognizeAsync(RecognizeMode.Multiple);
while (!completed)
{
Thread.Sleep(333);
}
Event handler code:
void recognizer_RecognizeCompleted(object sender, RecognizeCompletedEventArgs e)
{
completed = true;
}

issue with vs2010 C# windows voice recognition

I have ran into an interesting issue with my voice recognition code for C#. I have had this code work before, but I migrated it to another project and it just wont work. I must be missing something, because there are no errors or warnings about the speech recognition and I do have the reference for speech. Here is the main function:
static void Main(string[] args)
{
Program prgm = new Program();
string[] argument = prgm.readConfigFile();
if(argument[2].ToLower().Contains("true"))
{
recognizer = new SpeechRecognitionEngine(new System.Globalization.CultureInfo("en-US"));
recognizer.LoadGrammar(new DictationGrammar());
recognizer.SpeechRecognized += new EventHandler<SpeechRecognizedEventArgs>(recognizer_SpeechRecognized);
recognizer.SetInputToDefaultAudioDevice();
recognizer.RecognizeAsync(RecognizeMode.Multiple);
}
prgm._con.updateConsole(argument, prgm._list);
}
static void recognizer_SpeechRecognized(object sender, SpeechRecognizedEventArgs e)
{
Console.WriteLine(e.Result.Text);
}
along with the recognizer:
recognizer = new SpeechRecognitionEngine(new System.Globalization.CultureInfo("en-US"));
I did add the using System.Speech at the top of my code. When ever I start talking the event handler should start, but it never gets hit (checked with breakpoint). What am I doing wrong?

Speech to Text C# Train For Better Translation

I need to way to make the speech to text smarter as many of the words it is just getting incorrect in the translation. I cannot find much help on adding a list of words, not commands or grammar but words to help better translate audio recording.
Here is the code I found on the web, and this works, but I need to way to train, or make the engine smarter. Any ideas?
Thanks.
static void Main(string[] args)
{
// Create an in-process speech recognizer for the en-US locale.
using (SpeechRecognitionEngine recognizer =
new SpeechRecognitionEngine(
new System.Globalization.CultureInfo("en-US")))
{
// Create and load a dictation grammar.
recognizer.LoadGrammar(new DictationGrammar());
// Add a handler for the speech recognized event.
recognizer.SpeechRecognized +=
new EventHandler<SpeechRecognizedEventArgs>(recognizer_SpeechRecognized);
// Configure input to the speech recognizer.
recognizer.SetInputToWaveFile(#"c:\test2.wav");
// Start asynchronous, continuous speech recognition.
recognizer.RecognizeAsync(RecognizeMode.Multiple);
// Keep the console window open.
while (true)
{
Console.ReadLine();
}
}
}
// Handle the SpeechRecognized event.
static void recognizer_SpeechRecognized(object sender, SpeechRecognizedEventArgs e)
{
Console.WriteLine("Recognized text: " + e.Result.Text);
using (System.IO.StreamWriter file = new System.IO.StreamWriter(#"C:\WriteLines2.txt", true))
{
file.WriteLine("");
}
}

Programmatically turn off the automation features of windows speech recognition?

I'm making a program that uses the system.speech namespace (it's a simple program that will launch movies). I load all of the filenames from a folder and add them to the grammars I want to use. It's working remarkably well, however there is a hitch: I DON'T want the windows speech recognition to interact with windows at all (ie. when I say start, I don't want the start menu to open... I don't want anything to happen).
Likewise, I have a listbox for the moment that lists all of the movies found in the directory. When I say the show/movie that I want to open, the program isn't recognizing that the name was said because windows speech recognition is selecting the listboxitem from the list instead of passing that to my program.
The recognition is working otherwise, because I have words like "stop", "play", "rewind" in the grammar, and when I catch listener_SpeechRecognized, it will correctly know the word(s)/phrase that I'm saying (and currently just type it in a textbox).
Any idea how I might be able to do this?
I'd use the SpeechRecognitionEngine class rather than the SpeechRecognizer class. This creates a speech recognizer that is completely disconnected from Windows Speech Recognition.
private bool Status = false;
SpeechRecognitionEngine sre = new SpeechRecognitionEngine();
Choices dic = new Choices(new String[] {
"word1",
"word2",
});
public Form1()
{
InitializeComponent();
Grammar gmr = new Grammar(new GrammarBuilder(dic));
gmr.Name = "myGMR";
// My Dic
sre.LoadGrammar(gmr);
sre.SpeechRecognized +=
new EventHandler<SpeechRecognizedEventArgs>(sre_SpeechRecognized);
sre.SetInputToDefaultAudioDevice();
sre.RecognizeAsync(RecognizeMode.Multiple);
}
private void button1_Click(object sender, EventArgs e)
{
if (Status)
{
button1.Text = "START";
Status = false;
stslable.Text = "Stopped";
}
else {
button1.Text = "STOP";
Status = true;
stslable.Text = "Started";
}
}
public void sre_SpeechRecognized(object sender, SpeechRecognizedEventArgs ev)
{
String theText = ev.Result.Text;
MessageBox.Show(theText);
}

Categories