I'm making a program that uses the system.speech namespace (it's a simple program that will launch movies). I load all of the filenames from a folder and add them to the grammars I want to use. It's working remarkably well, however there is a hitch: I DON'T want the windows speech recognition to interact with windows at all (ie. when I say start, I don't want the start menu to open... I don't want anything to happen).
Likewise, I have a listbox for the moment that lists all of the movies found in the directory. When I say the show/movie that I want to open, the program isn't recognizing that the name was said because windows speech recognition is selecting the listboxitem from the list instead of passing that to my program.
The recognition is working otherwise, because I have words like "stop", "play", "rewind" in the grammar, and when I catch listener_SpeechRecognized, it will correctly know the word(s)/phrase that I'm saying (and currently just type it in a textbox).
Any idea how I might be able to do this?
I'd use the SpeechRecognitionEngine class rather than the SpeechRecognizer class. This creates a speech recognizer that is completely disconnected from Windows Speech Recognition.
private bool Status = false;
SpeechRecognitionEngine sre = new SpeechRecognitionEngine();
Choices dic = new Choices(new String[] {
"word1",
"word2",
});
public Form1()
{
InitializeComponent();
Grammar gmr = new Grammar(new GrammarBuilder(dic));
gmr.Name = "myGMR";
// My Dic
sre.LoadGrammar(gmr);
sre.SpeechRecognized +=
new EventHandler<SpeechRecognizedEventArgs>(sre_SpeechRecognized);
sre.SetInputToDefaultAudioDevice();
sre.RecognizeAsync(RecognizeMode.Multiple);
}
private void button1_Click(object sender, EventArgs e)
{
if (Status)
{
button1.Text = "START";
Status = false;
stslable.Text = "Stopped";
}
else {
button1.Text = "STOP";
Status = true;
stslable.Text = "Started";
}
}
public void sre_SpeechRecognized(object sender, SpeechRecognizedEventArgs ev)
{
String theText = ev.Result.Text;
MessageBox.Show(theText);
}
Related
I'm using c# and the System.Speech.Recognition to load a couple of simple grammars I defined. When I say phrases matching the grammars, the engine recognizes the grammar correctly with confidences around 0.95.
But when I pronounce words that are not even in the grammar (even from difference languages or gibberish), the engines randomly returns a match to a grammar with random text never pronounced and still with high confidence like 0.92.
Is there something I need to set in the SpeechRecognitionEngine object or in each Grammar object to avoid this problem?
I think I found a solution that works for me but still it would be nice to find a more elegant one if exists:
I define a dictation grammar and a "placeholder". Then I load my grammars and disabled them immediately.
using System.Speech.Recognition;
...
private DictationGrammar dictationGrammar;
private Grammar placeholderGrammar;
private List<Grammar> commands;
public void Initialize()
{
dictationGrammar = new DictationGrammar();
recognizer.LoadGrammarAsync(dictationGrammar);
var builder = new GrammarBuilder();
builder.Append("MYPLACEHOLDER");
placeholderGrammar = new Grammar(builder);
recognizer.LoadGrammarAsync(placeholderGrammar);
commands = new List<Grammar>();
foreach (var grammar in grammarManager.GetGrammars())
{
commands.Add(grammar);
grammar.Enabled = false;
recognizer.LoadGrammarAsync(grammar);
}
}
Then on the speechRecognized event I put the logic that if placeholder is recognized then enable the commands. If a command is recognized the re-enable the dictation and disable all commands:
private async void speechRecognized(object sender, SpeechRecognizedEventArgs e)
{
if (e.Result.Grammar == placeholderGrammar)
{
//go to command mode
placeholderGrammar.Enabled = false;
dictationGrammar.Enabled = false;
foreach (var item in commands)
item.Enabled = true;
}
else if (commands.Any(x => e.Result.Grammar == x))
{
Do_something_with_recognized_command("!!");
//go back in normal mode
placeholderGrammar.Enabled = true;
dictationGrammar.Enabled = true;
}else {//this is dictation.. nothing to do}
}
I'd like to write the speech a user says to text. Can I do this with the Microsoft Speech Platform? Perhaps I'm just misunderstanding how it's supposed to work and what its intended use case is.
I've got this console application now:
static void Main(string[] args)
{
Choices words = new Choices();
words.Add(new string[] { "test", "hello" ,"blah"});
GrammarBuilder gb = new GrammarBuilder();
gb.Append(words);
Grammar g = new Grammar(gb);
SpeechRecognitionEngine sre = new SpeechRecognitionEngine(new System.Globalization.CultureInfo("en-US"));
sre.LoadGrammar(g);
sre.SetInputToDefaultAudioDevice();
//add listeners
sre.Recognize();
Console.ReadLine();
}
And it only seems to output the words that I specify in Choices.
Would I have to add an entire dictionary of words if I wanted to match (most) of what a user will say?
Furthermore it stops right after it matches a single word. What if I wanted to capture entire sentences?
I'm looking for solutions for A) Capturing a wide array of words, and B) capturing more than one word at once.
Edit:
I found this: http://www.codeproject.com/Articles/483347/Speech-recognition-speech-to-text-text-to-speech-a#torecognizeallspeech
As seen in this page, the DictationGrammar class has a basic library of common words.
To capture more than one word at once I did
sre.RecognizeAsync(RecognizeMode.Multiple);
So my code is now this:
public static SpeechRecognitionEngine sre;
static void Main(string[] args)
{
sre = new SpeechRecognitionEngine(new System.Globalization.CultureInfo("en-US"));
sre.LoadGrammar(new Grammar(new GrammarBuilder("exit")));
sre.LoadGrammar(new DictationGrammar());
sre.SetInputToDefaultAudioDevice();
sre.SpeechRecognized += new EventHandler<SpeechRecognizedEventArgs>(sre_SpeechRecognized);
Console.ReadLine();
}
private static void sre_SpeechRecognized(object sender, SpeechRecognizedEventArgs e)
{
if (e.Result.Text == "exit")
{
sre.RecognizeAsyncStop();
}
Console.WriteLine("You said: " + e.Result.Text);
}
Please have a look at the following code
private void button2_Click(object sender, EventArgs e)
{
SpeechRecognizer sr = new SpeechRecognizer();
Choices colors = new Choices();
colors.Add(new string[] { "red arrow", "green", "blue" });
GrammarBuilder gb = new GrammarBuilder();
gb.Append(colors);
Grammar g = new Grammar(gb);
sr.LoadGrammar(g);
// SpeechSynthesizer s = new SpeechSynthesizer();
// s.SpeakAsync("start speaking");
sr.SpeechRecognized += new EventHandler<SpeechRecognizedEventArgs>(sr_SpeechRecognized);
}
void sr_SpeechRecognized(object sender, SpeechRecognizedEventArgs e)
{
MessageBox.Show(e.Result.Text);
}
This is normal speech recognition code which uses the MS speech engine. You can see here that I have loaded some grammar. But, there is an issue as well. That is, this is not responding only to the given grammar but also to the MS Built-In speech commands! Like speech command to minimize a window, open start menu etc!
I really don't need that. My application should only respond to my grammar and not to MS built-in commands. Is there is a way I can achieve this?
The SpeechRecognizer object builds on top of the existing Windows Speech system. From MSDN:
Applications use the shared recognizer to access Windows Speech
Recognition. Use the SpeechRecognizer object to add to the Windows
speech user experience.
Consider using a SpeechRecognitionEngine object instead as this runs in-process rather than system-wide.
i want to use kinect sdk voice recognition to run application from Metro UI.
For Example when i say the word:News, it would run from Metro UI the application of News.
Thanks to everybody!
Regards!
First you need to Make the connection with the audio stream and start listening:
private KinectAudioSource source;
private SpeechRecognitionEngine sre;
private Stream stream;
private void CaptureAudio()
{
this.source = KinectSensor.KinectSensors[0].AudioSource;
this.source.AutomaticGainControlEnabled = false;
this.source.EchoCancellationMode = EchoCancellationMode.CancellationOnly;
this.source.BeamAngleMode = BeamAngleMode.Adaptive;
RecognizerInfo info = SpeechRecognitionEngine.InstalledRecognizers()
.Where(r => r.Culture.TwoLetterISOLanguageName.Equals("en"))
.FirstOrDefault();
if (info == null) { return; }
this.sre = new SpeechRecognitionEngine(info.Id);
if(!isInitialized) CreateDefaultGrammars();
sre.LoadGrammar(CreateGrammars()); //Important step
this.sre.SpeechRecognized +=
new EventHandler<SpeechRecognizedEventArgs>
(sre_SpeechRecognized);
this.sre.SpeechHypothesized +=
new EventHandler<SpeechHypothesizedEventArgs>
(sre_SpeechHypothesized);
this.stream = this.source.Start();
this.sre.SetInputToAudioStream(this.stream, new SpeechAudioFormatInfo(
EncodingFormat.Pcm, 16000, 16, 1, 32000, 2, null));
this.sre.RecognizeAsync(RecognizeMode.Multiple);
}
First you can see in the sample that there's one important step sre.LoadGrammar(CreateGrammars()); which creates and loads the grammar so you have to create the method CreateGrammars():
private Grammar CreateGrammars()
{
var KLgb = new GrammarBuilder();
KLgb.Culture = sre.RecognizerInfo.Culture;
KLgb.Append("News");
return Grammar(KLgb);
}
The above sample create a grammar listening for the word "News". Once it is recognized (the probability that the word said is one in your grammar is higher than a threshold), the speech recognizer engine (sre) raise the SpeechRecognized event.
Of course you need to add the proper handler for the two events (Hypothetize, Recognize):
private void sre_SpeechRecognized(object sender, SpeechRecognizedEventArgs e)
{
this.RecognizedWord = e.Result.Text;
if (e.Result.Confidence > 0.65) InterpretCommand(e);
}
Know all you have to do is to write the InterpretCommand method which does whatever you want (as running a metro app ;) ). If you have multiple words in a dictionnary the method has to parse the text recognized and verify that this is the word news wich was recognized.
Here you can download samples of a great book on Kinect: Beginning Kinect Programming with the Microsoft Kinect SDK (unfortunately the book itself is not free). On the folder Chapter7\PutThatThereComplete\ you have a sample using audio that you can be inspired by.
I am a starter who is stuck very badly on this initially my main aim is to control robots using speech. Initially I started with making grammar for my speech with this code I was even successful my code is this I made this in windows form application:
using System.Speech.Recognition;
using System.Windows.Forms;
using System.Threading;
namespace WindowsFormsApplication1
{
public partial class Form1 : Form
{
public Form1()
{
InitializeComponent();
}
private void Form1_Load(object sender, EventArgs e)
{
// Create a new SpeechRecognizer instance.
sr = new SpeechRecognizer();
// Create a simple grammar that recognizes "red", "green", or "blue".
Choices colors = new Choices();
colors.Add("red");
colors.Add("green");
colors.Add("blue");
colors.Add("white");
GrammarBuilder gb = new GrammarBuilder();
gb.Append(colors);
// Create the actual Grammar instance, and then load it into the speech recognizer.
Grammar g = new Grammar(gb);
sr.LoadGrammar(g);
// Register a handler for the SpeechRecognized event.
sr.SpeechRecognized += new EventHandler<SpeechRecognizedEventArgs>(sr_SpeechRecognized);
}
// Simple handler for the SpeechRecognized event.
private void sr_SpeechRecognized(object sender, SpeechRecognizedEventArgs e)
{
MessageBox.Show(e.Result.Text);
}
private SpeechRecognizer sr;
}
Now from this code when I speak red , I get red in message box now I want to control motors therefore i need to communicate with my robots therefore i MADE ONE CONSOLE APPLICATION from help from internet FOR SENDING DATA TO MY SERVO CONTROLLER -SSC 32 THE CODE FOR ABOVE IS:
using System.IO.Ports;
using System.Threading;
namespace cConsoleAppMonitorServoCompletion
{
class Program
{
static SerialPort _serialPort;
static void Main(string[] args)
{
try
{
_serialPort = new SerialPort();
_serialPort.PortName = "COM3";
_serialPort.Open();
_serialPort.Write("#27 P1600 S750\r");
Console.WriteLine("#27 P1500 S750\r");
string output;
output = "";
//Example: "Q <cr>"
//This will return a "." if the previous move is complete, or a "+" if it is still in progress.
while (!(output == ".")) //loop until you get back a period
{
_serialPort.Write("Q \r");
output = _serialPort.ReadExisting();
Console.WriteLine(output);
Thread.Sleep(10);
}
_serialPort.Close();
}
catch (TimeoutException) { }
}
}
}
Now I want like when I speak red instead of giving a text box I want get serial command like _serialPort.Write("#27 P1600 S750\r");
Please help I have tried but I was not successful , it is my humble request please answer in more detailed manner , I am a just starter so it will be easy for me thanks in advance.
Controlling a robot using voice recognition... an ambitious project for a starter! There could be a million things going wrong here.
Just as important as the ability to write code is the ability to debug it. What can you tell us further - which parts work, which parts don't? Have you single-stepped through the code to see what happens and when, to diagnose where things start to go wrong?
You could also try some debugging output - Console.WriteLine for example - so we you can see the state of variables and flow of the code as it's running.
It looks like you need to use System.Diagnostics.Process.Start
This page has an example - how to execute console application from windows form?
// Simple handler for the SpeechRecognized event.
private void sr_SpeechRecognized(object sender, SpeechRecognizedEventArgs e)
{
System.Diagnostics.Process.Start( #"cmd.exe", #"/k c:\path\my.exe" );
}
An ambitious starter project indeed!
Update
private bool LaunchApp(String sAppPath, String sArg)
{
bool bSuccess = false;
try
{
//create a new process
Process myApp = new Process();
myApp.StartInfo.FileName = sAppPath;
myApp.StartInfo.Arguments = sArg;
bSuccess = myApp.Start();
}
catch (Win32Exception e)
{
MessageBox.Show("Error Details: {0}", e.Message);
}
return bSuccess;
}
if Now I want like when I speak red instead of giving a text box I want get serial command means - just to _serialPort.Write("#27 P1600 S750\r"); instead of showing messagebox (i.e. MessageBox.Show(e.Result.Text);) then task is really simple. just copy-paste that code. and add using System.IO.Ports; so that u can work with ports.
so prolly ur code will look like this:
private void sr_SpeechRecognized(object sender, SpeechRecognizedEventArgs e)
{
//MessageBox.Show(e.Result.Text);
try
{
_serialPort = new SerialPort();
_serialPort.PortName = "COM3";
_serialPort.Open();
_serialPort.Write("#27 P1600 S750\r");
Console.WriteLine("#27 P1500 S750\r");
string output;
output = "";
//Example: "Q <cr>"
//This will return a "." if the previous move is complete, or a "+" if it is still in progress.
while (!(output == ".")) //loop until you get back a period
{
_serialPort.Write("Q \r");
output = _serialPort.ReadExisting();
Console.WriteLine(output);
Thread.Sleep(10);
}
_serialPort.Close();
}
catch (TimeoutException) { }
}
p.s.
if you don't understand how SerialPort Class works go to MSDN