Azure Cognitive Speech-to-text DetailedSpeechRecognitionResult is not detecting explicit punctuation

Azure Cognitive Speech-to-text DetailedSpeechRecognitionResult is not detecting explicit punctuation - c#

when I using confidence the punctuation is not working just like I am saying question mark it was typing question mark instant ? and when I say period it was typing period instant . I have make a checkbox when you click on the checkbox the punctuation will be on
SpeechConfig config = SpeechConfig.FromSubscription("key", "region");
config.OutputFormat = OutputFormat.Detailed;
if (Properties.Settings.Default.Punctuation)
{
config.SetServiceProperty("punctuation", "explicit", ServicePropertyChannel.UriQueryParameter);
}
recognizer = new SpeechRecognizer(config);
recognizer. Recognizer. Recognizedecognizer_Recognized;
...
private void SpeechRecognizer_Recognized(object sender, SpeechRecognitionEventArgs e)
{
if (e.Result.Reason == ResultReason.RecognizedSpeech)
{
if (e.Result.Text.ToLower().Equals("new line") || e.Result.Text.ToLower().Equals("newline"))
{
SendKeys.SendWait(Environment.NewLine);
}
else
{
var detailedResults = e.Result.Best();
if (detailedResults != null && detailedResults.Any())
{
var bestResults = detailedResults?.ToList()[0];
foreach (var word in bestResults.Words)
{
double per = word.Confidence * 100;
SendKeys.SendWait($"{word.Word} [{per:0.##}] ");
}
}
}
}
}

What you are observing is by design. In most circumstances it not necessary or even helpful to inspect the details of recognized speech result. It looks like you have misinterpreted how to use the details.
You don't realise it but your example of detecting "new line" or "newline" as a key phrase and interpreting that as a request to inject a line feed into the output is the very same process at work.
For puntuation to be detected in the speech, the first thing that the classifier must do is resolve the words. It is only after the word has been resolved that the service can post process the results to classify the word as a natural word or punctuation.
The process is a bit like this:
Detected the word "comma" with high confidence
If the punctuation setting is set to explicit, then Is the word on its own or at the end of a recognized sequence that was followed by a pause
If yes, then interpret it as "," and not "comma"
For this reason it is important to understand that when the punctuation setting is set to explicit, the punctuation must be isolated out of the normal sentence cadence of the spoken text.
Read this as a sentence with a constant pace without punctuation:
this is a sentence that doesn't have a comma or a full stop but an exclamation mark would look nice
If you read fast and fluent enough, there should be no punctation in the output, even if the words were recognized with high confidence. To get punctuation into the same text, you actually need to read this script:
This is a sentence that doesn't have a comma.
Comma.
Or a fullstop.
Comma.
But an exclamation mark would look nice.
exclamation mark.
This is a sentence that doesn't have a comma , or a full stop , but an exclamation mark would look nice !
The per-word analysis for my test looks like this:
word
confidence
this
85.99%
is
95.93%
a
68.49%
sentence
96.99%
that
90.03%
doesn't
96.75%
have
94.57%
a
87.88%
comma
94.58%
comma
94.34%
or
67.14%
a
64.68%
fullstop
77.63%
comma
94.90%
but
91.17%
an
62.65%
exclamation
98.44%
mark
68.58%
would
86.15%
look
91.58%
nice
97.40%
exclamation
97.05%
mark
96.61%
Notice that the words representing the punctuation all have a high confidence rating, but in the output not all of the words were actually interpreted as punctuation. This might be clearer in this screenshot where I have highlighted two commas that are in the output, but are correctly identified as words:
In this screenshot, the panel on the left is populated with e.Result.Text and the panel on the right with the Word and Confidence.
DetailedSpeechRecognitionResult.Words
Returns the Word level timing result list.
The Words list is designed to be used to map the recognised word back to a specific offset and duration in the audio file that was submitted for analysis. You would use this information when testing and training the model or if you wanted to display the text as sub-titles for an audio or video clip. Punctuation is not shown at this level, it is purely about timing only, all it has done is literally transcribed the spoken audio into English vocabulary. It is the responsibility of other analytical functions to use this information to determine which detected words might represent punctuation or to determine context or sentiment.
FWIW this is my Recognized event handler:
recognizer.Recognized += (s, e) =>
{
// Checks result.
if (e.Result.Reason == ResultReason.RecognizedSpeech)
{
Console.WriteLine($"RECOGNIZED: Text={e.Result.Text}");
string text = e.Result.Text;
if (e.Result.Text.ToLower().Equals("new line") || e.Result.Text.ToLower().Equals("newline"))
text = Environment.NewLine;
// update the left textbox
this.BeginInvoke(SetText, textBox1, text);
var detailedResults = e.Result.Best();
if (detailedResults != null && detailedResults.Any())
{
var bestResults = detailedResults?.ToList()[0];
foreach (var word in bestResults.Words)
{
double perc = word.Confidence * 100;
// update the right textbox
this.BeginInvoke(SetText, textBox2, $"{word.Word} [{word.Confidence:p2}] " + Environment.NewLine);
}
}
}
else if (e.Result.Reason == ResultReason.NoMatch)
{
Console.WriteLine($"NOMATCH: Speech could not be recognized.");
}
};
...
delegate void SetTextDelegate(TextBox textBox, string text);
private SetTextDelegate SetText = delegate (TextBox textbox, string text)
{
textbox. Text += " " + text;
};

Using cognitive services I cannot reproduce your issue. Setting the config.OutputFormat = OutputFormat.Detailed or config.RequestWordLevelTimestamps(); does not affect the explicit punctuation recognition.
What is not clear from your example is the current state of your setting. When in doubt, if we are toggling logic using settings, and the behaviour that we observe is the same even when we change the setting values then the obvious code to check is the setting value itself.
Please try to comment out your logic to toggle the punctuation like this:
//if (Properties.Settings.Default.Punctuation)
{
config.SetServiceProperty("punctuation", "explicit", ServicePropertyChannel.UriQueryParameter);
}
If this solves it then there are two considerations:
What is the initial state of the Properties.Settings.Default.Punctuation setting? Is your application logic not updating the value when you expect it to? Any mutating logic that affects that setting may need to call Properties.Settings.Default.Save() to save changes. An extension of this of course is that depending on where your mutating logic is executing from, you might need to call Properties.Settings.Default.Reload() to ensure that the current values are loaded from the store, however this is not usually required if you are operating in the same thread space, which you most likely will be in WinForms.
Is the config loaded once, and is that once before the setting value has been toggled? That step in the workflow is unclear from your description and the code example. If you are using continuous recognition or you are creating a single instances of SpeechRecognizer for the lifetime of your Form then changes to your setting will not be applied into the Speech Configuration.
You will need to re-initialize the SpeechRecognizer as part of your logic that is handling the setting changed event or have some other routine in the speech event handlers that detects a change in this setting and restarts the SpeechRecognizer connection and process.

Related

Is there a way to bridge empty lines with Substring?

With the following code, I delete line by line in a TextBox (0, 39). Now there is on the last place a money amount (1 any Articel 10.00) which I want to deduct from the total amount. For that, I use the Substring. But there I get errors, as probably the empty spaces are not interpreted. Is there a simple solution to this? Thanks
private void btnDelete_Click(object sender, EventArgs e)
{
if (TextBox1.Text.Length > 0)
{
txtTotal.Text = (Double.Parse(txtTotal.Text) - Double.Parse(TextBox1.Text.Substring(8, 2))).ToString("0.00");
TextBox1.Text = TextBox1.Text.Remove(0, 39);
}
if (TextBox1.Text.Length == 0)
{
MessageBox.Show("The cart is empty");

Few things you can do to make your life easier (assuming you have to keep a TextBox as you have stated to others.)
Before I get into the details however, the issue seems to be you're having trouble parsing text that represents lines of data, data which contains amounts which you want to act on. If this is an incorrect assumption, disregard this answer.
Ok, back to it...
Rather than trying to work with the text directly in the TextBox, start by reading in your entire string as a list of lines (i.e. List<String>). You can do that with the Split function or with RegEx expressions. See here
Use RegEx expressions for each line to identify not just the type of line it is (an 'item' line or the 'All' line at the bottom) as well as the various parts of those lines. For instance, you can use a RegEx that starts at the end of the line and goes backwards looking for a number (in the form of a string.) Use the result of that for your Parse method to get the actual numeric value.
Finally, if you still need to remove the lines of text (I'm not sure if you're removing the text just for your logic or if you need to display it) simply remove them from your list of strings for the lines. If it needs to be displayed back in the UI (doubtful as it seems it should be blank at the end of processing) just use Join to convert the lines back to a string, then set that back to the TextBox.Text property.
Hope this helps!
Mark
P.S. To (try and) avoid comments such as the ones you got about your design, it may help to start your question by saying something like 'Unfortunately I'm restricted to using a TextBox due to issues outside of this question, hence I'm looking for an answer here.' At least that should cut back on those responses telling you to 'Do it differently!' instead of answering your question.

Digit Grouping Symbol messes with number format

I have a WinForms application, in which I have a label, where I bind a decimal to. I have also added a .Format method to the binding. Like this:
// Add data binding for the lavel
Binding bindWithFormat = new Binding("Text", viewModel, nameof(viewModel.BindingNumber));
bindWithFormat .Format += viewModel.FormatAsNumber;
lblNumber.DataBindings.Add(bindWithFormat);
// Formatting function
public void FormatAsNumber(object sender, ConvertEventArgs e)
{
// The method converts only to string type. Test this using the DesiredType.
if (e.DesiredType != typeof(string)) return;
// Formats the value with thousand separator and zero decimals
e.Value = String.Format("{0:N0}", e.Value);
}
This works fine under normal circumstances, but if I choose a particular type of Digit Grouping Symbol, it looks like this (it is supposed to show "15 000 000"):
I first thought it was when I used a blank space (" ") as symbol, but when I explicitly type a blank space, then it shows it as intended. However, there is another symbol I can chose in the regional settings, which looks like a space, but it causes the above formatting error when selected (unlike when I explicitly type space):
What the heck is going on? According to this website, the symbol is "no break space" (U+00A0). So it is a space. But not a space. And for some reason, it seriously messes up the formatting. What to do?
Bonus info: After playing around some more, it seems to only affect that specific font, that I was using (it only exists in my company). If I change fonts to e.g. Segoe UI, then the problem disappears.

(WinRT)How to get TextBox caret index?

I got some problems with getting caret index of TextBox in Windows Store App(WP 8.1).
I need to insert specific symbols to the text when button is pressed.
I tried this:
text.Text = text.Text.Insert(text.SelectionStart, "~");
But this code inserts symbol to the beginning of text, not to the place where caret is.
UPDATE
I updated my code thanks to Ladi. But now I got another problem: I'm building HMTL editor app so my default TextBlock.Text is: <!DOCTYPE html>\r\n<html>\r\n<head>\r\n</head>\r\n<body>\r\n</body>\r\n</html>
So for example when user inserts symbol to line 3, symbol is inserted 2 symbols before caret; 3 syms before when caret is in line 4 and so on. Inserting works properly when symbol is inserted to the first line.
Here's my inserting code:
Index = HTMLBox.SelectionStart;
HTMLBox.Text = HTMLBox.Text.Insert(Index, (sender as AppBarButton).Label);
HTMLBox.Focus(Windows.UI.Xaml.FocusState.Keyboard);
HTMLBox.Select(Index+1,0);
So how to solve this? I guess new line chars making trouble.

For your first issue I assume you changed the TextBox.Text before accessing SelectionStart. When you set the text.Text, text.SelectionStart is reset to 0.
Regarding your second issue related to new lines.
You could say that what you observe is by design. SelectionStart will count one "\r\n" as one character for reasons explained here (see Remarks section). On the other hand, method string.Insert does not care about that aspect and counts "\r\n" as two characters.
You need to change slightly your code. You cannot use the value of SelectionStart as the insert position. You need to calculate the insert position accounting for this behavior of SelectionStart.
Here is a verbose code sample with a potential solution.
// normalizedText will allow you to separate the text before
// the caret even without knowing how many new line characters you have.
string normalizedText = text.Text.Replace("\r\n", "\n");
string textBeforeCaret = normalizedText.Substring(0, text.SelectionStart);
// Now that you have the text before the caret you can count the new lines.
// that need to be accounted for.
int newLineCount = textBeforeCaret.Count(c => c == '\n');
// Knowing the new lines you can calculate the insert position.
int insertPosition = text.SelectionStart + newLineCount;
text.Text = text.Text.Insert(insertPosition, "~");
Also you should make sure that SelectionStart does not exhibit similar behavior with other combinations beside "\r\n". If it does you will need to update the code above.

String.Replace and Regex.Replace not working with special characters inside IComparer

The following console app works fine:
class Program
{
static void Main(string[] args)
{
string plainx = "‘test data’ random suffix";
plainx = Regex.Replace(plainx, #"‘", string.Empty);
Console.WriteLine(plainx);
}
}
However its giving me trouble in an ASP.Net application.. I am attaching a screenshot of the VS Debug watch window and Immediate window
(Click for larger view)
As you can see, the Regex.Replace in the Immediate Window works - but somehow it is not working in the code (line 71). I've also used String.Replace without success.
Edit
It seems the value that was stored in the DB is something than what the editor shows... kind of weird..

Have you actually examined the text being compared? What Unicode code points does it contain?
Your code shows you trying to replace the glyph '‘', which is a left "smart quote". The character's name is LEFT SINGLE QUOTATION MARK and its code point is 0x2018 (aka '\u2018'). This is a character you can't ordinarily enter on a keyboard.
What you are probably seeing is the glyph '`', a "backtick". Its character name is GRAVE ACCENT and its code point is 0x0060 (aka '\u0060'). This is the character typed when you press the [unshifted] tilde key on a standard US keyboard (leftmost key on the number row).
It might, of course, be any of a number of other characters whose glyph is similar to a single quote. See Commonly Confused Characters for more information.

The single quote in your code is not the same single quote in the string you are testing.
Use the hex value returned from testx[0] directly to guarantee that we are using the correct quote.
plainx = Regex.Replace(plainx, "\u2018", string.Empty);

try to replace :
#"‘" to #"\‘"
code :
string plainx = "‘test data’ random suffix";
plainx = System.Text.RegularExpressions.Regex.Replace(plainx, #"\‘", string.Empty);
Console.WriteLine(plainx);
Console.Read();

Determine POS tagging in English based on database files

I'm a little bit confused how to determine part-of-speech tagging in English. In this case, I assume that one word in English has one type, for example word "book" is recognized as NOUN, not as VERB. I want to recognize English sentences based on tenses. For example, "I sent the book" is recognized as past tense.
Description:
I have a number of database (*.txt) files: NounList.txt, verbList.txt, adjectiveList.txt, adverbList.txt, conjunctionList.txt, prepositionList.txt, articleList.txt. And if input words are available in the database, I assume that type of those words can be concluded. But, how to begin lookup in the databases? For example, "I sent the book": how to begin a search in the databases for every word, "I" as Noun, "sent" as verb, "the" as article, "book" as noun? Any better approach than searching every word in every database? I doubt that every databases has unique element.
I enclose my perspective here.
private List<string> ParseInput(String allInput)
{
List<string> listSentence = new List<string>();
char[] delimiter = ".?!;".ToCharArray();
var sentences = allInput.Split(delimiter, StringSplitOptions.RemoveEmptyEntries).Select(s => s.Trim());
foreach (var s in sentences)
listSentence.Add(s);
return listSentence;
}
private void tenseReviewMenu_Click(object sender, EventArgs e)
{
string allInput = rtbInput.Text;
List<string> listWord = new List<string>();
List<string> listSentence = new List<string>();
HashSet<string> nounList = new HashSet<string>(getDBList("nounList.txt"));
HashSet<string> verbList = new HashSet<string>(getDBList("verbList.txt"));
HashSet<string> adjectiveList = new HashSet<string>(getDBList("adjectiveList.txt"));
HashSet<string> adverbList = new HashSet<string>(getDBList("adverbList.txt"));
char[] separator = new char[] { ' ', '\t', '\n', ',' etc... };
listSentence = ParseInput(allInput);
foreach (string sentence in listSentence)
{
foreach (string word in sentence.Split(separator))
if (word.Trim() != "")
listWord.Add(word);
}
string testPOS = "";
foreach (string word in listWord)
{
if (nounList.Contains(word.ToLowerInvariant()))
testPOS += "noun ";
else if (verbList.Contains(word.ToLowerInvariant()))
testPOS += "verb ";
else if (adjectiveList.Contains(word.ToLowerInvariant()))
testPOS += "adj ";
else if (adverbList.Contains(word.ToLowerInvariant()))
testPOS += "adv ";
}
tbTest.Text = testPOS;
}
POS tagging is my secondary explanation in my assignment. So I use a simple approach to determine POS tagging that is based on database. But, if there's a simpler approach: easy to use, easy to understand, easy to get pseudocode, easy to design... to determine POS tagging, please let me know.

I hope the pseudocode I present below proves helpful to you. If I find time, I'd also write some code for you.
This problem can be tackled by following the steps below:
Create a dictionary of all the common sentence patterns in the English language. For example, Subject + Verb is an English pattern and all the sentences like I sleep, Dog barked and Ship will arrive match the S-V pattern. You can find a list of the most common english patterns here. Please note that for some time you may need to keep revising this dictionary to enhance the accuracy of your program.
Try to fit the input sentence in one of the patterns in the dictionary you created above, for example, if the input sentence is Snakes, unlike elephants, are venomous., then your code must be able to find a match with the pattern: Subject, unlike AnotherSubject, Verb Object or S-,unlike-S`-, -V-O. To successfully perform this step, you may need to write code that's good at spotting Structure Markers like the word unlike, in this example sentence.
When you have found a match for your input sentence in your pattern dictionary, you can easily assign a tag to each word in the sentence. For example, in our sentence, the word Snakes would be tagged as a subject, just like the word elephants, the word are would be tagged as a verb and finally the word venomous would be tagged as an object.
Once you have assigned a unique tag to each of the words in your sentence, you can go lookup the word in the appropriate text files that you already have and determine whether or not your sentence is valid.
If your sentence doesn't match any sentence pattern, then you have two options:
a) Add the pattern of this unrecognized sentence in your pattern dictionary if it is a valid English sentence.
b) Or, discard the input sentence as an invalid English sentence.
Things like what you're trying to achieve are best solved using machine learning techniques so that the system can learn any new patterns. So, you may want to include a trainer system that would add a new pattern to your pattern dictionary whenever it finds a valid English sentence not matching any of the existing patterns. I haven't thought much about how this can be done, but for now, you may manually revise your Sentence Pattern dictionary.
I'd be glad to hear your opinion about this pseudocode and would be available to brainstorm it further.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.