How to extract required data?

How to extract required data? - c#

The serial data I receive is like this
AT+CMGR=3
+CMGR: "REC READ","919742400000",,"2014/01/17 22:52:44+22"
G LED ON
OK
Actually it will be like this
AT+CMGR=3\r\n\r\n+CMGR: "REC READ","919742400000",,"2014/01/17 22:52:44+22"\r\nG LED ON\r\n\r\n\r\nOK\r\n
I need to extract
3 (Message Index can have any integer value)
REC READ (Msg Status)
919742400000 (Originating Number)
2014/01/17 22:52:44+22 (Date Time Stamp)
G LED ON (Actual Message SMS)
The VC#.Net code that I have written is
Regex r = new Regex(#"\+CMGR: (\d+),""(.+)"",""(.+)"",(.*),""(.+)""\r\n(.+)\r\n");
Match m = r.Match(mySMS); //mySMS is a string and contains the above mentioned data
int Index = int.Parse(m.Groups[1].Value); //(3 in above example)
string Status = m.Groups[2].Value; //(REC READ)
string Sender = m.Groups[3].Value; //(919742400000)
string Sent = m.Groups[4].Value; //2014/01/17 22:52:44+22
string Message = m.Groups[5].Value; //G LED ON
Another thing is The actual message (SMS) can contain double quoted strings like
"G "LED" ON" and I might also be able to extract
G "LED" ON
The above VC# code is not working.
How can I use RegEx or any other method to extract the required datas?

I would say your main structure is line-based, ie you either use ReadLine() oryou split the string:
string[] lines = mySMS.Split(new string[] {"\r\n"}, StringSplitOptions.None);
Now lines[2] constians your main data, comma-separated. lines[3] is your G "LED" ON.
To parse lines[2]:
string[] parts = lines[2].Split(',');
string OriginatingNumber = parts[1];
string DateTimeStamp = parts[3];
parts[0] will hold +CMGR: "REC READ", use another Split() or break on the : some other way.

Related

C# Finding numerical value from specific string

I have a string of varying length that I am trying to retrieve a number from. The format of the string is always:
"some text lines
FC = 1234
more text here
and so on"
So I know the string of numbers comes after "FC = ", and I know it finishes at the next \n. How can I return this number (which will vary in size) into a new string?

Try the following code snippet:
var str = "some text lines \nFC = 1234\n more text here and so on";
Console.WriteLine(Regex.Match(str, #"\d+\.*\d*").Value);

Thanks to all. Think I managed to find a way with Regex, based on ScareCrow's suggestion:
string rgSearch = searchString + #"\d+\.*\d*";
FC = Regex.Match(diagnostics, rgSearch).Value;
FC = FC.Replace(searchString, ""); //Leaves the number only

Convert String to Byte Array replacing characters between specific character

I am trying to find a way to convert a string (entered into a TextBox) and convert it to a byte array to send out a serial port / socket.
I am fine with the converting string to byte[] part but am struggling a bit with the replacement
Essentially the GUI allows the user to specific the format of the response to send and I was looking at something like the following :-
User Enters : [2] Test {1} {2} [3]
{1} and {2} are variable fields which can be pulled from the incoming message so they are currently being replaced without issue.
What I am trying to achieve is replace the [2] with an STX character and the [3] with an ETX character with the 2 and 3 being their ASCII equivalents. www.asciitable.com
The user can enter any valid ascii character in this format so [13] for CR etc
Would the best way to loop through the string remembering the index of [ and then the index of ] and grab all characters between these two indexes? Or is there a more efficient way?
Thanks,
Daniel.

A regular expression can find digits between brackets and replace them with a calculated value.
Your replacement scheme looks like it might be similar to String.Format but you'll have to compare that and decide on the order of operations and meaning of special characters.
The encoding will throw an exception if the bracketed number is outside of 0-127. You could have some other behavior if you want.
var encoding = Encoding.GetEncoding(Encoding.ASCII.CodePage,
EncoderFallback.ExceptionFallback,
DecoderFallback.ExceptionFallback);
var bracketRegex = new Regex(#"\[(?<digits>\d+)\]", RegexOptions.Compiled);
MatchEvaluator convertToCodepoint = (match) =>
Char.ConvertFromUtf32(Int32.Parse(match.Groups["digits"].Value));
var values = new[] {"a", "b", "c" };
var input = "[2] Test {1} {2} [3]";
encoding.GetBytes(String.Format(bracketRegex.Replace(input, convertToCodepoint), values))
.Dump();

I think you should write a code similar to this:
string input = TextBox.text; "User name (sales)";
//Use those lines if you don't know how many times do you have to iterate.
var totalOfBraces = input.Where(x => x == '{').Count();
var totalOfBrackets = input.Where(x => x == '[').Count();
var totalOfElements = totalOfBraces + totalOfBrackets;
string output = input.Split('[', ']')[1];
string output = input.Split('{', '}')[1];
And you you can get the elements between Braces and Brackets and do a replace of them.
Then, why I added totalOfElements, to have the possibility to do a for bucle
For example:
var counterOfBraces = 0;
var counterOfBrackets = 0;
for(var i=0; i<totalOfElements.Count(); i++){
if(i < totalOfBrackets){
counterOfBrackets+=1;
var textToFind = "[" + index + "]";
input = input.Replace(textToFind, "some new text");
} else {
//Do the same for braces
}
}
//NOW HERE, YOU HAVE YOUR TEXT FORMATED AND READY TO CONVERT IT TO BYTE[]

How to contact whole text from file into the string avoiding empty lines beetwen strings

How to get whole text from document contacted into the string. I'm trying to split text by dot: string[] words = s.Split('.'); I want take this text from text document. But if my text document contains empty lines between strings, for example:
pat said, “i’ll keep this ring.”
she displayed the silver and jade wedding ring which, in another time track,
she and joe had picked out; this
much of the alternate world she had elected to retain. he wondered what - if any - legal basis she had kept in addition. none, he hoped; wisely, however, he said nothing. better not even to ask.
result looks like this:
1. pat said ill keep this ring
2. she displayed the silver and jade wedding ring which in another time track
3. she and joe had picked out this
4. much of the alternate world she had elected to retain
5. he wondered what if any legal basis she had kept in addition
6. none he hoped wisely however he said nothing
7. better not even to ask
but desired correct output should be like this:
1. pat said ill keep this ring
2. she displayed the silver and jade wedding ring which in another time track she and joe had picked out this much of the alternate world she had elected to retain
3. he wondered what if any legal basis she had kept in addition
4. none he hoped wisely however he said nothing
5. better not even to ask
So to do this first I need to process text file content to get whole text as single string, like this:
pat said, “i’ll keep this ring.” she displayed the silver and jade wedding ring which, in another time track, she and joe had picked out; this much of the alternate world she had elected to retain. he wondered what - if any - legal basis she had kept in addition. none, he hoped; wisely, however, he said nothing. better not even to ask.
I can't to do this same way as it would be with list content for example: string concat = String.Join(" ", text.ToArray());,
I'm not sure how to contact text into string from text document

I think this is what you want:
var fileLocation = #"c:\\myfile.txt";
var stringFromFile = File.ReadAllText(fileLocation);
//replace Environment.NewLine with any new line character your file uses
var withoutNewLines = stringFromFile.Replace(Environment.NewLine, "");
//modify to remove any unwanted character
var withoutUglyCharacters = Regex.Replace(withoutNewLines, "[“’”,;-]", "");
var withoutTwoSpaces = withoutUglyCharacters.Replace(" ", " ");
var result = withoutTwoSpaces.Split('.').Where(i => i != "").Select(i => i.TrimStart()).ToList();
So first you read all text from your file, then you remove all unwanted characters and then split by . and return non empty items

Have you tried replacing double new-lines before splitting using a period?
static string[] GetSentences(string filePath) {
if (!File.Exists(filePath))
throw new FileNotFoundException($"Could not find file { filePath }!");
var lines = string.Join("", File.ReadLines(filePath).Where(line => !string.IsNullOrEmpty(line) && !string.IsNullOrWhiteSpace(line)));
var sentences = Regex.Split(lines, #"\.[\s]{1,}?");
return sentences;
}
I haven't tested this, but it should work.
Explanation:
if (!File.Exists(filePath))
throw new FileNotFoundException($"Could not find file { filePath }!");
Throws an exception if the file could not be found. It is advisory you surround the method call with a try/catch.
var lines = string.Join("", File.ReadLines(filePath).Where(line => !string.IsNullOrEmpty(line) && !string.IsNullOrWhiteSpace(line)));
Creates a string, and ignores any lines which are purely whitespace or empty.
var sentences = Regex.Split(lines, #".[\s]{1,}?");
Creates a string array, where the string is split at every period and whitespace following the period.
E.g:
The string "I came. I saw. I conquered" would become
I came
I saw
I conquered
Update:
Here's the method as a one-liner, if that's your style?
static string[] SplitSentences(string filePath) => File.Exists(filePath) ? Regex.Split(string.Join("", File.ReadLines(filePath).Where(line => !string.IsNullOrEmpty(line) && !string.IsNullOrWhiteSpace(line))), #"") : null;

I would suggest you to iterate through all characters and just check if they are in range of 'a' >= char <= 'z' or if char == ' '. If it matches the condition then add it to the newly created string else check if it is '.' character and if it is then end your line and add another one :
List<string> lines = new List<string>();
string line = string.Empty;
foreach(char c in str)
{
if((char.ToLower(c) >= 'a' && char.ToLower(c) <= 'z') || c == 0x20)
line += c;
else if(c == '.')
{
lines.Add(line.Trim());
line = string.Empty;
}
}
Working online example
Or if you prefer "one-liner"s :
IEnumerable<string> lines = new string(str.Select(c => (char)(((char.ToLower(c) >= 'a' && char.ToLower(c) <= 'z') || c == 0x20) ? c : c == '.' ? '\n' : '\0')).ToArray()).Split('\n').Select(s => s.Trim());

I may be wrong about this. I would think that you may not want to alter the string if you are splitting it. Example, there are double/single quote(s) (“) in part of the string. Removing them may not be desired which brings up the possibly of a question, reading a text file that contains single/double quotes (as your example data text shows) like below:
var stringFromFile = File.ReadAllText(fileLocation);
will not display those characters properly in a text box or the console because the default encoding using the ReadAllText method is UTF8. Example the single/double quotes will display (replacement characters) as diamonds in a text box on a form and will be displayed as a question mark (?) when displayed to the console. To keep the single/double quotes and have them display properly you can get the encoding for the OS’s current ANSI encoding by adding a parameter to the ReadAllText method like below:
string stringFromFile = File.ReadAllText(fileLocation, ASCIIEncoding.Default);
Below is code using a simple split method to .split the string on periods (.) Hope this helps.
private void button1_Click(object sender, EventArgs e) {
string fileLocation = #"C:\YourPath\YourFile.txt";
string stringFromFile = File.ReadAllText(fileLocation, ASCIIEncoding.Default);
string bigString = stringFromFile.Replace(Environment.NewLine, "");
string[] result = bigString.Split('.');
int count = 1;
foreach (string s in result) {
if (s != "") {
textBox1.Text += count + ". " + s.Trim() + Environment.NewLine;
Console.WriteLine(count + ". " + s.Trim());
count++;
}
else {
// period at the end of the string
}
}
}

read specific websourcecode in c#

When I press a button the following happens:
HttpWebRequest request = (HttpWebRequest)WebRequest
.Create("http://oldschool.runescape.com/slu");
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
StreamReader sr = new StreamReader(response.GetResponseStream());
richTextBox1.Text = sr.ReadToEnd();
sr.Close();
In short the data gets transferred to my textbox (this works perfectly)
Now if I choose world 78 (for example, from a combobox, it will refer to the last digits of that line) I want to get the value 968, if i choose world 14, I want to get the value 973.
This is an example of the printed data
e(378,true,0,"oldschool78",968,"United States","US","Old School 78");
e(314,true,0,"oldschool14",973,"United States","US","Old School 14");
What can I use to read this?

So there are two problems here, the first is selecting the right line, then getting the number out.
First you want a method for getting each of the lines in to a list, eg using something like this:
List<String> lines = new List<String>()
string line = sr.ReadLine();
while(line != null)
{
lines.Add(line);
line = sr.ReadLine(); // read the next line
}
Then you need to find the relevant line and get the token out of it.
Probably the most simple way is, for each line, split the string up by ',', '\"', '(' and ')' (using
String.Split). Ie, we get basically the parameters.
Eg
foreach(string lineInFile in lines)
{
// split the string in to tokens
string[] tokens = lineInFile.Split(',', '\"', '(', ')');
// based on the sample strings and how we've split this,
// we take the 15th entry
string endParameter = tokens[15]; //endParamter = "Old School 14"
...
We now use a regular expression to extract the number. The pattern we will use is d+, ie 1 or more digits.
Regex numberFinder = new Regex("\\d+");
Match numberMatch = numberFinder.Match(endParameter);
// we assume that there is a match, because if there isn't the string isn't
// correct, you should do some error handling here
string matchedNumber = numberMatch.Value;
int value = Int32.Parse(matchedValue); // we convert the string in to the number
if(value == desiredValue)
...
We check if the value matches the value we were looking for (eg 14), we now need to get the number you wanted.
We've already split the parameters, and the number we want is the 8th item (eg index 7 in string[] tokens). Since, at least in your example, this is just a lone number, we can just parse this to get the int.
{
return Int32.Parse(tokens[7]);
}
}
Again here we are assuming that the string is in the formats you showed, and you should do error protection here to.

Extracting data from plain text string

I am trying to process a report from a system which gives me the following code
000=[GEN] OK {Q=1 M=1 B=002 I=3e5e65656-e5dd-45678-b785-a05656569e}
I need to extract the values between the curly brackets {} and save them in to variables. I assume I will need to do this using regex or similar? I've really no idea where to start!! I'm using c# asp.net 4.
I need the following variables
param1 = 000
param2 = GEN
param3 = OK
param4 = 1 //Q
param5 = 1 //M
param6 = 002 //B
param7 = 3e5e65656-e5dd-45678-b785-a05656569e //I
I will name the params based on what they actually mean. Can anyone please help me here? I have tried to split based on spaces, but I get the other garbage with it!
Thanks for any pointers/help!

If the format is pretty constant, you can use .NET string processing methods to pull out the values, something along the lines of
string line =
"000=[GEN] OK {Q=1 M=1 B=002 I=3e5e65656-e5dd-45678-b785-a05656569e}";
int start = line.IndexOf('{');
int end = line.IndexOf('}');
string variablePart = line.Substring(start + 1, end - start);
string[] variables = variablePart.Split(' ');
foreach (string variable in variables)
{
string[] parts = variable.Split('=');
// parts[0] holds the variable name, parts[1] holds the value
}
Wrote this off the top of my head, so there may be an off-by-one error somewhere. Also, it would be advisable to add error checking e.g. to make sure the input string has both a { and a }.

I would suggest a regular expression for this type of work.
var objRegex = new System.Text.RegularExpressions.Regex(#"^(\d+)=\[([A-Z]+)\] ([A-Z]+) \{Q=(\d+) M=(\d+) B=(\d+) I=([a-z0-9\-]+)\}$");
var objMatch = objRegex.Match("000=[GEN] OK {Q=1 M=1 B=002 I=3e5e65656-e5dd-45678-b785-a05656569e}");
if (objMatch.Success)
{
Console.WriteLine(objMatch.Groups[1].ToString());
Console.WriteLine(objMatch.Groups[2].ToString());
Console.WriteLine(objMatch.Groups[3].ToString());
Console.WriteLine(objMatch.Groups[4].ToString());
Console.WriteLine(objMatch.Groups[5].ToString());
Console.WriteLine(objMatch.Groups[6].ToString());
Console.WriteLine(objMatch.Groups[7].ToString());
}
I've just tested this out and it works well for me.

Use a regular expression.
Quick and dirty attempt:
(?<ID1>[0-9]*)=\[(?<GEN>[a-zA-Z]*)\] OK {Q=(?<Q>[0-9]*) M=(?<M>[0-9]*) B=(?<B>[0-9]*) I=(?<I>[a-zA-Z0-9\-]*)}
This will generate named groups called ID1, GEN, Q, M, B and I.
Check out the MSDN docs for details on using Regular Expressions in C#.
You can use Regex Hero for quick C# regex testing.

You can use String.Split
string[] parts = s.Split(new string[] {"=[", "] ", " {Q=", " M=", " B=", " I=", "}"},
StringSplitOptions.None);

This solution breaks up your report code into segments and stores the desired values into an array.
The regular expression matches one report code segment at a time and stores the appropriate values in the "Parsed Report Code Array".
As your example implied, the first two code segments are treated differently than the ones after that. I made the assumption that it is always the first two segments that are processed differently.
private static string[] ParseReportCode(string reportCode) {
const int FIRST_VALUE_ONLY_SEGMENT = 3;
const int GRP_SEGMENT_NAME = 1;
const int GRP_SEGMENT_VALUE = 2;
Regex reportCodeSegmentPattern = new Regex(#"\s*([^\}\{=\s]+)(?:=\[?([^\s\]\}]+)\]?)?");
Match matchReportCodeSegment = reportCodeSegmentPattern.Match(reportCode);
List<string> parsedCodeSegmentElements = new List<string>();
int segmentCount = 0;
while (matchReportCodeSegment.Success) {
if (++segmentCount < FIRST_VALUE_ONLY_SEGMENT) {
string segmentName = matchReportCodeSegment.Groups[GRP_SEGMENT_NAME].Value;
parsedCodeSegmentElements.Add(segmentName);
}
string segmentValue = matchReportCodeSegment.Groups[GRP_SEGMENT_VALUE].Value;
if (segmentValue.Length > 0) parsedCodeSegmentElements.Add(segmentValue);
matchReportCodeSegment = matchReportCodeSegment.NextMatch();
}
return parsedCodeSegmentElements.ToArray();
}

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.