Splitting up a log file line c# - c#

Log file lines in question:
[2018-10-25 19:40:34] [Output] : (CHAT-Type) User: message
So the date and output can be split by number of characters right?
Then I can parse the date from within the [].
That would leave me with
(CHAT-Type) User: message
Now I want to split this into Chat Type, Username and message.
This is really hurting my head how I would do all this in c#.
Basically I need it to come out like this:
DateTime
ChatType
User
Message
all separate variables

This looks like RegEx kind of a problem :)
Here's a regular expression that matches a single line (with named capture groups)
^\[(?'dateTime'.+)\] \[(?'output'.+)\] : \((?'type'.+)\) (?'user'.+): (?'message'.+)$
Regexr link to try online: https://regexr.com/421p4
Since I used .+ for all areas, there are no character restrictions. It might break (e.g., if there is no space before and after :.) But it can be further improved to be more flexible. If you'd like, I can write one up.
Also, if you're using a method like file.ReadAllText(), you need to use the multiline flag to match all lines. (Regex.Match(input, pattern, RegexOptions.Multiline))
Otherwise (if you're iterating through the lines, for example,) it doesn't matter because there are no \ns in the string.
The C# code
using System.Text.RegularExpressions;
// ...
string pattern = #"^\[(?<dateTime>.+)\] \[(?<output>.+)\] : \((?<type>.+)\) (?<user>.+): (?<message>.+)$";
string message = "[2018-10-25 19:40:34] [Output] : (CHAT-Type) User: message";
var match = Regex.Match(message, pattern);
You can access the matches through the match variable like this:
match.Groups["dateTime"].Value; // "2018-10-25 19:40:34"
match.Groups["user"].Value; // "User"
match.Groups["message"].Value; // "message"

This is how you split all the values:
string inputValue = "(CHAT-Type) User: message";
int braceCloseIndex = inputValue.IndexOf(')');
int colonIndex = inputValue.IndexOf(':');
string chatType = inputValue.Substring(1, braceCloseIndex - 1).Trim();
string userName = inputValue.Substring(braceCloseIndex + 1, colonIndex - braceCloseIndex - 1).Trim();
string message = inputValue.Substring(colonIndex + 1).Trim();
Console.WriteLine($"chatType: {chatType}, userName: {userName}, msg: {message}");

Related

How can I extract a dynamic length string from multiline string?

I am using "nslookup" to get machine name from IP.
nslookup 1.2.3.4
Output is multiline and machine name's length dynamic chars. How can I extract "DynamicLengthString" from all output. All suggestions IndexOf and Split, but when I try to do like that, I was not a good solution for me. Any advice ?
Server: volvo.toyota.opel.tata
Address: 5.6.7.8
Name: DynamicLengthString.toyota.opel.tata
Address: 1.2.3.4
I made it the goold old c# way without regex.
string input = #"Server: volvo.toyota.opel.tata
Address: 5.6.7.8
Name: DynamicLengtdfdfhString.toyota.opel.tata
Address: 1.2.3.4";
string targetLineStart = "Name:";
string[] allLines = input.Split(new char[] { '\r', '\n' }, StringSplitOptions.RemoveEmptyEntries);
string targetLine = String.Empty;
foreach (string line in allLines)
if (line.StartsWith(targetLineStart))
{
targetLine = line;
}
System.Console.WriteLine(targetLine);
string dynamicLengthString = targetLine.Remove(0, targetLineStart.Length).Split('.')[0].Trim();
System.Console.WriteLine("<<" + dynamicLengthString + ">>");
System.Console.ReadKey();
This extracts "DynamicLengtdfdfhString" from the given input, no matter where the Name-Line is and no matter what comes afterwards.
This is the console version to test & verify it.
You can use Regex
using System;
using System.Text.RegularExpressions;
public class Program
{
public static void Main()
{
string Content = "Server: volvo.toyota.opel.tata \rAddress: 5.6.7.8 \rName: DynamicLengthString.toyota.opel.tata \rAddress: 1.2.3.4";
string Pattern = "(?<=DynamicLengthString)(?s)(.*$)";
//string Pattern = #"/^Dy*$/";
MatchCollection matchList = Regex.Matches(Content, Pattern);
Console.WriteLine("Running");
foreach(Match match in matchList)
{
Console.WriteLine(match.Value);
}
}
}
I'm going to assume your output is exactly like you put it.
string output = ExactlyAsInTheQuestion();
var fourthLine = output.Split(Environment.NewLine)[3];
var nameValue = fourthLine.Substring(9); //skips over "Name: "
var firstPartBeforePeriod = nameValue.Split('.')[0];
//firstPartBeforePeriod should equal "DynamicLengthString"
Note that this is a barebones example:
Either check all array indexes before you access them, or be prepared to catch IndexOutOfRangeExceptions.
I've assumed that the four spaces between "Name:" and "DynamicLengthString" are four spaces. If they are a tab character, you'll need to adjust the Substring(9) method to Substring(6).
If "DynamicLengthString" is supposed to also have periods in its value, then my answer does not apply. You'll need to use a regex in that case.
Note: I'm aware that you dismissed Split:
All suggestions IndexOf and Split, but when I try to do like that, I was not a good solution for me.
But based on only this description, it's impossible to know if the issue was in getting Split to work, or it actually being unusable for your situation.

Identify the string that does not exists in another string using regex and C#

I am trying to capture a string that does not contains in another string.
string searchedString = " This is my search string";
string subsetofSearchedString = "This is my";
My output should be "Search string". I would like to go with only regex so that I can handle complex strings.
The below is the code that I have tried so far and I am not successful.
Match match = new Regex(subsetofSearchedString ).Match(searchedString );
if (!string.IsNullOrWhiteSpace(match.Value))
{
UnmatchedString= UnmatchedString.Replace(match.Value, string.Empty);
}
Update : The above code is not working for the below texts.
text1 = 'Property Damage (2015 ACURA)' Exposure Added Automatically for IP:Claimant DriverLoss Reserve Line :Property DamageReserve Amount $ : STATIP Role(s): Owner, DriverExposure Owner :Jaimee Watson_csr Author:
text2 = 'Property Damage (2015 ACURA)' Exposure Added Automatically for IP:Claimant DriverLoss Reserve Line :Property DamageReserve Amount $ : STATIP Role(s): Owner, Driver
Match match = new Regex(text2).Match(text1);
You can use Regex.Split:
var ans = Regex.Split(searchedString, subsetofSearchedString);
If you want the answer as a single string minus the subset, you can join it:
var ansjoined = String.Join("", ans);
Replacing with String.Empty will also work:
var ans = Regex.Replace(searchedString, subsetOfSearchedString, String.Empty);
Answer :
Regex wasn't working for me because of the presence of metacharacters in my string. Regex.Escape did not help me with the comparison.
String Contains worked like a charm here
if (text1.Contains(text2))
{
status = TestResult.Pass;
text1= text1.Replace(text2, string.Empty);
}

How to strip a string from the point a hyphen is found within the string C#

I'm currently trying to strip a string of data that is may contain the hyphen symbol.
E.g. Basic logic:
string stringin = "test - 9894"; OR Data could be == "test";
if (string contains a hyphen "-"){
Strip stringin;
output would be "test" deleting from the hyphen.
}
Console.WriteLine(stringin);
The current C# code i'm trying to get to work is shown below:
string Details = "hsh4a - 8989";
var regexItem = new Regex("^[^-]*-?[^-]*$");
string stringin;
stringin = Details.ToString();
if (regexItem.IsMatch(stringin)) {
stringin = stringin.Substring(0, stringin.IndexOf("-") - 1); //Strip from the ending chars and - once - is hit.
}
Details = stringin;
Console.WriteLine(Details);
But pulls in an Error when the string does not contain any hyphen's.
How about just doing this?
stringin.Split('-')[0].Trim();
You could even specify the maximum number of substrings using overloaded Split constructor.
stringin.Split('-', 1)[0].Trim();
Your regex is asking for "zero or one repetition of -", which means that it matches even if your input does NOT contain a hyphen. Thereafter you do this
stringin.Substring(0, stringin.IndexOf("-") - 1)
Which gives an index out of range exception (There is no hyphen to find).
Make a simple change to your regex and it works with or without - ask for "one or more hyphens":
var regexItem = new Regex("^[^-]*-+[^-]*$");
here -------------------------^
It seems that you want the (sub)string starting from the dash ('-') if original one contains '-' or the original string if doesn't have dash.
If it's your case:
String Details = "hsh4a - 8989";
Details = Details.Substring(Details.IndexOf('-') + 1);
I wouldn't use regex for this case if I were you, it makes the solution much more complex than it can be.
For string I am sure will have no more than a couple of dashes I would use this code, because it is one liner and very simple:
string str= entryString.Split(new [] {'-'}, StringSplitOptions.RemoveEmptyEntries)[0];
If you know that a string might contain high amount of dashes, it is not recommended to use this approach - it will create high amount of different strings, although you are looking just for the first one. So, the solution would look like something like this code:
int firstDashIndex = entryString.IndexOf("-");
string str = firstDashIndex > -1? entryString.Substring(0, firstDashIndex) : entryString;
you don't need a regex for this. A simple IndexOf function will give you the index of the hyphen, then you can clean it up from there.
This is also a great place to start writing unit tests as well. They are very good for stuff like this.
Here's what the code could look like :
string inputString = "ho-something";
string outPutString = inputString;
var hyphenIndex = inputString.IndexOf('-');
if (hyphenIndex > -1)
{
outPutString = inputString.Substring(0, hyphenIndex);
}
return outPutString;

retain the newline in a regex Match, c#

So, i've created the following regex which captures everything i need from my string:
const string tag = ":59";
var result = Regex.Split(message, String.Format(":{0}[^:]?:*[^:]*", tag),RegexOptions.Multiline);
the string follows this patter:
:59A:/sometext\n
somemore text\n
:71A:somemore text
I'm trying to capture everything in between :59A: and :71A: - this isn't fixed in stone though, as :71A: could be something else. hence, why i was using [^:]
EDIT
So, just to be clear on my requirements. I have a file(string) which is passed into a C# method, which should return only those values specified in the parameter tag. For instance, if the file(string) contains the following tags:
:20:
:21:
:59A:
:71A:
and i pass in 59 then i only need to return everything in between the start of tag :59A: and the start of the next tag, which in this instance is :71A:, but could be something else.
You can use the following code to match what you need:
string input = ":59A:/sometext\nsomemore text\n:71A:somemore text";
string pattern = "(?<=:[^:]+:)[^:]+\n";
var m = Regex.Match(input, pattern, RegexOptions.Singleline).Value;
If you want to use your tag constant, you can use this code
const string tag = ":59";
string input = ":59A:/sometext\nsomemore text\n:71A:somemore text";
string pattern = String.Format("(?<={0}[^:]*:)[^:]+\n", tag);
var m = Regex.Match(input, pattern, RegexOptions.Singleline).Value;

Match Multiline & IgnoreSome

I'm trying to extract some information from a JCL source using regex in C#
Basically, this is a string I can have:
//JOBNAME0 JOB (BLABLABLA),'SOME TEXT',MSGCLASS=YES,ILIKE=POTATOES, GRMBL
// IALSOLIKE=TOMATOES, ANOTHER GARBAGE
// FINALLY=BYE
//OTHER STUFF
So I need to extract the jobname JOBNAME0, the info (BLABLABLA), the description 'SOME TEXT' and the other parms MSGCLASS=YES ILIKE=POTATOES IALSOLIKE=TOMATOES FINALLY=BYE.
I must ignore everything that is after the space ... like GRMBL or ANOTHER GARBAGE
I must continue to next line if my last valid char was a , and stop if it there were none.
So far, I have successfully managed to get the jobname, the info and the description, pretty easy. For the other parms, i'm able to get all the parms and to split them, but i don't know how to get rid of the garbage.
Here is my code:
var regex = "//([^\\s]*) JOB (\\([^)]*\\))?,?(\\'[^']*\\')?,?([^,]*[,|\\s|$])*";
Match match2 = Regex.Match(test5, regex,RegexOptions.Singleline);
string CarteJob2 = match2.Groups[0].Value;
string JobName2 = match2.Groups[1].Value;
string JobInfo2 = match2.Groups[2].Value;
string JobDesc2 = match2.Groups[3].Value;
IEnumerable<string> parms = match2.Groups[4].Captures.OfType<Capture>().Select(x => x.Value);
string JobParms2 = String.Join("|", parms);
Console.WriteLine(CarteJob2 + "|");
Console.WriteLine(JobName2 + "|");
Console.WriteLine(JobInfo2 + "|");
Console.WriteLine(JobDesc2 + "|");
Console.WriteLine(JobParms2 + "|");
The output I get is this one:
//JOBNAME0 JOB (BLABLABLA),'SOME TEXT',MSGCLASS=YES,ILIKE=POTATOES, GRMBL
// IALSOLIKE=TOMATOES, ANOTHER GARBAGE
// FINALLY=BYE
//OTHER |
JOBNAME0|
(BLABLABLA)|
'SOME TEXT'|
MSGCLASS=YES,|ILIKE=POTATOES,| GRMBL
// IALSOLIKE=TOMATOES,| ANOTHER GARBAGE
// FINALLY=BYE
//OTHER |
The output I would like to see is:
//JOBNAME0 JOB (BLABLABLA),'SOME TEXT',MSGCLASS=YES,ILIKE=POTATOES, GRMBL
// IALSOLIKE=TOMATOES, ANOTHER GARBAGE
// FINALLY=BYE|
JOBNAME0|
(BLABLABLA)|
'SOME TEXT'|
MSGCLASS=YES|ILIKE=POTATOES|IALSOLIKE=TOMATOES|FINALLY=BYE|
Is there a way to get what I want ?
I think I'd try and do this with two Regex expressions.
The first one to get all the starting information from the beginning of the string - job name, info, description.
The second one to get all the parameters, which all seem to have a simple pattern of <param name>=<param value>.
The first Regex might look like this:
^//(?<job>[\d\w]+)[ ]+JOB[ ]+\((?<info>[\d\w]+)\),'(?<description>[\d\w ]+)'
I don't know if rules permit whitespaces to appear in the job name, info or description - adjust as needed. Also, I'm assuming this is the start of the file using the ^ char. Finally, this Regex has groups already defined, so getting values should be easier in C#.
The second Regex might be something like this:
(?<param>[\w\d]+)=(?<value>[\w\d]+)
Again, grouping is added to help get the parameter names and values.
Hope this helps.
EDIT:
A small tip - you can use the # sign before a string in C# to make it easier to write such Regex patterns. For example:
Regex reg = new Regex(#"(?<param>[\w\d]+)=(?<value>[\w\d]+)");

Categories