what will be the best way to parse string inside 2 characters

what will be the best way to parse string inside 2 characters - c#

i have this string:
"Network adapter 'Realtek PCIe GBE Family Controller' on local host"
what will be the best way to return only the string between "'" ? (Realtek PCIe GBE Family Controller)

If you're comfortable with regular expressions, you could use a pattern like:
/'[^']*'/
to capture everything between the single quotes

You can use regular expressions, like this:
var s = "hello 'world' hehe";
var m = Regex.Match(s, "'([^']*)'");
string res = null;
if (m.Success) {
res = m.Groups[1].ToString();
}
Console.WriteLine(res);
The key to the solution is this regular expression:
'([^']*)'
It starts the match when it finds a single quote, and continues until it finds the closing quote, capturing everything in between. The captured group is then retrieved through the Regex API. Note that the capturing groups that you define start at index 1; index zero is reserved to mean "the entire match".
Take a look at the demo on ideone.

You can use the Substring() method to chop it up.
tempStr = str.Substring(str.IndexOf("'")+1);
yourStr = tempStr.SubString(0, tempStr.IndexOf("'"));

Related

C# Regex to Get file name without extension?

I want to use regex to get a filename without extension. I'm having trouble getting regex to return a value. I have this:
string path = #"C:\PERSONAL\TEST\TESTFILE.PDF";
var name = Regex.Match(path, #"(.+?)(\.[^\.]+$|$)").Value;
In this case, name always comes back as C:\PERSONAL\TEST\TESTFILE.PDF. What am I doing wrong, I think my search pattern is correct?
(I am aware that I could use Path.GetFileNameWithoutExtension(path);but I specifically want to try using regex)

You need Group[1].Value
string path = #"C:\PERSONAL\TEST\TESTFILE.PDF";
var match = Regex.Match(path, #"(.+?)(\.[^\.]+$|$)");
if(match.Success)
{
var name = match.Groups[1].Value;
}
match.Value returns the Captures.Value which is the entire match
match.Group[0] always has the same value as match.Value
match.Group[1] return the first capture value
For example:
string path = #"C:\PERSONAL\TEST\TESTFILE.PDF";
var match = Regex.Match(path, #"(.+?)(\.[^\.]+$|$)");
if(match.Success)
{
Console.WriteLine(match.Value);
// return the substring of the matching part
//Output: C:\\PERSONAL\\TEST\\TESTFILE.PDF
Console.WriteLine(match.Groups[0].Value)
// always the same as match.Value
//Output: C:\\PERSONAL\\TEST\\TESTFILE.PDF
Console.WriteLine(match.Groups[1].Value)
// return the first capture group which is (.+?) in this case
//Output: C:\\PERSONAL\\TEST\\TESTFILE
Console.WriteLine(match.Groups[2].Value)
// return the second capture group which is (\.[^\.]+$|$) in this case
//Output: .PDF
}

Since the data is on the right side of the string, tell the regex parser to work from the end of the string to the beginning by using the option RightToLeft. Which will significantly reduce the processing time as well as lessen the actual pattern needed.
The pattern below reads from left to right and says, give me everything that is not a \ character (to consume/match up to the slash and not proceed farther) and start consuming up to a period.
Regex.Match(#"C:\PERSONAL\TEST\TESTFILE.PDF",
#"([^\\]+)\.",
RegexOptions.RightToLeft)
.Groups[1].Value
Prints out
TESTFILE

Try this:
.*(?=[.][^OS_FORBIDDEN_CHARACTERS]+$)
For Windows:
OS_FORBIDDEN_CHARACTERS = :\/\\\?"><\|
this is a sleight modification of:
Regular expression get filename without extention from full filepath
If you are fine to match forbidden characters then simplest regex would be:
.*(?=[.].*$)

Can be a bit shorter and greedier:
var name = Regex.Replace(#"C:\PERS.ONAL\TEST\TEST.FILE.PDF", #".*\\(.*)\..*", "$1"); // "TEST.FILE"

C# RegEx - get only first match in string

I've got an input string that looks like this:
level=<device[195].level>&name=<device[195].name>
I want to create a RegEx that will parse out each of the <device> tags, for example, I'd expect two items to be matched from my input string: <device[195].level> and <device[195].name>.
So far I've had some luck with this pattern and code, but it always finds both of the device tags as a single match:
var pattern = "<device\\[[0-9]*\\]\\.\\S*>";
Regex rgx = new Regex(pattern);
var matches = rgx.Matches(httpData);
The result is that matches will contain a single result with the value <device[195].level>&name=<device[195].name>
I'm guessing there must be a way to 'terminate' the pattern, but I'm not sure what it is.

Use non-greedy quantifiers:
<device\[\d+\]\.\S+?>
Also, use verbatim strings for escaping regexes, it makes them much more readable:
var pattern = #"<device\[\d+\]\.\S+?>";
As a side note, I guess in your case using \w instead of \S would be more in line with what you intended, but I left the \S because I can't know that.

depends how much of the structure of the angle blocks you need to match, but you can do
"\\<device.+?\\>"

I want to create a RegEx that will parse out each of the <device> tags
I'd expect two items to be matched from my input string:
1. <device[195].level>
2. <device[195].name>
This should work. Get the matched group from index 1
(<device[^>]*>)
Live demo
String literals for use in programs:
#"(<device[^>]*>)"

Change your repetition operator and use \w instead of \S
var pattern = #"<device\[[0-9]+\]\.\w+>";
String s = #"level=<device[195].level>&name=<device[195].name>";
foreach (Match m in Regex.Matches(s, #"<device\[[0-9]+\]\.\w+>"))
Console.WriteLine(m.Value);
Output
<device[195].level>
<device[195].name>

Use named match groups and create a linq entity projection. There will be two matches, thus separating the individual items:
string data = "level=<device[195].level>&name=<device[195].name>";
string pattern = #"
(?<variable>[^=]+) # get the variable name
(?:=<device\[) # static '=<device'
(?<index>[^\]]+) # device number index
(?:]\.) # static ].
(?<sub>[^>]+) # Get the sub command
(?:>&?) # Match but don't capture the > and possible &
";
// Ignore pattern whitespace is to document the pattern, does not affect processing.
var items = Regex.Matches(data, pattern, RegexOptions.IgnorePatternWhitespace)
.OfType<Match>()
.Select (mt => new
{
Variable = mt.Groups["variable"].Value,
Index = mt.Groups["index"].Value,
Sub = mt.Groups["sub"].Value
})
.ToList();
items.ForEach(itm => Console.WriteLine ("{0}:{1}:{2}", itm.Variable, itm.Index, itm.Sub));
/* Output
level:195:level
name:195:name
*/

Regex replacing inside of

Well, I have this code:
StreamReader sr = new StreamReader(#"main.cl", true);
String str = sr.ReadToEnd();
Regex r = new Regex(#"&");
string[] line = r.Split(str);
foreach (string val in line)
{
string Change = val.Replace("puts","System.Console.WriteLine()");
Console.Write(Change);
}
As you can see, I'm trying to replace puts (content) by Console.WriteLine(content) but it would be need Regular Expressions and I didn't found a good article about how to do THIS.
Basically, taking * as the value that is coming, I'd like to do this:
string Change = val.Replace("puts *","System.Console.WriteLine(*)");
Then, if I receive:
puts "Hello World";
I want to get:
System.Console.WriteLine("Hello World");

You need to use Regex.Replace to capture part of the input by using a capturing group and include the captured match into the output. Example:
Regex.Replace(
"puts 'foo'", // input
"puts (.*)", // .* means "any number of characters"
"System.Console.WriteLine($1)") // $1 stands for whatever (.*) matched
If the input always ends in a semicolon you would want to move that semicolon outside the WriteLine parens. One way to do that is:
Regex.Replace(
"puts 'foo';", // input
"puts (.*);", // ; outside parens -- now it's not captured
"System.Console.WriteLine($1);") // manually adding the fixed ; at the end
If you intend to adapt these examples it's a good idea to consult a technical reference first; you can find a very good one here.

What you want to do is look at Grouping Expressions. Give the following a try
Regex.Replace(val, "puts (.*);", "System.Console.WriteLine(${1});");
Note that you can also name your groups, as opposed to using their indexes for replacement. You can do this like so:
Regex.Replace(val, "puts (?<str>.*);", "System.Console.WriteLine(${str});");

Simple regex question C#

I need to match the string that is shown in the window displayed below :
8% of setup_av_free.exe from software-files-l.cnet.com Completed
98% of test.zip from 65.55.72.119 Completed
[numeric]%of[filename]from[hostname | IP address]Completed
I have written the regex pattern halfway
if (Regex.IsMatch(text, #"[\d]+%[\s]of[\s](.+?)(\.[^.]*)[\s]from[\s]"))
MessageBox.Show(text);
and I now need to integrate the following regex into my code above
ValidIpAddressRegex = "^(([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])\.){3}([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])$";
ValidHostnameRegex = "^(([a-zA-Z]|[a-zA-Z][a-zA-Z0-9\-]*[a-zA-Z0-9])\.)*([A-Za-z]|[A-Za-z][A-Za-z0-9\-]*[A-Za-z0-9])$";
The 2 regex were taken from this link. These 2 regex works well when i use the Regex.ismatch to match "123.123.123.123" and "software-files-l.cnet.com" . However i cannot get it to work when i intergrate both of them to my existin regex code. I tried several variant but not able to get it to work. Can someone guide me to integrate the 2 regex to my existing code. Thanks in advance.

You can certainly combine all these regular expressions into one, but I'd recommend against it. Consider this method, first it checks wether your input text has the correct form overall, then it checks if the "from" part is an IP address or a hostname.
bool CheckString(string text) {
const string ValidIpAddressRegex = #"^(([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])\.){3}([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])$";
const string ValidHostnameRegex = #"^(([a-zA-Z]|[a-zA-Z][a-zA-Z0-9\-]*[a-zA-Z0-9])\.)*([A-Za-z]|[A-Za-z][A-Za-z0-9\-]*[A-Za-z0-9])$";
var match = Regex.Match(text, #"[\d]+%[\s]of[\s](.+?)(\.[^.]*)[\s]from[\s](\S+)");
if(!match.Success)
return false;
string address = match.Groups[3].Value;
return Regex.IsMatch(address, ValidIpAddressRegex) ||
Regex.IsMatch(address, ValidHostnameRegex);
}
It does what you want and is much more readable and than single monster-sized regular expression. If you aren't going to call this method millions of time in a loop there is no reason to be concerned about it being less performant that single regex.
Also, in case you aren't aware of that the brackets around \d or \s aren't necessary.

The "Problem" that those two regexes do not match your string is that they start with ^ and end with $
^ means match the start of the string (or row if the m modifier is activated)
$ means match the end of the string (or row if the m modifier is activated)
When you try it this is true but in your real text they are in the middle of the string, so it is not matched.
Try just remove the ^ at the very beginning and the $ at the very end.

Here you go.
^[\d]+%[\s+]of[\s+](.+?)(\.[^.]*)[\s+]from[\s+]((([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])\.){3}([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])|((([a-zA-Z]|[a-zA-Z][a-zA-Z0-9\-]*[a-zA-Z0-9])\.)*([A-Za-z]|[A-Za-z][A-Za-z0-9\-]*[A-Za-z0-9])))[\s+]Completed
Remove the ^ and $ characters from the ValidIpAddressRegex and ValidHostnameRegex samples above, and add them separated by the or character (|) enclosed by parentheses.

You could use this, its should work for all cases. I mightve accidentally deleted a character while formatting so let me know if it doesnt work.
string captureString = "8% of setup_av_free.exe from software-files-l.cnet.com Completed";
Regex reg = new Regex(#"(?<perc>\d+)% of (?<file>\w+\.\w+) from (?<host>" +
#"(\d+\.\d+.\d+.\d+)|(((https?|ftp|gopher|telnet|file|notes|ms-help):" +
#"((//)|(\\\\))+)?[\w\d:##%/;$()~_?\+-=\\\.&]*)) Completed");
Match m = reg.Match(captureString);
string perc = m.Groups["perc"].Value;
string file = m.Groups["file"].Value;
string host = m.Groups["host"].Value;

Regular expression to retrieve everything before first slash

I need a regular expression to basically get the first part of a string, before the first slash ().
For example in the following:
C:\MyFolder\MyFile.zip
The part I need is "C:"
Another example:
somebucketname\MyFolder\MyFile.zip
I would need "somebucketname"
I also need a regular expression to retrieve the "right hand" part of it, so everything after the first slash (excluding the slash.)
For example
somebucketname\MyFolder\MyFile.zip
would return
MyFolder\MyFile.zip.

You don't need a regular expression (it would incur too much overhead for a simple problem like this), try this instead:
yourString = yourString.Substring(0, yourString.IndexOf('\\'));
And for finding everything after the first slash you can do this:
yourString = yourString.Substring(yourString.IndexOf('\\') + 1);

This problem can be handled quite cleanly with the .NET regular expression engine. What makes .NET regular expressions really nice is the ability to use named group captures.
Using a named group capture allows you to define a name for each part of regular expression you are interested in “capturing” that you can reference later to get at its value. The syntax for the group capture is "(?xxSome Regex Expressionxx). Remember also to include the System.Text.RegularExpressions import statement when using regular expression in your project.
Enjoy!
//Regular expression
string _regex = #"(?<first_part>[a-zA-Z:0-9]+)\\{1}(?<second_part>(.)+)";
//Example 1
{
Match match = Regex.Match(#"C:\MyFolder\MyFile.zip", _regex, RegexOptions.IgnoreCase);
string firstPart = match.Groups["first_part"].Captures[0].Value;
string secondPart = match.Groups["second_part"].Captures[0].Value;
}
//Example 2
{
Match match = Regex.Match(#"somebucketname\MyFolder\MyFile.zip", _regex, RegexOptions.IgnoreCase);
string firstPart = match.Groups["first_part"].Captures[0].Value;
string secondPart = match.Groups["second_part"].Captures[0].Value;
}

You are aware that .NET's file handling classes do this a lot more elegantly, right?
For example in your last example, you could do:
FileInfo fi = new FileInfo(#"somebucketname\MyFolder\MyFile.zip");
string nameOnly = fi.Name;
The first example you could do:
FileInfo fi = new FileInfo(#"C:\MyFolder\MyFile.zip");
string driveOnly = fi.Root.Name.Replace(#"\", "");

This matches all non \ chars
[^\\]*

Here is the regular expression solution using the "greedy" operator '?'...
var pattern = "^.*?\\\\";
var m = Regex.Match("c:\\test\\gimmick.txt", pattern);
MessageBox.Show(m.Captures[0].Value);

Split on slash, then get first item
words = s.Split('\\');
words[0]

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

what will be the best way to parse string inside 2 characters - c#

i have this string: "Network adapter 'Realtek PCIe GBE Family Controller' on local host" what will be the best way to return only the string between "'" ? (Realtek PCIe GBE Family Controller)

If you're comfortable with regular expressions, you could use a pattern like: /'[^']*'/ to capture everything between the single quotes

You can use the Substring() method to chop it up. tempStr = str.Substring(str.IndexOf("'")+1); yourStr = tempStr.SubString(0, tempStr.IndexOf("'"));

Related

C# Regex to Get file name without extension?

C# RegEx - get only first match in string

Regex replacing inside of

Simple regex question C#

Regular expression to retrieve everything before first slash

Categories

Resources