Parse a string in C# and replace content - c#

I have a string and need to replace some content based on certain substrings appearing in the string. e.g. a sample string might be
(it.FirstField = "fred" AND it.SecondField = True AND it.ThirdField = False AND it.FifthField = True)
and I want to transform it to:
(it.FirstField = "fred" AND it.SecondField = 'Y' AND it.ThirdField = 'N' AND it.FifthField = True)
i.e. if the substring appears in the string, I want to change the True to 'Y' and the False to 'N', but leave any other True/False values intact.
I have an array of substrings to look for:
string[] booleanFields = { "SecondField", "ThirdField", "FourthField" };
I can use something like if (booleanFields.Any(s => inputString.Contains(s))) to find out if the string contains any of the keywords, but what's the best way to perform the replacement?
Thanks.

In the words of clipit - it looks like you are trying to parse SQL, would you like some help with that?
You can try and do this via string manipulation, but you are going to run into problems - think about what would happen if you replaced "fred" with something else, perhaps:
(it.FirstField = "it.SecondField = True" AND it.SecondField = True)
I'm loathed to recommend it (because it's probably quite difficult), but the correct way to do this is to parse the SQL and manipulate the parsed expression - see Parsing SQL code in C# for what looks like an approach that could make this relatively straightfoward.

It's probably not the best answer due to the two very similar lines (one for true/one for false), but this works and is fairly neat for a Regex (with .Dump() ready for LINQPad paste).
It does however assume that you want to replace every ".FieldName = True" within your content (which will include cases where this format is enclosed in quotes as a string value).
void Main()
{
List<string> booleanFields = new List<string> { "SecondField", "ThirdField", "FourthField" };
string s = #"(it.FirstField = ""fred"" AND it.SecondField = True AND it.ThirdField = False AND it.FifthField = True)";
booleanFields.ForEach(bf => s = Regex.Replace(s, String.Format(#"[.]{0}[ ]*=[ ]*True", bf), String.Format(".{0} = 'Y'", bf)));
booleanFields.ForEach(bf => s = Regex.Replace(s, String.Format(#"[.]{0}[ ]*=[ ]*False", bf), String.Format(".{0} = 'N'", bf)));
s.Dump();
}

Related

trim string before character but still keep the remain part after it

So I have this string which I have to trim and manipulate a little with it.
My string example:
string test = "studentName_123.pdf";
Now, what I want to do is somehow extract only the _123 part and at the end I need to have studentName.pdf
What I have tried:
string test_extracted = test.Substring(0, test.LastIndexOf("_") )+".pdf";
This also works but the thing is that I don't want to add the ".pdf" suffix at the end of the string manually because I can have strings that are not pdf, for ex. studentName.docx , studentName.png.
So basically I just want the "_123" part removed but still keep the remain part after that.
I think this might help you:
string test = "studentName_123.pdf";
string test_extracted = test.Substring(0, test.LastIndexOf("_") )+ test.Substring(test.LastIndexOf("."),test.Length - test.LastIndexOf(".") );
Using Remove(int startIndex, int count):
string test = "studentName_123.pdf";
string test_extracted = test.Remove(test.LastIndexOf("_"), test.LastIndexOf(".") - test.LastIndexOf("_"));
Sounds like you mean something like this?
string extension = Path.GetExtension(test);
string pdfName = Path.GetFileNameWithoutExtension(test).Split('_')[0];
string fullName = pdfName + extension;
Since you know what value you will always be replacing in your strings, "_123", to base on your example, just utilize the replace method and replace it with nothing since the method expects two arguments;
string test_extracted = test.replace('_123', '');
This could be solved with a regular expression like this
(\w*)_.*(\.\w*) where the first capture group (\w*) matches everything before the underscore and the second group (\.\w*) matches the file extensions.
Lastly we just have to concat the groups without the stuff inbetween like so:
string test = "studentName_123.pdf";
var regex = Regex.Match(test, #"(\w*)_.*(\.\w*)");
string newString = regex.Groups[1].Value + regex.Groups[2].Value;

C# Regex matching using a varible

I am most familiar with PowerShell and have recently moved into using C# as my primary language. In PowerShell it's possible to do the following
$var1 = "abc"
"abc" -match "$var1"
This results in a true statement.
I would like to do be able to do the same thing in C#. I know that you can use interpolation with C# and I have tries various ways of trying to use Regex.Match() with no luck.
Example:
string toMatch = "abc";
var result = Regex.Match("abc", $"{{toMatch}}");
var a = Regex.Match("abc", $"{{{toMatch}}}");
var b = Regex.Match("abc", $"{toMatch}");
var c = Regex.Match(toMatch,toMatch);
None of the above seems to work. I am not even sure if what I am trying to do is possible in C#. Ideally I'd like to be able to use a combination of variables and Regex for a match. Something even like this Regex.Match(varToMatch,$"{{myVar}}\\d+\\w{4}")
edit:
After reading some answers here and trying some code out it appears that my real issue is trying to match up against a directory path. Something like "C:\temp\abcfile". For example:
string path = #"C:\temp\abc";
string path2 = #"C:\temp\abc";
string fn = path.Split('\\').LastOrDefault();
path = Regex.Escape(path);
path2 = Regex.Escape(path2);
Regex rx = new Regex(path);
var a = Regex.Match(path.Split('\\').Last().ToString(), $"{fn}");
//Example A works if I split and match on just the file name.
var b = Regex.Match(path, $"{rx}");
//Example B does not work, even though it's a regex object.
var c = Regex.Match(path, $"{{path}}");
//Example C I've tried one, two, and three sets of parenthesis with no luck
var d = Regex.Match(path,path);
// Even a direct variable to variable match returns 0 results.
You seem to have it right in the last example, so perhaps the issue is that you're expecting a bool result instead of a Match result?
Hopefully this small example helps:
int a = 123;
string b = "abc";
string toMatch = "123 and abc";
var result = Regex.Match(toMatch, $"{a}.*{b}");
if (result.Success)
{
Console.WriteLine("Found a match!");
}

How to strip a string from the point a hyphen is found within the string C#

I'm currently trying to strip a string of data that is may contain the hyphen symbol.
E.g. Basic logic:
string stringin = "test - 9894"; OR Data could be == "test";
if (string contains a hyphen "-"){
Strip stringin;
output would be "test" deleting from the hyphen.
}
Console.WriteLine(stringin);
The current C# code i'm trying to get to work is shown below:
string Details = "hsh4a - 8989";
var regexItem = new Regex("^[^-]*-?[^-]*$");
string stringin;
stringin = Details.ToString();
if (regexItem.IsMatch(stringin)) {
stringin = stringin.Substring(0, stringin.IndexOf("-") - 1); //Strip from the ending chars and - once - is hit.
}
Details = stringin;
Console.WriteLine(Details);
But pulls in an Error when the string does not contain any hyphen's.
How about just doing this?
stringin.Split('-')[0].Trim();
You could even specify the maximum number of substrings using overloaded Split constructor.
stringin.Split('-', 1)[0].Trim();
Your regex is asking for "zero or one repetition of -", which means that it matches even if your input does NOT contain a hyphen. Thereafter you do this
stringin.Substring(0, stringin.IndexOf("-") - 1)
Which gives an index out of range exception (There is no hyphen to find).
Make a simple change to your regex and it works with or without - ask for "one or more hyphens":
var regexItem = new Regex("^[^-]*-+[^-]*$");
here -------------------------^
It seems that you want the (sub)string starting from the dash ('-') if original one contains '-' or the original string if doesn't have dash.
If it's your case:
String Details = "hsh4a - 8989";
Details = Details.Substring(Details.IndexOf('-') + 1);
I wouldn't use regex for this case if I were you, it makes the solution much more complex than it can be.
For string I am sure will have no more than a couple of dashes I would use this code, because it is one liner and very simple:
string str= entryString.Split(new [] {'-'}, StringSplitOptions.RemoveEmptyEntries)[0];
If you know that a string might contain high amount of dashes, it is not recommended to use this approach - it will create high amount of different strings, although you are looking just for the first one. So, the solution would look like something like this code:
int firstDashIndex = entryString.IndexOf("-");
string str = firstDashIndex > -1? entryString.Substring(0, firstDashIndex) : entryString;
you don't need a regex for this. A simple IndexOf function will give you the index of the hyphen, then you can clean it up from there.
This is also a great place to start writing unit tests as well. They are very good for stuff like this.
Here's what the code could look like :
string inputString = "ho-something";
string outPutString = inputString;
var hyphenIndex = inputString.IndexOf('-');
if (hyphenIndex > -1)
{
outPutString = inputString.Substring(0, hyphenIndex);
}
return outPutString;

Get the last path of URL

I'm using the following code which is working just fine for most of the services but some time in the URL last I got User;v=2;mp and I need to get just User,how should I handle it?
I need some general solution since I guess in some other URL I can get different spacial charters
if the URL
https://ldmrrt.ct/odata/GBSRM/User;v=2;mp
string serviceName = _uri.Segments.LastOrDefault();
if the URL is https://ldmrrt.ct/odata/GBSRM/User its working fine...
Just replace:
string serviceName = _uri.Segments.LastOrDefault();
With:
string serviceName = _uri.Segments.LastOrDefault().Split(new[]{';'}).First();
If you need something more flexible, where you can specify what characters to include or skip, you could do something like this (slightly messy, you should probably extract parts of this as separate variables, etc):
// Note the _ and - characters in this example:
Uri _uri = new Uri("https://ldmrrt.ct/odata/GBSRM/User_ex1-ex2;v=2;mp");
// This will return a string: "User_ex1-ex2"
string serviceName =
new string(_uri.Segments
.LastOrDefault()
.TakeWhile(c => Char.IsLetterOrDigit(c)
|| (new char[]{'_', '-'}).Contains(c))
.ToArray());
Update, in response to what I understand to be a question below :) :
You could just use a String.Replace() for that, or you could use filtering by doing something like this:
// Will return: "Userexex2v=2mp"
string serviceName =
new string(_uri.Segments
.LastOrDefault()
.Where(c =>
!Char.IsPunctuation(c) // Ignore punctuation
// ..and ignore any "_" or "-":
&& !(new char[]{'_', '-'}).Contains(c))
.ToArray());
If you use this in production, mind you, be sure to clean it up a little, e.g. by splitting into several variables, and defining your char[]-array prior to usage.
In your case you can try to split a "User;v=2;mp" string.
string [] split = words.Split(new Char [] { ';' });
Returns a string array that contains the substrings in this instance that are delimited by elements of a specified Unicode character array.
split.First();
or:
split[0];

Regex for Removing All the repetation of a string and assign to an array

I have the following text in a file:
"SHOP_ORDER001","SHOP_ORDER002","SHOP_ORDER003","SHOP_ORDER004","SHOP_ORDER005"
Now I am getting the values by reading the file and assigning to array by spilt:
String orderValue = "";
string[] orderArray;
orderValue = File.ReadAllText(#"C:\File.txt");
orderArray = orderValue.Split(',');
But I am getting the values as :
I need the Values in Array as "ORDER001","ORDER002","ORDER003"
The \" you see is just added by debugger visualizer for strings (because quote is a special characted and need to be escaped to don't get confused), don't worry they're not in your orderArray.
In case you want to remove quotes too so that your array will be:
SHOP_ORDER001
SHOP_ORDER002
...
Just use this (with LINQ):
var orderArray = orderValue.Split(',').Select(x => x.Trim('"'));
By the way String.Split isn't very robust unless you're sure each field will never contain a comma.
EDIT
To answer the point you added in the comments if you need to remove SHOP_ just write this:
var orderArray = orderValue.Split(',')
.Select(x => x.Trim('"').Substring("SHOP_".Length));
use this regex
var res = Regex.Matches(orderValue, #"(?<=""SHOP_)[^""]+?(?="")");
You could use this:
string[] result = Regex.Split(orderValue, "(?:^\"SHOP_)|(?:\",\"SHOP_)|(?:\"$)");
However you will have to skip the first and last items in the resulting array as they will always be empty strings.
Silly question but why don't you just do
.Replace("SHOP_", "");

Categories