I have a string which is somewhat like this:
string data = "I have a {apple} and a {orange}";
I need to extract the content inside {}, let's say for 10 times
I tried this
string[] split = data.Split(new char[] { '{', '}' }, StringSplitOptions.RemoveEmptyEntries);
The problem is my data is going to be dynamic and I wouldn't know at what instance the {<>} would be present, it can also be something like this
Give {Pen} {Pencil}
I guess the above method wouldn't work, so I would really like to know a dynamic way to do this. Any input would be really helpful.
Thanks and Regards
Try this:
string data = "I have a {apple} and a {orange}";
Regex rx = new Regex("{(.*?)}");
foreach (Match item in rx.Matches(data))
{
Console.WriteLine(item.Groups[1].Value);
}
You need to use Regex to get all values you need.
If the string between {} does not contain nested {} you can use a regex to perform this task:
string data = "I have a {apple} and a {orange}";
Regex reg = new Regex(#"\{(?<Name>[A-z0-9]*)\}");
var matches = reg.Matches(data);
foreach (var m in matches.OfType<Match>())
{
Console.WriteLine($"Found {m.Groups["Name"].Value} at {m.Index}");
}
To replace the strings between {} you can use Regex.Replace:
reg.Replace(data, m => m.Groups["Name"].Value + "_")
// Will produce "I have a apple_ and a orange_"
To get the rest of the string, you can use Regex.Split:
Regex reg2 = new Regex(#"\{[A-z0-9]*\}");
var result = reg2.Split(data);
// will contain "I have a ", " and a ", "", you might want to remove ""
As I understand, you want to split that string into parts like this:
I have a
{apple}
and a
{orange}
And then you want to go over those parts and do something with them, and that something is different depending on whether part is enclosed in {} or not. If so - you need Regex.Split:
string data = "I have a {apple} and a {orange}";
var parts = Regex.Split(data, #"({.*?})");
foreach (var part in parts) {
if (part.StartsWith("{") && part.EndsWith("}")) {
var trimmed = part.TrimStart('{').TrimEnd('}');
// "apple" and "orange" go here
// do something with {} part
}
else {
// "I have a " and " and a " go here
// do something with other part
}
}
Related
I have to find whether the String Contains one of the Exact word which are present in the List.
Eg:
List<string> KeyWords = new List<string>(){"Test","Re Test","ACK"};
String s1 = "Please give the Test"
String s2 = "Please give Re Test"
String s3 = "Acknowledge my work"
Now,
When I use: Keywords.Where(x=>x.Contains(s1)) It Gives me a Match which is correct. But for s3 it should not.
Any workaround for this.
Use split function on the basis of space and match the words.
i hope that will worked.
How about using regular expressions?
public static class Program
{
public static void Main(string[] args)
{
var keywords = new List<string>() { "Test", "Re Test", "ACK" };
var targets = new[] {
"Please give the Test",
"Please give Re Test",
"Acknowledge my work"
};
foreach (var target in targets)
{
Console.WriteLine($"{target}: {AnyMatches(target, keywords)}");
}
Console.ReadKey();
}
private static bool AnyMatches(string target, IEnumerable<string> keywords)
{
foreach (var keyword in keywords)
{
var regex = new Regex($"\\b{Regex.Escape(keyword)}\\b", RegexOptions.IgnoreCase);
if (regex.IsMatch(target))
return true;
}
return false;
}
}
Creating the regular expression always on-the-fly is maybe not the best option in production, so you should think of creating a list of Regex based on your keywords instead of storing only the keywords in a dumb string list.
Bit different solution.
void Main()
{
var KeyWords = new List<string>(){ "Test","Re Test","ACK" };
var array = new string[] {
"Please give the Test",
"Please give Re Test",
"Acknowledge my work"
};
foreach(var c in array)
{
Contains(c,KeyWords); // Your result.
}
}
private bool Contains(string sentence, List<string> keywords) {
var result = keywords.Select(keyWord=>{
var parts3 = Regex.Split(sentence, keyWord, RegexOptions.IgnoreCase).Where(x=>!string.IsNullOrWhiteSpace(x)).First().Split((char[])null); // Split by the keywords and get the rest of the words splitted by empty space
var splitted = sentence.Split((char[])null); // split the original string.
return parts3.Where(t=>!string.IsNullOrWhiteSpace(t)).All(x=>splitted.Any(t=>t.Trim().Equals(x.Trim(),StringComparison.InvariantCultureIgnoreCase)));
}); // Check if all remaining words from parts3 are inside the existing splitted string, thus verifying if full words.
return result.All(x=>x);// if everything matches then it was a match on full word.
}
The Idea is to split by the word you are looking for e.g Split by ACK and then see if the remaining words are matched by words splitted inside the original string, if the remaining match that means there was a word match and thus a true. If it is a part split meaning a sub string was taken out, then words wont match and thus result will be false.
Your usage of Contains is backwards:
var foundKW = KeyWords.Where(kw => s1.Contains(kw)).ToList();
how about the using of regex
using \bthe\b, \b represents a word boundary delimiter.
List<string> KeyWords = new List<string>(){"Test","Re Test","ACK"};
String s1 = "Please give the Test"
String s2 = "Please give Re Test"
String s3 = "Acknowledge my work"
bool result = false ;
foreach(string str in KeyWords)
{
result = Regex.IsMatch(s1 , #"\b"+str +"\b");
if(result)
break;
}
New to Regular Expressions, I want to have the following text in my HTML and would like to replace with something else
Example HTML:
{{Object id='foo'}}
Extract the id into a variable like this:
string strId = "foo";
So far I have the following Regular Expression code that will capture the Example HTML:
string strStart = "Object";
string strFind = "{{(" + strStart + ".*?)}}";
Regex regExp = new Regex(strFind, RegexOptions.IgnoreCase);
Match matchRegExp = regExp.Match(html);
while (matchRegExp.Success)
{
//At this point, I have this variable:
//{{Object id='foo'}}
//I can find the id='foo' (see below)
//but not sure how to extract 'foo' and use it
string strFindInner = "id='(.*?)'"; //"{{Slider";
Regex regExpInner = new Regex(strFindInner, RegexOptions.IgnoreCase);
Match matchRegExpInner = regExpInner.Match(matchRegExp.Value.ToString());
//Do something with 'foo'
matchRegExp = matchRegExp.NextMatch();
}
I understand this might be a simple solution, I am hoping to gain more knowledge about Regular Expressions but more importantly, I am hoping to receive a suggestion on how to approach this cleaner and more efficiently.
Thank you
Edit:
Is this an example that I could potentially use: c# regex replace
While I am not solving my initial question with Regular Expressions, I did move into a simpler solution using SubString, IndexOf and string.Split for the time being, I understand that my code needs to be cleaned up but thought I would post the answer that I have thus far.
string html = "<p>Start of Example</p>{{Object id='foo'}}<p>End of example</p>"
string strObject = "Slider"; //Example
//When found, this will contain "{{Object id='foo'}}"
string strCode = "";
//ie: "id='foo'"
string strCodeInner = "";
//Tags will be a list, but in this example, only "id='foo'"
string[] tags = { };
//Looking for the following "{{Object "
string strFindStart = "{{" + strObject + " ";
int intFindStart = html.IndexOf(strFindStart);
//Then ending in the following
string strFindEnd = "}}";
int intFindEnd = html.IndexOf(strFindEnd) + strFindEnd.Length;
//Must find both Start and End conditions
if (intFindStart != -1 && intFindEnd != -1)
{
strCode = html.Substring(intFindStart, intFindEnd - intFindStart);
//Remove Start and End
strCodeInner = strCode.Replace(strFindStart, "").Replace(strFindEnd, "");
//Split by spaces, this needs to be improved if more than IDs are to be used
//but for proof of concept this is perfect
tags = strCodeInner.Split(new char[] { ' ' });
}
Dictionary<string, string> dictTags = new Dictionary<string, string>();
foreach (string tag in tags)
{
string[] tagSplit = tag.Split(new char[] { '=' });
dictTags.Add(tagSplit[0], tagSplit[1].Replace("'", "").Replace("\"", ""));
}
//At this point, I can replace "{{Object id='foo'}}" with anything I'd like
//What I don't show is that I go into the website's database,
//get the object (ie: Slider) and return the html for slider with the ID of foo
html = html.Replace(strCode, strView);
/*
"html" variable may contain:
<p>Start of Example</p>
<p id="foo">This is the replacement text</p>
<p>End of example</p>
*/
I would need some help with matching data in this example string:
req:{REQUESTER_NAME},key:{abc},act:{UPDATE},sku:{ABC123,DEF-123},qty:{10,5}
Essentially, every parameter is separated by "," but it is also included within {} and I need some help with regex as I am not that good with it.
Desired Output:
req = "REQUESTER_NAME"
key = "abc"
act = "UPDATE"
sku[0] = "ABC123"
sku[1] = "DEF-123"
qty[0] = 10
qty[1] = 5
I would suggest you do the following
Use String Split with ',' character as the separator (eg output req:{REQUESTER_NAME})
With each pair of data, do String Split with ';' character as the separator (eg output "req", "{REQUESTER_NAME}")
Do a String Replace for characters '{' and '}' with "" (eg output REQUESTER_NAME)
Do a String Split again with ',' character as separator (eg output "ABC123", "DEF-123")
That should parse it for you perfectly. You can store the results into your data structure as the results come in. (Eg. You can store the name at step 2 whereas the value for some might be available at Step 3 and for others at Step 4)
Hope That Helped
Note:
- If you don't know string split - http://www.dotnetperls.com/split-vbnet
- If you don't know string replace - http://www.dotnetperls.com/replace-vbnet
The below sample may helps to solve your problem. But here lot of string manipulations are there.
string input = "req:{REQUESTER_NAME},key:{abc},act:{UPDATE},sku:{ABC123,DEF-123},qty:{10,5}";
Console.WriteLine(input);
string[] words = input.Split(new string[] { "}," }, StringSplitOptions.RemoveEmptyEntries);
foreach (string item in words)
{
if (item.Contains(':'))
{
string modifiedString = item.Replace(",", "," + item.Substring(0, item.IndexOf(':')) + ":");
string[] wordsColl = modifiedString.Split(new char[] { ',' }, StringSplitOptions.RemoveEmptyEntries);
foreach (string item1 in wordsColl)
{
string finalString = item1.Replace("{", "");
finalString = finalString.Replace("}", "");
Console.WriteLine(finalString);
}
}
}
First, use Regex.Matches to get the parameters inside { and }.
string str = "req:{REQUESTER_NAME},key:{abc},act:{UPDATE},sku:{ABC123,DEF-123},qty:{10,5}";
MatchCollection matches = Regex.Matches(str,#"\{.+?\}");
string[] arr = matches.Cast<Match>()
.Select(m => m.Groups[0].Value.Trim(new char[]{'{','}',' '}))
.ToArray();
foreach (string s in arr)
Console.WriteLine(s);
output
REQUESTER_NAME
abc
UPDATE
ABC123,DEF-123
10,5
then use Regex.Split to get the parameter names
string[] arr1 = Regex.Split(str,#"\{.+?\}")
.Select(x => x.Trim(new char[]{',',':',' '}))
.Where(x => !string.IsNullOrEmpty(x)) //need this to get rid of empty strings
.ToArray();
foreach (string s in arr1)
Console.WriteLine(s);
output
req
key
act
sku
qty
Now you can easily traverse through the parameters. something like this
for(int i=0; i<arr.Length; i++)
{
if(arr1[i] == "req")
//arr[i] contains req parameters
else if(arr1[i] == "sku")
//arr[i] contains sku parameters
//use string.Split(',') to get all the sku paramters and process them
}
Kishore's answer is correct. This extension method may help implement that suggestion:
<Extension()>
Function WideSplit(InputString As String, SplitToken As String) As String()
Dim aryReturn As String()
Dim intIndex As Integer = InputString.IndexOf(SplitToken)
If intIndex = -1 Then
aryReturn = {InputString}
Else
ReDim aryReturn(1)
aryReturn(0) = InputString.Substring(0, intIndex)
aryReturn(1) = InputString.Substring(intIndex + SplitToken.Length)
End If
Return aryReturn
End Function
If you import System.Runtime.CompilerServices, you can use it like this:
Dim stringToParse As String = "req:{REQUESTER_NAME},key:{abc},act:{UPDATE},sku:{ABC123,DEF-123},qty:{10,5}"
Dim strTemp As String
Dim aryTemp As String()
strTemp = stringToParse.WideSplit("req:{")(1)
aryTemp = strTemp.WideSplit("},key:{")
req = aryTemp(0)
aryTemp = aryTemp(1).WideSplit("},act:{")
key = aryTemp(0)
'etc...
You may be able do this more memory efficiently, though, as this method creates a number of temporary string allocations.
Kishore's solution is perfect, but here is another solution that works with regex:
Dim input As String = "req:{REQUESTER_NAME},key:{abc},act:{UPDATE},sku:{ABC123,DEF-123},qty:{10,5}"
Dim Array = Regex.Split(input, ":{|}|,")
This does essentially the same, it uses regex to split on :{, } and ,. The solution might be a bit shorter though. The values will be put into the array like this:
"req", "REQUESTER_NAME","", ... , "qty", "10", "5", ""
Notice after the parameter and its value(s) there will be an empty string in the array. When looping over the array you can use this to let the program know when a new parameter starts. Then you can create a new array/data structure to store its values.
I have a list of strings.. each one looks similar to this:
"\n\t\"BLOCK\",\"HEADER-\"\r\n\t\t\"NAME\",\"147430\"\r\n\t\t\"REVISION\",\"0000\"\r\n\t\t\"DATE\",\"11/11/10\"\r\n\t\t\"TIME\",\"10:03:47\"\r\n\t\t\"PMABAR\",\"\"\r\n\t\t\"COMMENT\",\"\"\r\n\t\t\"PTPNAME\",\"0805C\"\r\n\t\t\"CMPNAME\",\"0805C\"\r\n\t\"BLOCK\",\"PRTIDDT-\"\r\n\t\t\"PMAPP\",1\r\n\t\t\"PMADC\",0\r\n\t\t\"ComponentQty\",4\r\n\t\"BLOCK\",\"PRTFORM-\"\r\n\t\t\....(more)...."
What I am trying to do is keep that entire string BUT... replace the DATE, TIME and ComponentQty.....
I want to place the date variable that i have set for the DATE, as well as the DateTime.Now.ToString(""HH:mm:ss") for the TIME ... and a dictionary[part] for the ComponentQty. These values would replace like so:
"DATE","11/11/10" with "DATE","12/06/11"
"TIME","10:03:47" with "TIME","10:30:10"
"ComponentQty",4 with "ComponentQty", 8
or something similar...
so the new string would look like this:
"\n\t\"BLOCK\",\"HEADER-\"\r\n\t\t\"NAME\",\"147430\"\r\n\t\t\"REVISION\",\"0000\"\r\n\t\t\"DATE\",\"12/06/11\"\r\n\t\t\"TIME\",\"10:30:10"\"\r\n\t\t\"PMABAR\",\"\"\r\n\t\t\"COMMENT\",\"\"\r\n\t\t\"PTPNAME\",\"0805C\"\r\n\t\t\"CMPNAME\",\"0805C\"\r\n\t\"BLOCK\",\"PRTIDDT-\"\r\n\t\t\"PMAPP\",1\r\n\t\t\"PMADC\",0\r\n\t\t\"ComponentQty\",8\r\n\t\"BLOCK\",\"PRTFORM-\"\r\n\t\t\....(more)...."
What is the quickest way to do such a thing? I was thinking Regex but I am not too sure on how to go about doing this. Can anyone help?
EDIT:
I used just a normal string replace to do it.. but the replaced data will not always have the statc date, time, compQty that I have below (11/11/10, 10:03:47, 4)... I need to find a way to make that section not hard coded -- with regex I am assuming..
var newDate = "DATE\",\"" + date + "\"";
var newTime = "TIME\",\"" + DateTime.Now.ToString("HH:mm:ss") + "\"";
var newCompQTY = "ComponentQty\"," + dictionary[part];
trimmedDataBasePart = trimmedDataBasePart.ToUpper().Replace("DATE\",\"11/11/10", newDate);
trimmedDataBasePart = trimmedDataBasePart.ToUpper().Replace("TIME\",\"10:03:47", newTime);
trimmedDataBasePart = trimmedDataBasePart.ToUpper().Replace("COMPONENTQTY\",4", newCompQTY);
I am trying to set a value to a Regex and am not sure how to do so... this is what I was trying... but it obviously does not work because the var is not a string. any suggestions?
var newDate = "DATE\",\"" + date + "\"";
var regexedDate = Regex.Match(trimmedDataBasePart, "DATE\",[0-9]+/[0-9]+/[0-9]+");
trimmedDataBasePart = trimmedDataBasePart.ToUpper().Replace(regexedDate, newDate);
Try this:
resultString = Regex.Replace(subjectString, #"(.*\bDATE\b\D*).*?(\\.*\bTIME\b\D*).*?(\\.*\bComponentQty\b\D*)\d+(.*)", "$1NEW_DATE$2NEW_TIME$3NEW_QTY", RegexOptions.Singleline);
Where NEW_DATE should be replaced by your date, NEW_TIME by your time, and NEW_QTY by your new qty.
You can create the replacement string from other variables as you please :)
Well well well, .NET and interpolated variables suck.. If you try to change use "$11" in replacement it thinks it has to use backreference #11 and it fails. Also Regexbuddy had a bug which produced the wrong regex. This is tested and works!
string subjectString = "\n\t\"BLOCK\",\"HEADER-\"\r\n\t\t\"NAME\",\"147430\"\r\n\t\t\"REVISION\",\"0000\"\r\n\t\t\"DATE\",\"11/11/10\"\r\n\t\t\"TIME\",\"10:03:47\"\r\n\t\t\"PMABAR\",\"\"\r\n\t\t\"COMMENT\",\"\"\r\n\t\t\"PTPNAME\",\"0805C\"\r\n\t\t\"CMPNAME\",\"0805C\"\r\n\t\"BLOCK\",\"PRTIDDT-\"\r\n\t\t\"PMAPP\",1\r\n\t\t\"PMADC\",0\r\n\t\t\"ComponentQty\",4\r\n\t\"BLOCK\",\"PRTFORM-\"\r\n\t\t....(more)....";
Regex regexObj = new Regex(#"^(.*\bDATE\b\D*).*?(\"".*?\bTIME\b\D*).*?(\"".*?\bComponentQty\b\D*)\d+(.*)$", RegexOptions.Singleline);
StringBuilder myResult = new StringBuilder();
Match matchResults = regexObj.Match(subjectString);
while (matchResults.Success)
{
for (int i = 1; i < matchResults.Groups.Count; i++)
{
Group groupObj = matchResults.Groups[i];
if (groupObj.Success)
{
myResult.Append(groupObj.Value);
switch (i)
{
case 1:
myResult.Append("NEW_DATE");
break;
case 2:
myResult.Append("NEW_TIME");
break;
case 3:
myResult.Append("NEW QTY");
break;
}
}
}
matchResults = matchResults.NextMatch();
}
Console.WriteLine("Final Result : \n\n\n{0}", myResult.ToString());
Output:
Final Result :
"BLOCK","HEADER-"
"NAME","147430"
"REVISION","0000"
"DATE","NEW_DATE"
"TIME","NEW_TIME"
"PMABAR",""
"COMMENT",""
"PTPNAME","0805C"
"CMPNAME","0805C"
"BLOCK","PRTIDDT-"
"PMAPP",1
"PMADC",0
"ComponentQty",NEW QTY
"BLOCK","PRTFORM-"
....(more)....
By the way you have a falsely escaped dot in your input string. Cheers and have fun! :)
If you can change the way your source string looks, I would use String.Format:
string s = String.Format("Date={0}, Name={1}, Quantity={2}", date, name, quantity);
The placeholders {0}, {1}, {2} are replaced with the specified arguments which follow.
To make it cleaner I would create a function to parse that string list, and then another function to create such a string list instead of using regexps. I think this will make your code easier to maintain.
Dictionary<string, string> Parse(List<string> data)
{
...
}
List<string> CreateStringList(Dictionary<string, string> values)
{
...
}
List<string> SetValues(List<string> data)
{
Dictionary<string, string> values = Parse(data);
values["DATE"] = "12/06/11";
values["TIME"] = "10:30:10";
values["ComponentQty"] = "4";
return CreateStringList(values);
}
How can I remove duplicate substrings within a string? so for instance if I have a string like smith:rodgers:someone:smith:white then how can I get a new string that has the extra smith removed like smith:rodgers:someone:white. Also I'd like to keep the colons even though they are duplicated.
many thanks
string input = "smith:rodgers:someone:smith:white";
string output = string.Join(":", input.Split(':').Distinct().ToArray());
Of course this code assumes that you're only looking for duplicate "field" values. That won't remove "smithsmith" in the following string:
"smith:rodgers:someone:smithsmith:white"
It would be possible to write an algorithm to do that, but quite difficult to make it efficient...
Something like this:
string withoutDuplicates = String.Join(":", myString.Split(':').Distinct().ToArray());
Assuming the format of that string:
var theString = "smith:rodgers:someone:smith:white";
var subStrings = theString.Split(new char[] { ':' });
var uniqueEntries = new List<string>();
foreach(var item in subStrings)
{
if (!uniqueEntries.Contains(item))
{
uniqueEntries.Add(item);
}
}
var uniquifiedStringBuilder = new StringBuilder();
foreach(var item in uniqueEntries)
{
uniquifiedStringBuilder.AppendFormat("{0}:", item);
}
var uniqueString = uniquifiedStringBuilder.ToString().Substring(0, uniquifiedStringBuilder.Length - 1);
Is rather long-winded but shows the process to get from one to the other.
not sure why you want to keep the duplicate colons. if you are expecting the output to be "smith:rodgers:someone::white" try this code:
public static string RemoveDuplicates(string input)
{
string output = string.Empty;
System.Collections.Specialized.StringCollection unique = new System.Collections.Specialized.StringCollection();
string[] parts = input.Split(':');
foreach (string part in parts)
{
output += ":";
if (!unique.Contains(part))
{
unique.Add(part);
output += part;
}
}
output = output.Substring(1);
return output;
}
ofcourse i've not checked for null input, but i'm sure u'll do it ;)