Replace string with regular expression and my own parameter - c#

In my html I've serval token like this:
{PROP_1_1}, {PROP_1_2}, {PROP_37871_1} ...
Actually I replace that token with the following code:
htmlBuffer = htmlBuffer.Replace("{PROP_" + prop.PropertyID + "_1}", prop.PropertyDefaultHtml);
where prop is a custom object. But in this case it affects only the tokens ending with '_1'. I would like to propagate this logic to all the rest ending up with '_X' where X is numeric.
How could I implement a regexp pattern to achieve this?

You can use Regex.Replace():
Regex rgx = new Regex("{PROP_" + prop.PropertyID + "_\d+}");
htmlBuffer = rgx.Replace(htmlBuffer, prop.PropertyDefaultHtml);

You can do even better, you can catch both identifiers in a regular expression. That way you can loop through the references that exist in the string and get the properties for those, instead of looping through all the properties that you have and check if there is any reference for them in the string.
Example:
htmlBuffer = Regex.Replace(htmlBuffer, #"{PROP_(\d+)_(\d+)}", m => {
int id = Int32.Parse(m.Groups[1].Value);
int suffix = Int32.Parse(m.Groups[2].Value);
return properties[id].GetValue(suffix);
});

Related

How to find values based on pattern matching from 2 string

I have string
a = "{key1}|{key2}_{key3}-{key4}"
I have another string
b = "abc|qwe_tue-pqr"
I need the output to be as to get values of
key1="abc", key2="qwe", key3="tue" and key4="pqr"
If the use case is as simple as presented you could perhaps transform a into a regex with named capturing groups:
var r = new Regex(Regex.Escape(a).Replace("\\{","(?<").Replace("}",">[a-z]+)"));
This turns a into a regex like (the | delimiter needed escaping):
(?<key1>[a-z]+)\|(?<key2>[a-z]+)_(?<key3>[a-z]+)-(?<key4>[a-z]+)
You can then get the matches and print them:
var m = r.Match(b);
foreach(Group g in m.Groups){
Console.WriteLine(g.Name + "=" + g.Value);
}
key1=abc
key2=qwe
key3=tue
key4=pqr
I can't promise it will reliably process everything you throw into it, based on the very limited input example, but it should give you the idea for a start on the process

Regular Expression + Options in MongoDB (C# Driver)

I'm working with the MongoDB driver for C # and I'm making queries to get from a collection a list of items that match a field.
I am using the BsonRegularExpression object with the following expression:
"/.*" + summonerName + "/"
This would be the equivalent of LIKE in Sql, but the problem comes when I want it to be case insensitive.
To do so, many sites comment that it must be like this:
BsonRegularExpression expReg = new BsonRegularExpression("/.*" + summonerName + "/", "i");
When I put the options parameter like this: "i" after the expression, it just doesn't return anything.
I leave the entire code here:
public static List<Summoner> GetSummonerByName(String summonerName)
{
try
{
var database = dbClient.GetDatabase(databaseName);
var collection = database.GetCollection<Summoner>(collectionSummoner);
String nombreRegex = "";
var nombreChar = summonerName.ToCharArray();
foreach(Char caracter in nombreChar)
{
nombreRegex += "*";
nombreRegex += caracter;
}
//var filter = Builders<Summoner>.Filter.Regex(u => u.name, new BsonRegularExpression("/.*C*o*r*b*a*n/"));
BsonRegularExpression expReg = new BsonRegularExpression("/.*" + summonerName + "/");
var filter = Builders<Summoner>.Filter.Regex(u => u.name, expReg);
var resultado = collection.Find(filter).ToList();
if(resultado.Count() == 0)
{
InsertSummoner(RiotApiConnectorService.GetSummonerByName(summonerName));
GetSummonerByName(summonerName);
}
return resultado;
}
catch (Exception error)
{
return null;
}
}
Has anyone experience using regular expressions in the mongo driver in C #?
Thanks in advance!
You should define it as
BsonRegularExpression expReg = new BsonRegularExpression(Regex.Escape(summonerName), "i")
The point here is that BsonRegularExpression regex definition should not include regex delimiters, / in your case. The regex instantiation is performed using the BsonRegularExpression class, and the regex delimiters are simply redundant, and are treated here as literal / in the pattern. There are no matches because your data have no slashes.
Next, you do not need .* here because regex searches for a match anywhere in the input text, it does not require a full string match (as is the case with LIKE operator).
Note Regex.Escape(summonerName) is used just in case there are special regex metacharacters in the summonerName, and if the method is not used, the search may fail.

retain the newline in a regex Match, c#

So, i've created the following regex which captures everything i need from my string:
const string tag = ":59";
var result = Regex.Split(message, String.Format(":{0}[^:]?:*[^:]*", tag),RegexOptions.Multiline);
the string follows this patter:
:59A:/sometext\n
somemore text\n
:71A:somemore text
I'm trying to capture everything in between :59A: and :71A: - this isn't fixed in stone though, as :71A: could be something else. hence, why i was using [^:]
EDIT
So, just to be clear on my requirements. I have a file(string) which is passed into a C# method, which should return only those values specified in the parameter tag. For instance, if the file(string) contains the following tags:
:20:
:21:
:59A:
:71A:
and i pass in 59 then i only need to return everything in between the start of tag :59A: and the start of the next tag, which in this instance is :71A:, but could be something else.
You can use the following code to match what you need:
string input = ":59A:/sometext\nsomemore text\n:71A:somemore text";
string pattern = "(?<=:[^:]+:)[^:]+\n";
var m = Regex.Match(input, pattern, RegexOptions.Singleline).Value;
If you want to use your tag constant, you can use this code
const string tag = ":59";
string input = ":59A:/sometext\nsomemore text\n:71A:somemore text";
string pattern = String.Format("(?<={0}[^:]*:)[^:]+\n", tag);
var m = Regex.Match(input, pattern, RegexOptions.Singleline).Value;

extract query string from a URL string

I am reading from history, and I want that when i come across a google query, I can extract the query string. I am not using request or httputility since i am simply parsing a string. however, when i come across URLs like this, my program fails to parse it properly:
http://www.google.com.mt/search?client=firefox-a&rls=org.mozilla%3Aen-US%3Aofficial&channel=s&hl=mt&source=hp&biw=986&bih=663&q=hotmail&meta=&btnG=Fittex+bil-Google
what i was trying to do is get the index of q= and the index of & and take the words in between but in this case the index of & will be smaller than q= and it will give me errors.
any suggestions?
thanks for your answers, all seem good :) p.s. i couldn't use httputility, not I don't want to. when i add a reference to system.web, httputility isn't included! it's only included in an asp.net application. Thanks again
It's not clear why you don't want to use HttpUtility. You could always add a reference to System.Web and use it:
var parsedQuery = HttpUtility.ParseQueryString(input);
Console.WriteLine(parsedQuery["q"]);
If that's not an option then perhaps this approach will help:
var query = input.Split('&')
.Single(s => s.StartsWith("q="))
.Substring(2);
Console.WriteLine(query);
It splits on & and looks for the single split result that begins with "q=" and takes the substring at position 2 to return everything after the = sign. The assumption is that there will be a single match, which seems reasonable for this case, otherwise an exception will be thrown. If that's not the case then replace Single with Where, loop over the results and perform the same substring operation in the loop.
EDIT: to cover the scenario mentioned in the comments this updated version can be used:
int index = input.IndexOf('?');
var query = input.Substring(index + 1)
.Split('&')
.SingleOrDefault(s => s.StartsWith("q="));
if (query != null)
Console.WriteLine(query.Substring(2));
If you don't want to use System.Web.HttpUtility (thus be able to use the client profile), you can still use Mono HttpUtility.cs which is only an independent .cs file that you can embed in your application. Then you can simply use the ParseQueryString method inside the class to parse the query string properly.
here is the solution -
string GetQueryString(string url, string key)
{
string query_string = string.Empty;
var uri = new Uri(url);
var newQueryString = HttpUtility.ParseQueryString(uri.Query);
query_string = newQueryString[key].ToString();
return query_string;
}
Why don't you create a code which returns the string from the q= onwards till the next &?
For example:
string s = historyString.Substring(url.IndexOf("q="));
int newIndex = s.IndexOf("&");
string newString = s.Substring(0, newIndex);
Cheers
Use the tools available:
String UrlStr = "http://www.google.com.mt/search?client=firefox-a&rls=org.mozilla%3Aen-US%3Aofficial&channel=s&hl=mt&source=hp&biw=986&bih=663&q=hotmail&meta=&btnG=Fittex+bil-Google";
NameValueCollection Items = HttpUtility.ParseQueryString(UrlStr);
String QValue = Items["q"];
If you really need to do the parsing yourself, and are only interested in the value for 'q' then the following would work:
string url = #"http://www.google.com.mt/search?" +
"client=firefoxa&rls=org.mozilla%3Aen-" +
"US%3Aofficial&channel=s&hl=mt&source=hp&" +
"biw=986&bih=663&q=hotmail&meta=&btnG=Fittex+bil-Google";
int question = url.IndexOf("?");
if(question>-1)
{
int qindex = url.IndexOf("q=", question);
if (qindex > -1)
{
int ampersand = url.IndexOf('&', qindex);
string token = null;
if (ampersand > -1)
token = url.Substring(qindex+2, ampersand - qindex - 2);
else
token = url.Substring(qindex+2);
Console.WriteLine(token);
}
}
But do try to look at using a proper URL parser, it will save you a lot of hassle in the future.
(amended this question to include a check for the '?' token, and support 'q' values at the end of the query string (without the '&' at the end) )
And that's why you should use Uri and HttpUtility.ParseQueryString.
HttpUtility is fine for the .Net Framework. However that class is not available for WinRT apps. If you want to get the parameters from a url in a Windows Store App you need to use WwwFromUrlDecoder. You create an object from this class with the query string you want to get the parameters from, the object has an enumerator and supports also lambda expressions.
Here's an example
var stringUrl = "http://localhost/?name=Jonathan&lastName=Morales";
var decoder = new WwwFormUrlDecoder(stringUrl);
//Using GetFirstByName method
string nameValue = decoder.GetFirstByName("name");
//nameValue has "Jonathan"
//Using Lambda Expressions
var parameter = decoder.FirstOrDefault(p => p.Name.Contains("last")); //IWwwFormUrlDecoderEntry variable type
string parameterName = parameter.Name; //lastName
string parameterValue = parameter.Value; //Morales
You can also see http://www.dzhang.com/blog/2012/08/21/parsing-uri-query-strings-in-windows-8-metro-style-apps

C# Named parameters to a string that replace to the parameter values

I want in a good performance way (I hope) replace a named parameter in my string to a named parameter from code, example, my string:
"Hi {name}, do you like milk?"
How could I replace the {name} by code, Regular expressions? To expensive? Which way do you recommend?
How do they in example NHibernates HQL to replace :my_param to the user defined value? Or in ASP.NET (MVC) Routing that I like better, "{controller}/{action}", new { controller = "Hello", ... }?
Have you confirmed that regular expressions are too expensive?
The cost of regular expressions is greatly exaggerated. For such a simple pattern performance will be quite good, probably only slightly less good than direct search-and-replace, in fact. Also, have you experimented with the Compiled flag when constructing the regular expression?
That said, can't you just use the simplest way, i.e. Replace?
string varname = "name";
string pattern = "{" + varname + "}";
Console.WriteLine("Hi {name}".Replace(pattern, "Mike"));
Regex is certainly a viable option, especially with a MatchEvaluator:
Regex re = new Regex(#"\{(\w*?)\}", RegexOptions.Compiled); // store this...
string input = "Hi {name}, do you like {food}?";
Dictionary<string, string> vals = new Dictionary<string, string>();
vals.Add("name", "Fred");
vals.Add("food", "milk");
string q = re.Replace(input, delegate(Match match)
{
string key = match.Groups[1].Value;
return vals[key];
});
Now if you have you replacements in a dictionary, like this:
var replacements = new Dictionary<string, string>();
replacements["name"] = "Mike";
replacements["age"]= "20";
then the Regex becomes quite simple:
Regex regex = new Regex(#"\{(?<key>\w+)\}");
string formattext = "{name} is {age} years old";
string newStr = regex.Replace(formattext,
match=>replacements[match.Groups[1].Captures[0].Value]);
After thinking about this, I realized what I actually wished for, was that String.Format() would take an IDictionary as argument, and that templates could be written using names instead of indexes.
For string substitutions with lots of possible keys/values, the index numbers result in illegible string templates - and in some cases, you may not even know which items are going to have what number, so I came up with the following extension:
https://gist.github.com/896724
Basically this lets you use string templates with names instead of numbers, and a dictionary instead of an array, and lets you have all the other good features of String.Format(), allowing the use of a custom IFormatProvider, if needed, and allowing the use of all the usual formatting syntax - precision, length, etc.
The example provided in the reference material for String.Format is a great example of how templates with many numbered items become completely illegible - porting that example to use this new extension method, you get something like this:
var replacements = new Dictionary<String, object>()
{
{ "date1", new DateTime(2009, 7, 1) },
{ "hiTime", new TimeSpan(14, 17, 32) },
{ "hiTemp", 62.1m },
{ "loTime", new TimeSpan(3, 16, 10) },
{ "loTemp", 54.8m }
};
var template =
"Temperature on {date1:d}:\n{hiTime,11}: {hiTemp} degrees (hi)\n{loTime,11}: {loTemp} degrees (lo)";
var result = template.Subtitute(replacements);
As someone pointed out, if what you're writing needs to be highly optimized, don't use something like this - if you have to format millions of strings this way, in a loop, the memory and performance overhead could be significant.
On the other hand, if you're concerned about writing legible, maintainable code - and if you're doing, say, a bunch of database operations, in the grand scheme of things, this function will not add any significant overhead.
...
For convenience, I did attempt to add a method that would accept an anonymous object instead of a dictionary:
public static String Substitute(this String template, object obj)
{
return Substitute(
template,
obj.GetType().GetProperties().ToDictionary(p => p.Name, p => p.GetValue(obj, null))
);
}
For some reason, this doesn't work - passing an anonymous object like new { name: "value" } to that extension method gives a compile-time error message saying the best match was the IDictionary version of that method. Not sure how to fix that. (anyone?)
How about
stringVar = "Hello, {0}. How are you doing?";
arg1 = "John"; // or args[0]
String.Format(stringVar, arg1)
You can even have multiple args, just increment the {x} and add another parameter to the Format() method. Not sure the different but both "string" and "String" have this method.
A compiled regex might do the trick , especially if there are many tokens to be replaced. If there are just a handful of them and performance is key, I would simply find the token by index and replace using string functions. Believe it or not this will be faster than a regex.
Try using StringTemplate. It's much more powerful than that, but it does the job flawless.
or try this with Linq if you have all your replace values stored in a Dictionary obj.
For example:
Dictionary<string,string> dict = new Dictionary<string,string>();
dict.add("replace1","newVal1");
dict.add("replace2","newVal2");
dict.add("replace3","newVal3");
var newstr = dict.Aggregate(str, (current, value) => current.Replace(value.Key, value.Value));
dict is your search-replace pairs defined Dictionary object.
str is your string which you need to do some replacements with.
I would go for the mindplay.dk solution... Works quite well.
And, with a slight modification, it supports templates-of-templates, like
"Hi {name}, do you like {0}?", replacing {name} but retaining {0}:
In the given source (https://gist.github.com/896724), replace as follows:
var format = Pattern.Replace(
template,
match =>
{
var name = match.Groups[1].Captures[0].Value;
if (!int.TryParse(name, out parsedInt))
{
if (!map.ContainsKey(name))
{
map[name] = map.Count;
list.Add(dictionary.ContainsKey(name) ? dictionary[name] : null);
}
return "{" + map[name] + match.Groups[2].Captures[0].Value + "}";
}
else return "{{" + name + "}}";
}
);
Furthermore, it supports a length ({name,30}) as well as a formatspecifier, or a combination of both.
UPDATE for 2022 (for both .NET 4.8 and .NET 6):
Especially when multi-line string templates are needed, C# 6 now offers us both $ and # used together like:
(You just need to escape quotes by replacing " with "")
string name = "Mike";
int age = 20 + 14; // 34
string product = "milk";
var htmlTemplateContent = $#"
<!DOCTYPE html>
<html>
<head>
<meta charset=""utf-8"" />
<title>Sample HTML page</title>
</head>
<body>
Hi {name}, now that you're {age.ToString()}, how do you like {product}?
</body>
</html>";

Categories