I'm new to .NET and having a hard time trying to understand the Regex object.
What I'm trying to do is below. It's pseudo-code; I don't know the actual code that makes this work:
string pattern = ...; // has multiple groups using the Regex syntax <groupName>
if (new Regex(pattern).Apply(inputString).HasMatches)
{
var matches = new Regex(pattern).Apply(inputString).Matches;
return new DecomposedUrl()
{
Scheme = matches["scheme"].Value,
Address = matches["address"].Value,
Port = Int.Parse(matches["address"].Value),
Path = matches["path"].Value,
};
}
What do I need to change to make this code work?
There is no Apply method on Regex. Seems like you may be using some custom extension methods that aren't shown. You also haven't shown the pattern you're using. Other than that, groups can be retrieved from a Match, not a MatchCollection.
Regex simpleEmail = new Regex(#"^(?<user>[^#]*)#(?<domain>.*)$");
Match match = simpleEmail.Match("someone#tempuri.org");
String user = match.Groups["user"].Value;
String domain = match.Groups["domain"].Value;
A Regex instance on my machine doesn't have the Apply method. I'd usually do something more like this:
var match=Regex.Match(input,pattern);
if(match.Success)
{
return new DecomposedUrl()
{
Scheme = match.Groups["scheme"].Value,
Address = match.Groups["address"].Value,
Port = Int.Parse(match.Groups["address"].Value),
Path = match.Groups["path"].Value
};
}
Related
How can I most efficiently check to see if an input string starts with a string that belongs in a list of strings?
For example possiblePrefixes = "1234", "1235", "1236". If input = "1235av2425" should return true. If input = "1237352ko" should return false.
you can use Any for this. the concept here is you need to check whether there is any item in the list which is the prefix for the given string.
List<string> list = new List<string>() { "1234", "1235", "1236" };
string input = "1237352ko";
var exisits = list.Any(x => input.StartsWith(x)); //returns false
when string input = "1235av2425"; it will return true
An efficient datastructure for this type of search would be a prefix tree (aka "Trie").
For your example data such a tree might look something like this:
123
|-4
|-5
|-6
This could allow a lookup time that is independent of the number of prefixes you want to check against.
But as far as I know there are no builtin types for this, so you would either need to find a library, or implement it yourself.
The solution using Any and StartsWith will be the best in most cases. Looking for an optimized solution will only be necessary if you have a long list of possible prefixes and/or a long list of texts to check against the same prefixes.
In that case, using a pre-compiled regular expression built once from the list of possible prefixes and then re-used for multiple checks might be a little faster.
// Build regular expression once
string[] possiblePrefixes = new string[] { "1234", "1235", "1236" };
var escaped = possiblePrefixes.Select(p => Regex.Escape(p));
string pattern = "^(" + string.Join("|", escaped) + ").*";
Regex regEx = new Regex(pattern, RegexOptions.Compiled);
// Now use it multiple times
string input = "1235av2425";
bool result = regEx.IsMatch(input);
Following are the 2 solutions
Solution # 1 (using Lambda Expression)
List<string> possiblePrefixes = new List<string>() { "1234", "1235", "1236" };
string input = "1235av2425";
var result = possiblePrefixes.Any(x => input.StartsWith(x));
Console.WriteLine(result); //returns True
Solution # 2 (using SQL)
List<string> possiblePrefixes = new List<string>() { "1234", "1235", "1236" };
string input = "1235av2425";
var result = (from val in possiblePrefixes
where input.StartsWith(val)
select val).Any();
Console.WriteLine(result); //returns True
Can one store the template of a string in a variable and use interpolation on it?
var name = "Joe";
var template = "Hi {name}";
I then want to do something like:
var result = $template;
The reason is my templates will come from a database.
I guess that these strings will have always the same number of parameters, even if they can change. For example, today template is "Hi {name}", and tomorrow could be "Hello {name}".
Short answer: No, you cannot do what you have proposed.
Alternative 1: use the string.Format method.
You can store in your database something like this:
"Hi {0}"
Then, when you retrieve the string template from the db, you can write:
var template = "Hi {0}"; //retrieved from db
var name = "Joe";
var result = string.Format(template, name);
//now result is "Hi Joe"
With 2 parameters:
var name2a = "Mike";
var name2b = "John";
var template2 = "Hi {0} and {1}!"; //retrieved from db
var result2 = string.Format(template2, name2a, name2b);
//now result2 is "Hi Mike and John!"
Alternative 2: use a placeholder.
You can store in your database something like this:
"Hi {name}"
Then, when you retrieve the string template from the db, you can write:
var template = "Hi {name}"; //retrieved from db
var name = "Joe";
var result = template.Replace("{name}", name);
//now result is "Hi Joe"
With 3 parameters:
var name2a = "Mike";
var name2b = "John";
var template2 = "Hi {name2a} and {name2b}!"; //retrieved from db
var result2 = template2
.Replace("{name2a}", name2a)
.Replace("{name2b}", name2b);
//now result2 is "Hi Mike and John!"
Pay attention at which token you choose for your placeholders. Here I used surrounding curly brackets {}. You should find something that is unlikely to cause collisions with the rest of your text. And that depends entirely on your context.
This can be done as requested using dynamic compilation, such as through the Microsoft.CodeAnalysis.CSharp.Scripting package. For example:
var name = "Joe";
var template = "Hi {name}";
var result = await CSharpScript.EvaluateAsync<string>(
"var name = \"" + name + "\"; " +
"return $\"" + template + "\";");
Note that this approach is slow, and you'd need to add more logic to handle escaping of quotes (and injection attacks) within strings, but the above serves as a proof-of-concept.
No you can't do that since it needs name value at the time string is created (compile time). Consider using String.Format or String.Replace instead.
I just had the same need in my app so will share my solution using String.Replace(). If you're able to use LINQ then you can use the Aggregate method (which is a reducing function, if you're familiar with functional programming) combined with a Dictionary that provides the substitutions you want.
string template = "Hi, {name} {surname}";
Dictionary<string, string> substitutions = new Dictionary<string, string>() {
{ "name", "Joe" },
{ "surname", "Bloggs" },
};
string result = substitutions.Aggregate(template, (args, pair) =>
args.Replace($"{{{pair.Key}}}", pair.Value)
);
// result == "Hi, Joe Bloggs"
This works by starting with the template and then iterating over each item in the substitution dictionary, replacing the occurrences of each one. The result of one Replace() call is fed into the input to the next, until all substitutions are performed.
The {{{pair.Key}}} bit is just to escape the { and } used to find a placeholder.
This is pretty old now, but as I've just come across it it's new to me!
It's a bit overkill for what you need, but I have used Handlebars.NET for this sort of thing.
You can create quite complex templates and merge in hierarchical data structures for the context. There's rules for looping and conditional sections, partial template compositing and even helper function extension points. It also handles many data types gracefully.
There's way too much to go into here, but a short example to illustrate...
var source = #"Hello {{Guest.FirstName}}{{#if Guest.Surname}} {{Guest.Surname}}{{/if}}!";
var template = Handlebars.Compile(source);
var rec = new {
Guest = new { FirstName = "Bob", Surname = null }
};
var resultString = template(rec);
In this case the surname will only be included in the output if the value is not null or empty.
Now admittedly this is more complicated for users than simple string interpolation, but remember that you can still just use {{fieldName}} if you want to, just that you can do lots more as well.
This particular nuGet is a port of HandlebarsJs so it has a high degree of compatibility. HandlebarsJs is itself a port of Mustache - there are direct dotNet ports of Mustache but IMHO HandlebarsNET is the business.
I have string coming in this format as shown bellow:
"mark345345#test.com;rtereter#something.com;terst#gmail.com;fault#mail"
What would be the most efficient way to validate each of these above and fail if it is not valid e-mail?
you can use EmailAddressAttribute class of System.ComponentModel.DataAnnotations namespace for validating the email address. Before that you need to split up individual mails and check whether it is valid or not. the following code will help you to collect the valid mails and invalid mails seperately.
List<string> inputMails = "mark345345#test.com;rtereter#something.com;terst#gmail.com;fault#mail".Split(';').ToList();
List<string> validMails = new List<string>();
List<string> inValidMails = new List<string>();
var validator = new EmailAddressAttribute();
foreach (var mail in inputMails)
{
if (validator.IsValid(mail))
{
validMails.Add(mail);
}
else
{
inValidMails.Add(mail);
}
}
You can use Regex or you might split the string by ';' and try to create a System.Net.Mail.MailAddress instance for each and every address. FormatException will occur if address is not in a recognized format.
If you're sure, that all e-mails are semi colon separated, you can split it and make a list of all. The best way for me to validate each e-mail is to use a regex pattern. I've used this one:
var emailPattern = #"(?=^.{1,64}#)^[a-zA-Z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-zA-Z0-9!#$%&'*+/=?^_`{|}~-]+)*#(?=.{1,255}$|.{1,255};)(?:[a-zA-Z0-9](?:[a-zA-Z0-9-]*[a-zA-Z0-9])?\.)+[a-zA-Z0-9](?:[a-zA-Z0-9-]*[a-zA-Z0-9])(;(?=.{1,64}#)[a-zA-Z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-zA-Z0-9!#$%&'*+/=?^_`{|}~-]+)*#(?=.{1,255}$|.{1,255};)(?:[a-zA-Z0-9](?:[a-zA-Z0-9-]*[a-zA-Z0-9])?\.)+[a-zA-Z0-9](?:[a-zA-Z0-9-]*[a-zA-Z0-9]))*$";
var incomingString = "mark345345#test.com;rtereter#something.com;terst#gmail.com;fault#mail";
var emails = incomingString.Split(';').ToList();
foreach (var email in emails)
{
if (new Regex(emailPattern).IsMatch(email))
{
// your logic here
}
}
Since .Net has out of the box ways to validate an email id, I would not use a regex and rely upon .Net. e.g the EmailAddressAttribute from System.ComponentModel.DataAnnotations.
A clean way to use it would be something like:
var emailAddressAttribute = new EmailAddressAttribute();
var groups = yourEmailsString.Split(new [] { ';' }, StringSplitOptions.RemoveEmptyEntries)
.GroupBy(emailAddressAttribute.IsValid);
This will give you 2 groups, the one with the Key == true will be valid email ids
var validEmailIds = groups.Where(group => group.Key)
.SelectMany(group => group);
the one with Key == false will be invalid email ids
var invalidEmailIds = groups.Where(group => !group.Key)
.SelectMany(group => group);
You could also run up a for loop after grouping, according to your needs..
Basically i have a string list as such:
/forum/
/phpld/
/php/
Now i want to check if any of the url:
http://www.url.com/forum/
contains any values from the string list.
In the above case it should match because /forum/ is in the url.
I was thinking something like this:
foreach (string filter in _filterList)
{
if (PAGEURL.Trim().Contains(filter.Trim()))
{
_parseResultsFinal.Add(PAGEURL);
filteredByURL++;
break;
}
}
But i cannot get the above to be accurate
How would i do this? :)
Try this:
_filterList.Any(filter => PAGEURL.Trim().Contains(filter.Trim()));
You may do PAGEURL = PAGEURL.Trim() before this expression to not run it each time.
String.Contains() is case-sensitive and culture-insensitive, so if there are any case differences that could be the cause of the 'inaccuracy' that you are experiencing.
If you suspect this may be the problem (or even as a viable alternative) you can try this as the 'if' clause:
if (PAGEURL.Trim().IndexOf(filter.Trim(), StringComparison.OrdinalIgnoreCase) >= 0)
I'm not abundantly clear on what you want to do here, it seems as though if a URL contains any of the filters then you want to add the URL to the list.
List<string> parseResultsFinal = new List<string>();
if (_filterList.Any(x => PAGEURL.Contains(x))
{
parseResultsFinal.Add(PAGEURL);
}
Try to use that.
I would try the following:
var trimmedUrl = PageURL.Replace("http://", "");
var parts = trimmedUrl.Split("/");
var filterList = new List<string> { "forum", "phpld", "php" }
var anyContains = parts.Any(o => filterList.contains(o));
I'd change segments filters to simple words (without slashes, trimmed before adding to filter list):
var _filterList = new List<string>()
{
"forum", "phpld", "php"
};
And used regex to search for segments in url (ignore case, optional slash at the end of url)
bool IsSegmentInUrl(string url, string segment)
{
string pattern = String.Format(".*/{0}(/|$)", segment);
return Regex.IsMatch(url, pattern, RegexOptions.IgnoreCase);
}
Usage:
if (_filterList.Any(filter => IsSegmentInUrl(PAGEURL, filter))
{
_parseResultsFinal.Add(PAGEURL);
filteredByURL++;
}
More readable solution - create extensions method
public static bool ContainsSegment(this string url, string segment)
{
string pattern = String.Format("http://.*/{0}(/|$)", segment);
return Regex.IsMatch(url, pattern, RegexOptions.IgnoreCase);
}
Now code looks very self-describing:
if (_filterList.Any(filter => PAGEURL.ContainsSegment(filter))
{
_parseResultsFinal.Add(PAGEURL);
filteredByURL++;
}
With the following code:
string q = "userID=16555&gameID=60&score=4542.122&time=343114";
What would be the easiest way to parse the values, preferably without writing my own parser? I'm looking for something with the same functionality as Request.querystring["gameID"].
Pretty easy... Use the HttpUtility.ParseQueryString method.
Untested, but this should work:
var qs = "userID=16555&gameID=60&score=4542.122&time=343114";
var parsed = HttpUtility.ParseQueryString(qs);
var userId = parsed["userID"];
// ^^^^^^ Should be "16555". Note this will be a string of course.
You can do it with linq like this.
string query = "id=3123123&userId=44423&format=json";
Dictionary<string,string> dicQueryString =
query.Split('&')
.ToDictionary(c => c.Split('=')[0],
c => Uri.UnescapeDataString(c.Split('=')[1]));
string userId = dicQueryString["userID"];
Edit
If you can use HttpUtility.ParseQueryString then it will be a lot more straight forward and it wont be case-sensitive as in case of LinQ.
As has been mentioned in each of the previous answers, if you are in a context where you can add a dependency to the System.Web library, using HttpUtility.ParseQueryString makes sense. (For reference, the relevant source can be found in the Microsoft Reference Source). However, if this is not possible, I would like to propose the following modification to Adil's answer which accounts for many of the concerns addressed in the comments (such as case sensitivity and duplicate keys):
var q = "userID=16555&gameID=60&score=4542.122&time=343114";
var parsed = q.TrimStart('?')
.Split(new[] { '&' }, StringSplitOptions.RemoveEmptyEntries)
.Select(k => k.Split('='))
.Where(k => k.Length == 2)
.ToLookup(a => a[0], a => Uri.UnescapeDataString(a[1])
, StringComparer.OrdinalIgnoreCase);
var userId = parsed["userID"].FirstOrDefault();
var time = parsed["TIME"].Select(v => (int?)int.Parse(v)).FirstOrDefault();
If you want to avoid the dependency on System.Web that is required to use HttpUtility.ParseQueryString, you could use the Uri extension method ParseQueryString found in System.Net.Http.
Note that you have to convert the response body to a valid Uri so that ParseQueryString works.
Please also note in the MSDN document, this method is an extension method for the Uri class, so you need reference the assembly System.Net.Http.Formatting (in System.Net.Http.Formatting.dll). I tried installed it by the nuget package with the name "System.Net.Http.Formatting", and it works fine.
string body = "value1=randomvalue1&value2=randomValue2";
// "http://localhost/query?" is added to the string "body" in order to create a valid Uri.
string urlBody = "http://localhost/query?" + body;
NameValueCollection coll = new Uri(urlBody).ParseQueryString();
How is this
using System.Text.RegularExpressions;
// query example
// "name1=value1&name2=value2&name3=value3"
// "?name1=value1&name2=value2&name3=value3"
private Dictionary<string, string> ParseQuery(string query)
{
var dic = new Dictionary<string, string>();
var reg = new Regex("(?:[?&]|^)([^&]+)=([^&]*)");
var matches = reg.Matches(query);
foreach (Match match in matches) {
dic[match.Groups[1].Value] = Uri.UnescapeDataString(match.Groups[2].Value);
}
return dic;
}
System.Net.Http ParseQueryString extension method worked for me. I'm using OData query options and trying to parse out some custom parameters.
options.Request.RequestUri.ParseQueryString();
Seems to give me what I need.
HttpUtility.ParseQueryString will work as long as you are in a web app or don't mind including a dependency on System.Web. Another way to do this is:
// NameValueCollection nameValueCollection = HttpUtility.ParseQueryString(queryString);
NameValueCollection nameValueCollection = new NameValueCollection();
string[] querySegments = queryString.Split('&');
foreach(string segment in querySegments)
{
string[] parts = segment.Split('=');
if (parts.Length > 0)
{
string key = parts[0].Trim(new char[] { '?', ' ' });
string val = parts[1].Trim();
nameValueCollection.Add(key, val);
}
}
For .NET Core there is Microsoft.AspNetCore.WebUtilities.QueryHelpers.ParseQuery
var queryString = QueryHelpers.ParseQuery("?param1=value");
var queryParamValue = queryString["param1"];
Code snippet modified from trackjs.com: