.NET Regular expression for querystring value - c#

I need to strip out any "&id=SomeValue" from a Url.PathAndQuery. Where SomeValue could be an int or a string. And it may or may not be followed by another ampersand.
So it could be
somepage.aspx?cat=22&id=SomeId&param2=4
or
somepage.aspx?cat=tect&id=450
I want to be left with
somepage.aspx?cat=22&param2=4
or
somepage.aspx?cat=tect

Just going off the top of my head...
string url = "somepage.aspx?cat=22&id=SomeId&param2=4";
Regex regex = new Regex("([\?\&])id=[^\?\&]+");
url = regex.replace(url, "\1");
System.Diagnostics.Debug.WriteLine("url = " + url);
Update 2010-03-05 11:12 PM PST
I've been shamed by a comment into actually testing my code. What are you, my QA department? Here's a working example using MSTest.
Regex regex = new Regex(#"([\?\&])id=[^\&]+[\&]?");
[TestMethod]
public void RegexReplacesParameterInMiddle()
{
string url = "somepage.aspx?cat=22&id=SomeId&param2=4";
url = regex.Replace(url, "$1");
Assert.AreEqual("somepage.aspx?cat=22&param2=4",url);
}
[TestMethod]
public void RegexReplacesParameterInFront()
{
string url = "somepage.aspx?id=SomeId&cat=22&param2=4";
url = regex.Replace(url, "$1");
Assert.AreEqual("somepage.aspx?cat=22&param2=4", url);
}
[TestMethod]
public void RegexReplacesParameterAtEnd()
{
string url = "somepage.aspx?cat=22&param2=4&id=SomeId";
url = regex.Replace(url, "$1");
Assert.AreEqual("somepage.aspx?cat=22&param2=4&", url);
}
[TestMethod]
public void RegexReplacesSoleParameter()
{
string url = "somepage.aspx?id=SomeId";
url = regex.Replace(url, "$1");
Assert.AreEqual("somepage.aspx?", url);
}
public void RegexIgnoresMissingParameter()
{
string url = "somepage.aspx?foo=bar&blet=monkey";
url = regex.Replace(url, "$1");
Assert.AreEqual("somepage.aspx?foo=bar&blet=monkey", url);
}
The regex, interpreted, says:
Look for a "?" or an "&" character (and store it as a backreference)
followed by "id="
followed by one or more non-"&" characters.
optionally followed by another "&"
Then replace that expression with the backreference, so you don't lose your initial ?/&.
note -- as you can see from the tests, this emits a trailing ? or & when the replaced parameter is the only one or the last one, respectively. You could use string methods to get rid of that, though if somebody knows how to keep them out of the result using only regular expressions it would be excellent to see.

If bad things could happen (e.g., security-wise) if an "id=" parameter were missed by the regular expression, then you also need to worry that the query string might contain a hexadecimal urlencoded equivalent, which the regular expression will not recognize. For example, "id" is "%69%64". Also consider the effects of different capitalizations of "id" on your program. My opinion in this situtation is that you read the RFCs and build a complete class that can do transformations in both directions from a set of name-value pairs to a query strings. System.Uri will not do this. if you are running inside an ASP.NET application, you might investigate if HttpUtility.ParseQueryString is sufficient.

I would first parse the Querystring to Strongly typed values, then I would check using Regex if I needed to.
C# ASP.NET QueryString parser

Related

How to split a url c# [duplicate]

This question already has answers here:
Get URL parameters from a string in .NET
(17 answers)
Closed 4 years ago.
I have a uri string like: http://example.com/file?a=1&b=2&c=string%20param
Is there an existing function that would convert query parameter string into a dictionary same way as ASP.NET Context.Request does it.
I'm writing a console app and not a web-service so there is no Context.Request to parse the URL for me.
I know that it's pretty easy to crack the query string myself but I'd rather use a FCL function is if exists.
Use this:
string uri = ...;
string queryString = new System.Uri(uri).Query;
var queryDictionary = System.Web.HttpUtility.ParseQueryString(queryString);
This code by Tejs isn't the 'proper' way to get the query string from the URI:
string.Join(string.Empty, uri.Split('?').Skip(1));
You can use:
var queryString = url.Substring(url.IndexOf('?')).Split('#')[0]
System.Web.HttpUtility.ParseQueryString(queryString)
MSDN
This should work:
string url = "http://example.com/file?a=1&b=2&c=string%20param";
string querystring = url.Substring(url.IndexOf('?'));
System.Collections.Specialized.NameValueCollection parameters =
System.Web.HttpUtility.ParseQueryString(querystring);
According to MSDN. Not the exact collectiontype you are looking for, but nevertheless useful.
Edit: Apparently, if you supply the complete url to ParseQueryString it will add 'http://example.com/file?a' as the first key of the collection. Since that is probably not what you want, I added the substring to get only the relevant part of the url.
I had to do this for a modern windows app. I used the following:
public static class UriExtensions
{
private static readonly Regex _regex = new Regex(#"[?&](\w[\w.]*)=([^?&]+)");
public static IReadOnlyDictionary<string, string> ParseQueryString(this Uri uri)
{
var match = _regex.Match(uri.PathAndQuery);
var paramaters = new Dictionary<string, string>();
while (match.Success)
{
paramaters.Add(match.Groups[1].Value, match.Groups[2].Value);
match = match.NextMatch();
}
return paramaters;
}
}
Have a look at HttpUtility.ParseQueryString() It'll give you a NameValueCollection instead of a dictionary, but should still do what you need.
The other option is to use string.Split().
string url = #"http://example.com/file?a=1&b=2&c=string%20param";
string[] parts = url.Split(new char[] {'?','&'});
///parts[0] now contains http://example.com/file
///parts[1] = "a=1"
///parts[2] = "b=2"
///parts[3] = "c=string%20param"
For isolated projects, where dependencies must be kept to a minimum, I found myself using this implementation:
var arguments = uri.Query
.Substring(1) // Remove '?'
.Split('&')
.Select(q => q.Split('='))
.ToDictionary(q => q.FirstOrDefault(), q => q.Skip(1).FirstOrDefault());
Do note, however, that I do not handle encoded strings of any kind, as I was using this in a controlled setting, where encoding issues would be a coding error on the server side that should be fixed.
In a single line of code:
string xyz = Uri.UnescapeDataString(HttpUtility.ParseQueryString(Request.QueryString.ToString()).Get("XYZ"));
Microsoft Azure offers a framework that makes it easy to perform this.
http://azure.github.io/azure-mobile-services/iOS/v2/Classes/MSTable.html#//api/name/readWithQueryString:completion:
You could reference System.Web in your console application and then look for the Utility functions that split the URL parameters.

C# Regex, any more efficient way to parse string enclosed by symbol?

I'm not sure if it's okay to ask... But here goes.
I implemented a method that parses a string using regex, each matching are parsed through the delegates with an order ( actually, order is not important-- I think, wait, is it? ... But I wrote it this way, and it's not fully tested ):
Pattern Regex.Replace: #"(?<!\\)\$.+?\$" then String.Replace: #"\$", #"$"; Replace string enclosed by dollar sign. Ignores backslash ones, then erases backslash. Ex: "$global name$" -> "motherofglobalvar", "Money \$9000" -> "Money $9000"
Pattern Regex.Replace #"(?<!\\)%.+?%" then String.Replace #"\%", #"%"; Replace string enclosed by percentage sign. Ignores backslash ones, then erase backslash. Same as previous example: "%local var%" -> "lordoflocalvar", "It's over 9000\%" -> "It's over 9000%"
Pattern Regex.Replace #"(?<!\\)#" then String.Replace #"\#", #"#"; Replace char '#' with whitespace, ' '. But ignore backslash ones, then erase the backslash. Ex: "I#hit#the#ground#too#hard" -> "I hit the ground too hard", "qw\#op" -> "qw#op"
What I've done without much experience (I think):
//parse variable
public static string ParseVariable(string text)
{
return Regex.Replace(Regex.Replace(Regex.Replace(text, #"(?<!\\)\$.+?\$", match =>
{
string trim = match.Value.Trim('$');
string trimUpper = trim.ToUpper();
return variableGlobal.ContainsKey(trim) ? variableGlobal[trim] : match.Value;
}).Replace(#"\$", #"$"), #"(?<!\\)%.+?%", match =>
{
string trim = match.Value.Trim('%');
string trimUpper = trim.ToUpper();
return variableLocal.ContainsKey(trim) ? variableLocal[trim] : match.Value;
}).Replace(#"\%", #"%"), #"(?<!\\)#", " ").Replace(#"\#", #"#");
}
In short, what I used is: Regex.Replace().Replace()
Since I need to parse 3 kinds of symbols, I chained it as following: Regex.Replace(Regex.Replace(Regex.Replace().Replace()).Replace()).Replace()
Is there any more efficient way than this? I mean, like without need to go through the text 6 times? (3 times regex.replace, 3 times string.replace, where each replace modifies the text to be used by the next replace )
Or is it the best way it can do?
Thanks.
Here's a unique take on the problem, I think. You can build a class that will be used to construct the overall pattern piece-by-piece. This class will be responsible for the generating of the MatchEvaluator delegate that will be passed to Replace as well.
class RegexReplacer
{
public string Pattern { get; private set; }
public string Replacement { get; private set; }
public string GroupName { get; private set; }
public RegexReplacer NextReplacer { get; private set; }
public RegexReplacer(string pattern, string replacement, string groupName, RegexReplacer nextReplacer = null)
{
this.Pattern = pattern;
this.Replacement = replacement;
this.GroupName = groupName;
this.NextReplacer = nextReplacer;
}
public string GetAggregatedPattern()
{
string constructedPattern = this.Pattern;
string alternation = (this.NextReplacer == null ? string.Empty : "|" + this.NextReplacer.GetAggregatedPattern()); // If there isn't another replacer, then we won't have an alternation; otherwise, we build an alternation between this pattern and the next replacer's "full" pattern
constructedPattern = string.Format("(?<{0}>{1}){2}", this.GroupName, this.Pattern, alternation); // The (?<XXX>) syntax builds a named capture group. This is used by our GetReplacementDelegate metho.
return constructedPattern;
}
public MatchEvaluator GetReplaceDelegate()
{
return (match) =>
{
if (match.Groups[this.GroupName] != null && match.Groups[this.GroupName].Length > 0) // Did we get a hit on the group name?
{
return this.Replacement;
}
else if (this.NextReplacer != null) // No? Then is there another replacer to inspect?
{
MatchEvaluator next = this.NextReplacer.GetReplaceDelegate();
return next(match);
}
else
{
return match.Value; // No? Then simply return the value
}
};
}
}
It should be obvious as to what Pattern and Replacement represent. GroupName is kind of a hack to let the replacement evaluator know which RegexReplacer fragment resulted in the match. NextReplacer points to another replacer instance that holds a different pattern fragment (et al.).
The idea here is to have a kind of linked list of objects that will represent the overall pattern. You can call GetAggregatedPattern on the outer-most replacer to get the full pattern--each replacer calls the next replacer's GetAggregatedPattern to get that replacer's patter fragment, to which it concatenates its own fragment. The GetReplacementDelegate generates a MatchEvaluator. This MatchEvaluator will compare its own GroupName to the Match's captured groups. If the group name was captured, then we have a hit, and we return this replacer's Replacement value. Otherwise, we step into the next replacer (if there is one) and repeat the group name comparison. If there is no hit on any replacer, then we simply yield back the original value (i.e. what was matched by the pattern; this should be rare).
The usage of such might look like this:
string target = #"$global name$ Money \$9000 %local var% It's over 9000\% I#hit#the#ground#too#hard qw\#op";
RegexReplacer dollarWrapped = new RegexReplacer(#"(?<!\\)\$[^$]+\$", "motherofglobalvar", "dollarWrapped");
RegexReplacer slashDollar = new RegexReplacer(#"\\\$", string.Empty, "slashDollar", dollarWrapped);
RegexReplacer percentWrapped = new RegexReplacer(#"(?<!\\)%[^%]+%", "lordoflocalvar", "percentWrapped", slashDollar);
RegexReplacer slashPercent = new RegexReplacer(#"\\%", string.Empty, "slashPercent", percentWrapped);
RegexReplacer singleAt = new RegexReplacer(#"(?<!\\)#", " ", "singleAt", slashPercent);
RegexReplacer slashAt = new RegexReplacer(#"\\#", "#", "slashAt", singleAt);
RegexReplacer replacer = slashAt;
string pattern = replacer.GetAggregatedPattern();
MatchEvaluator evaluator = replacer.GetReplaceDelegate();
string result = Regex.Replace(target, pattern, evaluator);
Because you want each replacer to know if it got a hit, and because we are hacking this by using group names, you want to make sure that each group name is distinct. A simple way to ensure this would be to use a name that's identical to the variable name since you can't have two variables with the same name within the same scope.
You can see above that I am building each part of the pattern separately, but as I build, I pass the previous replacer as a 4th parameter to the current replacer. This builds the chain of replacers. Once built, I use the last replacer constructed in order to generate the overall pattern and evaluator. If you use anything but, then you will only have part of the overall pattern. Finally, it's simply a matter of passing the generated pattern and evaluator to the Replace method.
Keep in mind that this approach was targeted more at the problem as described. It may work in more general scenarios, but I've only worked with what you've presented. Also, since this is more of a parsing question, a parser may be the proper route to take--although the learning curve is going to be higher.
Also keep in mind that I haven't profiled this code. It certainly doesn't loop over the target string multiple times, but it does involve additional method calls during replacement. You would certainly want to test it in your environment.

Fixed string Regular Expression C#

Hi all I want to know something regarding to fixed-string in regular expression.
How to represent a fixed-string, regardless of special characters or alphanumeric in C#?
For eg; have a look at the following string:
infinity.world.uk/Members/namelist.aspx?ID=-1&fid=X
The entire string before X will be fixed-string (ie; the whole sentence will appear the same) BUT only X will be the decimal variable.
What I want is that I want to append decimal number X to the fixed string. How to express that in terms of C# regular expression.
Appreciate your help
string fulltext = "inifinity.world.uk/Members/namelist.aspx?ID=-1&fid=" + 10;
if you need to modify existing url, dont use regex, string.Format or string.Replace you get problem with encoding of arguments
Use Uri and HttpUtility instead:
var url = new Uri("http://infinity.world.uk/Members/namelist.aspx?ID=-1&fid=X");
var query = HttpUtility.ParseQueryString(url.Query);
query["fid"] = 10.ToString();
var newUrl = url.GetLeftPart(UriPartial.Path) + "?" + query;
result: http://infinity.world.uk/Members/namelist.aspx?ID=-1&fid=10
for example, using query["fid"] = "%".ToString(); you correctly generate http://infinity.world.uk/Members/namelist.aspx?ID=-1&fid=%25
demo: https://dotnetfiddle.net/zZ9Y1h
String.Format is one way of replacing token values in a string, if that's what you want. In the example below, the {0} is a token, and String.Format takes the fixedString and replaces the token with the value of myDecimal.
string fixedString = "infinity.world.uk/Members/namelist.aspx?ID=-1&fid={0}";
decimal myDecimal = 1.5d;
string myResultString = string.Format(fixedString, myDecimal.ToString());

Replace URL in a string

I'm a beginner in C# and I have the following string,
string url = "svn1/dev";
along with,
string urlMod = "ato-svn3-sslv3.of.lan/svn/dev"
I want to replace svn1 in url with "ato-svn3-sslv3.of.lan"
Although your question still has some inconsistent statements, I believe String.Replace is what you are looking for:
http://msdn.microsoft.com/en-us/library/fk49wtc1.aspx
url = url.Replace("svn1","ato-svn3-sslv3.of.lan");
Strings are immutable so you need to assign the return value to a variable:
string replacement = "ato-svn3-sslv3.of.lan";
url = url.Replace("svn1", replacement);
You can use the string method replace.
url = url.Replace("svn1", urlMod)
I think you need this:
string url = "svn1/dev";
string anotherUrl = "ato-svn3-sslv3.of.lan/svn/dev";
string toBeReplaced = anotherUrl.Split('/')[0];
url = url.Replace("svn1", toBeReplaced);
It uses split method and replace method.

How can I parse HTTP urls in C#?

My requirement is to parse Http Urls and call functions accordingly. In my current implementation, I am using nested if-else statement which i think is not an optimized way. Can you suggest some other efficient approch?
Urls are like these:
server/func1
server/func1/SubFunc1
server/func1/SubFunc2
server/func2/SubFunc1
server/func2/SubFunc2
I think you can get a lot of use out of the System.Uri class. Feed it a URI and you can pull out pieces in a number of arrangements.
Some examples:
Uri myUri = new Uri("http://server:8080/func2/SubFunc2?query=somevalue");
// Get host part (host name or address and port). Returns "server:8080".
string hostpart = myUri.Authority;
// Get path and query string parts. Returns "/func2/SubFunc2?query=somevalue".
string pathpart = myUri.PathAndQuery;
// Get path components. Trailing separators. Returns { "/", "func2/", "sunFunc2" }.
string[] pathsegments = myUri.Segments;
// Get query string. Returns "?query=somevalue".
string querystring = myUri.Query;
This might come as a bit of a late answer but I found myself recently trying to parse some URLs and I went along using a combination of Uri and System.Web.HttpUtility as seen here, my URLs were like http://one-domain.com/some/segments/{param1}?param2=x.... so this is what I did:
var uri = new Uri(myUrl);
string param1 = uri.Segments.Last();
var parameters = HttpUtility.ParseQueryString(uri.Query);
string param2 = parameters["param2"];
note that in both cases you'll be working with strings, and be specially weary when working with segments.
I combined the split in Suncat2000's answer with string splitting to get at interesting features of the URL. I am passing in a full Uri including https: etc. from another page as the navigation argument e.Parameter:
Uri playlistUri = (Uri)e.Parameter;
string youtubePlaylistUnParsed = playlistUri.Query;
char delimiterChar = '=';
string[] sections = youtubePlaylistUnParsed.Split(delimiterChar);
string YoutubePlaylist = sections[1];
This gets me the playlist in the PLs__ etc. form for use in the Google APIs.

Categories