I am trying to remove empty url type parameters from a string using C#. My code sample is here.
public static string test ()
{
string parameters = "one=aa&two=&three=aaa&four=";
string pattern = "&[a-zA-Z][a-zA-Z]*=&";
string replacement = "";
Regex rgx = new Regex(pattern);
string result = rgx.Replace(parameters, replacement);
return parameters;
}
public static void Main(string[] args)
{
Console.WriteLine(test());
}
I tried the code in rextester
output: one=aa&two=&three=aaa&four=
expected output: one=aa&three=aaa
You absolutely do not need to roll your own Regex for this, try using HttpUtility.ParseQueryString():
public static string RemoveEmptyUrlParameters(string input)
{
var results = HttpUtility.ParseQueryString(input);
Dictionary<string, string> nonEmpty = new Dictionary<string, string>();
foreach(var k in results.AllKeys)
{
if(!string.IsNullOrWhiteSpace(results[k]))
{
nonEmpty.Add(k, results[k]);
}
}
return string.Join("&", nonEmpty.Select(kvp => $"{kvp.Key}={kvp.Value}"));
}
Fiddle here
Regex:
(?:^|&)[a-zA-Z]+=(?=&|$)
This matches start of string or an ampersand ((?:^|&)) followed by at least one (english) letter ([a-zA-Z]+), an equal sign (=) and then nothing, made sure by the positive look-ahead ((?=&|$)) which matches end of string or a new parameter (started by &).
Code:
public static string test ()
{
string parameters = "one=aa&two=&three=aaa&four=";
string pattern = "(?:^|&)[a-zA-Z]+=(?=&|$)";
string replacement = "";
Regex rgx = new Regex(pattern);
string result = rgx.Replace(parameters, replacement);
return result;
}
public static void Main(string[] args)
{
Console.WriteLine(test());
}
Note that this also returns the correct variable (as pointed out by Joel Anderson)
See it live here at ideone.
The results of the Regex replace is not returned by the function. The function returns the variable "parameters", which is never updated or changed.
string parameters = "one=aa&two=&three=aaa&four=";
...
string result = rgx.Replace(parameters, replacement);
return parameters;
....
Perhaps you meant
return results;
Related
I have the following sensitive data:
"Password":"123","RootPassword":"123qwe","PassPhrase":"phrase"
I would like to get the following safe data:
"Password":"***","RootPassword":"***","PassPhrase":"***"
It's my code:
internal class Program
{
private static void Main(string[] args)
{
var data = "\"Password\":\"123\",\"RootPassword\":\"123qwe\",\"PassPhrase\":\"phrase\"";
var safe1 = PasswordReplacer.Replace1(data);
var safe2 = PasswordReplacer.Replace2(data);
}
}
public static class PasswordReplacer
{
private const string RegExpReplacement = "$1***$2";
private const string Template = "(\"{0}\":\").*?(\")";
private static readonly string[] PasswordLiterals =
{
"password",
"RootPassword",
"PassPhrase"
};
public static string Replace1(string sensitiveInfo)
{
foreach (var literal in PasswordLiterals)
{
var pattern = string.Format(Template, literal);
var regex = new Regex(pattern, RegexOptions.IgnoreCase);
sensitiveInfo = regex.Replace(sensitiveInfo, RegExpReplacement);
}
return sensitiveInfo;
}
public static string Replace2(string sensitiveInfo)
{
var multiplePattern = "(\"password\":\")|(\"RootPassword\":\")|(\"PassPhrase\":\").*?(\")"; //?
var regex = new Regex(string.Format(Template, multiplePattern), RegexOptions.IgnoreCase);
return regex.Replace(sensitiveInfo, RegExpReplacement);
}
}
Replace1 method works as expected. But it does it one by one. My question is is it possble to do the same but using single regex match ? If so I need help with Replace2.
The Replace2 can look like
public static string Replace2(string sensitiveInfo)
{
var multiplePattern = $"(\"(?:{string.Join("|", PasswordLiterals)})\":\")[^\"]*(\")";
return Regex.Replace(sensitiveInfo, multiplePattern, RegExpReplacement, RegexOptions.IgnoreCase);
}
See the C# demo.
The multiplePattern will hold a pattern like ("(?:password|RootPassword|PassPhrase)":")[^"]*("), see the regex demo. Quick details:
("(?:password|RootPassword|PassPhrase)":") - Group 1 ($1): a " char followed with either password, RootPassword or PassPhrase and then a ":" substring
[^"]* - any zero or more chars other than " as many as possible
(") - Group 2 ($2): a " char.
I have this string:
http://www.edrdg.org/jmdictdb/cgi-bin/edform.py?svc=jmdict&sid=&q=1007040&a=2
How can I pick out the number between "q=" and "&" as an integer?
So in this case I want to get the number: 1007040
What you're actually doing is parsing a URI - so you can use the .Net library to do this properly as follows:
var str = "http://www.edrdg.org/jmdictdb/cgi-bin/edform.py?svc=jmdict&sid=&q=1007040&a=2";
var uri = new Uri(str);
var query = uri.Query;
var dict = System.Web.HttpUtility.ParseQueryString(query);
Console.WriteLine(dict["amp;q"]); // Outputs 1007040
If you want the numeric string as an integer then you'd need to parse it:
int number = int.Parse(dict["amp;q"]);
Consider using regular expressions
String str = "http://www.edrdg.org/jmdictdb/cgi-bin/edform.py?svc=jmdict&sid=&q=1007040&a=2";
Match match = Regex.Match(str, #"q=\d+&");
if (match.Success)
{
string resultStr = match.Value.Replace("q=", String.Empty).Replace("&", String.Empty);
int.TryParse(resultStr, out int result); // result = 1007040
}
Seems like you want a query parameter for a uri that's html encoded. You could do:
Uri uri = new Uri(HttpUtility.HtmlDecode("http://www.edrdg.org/jmdictdb/cgi-bin/edform.py?svc=jmdict&sid=&q=1007040&a=2"));
string q = HttpUtility.ParseQueryString(uri.Query).Get("q");
int qint = int.Parse(q);
A regex approach using groups:
public int GetInt(string str)
{
var match = Regex.Match(str,#"q=(\d*)&");
return int.Parse(match.Groups[1].Value);
}
Absolutely no error checking in that!
I'm trying to parse messages transmited over TCP for my own network protocol using regex without success.
My commands start with ! followed by COMMAND_NAME and a list of arguments in the format or ARGUMENT_NAME=ARGUMENT_VALUE enclosed in <>
for example:
!LOGIN?<USERNAME='user'><PASSWORD='password'>;
my code :
public class CommandParser
{
private Dictionary<string, string> arguments = new Dictionary<string, string>();
public CommandParser(string input)
{
Match commandMatch = Regex.Match(input, #"\!([^)]*)\&");
if (commandMatch.Success)
{
CommandName = commandMatch.Groups[1].Value;
}
// Here we call Regex.Match.
MatchCollection matches = Regex.Matches(input,"(?<!\\S)<([a-z0-9]+)=(\'[a-z0-9]+\')>(?!\\S)",
RegexOptions.IgnoreCase);
//
foreach (Match argumentMatch in matches)
{
arguments.Add(
argumentMatch.Groups[1].Value,
argumentMatch.Groups[2].Value);
}
}
public string CommandName { get; set; }
public Dictionary<string, string> Arguments
{
get { return arguments; }
}
/// <summary>
///
/// </summary>
public int ArgumentCount
{
get { return arguments.Count; }
}
}
To find the command name, finding the first word after the "!" should be enough:
/\!\w*/g
To match the key/value pairs in groups, you could try something like:
(\w+)='([a-zA-Z_]*)'
An example of the above regex can be found here.
You do not need regex here and avoid them unless that's a last option left. You could do this with simple C# logic.
string input = "!LOGIN?<USERNAME='user'><PASSWORD='password'>";
string command = input.Substring(1, input.IndexOf('?') - 1);
Console.WriteLine($"command: {command}");
var parameters = input
.Replace($"!{command}?", string.Empty)
.Replace("<", "")
.Split(">".ToCharArray(), StringSplitOptions.RemoveEmptyEntries);
string[] kvpair;
foreach(var kv in parameters) {
kvpair = kv.Split('=');
Console.WriteLine($"pname: {kvpair[0]}, pvalue: {kvpair[1]}");
}
Output:
command: LOGIN
pname: USERNAME, pvalue: 'user'
pname: PASSWORD, pvalue: 'password'
I'm trying to replace text. I'm using a dictionary for the task.
public static string cleanString(this String str) {
Dictionary<string, string> dict = new Dictionary<string,string>();
dict.Add("JR", "Junior");
dict.Add("SR", "Senior");
foreach (KeyValuePair<string,string> d in dict) {
if (str.BlindContains(p.Key)) {
str = str.BlindReplace(str, p.Value);
}
}
return str;
}
BlindContains and BlindReplace just ignore the case of the replacement (and BC ensures the string is not part of another word):
public static bool BlindContains(this String str, string toCheck)
{
if (Regex.IsMatch(str, #"\b" + toCheck + #"\b", RegexOptions.IgnoreCase))
return str.IndexOf(toCheck, StringComparison.OrdinalIgnoreCase) >= 0;
return false;
}
public static string BlindReplace(this String str, string oldStr, string newStr)
{
return Regex.Replace(str, oldStr, newStr, RegexOptions.IgnoreCase);
}
The problem
If I call my method on a a string, the following occurs:
string mystring = "The jr. is less than the sr."
mystring.cleanString()
returns "Junior"
However, when I print
Console.WriteLine(Regex.Replace(mystring, "jr.", "junior", Regex.IgnoreCase));
I get the output: "The junior is less than the sr."
Why does the loop compromise the task?
You should be passing the key in your dictionary (Which contains the text to search for), rather than the actual string you are searching in.
It should be:
str = str.BlindReplace(p.Key, p.Value);
As opposed to:
str = str.BlindReplace(str, p.Value);
You are currently replacing your string with the value "Junior" because you specified your string as the text to search for. (Which will make it replace the entire string instead of just the keyword)
In cleanString implementation, I think you made an error in your call to BlindReplace. Instead of:
str = str.BlindReplace(str, p.Value);
I believe you should have called:
str = str.BlindReplace(d.Key, d.Value);
I use Stream reader to read context.Request.InputStream to the end and end up with a string looking like
"Gamestart=true&GamePlayer=8&CurrentDay=Monday&..."
What would be the most efficent/"clean" way to parse that in a C# console?
You can use HttpUtility.ParseQueryString
Little sample:
string queryString = "Gamestart=true&GamePlayer=8&CurrentDay=Monday"; //Hardcoded just for example
NameValueCollection qscoll = HttpUtility.ParseQueryString(querystring);
foreach (String k in qscoll.AllKeys)
{
//Prints result in output window.
System.Diagnostics.Debug.WriteLine(k + " = " + qscoll[k]);
}
HttpUtility.ParseQueryString
Parses a query string into a NameValueCollection using UTF8 encoding.
http://msdn.microsoft.com/en-us/library/ms150046.aspx
I know this is a bit of a zombie post but I thought I'd add another answer since HttpUtility adds another assembly reference (System.Web), which may be undesirable to some.
using System.Net;
using System.Text.RegularExpressions;
static readonly Regex HttpQueryDelimiterRegex = new Regex(#"\?", RegexOptions.Compiled);
static readonly Regex HttpQueryParameterDelimiterRegex = new Regex(#"&", RegexOptions.Compiled);
static readonly Regex HttpQueryParameterRegex = new Regex(#"^(?<ParameterName>\S+)=(?<ParameterValue>\S*)$", RegexOptions.Compiled);
static string GetPath(string pathAndQuery)
{
var components = HttpQueryDelimiterRegex.Split(pathAndQuery, 2);
return components[0];
}
static Dictionary<string, string> GetQueryParameters(string pathAndQuery)
{
var parameters = new Dictionary<string, string>(StringComparer.OrdinalIgnoreCase);
var components = HttpQueryDelimiterRegex.Split(pathAndQuery, 2);
if (components.Length > 1)
{
var queryParameters = HttpQueryParameterDelimiterRegex.Split(components[1]);
foreach(var queryParameter in queryParameters)
{
var match = HttpQueryParameterRegex.Match(queryParameter);
if (!match.Success) continue;
var parameterName = WebUtility.HtmlDecode(match.Groups["ParameterName"].Value) ?? string.Empty;
var parameterValue = WebUtility.HtmlDecode(match.Groups["ParameterValue"].Value) ?? string.Empty;
parameters[parameterName] = parameterValue;
}
}
return parameters;
}
I wish they would add that same method to WebUtility which is available in System.Net as of .NET 4.0.