String whitespace replacement error when using string.join - c#

Trying to join a list of strings together using string.join. When I use the Separator string " OR " the white spaces are being replaced with "+" which is breaking my targetUri string. Below is the code used to join.
if (DocumentSearchListViewModel.Filter == null)
{
return "http://000.000.00.00:8080/value/value/search/json?terms=value%20OR%20value&target=TEST2&maxResults=5";
}
var targetUri = "http://000.000.00.00:8080/value/value/search/json?";
NameValueCollection termsString = System.Web.HttpUtility.ParseQueryString(string.Empty);
if (!string.IsNullOrWhiteSpace(DocumentSearchListViewModel.Filter.Keywords))
{
if (!string.IsNullOrWhiteSpace(DocumentSearchListViewModel.Filter.Author))
{
DocumentSearchListViewModel.Filter.Keywords += (" " + DocumentSearchListViewModel.Filter.Author);
}
IList<string> keywords = DocumentSearchListViewModel.Filter.Keywords.Split();
termsString["terms"] = string.Join(" OR ", keywords);
}
targetUri += termsString.ToString();
targetUri += "&target=TEST2&maxResults=";
targetUri += DocumentSearchListViewModel.Filter.MaxNumberOfResults ?? "5";
return targetUri;
I have done many searches on Google but haven't been able to find anything that talks about string.join replacing characters. And during my debugging I was able to narrow it down to on the termsString line as where the problem occurs.
Here is an actual example of the string I get out: terms=value1+OR+value2+OR+value3
How would I stop the white spaces from being replaced with + characters?
Cheers,
James

In order to get the URL decoded value on the server side, you should use:
var encoded = "terms=value1+OR+value2+OR+value3";
var decoded = System.Web.HttpUtility.UrlDecode(encoded);
#PanagiotisKanavos, regarding my previous suggestion of using %20 instead of space, take a look at this JS:
var uri1="terms=value1%20OR%20value2%20OR%20value3";
var uri2="terms=value1+OR+value+OR+value3";
document.write(decodeURIComponent(uri1));
document.write("<br/>");
document.write(decodeURIComponent(uri2));
If you run it, you'll see that encoding could be sensitive in some contexts.

Related

Remove part of a string between an start and end

Code first:
string myString = "<at>onePossibleName</at> some question here regarding <at>disPossibleName</at>"
// some code to handle myString and save it in myEditedString
Console.WriteLine(myEditedString);
//output now is: some question here regarding <at>disPossibleName</at>
I want to remove <at>onePossibleName</at> from myString. The string onePossibleName and disPossbileName could be any other string.
So far I am working with
string myEditedString = string.Join(" ", myString.Split(' ').Skip(1));
The problem here would be that if onePossibleName becomes one Possible Name.
Same goes for the try with myString.Remove(startIndex, count) - this is not the solution.
There will be different method depending on what you want, you can go with a IndexOf and a SubString, regex would be a solution too.
// SubString and IndexOf method
// Usefull if you don't care of the word in the at tag, and you want to remove the first at tag
if (myString.Contains("</at>"))
{
var myEditedString = myString.Substring(myString.IndexOf("</at>") + 5);
}
// Regex method
var stringToRemove = "onePossibleName";
var rgx = new Regex($"<at>{stringToRemove}</at>");
var myEditedString = rgx.Replace(myString, string.Empty, 1); // The 1 precise that only the first occurrence will be replaced
You could use this generic regular expression.
var myString = "<at>onePossibleName</at> some question here regarding <at>disPossibleName</at>";
var rg = new Regex(#"<at>(.*?)<\/at>");
var result = rg.Replace(myString, "").Trim();
This would remove all 'at' tags and the content between. The Trim() call is to remove any white space at the beginning/end of the string after the replacement.
string myString = "<at>onePossibleName</at> some question here regarding <at>disPossibleName</at>"
int sFrom = myString.IndexOf("<at>") + "<at>".Length;
int sTo = myString.IndexOf("</at>");
string myEditedString = myString.SubString(sFrom, sFrom - sTo);
Console.WriteLine(myEditedString);
//output now is: some question here regarding <at>disPossibleName</at>

How to convert quarter which is provided as "FY18 Q1" to produce "2018.4" in C# using REGEX?

The below code works fine. But, I want to obtain this via Regex.
private decimal GetQuarter(string quarter)
{
var unformattedQuarter = "20" + quarter[2] + quarter[3] + "." + quarter[6];
return Convert.ToDecimal(unformattedQuarter);
}
Input
FY18 Q4
FY19 Q1
FY19 Q2
Output
2018.4
2019.1
2019.2
You can use the pattern
FY(\d{2}) Q(\d)
And replace matches with
20$1.$2
Example
var input = #"FY18 Q4\r\nFY19 Q1\r\nFY19 Q2";
var pattern = #"FY(\d{2}) Q(\d)";
var replacement = "20$1.$2";
Console.WriteLine(Regex.Replace(input, pattern, replacement));
Output
2018.4
2019.1
2019.2
Full Demo Here
Explanation
Note : Adding 20 seems a little problematic, and should be used with caution
Using the following code, you can extract the first and second occurrences of the numbers from the string into a list and then concatenate them:
string n = "FY18 Q1";
Regex digits = new Regex(#"[\d]+");
var list = digits.Matches(n);
var finalValue = "20" + list [0] + "." + list [1];

how to convert char #"\" to Escape String \ by C#

I have grabbed some data from a website.A string which is named as urlresult in the data is "http:\/\/www.cnopyright.com.cn\/index.php?com=com_noticeQuery&method=wareList&optionid=1221&obligee=\u5317\u4eac\u6c83\u534e\u521b\u65b0\u79d1\u6280\u6709\u9650\u516c\u53f8&softwareType=1".
what I want to do is to get rid of the first three char #'\' in the string urlresult above . I have tried the function below:
public string ConvertDataToUrl(string urlresult )
{
var url= urlresult.Split('?')[0].Replace(#"\", "") + "?" + urlresult .Split('?')[1];
return url
}
It returns "http://www.cnopyright.com.cn/index.php?com=com_noticeQuery&method=wareList&optionid=1221&obligee=\\u5317\\u4eac\\u6c83\\u534e\\u521b\\u65b0\\u79d1\\u6280\\u6709\\u9650\\u516c\\u53f8&softwareType=1" which is incorrect.
The correct result is "http://www.cnopyright.com.cn/index.php?com=com_noticeQuery&method=wareList&optionid=1221&obligee=北京沃华创新科技有限公司&softwareType=1"
I have tried many ways,but it hasn't worked.I have no idea how to get the correct result.
I think you may be misled by the debugger because there's no reason that extra "\" characters should get inserted by the code you provided. Often times the debugger will show extra "\" in a quoted string so that you can tell which "\" characters are really there versus which are there to represent other special characters. I would suggest writing the string out with Debug.WriteLine or putting it in a log file. I don't think the information you provided in the question is correct.
As proof of this, I compiled and ran this code:
static void Main(string[] args)
{
var url = #"http:\/\/www.cnopyright.com.cn\/index.php?com=com_noticeQuery&method=wareList&optionid=1221&obligee=\u5317\u4eac\u6c83\u534e\u521b\u65b0\u79d1\u6280\u6709\u9650\u516c\u53f8&softwareType=1";
Console.WriteLine("{0}{1}{2}", url, Environment.NewLine,
url.Split('?')[0].Replace(#"\", "") + "?" + url.Split('?')[1]);
}
The output is:
http:\/\/www.cnopyright.com.cn\/index.php?com=com_noticeQuery&method=wareList&optionid=1221&obligee=\u5317\u4eac\u6c83\u534e\u521b\u65b0\u79d1\u6280\u6709\u9650\u516c\u53f8&softwareType=1
http://www.cnopyright.com.cn/index.php?com=com_noticeQuery&method=wareList&optionid=1221&obligee=\u5317\u4eac\u6c83\u534e\u521b\u65b0\u79d1\u6280\u6709\u9650\u516c\u53f8&softwareType=1
You can use the System.Text.RegularExpressions.Regex.Unescape method:
var input = #"\u5317\u4eac\u6c83\u534e\u521b\u65b0\u79d1\u6280\u6709\u9650\u516c\u53f8";
string escapedText = System.Text.RegularExpressions.Regex.Unescape(input);

Extract ID and replace everything in `Example HTML`

New to Regular Expressions, I want to have the following text in my HTML and would like to replace with something else
Example HTML:
{{Object id='foo'}}
Extract the id into a variable like this:
string strId = "foo";
So far I have the following Regular Expression code that will capture the Example HTML:
string strStart = "Object";
string strFind = "{{(" + strStart + ".*?)}}";
Regex regExp = new Regex(strFind, RegexOptions.IgnoreCase);
Match matchRegExp = regExp.Match(html);
while (matchRegExp.Success)
{
//At this point, I have this variable:
//{{Object id='foo'}}
//I can find the id='foo' (see below)
//but not sure how to extract 'foo' and use it
string strFindInner = "id='(.*?)'"; //"{{Slider";
Regex regExpInner = new Regex(strFindInner, RegexOptions.IgnoreCase);
Match matchRegExpInner = regExpInner.Match(matchRegExp.Value.ToString());
//Do something with 'foo'
matchRegExp = matchRegExp.NextMatch();
}
I understand this might be a simple solution, I am hoping to gain more knowledge about Regular Expressions but more importantly, I am hoping to receive a suggestion on how to approach this cleaner and more efficiently.
Thank you
Edit:
Is this an example that I could potentially use: c# regex replace
While I am not solving my initial question with Regular Expressions, I did move into a simpler solution using SubString, IndexOf and string.Split for the time being, I understand that my code needs to be cleaned up but thought I would post the answer that I have thus far.
string html = "<p>Start of Example</p>{{Object id='foo'}}<p>End of example</p>"
string strObject = "Slider"; //Example
//When found, this will contain "{{Object id='foo'}}"
string strCode = "";
//ie: "id='foo'"
string strCodeInner = "";
//Tags will be a list, but in this example, only "id='foo'"
string[] tags = { };
//Looking for the following "{{Object "
string strFindStart = "{{" + strObject + " ";
int intFindStart = html.IndexOf(strFindStart);
//Then ending in the following
string strFindEnd = "}}";
int intFindEnd = html.IndexOf(strFindEnd) + strFindEnd.Length;
//Must find both Start and End conditions
if (intFindStart != -1 && intFindEnd != -1)
{
strCode = html.Substring(intFindStart, intFindEnd - intFindStart);
//Remove Start and End
strCodeInner = strCode.Replace(strFindStart, "").Replace(strFindEnd, "");
//Split by spaces, this needs to be improved if more than IDs are to be used
//but for proof of concept this is perfect
tags = strCodeInner.Split(new char[] { ' ' });
}
Dictionary<string, string> dictTags = new Dictionary<string, string>();
foreach (string tag in tags)
{
string[] tagSplit = tag.Split(new char[] { '=' });
dictTags.Add(tagSplit[0], tagSplit[1].Replace("'", "").Replace("\"", ""));
}
//At this point, I can replace "{{Object id='foo'}}" with anything I'd like
//What I don't show is that I go into the website's database,
//get the object (ie: Slider) and return the html for slider with the ID of foo
html = html.Replace(strCode, strView);
/*
"html" variable may contain:
<p>Start of Example</p>
<p id="foo">This is the replacement text</p>
<p>End of example</p>
*/

Problems when a json string has extra " in it using JObject.Parse

The Following string gives an error when using the following code:
data = await resposta.Content.ReadAsStringAsync();
dynamic j = JObject.Parse(data);
The data in contains the following string:
{"code": 100, "message": "The entity with the name "Esther Rea" its not in DB."}
How to take off the " from Esther Rea?
As suggested, the correct solution would be to have whoever's returning this value escape the quotes. However, if that's really not an option, you can try to brute-force your way into escaping the double quotes yourself, assuming the return schema is always the same, using something like this:
var pattern = "(\"message\":\\s+\")(?<messageContent>(.*))(\"})";
var regex = Regex.Match(data, pattern);
var message = regex.Groups["messageContent"].Value;
if (!string.IsNullOrEmpty(message))
{
message = message.Replace("\"", "\\\"");
var newData = Regex.Replace(data, pattern, "$1" + message + "$3");
var jObject = JObject.Parse(newData);
}
This will extract the actual message string and escapes all double quotes in it (message.Replace("\"", "\\\"");), causing serialization to succeed.
If you really want to remove the quotes instead of escaping them, you can do message = message.Replace("\"", "");

Categories