Json: how to properly strip the escape characters with json.net - c#

I have json response in the below format.
"[{\\\"JobID\\\":\\\"1\\\",\\\"BillGenerationDate\\\":\\\"4/29/2013 2:53:34 PM\\\",\\\"BillID\\\":\\\"115743\\\",\\\"BillNo\\\":\\\"115743\\\",\\\"CustomerID\\\":\\\"4041705\\\",\\\"PayStatus\\\":\\\"0\\\",\\\"PaymentRequiredStatus\\\":\\\"True\\\",\\\"ProductName\\\":\\\"Epic FBO test\\\",\\\"Description\\\":\\\"Epic Automation 2\\\\r\\\\n\\\",\\\"ProductType\\\":\\\"eBill \\\",\\\"DueType\\\":\\\"-1\\\",\\\"DueDate\\\":\\\"2013-03-15\\\",\\\"Amount\\\":\\\"63.70\\\",\\\"Cost\\\":\\\"\\\"},
{\\\"JobID\\\":\\\"9\\\",\\\"BillGenerationDate\\\":\\\"5/2/2013 10:21:39 AM\\\",\\\"BillID\\\":\\\"115743\\\",\\\"BillNo\\\":\\\"115743\\\",\\\"CustomerID\\\":\\\"4041705\\\",\\\"PayStatus\\\":\\\"0\\\",\\\"PaymentRequiredStatus\\\":\\\"True\\\",\\\"ProductName\\\":\\\"FBO Test Product\\\",\\\"Description\\\":\\\"FBO Product Test\\\",\\\"ProductType\\\":\\\"eBill \\\",\\\"DueType\\\":\\\"-1\\\",\\\"DueDate\\\":\\\"2013-05-01\\\",\\\"Amount\\\":\\\"150.70\\\",\\\"Cost\\\":\\\"\\\"}]
I believe json.net handles the escape characters and I used the below code to deserialize it to a dictionary collection.
var billList = JsonConvert.DeserializeObject<List<Dictionary<string, string>>>(contentCorrected);
But this json parsing throws exception
"Invalid property identifier character: . Path '[0]', line 1, position 2."
Could we solve this by manipulating the json response string?

THE SHORT ANSWER: first you need to deserialize the escaped string, but not to the target CLR type, but deserialize to another string (repeat if necessary); then, it is deserialized to the target type:
// Initial example json string: "\"{\\\"Property1\\\":1988,\\\"Property2\\\":\\\"Some data :D\\\"}\""
// First, deserialize to another string (unescaped string).
string unescapedJsonString = JsonConvert.DeserializeObject<string>(escapedJsonString);
Debug.WriteLine(unescapedJsonString);
// Prints:
// "{\"Property1\":1988,\"Property2\":\"Some data :D\"}"
// Second, deserialize to another string, again (in this case is necessary)
var finalUnescapedJsonString = JsonConvert.DeserializeObject<string>(unescapedJsonString);
Debug.WriteLine(finalUnescapedJsonString);
// This time prints a final, unescaped, json string:
// {"Property1":1988,"Property2":"Some data :D"}
// Finally, perform final deserialization to the target type, using the last unescaped string.
MyClass targetObject = JsonConvert.DeserializeObject<MyClass>(finalUnescapedJsonString);
LONG ANSWER (but interesting)
Using string.Replace(... could generate an invalid string, because it could damage certain special characters that needed the backslash to be deserialized correctly .
This type of escaped strings are usually generated when a string that was already a json string, its serialized again (or even more times). This causes something like "various levels of serialization" (it really is a serialization of a string with reserved characters), and the result is backshash characters (or groups of one, two or more backslash followed: \, \, \\\) scattered all over the string.
So, to remove them correctly is not enough to replace them by empty.
THE RIGHT WAY: A better way to get a unescaped string would be to do a first deserialization to string type (Repeat this several times if necessary), And then do a final deserialization to target CLR type:
// -- SERIALIZATION --
// Initial object
MyClass originObj = new MyClass { Property1 = 1988, Property2 = "Some data :D" };
// "First level" Of serialization.
string jsonString = JsonConvert.SerializeObject(originObj);
Debug.WriteLine(jsonString);
// Prints:
// {"Property1":1988,"Property2":"Some data :D"}
// "Second level" of serialization.
string escapedJsonString = JsonConvert.SerializeObject(jsonString);
Debug.WriteLine(escapedJsonString);
// "{\"Property1\":1988,\"Property2\":\"Some data :D\"}"
// Note the initial and final " character and de backslash characters
// ...
// at this point you could do more serializations ("More levels"), Obtaining as a result more and more backslash followed,
// something like this:
// "\"{\\\"Property1\\\":1988,\\\"Property2\\\":\\\"Some data :D\\\"}\""
// Note that is... very very crazy :D
// ...
// -- DESERIALIZATION --
// First deserialize to another string (unescaped string).
string unescapedJsonString = JsonConvert.DeserializeObject<string>(escapedJsonString);
Debug.WriteLine(unescapedJsonString);
// Prints:
// {"Property1":1988,"Property2":"Some data :D"}
// ...
// at this point you could repeat more deserializations to string, if necessary. For example if you have many backslash \\\
// ...
// Finally, perform final deserialization to the target type, using the last unescaped string.
MyClass targetObject = JsonConvert.DeserializeObject<MyClass>(unescapedJsonString);

Try string contentCorrected = contentCorrected.Replace(#"\", "");before deserialization process.

Remove all the "\" character before you deserialize it. Use replace function.
yourJsonString.Replace("\\\\\", "");
Your Json string is incomplete or doesnot seems to be of type List<Dictionary<string, string>>". Correct the type you want the json to be converted.
I modified your json a little as follows and it worked.
newJson = "{ \"array\":" + yourJsonString + "}"

The problem occurs when valid double quotes are used within the answer. Removing and/or Replacing won't solved this in all cases.
It frustrated me too until I found a simple solution:
var billList = JsonConvert.DeserializeObject<List<Dictionary<string, string>>>(#contentCorrected);

For me the code below works
string contentCorrected = contentCorrected.Replace(**#"\""", ""**);

Related

Jsonconvert serializeobject not escaping single quote

C#, I have an Automobile class and in that class i have a vehicleTrim field.
I use JsonConvert.SerializeObject to serialize that class and it is not escaping the single quote.
This is causing an issue when i try to set the value of an object in the web via window.localStorage.setItem function.
example:
public class Automobile
{
public string vehicleTrim { get; set; }
}
var test = new Automobile()
{
vehicleTrim = "designer's package"
};
var serialized = JsonConvert.SerializeObject(test, Formatting.None);
// serialized output: {"vehicleTrim":"designer's package"}
// expected output : {"vehicleTrim":"designer\'s package"}
so now i want to set this json object to the localstorage of my web by calling this
var jsSetScript = $"window.localStorage.setItem('automobile', '{serialized}');";
await Control.EvaluateJavascriptAsync(jsSetScript);
EvaluateJavascriptAsync returns this error trying to read the json SyntaxError: Unexpected identifier 's'. Expected ')' to end an argument list.
I manaully tried this with the escaped single quote and it was fine. So the question is how can i make serializedobject method escape the single quote?
"\'" is not even a valid JSON string literal. From the JSON spec:
Thus ' does not need to be escaped, but if it is, it must appear as "\u0027". Only the 8 listed characters have a special, abbreviated escaping syntax. (For further details see RFC 8259.)
If "\u0027" meets your needs, then setting JsonSerializerSettings.StringEscapeHandling to StringEscapeHandling.EscapeHtml should do the trick. From the docs:
StringEscapeHandling Enumeration
Specifies how strings are escaped when writing JSON text.
Default 0 Only control characters (e.g. newline) are escaped.
EscapeNonAscii 1 All non-ASCII and control characters (e.g. newline) are escaped.
EscapeHtml 2 HTML (<, >, &, ', ") and control characters (e.g. newline) are escaped.
Thus the following now succeeds:
var settings = new JsonSerializerSettings
{
StringEscapeHandling = StringEscapeHandling.EscapeHtml,
};
var serialized = JsonConvert.SerializeObject(test, Formatting.None, settings);
Console.WriteLine(serialized);
// Outputs {"vehicleTrim":"designer\u0027s package"}
Assert.IsTrue(!serialized.Contains('\''));
// Succeeds
Demo fiddle here.

Convert string into a valid JSON in c#

In the code snippet below, the JSON string in the commented out jsonString variable is valid while the uncommented out one causes JObject.Parse to throw a JsonReaderException with the message:
After parsing a value an unexpected character was encountered: e. Path 'Key', line 1, position 15.
var jsonString = "{\"Key\":\"Value \"extra\" \"}";
//var jsonString = "{\"Key\":\"Value \\\"extra\\\" \"}";
JObject.Parse(jsonString);
Are there any methods available in Newtonsoft.Json or elsewhere that can transform a JSON string to make it valid?
No, because NewtonSoft cannot guess what you want. E.g. is extra a new key and did you just ommit a comma or is it part of the previous value, or is it just something that can be ignored. It would be better to have the thing you are consuming the json from construct valid json.
Using Regex might help you to resolve the existing JSON you have. If you can control how subsequent JSON is generated, you really should fix it at that point.
This solution counts the value as existing from the first " after a "key":, through to the last " before a , or a }, and then it reserializes the value to ensure that it is correctly escaped. If it finds ",, it expects it to be followed by another key ("key":). This is in an attempt to avoid red herrings (i.e. {"key": "test "," value"}) which might otherwise confuse it.
private static string FixJson(string json)
{
var regex = new Regex("\"(?<key>.*?)\"\\W?:\\W?\"(?<value>.*?)\"(?=,\".*?\"\\W?:|}$)");
return regex.Replace(json, new MatchEvaluator(m => {
var key = m.Groups["key"].Value;
var val = m.Groups["value"].Value;
return string.Format("\"{0}\":{1}", key, JsonConvert.SerializeObject(val));
}));
}
Disclaimer: It's a regular expression, it's not foolproof, and if your JSON is more broken than you have indicated, it will probably spit out broken JSON, or incorrect values, so use it at your own risk.
Try it online

Is "[]" valid JSON?

I'm having troubles de-serializing this JSON string using JSON.NET (note the quotes):
"[]"
Depending on which JSON validation website you go to, this is valid JSON (jsonlint for example says it is).
The JSON.NET code:
void Main()
{
string json = "\"[]\"";
var x = JsonConvert.DeserializeObject<User[]>(json);
Console.WriteLine(x);
}
// Define other methods and classes here
public class User
{
public string Id { get; set; }
public int Age { get; set; }
}
The exception
Error converting value "[]" to type 'UserQuery+User[]'. Path '', line 1, position 4.
Is there a way of forcing JSON.NET to parse this?
Part 1: Is "[]" valid JSON?
There are several documents and standards on JSON, and hundreds of parsers; and some of them suppose that JSON can only be object {} or an array [], but some allow single values like strings, numbers to be used as JSON.
Read this article, it widely describes this problem.
What is the minimum valid JSON?
This dispute on JSON validity is another question. In your case, it doesn't matter, because...
Part 2: why your code isn't working.
Even if we allow non-objects \ non-arrays to be valid JSON, then your JSON represents a single string equal to "[]". It could be anything else, not brackets, it is not an array notation, but just two symbols "[" and "]".
However, you try to parse this JSON as an array of objects, which will anyway result into error.
In other words, even if it is a valid JSON, then it is a valid JSON string, not JSON array.
var str1 = JSON.parse("\"[]\""),
str2 = JSON.parse("\"could be anything else, not brackets\""),
arr = JSON.parse("[]");
console.log(typeof str1);
console.log(typeof str2);
console.log(typeof arr);
var str1_s = JSON.stringify([]);
console.log("Valid JSON of an empty array: " + str1_s);
var arr_s = JSON.stringify("[]");
console.log("Partly valid JSON of a string '[]': " + arr_s);
Part 3: what should you do
The best idea - stop using invalid JSON as input. Tell whoever gave you this JSON that it is invalid JSON array and you cannot use it. You would be able to deserialize a JSON into your array of User if it was correct just like you use it:
string json = "[]";
var x = JsonConvert.DeserializeObject<User[]>(json);
Console.WriteLine(x);
If this JSON is provided from 3rd party services and you can do nothing about that, then you need to tidy it up and make it valid. Yeah, unfortunately, sometimes it happens.
How? It depends on what is your value when there ARE objects (users).
It may be a JSON-serialized JSON-string (double-serialized) like this, and then you need to deserialize a string, and then deserialize an array.
Or it can just have two odd quotes in the beginning and the end, and you can just remove them.
It is valid JSON, but the deserializer failes because the datatypes do not match.
"[]"
Is a string, so the deserializer wants to serialize it to a string.
[]
Is an empty array. So, in short, this should work:
string json = "[]";
var x = JsonConvert.DeserializeObject<User[]>(json);
Console.WriteLine(x);

Deserialize json string that contains singlequote using c#

I have a json string that contains a string literal as value of one of the object - PostData.
string json = "{\"PostData\": '{\"LastName\": \"O Corner\",\"FirstName\":\"Mark\",\"Address\":\"123 James St\"}'}";
I am trying to deserialize the json using:
var obj = JsonConvert.DeserializeObject<dynamic>(json);
then, I can use my json string value of PostData like:
obj["PostData"].ToString()
But, as soon as I get the data with single quotes in it, like:
string json = "{\"PostData\": '{\"LastName\": \"O' Corner\",\"FirstName\":\"Mark\",\"Address\":\"123 James St\"}'}";
I get exception on deserialization. How can I escape the single quote?
I have checked SO for similar issues but didn't get any thing working. I also tried one of the solution mentioned int his thread:
JsonSerializerSettings settings = new JsonSerializerSettings
{
StringEscapeHandling = StringEscapeHandling.EscapeHtml
};
JsonConvert.SerializeObject(obj, settings);
But, I get Newtonsoft doesnot contain defination for StringEscapeHandling.
Also, tried to escape the singlequote with in the string with \:
'{\"LastName\": \"O\' Corner\",\"FirstName\":\"Mark\",\"Address\":\"123 James St\"}' which didn't work either.
For a start, it might be worth noting that the JSON syntax uses single quotes where you have used double quotes. Here is a guide for proper syntax:
Now unfortunately JSON does not allow the use of single quotes like that, but we can use the unicode \u0027 for an apostrophe and make use of JSON's serializer settings, as you have already done. So your original string:
string json = "{\"PostData\": '{\"LastName\": \"O' Corner\",\"FirstName\":\"Mark\",\"Address\":\"123 James St\"}'}";
becomes:
string json = "{'PostData': {'LastName': 'O\u0027 Corner','FirstName':'Mark','Address':'123 James St'}}"
This is assuming that you are parsing a string literal, otherwise you would need to escape the unicode to give:
string json = "{'PostData': {'LastName': 'O\\u0027 Corner','FirstName':'Mark','Address':'123 James St'}}"

how to validate JSON string before converting to XML in C#

I will receive an response in the form of JSON string.
We have an existing tool developed in C# which will take input in XML format.
Hence i am converting the JSON string obtained from server using Newtonsoft.JSON to XML string and passing to the tool.
Problem:
When converting JSON response to XML, I am getting an error
"Failed to process request. Reason: The ' ' character, hexadecimal
value 0x20, cannot be included in a name."
The above error indicates that the JSON Key contains a space [For Example: \"POI Items\":[{\"lat\":{\"value\":\"00\"}] which cannot be converted to XML element.
Is there any approach to identify spaces only JSON key's ["POI Items"] and remove the spaces in it?
Also suggest any alternative solution so that we needn't change the existing solution?
Regards,
Sudhir
You can use Json.Net and replace the names while loading the json..
JsonSerializer ser = new JsonSerializer();
var jObj = ser.Deserialize(new JReader(new StringReader(json))) as JObject;
var newJson = jObj.ToString(Newtonsoft.Json.Formatting.None);
.
public class JReader : Newtonsoft.Json.JsonTextReader
{
public JReader(TextReader r) : base(r)
{
}
public override bool Read()
{
bool b = base.Read();
if (base.CurrentState == State.Property && ((string)base.Value).Contains(' '))
{
base.SetToken(JsonToken.PropertyName,((string)base.Value).Replace(" ", "_"));
}
return b;
}
}
Input : {"POI Items":[{"lat":{"value":"00","ab cd":"de fg"}}]}
Output: {"POI_Items":[{"lat":{"value":"00","ab_cd":"de fg"}}]}
I recommend using some sort of Regex.Replace().
Search the input string for something like:
\"([a-zA-Z0-9]+) ([a-zA-Z0-9]+)\":
and then replace something like (mind the missing space):
\"(1)(2)\":
The 1st pair of parenthesis contain the first word in a variable name, the 2nd pair of parenthesis means the 2nd word. The : guarantees that this operation will be done in variable names only (not in string data). the JSON variable names are inside a pair of \"s.
Maybe it's not 100% correct but you can start searching by this.
For details check MSDN, and some Regex examples
http://msdn.microsoft.com/en-us/library/system.text.regularexpressions.regex.replace.aspx

Categories