Characters added and wrong output during serialization with Json.NET - c#

JSON.NET seems to serialize my code into what appear to be strings, instead of objects. Here's an example of what it returns:
"{\"kvk_nummer\":11111111,\"onderneming\":\"berijf B.V.\",\"vestigingsplaats\":\"AMSTERDAM\",\"actief\":1}"
It also adds strange backslashes, I tried to get rid of them, but none of the answers I've found seemed to have helped. Here is the code that returns the string.
getregister r = new getregister
{
kvk_nummer = col1, //contains an 8 digit number
onderneming = checkTotaal[col1], //contains a name
vestigingsplaats = checkTotaal2[col1], //contains a location
actief = 1 // bool that represents wether the company is active or not
};
yield return JsonConvert.SerializeObject(r);
How can i get JSON.NET to output an object, instead of some JSON strings?

Looks like you're confusing some stuff. Taken from Serialization (C#)
Serialization is the process of converting an object into a stream of bytes to store the object or transmit it to memory, a database, or a file. Its main purpose is to save the state of an object in order to be able to recreate it when needed. The reverse process is called deserialization.
When you serialize into JSON, you get a JSON representation of your object. Which is a string representation. Taken from the JSON Wikipedia page:
JavaScript Object Notation or JSON is an open-standard file format that uses human-readable text to transmit data objects consisting of attribute–value pairs and array data types (or any other serializable value).
In short: your code is doing what you're asking it to do. As far as the slashes go: those are escape characters. If you want (JSON.NET to return) an object, return the object you're creating (r).
return new getregister
{
kvk_nummer = col1, //contains an 8 digit number
onderneming = checkTotaal[col1], //contains a name
vestigingsplaats = checkTotaal2[col1], //contains a location
actief = 1 // bool that represents wether the company is active or not
};
If you're looking for a way to have JSON.NET return an object, you should take a look into Deserializing it. Since that takes the string-representation (JSON) for your object, and turns it back into an actual object for you.

Related

Problem when serializing objects with strings that contain "/"

I am using DataContractJsonSerializer to serialize an object, and to do this I am using the following function:
public static string Serialize<T>(T obj)
{
string returnVal = "";
try
{
DataContractJsonSerializer serializer = new DataContractJsonSerializer(obj.GetType());
using (MemoryStream ms = new MemoryStream())
{
serializer.WriteObject(ms, obj);
returnVal = Encoding.UTF8.GetString(ms.ToArray());
}
}
catch (Exception /*exception*/)
{
returnVal = "";
//log error
}
return returnVal;
}
Now, this function is working well and great...except in the following situation (I am dubitative if to change it, since I don't know how it will affect the rest of my code).
The situation in which it does not work well
Say I have obj (the argument) an object such as:
[DataContract()]
public class theObject
{
[DataMember()]
public string image;
}
in which image holds the Base64 value of a BMP file.
It is a big value but for example it would start as: "Qk1W/QAAAAAAADYAAAAoAAAAawAAAMgAAAABABgAAAAAACD9AADEDgAAxA4AAAAAAAAAAAAA////////////////////////////////////7+/...."
So you see that it contains a lot of /s.
So when I pass this object to Serialize it will WriteObject in ms and then get this into an array that finally will go to returnVal.
Now let's examine returnVal. It is in JSON format (correct) and when you visualize it as JSON it will show you:
image:"Qk1W/QAAAAAAADYAAAAoAAAAawAAAMgAAAABABgAAAAAACD9AADEDgAAxA4AAAAAAAAAAAAA////////////////////////////////////7+/...."
However! when you visualize it as text it will show you:
"image":"Qk1W\/QAAAAAAADYAAAAoAAAAawAAAMgAAAABABgAAAAAACD9AADEDgAAxA4AAAAAAAAAAAAA\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/7+\/..."
Did you see? it has inserted \ before every / and it makes a lot of difference.
So my questions are:
Why visualizing it as JSON and visualizing it as Text shows different things?
How can I get after serialization the correct value (without the /s)
EDIT:
Although one can say that \/ and / are the same, the consequences are not. Later when using this JSON to throw it to a Web Api using
byte[] bytes = Encoding.UTF8.GetBytes(json);
ByteArrayContent byteContent = new ByteArrayContent(bytes);
byteContent.Headers.ContentType = new MediaTypeWithQualityHeaderValue(content);
the version with added \ results in a bytes with 115442 bytes while the version that only uses / results in bytes of 86535 bytes. Therefore the results are quite different.
So How can I get the result without the added \s?
The standard behavior of the DataContractJsonSerializer is to escape / characters in strings so that they become \/ in JSON. When the JSON is deserialized back to an object, the \/ escape sequences will be turned back into / so no data is lost or corrupted. (Try it and see.) However it does result in a larger JSON size in bytes. If this is really a concern for you, there are couple of things you can do to work around it:
Approach 1
Immediately after serializing, you could use string.Replace() to get rid of all backslashes which appear directly before slashes. You can do this right in your Serialize method by changing this line:
returnVal = Encoding.UTF8.GetString(ms.ToArray());
to this:
returnVal = Encoding.UTF8.GetString(ms.ToArray()).Replace("\\/", "/");
Because / has no special meaning in JSON, it's not actually necessary to escape them with \, although it is permissible to do so. (See page 5 of the JSON specification.) DataContractJsonSerializer will still deserialize the JSON just fine even when slashes are not escaped. (Try it yourself and see. I'd make a fiddle for this, but .NET Fiddle doesn't support DataContractJsonSerializer).
Approach 2 (recommended)
Switch to a better JSON serializer such as Json.Net which does not escape the slashes in the first place. You can simplify your code and replace your entire Serialize method with JsonConvert.SerializeObject()
Fiddle: https://dotnetfiddle.net/MQKXSD

convert byte array to string but not with Convert.ToBase64

Dears
I have a byte array that is returned from web server , it is a part of json-serialized object (property value)
It looks like below in the json string:
,"n":"y1GpP7FibyTYl40Jhx1B90WOi1mecJfpi4IEhbHPbAB64jhV16UlpEPyGpNIzDS4Lct80sIs7FW5Vnf38Z-tzPbtHyFVYYU2AC4SVrwQp9-ELz-..._xW3bmMxuwoBgHpWDTw"
Please note that there is no double equal sign at the end, like for Base64 strings. I've used three dots (...) to make string representation a little bit shorter
I can deserialize object and get proper byte array:
var kb = JsonConvert.DeserializeObject<KeyBundle>(Properties.Resources.keyBundleJson);
And can it serialize to json back:
JsonSerializerSettings settings = new JsonSerializerSettings
{
TypeNameHandling = TypeNameHandling.None,
Formatting = Formatting.Indented
};
string json = JsonConvert.SerializeObject(kb, settings);
But the problem is that result property value looks not the same as original string:
from web server it was:
y1GpP7FibyTYl40Jhx1B90WOi1mecJfpi4IEhbHPbAB64jhV16UlpEPyGpNIzDS4Lct80sIs7FW5Vnf38Z-tzPbtHyFVYYU2AC4SVrwQp9-ELz-..._xW3bmMxuwoBgHpWDTw
serialized locally:
y1GpP7FibyTYl40Jhx1B90WOi1mecJfpi4IEhbHPbAB64jhV16UlpEPyGpNIzDS4Lct80sIs7FW5Vnf38Z+tzPbtHyFVYYU2AC4SVrwQp9+ELz+.../xW3bmMxuwoBgHpWDTw==
underscores and slashes, plus and minus signs, two equal signs at the end
is it possible to serialize byte array exactly as it is done by web-server?
I have an idea to serialize it with Json and then replace minus with plus, underscore with slash and remove last two equal signs.
Any other method to get it immediately out of the box?
Regards
In urls there is different variant of Base64 used with - and _ which doesn't require additional encoding (e.g. + would be encoded to %2B). For this you can simply use string Replace method to replace those characters.
If you want an out-of-the box solution you can try Microsoft.IdentityModel.Tokens nuget package:
var encoded = Base64UrlEncoder.Encode(someString);
var decoded = Base64UrlEncoder.Decode(encoded);
For more info: https://en.wikipedia.org/wiki/Base64#URL_applications

Can I Deserialize a JSON string that contains 0.0 in C#?

The JSON I'm getting back from a webservice has an integer incorrectly represented as 0.0. My deserialization code looks like this:
var serializer = new JsonSerializer();
var ret = serializer.Deserialize<T>(jsonTextReader);
And I get an error like this:
Input string '0.0' is not a valid integer.
My question is, is there a way to specify a less strict deserialization method so that I can parse this string?
EDIT: The web service returns no schema so I don't know why the deserializer tries to convert it to an int instead of a float or double.
I'd say that you should go ahead and creat your classes on Json -> C#
var o = (JObject)serializer.Deserialize(myjsondata);
You can use the C# dynamic type to make things easier. This technique also makes re-factoring simpler as it does not rely on magic-strings. Use JsonConvert.DeserializeObject<dynamic>()to deserialize this string into a dynamic type then simply access its properties in the usual way in C#.
Im not sure why youre getting
Input string '0.0' is not a valid integer.
since if you dont have any Json data it should just be left at null and you shouldnt have this problem

cleaning JSON for XSS before deserializing

I am using Newtonsoft JSON deserializer. How can one clean JSON for XSS (cross site scripting)? Either cleaning the JSON string before de-serializing or writing some kind of custom converter/sanitizer? If so - I am not 100% sure about the best way to approach this.
Below is an example of JSON that has a dangerous script injected and needs "cleaning." I want a want to manage this before I de-serialize it. But we need to assume all kinds of XSS scenarios, including BASE64 encoded script etc, so the problem is more complex that a simple REGEX string replace.
{ "MyVar" : "hello<script>bad script code</script>world" }
Here is a snapshot of my deserializer ( JSON -> Object ):
public T Deserialize<T>(string json)
{
T obj;
var JSON = cleanJSON(json); //OPTION 1 sanitize here
var customConverter = new JSONSanitizer();// OPTION 2 create a custom converter
obj = JsonConvert.DeserializeObject<T>(json, customConverter);
return obj;
}
JSON is posted from a 3rd party UI interface, so it's fairly exposed, hence the server-side validation. From there, it gets serialized into all kinds of objects and is usually stored in a DB, later to be retrieved and outputted directly in HTML based UI so script injection must be mitigated.
Ok, I am going to try to keep this rather short, because this is a lot of work to write up the whole thing. But, essentially, you need to focus on the context of the data you need to sanitize. From comments on the original post, it sounds like some values in the JSON will be used as HTML that will be rendered, and this HTML comes from an un-trusted source.
The first step is to extract whichever JSON values need to be sanitized as HTML, and for each of those objects you need to run them through an HTML parser and strip away everything that is not in a whitelist. Don't forget that you will also need a whitelist for attributes.
HTML Agility Pack is a good starting place for parsing HTML in C#. How to do this part is a separate question in my opinion - and probably a duplicate of the linked question.
Your worry about base64 strings seems a little over-emphasized in my opinion. It's not like you can simply put aW5zZXJ0IGg0eCBoZXJl into an HTML document and the browser will render it. It can be abused through javascript (which your whitelist will prevent) and, to some extent, through data: urls (but this isn't THAT bad, as javascript will run in the context of the data page. Not good, but you aren't automatically gobbling up cookies with this). If you have to allow a tags, part of the process needs to be validating that the URL is http(s) (or whatever schemes you want to allow).
Ideally, you would avoid this uncomfortable situation, and instead use something like markdown - then you could simply escape the HTML string, but this is not always something we can control. You'd still have to do some URL validation though.
Interesting!! Thanks for asking. we normally use html.urlencode in terms of web forms. I have a enterprise web api running that has validations like this. We have created a custom regex to validate. Please have a look at this MSDN link.
This is the sample model created to parse the request named KeyValue (say)
public class KeyValue
{
public string Key { get; set; }
}
Step 1: Trying with a custom regex
var json = #"[{ 'MyVar' : 'hello<script>bad script code</script>world' }]";
JArray readArray = JArray.Parse(json);
IList<KeyValue> blogPost = readArray.Select(p => new KeyValue { Key = (string)p["MyVar"] }).ToList();
if (!Regex.IsMatch(blogPost.ToString(),
#"^[\p{L}\p{Zs}\p{Lu}\p{Ll}\']{1,40}$"))
Console.WriteLine("InValid");
// ^ means start looking at this position.
// \p{ ..} matches any character in the named character class specified by {..}.
// {L} performs a left-to-right match.
// {Lu} performs a match of uppercase.
// {Ll} performs a match of lowercase.
// {Zs} matches separator and space.
// 'matches apostrophe.
// {1,40} specifies the number of characters: no less than 1 and no more than 40.
// $ means stop looking at this position.
Step 2: Using HttpUtility.UrlEncode - this newtonsoft website link suggests the below implementation.
string json = #"[{ 'MyVar' : 'hello<script>bad script code</script>world' }]";
JArray readArray = JArray.Parse(json);
IList<KeyValue> blogPost = readArray.Select(p => new KeyValue {Key =HttpUtility.UrlEncode((string)p["MyVar"])}).ToList();

JavaScriptSerializer value for i=3?

JavaScriptSerializer oSerializer = new JavaScriptSerializer();
object i = 3;
string sJSON = oSerializer.Serialize(i); //"3"
The JavaScriptSerializer should serialize its parameter to JSON!
And the result is "3" ( which is not JSON)
What am I missing?
edit
Ive written a mail to douglas crockford
3 is not a json object/text but json value.
so i think msdn should clarify the serialize method.
http://i.stack.imgur.com/VOh3X.png
As has been said many times by different people, the output you are receiving is valid JSON.
From the JSON Specification (the Introduction):
JSON can represent four primitive types (strings, numbers, booleans, and null) and two structured types (objects and arrays).
and further (Section 2.1):
A JSON value MUST be an object, array, number, or string, or one of the following three literal names:
false null true
My interpretation of specification tells me that the case you describe here is more a JSON value than a JSON object.
You asked it to serialise the value 3, and it did. That's exactly correct.
To be explicit: what exactly are you expecting to come out? JSON gives name-value pairs. The value "3" has no name, because the whole object is 3.
JSON is JavaScript object notation. Pass it an object, and you'll probably get what you're expecting.
You can use an anonymous type as M. Babcock suggests: new { i = 3 }.

Categories