JSON.NET: Getting nested value when key contains dots? - c#

I want to access a nested value with JSON.NET. I know I can use the .SelectToken() method to access a nested value (see for example this question or this question). My issue is that the JSON I'm trying to access has keys with dots in them:
var json = #"
{
""data.dot"": {
""value"": 5,
}
}";
var jo = JObject.Parse(json);
Console.WriteLine(jo.SelectToken("data.dot.value")); // <-- doesn't work

I found the answer while writing this question, so I might as well share my findings.
It turns out that the .SelectToken method is very powerful, and:
allows you to query a JSON with escaped properties by surrounding your key with ['{key}']
allows you to use regex
allows you to filter by path value
allows you to query by complex path
So in my case, I could write:
jo.SelectToken("['data.dot'].value"); // escaped property
jo.SelectToken("$..value"); // complex JSON path
I could also use the JToken indexer, but contrary to the .SelectToken method, it would throw an exception if the JSON doesn't contain the data.dot key:
jo["data.dot"]["value"]

Related

C# Pass filename as json parameter- Getting error "Unrecognized escape sequence. "

I want to pass a filepath through JSON. On deserializing I am getting error:
Unrecognized escape sequence. (43): {"Jobtype": "StepBatch","SelectedId": "D:\Input\file1.CATPart"}
I have escaped characters but it still shows error...am I missing something here?
string json = "{\"Jobtype\": \"StepBatch\",\"SelectedId\": \"D:\\Input\\file1.CATPart\"}";
var jsonObj = new JavaScriptSerializer().Deserialize<List<Arguments>>(json);
The problem is that the content of your string at execution time is:
{"Jobtype": "StepBatch","SelectedId": "D:\Input\file1.CATPart"}
That's not valid JSON, because of the backslashes in the value for SelectedId. You'd need the JSON to be:
{"Jobtype": "StepBatch","SelectedId": "D:\\Input\\file1.CATPart"}
so your C# would have to be:
string json = "{\"Jobtype\": \"StepBatch\",\"SelectedId\": \"D:\\\\Input\\\\file1.CATPart\"}";
However, given that you're immediately deserializing the JSON anyway, I'd suggest getting rid of the JSON part entirely, and just creating the Arguments values yourself.
If you need to produce JSON, create the right values directly, and then get JavaScriptSerializer (or preferrably Json.NET) to create the JSON for you, instead of hand-coding it.

JsonPath with JsonTextReader: Token at a Time

I am having an issue with JsonPath working differently when loading token (.Load) at a time using JsonTextReader versus loading the entire JSON using ReadFrom. Here is an example:
JSON: Path="[*].person" Method=SelectTokens(path)
[
{
"person": {
"personid": 123456
}
},
{
"person": {
"personid": 798
}
}
]
When using .ReadFrom, it'll return the proper 2 elements. If I use .Load though, it'll return 0 elements. However, if I change the path to "person", .ReadFrom returns 0 elements while .Load returns 2 elements.
As a fix, I could change the path so that it'll remove up to the first "." i.e. path = substring(path.index(".")+1); however, this feels more of a hack than a proper fix. I would, of course, also need to ensure that the JSON is an array, but in most of my cases, it would be.
So finally, I am trying to learn how to use JSON Path with arrays when loading a token at a time. Any recommendations?
Full Code
Full JSON
What is happening in the code you have linked to is it reads tokens until it encounters an object, it then loads the a JToken from this object, which reads ahead to the end of this object.
So what you end up with is a JToken per item in the root array. You can then for each JToken call:
token.SelectTokens("person").OfType<JObject>()
cause you know the property contains an object.
That is the equivalent of doing "[*].person" JsonPath on the whole parsed JSON.
I hope I have understood your question correctly. If not, please let me know =)
Update:
Based on your comments I understand what you are after. What you could do is create a method like this:
public IEnumerable<JToken> GetTokensByPath(TextReader tr, string path)
{
// do our best to convert the path to a RegEx
var regex = new Regex(path.Replace("[*]", #"\[[0-9]*\]"));
using (var reader = new JsonTextReader(tr))
{
while (reader.Read())
{
if (regex.IsMatch(reader.Path))
yield return JToken.Load(reader);
}
}
}
I am matching the path based on the JSON path input, but we need to try and handle all of the various JSON path grammars, at the moment I'm only support *.
This approach will be useful when you have a massive file, with a deep JSON path selector, you'll keep the stream open longer if you enumerate slowly, but you will have a much lower peak memory usage.
I hope this helps.

How to create json schema from json object string C#

I am evaluating Json.Net.Schema from NewtonSoft and NJsonSchema from GitHub and I cannot figure out how to create a JSON schema from a JSON object. I want it to work exactly like this site does: http://jsonschema.net/#/
What I am looking for
string json = #"{""Name"": ""Bill"",""Age"": 51,""IsTall"": true}";
var jsonSchemaRepresentation = GetSchemaFromJsonObject(json);
I would expect a valid JSON schema in the jsonSchemaRepresentation variable. Does anyone know how I can accomplish this?
Thanks in advance!
The current version of NJsonSchema supports this feature:
The SampleJsonSchemaGenerator generates a JSON Schema from sample JSON data.
var schema = JsonSchema4.FromSampleJson("...");
var schemaJson = schema.ToJson();
... or create a SampleJsonSchemaGenerator instance and call the Generate("...") method.
Actually both of the libraries you mentioned do not support such a functionality.
If you're down to implement it yourself then you will have to parse your JSON, iterate over it recursively and add a new schema depending on the type of what you've just iterated over.
There are also some other tools (in other languages like python) which could be an inspiration, this might get you started.
The string you are submitting to the function is not in the correct format. Try this (add '{' to the start of the string, '}' to the end):
string json = #"{
""Name"": ""Bill"",
""Age"": 51,
""IsTall"": true
}";
var jsonSchemaRepresentation = GetSchemaFromJsonObject(json);

cleaning JSON for XSS before deserializing

I am using Newtonsoft JSON deserializer. How can one clean JSON for XSS (cross site scripting)? Either cleaning the JSON string before de-serializing or writing some kind of custom converter/sanitizer? If so - I am not 100% sure about the best way to approach this.
Below is an example of JSON that has a dangerous script injected and needs "cleaning." I want a want to manage this before I de-serialize it. But we need to assume all kinds of XSS scenarios, including BASE64 encoded script etc, so the problem is more complex that a simple REGEX string replace.
{ "MyVar" : "hello<script>bad script code</script>world" }
Here is a snapshot of my deserializer ( JSON -> Object ):
public T Deserialize<T>(string json)
{
T obj;
var JSON = cleanJSON(json); //OPTION 1 sanitize here
var customConverter = new JSONSanitizer();// OPTION 2 create a custom converter
obj = JsonConvert.DeserializeObject<T>(json, customConverter);
return obj;
}
JSON is posted from a 3rd party UI interface, so it's fairly exposed, hence the server-side validation. From there, it gets serialized into all kinds of objects and is usually stored in a DB, later to be retrieved and outputted directly in HTML based UI so script injection must be mitigated.
Ok, I am going to try to keep this rather short, because this is a lot of work to write up the whole thing. But, essentially, you need to focus on the context of the data you need to sanitize. From comments on the original post, it sounds like some values in the JSON will be used as HTML that will be rendered, and this HTML comes from an un-trusted source.
The first step is to extract whichever JSON values need to be sanitized as HTML, and for each of those objects you need to run them through an HTML parser and strip away everything that is not in a whitelist. Don't forget that you will also need a whitelist for attributes.
HTML Agility Pack is a good starting place for parsing HTML in C#. How to do this part is a separate question in my opinion - and probably a duplicate of the linked question.
Your worry about base64 strings seems a little over-emphasized in my opinion. It's not like you can simply put aW5zZXJ0IGg0eCBoZXJl into an HTML document and the browser will render it. It can be abused through javascript (which your whitelist will prevent) and, to some extent, through data: urls (but this isn't THAT bad, as javascript will run in the context of the data page. Not good, but you aren't automatically gobbling up cookies with this). If you have to allow a tags, part of the process needs to be validating that the URL is http(s) (or whatever schemes you want to allow).
Ideally, you would avoid this uncomfortable situation, and instead use something like markdown - then you could simply escape the HTML string, but this is not always something we can control. You'd still have to do some URL validation though.
Interesting!! Thanks for asking. we normally use html.urlencode in terms of web forms. I have a enterprise web api running that has validations like this. We have created a custom regex to validate. Please have a look at this MSDN link.
This is the sample model created to parse the request named KeyValue (say)
public class KeyValue
{
public string Key { get; set; }
}
Step 1: Trying with a custom regex
var json = #"[{ 'MyVar' : 'hello<script>bad script code</script>world' }]";
JArray readArray = JArray.Parse(json);
IList<KeyValue> blogPost = readArray.Select(p => new KeyValue { Key = (string)p["MyVar"] }).ToList();
if (!Regex.IsMatch(blogPost.ToString(),
#"^[\p{L}\p{Zs}\p{Lu}\p{Ll}\']{1,40}$"))
Console.WriteLine("InValid");
// ^ means start looking at this position.
// \p{ ..} matches any character in the named character class specified by {..}.
// {L} performs a left-to-right match.
// {Lu} performs a match of uppercase.
// {Ll} performs a match of lowercase.
// {Zs} matches separator and space.
// 'matches apostrophe.
// {1,40} specifies the number of characters: no less than 1 and no more than 40.
// $ means stop looking at this position.
Step 2: Using HttpUtility.UrlEncode - this newtonsoft website link suggests the below implementation.
string json = #"[{ 'MyVar' : 'hello<script>bad script code</script>world' }]";
JArray readArray = JArray.Parse(json);
IList<KeyValue> blogPost = readArray.Select(p => new KeyValue {Key =HttpUtility.UrlEncode((string)p["MyVar"])}).ToList();

Generating CSS using parameterized templating

I have already looked at the post: Efficient plain text template engine, but it didn't answer my question. It's documentation is more than a little lacking, and I don't see that it does what I'm trying to do.
I'm wondering if you can iterate over a template and fill in the values with a function, whose parameters come from attributes within the template. e.g.:
"The <comparison property='fruit' value='green'> and the <comparison property='bowl' value='big'>."
becomes, after iterating over each variable and passing it to a function,
"The fruit is green and the bowl is big."
I'm trying to generate a css page based upon a JSON object containing appearance settings.
EDIT: I'm wondering if there's a way to get the straight object from JsonConvert.DeserializeObject(). The JObject has a lot of information I don't need.
(I am not sure if this is what you are looking for, but) I guess, you can combine my previous answer (showing the use of JObject.SelectToken) with regex to create your own templating engine.
string Parse(string json, string template)
{
var jObj = JObject.Parse(json);
return Regex.Replace(template, #"\{\{(.+?)\}\}",
m => jObj.SelectToken(m.Groups[1].Value).ToString());
}
string json = #"{name:""John"" , addr:{state:""CA""}}";
string template = "dummy text. Hello {{name}} at {{addr.state}} dummy text.";
string result = Parse(json, template);

Categories