Remove fields from JSON dynamically using Json.Net - c#

I have some JSON input, the shape of which I cannot predict, and I have to make some transformations (to call it something) so that some fields are not logged. For instance, if I have this JSON:
{
"id": 5,
"name": "Peter",
"password": "some pwd"
}
then after the transformation it should look like this:
{
"id": 5,
"name": "Peter"
}
The above sample is trivial, but the actual case is not so happy/easy. I will have some regular expressions and if any field(s) on the input JSON matches that, then it shouldn't be on the result. I will have to go recursively in case I have some nested objects. I've been seeing some stuff on LINQ to JSON but I have found nothing satisfying my needs.
Is there a way of doing this?
Note:
This is part of a logging library. I can use the JSON string if necessary or easier. The thing is that at some point in my logging pipeline I get the object (or string as required) and then I need to strip the sensitive data from it, such as passwords, but also any other client-specified data.

You can parse your JSON into a JToken, then use a recursive helper method to match property names to your regexes. Wherever there's a match, you can remove the property from its parent object. After all sensitive info has been removed, just use JToken.ToString() to get the redacted JSON.
Here is what the helper method might look like:
public static string RemoveSensitiveProperties(string json, IEnumerable<Regex> regexes)
{
JToken token = JToken.Parse(json);
RemoveSensitiveProperties(token, regexes);
return token.ToString();
}
public static void RemoveSensitiveProperties(JToken token, IEnumerable<Regex> regexes)
{
if (token.Type == JTokenType.Object)
{
foreach (JProperty prop in token.Children<JProperty>().ToList())
{
bool removed = false;
foreach (Regex regex in regexes)
{
if (regex.IsMatch(prop.Name))
{
prop.Remove();
removed = true;
break;
}
}
if (!removed)
{
RemoveSensitiveProperties(prop.Value, regexes);
}
}
}
else if (token.Type == JTokenType.Array)
{
foreach (JToken child in token.Children())
{
RemoveSensitiveProperties(child, regexes);
}
}
}
And here is a short demo of its use:
public static void Test()
{
string json = #"
{
""users"": [
{
""id"": 5,
""name"": ""Peter Gibbons"",
""company"": ""Initech"",
""login"": ""pgibbons"",
""password"": ""Sup3rS3cr3tP#ssw0rd!"",
""financialDetails"": {
""creditCards"": [
{
""vendor"": ""Viza"",
""cardNumber"": ""1000200030004000"",
""expDate"": ""2017-10-18"",
""securityCode"": 123,
""lastUse"": ""2016-10-15""
},
{
""vendor"": ""MasterCharge"",
""cardNumber"": ""1001200230034004"",
""expDate"": ""2018-05-21"",
""securityCode"": 789,
""lastUse"": ""2016-10-02""
}
],
""bankAccounts"": [
{
""accountType"": ""checking"",
""accountNumber"": ""12345678901"",
""financialInsitution"": ""1st Bank of USA"",
""routingNumber"": ""012345670""
}
]
},
""securityAnswers"":
[
""Constantinople"",
""Goldfinkle"",
""Poppykosh"",
],
""interests"": ""Computer security, numbers and passwords""
}
]
}";
Regex[] regexes = new Regex[]
{
new Regex("^.*password.*$", RegexOptions.IgnoreCase),
new Regex("^.*number$", RegexOptions.IgnoreCase),
new Regex("^expDate$", RegexOptions.IgnoreCase),
new Regex("^security.*$", RegexOptions.IgnoreCase),
};
string redactedJson = RemoveSensitiveProperties(json, regexes);
Console.WriteLine(redactedJson);
}
Here is the resulting output:
{
"users": [
{
"id": 5,
"name": "Peter Gibbons",
"company": "Initech",
"login": "pgibbons",
"financialDetails": {
"creditCards": [
{
"vendor": "Viza",
"lastUse": "2016-10-15"
},
{
"vendor": "MasterCharge",
"lastUse": "2016-10-02"
}
],
"bankAccounts": [
{
"accountType": "checking",
"financialInsitution": "1st Bank of USA"
}
]
},
"interests": "Computer security, numbers and passwords"
}
]
}
Fiddle: https://dotnetfiddle.net/KcSuDt

You can parse your JSON to a JContainer (which is either an object or array), then search the JSON hierarchy using DescendantsAndSelf() for properties with names that match some Regex, or string values that match a Regex, and remove those items with JToken.Remove().
For instance, given the following JSON:
{
"Items": [
{
"id": 5,
"name": "Peter",
"password": "some pwd"
},
{
"id": 5,
"name": "Peter",
"password": "some pwd"
}
],
"RootPasswrd2": "some pwd",
"SecretData": "This data is secret",
"StringArray": [
"I am public",
"This is also secret"
]
}
You can remove all properties whose name includes "pass.*w.*r.*d" as follows:
var root = (JContainer)JToken.Parse(jsonString);
var nameRegex = new Regex(".*pass.*w.*r.*d.*", RegexOptions.IgnoreCase | RegexOptions.CultureInvariant);
var query = root.DescendantsAndSelf()
.OfType<JProperty>()
.Where(p => nameRegex.IsMatch(p.Name));
query.RemoveFromLowestPossibleParents();
Which results in:
{
"Items": [
{
"id": 5,
"name": "Peter"
},
{
"id": 5,
"name": "Peter"
}
],
"SecretData": "This data is secret",
"StringArray": [
"I am public",
"This is also secret"
]
}
And you can remove all string values that include the substring secret by doing:
var valueRegex = new Regex(".*secret.*", RegexOptions.IgnoreCase);
var query2 = root.DescendantsAndSelf()
.OfType<JValue>()
.Where(v => v.Type == JTokenType.String && valueRegex.IsMatch((string)v));
query2.RemoveFromLowestPossibleParents();
var finalJsonString = root.ToString();
Which when applied after the first transform results in:
{
"Items": [
{
"id": 5,
"name": "Peter"
},
{
"id": 5,
"name": "Peter"
}
],
"StringArray": [
"I am public"
]
}
For convenience, I am using the following extension methods:
public static partial class JsonExtensions
{
public static TJToken RemoveFromLowestPossibleParent<TJToken>(this TJToken node) where TJToken : JToken
{
if (node == null)
return null;
JToken toRemove;
var property = node.Parent as JProperty;
if (property != null)
{
// Also detach the node from its immediate containing property -- Remove() does not do this even though it seems like it should
toRemove = property;
property.Value = null;
}
else
{
toRemove = node;
}
if (toRemove.Parent != null)
toRemove.Remove();
return node;
}
public static IEnumerable<TJToken> RemoveFromLowestPossibleParents<TJToken>(this IEnumerable<TJToken> nodes) where TJToken : JToken
{
var list = nodes.ToList();
foreach (var node in list)
node.RemoveFromLowestPossibleParent();
return list;
}
}
Demo fiddle here.

Related

Remove all occurrences of a particular key from a JSON response in C#

I have a JSON string from which I want to eliminate all the occurrences of a given key.
JSON I have:
string requstBody =
{
"payLoad": [
{
"BaseVersionId_": 9,
"VersionId_": 10,
"AssetCollateralLink": [
{
"AssetId": 137,
"BaseVersionId_": 9,
"VersionId_": 10
},
{
"AssetId": 136,
"BaseVersionId_": 0,
"VersionId_": 1
}
],
"CollateralProvider": [],
"AdvCollateralAllocation": [
{
"LinkId": 91,
"IsDeleted_": false,
"BaseVersionId_": 1,
"VersionId_": 2
}
]
}
]
}
I want to eliminate keys "BaseVersionID_" and "VersionId_" as follows:
string requstBody =
{
"payLoad": [
{
"AssetCollateralLink": [
{
"AssetId": 137
},
{
"AssetId": 136
}
],
"CollateralProvider": [],
"AdvCollateralAllocation": [
{
"LinkId": 91,
"IsDeleted_": false
}
]
}
]
}
I used JObject.Remove(); as follows
JObject sampleObj1 = new JObject();
sampleObj1 = JsonHelper.JsonParse(requestBody);
sampleObj1.Remove("BaseVersionId_");
but able to remove the keys under payLoad Hierarchy only.
How do I remove all the occurrences of the Key.
The required properties can be removed from the Json, simply with Linq:
var jsonObj = JObject.Parse(requestBody);
jsonObj.SelectToken("payLoad[0]").SelectToken("AdvCollateralAllocation")
.Select(jt => (JObject)jt)
.ToList()
.ForEach(r =>
r
.Properties()
.ToList()
.ForEach(e =>
{
if (e.Name == "BaseVersionId_" || e.Name == "VersionId_")
e.Remove();
}));
The resultant jsonObj will be without the BaseVersionId_ and VersionId_ names as well as their values.
I'd use JsonPath as such:
var toRemove = jsonObject
.SelectTokens("$.payLoad.[*].AssetCollateralLink.[*]..BaseVersionId_")
.Concat(stuff.SelectTokens("$.payLoad.[*].AssetCollateralLink.[*]..VersionId_"))
.Concat(stuff.SelectTokens("$.payLoad.[*].AdvCollateralAllocation.[*]..VersionId_"))
.Concat(stuff.SelectTokens("$.payLoad.[*].AdvCollateralAllocation.[*]..BaseVersionId_"))
.ToList();
for (int i = toRemove.Count - 1; i >= 0; i--)
{
toRemove[i].Parent?.Remove();
}

Parse Json key/values of a key's value using JObject in C#

I am trying to get the sub key\values of a key's value. What I am trying to accomplish is to remove the elements that are empty or have "-" or have "N/A". I can not seem to figure out out to iterate over the values to search.
{
"name": {
"first": "Robert",
"middle": "",
"last": "Smith"
},
"age": 25,
"DOB": "-",
"hobbies": [
"running",
"coding",
"-"
],
"education": {
"highschool": "N/A",
"college": "Yale"
}
}
Code:
JObject jObject = JObject.Parse(response);
foreach (var obj in jObject)
{
Console.WriteLine(obj.Key);
Console.WriteLine(obj.Value);
}
I am trying to search "first":"Robert","middle":"","last":"Smith"
You can use Descendants method to get child tokens of type JProperty, then filter their values and print them or remove one by one
var properties = json.Descendants()
.OfType<JProperty>()
.Where(p =>
{
if (p.Value.Type != JTokenType.String)
return false;
var value = p.Value.Value<string>();
return string.IsNullOrEmpty(value);
})
.ToList();
foreach (var property in properties)
property.Remove();
Console.WriteLine(json);
Gives you the following result (with "middle": "" property removed)
{
"name": {
"first": "Robert",
"last": "Smith"
},
"age": 25,
"DOB": "-",
"hobbies": [
"running",
"coding",
"-"
],
"education": {
"highschool": "N/A",
"college": "Yale"
}
}
You can also add more conditions to return statement, like return string.IsNullOrEmpty(value) || value.Equals("-"); to remove "DOB": "-" property as well
You can recursively iterate JObject properties:
private static void IterateProps(JObject o)
{
foreach (var prop in o.Properties())
{
Console.WriteLine(prop.Name);
if (prop.Value is JObject)
{
IterateProps((JObject)prop.Value);
}
else
{
Console.WriteLine(prop.Value);
}
}
}

How to traverse dynamic nested json in c# recursively

If dataGeneratorType is range then the value can be anything between dataGeneratorStart and dataGeneratorEnd. If dataGeneratorType is array with no length property then the value will be one value randomly selected from the array, otherwise it be two randomly selected values that equals to the length. If it is the object (which can be more nested) then it will again follow the above logic. But this is where it gets tricky for me. Is there any dynamic way to solve the problem in C#.
Input payload json
{
"temperature": {
"type": "int",
"dataGeneratorType": "range",
"dataGeneratorStart": -5,
"dataGeneratorEnd": 55
},
"salesAmount": {
"type": "float",
"dataGeneratorType": "array",
"dataGeneratorArray": [
0.51,
13.33,
20.01,
1.54
]
},
"city": {
"type": "string",
"dataGeneratorType": "array",
"dataGeneratorArray": [
"UK",
"Iceland",
"Portugal",
"Spain"
]
},
"relatedTags": {
"type": "array",
"dataGeneratorType": "array",
"dataGeneratorArray": [
"Sport",
"Hardware",
"Cycling",
"Magazines"
],
"length": 2
},
"salesDetail": {
"type": "object",
"dataGeneratorType": "object",
"dataGeneratorValue": {
"VAT": {
"type": "float",
"dataGeneratorType": "range",
"dataGeneratorStart": 0.0,
"dataGeneratorEnd": 20.0
},
"discountAmount": {
"type": "float",
"dataGeneratorType": "array",
"dataGeneratorArray": [
0.10,
0.15,
0.20
]
}
}
}
}
To output json:
{
"temperature": 20,
"salesAmount": 20.01,
"city": "Iceland",
"relatedTags": [
"Sport",
"Cycling"
],
"salesDetails": {
"VAT": 15.0,
"discount": 0.1
}
}
Please try this:
static string TransformJson(string inputPayload)
{
JObject obj = JObject.Parse(inputPayload);
var manipulatedObj = obj.DeepClone();
foreach (var child in obj)
{
var key = child.Key;
var t = child.Value["type"];
JToken genType;
if (!((JObject)child.Value).TryGetValue("dataGeneratorType", out genType))
{
continue; // genType is not found so, continue with next object.
}
var str = genType.Type == JTokenType.String ? genType.ToString().ToLower() : null;
switch (str)
{
case "range":
var r = new Random();
if (((string)t).ToLower() == "float")
{
var s = (float)child.Value["dataGeneratorStart"];
var e = (float)child.Value["dataGeneratorEnd"];
manipulatedObj[key] = r.NextDouble() * e;
}
else
{
var s = (int)child.Value["dataGeneratorStart"];
var e = (int)child.Value["dataGeneratorEnd"];
manipulatedObj[key] = r.Next(s, e);
}
break;
case "array":
var arr = child.Value["dataGeneratorArray"];
JToken lengthToken;
if (!((JObject)child.Value).TryGetValue("length", out lengthToken))
{
lengthToken = 2.ToString();
}
if ((string)t != "array")
{
manipulatedObj[key] = arr.OrderBy(a => a).LastOrDefault();
}
else
{
var count = arr.Count() >= (int)lengthToken ? (int)lengthToken : arr.Count();
var item = new JArray();
foreach (var m in Enumerable.Range(0, count).Select(i => arr[i]))
{
item.Add(m);
}
manipulatedObj[key] = item;
}
break;
case "object":
var transformJson = TransformJson(child.Value["dataGeneratorValue"].ToString());
manipulatedObj[key] = JObject.Parse(transformJson);
break;
default:
manipulatedObj[key] = child.Value["dataGeneratorValue"];
break;
}
}
return manipulatedObj.ToString(Formatting.Indented);
}
And use it as below:
var serializeObject = #"{""temperature"":{""type"":""int"",""dataGeneratorType"":""range"",""dataGeneratorStart"":-5,""dataGeneratorEnd"":55},""salesAmount"":{""type"":""float"",""dataGeneratorType"":""array"",""dataGeneratorArray"":[0.51,13.33,20.01,1.54]},""relatedTags"":{""type"":""array"",""dataGeneratorType"":""array"",""dataGeneratorArray"":[""Sport"",""Hardware"",""Cycling"",""Magazines""],""length"":2},""salesDetail"":{""type"":""object"",""dataGeneratorType"":""object"",""dataGeneratorValue"":{""VAT"":{""type"":""float"",""dataGeneratorType"":""range"",""dataGeneratorStart"":0.0,""dataGeneratorEnd"":20.0},""discountAmount"":{""type"":""float"",""dataGeneratorType"":""array"",""dataGeneratorArray"":[0.1,0.15,0.2]}}}}";
Console.WriteLine(serializeObject);
var outputJson = TransformJson(serializeObject);
Console.WriteLine(System.Environment.NewLine + "Modified Json = " + System.Environment.NewLine);
Console.WriteLine(outputJson);
I just added everything in one function but you can split into multiple functions and/or classes to make unit testing easier.
Here is the dotnet fiddle.
Here is the jist of the logic for your problem. To parse nested json you have to use a recursive statement and do the processing for each layer.
public Void Traverse(string myJsonString){
var jObj = JObject.Parse(jsonString);
foreach (var item in jObj)
{
if(item.Value["dataGeneratorType"].ToString().Equals("range")){
Console.WriteLine("Do some logic");
}
else if(item.Value["dataGeneratorType"].ToString().Equals("array")){
if(item.Value["length"] != null){
Console.WriteLine("Length is present do more logic");
}
else
{
Console.WriteLine("No length property present do more logic");
}
}
else if(item.Value["dataGeneratorType"].ToString().Equals("object")){
Console.WriteLine("It's an object");
Console.WriteLine(item.Value["dataGeneratorValue"]);
foreach (var nestedItem in item.Value["dataGeneratorValue"]){
Console.WriteLine("Nested Item");
Console.WriteLine(nestedItem);
//Recursive function call
Traverse(nestedItem.Value)//pass in as a json string
}
}
}
}
}

Find field with specific key in Json

In my c# project I use Json.net Library.
I have long Json with many subfields, for ex:
{
"count": 10,
"Foo1": [
{
"id": "1",
"name": "Name1"
},
{
"id": "2",
"name": "Name3"
},
{
"id": "3",
"name": "Name4"
}
],
"Foo2": [
{
"id": "4",
"name": "Name3",
"specific_field": "specific_values1"
},
{
"id": "5",
"name": "Name3",
"specific_field": "specific_values2"
},
{
"id": "6",
"name": "Name3",
"specific_field": "specific_values3"
}
],
"Foo3": [
{
"id": "7"
},
{
"id": "8"
},
{
"id": "9"
}
]
}
And I need to get List of all specific_field (id 4-6), but cant deserialized json to object, because Foo1, Foo2 ... changed dynamically.
I want to know, is this possible to get values of specific_field when i have only json?
I think, I found solution:
var list = new List<string>();
var result = ((JToken)json);
foreach (var res in result)
{
list.AddRange(from foo in res.First let ret = foo["specific_field"] where (dynamic) ret != null select foo["specific_field"].ToString());
}
In comment, provide, what do you think about it?
You could use dynamics:
string json = "your JSON string comes here";
dynamic deserializedValue = JsonConvert.DeserializeObject(json);
var values = deserializedValue["Foo2"];
for (int i = 0; i < values.Count; i++)
{
Console.WriteLine(values[i]["specific_field"]);
}

Deserialize dynamically named JSON objects in C# (using JSON.Net or otherwise)

I have the following JSON:
{
"aaaa": {
"name": "General Name",
"product": "book",
"host": "book.example.com",
"chapters": {
"bbbb": {
"name": "Chapter 1",
"page": "1",
"end_page": "25"
}
},
"categories" : {
"analysis":{
"Abbbb" : {
"name": "B Chapter",
"id" : "9001"
},
"Acccc" : {
"name": "C Chapter",
"id" : "9001"
},
"Adddd" : {
"name": "D Chapter",
"id" : "9001"
},
"Aeeee" : {
"name": "E Chapter",
"id" : "9001"
},
"Affff" : {
"name": "F Chapter",
"id" : "9001"
},
"Agggg" : {
"name": "G Chapter",
"id" : "9001"
}
},
"sources":{
"acks" : {
"name": "S. Spielberg",
"id" : "9001"
}
}
}
}
"yyyy": {
"name": "Y General Name",
"product": "Y book",
"host": "ybook.example.com",
...
}
"zzzz": {
"name": "Z General Name",
"product": "Z book",
"host": "zbook.example.com",
...
}
The values for aaaa, yyyy, and zzzz can be any string and there can be any number of them.
I need to extract all of the [aaaa|yyyy|zzz].categories.analysis values. That is, I need to end up with a Dictionary<string, string> of the object name (e.g., Abbbb, Acccc, etc.) and the ID, ignoring the name string.
E.g.,
[Abbbb, 9001]
[Acccc, 9001]
[Adddd, 9001]
...
[Zaaaa, 9001]
I've been at this for way too long and feel like I'm missing something obvious. I've tried JSON.net and native serialization. This is a trivial task in every other language I've used.
I've come close with something like this:
var ajsonObject = JsonConvert.DeserializeObject<dynamic>(jsonString);
var oasearch_categories = ajsonObject.aaaa.categories.analysis;
But again, aaaa can be any string, so I'm not sure how to reference that dynamically.
Took a while, but I figured it out. My requirement changed slightly from the original question... My final result needed to a Dictionary of lists, so I'd end up with a dictionary like:
DICT[ {"9001", ["Abbbb", "Acccc", "Adddd", ...]}, {"9002", ["Zbbbb, Zdddd", ...]}, etc. ]
| key | | value | | key | | value |
This is the result:
Dictionary<string, List<string>> idsAndTheirNames = new Dictionary<string, List<string>>();
try
{
var ajsonObject = JsonConvert.DeserializeObject<dynamic>(JSONstring);
foreach (var child in ajsonObject.Children())
{
foreach (var product in child.Children())
{
var categories = product.categories.analysis;
foreach (var category in categories.Children())
{
foreach (var subcat in category)
{
List<string> name = idsAndTheirNames[(string)subcat.id]; //e.g., "9001"
if (name == null) name = new List<string>();
name.Add(category.Name); //e.g., "Abbbb"
idsAndTheirNames[(string)subcat.id] = name; //"9001" -> ["Abbbb", "Acccc", etc.]
System.Diagnostics.Debug.WriteLine((string)category.Name); //"Abbbb"
System.Diagnostics.Debug.WriteLine((string)subcat.name); //"Chapter B"
System.Diagnostics.Debug.WriteLine((string)subcat.id); //"9001"
}
}
}
}
}
catch (Exception ex)
{
System.Diagnostics.Debug.WriteLine("JSON ERROR: " + ex.Message);
}

Categories