JSON repeating inputs section from Excel - c#

I have a specific JSON string that I need to match for a rest call. I'm pulling the data from an excel spreadsheet. One of the sections has repeating input like below. The data in my spreadsheet looks like this:
The JSON I need to generate looks like:
"detailInputs": [
{
"name": "SOGrid",
"repeatingInputs": [
{
"inputs": [
{
"name": "ItemNumber",
"value": "XYZ"
},
{
"name": "Quantity",
"value": "1"
}
]
},
{
"inputs": [
{
"name": "ItemNumber",
"value": "ABC"
},
{
"name": "Quantity",
"value": "3"
}
]
}
]
What I've tried so far is below (note jsonArraystring is the header information formatted in a previous section):
using (var conn = new OleDbConnection(connectionString))
{
sheetName = "Detail";
conn.Open();
var cmd = conn.CreateCommand();
cmd.CommandText = $"SELECT * FROM [{sheetName}$]";
using (var rdr = cmd.ExecuteReader())
{
var query = rdr.Cast<DbDataRecord>().Select(row => new {
name = row[0],
value = row[1],
//description = row[2]
});
var json = JsonConvert.SerializeObject(query);
jsonArrayString = jsonArrayString + ",\"detailInputs\":[{\"name\":\"SOGrid\",\"repeatingInputs\":[{\"inputs\": " + json + "}]}]}";
This is very close, but puts the "repeating Inputs" are all in one inputs section.
I also tried assigning the values to a dictionary and list in hopes of pulling the appropriate pairs and formatting the JSON from that, this is the beginning of that, but I'm not familiar enough with unraveling the key value pairs to get that formatted correctly.
using (var conn = new OleDbConnection(connectionString))
{
sheetName = "Detail";
conn.Open();
int counter = 0;
var cmd = conn.CreateCommand();
cmd.CommandText = $"SELECT * FROM [{sheetName}$]";
var values = new List<Dictionary<string, object>>();
var ListValues = new List<string>();
using (var rdr = cmd.ExecuteReader())
{
while (rdr.Read())
{
var fieldValues = new Dictionary<string, object>();
var fieldValuesList = new List<string>();
for (int i = 0; i < rdr.FieldCount; i++)
{
fieldValues.Add(rdr.GetName(i), rdr[i]);
fieldValuesList.Add(rdr.GetName(i));
}
// add the dictionary on the values list
values.Add(fieldValues);
}
The root question is how can I create a repeating inputs structure as shown in the JSON sample, by pulling from excel data.

What you want to do is to serialize the contents of the Excel worksheet as the array value of the "repeatingInputs" property, using a specific structure. I would suggest breaking this down into a series of LINQ transformations.
First, introduce a couple of extension methods:
public static class DataReaderExtensions
{
// Adapted from this answer https://stackoverflow.com/a/1202973
// To https://stackoverflow.com/questions/1202935/convert-rows-from-a-data-reader-into-typed-results
// By https://stackoverflow.com/users/3043/joel-coehoorn
public static IEnumerable<T> SelectRows<T>(this IDataReader reader, Func<IDataRecord, T> select)
{
while (reader.Read())
{
yield return select(reader);
}
}
}
public static class EnumerableExtensions
{
// Adapted from this answer https://stackoverflow.com/a/419058
// To https://stackoverflow.com/questions/419019/split-list-into-sublists-with-linq/
// By https://stackoverflow.com/users/50776/casperone
public static IEnumerable<List<T>> ChunkWhile<T>(this IEnumerable<T> enumerable, Func<List<T>, T, bool> shouldAdd)
{
if (enumerable == null || shouldAdd == null)
throw new ArgumentNullException();
return enumerable.ChunkWhileIterator(shouldAdd);
}
static IEnumerable<List<T>> ChunkWhileIterator<T>(this IEnumerable<T> enumerable, Func<List<T>, T, bool> shouldAdd)
{
List<T> list = new List<T>();
foreach (var item in enumerable)
{
if (list.Count > 0 && !shouldAdd(list, item))
{
yield return list;
list = new List<T>();
}
list.Add(item);
}
if (list.Count != 0)
{
yield return list;
}
}
}
The first method packages an IDataReader into an enumerable of typed objects, one for each row. Doing this makes it easier to feed the data reader's contents into subsequent LINQ transformations. The second method breaks a flat enumerable into an enumerable of "chunks" of lists, based on some predicate condition. This will be used to break the rows into chunks at each ItemNumber row.
Using these two extension methods we can generate the required JSON as follows:
public static string ExtractRows(string connectionString, string sheetName)
{
using (var conn = new OleDbConnection(connectionString))
{
conn.Open();
using (var cmd = conn.CreateCommand())
{
cmd.CommandText = string.Format("SELECT * FROM [{0}]", sheetName);
using (var rdr = cmd.ExecuteReader())
{
var query = rdr
// Wrap the IDataReader in a LINQ enumerator returning an array of key/value pairs for each row.
// Project the first two columns into a single anonymous object.
.SelectRows(r =>
{
// Check we have two columns in the row, and the first (Name) column value is non-null.
// You might instead check that we have at least two columns.
if (r.FieldCount != 2 || r.IsDBNull(0))
throw new InvalidDataException();
return new { Name = r[0].ToString(), Value = r[1] };
})
// Break the columns into chunks when the first name repeats
.ChunkWhile((l, r) => l[0].Name != r.Name)
// Wrap in the container Inputs object
.Select(r => new { Inputs = r });
// Serialize in camel case
var settings = new JsonSerializerSettings
{
ContractResolver = new CamelCasePropertyNamesContractResolver(),
};
return JsonConvert.SerializeObject(query, Formatting.Indented, settings);
}
}
}
}
Which will generate the required value for "repeatingInputs":
[
{
"inputs": [
{
"name": "ItemNumber",
"value": "XYZ"
},
{
"name": "Quantity",
"value": "1"
}
]
},
{
"inputs": [
{
"name": "ItemNumber",
"value": "ABC"
},
{
"name": "Quantity",
"value": "3"
}
]
}
]

Related

Create a tree node when serializing data

I have some data queried from a SQL database and I use this code to serialize them:
List<Dictionary<string, object>> rows = new List<Dictionary<string, object>>();
DataTable dt = new DataTable();
...
SqlDataAdapter adapt = new SqlDataAdapter();
adapt.Fill(dt);
Dictionary<string, object> row;
foreach (DataRow dr in dt.Rows)
{
row = new Dictionary<string, object>();
foreach (DataColumn col in dt.Columns)
{
row.Add(col.ColumnName, dr[col]);
}
rows.Add(row);
}
return JsonSerializer.Serialize(rows);
It gave me this result when I serialize them:
{
"operator": "Unknown",
"extrainfo": "potential client",
"Name": "John Doe",
"ID": 568910,
"LastUpdate": "2021-07-22T00:00:00",
"Interested?": "Yes",
"Does it have a valid contract?": "No",
"Contract type": "Prepaid",
"Client willing to pay more?": "Yes, up to 20%",
"Comments": {}
}
I want all data that comes after lastUpdate column to be serialized inside another node, which is simply called interview.
Here is how I want to serialize them:
{
"operator": "Unknown",
"extrainfo": "potential client",
"Name": "John Doe",
"ID": 568910,
"LastUpdate": "2021-07-22T00:00:00",
"interview": [
{
"question" : "Interested?",
"answer": "Yes"
},
{
"question" : "Does it have a valid contract?",
"answer": "No"
},
{
"question" : "Contract type",
"answer": "Prepaid"
},
{
"question" : "Client willing to pay more?",
"answer": "Yes, up to 20%"
},
{
"question" : "Comments",
"answer": ""
}
]
}
Here it's how a database row looks like:
I want some help on how to do this.
All data that comes after lastUpdate column to be serialized inside another node
After is relative:
Your DataTable might define the columns in a different order then they should present in the json
Serializer might use different ordering then your database schema
Filtering
I would suggest an approach where you list those fields that should be serialized as properties and treat the rest of them as interview question-answer pairs.
var propertyFields = new[] { "operator", "extrainfo", "Name", "ID", "LastUpdate" };
Capturing data
In order to create the required output (for interview) you might need to introduce a class or a struct. I've introduced a named ValueTuple to avoid creating such. But depending on your runtime environment it may or may not available. UPDATE: ValueTuples are not supported by System.Text.Json.JsonSerializer
struct Interview
{
[JsonPropertyName("question")]
public string Question { get; set; }
[JsonPropertyName("answer")]
public string Answer { get; set; }
}
Wire up
Let's put all this things together
static readonly string[] propertyFields = new[] { "operator", "extrainfo", "Name", "ID", "LastUpdate" };
...
Dictionary<string, object> row;
foreach (DataRow dr in dt.Rows)
{
row = new Dictionary<string, object>();
var interview = new List<Interview>();
foreach (DataColumn col in dt.Columns)
{
string name = col.ColumnName;
object value = dr[col];
if (propertyFields.Contains(col.ColumnName))
row.Add(name, value);
else
interview.Add(new Interview { Question = name, Answer = value.ToString() });
}
row.Add("interview", interview);
rows.Add(row);
}
#admiri Please look serialisation example in this link
https://learn.microsoft.com/en-us/dotnet/standard/serialization/system-text-json-how-to?pivots=dotnet-5-0
I substituted a list of tuples for the sql data. For the purposes of the algorithm it's going to be the same.
Note that the simplest way to do this, is to create a POCO class to hold the actual values with the nested "inteview" POCO. If this is coming from SQL then you should know the column structure.
Giving your question, I'm going to make the assumption that for whatever reason that isn't possible and you don't know the column structure ahead of time and you're doing this on the fly.
In that case you're best bet is to not use any POCO classes - including the dictionary you're currently using - and simply write out the data as JSON. One way to do that is as follows:
static List<(string name, string[] values)> Data = new()
{
("operator", new[] { "Unknown" } ),
("extrainfo", new[] { "potential client" }),
("Name", new[] { "John Doe" }),
("ID", new[] { "568910" }),
("LastUpdate", new[] { "2021-07-22T00:00:00" }),
("Interested?", new[] { "Yes" } ),
("Does it have a valid contract?", new[] { "No" } ),
("Contract type", new[] { "Prepaid" } ),
("Client willing to pay more?", new[] { "Yes, up to 20%" } ),
("Comments", new string[] { }),
};
static string Serilize(List<(string name, string[] values)> data)
{
using var output = new MemoryStream();
using (var writer = new Utf8JsonWriter(output, new JsonWriterOptions() { Indented = true }))
{
bool foundQA = false;
writer.WriteStartObject();
foreach (var row in data)
{
if (!foundQA)
{
foreach (var value in row.values)
{
writer.WritePropertyName(row.name);
if (null != value)
writer.WriteStringValue(value);
else
writer.WriteStringValue("");
}
if (row.name == "LastUpdate")
{
writer.WritePropertyName("interview");
writer.WriteStartArray();
foundQA = true;
}
}
else
{
writer.WriteStartObject();
writer.WritePropertyName("question");
writer.WriteStringValue(row.name);
writer.WritePropertyName("answer");
writer.WriteStringValue(row.values.Length > 0 ? row.values[0] : "");
writer.WriteEndObject();
}
}
if (foundQA)
{
writer.WriteEndArray();
}
writer.WriteEndObject();
}
return Encoding.UTF8.GetString(output.ToArray());
}
static void Main(string[] args)
{
string formattedJson = Serilize(Data);
Console.WriteLine("Formatted output:");
Console.WriteLine(formattedJson);
}

Deserialization and Exporting of Nested JSON API Data to CSV

I have requirement to take some data from an API from nested JSON and export it to CSV. I can't find the right syntax for taking that data to export it to CSV without creating multiple classes (there are hundreds of properties, and new ones can be added dynamically).
The JSON response structure from the API is below:
{
"data": [
{
"id": "1",
"type": "Bus",
"attributes": {
"property-two": "2020-12-10",
"property-three": "D",
"property-four": null,
"property-five": 5
}
},
{
"id": "2",
"type": "Car",
"attributes": {
"property-two": "2020-12-10",
"property-three": "D",
"property-four": null,
"property-five": 5
}
}
]
}
We only need to export the "attributes" node from each dataset to CSV, but cannot seem to flatten the data or extract just those nodes.
The following code gets a list of JToken objects, but I'm not sure how this can be exported to CSV without re-serializing and de-serializing again. The datatype is dynamic since columns can be added and removed from it.
var jsonObject = JObject.Parse(apiResponseString);
var items = jsonObject["data"].Children()["attributes"].ToList();
//TODO: Export to CSV, DataTable etc
Is it possible to deserialize the data from the attributes nodes only, and how would that be done (whether on the initial JObject.Parse or serializing the JToken list again)? I'm not tied to JObject, so can use Newtonsoft or other as well.
Using Json.Net's LINQ-to-JSON API (JObjects), you can convert your JSON data to CSV as shown below.
First define a couple of short extension methods:
using System;
using System.Collections.Generic;
using System.Linq;
using Newtonsoft.Json.Linq;
public static class JsonHelper
{
public static string ToCsv(this IEnumerable<JObject> items, bool includeHeaders = true)
{
if (!items.Any()) return string.Empty;
var rows = new List<string>();
if (includeHeaders)
{
rows.Add(items.First().Properties().Select(p => p.Name).ToCsv());
}
rows.AddRange(items.Select(jo =>
jo.Properties().Select(p => p.Value.Type == JTokenType.Null ? null : p.Value).ToCsv()
));
return string.Join(Environment.NewLine, rows);
}
public static string ToCsv(this IEnumerable<object> values)
{
const string quote = "\"";
const string doubleQuote = "\"\"";
return string.Join(",", values.Select(v =>
v != null ? string.Concat(quote, v.ToString().Replace(quote, doubleQuote), quote) : string.Empty
));
}
}
Then you can do:
var obj = JObject.Parse(json);
var csv = obj.SelectTokens("..attributes").Cast<JObject>().ToCsv();
Here is a working demo: https://dotnetfiddle.net/NF2G2l
I ended up creating an extension method in my Json Helper class to convert the JToken enumerable to a datatable. This meets the original requirements to export to CSV simply (we use Aspose Cells that has a method for DatatTable to CSV), but also allows us to work with the datatable as an object without defining the columns. The end result looks like this:
Worker Class:
var stringResponse = await apiResponse.Content.ReadAsStringAsync();
var jsonObject = JObject.Parse(stringResponse);
var dataRows = jsonObject.SelectTokens("$..attributes").ToList();
var outputData = new DataTable("MyDataTable");
dataRows.AddToDataTable(outputData, pageNumber);
//TODO: Do something with outputData (to file, to object, to DB etc)
JsonHelper Class new extension method
public static void AddToDataTable(this List<JToken> jTokens, DataTable dt, int pageNumber)
{
foreach (var token in jTokens)
{
JObject item;
JToken jtoken;
if (pageNumber == 1 && dt.Rows.Count == 0)
{
item = (JObject)token;
jtoken = item.First;
while (jtoken != null)
{
dt.Columns.Add(new DataColumn(((JProperty)jtoken).Name));
jtoken = jtoken.Next;
}
}
item = (JObject)token;
jtoken = item.First;
var dr = dt.NewRow();
while (jtoken != null)
{
dr[((JProperty)jtoken).Name] = ((JProperty)jtoken).Value.ToString();
jtoken = jtoken.Next;
}
dt.Rows.Add(dr);
}
}

How to output JSON array as a single field in CSV using ChoETL

I'm using ChoETL to convert JSON to CSV. Currently, if a property in the JSON object is an array it is output into separate fields in JSON.
Example:
{
"id", 1234,
"states": [
"PA",
"VA"
]
},
{
"id", 1235,
"states": [
"CA",
"DE",
"MD"
]
},
This results in CSV like this (using pipe as a delimeter)
"id"|"states_0"|"states_1"|"states_2"
"1234"|"PA"|"VA"
"1235"|"CA"|"DE"|"MD"
What I would like is for the array to be displayed in a single states field as a comma separated string
"id"|"states"
"1234"|"PA,VA"
"1235"|"CA,DE,MD"
Here is the code I have in place to perform the parsing and transformation.
public static class JsonCsvConverter
{
public static string ConvertJsonToCsv(string json)
{
var csvData = new StringBuilder();
using (var jsonReader = ChoJSONReader.LoadText(json))
{
using (var csvWriter = new ChoCSVWriter(csvData).WithFirstLineHeader())
{
csvWriter.WithMaxScanRows(1000);
csvWriter.Configuration.Delimiter = "|";
csvWriter.Configuration.QuoteAllFields = true;
csvWriter.Write(jsonReader);
}
}
return csvData.ToString();
}
}
Edited: Removed test code that wasn't useful
This is how you can produce the expected output using the code below
var csvData = new StringBuilder();
using (var jsonReader = ChoJSONReader.LoadText(json))
{
using (var csvWriter = new ChoCSVWriter(csvData)
.WithFirstLineHeader()
.WithDelimiter("|")
.QuoteAllFields()
.Configure(c => c.UseNestedKeyFormat = false)
.WithField("id")
.WithField("states", m => m.ValueConverter(o => String.Join(",", ((Array)o).OfType<string>())))
)
{
csvWriter.Write(jsonReader);
}
}
Console.WriteLine(csvData.ToString());
Output:
id|states
"1234"|"PA,VA"
"1235"|"CA,DE,MD"
PS: on the next release, this issue will be handled automatically without using valueconverters

Convert SQLDataReader results to JSON, with nested JSON objects

I have a C# application which retrieves an SQL result set in the following format:
customer_id date_registered date_last_purchase loyalty_points
1 2017-01-01 2017-05-02 51
2 2017-01-23 2017-06-21 124
...
How can I convert this to a JSON string, such that the first column (customer_id) is a key, and all other subsequent columns are values within a nested-JSON object for each customer ID?
Example:
{
1: {
date_registered: '2017-01-01',
date_last_purchase: '2017-05-02',
loyalty_points: 51,
...
},
2: {
date_registered: '2017-01-23',
date_last_purchase: '2017-06-21',
loyalty_points: 124,
...
},
...
}
Besides date_registered, date_last_purchase, and loyalty_points, there may be other columns in the future so I do not want to refer to these column names specifically. Therefore I have already used the code below to fetch the column names, but am stuck after this.
SqlDataReader sqlDataReader = sqlCommand.ExecuteReader();
var columns = new List<string>();
for (var i = 0; i < sqlDataReader.FieldCount; i++)
{
columns.Add(sqlDataReader.GetName(i));
}
while (sqlDataReader.Read())
{
rows.Add(columns.ToDictionary(column => column, column => sqlDataReader[column]));
}
You could use something like this to convert the data reader to a Dictionary<object, Dictionary<string, object>> and then use Json.NET to convert that to JSON:
var items = new Dictionary<object, Dictionary<string, object>>();
while (sqlDataReader.Read())
{
var item = new Dictionary<string, object>(sqlDataReader.FieldCount - 1);
for (var i = 1; i < sqlDataReader.FieldCount; i++)
{
item[sqlDataReader.GetName(i)] = sqlDataReader.GetValue(i);
}
items[sqlDataReader.GetValue(0)] = item;
}
var json = Newtonsoft.Json.JsonConvert.SerializeObject(items, Newtonsoft.Json.Formatting.Indented);
Update: JSON "names" are always strings, so used object and GetValue for the keys.

How to serialize specific Json object from C#

I have a requirement where i need to serialize json object in below format
[{
"columns": [{
"title": "NAME"
}, {
"title": "COUNTY"
}],
"data": [
["John Doe", "Fresno"],
["Billy", "Fresno"],
["Tom", "Kern"],
["King Smith", "Kings"]
]
}]
Here i need to get this json object from two different source, one is Columns and other is data. Columns would come from a string which will be comma separated as
string columnNames = "Name, County";
and data would come from .net Datatable like
DataTable dt = new DataTable();
I tried with below code using JavaScriptSerializer but i am not able to format it in the required format. Actually, shared format is required to dynamically create jquery datatable. Here is my raw code in C#.
[WebMethod]
public static string ConvertDatadttoString(string appName)
{
DataTable dt = new DataTable();
dt.Columns.Add("Name", typeof(string));
dt.Columns.Add("County", typeof(string));
dt.Rows.Add("vo", "go.com");
dt.Rows.Add("pa", "pa.com");
System.Web.Script.Serialization.JavaScriptSerializer serializer = new System.Web.Script.Serialization.JavaScriptSerializer();
List<Dictionary<string, object>> rows = new List<Dictionary<string, object>>();
Dictionary<string, object> row;
foreach (DataRow dr in dt.Rows)
{
row = new Dictionary<string, object>();
foreach (DataColumn col in dt.Columns)
{
row.Add(col.ColumnName, dr[col]);
}
rows.Add(row);
}
return serializer.Serialize(rows);
}
Above code is only serializing the DataTable and is not able to create in the required format. Thanks.
What I usually do, is I build a model based off of the data, and serialize that model.
This is how I'd imagine your model would look.
public class SampleClass
{
public IEnumerable<SampleItem> columns { get; set; }
public IEnumerable<IEnumerable<string>> data { get; set; }
}
public class SampleItem
{
public string title { get; set; }
}
And this is how I'd imagine you'd get the sample json
var sample = new List<SampleClass>
{
new SampleClass()
{
columns = new List<SampleItem>()
{
new SampleItem() {title = "NAME" },
new SampleItem() {title = "COUNTY" },
},
data = new List<List<string>>()
{
new List<string> { "John Doe", "Fresno" },
new List<string> { "Billy", "Fresno" },
new List<string> { "Tom", "Kern" },
new List<string> { "King Smith", "Kings" },
}
}
};
var serializer = new JavaScriptSerializer();
var json = serializer.Serialize(sample);
I'm sure you can figure out how to create that model based off of your real data. It's not that hard.
It's probably easier to create a class, but if you want to work with a Dictionary<string,object> then you need to first add an entry for your columns:
rows["columns"] = dt.Columns.Cast<DataTableColumn>()
.Select(c => new { title = c.ColumnName }).ToList();
And then you can add your data with something like:
rows["data"] = dt.Rows.Cast<DataRow>.Select(r => r.ItemArray).ToList();
Now you have a Dictionary<string,object> with two items columns and rows. columns contains a collection of objects with a property title and rows just contains an array of arrays for each row.
But this is a quick and dirty solution. I think creating a class as per #Sam I am's answer is cleaner and easier to maintain in the long run.
If you are starting with a comma separated list of column names, it really shouldn't be much harder to do. Something like:
var columns = columnNames.Split(","); // beware of column names that include commas!
row["columns"] = columns.Select(c => new { title = c });
row["data"] = dt.Rows.Cast<DataRow>.Select(r => columns.Select(c => r[c]).ToList());

Categories