I have been searching and searching for a way to convert a json file to a csv and the vice versa using C#. I have searched google and have not come up with anything. Everything I've tried so far from the answers on stack overflow just do not work from me. Does anyone know of any tooling or tutorials I could have look at how to accomplish this with the .NET Framework? Usually I post what I've tried however I'm clearly far off here so it is pointless.
Compromises and Problems
You can accomplish this with the .NET Framework but there's not a clear and obvious way to just do this straight-up because of hierarchies and collections. What I mean by that is that CSV data is very flat and unstructured whereas JSON data is very organized and iterative. Let's take a simple chunk of JSON data that could look like this:
{
"Data": [
{
"Name":"Mickey Mouse",
"Friends":[ "Pluto", "Minnie", "Donald" ]
},
{
"Name":"Pluto",
"Friends":[ "Mickey" ]
}
]
}
The most obvious CSV file for that could be:
Name,Friend
Mickey Mouse,Pluto
Mickey Mouse,Minnie
Mickey Mouse,Donald
Pluto,Mickey
That's the easier conversion but let's say you just have that CSV file. It's not so obvious what the JSON should look like. One could argue that the JSON should look like this:
{
"Data": [
{ "Name":"Mickey Mouse", "Friend":"Pluto" },
{ "Name":"Mickey Mouse", "Friend":"Minnie" },
{ "Name":"Mickey Mouse", "Friend":"Donald" },
{ "Name":"Pluto", "Friend":"Mickey" },
]
}
That resulting JSON file is very different than the input JSON file. My point is that this isn't a simple/obvious conversion so any off-the-shelf or copy/paste solution will be imperfect. Whatever your solution is, you're going to have to make compromises or intelligent decisions.
.NET Framework Options
Now that we've gotten that out of the way, .NET gives you some capabilities to accomplish this out of the box and there are some good Nuget-supplied options as well. If you want to utilize pure .NET capabilities, you could use a combination of these two SO Answers:
Not perfect but this answer has some great code to get you started in the logic to generate a CSV file
This question and the resulting answers have some good info about generating JSON using just the .NET Framework and without any third-party utilities.
You should be able to apply the concepts in those two links PLUS the compromises and intelligent decisions you need to make from my first "Compromises and Problems" section of this post to accomplish what you need.
Something I've Done Before
I've done something similar where I actually used some functionality in the Microsoft.VisualBasic.FileIO namespace (works great in a C# app) in addition to Web API's serialization functionality to accomplish a CSV->JSON conversion using Dynamic objects (using the dynamic keyword) as an intermediary. The code is provided below. It's not terribly robust and makes some significant compromises but it has worked well for me. If you want to try this, you'll have to create your own version that goes in reverse, but as I mentioned in my first section, that's really the easy part.
using System.Collections.Generic;
using System.Dynamic;
using System.IO;
using System.Linq;
using System.Web.Http;
// NOTE: This is not purely my code. This was put together
// with the help of other SO questions that I wish I had the
// links to so I could credit them. You probably will find
// some chunk(s) of this code elsewhere on SO.
namespace Application1.Controllers
{
public class Foo
{
public string Csv { get; set; }
}
public class JsonController : ApiController
{
[HttpPost]
[Route("~/Csv/ToJson")]
public dynamic[] ConvertCsv([FromBody] Foo input)
{
var data = CsvToDynamicData(input.Csv);
return data.ToArray();
}
internal static List<dynamic> CsvToDynamicData(string csv)
{
var headers = new List<string>();
var dataRows = new List<dynamic>();
using (TextReader reader = new StringReader(csv))
{
using (var parser = new Microsoft.VisualBasic.FileIO.TextFieldParser(reader))
{
parser.Delimiters = new[] {","};
parser.HasFieldsEnclosedInQuotes = true;
parser.TrimWhiteSpace = true;
var rowIdx = 0;
while (!parser.EndOfData)
{
var colIdx = 0;
dynamic rowData = new ExpandoObject();
var rowDataAsDictionary = (IDictionary<string, object>) rowData;
foreach (var field in parser.ReadFields().AsEnumerable())
{
if (rowIdx == 0)
{
// header
headers.Add(field.Replace("\\", "_").Replace("/", "_").Replace(",", "_"));
}
else
{
if (field == "null" || field == "NULL")
{
rowDataAsDictionary.Add(headers[colIdx], null);
}
else
{
rowDataAsDictionary.Add(headers[colIdx], field);
}
}
colIdx++;
}
if (rowDataAsDictionary.Keys.Any())
{
dataRows.Add(rowData);
}
rowIdx++;
}
}
}
return dataRows;
}
}
}
If you want something more robust, then you can always leverage these great projects:
JSON.NET (This works VERY WELL with creating JSON from dynamic objects. Given that you're not using Web API, this would be the first place I would look to take the dynamic[] return value and convert it to JSON.)
CsvHelper
Besides using combination of multiple libraries to do the conversion of JSON to CSV and vice versa, Cinchoo ETL gives you unified interface to do the conversion between those 2 formats.
For a sample JSON file:
[
{
"Name" : "Xytrex Co.",
"Description" : "Industrial Cleaning Supply Company",
"AccountNumber" : "ABC15797531"
},
{
"Name" : "Watson and Powell, Inc.",
"Description" : "Law firm. New York Headquarters",
"AccountNumber" : "ABC24689753"
}
]
To produce CSV file:
Name,Description,AccountNumber
Xytrex Co.,Industrial Cleaning Supply Company,ABC15797531
Watson and Powell Inc.,Law firm. New York Headquarters,ABC24689753
JSON to CSV:
using (var p = ChoJSONReader.LoadText(json))
{
using (var w = new ChoCSVWriter(Console.Out)
.WithFirstLineHeader()
)
{
w.Write(p);
}
}
Sample fiddle: https://dotnetfiddle.net/T3u4W2
CSV to JSON:
using (var p = ChoCSVReader.LoadText(csv)
.WithFirstLineHeader()
)
{
using (var w = new ChoJSONWriter(Console.Out))
{
w.Write(p);
}
}
Sample fiddle: https://dotnetfiddle.net/gVlJVX
like Jaxidian mentioned, the problem is, that json can have a hierarchy, csv not.
So, there are two solutions I could suggest you:
create a hierarchical csv, shouldn't be much effort:
"Id";"Name";"Age";"Type"
"FriendId"
1;"Mickey Mouse";20;"mouse"
2
3
4
2;"Pluto";7;"dog"
1
3;"Minnie";20;"mouse"
4;"Donald";22;"duck"
create multiple files, could be more effort, but is more beautiful and more dynamic, when you eg. export from/import into database. Maybe this link could help you: http://www.snellman.net/blog/archive/2016-01-12-json-to-multicsv
all.csv (store all characters)
"Id";"Name";"Age";"Type"
1;"Mickey Mouse";20;"mouse"
2;"Pluto";7;"dog"
3;"Minnie";20;"mouse"
4;"Donald";22;"duck"
friends.csv (store all relations)
"FriendKey1";"FriendKey2"
1;2
2;1
1;3
1;4
Related
Does anyone know how to convert the below nested JSON to CSV via CHOETL (An ETL framework for .NET)? Thank you!
I'm using this code but it will only return the first equipment record.
CODE:
{
using (var json = new ChoJSONReader("./test.json"))
{
csv.Write(json.Cast<dynamic>().Select(i => new
{
EquipmentId = i.GpsLocation.Equipment[0].EquipmentId,
InquiryValue = i.GpsLocation.Equipment[0].InquiryValue,
Timestamp = i.GpsLocation.Equipment[0].Timestamp
}));
}
}
JSON:
"GpsLocation": {
"Equipment": [
{
"EquipmentId": "EQ00001",
"InquiryValue": [
"IV00001"
],
"Timestamp": "2020-01-01 01:01:01.01",
},
{
"EquipmentId": "EQ00002",
"InquiryValue": [
"IV00002"
],
"Timestamp": "2020-01-01 01:01:01.01"
}
]
}
}````
As others suggest, the issue is you are only looking at the first element of the array.
It appears that the easiest way to control what you serialise into CSV is by correctly defining your source objects from JSON. JSON Path expressions come in pretty handy.
What I ended up doing here is query all JSON to return an array of Equipment objects regardless of where they are in the hierarchy (which means you may need to filter it a bit better depending on your full JSON).
Then it's pretty easy to define each field based on JSON path and just pass the result to CSVWriter.
Also check out some gotchas that I outlined in the respective comment lines.
void Main()
{
var jsonString = "{\"GpsLocation\":{\"Equipment\":[{\"EquipmentId\":\"EQ00001\",\"InquiryValue\":[\"IV00001\"],\"Timestamp\":\"2020-01-01 01:01:01.01\"},{\"EquipmentId\":\"EQ00002\",\"InquiryValue\":[\"IV00002\"],\"Timestamp\":\"2020-01-01 01:01:01.01\"}]}}";
var jsonReader = new StringReader(jsonString);
var csvWriter = new StringWriter(); // outputs to string, comment out if you want file output
//var csvWriter = new StreamWriter(".\\your_output.csv"); // writes to a file of your choice
using (var csv = new ChoCSVWriter(csvWriter))
using (var json = new ChoJSONReader(jsonReader)
.WithJSONPath("$..Equipment[*]", true) // firstly you scope the reader to all Equipment objects. take note of the second parameter. Apparently you need to pass true here as otherwise it just won't return anythig
.WithField("EquipmentId", jsonPath: "$.EquipmentId", isArray: false) // then you scope each field in the array to what you want it to be. Since you want scalar values, pass `isArray: false` for better predictability
.WithField("InquiryValue", jsonPath: "$.InquiryValue[0]", isArray: false) // since your InquiryValue is actually an array, you want to obtain first element here. if you don't do this, fields names and values would go askew
.WithField("Timestamp", jsonPath: "$.Timestamp", fieldType: typeof(DateTime), isArray: false)) // you can also supply field type, otherwise it seems to default to `string`
{
csv.WithFirstLineHeader().Write(json);
}
Console.WriteLine(csvWriter.GetStringBuilder().ToString()); // comment this out if writing to file - you won't need it
}
Update summary:
Pivoted to update the code to rely on JSON Path scoping - this seems to allow for field name manipulation with pretty low effort
Looking at your comment, you could probably simplify your file writer a little bit - use StreamWriter instead of StringWriter - see updated code for example
Here is the working sample of producing CSV from your JSON
string json = #"{
""GpsLocation"": {
""Equipment"": [
{
""EquipmentId"": ""EQ00001"",
""InquiryValue"": [
""IV00001""
],
""Timestamp"": ""2020-02-01 01:01:01.01"",
},
{
""EquipmentId"": ""EQ00002"",
""InquiryValue"": [
""IV00002""
],
""Timestamp"": ""2020-01-01 01:01:01.01""
}
]
}
}";
StringBuilder csv = new StringBuilder();
using (var r = ChoJSONReader.LoadText(json)
.WithJSONPath("$.GpsLocation.Equipment")
.WithField("EquipmentId")
.WithField("InquiryValue", jsonPath: "InquiryValue[0]", fieldType: typeof(string))
.WithField("Timestamp", fieldType: typeof(DateTime))
)
{
using (var w = new ChoCSVWriter(csv)
.WithFirstLineHeader())
w.Write(r);
}
Console.WriteLine(csv.ToString());
Output:
EquipmentId,InquiryValue,Timestamp
EQ00001,IV00001,2/1/2020 1:01:01 AM
EQ00002,IV00002,1/1/2020 1:01:01 AM
Sample fiddle: https://dotnetfiddle.net/hJWtqH
Your code is sound, but the issue is that you're only writing the first variable in the array by using i.GpsLocation.Equipment[0]. Instead, try looping over everything by putting it into a for loop, and changing the [0] to your iterating variable inside of said loop.
How can i make a for each loop for each record in my json file using Json.Net ?
my json file is something like this :
{
"transactions":[
{
"type":"deposit",
"account_id":123456789012345,
"amount":20000.0
},
{
"type":"deposit",
"account_id":555456789012345,
"amount":20000.0
},
{
"type":"payment",
"account_id":123456789012345,
"amount":20000.0
},
{
"type":"transfer",
"from":555456789012345,
"to":123456789012345,
"amount":20000.0
}
]
}
Update :
I want to read each record .... and then Deserializing that record (record in the foreach loop) and put "type","account_id" etc to some other strings
and i want to make a loop that read the record one by one (not all of them in once)
/////////////////////////
Update 2 :
I want a code like this :
dynamic jsonObj = JsonConvert.DeserializeObject(file);
foreach (var obj in jsonObj)
{
}
I'm not really sure why this hasn't already come up, so maybe I'm misunderstanding, but you should be able to do something like this.
var fileJson = JToken.Parse(file);
foreach (var item in fileJson["transactions"])
{
// do stuff
}
That might have a missing cast in it, it's been a little while since I wrote any serialization code, but that should be the gist of it.
I have a JSON string that I want to be able to amend in C#. I want to be able to delete a set of data based when one of the child values is a certain value.
Take the following
{
"responseHeader":{
"status":0,
"QTime":0,
"params":{
"explainOther":"",
"fl":"*,score",
"indent":"on",
"start":"0",
"q":"*:*",
"hl.fl":"",
"qt":"",
"wt":"json",
"fq":"",
"version":"2.2",
"rows":"2"}
},
"response":{"numFound":2,"start":0,"maxScore":1.0,"docs":
[{
"id":"438500feb7714fbd9504a028883d2860",
"name":"John",
"dateTimeCreated":"2012-02-07T15:00:42Z",
"dateTimeUploaded":"2012-08-09T15:30:57Z",
"score":1.0
},
{
"id":"2f7661ae3c7a42dd9f2eb1946262cd24",
"name":"David",
"dateTimeCreated":"2012-02-07T15:02:37Z",
"dateTimeUploaded":"2012-08-09T15:45:06Z",
"score":1.0
}]
}}
There are two response results shown above. I want to be able to remove the whole parent response result group when its child "id" value is matched, for example if my id was "2f7661ae3c7a42dd9f2eb1946262cd24", I would want the second group to be deleted and thus my result would look as follows.
{
"responseHeader":{
"status":0,
"QTime":0,
"params":{
"explainOther":"",
"fl":"*,score",
"indent":"on",
"start":"0",
"q":"*:*",
"hl.fl":"",
"qt":"",
"wt":"json",
"fq":"",
"version":"2.2",
"rows":"2"}},
"response":{"numFound":2,"start":0,"maxScore":1.0,"docs":[
{
"id":"438500feb7714fbd9504a028883d2860",
"name":"John",
"dateTimeCreated":"2012-02-07T15:00:42Z",
"dateTimeUploaded":"2012-08-09T15:30:57Z",
"score":1.0
}]
}}
I will need to perform multiple delete operations on the Json file. The Json file could contain thousands of results and I really need the most performant way possible.
Any help greatly appreciated.
I've been attempting to compress this into a nicer LINQ statement for the last 10 minutes or so, but the fact that the list of known Ids is inherently changing how each element is evaluated means that I'm probably not going to get that to happen.
var jObj = (JObject)JsonConvert.DeserializeObject(json);
var docsToRemove = new List<JToken>();
foreach (var doc in jObj["response"]["docs"])
{
var id = (string)doc["id"];
if (knownIds.Contains(id))
{
docsToRemove.Add(doc);
}
else
{
knownIds.Add(id);
}
}
foreach (var doc in docsToRemove)
doc.Remove();
This seems to work well with the crappy little console app I spun up to test, but my testing was limited to the sample data above so if there's any problems go ahead and leave a comment so I can fix them.
For what it's worth, this will basically run in linear time with respect to how many elements you feed it, which is likely all the more algorithmic performance you're going to get without getting hilarious with this problem. Spinning each page of ~100 records off into its own task using the Task Parallel Library invoking a worker that will handle its own little page and returned the cleaned JSON string comes to mind. That would certainly make this faster if you ran it on a multi-cored machine, and I'd be happy to provide some code to get you started on that, but it's also a huge overengineering for the scope of the problem as it's presented.
var jObj = (JObject)JsonConvert.DeserializeObject(json);
HashSet<string> idsToDelete = new HashSet<string>() { "2f7661ae3c7a42dd9f2eb1946262cd24" };
jObj["response"]["docs"]
.Where(x => idsToDelete.Contains((string)x["id"]))
.ToList()
.ForEach(doc=>doc.Remove());
var newJson = jObj.ToString();
None of the answers above worked for me, I had to Remove() child from Parent (.Parent.Remove()) not just Remove() it. Working code example below:
namespace Engine.Api.Formatters
{
using Newtonsoft.Json;
using Newtonsoft.Json.Linq;
using System;
using System.IO;
using System.Net;
using System.Net.Http;
using System.Net.Http.Formatting;
using System.Net.Http.Headers;
using System.Threading.Tasks;
using System.Web.Script.Serialization;
using System.Xml;
using System.Xml.Serialization;
public class ReducedJson
{
public dynamic WriteToStreamAsync(object value)
{
var json = new JavaScriptSerializer().Serialize(value);
var serializedJson = (JObject)JsonConvert.DeserializeObject(json);
foreach (var response in serializedJson["ProductData"]["Motor"]["QuoteResponses"])
{
response["NetCommResults"].Parent.Remove();
foreach (var netCommResult in response["BestPriceQuote"]["NetCommResults"])
{
netCommResult["Scores"].Parent.Remove();
}
}
return serializedJson;
}
}
Hope this saves you some time.
I just find another answer.
var aJson = JsonConvert.DeserializeObject<JObject>(json);
var doc = aJson["response"]["docs"];
JObject docs = new JObject();
docs["docs"] = doc;
// remove
docs.SelectTokens(string.Format("docs[?(#.id == '{0}')]", "2f7661ae3c7a42dd9f2eb1946262cd24")).ToList().ForEach(i => i.Remove());
// replace
aJson.SelectToken("response.docs").Replace(docs["docs"]);
What is the best way to convert json (or straightforward XML) to XML with namespaces (or with specific schema), without using strongly typed classes (C#)? (Using XSD, XSLT, template engine or other text based engine).
What is the most effective way (resources / performance)?
For example, to take the following object in json - string:
{
'item': {
'name': 'item #1'
'code': 'itm-123'
'image': {
'#url': 'http://www.foo.com/bar.jpg'
}
}
}
And convert it to:
<foo:item>
<foo:name>item #1</foo:name>
<foo:code>itm-123</foo:code>
<foo:image url="http://www.foo.bar"/>
</foo:item>
(The object can be more complex than the example above)
thanks
You could use json.net in order to do so.
Read this other post
It shows the other way round but should be pretty much the same.
With Cinchoo ETL, an open source library, this conversion can be done easily as below
string json = #"
{
'item': {
'name': 'item #1',
'code': 'itm-123',
'image': {
'#url': 'http://www.test.com/bar.jpg'
}
}
}";
StringBuilder xml = new StringBuilder();
using (var r = ChoJSONReader.LoadText(json))
{
using (var w = new ChoXmlWriter(xml)
.IgnoreRootName()
.IgnoreNodeName()
.WithDefaultXmlNamespace("foo", "http://temp.com")
)
{
w.Write(r);
}
}
Console.WriteLine(xml.ToString());
Output:
<foo:item xmlns:foo="http://temp">
<foo:name>item #1</foo:name>
<foo:code>itm-123</foo:code>
<foo:image url="http://www.test.com/bar.jpg" />
</foo:item>
Sample fiddle: https://dotnetfiddle.net/MITsuL
Disclaimer: I'm author of this library.
What I have
I have templates that are stored in a database, and JSON data that gets converted into a dictionary in C#.
Example:
Template: "Hi {FirstName}"
Data: "{FirstName: 'Jack'}"
This works easily with one level of data by using a regular expression to pull out anything within {} in the template.
What I want
I would like to be able to go deeper in the JSON than the first layer.
Example:
Template: "Hi {Name: {First}}"
Data: "{Name: {First: 'Jack', Last: 'Smith'}}"
What approach should I be taking? (and some guidance on where to start with your pick)
A regular expression
Not use JSON in the template (in favor of xslt or something similar)
Something else
I'd also like to be able to loop through data in the template, but I have no idea at all where to start with that one!
Thanks heaps
You are in luck! SmartFormat does exactly as you describe. It is a lightweight, open-source string formatting utility.
It supports named placeholders:
var template = " {Name:{Last}, {First}} ";
var data = new { Name = new { First="Dwight", Last="Schrute" } };
var result = Smart.Format(template, data);
// Outputs: " Schrute, Dwight " SURPRISE!
It also supports list formatting:
var template = " {People:{}|, |, and} ";
var data = new { People = new[]{ "Dwight", "Michael", "Jim", "Pam" } };
var result = Smart.Format(template, data);
// Outputs: " Dwight, Michael, Jim, and Pam "
You can check out the unit tests for Named Placeholders and List Formatter to see plenty more examples!
It even has several forms of error-handling (ignore errors, output errors, throw errors).
Note: the named placeholder feature uses reflection and/or dictionary lookups, so you can deserialize the JSON into C# objects or nested Dictionaries, and it will work great!
Here is how I would do it:
Change your template to this format Hi {Name.First}
Now create a JavaScriptSerializer to convert JSON in Dictionary<string, object>
JavaScriptSerializer jss = new JavaScriptSerializer();
dynamic d = jss.Deserialize(data, typeof(object));
Now the variable d has the values of your JSON in a dictionary.
Having that you can run your template against a regex to replace {X.Y.Z.N} with the keys of the dictionary, recursively.
Full Example:
public void Test()
{
// Your template is simpler
string template = "Hi {Name.First}";
// some JSON
string data = #"{""Name"":{""First"":""Jack"",""Last"":""Smith""}}";
JavaScriptSerializer jss = new JavaScriptSerializer();
// now `d` contains all the values you need, in a dictionary
dynamic d = jss.Deserialize(data, typeof(object));
// running your template against a regex to
// extract the tokens that need to be replaced
var result = Regex.Replace(template, #"{?{([^}]+)}?}", (m) =>
{
// Skip escape values (ex: {{escaped value}} )
if (m.Value.StartsWith("{{"))
return m.Value;
// split the token by `.` to run against the dictionary
var pieces = m.Groups[1].Value.Split('.');
dynamic value = d;
// go after all the pieces, recursively going inside
// ex: "Name.First"
// Step 1 (value = value["Name"])
// value = new Dictionary<string, object>
// {
// { "First": "Jack" }, { "Last": "Smith" }
// };
// Step 2 (value = value["First"])
// value = "Jack"
foreach (var piece in pieces)
{
value = value[piece]; // go inside each time
}
return value;
});
}
I didn't handle exceptions (e.g. the value couldn't be found), you can handle this case and return the matched value if it wasn't found. m.Value for the raw value or m.Groups[1].Value for the string between {}.
Have you thought of using Javascript as your scripting language? I had great success with Jint, although the startup cost is high. Another option is Jurassic, which I haven't used myself.
If you happen to have a Web Application, using Razor maybe an idea, see here.
Using Regex or any sort of string parsing can certainly work for trivial things, but can get painful when you want logic or even just basic hierarchies. If you deserialize your JSON into nested Dictionaries, you can build a parser relatively easily:
// Untested and error prone, just to illustrate the concept
var parts = "parentObj.childObj.property".split('.');
Dictionary<object,object> current = yourDeserializedObject;
foreach(var key in parts.Take(parts.Length-1)){
current = current[key];
}
var value = current[parts.Last()];
Just whatever you do, don't do XSLT. Really, if XSLT is the answer then the question must have been really desperate :)
Why not us nvelocity or something?