Multi Line JSON to Datatable - c#

I'm using the following from Newtonsoft to deserialize some JSON data into a datatable (for the ultimate purpose of saving out to a spreadsheet if it matters);
var dt = (DataTable)JsonConvert.DeserializeObject(returnData, (typeof(DataTable)));
While this works well enough, it has the problem that nested rows are lost. Below is example data of a similar format. In the ratings section only "Internet Movie Database" is saved, "Rotten Tomatoes" & "Metacritic" are lost in the conversion. Is there a deserialize method that would retain these? I'm willing to consider options that would split the results onto multiple rows OR concatenate the ratings section into a single field.
{
"Title": "Guardians of the Galaxy Vol. 2",
"Year": "2017",
"Rated": "PG-13",
"Released": "05 May 2017",
"Runtime": "136 min",
"Genre": "Action, Adventure, Comedy, Sci-Fi",
"Director": "James Gunn",
"Writer": "James Gunn, Dan Abnett (based on the Marvel comics by), Andy Lanning (based on the Marvel comics by), Steve Englehart (Star-Lord created by), Steve Gan (Star-Lord created by), Jim Starlin (Gamora and Drax created by), Stan Lee (Groot created by), Larry Lieber (Groot created by), Jack Kirby (Groot created by), Bill Mantlo (Rocket Raccoon created by), Keith Giffen (Rocket Raccoon created by), Steve Gerber (Howard the Duck created by), Val Mayerik (Howard the Duck created by)",
"Actors": "Chris Pratt, Zoe Saldana, Dave Bautista, Vin Diesel",
"Plot": "The Guardians struggle to keep together as a team while dealing with their personal family issues, notably Star-Lord's encounter with his father the ambitious celestial being Ego.",
"Language": "English",
"Country": "USA",
"Awards": "Nominated for 1 Oscar. Another 12 wins & 42 nominations.",
"Poster": "https://m.media-amazon.com/images/M/MV5BMTg2MzI1MTg3OF5BMl5BanBnXkFtZTgwNTU3NDA2MTI#._V1_SX300.jpg",
"Ratings": [{
"Source": "Internet Movie Database",
"Value": "7.7/10"
}, {
"Source": "Rotten Tomatoes",
"Value": "84%"
}, {
"Source": "Metacritic",
"Value": "67/100"
}
],
"Metascore": "67",
"imdbRating": "7.7",
"imdbVotes": "482,251",
"imdbID": "tt3896198",
"Type": "movie",
"DVD": "22 Aug 2017",
"BoxOffice": "$389,804,217",
"Production": "Walt Disney Pictures",
"Website": "https://marvel.com/guardians",
"Response": "True"
}
UPDATE
Thanks for the solutions, I'm going to try these when I get home. In the meantime, perhaps to be clearer (or maybe even more complicated), I'd settle for concatenating the Ratings section to a single delimited string/field. What would be ideal is something like below.

The DataTable type to which you're de-serializing is unable to handle the one-to-many relationship between the movie and its ratings.
Try de-serializing to a more specific type that better suits your JSON objects.
You can use json2csharp.com to make a C# class out of a JSON object.
Once you have your C# type, you can de-serialize to that and get the C# equivalent of your objects.
var obj = (RootObject)JsonConvert.DeserializeObject(returnData, (typeof(RootObject)));
or if your JSON data is an array of these objects:
var list = (RootObject[])JsonConvert.DeserializeObject(returnData, (typeof(RootObject[])));

This works for you if you don't want to declare a class.
var dict = JsonConvert.DeserializeObject<Dictionary<string, object>>(json);
string rating = Convert.ToString(dict["Ratings"]);
var dtScore = JsonConvert.DeserializeObject<DataTable>(rating);
string MetacriticScore = dtScore.Rows[2]["Value"].ToString();
And there is another simple way
var jsonObj = JsonConvert.DeserializeObject<JObject>(json);
string MetacriticScore = Convert.ToString(jsonObj["Ratings"][2]["Value"]);

Related

How are complex fields in Azure Search represented in a database?

When using Azure Cognitive Search, you can push complex fields to the index with JSON, like so (using a simplified version of the official Hotels example):
{
"HotelId": "1",
"HotelName": "Secret Point Hotel",
"Category": "Boutique",
"Tags": [ "view", "air conditioning", "concierge" ],
"Address": {
"StreetAddress": "677 5th Ave",
"City": "New York",
"StateProvince": "NY",
"PostalCode": "10022",
"Country": "USA"
},
"Rooms": [
{
"Description": "Budget Room, 1 Queen Bed (Cityside)",
"Description_fr": "Chambre Économique, 1 grand lit (côté ville)",
"Type": "Budget Room",
"BaseRate": 96.99,
"BedOptions": "1 Queen Bed",
"SleepsCount": 2,
"SmokingAllowed": true,
"Tags": [ "vcr/dvd" ]
},
{
"Description": "Budget Room, 1 King Bed (Mountain View)",
"Description_fr": "Chambre Économique, 1 très grand lit (Mountain View)",
"Type": "Budget Room",
"BaseRate": 80.99,
"BedOptions": "1 King Bed",
"SleepsCount": 2,
"SmokingAllowed": true,
"Tags": [ "vcr/dvd", "jacuzzi tub" ]
},
]
}
Notice how there are a couple complex field types here, Tags, Address, and Rooms. Since Rooms is the most difficult of the three, let's model out Rooms. The official SQL scripts do not include any sort of data creation for rooms, so I can't see how an indexer might find/read that data.
How do I need to represent my data (in the view that the indexer is reading from) in order to create a complex collection such as the Rooms example above?
If that's not possible, is it even possible to represent Address in the database in a way that it could be transferred to Azure Search in the schema below?
You will need to create a Data Source that is attached to a SQL view that contains all of your Hotel data. Your view should have columns (Tags, Address, Rooms) that contain embedded JSON which represent the complex types.
This is an example of creating a view with a 'Rooms' column that will contain data from the Rooms table:
CREATE VIEW [dbo].[HotelRooms]
AS
SELECT *, (SELECT *
FROM dbo.Rooms
WHERE dbo.Rooms.HotelID = dbo.Hotels.HotelID FOR JSON AUTO) AS Rooms
FROM dbo.Hotels
GO
This view will have a column Room which will be populated with a JSON array eg: [{"Description": "Budget Room"},{"Description": "Another Room"}]
The columns listed on your view should match exactly with your Index schema. You will need to create a field on your Index called Room of type Collection(Edm.ComplexType) which will contain child fields such as Description.
For more information please see here:
https://learn.microsoft.com/en-us/azure/search/index-sql-relational-data

How to modify the JSON obtained from serializing a DataSet using Json.Net for purposes of ESRI geocoding

How to introduce the "attributes" level into JSON text below? I'm using a C# dataset populated from SQL server with SerializeObject from Newtonsoft.json.
This is for submitting data to ESRI batch geocoder, as described here.
The format their REST service expects looks like this
{
"records": [
{
"attributes": {
"OBJECTID": 1,
"Address": "4550 Cobb Parkway North NW",
"City": "Acworth",
"Region": "GA"
}
},
{
"attributes": {
"OBJECTID": 2,
"Address": "2450 Old Milton Parkway",
"City": "Alpharetta",
"Region": "GA"
}
}
]
}
The format my C# script creates looks like this (missing the "attributes" level.)
{
"records": [
{
"OBJECTID": 1,
"address": "4550 Cobb Parkway North NW",
"city": "Acworth",
"state": "GA",
"zip": 30101.0
},
{
"OBJECTID": 2,
"address": "2450 Old Milton Parkway",
"city": "Alpharetta",
"state": "GA",
"zip": 30009.0
}
]
}
I've read thru json.net documentation and wonder if the JsonConverter class could be helpful. Candidly, I'm at loss for how to resolve this. First time user of Json.net, relative newbie with C#
Here is the C# code used to this point:
SQLStatement = "select OBJECTID, Address, City, Region, Postal from MyAddresses";
SqlDataAdapter geoA = new SqlDataAdapter(SQLStatement, GEOconn);
DataSet GeoDS = new DataSet();
geoA.Fill(GeoDS, "records");
string geoAJSON = JsonConvert.SerializeObject(GeoDS);
Console.WriteLine("{0}", geoAJSON);
You can wrap your rows in another object with an "attributes" property using Json.Net's LINQ-to-JSON API.
In your code, replace this line:
string geoAJSON = JsonConvert.SerializeObject(GeoDS);
with this:
var obj = JObject.FromObject(GeoDS);
obj["records"] = new JArray(
obj["records"].Select(jo => new JObject(new JProperty("attributes", jo)))
);
string geoAJSON = obj.ToString();
Working demo here: https://dotnetfiddle.net/nryw27
Aside: based on your JSON it looks like you are storing postal codes in your database as decimals. Don't do that. They may look like numbers, but you should store them as strings. Postal codes in the US can have leading zeros, which will get dropped when you treat them as numbers. Some international postal codes can contain letters, so a numeric type won't even work in that case.

Deserializing and getting the value from completely dynamic JSON

I'm scraping the internet, therefore, the JSON I'm analyzing will be completely different on each webpage. In a nutshell, though, I'm looking to find the author's name.
I'm currently using:
dynamic results = JsonConvert.DeserializeObject(value);
string Author = results.Author;
The problem is, two pages are never the same.
This is an example of two different web pages, with schema which I will find and deserialize & find the author's name.
Example 1:
{
"#context": "https://schema.org",
"#type": "BookSeries",
"author": {
"#type": "Person",
"givenName": "Douglas",
"familyName": "Adams",
"additionalName": "Noel",
"birthDate": "1952-03-11",
"birthPlace": {
"#type": "Place",
"address": "Cambridge, Cambridgeshire, England"
}
}
}
Example 2:
{
"#context": "https://schema.org",
"#type": "WebPage",
"name": "Lecture 12: Graphs, networks, incidence matrices",
"author": "James Beckett",
"description": "These video lectures of Professor Gilbert Strang teaching 18.06 were recorded in Fall 1999 and do not correspond precisely to the current edition of the textbook.",
"publisher": {
"#type": "CollegeOrUniversity",
"name": "MIT OpenCourseWare"
},
"license": "http://creativecommons.org/licenses/by-nc-sa/3.0/us/deed.en_US"
}
Is there a way of truly being dynamic, and finding said values within a JSON string, no matter how they're formatted? With static JSON it's very simple, however, like this - I have absolutely no clue because you can't turn the JSON into C# classes, because they'll always be different.
Any help would be appreciated!

Query items in a dictionary in a CosmosDB document

I have documents like this in my CosmosDB database:
{
"id": "12345",
"filename": "foo.txt",
"versions": {
"1": {
"storageAccount": "blob123",
"size": 33
},
"2": {
"storageAccount": "blob123",
"size": 42
}
}
}
(this is a simplified sample)
I need to query on the "storageAccount" property, to check if there are files stored on a given storage account. But I can't find a way to express "for each version".
I tried this, but of course it doesn't work
select top 1 *
from c
join v in c.versions
where v.storageAccount = 'blob123'
Apparently JOIN only works on arrays, not dictionaries. Is there a way to query items in a dictionary?
As a workaround, I can use an UDF, but the performance and cost are terrible (1200 RUs for just 2000 documents when there is not matching document...)
EDIT: updated to more closely reflect actual use case
Unfortunately, this isn't possible today. You cannot iterate over object keys in Cosmos's SQL.
I'd recommend changing the schema to something like:
{
"id": "12345",
"filename": "foo.txt",
"versions": [
{
"id": "1"
"storageAccount": "blob123",
"size": 33
},
{
"id": "2"
"storageAccount": "blob123",
"size": 42
}
]
}
Additionally, you could evaluate a User Defined Function which would return the keys of an object for you, but that will increase your RU costs, though possibly less than sprocs.

Asp.Net Extracting data from deep within a dictionary<string,object>-

Relatively newbie here with a little question. I been extracting a json string that looks like this (in this case it is a modified return from Facebook oauth2.
{"id":"555555555555555","name":"Monkey
Man","last_name":"Man","first_name":"Monkey","email":"test\u0040someaccount.com","location":{"id":"555555555555555","name":"Jungle,
North
Carolina"},"gender":"male","work":[{"employer":{"id":"555555555555555","name":"Big
Boss makes me work"}:"projects":{"current":"doing stuff",
"previous":"other
stuff"},"location":{"id":"555555555555555","name":"Jungle, North
Carolina"},"position":{"id":"555555555555555","name":"IT
monkey"},"start_date":"2010-09"}],"picture":"http://profile.ak.fbcdn.net/static-ak/rsrc.php/v1/yo/r/5555555-555.gif"}
Well I am able to extract everything to a the dictionary by using the following code
JavaScriptSerializer ser = new JavaScriptSerializer();
Dictionary<string, object> dict = ser.Deserialize<Dictionary<string,object>>(json);
I then extract the data as following from the dictionary and store them in an object called contact which is pretty much just a collection of strings.
if (d.ContainsKey("email"))
{
c.email = d["email"].ToString();
}
else
c.email = "";
I did it this way as I was not gaurenteed the information fields will all be there.
If there is an object set in the value such as with the address I use a modified code (thanks to the guy who showed me how to do that) like following.
c.location = (d["location"] as Dictionary<string, object>)["name"].ToString();
Now come the difficult part that I am stuck on.
I am trying to extract the employer name "Big Boss makes me work" from the following part of the string...
"work":[{"employer":{"id":"555555555555555","name":"Big Boss makes me
work"}:"projects":{"current":"doing stuff", "previous":"other
stuff"},"location":{"id":"555555555555555","name":"Jungle, North
Carolina"},"position":{"id":"555555555555555","name":"IT
monkey"},"start_date":"2010-09"}]
It is storing the data down within an array inside of other objects and I have no idea how to get to the information to extract it, or even how to extract information like this from live oauth2...
"addresses": { "personal": { "street": null, "street_2": null, "city":
"Jungle", "state": "NC", "postal_code": "28677", "region": "United
States" }, "business": { "street": "Tree Street", "street_2": null,
"city": "Jungle", "state": "NC", "postal_code": "28677", "region":
"United States" } }
As you can see this goes three levels deep so my (d["location"] as Dictionary)["name"].ToString(); is pretty useless here. How would you go about getting say the street name from this?
I hope my questions aren't too vague or random. I just need some advice on properly extracting data from the dictionary objects. The ways I come up with involve editing the json string and that causes alsorts of problems as I just don't understand the dictionary object well enough to figure this out on my own
Thanks
Running your JSON through jsonlint.com (and correcting it slightly), it looks like this formatted:
{
"id": "555555555555555",
"name": "Monkey Man",
"last_name": "Man",
"first_name": "Monkey",
"email": "test#someaccount.com",
"location": {
"id": "555555555555555",
"name": "Jungle, North Carolina"
},
"gender": "male",
"work": [
{
"employer": {
"id": "555555555555555",
"name": "Big Boss makes me work"
},
"projects": {
"current": "doing stuff",
"previous": "other stuff"
},
"location": {
"id": "555555555555555",
"name": "Jungle, North Carolina"
},
"position": {
"id": "555555555555555",
"name": "IT monkey"
},
"start_date": "2010-09"
}
],
"picture": "http://profile.ak.fbcdn.net/static-ak/rsrc.php/v1/yo/r/5555555-555.gif"
}
Your JSON data in this case just isn't really suitable to be serialized to a straightforward Dictionary object, so that's not really the way to go here.
The easier way to do is to create a C# class that has defined properties the same as the Javascript object you're de-serializaing. Then, deserialize the JSON as that object and you should be able to access the ""Big Boss makes me work" value should be at objectFromJson.work[0].employer.name .

Categories