How are complex fields in Azure Search represented in a database? - c#

When using Azure Cognitive Search, you can push complex fields to the index with JSON, like so (using a simplified version of the official Hotels example):
{
"HotelId": "1",
"HotelName": "Secret Point Hotel",
"Category": "Boutique",
"Tags": [ "view", "air conditioning", "concierge" ],
"Address": {
"StreetAddress": "677 5th Ave",
"City": "New York",
"StateProvince": "NY",
"PostalCode": "10022",
"Country": "USA"
},
"Rooms": [
{
"Description": "Budget Room, 1 Queen Bed (Cityside)",
"Description_fr": "Chambre Économique, 1 grand lit (côté ville)",
"Type": "Budget Room",
"BaseRate": 96.99,
"BedOptions": "1 Queen Bed",
"SleepsCount": 2,
"SmokingAllowed": true,
"Tags": [ "vcr/dvd" ]
},
{
"Description": "Budget Room, 1 King Bed (Mountain View)",
"Description_fr": "Chambre Économique, 1 très grand lit (Mountain View)",
"Type": "Budget Room",
"BaseRate": 80.99,
"BedOptions": "1 King Bed",
"SleepsCount": 2,
"SmokingAllowed": true,
"Tags": [ "vcr/dvd", "jacuzzi tub" ]
},
]
}
Notice how there are a couple complex field types here, Tags, Address, and Rooms. Since Rooms is the most difficult of the three, let's model out Rooms. The official SQL scripts do not include any sort of data creation for rooms, so I can't see how an indexer might find/read that data.
How do I need to represent my data (in the view that the indexer is reading from) in order to create a complex collection such as the Rooms example above?
If that's not possible, is it even possible to represent Address in the database in a way that it could be transferred to Azure Search in the schema below?

You will need to create a Data Source that is attached to a SQL view that contains all of your Hotel data. Your view should have columns (Tags, Address, Rooms) that contain embedded JSON which represent the complex types.
This is an example of creating a view with a 'Rooms' column that will contain data from the Rooms table:
CREATE VIEW [dbo].[HotelRooms]
AS
SELECT *, (SELECT *
FROM dbo.Rooms
WHERE dbo.Rooms.HotelID = dbo.Hotels.HotelID FOR JSON AUTO) AS Rooms
FROM dbo.Hotels
GO
This view will have a column Room which will be populated with a JSON array eg: [{"Description": "Budget Room"},{"Description": "Another Room"}]
The columns listed on your view should match exactly with your Index schema. You will need to create a field on your Index called Room of type Collection(Edm.ComplexType) which will contain child fields such as Description.
For more information please see here:
https://learn.microsoft.com/en-us/azure/search/index-sql-relational-data

Related

Designing models in WebAPI with respect to MongoDB

I'm implementing a model as follow:
There is an entity called ROBOT and any ROBOT may has multiple parameters and any parameters may have settings for telephone number with some options. Here is an example of this model:
{
"Robot": "Test",
"Parameters: {
[
"Name": "Charge",
"Handling": {
[
"Min": 4,
"Max": 10,
"Telephone": "1111111111",
"Text": "MyTexT"
],
[
"Min": 6,
"Max": 11,
"Telephone": "222222222222",
"Text": "Another Text"
]
}
]
}
}
May you please help me how can I design model for this instance with respect to WebAPI and MongoDB?
I think you have to start from inside then go outside, for example you should have a model for you handling property and make it as a class consisting Min, Max, ... . then use handling as a property in the outer section, i.e Partmeter. Then create a class for parameters. When done go in the outer section and put parameters as a property for the main model.

Multi Line JSON to Datatable

I'm using the following from Newtonsoft to deserialize some JSON data into a datatable (for the ultimate purpose of saving out to a spreadsheet if it matters);
var dt = (DataTable)JsonConvert.DeserializeObject(returnData, (typeof(DataTable)));
While this works well enough, it has the problem that nested rows are lost. Below is example data of a similar format. In the ratings section only "Internet Movie Database" is saved, "Rotten Tomatoes" & "Metacritic" are lost in the conversion. Is there a deserialize method that would retain these? I'm willing to consider options that would split the results onto multiple rows OR concatenate the ratings section into a single field.
{
"Title": "Guardians of the Galaxy Vol. 2",
"Year": "2017",
"Rated": "PG-13",
"Released": "05 May 2017",
"Runtime": "136 min",
"Genre": "Action, Adventure, Comedy, Sci-Fi",
"Director": "James Gunn",
"Writer": "James Gunn, Dan Abnett (based on the Marvel comics by), Andy Lanning (based on the Marvel comics by), Steve Englehart (Star-Lord created by), Steve Gan (Star-Lord created by), Jim Starlin (Gamora and Drax created by), Stan Lee (Groot created by), Larry Lieber (Groot created by), Jack Kirby (Groot created by), Bill Mantlo (Rocket Raccoon created by), Keith Giffen (Rocket Raccoon created by), Steve Gerber (Howard the Duck created by), Val Mayerik (Howard the Duck created by)",
"Actors": "Chris Pratt, Zoe Saldana, Dave Bautista, Vin Diesel",
"Plot": "The Guardians struggle to keep together as a team while dealing with their personal family issues, notably Star-Lord's encounter with his father the ambitious celestial being Ego.",
"Language": "English",
"Country": "USA",
"Awards": "Nominated for 1 Oscar. Another 12 wins & 42 nominations.",
"Poster": "https://m.media-amazon.com/images/M/MV5BMTg2MzI1MTg3OF5BMl5BanBnXkFtZTgwNTU3NDA2MTI#._V1_SX300.jpg",
"Ratings": [{
"Source": "Internet Movie Database",
"Value": "7.7/10"
}, {
"Source": "Rotten Tomatoes",
"Value": "84%"
}, {
"Source": "Metacritic",
"Value": "67/100"
}
],
"Metascore": "67",
"imdbRating": "7.7",
"imdbVotes": "482,251",
"imdbID": "tt3896198",
"Type": "movie",
"DVD": "22 Aug 2017",
"BoxOffice": "$389,804,217",
"Production": "Walt Disney Pictures",
"Website": "https://marvel.com/guardians",
"Response": "True"
}
UPDATE
Thanks for the solutions, I'm going to try these when I get home. In the meantime, perhaps to be clearer (or maybe even more complicated), I'd settle for concatenating the Ratings section to a single delimited string/field. What would be ideal is something like below.
The DataTable type to which you're de-serializing is unable to handle the one-to-many relationship between the movie and its ratings.
Try de-serializing to a more specific type that better suits your JSON objects.
You can use json2csharp.com to make a C# class out of a JSON object.
Once you have your C# type, you can de-serialize to that and get the C# equivalent of your objects.
var obj = (RootObject)JsonConvert.DeserializeObject(returnData, (typeof(RootObject)));
or if your JSON data is an array of these objects:
var list = (RootObject[])JsonConvert.DeserializeObject(returnData, (typeof(RootObject[])));
This works for you if you don't want to declare a class.
var dict = JsonConvert.DeserializeObject<Dictionary<string, object>>(json);
string rating = Convert.ToString(dict["Ratings"]);
var dtScore = JsonConvert.DeserializeObject<DataTable>(rating);
string MetacriticScore = dtScore.Rows[2]["Value"].ToString();
And there is another simple way
var jsonObj = JsonConvert.DeserializeObject<JObject>(json);
string MetacriticScore = Convert.ToString(jsonObj["Ratings"][2]["Value"]);

Query items in a dictionary in a CosmosDB document

I have documents like this in my CosmosDB database:
{
"id": "12345",
"filename": "foo.txt",
"versions": {
"1": {
"storageAccount": "blob123",
"size": 33
},
"2": {
"storageAccount": "blob123",
"size": 42
}
}
}
(this is a simplified sample)
I need to query on the "storageAccount" property, to check if there are files stored on a given storage account. But I can't find a way to express "for each version".
I tried this, but of course it doesn't work
select top 1 *
from c
join v in c.versions
where v.storageAccount = 'blob123'
Apparently JOIN only works on arrays, not dictionaries. Is there a way to query items in a dictionary?
As a workaround, I can use an UDF, but the performance and cost are terrible (1200 RUs for just 2000 documents when there is not matching document...)
EDIT: updated to more closely reflect actual use case
Unfortunately, this isn't possible today. You cannot iterate over object keys in Cosmos's SQL.
I'd recommend changing the schema to something like:
{
"id": "12345",
"filename": "foo.txt",
"versions": [
{
"id": "1"
"storageAccount": "blob123",
"size": 33
},
{
"id": "2"
"storageAccount": "blob123",
"size": 42
}
]
}
Additionally, you could evaluate a User Defined Function which would return the keys of an object for you, but that will increase your RU costs, though possibly less than sprocs.

Asp.Net Extracting data from deep within a dictionary<string,object>-

Relatively newbie here with a little question. I been extracting a json string that looks like this (in this case it is a modified return from Facebook oauth2.
{"id":"555555555555555","name":"Monkey
Man","last_name":"Man","first_name":"Monkey","email":"test\u0040someaccount.com","location":{"id":"555555555555555","name":"Jungle,
North
Carolina"},"gender":"male","work":[{"employer":{"id":"555555555555555","name":"Big
Boss makes me work"}:"projects":{"current":"doing stuff",
"previous":"other
stuff"},"location":{"id":"555555555555555","name":"Jungle, North
Carolina"},"position":{"id":"555555555555555","name":"IT
monkey"},"start_date":"2010-09"}],"picture":"http://profile.ak.fbcdn.net/static-ak/rsrc.php/v1/yo/r/5555555-555.gif"}
Well I am able to extract everything to a the dictionary by using the following code
JavaScriptSerializer ser = new JavaScriptSerializer();
Dictionary<string, object> dict = ser.Deserialize<Dictionary<string,object>>(json);
I then extract the data as following from the dictionary and store them in an object called contact which is pretty much just a collection of strings.
if (d.ContainsKey("email"))
{
c.email = d["email"].ToString();
}
else
c.email = "";
I did it this way as I was not gaurenteed the information fields will all be there.
If there is an object set in the value such as with the address I use a modified code (thanks to the guy who showed me how to do that) like following.
c.location = (d["location"] as Dictionary<string, object>)["name"].ToString();
Now come the difficult part that I am stuck on.
I am trying to extract the employer name "Big Boss makes me work" from the following part of the string...
"work":[{"employer":{"id":"555555555555555","name":"Big Boss makes me
work"}:"projects":{"current":"doing stuff", "previous":"other
stuff"},"location":{"id":"555555555555555","name":"Jungle, North
Carolina"},"position":{"id":"555555555555555","name":"IT
monkey"},"start_date":"2010-09"}]
It is storing the data down within an array inside of other objects and I have no idea how to get to the information to extract it, or even how to extract information like this from live oauth2...
"addresses": { "personal": { "street": null, "street_2": null, "city":
"Jungle", "state": "NC", "postal_code": "28677", "region": "United
States" }, "business": { "street": "Tree Street", "street_2": null,
"city": "Jungle", "state": "NC", "postal_code": "28677", "region":
"United States" } }
As you can see this goes three levels deep so my (d["location"] as Dictionary)["name"].ToString(); is pretty useless here. How would you go about getting say the street name from this?
I hope my questions aren't too vague or random. I just need some advice on properly extracting data from the dictionary objects. The ways I come up with involve editing the json string and that causes alsorts of problems as I just don't understand the dictionary object well enough to figure this out on my own
Thanks
Running your JSON through jsonlint.com (and correcting it slightly), it looks like this formatted:
{
"id": "555555555555555",
"name": "Monkey Man",
"last_name": "Man",
"first_name": "Monkey",
"email": "test#someaccount.com",
"location": {
"id": "555555555555555",
"name": "Jungle, North Carolina"
},
"gender": "male",
"work": [
{
"employer": {
"id": "555555555555555",
"name": "Big Boss makes me work"
},
"projects": {
"current": "doing stuff",
"previous": "other stuff"
},
"location": {
"id": "555555555555555",
"name": "Jungle, North Carolina"
},
"position": {
"id": "555555555555555",
"name": "IT monkey"
},
"start_date": "2010-09"
}
],
"picture": "http://profile.ak.fbcdn.net/static-ak/rsrc.php/v1/yo/r/5555555-555.gif"
}
Your JSON data in this case just isn't really suitable to be serialized to a straightforward Dictionary object, so that's not really the way to go here.
The easier way to do is to create a C# class that has defined properties the same as the Javascript object you're de-serializaing. Then, deserialize the JSON as that object and you should be able to access the ""Big Boss makes me work" value should be at objectFromJson.work[0].employer.name .

Couchbase View equivalent of Select count(doc.id) from doc where doc.productId in (select productId from doc where doc.lastupdated between x and y)

I'm trying to count the number of 'comments' related to a product in a couchbase bucket. That part is easy for a "full" set of data. It's just a simple map / reduce. Things get tricky when i want limit it to only products that have had changes within a date range. I can do this as two different Views in CB. One that gets the Product Id's where the dateCreated falls within my range, and then One that I pass these Id's to and it calculates my stats. The performance on this approach is horrible though. The key's for the second query aren't necessarily contiguous so i can't do a start/end on them; I'm using the .net 2.2 client for version 4.x couchbase.
I'm open to any options; i.e. Super-awesome-do-it-all-in-one-call View, or follow the 2 view approach if the client has some capacity for bulk get's against non-contiguous keys in a View (i can't find anything on this topic).
Here's my simplified example schema:
{
"comment": {
"key": "key1",
"title": "yay",
"productId": "product1",
"dateCreated": "2016,11,30"
},
"comment": {
"key": "key2",
"title": "booo",
"productId": "product1",
"dateCreated": "2016,12,30"
}
}
Not sure if this is what you want (also not sure about how to translate this to C#), but say you have two documents with ids comment::1 and comment::2 and a Couchbase document for each in this format.
{
"key": "key2",
"title": "booo",
"productId": "product1",
"dateCreated": "2016,12,30"
}
You can define a view (let's call it comments_by_time)
Map
function (doc, meta) {
if (doc.dateCreated) {
var dateParts = doc.dateCreated.split(",");
dateParts = dateParts.map(Number);
emit(dateParts, doc.productId);
}
}
Reduce
_count
Then, you can use the View Query API to do a startKey and endKey range on your documents.
End point
http://<couchbase>:8092/<bucket>/_design/<view>/_view/comments_by_time
Get count of all comments
?reduce=true
{"rows":[ {"key":null,"value":2} ] }
Get documents before a date
?reduce=false&endkey=[2016,12,1]
{"total_rows":2,"rows":[
{"id":"comment::1","key":[2016,11,30],"value":"product1"}
]
}
Between dates
?reduce=false&startkey=[2016,12,1]&endkey=[2017,1,1]
{"total_rows":2,"rows":[
{"id":"comment::2","key":[2016,12,30],"value":"product1"}
]
}

Categories