Group, calculation and order of ElasticSearch data - c#

I have a lot of data stored in the following format (I simplified the data to explain the problem).
What I need is:
group all the data by "Action Id" field
calculate the difference between max and min values of "Created Time" for each group (from the previous action)
order the results by the calculated field ("Action duration" - difference between max and min)
I use NEST (C#) to query the ElasticSearch. I think that if you can help me with native Elastic query it also will be very helpful, I'll translate it to C#.
Thank you.

Case your mappings looks like that:
PUT /index
{
"mappings": {
"doc": {
"properties": {
"ActionId": {
"type": "text",
"fielddata": true
},
"CreatedDate":{
"type": "date"
},
"SubActionName":{
"type": "text",
"fielddata": true
}
}
}
}
}
Your elasticsearch query should look like that:
GET index/_search
{
"size": 0,
"aggs": {
"actions": {
"terms": {
"field": "ActionId"
},
"aggs": {
"date_created": {
"date_histogram": {
"field": "CreatedDate",
"interval": "hour"
},
"aggs": {
"the_max": {
"max": {
"field": "CreatedDate"
}
},
"the_min": {
"min": {
"field": "CreatedDate"
}
},
"diff_max_min": {
"bucket_script": {
"buckets_path": {
"max": "the_max",
"min": "the_min"
},
"script": "params.max - params.min"
}
}
}
}
}
}
}
}
You can read more about Pipeline Aggregetions here
Hope that helps

Related

Create line chart using QuickChart in C#

I want to create a chart but I can't send the data from a list. I am using C#. I need to send the values of X and Y from a query that I perform and the values change. I need to use this library because it allows to create simple graphs.
The value of X has to be of type date and the value of Y is double
{
"type": "line",
"data": {
"datasets": [
{
"label": "Dataset with string point data",
"backgroundColor": "rgba(255, 99, 132, 0.5)",
"borderColor": "rgb(255, 99, 132)",
"fill": false,
"data": [
{
"x": "2020-06-14T09:15:34-07:00",
"y": 75
},
{
"x": "2020-06-16T09:15:34-07:00",
"y": -53
},
{
"x": "2020-06-18T09:15:34-07:00",
"y": 31
},
{
"x": "2020-06-19T09:15:34-07:00",
"y": 6
}
]
}
]
},
"options": {
"responsive": true,
"title": {
"display": true,
"text": "Chart.js Time Point Data"
},
"scales": {
"xAxes": [{
"type": "time",
"display": true,
"scaleLabel": {
"display": true,
"labelString": "Date"
},
"ticks": {
"major": {
"enabled": true
}
}
}],
"yAxes": [{
"display": true,
"scaleLabel": {
"display": true,
"labelString": "value"
}
}]
}
}
}

Projecting results from a grouping that also need to be summed

I am trying to get my head around a specific problem to decide whether to take the plunge in converting some personal projects to MongoDb after completing a basic course last week. What I am trying to achieve is a representation of my data based on grouping and then ultimately selecting specific parts of that group to create a new projection which shows my final result. In the code presently, we do the grouping and then do a sub-select to create the final dataset, I am hoping this can be done in a single hit.
Example document
{
"_id": {
"$oid": "600d88b0d7016d5675cd59bd"
},
"DeviceId": {
"$oid": "600d729764ea780882ac559b"
},
"UserId": {
"$oid": "600b660eff59aab915985b1d"
},
"Date": {
"$date": {
"$numberLong": "1611499696095"
}
},
"Records": [
{
"Count": {
"$numberInt": "10"
},
"Test1": {
"Inconclusive": null,
"Passed": true,
"Failed": null
},
"Test2": {
"Inconclusive": null,
"Passed": true,
"Failed": null
}
},
{
"Count": {
"$numberInt": "15"
},
"Test1": {
"Inconclusive": true,
"Passed": null,
"Failed": null
},
"Test2": {
"Inconclusive": null,
"Passed": true,
"Failed": null
}
},
{
"Count": {
"$numberInt": "15"
},
"Test1": {
"Inconclusive": true,
"Passed": null,
"Failed": null
},
"Test2": {
"Inconclusive": null,
"Passed": null,
"Failed": true
}
}
]
}
Ultimately, what I am trying to get is this as close to this as possible;
{
"DeviceId": "600d729764ea780882ac559b",
"Test1Inconclusive": 30,
"Test1Passed": 10,
"Test1Failed": 0,
"Test2Inconclusive": 0,
"Test2Passed": 25,
"Test2Failed": 15
}
So far, all I have managed to get is the data grouped and it is at this point in the existing code (Entity Framework/SQL server) that I would use Linq to pull out the SUM'd values.
[{
$match: {
UserId: ObjectId('600b660eff59aab915985b1d')
}
}, {
$unwind: {
path: '$Records'
}
}, {
$group: {
_id: {
DeviceId: '$DeviceId',
Test1Inconclusive: '$Records.Test1.Inconclusive',
Test1Passed: '$Records.Test1.Passed',
Test1Failed: '$Records.Test1.Failed',
Test2Inconclusive: '$Records.Test2.Inconclusive',
Test2Passed: '$Records.Test2.Passed',
Test2Failed: '$Records.Test2.Failed',
},
Count: {
$sum: '$Records.Count'
}
}
}, {}]
I am not sure if it is possible to do what I want, and if so how the do the next projection step while performing a subselect of this grouped data. It might even be that my approach is flawed from the start, so feel free to change it completely.
Bonus internet points if you can also give me the MongoDb C# syntax for doing the same (on a MongoCollection)
Following on from the initial version by #turivishal, the answer below worked;
db.collection.aggregate([
{
$match: {
UserId: ObjectId("600b660eff59aab915985b1d")
}
},
{
$unwind: {
path: "$Records"
}
},
{
$group: {
_id: "$DeviceId",
Test1Inconclusive: {
$sum: {
$cond: [
{
$eq: [
"$Records.Test1.Inconclusive",
true
]
},
"$Records.Count",
0
]
}
},
Test1Passed: {
$sum: {
$cond: [
{
$eq: [
"$Records.Test1.Passed",
true
]
},
"$Records.Count",
0
]
}
},
Test1Failed: {
$sum: {
$cond: [
{
$eq: [
"$Records.Test1.Failed",
true
]
},
"$Records.Count",
0
]
}
},
Test2Inconclusive: {
$sum: {
$cond: [
{
$eq: [
"$Records.Test2.Inconclusive",
true
]
},
"$Records.Count",
0
]
}
},
Test2Passed: {
$sum: {
$cond: [
{
$eq: [
"$Records.Test2.Passed",
true
]
},
"$Records.Count",
0
]
}
},
Test2Failed: {
$sum: {
$cond: [
{
$eq: [
"$Records.Test2.Failed",
true
]
},
"$Records.Count",
0
]
}
},
Count: {
$sum: "$Records.Count"
}
}
}
])

MongoDb: Rename a property in a complex document

We have documents saving to MongoDb. The problem is that one of our sub-documents has an Id property that is getting returned as _id, which is causing serialize/deserialize issues with the C# driver due to how it interprets Id fields (see http://mongodb.github.io/mongo-csharp-driver/2.0/reference/bson/mapping/)
I would like to rename the property from Id to SetId, but our data is fairly dynamic and simple field rename solutions that I've seen elsewhere do not apply. Here's an example of some heavily edited simple data:
{
"Id": "5a6238dbccf20b38b0db6cf2",
"Title": "Simple Document",
"Layout": {
"Name": "Simple Document Layout",
"Tabs": [
{
"Name": "Tab1",
"Sections": [
{
"Name": "Tab1-Section1",
"Sets": [
{
"Id": 1
}
]
}
]
}
]
}
}
Compare with more complex data:
{
"Id": "5a6238dbccf20b38b0db6abc",
"Title": "Complex Document",
"Layout": {
"Name": "Complex Document Layout",
"Tabs": [
{
"Name": "Tab1",
"Sections": [
{
"Name": "Tab1-Section1",
"Sets": [
{
"Id": 1
}
]
},
{
"Name": "Tab1-Section2",
"Sets": [
{
"Id": 1
}
]
}
]
},
{
"Name": "Tab2",
"Sections": [
{
"Name": "Tab2-Section1",
"Sets": [
{
"Id": 1
}
]
}
]
},
{
"Name": "Tab3",
"Sections": [
{
"Name": "Tab3-Section1",
"Sets": [
{
"Id": 1
},
{
"Id": 2
}
]
}
]
}
]
}
}
Note that the Set.Id field can be on multiple tabs on multiple sections with multiple sets. I just don't know how to approach a query to handle renaming data at all these levels.
I took #Veerum's advice and did a manual iteration over the collection with something like this:
myCol = db.getCollection('myCol');
myCol.find({ "Layout.Tabs.Sections.Sets._id": {$exists: true} }).forEach(function(note) {
for(tab = 0; tab != note.Layout.Tabs.length; ++tab) {
for(section = 0; section != note.Layout.Tabs[tab].Sections.length; ++section) {
for(set = 0; set != note.Layout.Tabs[tab].Sections[section].Sets.length; ++set) {
note.Layout.Tabs[tab].Sections[section].Sets[set].SetId = NumberInt(note.Layout.Tabs[tab].Sections[section].Sets[set]._id);
delete note.Layout.Tabs[tab].Sections[section].Sets[set]._id
}
}
}
myCol.update({ _id: note._id }, note);
});
Perhaps there is a more efficient way, but we are still on Mongo v3.2 and it seems to work well.

Elasticsearch Dynamic Aggregations with NEST

Hi there I have the following mapping for product in elastic
I am trying to create aggregations from the Name / Value data in product specifications I think what i need to achieve is with Nested aggregations but im struggling with the implementation
"mappings": {
"product": {
"properties": {
"productSpecification": {
"properties": {
"productSpecificationId": {
"type": "long"
},
"specificationId": {
"type": "long"
},
"productId": {
"type": "long"
},
"name": {
"fielddata": true,
"type": "text"
},
"value": {
"fielddata": true,
"type": "text"
}
}
},
"name": {
"type": "text",
"fields": {
"keyword": {
"ignore_above": 256,
"type": "keyword"
}
}
},
"value": {
"type": "text",
"fields": {
"keyword": {
"ignore_above": 256,
"type": "keyword"
}
}
}
}
},
"description": {
"type": "text",
"fields": {
"keyword": {
"ignore_above": 256,
"type": "keyword"
}
}
},
"reviewRatingCount": {
"type": "integer"
},
"productId": {
"type": "integer"
},
"url": {
"type": "text",
"fields": {
"keyword": {
"ignore_above": 256,
"type": "keyword"
}
}
},
"dispatchTimeInDays": {
"type": "integer"
},
"productCode": {
"type": "text",
"fields": {
"keyword": {
"ignore_above": 256,
"type": "keyword"
}
}
},
"name": {
"type": "text",
"fields": {
"keyword": {
"ignore_above": 256,
"type": "keyword"
}
}
},
I have now changed the code below and I am getting some success
.Aggregations(a => a
.Terms("level1",t => t
.Field(f=> f.ProductSpecification.First().Name)
.Aggregations(snd => snd
.Terms("level2", f2 => f2.Field(f3 => f3.ProductSpecification.First().Value))
)))
by using this code i am now returning the Name values
var myagg = response.Aggs.Terms("level1");
if(response.Aggregations != null)
{
rtxAggs.Clear();
rtxAggs.AppendText(Environment.NewLine);
foreach(var bucket in myagg.Buckets)
{
rtxAggs.AppendText(bucket.Key);
}
}
What i cant figure out is how to then get the sub aggregation values
Right after much experimenting and editing Ive managed to get to the bottom of this
First up I Modified productSpecification back to nested and then used the following in the aggregation
.Aggregations(a => a
.Nested("specifications", n => n
.Path(p => p.ProductSpecification)
.Aggregations(aa => aa.Terms("groups", sp => sp.Field(p => p.ProductSpecification.Suffix("name"))
.Aggregations(aaa => aaa
.Terms("attribute", tt => tt.Field(ff => ff.ProductSpecification.Suffix("value"))))
)
)
)
)
Then used the following to get the values.
var groups = response.Aggs.Nested("specifications").Terms("groups");
foreach(var bucket in groups.Buckets)
{
rtxAggs.AppendText(bucket.Key);
var values = bucket.Terms("attribute");
foreach(var valBucket in values.Buckets)
{
rtxAggs.AppendText(Environment.NewLine);
rtxAggs.AppendText(" " + valBucket.Key + "(" + valBucket.DocCount + ")");
}
rtxAggs.AppendText(Environment.NewLine);
}
All seems to be working fine hopefully this helps some people, on to my next challenge of boosting fields and filtering on said aggregations.

Filter the aggregated results in sub aggregation using NEST

I have the list of production and would like to get the sold count each product on two different time period. Able to generate the Elastic Search query as follows,
POST /Elastic/products/_search
{
"size": 0,
"query": {
"query_string": {
"query": "(date:[20130501 TO 20140430])"
}
},
"aggs": {
"last12months": {
"terms": {
"field": "product",
"order": {
"last3months": "desc"
},
"exclude": "NA",
"size": 20
},
"aggs": {
"last3months": {
"filter": {
"range": {
"date": {
"gte": "20140201",
"lte": "20140430"
}
}
}
}
}
}
}
}
the result looks as follow,
"aggregations": {
"last12months": {
"doc_count_error_upper_bound": -1,
"sum_other_doc_count": 938,
"buckets": [
{
"key": "xxxx",
"doc_count": 55,
"last3months": {
"doc_count": 41
}
},
{
"key": "yyyy",
"doc_count": 41,
"last3months": {
"doc_count": 14
}
}
]
}
}
Since I am using nest.net to generate the query from my application and able to generate the query but subaggregation part is little tricky here. Not able to apply the range inside filter on the sub aggregation.
ElasticSearchClient.Search<Product>(s => s.Size(0)
.Query(query => query
.Filtered(filtered => filtered
.Filter(filter => filter
.Range(range => range
.OnField(ElasticFields.JobDate)
.GreaterOrEquals("20140101")
.LowerOrEquals("20141231"))
.Query(q => q
.QueryString(qs => qs.Query(queryString)))))
.Aggregations(aggregation => aggregation
.Terms("last12months", t1 => t1
.Field("product")
.OrderDescending("last3months")
.Size(25)
.Aggregations(innerAggregation => innerAggregation.Filter("last3months", t2 => t2.Filters(t3 => t3.Range(range => range.LowerOrEquals("").GreaterOrEquals("")))))
)));
how to achieve this?
helps much appreciated

Categories