How to search a content of a document attached in elasticsearch index - c#

I have created the index in elasticsearch as
this.client.CreateIndex("documents", c => c.Mappings(mp => mp.Map<DocUpload>
(m => m.Properties(ps => ps.Attachment
(a => a.Name(o => o.Document)
.TitleField(t => t.Name(x => x.Title).TermVector(TermVectorOption.WithPositionsOffsets))
)))));
the attachment is base64 encoded before indexing. I am not able to search a content inside any of the document. Is base64 encoding creates any problem. Can anyone please help?
Browser response is like
{
"documents": {
"aliases": {},
"mappings": {
"indexdocument": {
"properties": {
"document": {
"type": "attachment",
"fields": {
"content": {
"type": "string"
},
"author": {
"type": "string"
},
"title": {
"type": "string",
"term_vector": "with_positions_offsets"
},
"name": {
"type": "string"
},
"date": {
"type": "date",
"format": "strict_date_optional_time||epoch_millis"
},
"keywords": {
"type": "string"
},
"content_type": {
"type": "string"
},
"content_length": {
"type": "integer"
},
"language": {
"type": "string"
}
}
},
"documentType": {
"type": "string"
},
"id": {
"type": "long"
},
"lastModifiedDate": {
"type": "date",
"format": "strict_date_optional_time||epoch_millis"
},
"location": {
"type": "string"
},
"title": {
"type": "string"
}
}
}
},
"settings": {
"index": {
"creation_date": "1465193502636",
"number_of_shards": "5",
"number_of_replicas": "1",
"uuid": "5kCRvhmsQAGyndkswLhLrg",
"version": {
"created": "2030399"
}
}
},
"warmers": {}
}
}

I found the solution by adding an analyser.
var fullNameFilters = new List<string> { "lowercase", "snowball" };
client.CreateIndex("mydocs", c => c
.Settings(st => st
.Analysis(anl => anl
.Analyzers(h => h
.Custom("full", ff => ff
.Filters(fullNameFilters)
.Tokenizer("standard"))
)
.TokenFilters(ba => ba
.Snowball("snowball", sn => sn
.Language(SnowballLanguage.English)))
))
.Mappings(mp => mp
.Map<IndexDocument>(ms => ms
.AutoMap()
.Properties(ps => ps
.Nested<Attachment>(n => n
.Name(sc => sc.File)
.AutoMap()
))
.Properties(at => at
.Attachment(a => a.Name(o => o.File)
.FileField(fl=>fl.Analyzer("full"))
.TitleField(t => t.Name(x => x.Title)
.Analyzer("full")
.TermVector(TermVectorOption.WithPositionsOffsets)
)))
))
);

Related

Retrieving list of documents from collection by id in nested list

I have documents like this:
[
// 1
{
"_id": ObjectId("573f3944a75c951d4d6aa65e"),
"Source": "IGN",
"Family": [
{
"Countries": [
{
"uid": 17,
"name": "Japan",
}
]
}
]
},
// 2
{
"_id": ObjectId("573f3d41a75c951d4d6aa65f"),
"Source": "VG",
"Family": [
{
"Countries": [
{
"uid": 17,
"name": "USA"
}
]
}
]
},
// 3
{
"_id": ObjectId("573f4367a75c951d4d6aa660"),
"Source": "NRK",
"Family": [
{
"Countries": [
{
"uid": 17,
"name": "Germany"
}
]
}
]
},
// 4
{
"_id": ObjectId("573f4571a75c951d4d6aa661"),
"Source": "VG",
"Family": [
{
"Countries": [
{
"uid": 10,
"name": "France"
}
]
}
]
},
// 5
{
"_id": ObjectId("573f468da75c951d4d6aa662"),
"Source": "IGN",
"Family": [
{
"Countries": [
{
"uid": 14,
"name": "England"
}
]
}
]
}
]
I want to return only the documents with source equals 'Countries.uid' equal 17
so I have in the end :
[
{
"_id": ObjectId("573f3944a75c951d4d6aa65e"),
"Source": "IGN",
"Family": [
{
"Countries": [
{
"uid": 17,
"name": "Japan",
}
]
}
]
},
{
"_id": ObjectId("573f3d41a75c951d4d6aa65f"),
"Source": "VG",
"Family": [
{
"Countries": [
{
"uid": 17,
"name": "USA"
}
]
}
]
},
{
"_id": ObjectId("573f4367a75c951d4d6aa660"),
"Source": "NRK",
"Family": [
{
"Countries": [
{
"uid": 17,
"name": "Germany"
}
]
}
]
}
]
How can I do this with the official C# MongoDB driver?
Tried this :
public List<Example> getLinkedCountry(string porduitId)
{
var filter = Builders<Example>.Filter.AnyIn("Family.Countries.uid", porduitId);
var cursor = await _certificats.FindAsync(filter);
var docs = cursor.ToList();
return docs;
}
Unfortunately, I think my filter is wrong.
Is there a way to find all the documents by accessing the nested list by id and retrieving it?
Solution 1
Use ElemMatch instead of AnyIn.
var filter = Builders<Example>.Filter.ElemMatch(
x => x.Family,
y => y.Countries.Any(z => z.uid == porduitId));
Output
Solution 2
If you are unconfident with MongoDB .Net Driver syntax, you can convert the query as BsonDocument via MongoDB Compass (Export to language feature).
var filter = new BsonDocument("Family.Countries.uid", porduitId);
Just to expand on #Yong Shun 's answer,
if you just want to return the list of nested documents and not all of it, you have a few options.
Using project
var filter = Builders<Example>.Filter.ElemMatch(
x => x.Family,
y => y.Countries.Any(z => z.uid == porduitId));
var project = Builders<Example>.Project.ElemMatch(
x => x.Family,
y => y.Countries.Any(z => z.uid == porduitId)
);
var examples = await collection.filter(filter).Project<Example>(project).toListAsync();
Using the aggregate pipeline
var filter = Builders<Example>.Filter.ElemMatch(
x => x.Family,
y => y.Countries.Any(z => z.uid == porduitId));
var project = Builders<ServiceProvider>.Projection.Expression(
x => x.Faimily.Where(y => y.uid == porduitId)
);
var result = await collection
.Aggregate()
.Match(filter)
.Project(project)
.ToListAsync(); //Here result is a list of Iterable<Countries>

Elasticsearch filter group by using nest c#

I am using elastic search to get the products grouped by category and perform aggregations on result....
If I use categoryid(numeric) as a field its giving result but when i try to give category name its giving Unsuccessful(400)
Please see the blow code snippet
I am getting document count. Can i get document data from same request?
ISearchResponse<Products> results;
results = _client.Search<Products>(s => s
//.Size(int.MaxValue)
.Query(q => q
.Bool(b => b
.Should(
bs => bs.Prefix(p => p.cat_name, "heli"),
bs => bs.Prefix(p => p.pr_name, "heli")
)
)
)
.Aggregations(a => a
.Terms("catname", t => t
.Field(f => f.categoryid)
.Size(int.MaxValue)
.Aggregations(agg => agg
.Max("maxprice", av => av.Field(f2 => f2.price))
.Average("avgprice", av => av.Field(f3 => f3.price))
.Max("maxweidht", av => av.Field(f2 => f2.weight))
.Average("avgweight", av => av.Field(f3 => f3.weight))
)
)
)
);
mapping model:
{
"product_catalog": {
"mappings": {
"properties": {
"#timestamp": {
"type": "date"
},
"#version": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"cat_name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"categoryid": {
"type": "long"
},
"createdon": {
"type": "date"
},
"fulldescription": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"height": {
"type": "float"
},
"length": {
"type": "float"
},
"pr_name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"price": {
"type": "long"
},
"productid": {
"type": "long"
},
"shortdescription": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"sku": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"updatedon": {
"type": "date"
},
"weight": {
"type": "float"
},
"width": {
"type": "float"
}
}
}
}
}
Can anyone guide how to use category name for grouping.
catname field is of type text and thus you can't use it by default in aggregations because fielddata is disabled for performance reasons.
Based on your mapping I see you are also indexing keyword as well for catname so you can use this field.
Just change this part of your term aggregation .Field(f => f.categoryid) to
.Field(f => f.cat_name.Suffix("keyword")) and you should be good.

Create dynamic pivot list using joins

I want to create dynamic pivot list on list which has data in below format
"products" :
{
"name": "ABC",
"Variance": [
{
"Date": "01-01-2018",
"Value": "10"
},
{
"Date": "02-01-2018",
"Value": "11"
},
{
"Date": "03-01-2018",
"Value": "12"
},
]
},
{
"name": "XYZ",
"Variance": [
{
"Date": "01-01-2018",
"Value": "22"
},
{
"Date": "03-01-2018",
"Value": "24"
},
{
"Date": "04-01-2018",
"Value": "28"
},
],
},
{
"name": "PQR",
"Variance": [
{
"Date": "01-01-2018",
"Value": "20"
},
{
"Date": "02-01-2018",
"Value": "22"
},
{
"Date": "04-01-2018",
"Value": "24"
},
],
}
I want to create pivot list so it can return data like
"NewProducts":[{
"Date": "01-01-2018",
"ABC" : "10"
"XYZ" : "22",
"PQR" : "20"
},
{
"Date": "02-01-2018",
"ABC" : "11"
"XYZ" : "null",
"PQR" : "22"
},
{
"Date": "03-01-2018",
"ABC" : "12"
"XYZ" : "24",
"PQR" : "null"
},
{
"Date": "04-01-2018",
"ABC" : "null"
"XYZ" : "28",
"PQR" : "24"
}]
I tried some approach of having joins on inner lists, but not getting required results. I want to avoid loops as my product list will vary as per selections.
I was able to join the list using for loops, but I want to have minimal use of for loops. Any suggestions would be really helpful to me.
Thanks in Advance.
Assuming you want to use a Dictionary<string,int> to hold the dynamic value pairs, you can use LINQ by first flattening the nested structure into a new flat list with SelectMany and then grouping by Date:
var ans = products.SelectMany(p => p.Variance.Select(v => new { p.name, v.Date, v.Value }))
.GroupBy(pv => pv.Date)
.Select(pvg => new { Date = pvg.Key, Fields = pvg.ToDictionary(p => p.name, p => p.Value) });

Elasticsearch Dynamic Aggregations with NEST

Hi there I have the following mapping for product in elastic
I am trying to create aggregations from the Name / Value data in product specifications I think what i need to achieve is with Nested aggregations but im struggling with the implementation
"mappings": {
"product": {
"properties": {
"productSpecification": {
"properties": {
"productSpecificationId": {
"type": "long"
},
"specificationId": {
"type": "long"
},
"productId": {
"type": "long"
},
"name": {
"fielddata": true,
"type": "text"
},
"value": {
"fielddata": true,
"type": "text"
}
}
},
"name": {
"type": "text",
"fields": {
"keyword": {
"ignore_above": 256,
"type": "keyword"
}
}
},
"value": {
"type": "text",
"fields": {
"keyword": {
"ignore_above": 256,
"type": "keyword"
}
}
}
}
},
"description": {
"type": "text",
"fields": {
"keyword": {
"ignore_above": 256,
"type": "keyword"
}
}
},
"reviewRatingCount": {
"type": "integer"
},
"productId": {
"type": "integer"
},
"url": {
"type": "text",
"fields": {
"keyword": {
"ignore_above": 256,
"type": "keyword"
}
}
},
"dispatchTimeInDays": {
"type": "integer"
},
"productCode": {
"type": "text",
"fields": {
"keyword": {
"ignore_above": 256,
"type": "keyword"
}
}
},
"name": {
"type": "text",
"fields": {
"keyword": {
"ignore_above": 256,
"type": "keyword"
}
}
},
I have now changed the code below and I am getting some success
.Aggregations(a => a
.Terms("level1",t => t
.Field(f=> f.ProductSpecification.First().Name)
.Aggregations(snd => snd
.Terms("level2", f2 => f2.Field(f3 => f3.ProductSpecification.First().Value))
)))
by using this code i am now returning the Name values
var myagg = response.Aggs.Terms("level1");
if(response.Aggregations != null)
{
rtxAggs.Clear();
rtxAggs.AppendText(Environment.NewLine);
foreach(var bucket in myagg.Buckets)
{
rtxAggs.AppendText(bucket.Key);
}
}
What i cant figure out is how to then get the sub aggregation values
Right after much experimenting and editing Ive managed to get to the bottom of this
First up I Modified productSpecification back to nested and then used the following in the aggregation
.Aggregations(a => a
.Nested("specifications", n => n
.Path(p => p.ProductSpecification)
.Aggregations(aa => aa.Terms("groups", sp => sp.Field(p => p.ProductSpecification.Suffix("name"))
.Aggregations(aaa => aaa
.Terms("attribute", tt => tt.Field(ff => ff.ProductSpecification.Suffix("value"))))
)
)
)
)
Then used the following to get the values.
var groups = response.Aggs.Nested("specifications").Terms("groups");
foreach(var bucket in groups.Buckets)
{
rtxAggs.AppendText(bucket.Key);
var values = bucket.Terms("attribute");
foreach(var valBucket in values.Buckets)
{
rtxAggs.AppendText(Environment.NewLine);
rtxAggs.AppendText(" " + valBucket.Key + "(" + valBucket.DocCount + ")");
}
rtxAggs.AppendText(Environment.NewLine);
}
All seems to be working fine hopefully this helps some people, on to my next challenge of boosting fields and filtering on said aggregations.

how to implement custom sorter in kendo using razor

I'm trying to implement a custom compare in a kendo grid so that numbers sort correctly along side the text in the column.
the cshtml page has been written using the mvc wrapper, razor markup -
#(Html.Kendo().Grid<dynamic>().Name("grid")
.Columns(a =>
{
a.Bound("colA").Width("6%");
a.Bound("colB").Width("14%");
a.Bound("colC").Title("numbers and text").Width("5%");
foreach (var issue in LookupHelper.GetFailures().Where(b => b.Source != "Other"))
a.Bound("Issue_" + issue.Id.ToString()).Title(issue.Description).Width("7%");
})
.DataSource(a => a.Ajax().Batch(true)
.Model(b => b.Id("colA"))
.PageSize(20)
.Sort(b => b.Add("colA").Ascending())
.ServerOperation(false)
)
.Events(a => a.Change("grid.change").DataBound("grid.change"))
.Pageable()
.Resizable(a => a.Columns(true))
.Selectable()
.Sortable(a => a.SortMode(GridSortMode.MultipleColumn))
.Filterable()
the telerik page says it's not supported yet http://www.telerik.com/forums/custom-sort-example-for-mvc-wrappers
so i'd like to take the mark up and string substitute to replace the column definition with one including the custom sort function.
any ideas how to do this?
i've tried .toHtmlString() but then the grid doesnt render but only displays the string.
thanks
ok I solved this by re-writing it in JS format and using the Razor markup still to dynamically generate the columns.
this helped - Mix Razor and Javascript code
<script type="text/javascript">jQuery(function () {
jQuery("#wims-grid-surveillance").kendoGrid({
change: wimsDashboard.changeSurveillance,
dataBound: wimsDashboard.changeSurveillance,
columns: [
{ title: "Well", "width": "5%", "field": "Well", "filterable": {}, "encoded": true },
{ title: "Type", "width": "5%", "field": "Type", "filterable": {}, "encoded": true },
{ title: "Pot.", "width": "3%", format: "{0:n0}","field": "Potential", "filterable": {}, "encoded": true },
{ title: "Status", "width": "4%",
"template": "\u003cdiv style=\u0027vertical-align: top;cursor: pointer;text-align: center;font-size: 300%;color: #=StatusFlag#\u0027 onclick=\u0027wimsPage.bf.openWindow(\u0022/eplant/dll/eplant.dll?Display&page_id=2121&WELL=#=Well#\u0022,\u0022#=Well#\u0022, \u0022/eplant/images/custom_images/WIMS-16x16.png\u0022,\u0022#=Well# - Well Integrity BF Display\u0022);\u0027 \u003e\u25CF\u003c/div\u003e",
"field": "Status",
"filterable": {
extra:false,
operators: {
string:{ contains: "Is"}
},
ui: function (el){
el.kendoDropDownList({
dataSource: [{value:"111",text:"Open"},{value:"0",text:"Shut"}],
dataTextField: "text",
dataValueField: "value",
optionLabel: "--Select Value--",
cell: {operator: "contains"}
});
}
},
"encoded": true,
sortable: {
compare: function (a, b) {
a = (a.Status.split("1").length - 1);
b = (b.Status.split("1").length - 1);
return a<b ? -1 : a==b ? 0 : 1;
}
} },
{ title: "Oper. Status", "width": "4%", "field": "OpStatus", "filterable": { extra:false,
operators: {
string:{ eq: "Is"}
},
ui: function (el){
el.kendoDropDownList({
dataSource: [{value:"Shut In",text:"Shut In"},{value:"Cont. Oper.",text:"Cont. Oper."}],
dataTextField: "text",
dataValueField: "value",
optionLabel: "--Select Value--",
cell: {eq: "Is"}
});
}}, "encoded": true },
{ title: "Active Case", "width": "8%", "field": "Case", "filterable": {}, "encoded": true },
{ title: "Sev.", "width": "3%", "field": "Severity", "filterable": {}, "encoded": true },
{ title: "Days to expiry",
attributes: { "class": "vline" },
width: "4%",
template: "#if (DaysToExpiry == '0') {# <div style='color: #=DaysToExpiryFlag#'>Expired</div> #} else if(DaysToExpiry==null) {##} else {##=DaysToExpiry##}#",
field: "DaysToExpiry",
filterable: {},
encoded: true
}
#foreach (var issue in LookupHelper.GetFailureLocations().Where(b => b.Source != "Other"))
{
<text>
,{ "title": "#issue.Description",
"attributes": { "class": "visible-wide" },
"width": "5%",
"template": "<div class='input-block-level' style='color:transparent; background-color: #if(Issue_#issue.Id == 5){##=dashboardFailureColour.text##}else if (Issue_#issue.Id == 4) {##=dashboardCategory1Colour.text##} else if (Issue_#issue.Id == 3) {##=dashboardCategory2Colour.text##} else if (Issue_#issue.Id == 2) {##=dashboardCategory3Colour.text##} else if (Issue_#issue.Id == 1) {##=dashboardNonApplicableColour.text##} else if (Issue_#issue.Id === 0) {##=dashboardInvalidAttributeColour.text##}else{#none#}#;'>#if(Issue_#issue.Id != null){##=Issue_#issue.Id##}#</div>",
"field": "Issue_#issue.Id",
"filterable": {
extra:false,
operators: {
string:{ eq: "Is"}
},
ui: function (el){
el.kendoDropDownList({
dataSource: [
{ 'value': 0, text:'Error' },
{ 'value': 1, text:'OK' },
{ 'value': 2, text:'Cat3' },
{ 'value': 3, text:'Cat2' },
{ 'value': 4, text:'Cat1' },
{ 'value': 5, text:'Fail' }
],
dataTextField: "text",
dataValueField: "value",
optionLabel: "--Select Value--",
cell: {operator: "eq"}
});
}
},
"encoded": true
}
</text>
}
],
"pageable": { "buttonCount": 10 },
"sortable": { "mode": "multiple" },
"selectable": "Single, Row",
"filterable": true,
"resizable": false,
"scrollable": false,
"dataSource": {
"transport": {
"prefix": "",
"read": {
"url": ""}
},
"pageSize": 20,
"page": 1,
"total": 0,
"type": "aspnetmvc-ajax",
"sort": [{ "field": "Well", "dir": "asc"}],
"schema": {
"data": "Data",
"total": "Total",
"errors": "Errors",
"model": { "id": "Well", "fields": {
"Severity":{"type":"number"},
"Potential":{"type":"number"},
"DaysToExpiry":{"type":"number"},
"Issue_1":{"type":"number"},
"Issue_2":{"type":"number"},
"Issue_3":{"type":"number"},
"Issue_4":{"type":"number"},
"Issue_5":{"type":"number"},
"Issue_6":{"type":"number"},
"Issue_7":{"type":"number"}
}
}
},
"batch": true
}
});
$.fx.off = true;
});
</script>

Categories