ElasticSearch Mapping With NEST 6.6.0 - c#

I'm new with ElasticSearch and encountered some problem while mapping my documents into ES Index.
My document structure is
public class DocumentItem
{
public string Id { get; set; }
public DocumentType DocType { get; set; }
public Dictionary<string, string> Props { get; set; } = new Dictionary<string, string>();
}
And here's my mapping
var indexResponseFiles = dClient.CreateIndex(sedIndex, c => c
.InitializeUsing(indexConfig)
.Mappings(m => m
.Map<DocumentItem>(mp => mp.AutoMap()
)
));
As u see i'm trying to map a DICTIONARY type. In every document the keys of dictionary are different.
My goal is to set my custom analyzer to all text values of dictionary. I have no idea how to do this.

Dynamic templates feature will help you here. You can configure the dynamic template for all string fields below props object which will create a mapping for such fields with certain analyzer.
Here is the example with creating text fields with english analyzer
var createIndexResponse = await client.CreateIndexAsync("index_name",
c => c.Mappings(m => m
.Map<Document>(mm => mm.DynamicTemplates(dt => dt
.DynamicTemplate("props_fields", t => t
.PathMatch("props.*")
.MatchMappingType("string")
.Mapping(dm => dm.Text(text => text.Analyzer("english"))))))));
and here is the mapping after indexing following document
var document = new Document { Id = "1", Name = "name"};
document.Props.Add("field1", "value");
var indexDocument = await client.IndexDocumentAsync(document);
mapping
{
"index_name": {
"mappings": {
"document": {
"dynamic_templates": [
{
"props_fields": {
"path_match": "props.*",
"match_mapping_type": "string",
"mapping": {
"analyzer": "english",
"type": "text"
}
}
}
],
"properties": {
"id": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"props": {
"properties": {
"field1": {
"type": "text",
"analyzer": "english"
}
}
}
}
}
}
}
}
Hope that helps.

Related

NJsonSchema C# code generation creates generic objects instead of classes

I am having trouble understanding NJsonSchema's take on the following schema and the classes generated. I cannot change the schema as this is defined by someone else. The schema posted below is simplified to demonstrate the problem (link to the entire schema). I would like to create a set of C# classes that can output a valid file following the schema. However, the NJsonSchema code generator just fills in generic objects and does not populate the fields and properties described by the schema.
The generator code:
var schema = JsonSchema.FromFileAsync(#"../../../SchemaSnippets.json");
CSharpGeneratorSettings set = new CSharpGeneratorSettings();
set.Namespace = "EPJSON";
set.SchemaType = SchemaType.JsonSchema;
set.GenerateDataAnnotations = true;
set.AnyType = "object";
set.ClassStyle = CSharpClassStyle.Poco;
set.HandleReferences = false;
var generator = new CSharpGenerator(schema.Result, set);
var file = generator.GenerateFile();
File.WriteAllText(#"../../../SchemaSnippetsCode.cs", file);
The Schema:
{
"$schema": "https://json-schema.org/draft-07/schema#",
"properties": {
"Version": {
"patternProperties": {
".*": {
"type": "object",
"properties": {
"version_identifier": {
"type": "string",
"default": "22.2"
}
}
}
},
"group": "Simulation Parameters",
"legacy_idd": {
"field_info": {
"version_identifier": {
"field_name": "Version Identifier",
"field_type": "a"
}
},
"fields": [
"version_identifier"
],
"alphas": {
"fields": [
"version_identifier"
]
},
"numerics": {
"fields": []
}
},
"type": "object",
"maxProperties": 1,
"memo": "Specifies the EnergyPlus version of the IDF file.",
"format": "singleLine"
}
}
}
The generated class output looks like this:
[System.CodeDom.Compiler.GeneratedCode("NJsonSchema", "10.8.0.0 (Newtonsoft.Json v9.0.0.0)")]
[Newtonsoft.Json.JsonProperty("Version", Required = Newtonsoft.Json.Required.DisallowNull, NullValueHandling = Newtonsoft.Json.NullValueHandling.Ignore)]
public System.Collections.Generic.IDictionary<string, object> Version { get; set; }
private System.Collections.Generic.IDictionary<string, object> _additionalProperties;
[Newtonsoft.Json.JsonExtensionData]
public System.Collections.Generic.IDictionary<string, object> AdditionalProperties
{
get { return _additionalProperties ?? (_additionalProperties = new System.Collections.Generic.Dictionary<string, object>()); }
set { _additionalProperties = value; }
}
I was expecting/hoping for something along these lines:
public System.Collections.Generic.IDictionary<string, VersionObject> Version { get; set; }
class VersionObject
{
string version_identifier { get; set; } = "22.2";
}

DateRange search is not working in Elastic search NEST api

I have a table of logs records and I want to conduct a simple search by date.
For example, I wanted to search all the queries before 01.06.2019 00:00:00 (mm.DD.yyyy hh:mm:ss) and I wrote this query:
var query = client.Search<SearchEventDto>(s => s
.AllIndices()
.AllTypes()
.Query(q => q
.MatchAll() && +q
.DateRange(r =>r
.Field(f => f.timestamp)
.LessThanOrEquals(new DateTime(2019,06,01, 0, 0, 0))
)
)
);
My Dto looks like this:
public class SearchEventDto : IDto
{
[KendoColumn(Hidden = true, Editable = true)]
public string id { get; set; }
[KendoColumn(Order = 2, DisplayName = "Level")]
public string level { get; set; }
[KendoColumn(Order = 4, DisplayName = "Message")]
public string message { get; set; }
[KendoColumn(Hidden = true)]
public string host { get; set; }
[KendoColumn(Order = 3, DisplayName = "Source")]
public string src { get; set; }
[KendoColumn(Order = 1, DisplayName = "Timestamp", UIType = UIType.DateTime)]
public DateTime timestamp { get; set; }
[KendoColumn(Hidden = true)]
public DateTime time { get; set; }
}
Unfortunately, it is returning all the records without filtering anything.
Where am I going wrong in this?
Thanks in advance!
PS: ES version: 6.7.0, NEST: 6.8
PS: I have integrated the logs with Nlog. So, now every day it inserts a new index with the date as the name. Here is a mapping for 219-06-28 (I am using the #timestamp):
{
"logstash-2019-06-28": {
"mappings": {
"logevent": {
"properties": {
"#timestamp": {
"type": "date"
},
"host": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"level": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"message": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"src": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"time": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
}
}
}
I'll post what we have figured out in comments as an answer as I think there are couple things which could be improved to increase performance and readability.
Solution:
Query from the question was using .Field(f => f.timestamp) which was translated by NEST to use timestamp field not #timestamp. Simple change to .Field("#timestamp") would resolve the problem as this is the proper field name in index mapping.
{
"logstash-2019-06-28": {
"mappings": {
"logevent": {
"properties": {
"#timestamp": {
"type": "date"
},
..
}
}
}
}
}
We could also mark timestamp property with PropertyName attribute to tell NEST to use #timestamp as a name instead of timestamp
public class SearchEventDto : IDto
{
[KendoColumn(Order = 1, DisplayName = "Timestamp", UIType = UIType.DateTime)]
[PropertyName("#timestamp")]
public DateTime timestamp { get; set; }
}
and query
var query = client.Search<SearchEventDto>(s => s
.AllIndices()
.AllTypes()
.Query(q => q
.MatchAll() && +q
.DateRange(r =>r
.Field(f => f.timestamp)
.LessThanOrEquals(new DateTime(2019,06,01, 0, 0, 0))
)
)
);
would be just working as well.
Improvements:
Query only specific indices:
var query = client.Search<SearchEventDto>(s => s
.AllIndices()
.AllTypes()
..
By using AllIndices() we are telling elasticsearch to try to gather documents from all of the indices, we could change it a little bit to query only indices with logs data:
var query = client.Search<SearchEventDto>(s => s
.Index("logstash-*")
.Type("logevent")
..
Use filter context for date range filter:
.Query(q => q.Bool(b => b.Filter(f => f.DateRange(..))))
This way your query should be faster as it doesn't care about calculating search relevance score. You can read more about it here.
Hope that helps.

How to deserialise JSON from HubSpot

I am having trouble deserializing JSON received from HubSpot ContactList API.
I am using Restsharp and NewtonSoft, and I'm having real struggles understanding how to correctly define the required classes in order to deserialize the JSON string, which is below:
"contacts": [
{
"vid": 2251,
"portal-id": 5532227,
"is-contact": true,
"profile-url": "https://app.hubspot.com/contacts/5532227/contact/2251",
"properties": {
"firstname": {
"value": "Carl"
},
"lastmodifieddate": {
"value": "1554898386040"
},
"company": {
"value": "Cygnus Project"
},
"lastname": {
"value": "Swann"
}
},
"form-submissions": [],
"identity-profiles": [
{
"vid": 2251,
"saved-at-timestamp": 1553635648634,
"deleted-changed-timestamp": 0,
"identities": [
{
"type": "EMAIL",
"value": "cswann#cygnus.co.uk",
"timestamp": 1553635648591,
"is-primary": true
},
{
"type": "LEAD_GUID",
"value": "e2345",
"timestamp": 1553635648630
}
]
}
],
"merge-audits": []
},
{
"vid": 2301,
"portal-id": 5532227,
"is-contact": true,
"profile-url": "https://app.hubspot.com/contacts/5532227/contact/2301",
"properties": {
"firstname": {
"value": "Carlos"
},
"lastmodifieddate": {
"value": "1554886333954"
},
"company": {
"value": "Khaos Control"
},
"lastname": {
"value": "Swannington"
}
},
"identity-profiles": [
{
"vid": 2301,
"saved-at-timestamp": 1553635648733,
"deleted-changed-timestamp": 0,
"identities": [
{
"type": "EMAIL",
"value": "cswann#khaoscontrol.com",
"timestamp": 1553635648578,
"is-primary": true
},
{
"type": "LEAD_GUID",
"value": "c7f403ba",
"timestamp": 1553635648729
}
]
}
],
"merge-audits": []
}
],
"has-more": false,
"vid-offset": 2401
}
If I simply request the vid, I correctly get 2 vid's back. It's when I try to do the properties and that i get a fail.
Please help
Lets reduce the Json to the minimum to reproduce your error :
{
"vid": 2301,
"portal-id": 5532227,
"is-contact": true,
"profile-url": "https://app.hubspot.com/contacts/5532227/contact/2301",
"properties": {
"firstname": {
"value": "Carlos"
},
"lastmodifieddate": {
"value": "1554886333954"
},
"company": {
"value": "Khaos Control"
},
"lastname": {
"value": "Swannington"
}
}
}
And the appropriate class ContactListAPI_Result:
public partial class ContactListAPI_Result
{
[JsonProperty("vid")]
public long Vid { get; set; }
[JsonProperty("portal-id")]
public long PortalId { get; set; }
[JsonProperty("is-contact")]
public bool IsContact { get; set; }
[JsonProperty("profile-url")]
public Uri ProfileUrl { get; set; }
[JsonProperty("properties")]
public Dictionary<string, Dictionary<string, string>> Properties { get; set; }
}
public partial class ContactListAPI_Result
{
public static ContactListAPI_Result FromJson(string json)
=> JsonConvert.DeserializeObject<ContactListAPI_Result>(json);
//public static ContactListAPI_Result FromJson(string json)
// => JsonConvert.DeserializeObject<ContactListAPI_Result>(json, Converter.Settings);
}
public static void toto()
{
string input = #" {
""vid"": 2301,
""portal-id"": 5532227,
""is-contact"": true,
""profile-url"": ""https://app.hubspot.com/contacts/5532227/contact/2301"",
""properties"": {
""firstname"": {
""value"": ""Carlos""
},
""lastmodifieddate"": {
""value"": ""1554886333954""
},
""company"": {
""value"": ""Khaos Control""
},
""lastname"": {
""value"": ""Swannington""
}
}
}";
var foo = ContactListAPI_Result.FromJson(input);
}
But the Value of one property will be burrow in the sub dictionary, we can the project the object in a more usefull one :
public partial class ItemDTO
{
public long Vid { get; set; }
public long PortalId { get; set; }
public bool IsContact { get; set; }
public Uri ProfileUrl { get; set; }
public Dictionary<string, string> Properties { get; set; }
}
Adding the projection to the Class:
public ItemDTO ToDTO()
{
return new ItemDTO
{
Vid = Vid,
PortalId = PortalId,
IsContact = IsContact,
ProfileUrl = ProfileUrl,
Properties =
Properties.ToDictionary(
p => p.Key,
p => p.Value["value"]
)
};
}
Usage :
var result = foo.ToDTO();
Live Demo
Creating and managing class structure for big and nested key/value pair json is tedious task
So one approach is to use JToken instead.
You can simply parse your JSON to JToken and by querying parsed object, you will easily read the data that you want without creating class structure for your json
From your post it seems you need to retrieve vid and properties from your json so try below code,
string json = "Your json here";
JToken jToken = JToken.Parse(json);
var result = jToken["contacts"].ToObject<JArray>()
.Select(x => new
{
vid = Convert.ToInt32(x["vid"]),
properties = x["properties"].ToObject<Dictionary<string, JToken>>()
.Select(y => new
{
Key = y.Key,
Value = y.Value["value"].ToString()
}).ToList()
}).ToList();
//-----------Print the result to console------------
foreach (var item in result)
{
Console.WriteLine(item.vid);
foreach (var prop in item.properties)
{
Console.WriteLine(prop.Key + " - " + prop.Value);
}
Console.WriteLine();
}
Output:

C# NEST elasticsearch source filtering returning null for most of the fields

I am new to Elasticsearch with .NET using NEST.
I am trying to execute a simple search to match all and interested with only few properties. I am not able to get the values for almost all the fields in the source. all shows as null value
The index already exists in elasticsearch.
I have a class representing the type.
public class DocType
{
public long CommunicationsDate { get; set; }
public string ControlNumber { get; set; }
public string Kind { get; set; }
public string PrimaryCommuncationsName { get; set; }
public float RiskScore { get; set; }
}
and my mapping is:
PUT relativity
{
"mappings": {
"doctype": {
"properties": {
"comms_date": {
"type": "date"
},
"control_number": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"kind": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"primary_comms_name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"risk_score": {
"type": "float"
}
}
}
}
}
The following query returns Hits count correctly but the values are null except for the Kind property. Not sure what am i doing wrong here. Is this because the property names are different in c# class or something else?
return await _connection.Get<ElasticClient>().SearchAsync<DocType>(s =>
{
var searchDescriptor = s.Index("relativity")
.Type("DocType")
.Size(100)
.Source(sf => sf
.Includes(i => i
.Fields(
f => f.ControlNumber,
f => f.PrimaryCommuncationsName,
f => f.RiskScore,
f => f.Kind,
f => f.CommunicationsDate
)
)
);
}
Property need to have the same name in order to nest to map it correctly with your es index.
You can use attribute in your c# class file to change the mappings if you want to have different name on c# side.
You can use fluent mapping too.
https://www.elastic.co/guide/en/elasticsearch/client/net-api/current/attribute-mapping.html
Hope it's help
++

Using C# Nest API to get nested json data is not retrieving data

I have the following json coming through the elasticsearch:
{
"_index": "data-2016-01-14",
"_type": "type-data",
"_id": "AVJBBNG-TE8FYIA1rf1p",
"_score": 1,
"_source": {
"#message": {
"timestamp": 1452789770326461200,
"eventID": 1452789770326461200,
"eventName": "New",
"Price": "38.34",
"Qty": 100,
"statistic_LatencyValue_ns": 1142470,
"statistic_LatencyViolation": false,
"statistic_LossViolation": false
},
"#timestamp": "2016-01-14T16:42:50.326Z",
"#fields": {
"timestamp": "1452789770326"
}
},
"fields": {
"#timestamp": [
1452789770326
]
}
}
I'm using Nest to try to get the eventName data i created the class and marked the property:
public class ElasticTest
{
[ElasticProperty(Type = FieldType.Nested)]
public string eventName { get; set; }
}
But the following query is returning 0 results, what am i doing wrong?
var result = client.Search<CorvilTest>(s => s
.From(0)
.Size(10000)
.Query(x => x
.Term(e => e.eventName,"New"))
);
var r = result.Documents;
Mapping definition:
{
"data-2016-01-14": {
"mappings": {
"type-data": {
"properties": {
"#fields": {
"properties": {
"timestamp": {
"type": "string"
}
}
},
"#message": {
"properties": {
"OrderQty": {
"type": "long"
},
"Price": {
"type": "string"
},
"eventID": {
"type": "long"
},
"eventName": {
"type": "string"
},
"statistic_LatencyValue_ns": {
"type": "long"
},
"statistic_LatencyViolation": {
"type": "boolean"
},
"statistic_LossViolation": {
"type": "boolean"
},
"timestamp": {
"type": "long"
}
}
},
"#timestamp": {
"type": "date",
"format": "dateOptionalTime"
}
}
}
}
}
}
I see that the field #message.eventName is using a standard analyzer which means that its value is lower-cased and split at word boundaries before indexing. Hence the value "new" is indexed and not "New". Read more about it here. You need to be mindful about this fact when using a Term Query. Another thing is that the field eventName is not of nested type. So the code below should work for you.
var result = client.Search<CorvilTest>(s => s
.From(0)
.Size(10000)
.Query(x => x
.Term(e => e.Message.EventName, "new"))); // Notice I've used "new" and not "New"
var r = result.Documents;
For the above code to work the definition of CorvilTest class should be something like below:
public class CorvilTest
{
[ElasticProperty(Name = "#message")]
public Message Message { get; set; }
/* Other properties if any */
}
public class Message
{
[ElasticProperty(Name = "eventName")]
public string EventName { get; set; }
}

Categories