Elasticsearch NEST 5.x Koelnerphonetic not matching - c#

UPDATE
I changed the approach of the question
I'm trying to apply phonetic search with koelner phonetics and also ngram is used.
Index configuration I'm using:
{
"testnew": {
"settings": {
"index": {
"number_of_shards": "5",
"provided_name": "testnew",
"creation_date": "1489672932033",
"analysis": {
"filter": {
"koelnerPhonetik": {
"replace": "false",
"type": "phonetic",
"encoder": "koelnerphonetik"
}
},
"analyzer": {
"koelnerPhonetik": {
"type": "custom",
"tokenizer": "koelnerPhonetik"
},
"ngram_analyzer": {
"type": "custom",
"tokenizer": "ngram_tokenizer"
}
},
"tokenizer": {
"koelnerPhonetik": {
"type": "standard"
},
"ngram_tokenizer": {
"token_chars": [
"letter",
"digit",
"punctuation",
"symbol"
],
"min_gram": "2",
"type": "ngram",
"max_gram": "20"
}
}
},
...
}
}
}
}
}
Ive got one document that looks like this:
{
"_index": "testnew",
"_type": "person",
"_id": "3",
"_score": 1,
"_source": {
"name": "Can",
"fields": {
"phonetic": {
"type": "string",
"analyzer": "koelnerPhonetik"
}
}
}
It is mapped like this:
GET testnew/person/_mapping
"name": {
"type": "text",
"analyzer": "koelnerPhonetik"
}
Why cant I find 'Can' by searching for 'Kan' in this query?
GET testnew/person/_search
{
"query": {
"match": {
"name.phonetic": {
"query": "Kan"
}
}
}
}

Related

ElasticSearch Nest, edge n gram with Fuziness

I am using ElastiSearch.Net and NEST v7.10.0
I have these settings and mappings for elastic search.
{
"settings": {
"index": {
"analysis": {
"filter": {},
"analyzer": {
"keyword_analyzer": {
"filter": [
"lowercase",
"asciifolding",
"trim"
],
"char_filter": [],
"type": "custom",
"tokenizer": "keyword"
},
"edge_ngram_analyzer": {
"filter": [
"lowercase"
],
"tokenizer": "edge_ngram_tokenizer"
},
"edge_ngram_search_analyzer": {
"tokenizer": "lowercase"
}
},
"tokenizer": {
"edge_ngram_tokenizer": {
"type": "edge_ngram",
"min_gram": 2,
"max_gram": 50,
"token_chars": [
"letter"
]
}
}
}
}
},
"mappings": {
"properties": {
"MatchName": {
"type": "text",
"fields": {
"keywordstring": {
"type": "text",
"analyzer": "keyword_analyzer"
},
"edgengram": {
"type": "text",
"analyzer": "edge_ngram_analyzer",
"search_analyzer": "edge_ngram_search_analyzer"
},
"completion": {
"type": "completion"
}
},
"analyzer": "standard"
},
"CompetitionName": {
"type": "text",
"fields": {
"keywordstring": {
"type": "text",
"analyzer": "keyword_analyzer"
},
"edgengram": {
"type": "text",
"analyzer": "edge_ngram_analyzer",
"search_analyzer": "edge_ngram_search_analyzer"
},
"completion": {
"type": "completion"
}
},
"analyzer": "standard"
}
}
}
}
I have indexed 3 documents with values
{
"_source": {
"CompetitionName": "Premiership",
"MatchName": "Dundee Utd - St Johnstone",
}
},
{
"_source": {
"CompetitionName": "2nd Div, Vastra Gotaland UOF",
"MatchName": "IF Limhamn Bunkeflo - FC Rosengaard 1917",
}
},
{
"_source": {
"CompetitionName": "Bundesliga",
"MatchName": "Hertha Berlin - Eintracht Frankfurt",
}
}
And i am searching with Fuziness.Auto in both fields with string "bunde".
I want to achieve to get all the documents with the search above.
But for the query below i get nothing.
string value = "bunde";
BoolQuery boolQuery = new BoolQuery
{
Should = new List<QueryContainer>
{
new QueryContainer(new FuzzyQuery
{
Field = Infer.Field<EventHistoryDoc>(path:eventHistoryDoc => eventHistoryDoc.MatchName),
Value = value,
Fuzziness = Fuzziness.Auto,
}),
new QueryContainer(new FuzzyQuery
{
Field = Infer.Field<EventHistoryDoc>(path:eventHistoryDoc => eventHistoryDoc.CompetitionName),
Value = value,
Fuzziness = Fuzziness = Fuzziness.Auto,
})
}
};
ISearchRequest searchRequest = new SearchRequest
{
Query = new QueryContainer(boolQuery),
};
var json = _elasticClient.RequestResponseSerializer.SerializeToString(searchRequest);
ISearchResponse<EventHistoryDoc> searchResponse = await _elasticClient.SearchAsync<EventHistoryDoc>(searchRequest);
If i search with string "bundes" i get only one document
{
"_source": {
"CompetitionName": "Bundesliga",
"MatchName": "Hertha Berlin - Eintracht Frankfurt",
}
}
Any idea about changes should i do to settings, mapping or query in order to get as response all the documents above?
I am not aware of the syntax of Elasticsearch Nest, but in JSON format you can achieve your result in the following way:
Adding a working example with index mapping, search query, and search result
(For now, I have removed the keyword_analyzer and edge_ngram_search_analyzer from the index mapping, as you just wanted to return all the documents with edge ngram along with fuzziness)
Index Mapping:
{
"settings": {
"analysis": {
"analyzer": {
"my_analyzer": {
"tokenizer": "my_tokenizer"
}
},
"tokenizer": {
"my_tokenizer": {
"type": "edge_ngram",
"min_gram": 2,
"max_gram": 50,
"token_chars": [
"letter",
"digit"
]
}
}
},
"max_ngram_diff": 50
},
"mappings": {
"properties": {
"CompetitionName": {
"type": "text",
"analyzer": "my_analyzer"
},
"MatchName": {
"type": "text",
"analyzer": "my_analyzer"
}
}
}
}
Search Query:
{
"query": {
"multi_match": {
"query": "bunde",
"fuzziness": "AUTO"
}
}
}
Search Result:
"hits": [
{
"_index": "64968421",
"_type": "_doc",
"_id": "1",
"_score": 2.483365,
"_source": {
"CompetitionName": "Premiership",
"MatchName": "Dundee Utd - St Johnstone"
}
},
{
"_index": "64968421",
"_type": "_doc",
"_id": "3",
"_score": 2.4444416,
"_source": {
"CompetitionName": "Bundesliga",
"MatchName": "Hertha Berlin - Eintracht Frankfurt"
}
},
{
"_index": "64968421",
"_type": "_doc",
"_id": "2",
"_score": 0.6104546,
"_source": {
"CompetitionName": "2nd Div, Vastra Gotaland UOF",
"MatchName": "IF Limhamn Bunkeflo - FC Rosengaard 1917"
}
}
]
The index mapping provided in the question is also correct. When using the same index mapping (as provided in the question) and searching for bunde in the multi-match query (as shown above), all the three documents are returned (which is the expected result).

C# deleting json field

{
"from": 0,
"query": {
"bool": {
"must": [
{
"query_string": {
"analyze_wildcard": true,
"default_operator": "and",
"fields": [
"applicationCd"
],
"query": "$applicationCd"
}
},
{
"query_string": {
"analyze_wildcard": true,
"default_operator": "and",
"fields": [
"entityStatusDesc"
],
"query": "$entityStatusDesc"
}
},
{
"query_string": {
"analyze_wildcard": true,
"default_operator": "and",
"fields": [
"stepUserName"
],
"query": "$stepUserName"
}
},
{
"match": {
"model": {
"query": "instance"
}
}
}
]
}
},
"size": 10,
"sort": [
{
"instanceId": {
"missing": "_last",
"order": "desc"
}
}
]
}
I have json file and i just want to delete entire block inside must area due to given field name(applicationCd,EntityStatusDesc,StepUsername) .For example if given field is applicationCd i want to delete that field entirely and my json should look like this.I will be grateful if anyone could help me.Thanks.
{
"from": 0,
"query": {
"bool": {
"must": [
{
//deleted part
"query_string": {
"analyze_wildcard": true,
"default_operator": "and",
"fields": [
"entityStatusDesc"
],
"query": "$entityStatusDesc"
}
},
rest of the code is same

How can I get top 10 results from Elastic search using rest queries with the max hit documents?

I want to get the top 10 documents\rows where these rows should be ordered with most recently and multiple times accessed docs on top.
here's what I tried:
{
"size": 10,
"query": {
"range": {
"searchDate": {
"gte": "DateTime.Now.AddDays(-30)"
},
"aggs": {
"top_tags": {
"terms": {
"field": "searchDate"
},
"aggs": {
"top_otf_hits": {
"top_hits": {
"sort": [
{
"searchDate": {
"order": "desc"
}
}
],
"_source": {
"includes": [
"origin",
"destination"
]
},
"size": 1
}
}
}
}
}
}
}
}
{
"from": 0,
"size": 0,
"sort": [{
"searchDate": "desc"
}, "_score"],
"query": {
"range": {
"searchDate": {
"gte": "2018-02-28",
"lte": "2018-03-05",
"format": "yyyy-MM-dd"
}
}
},
"aggs": {
"frequent": {
"terms": {
"field": "tripKey"
},
"aggs": {
"top_otf_hits": {
"top_hits": {
"sort": [{
"searchDate": {
"order": "desc"
}
}],
"_source": {
"include": ["*"]
},
"size": 1
}
}
}
}
}
}

nested aggregations filtering

I have my document indexed with locations nested,
{
"name": "name 1",
"locations": [
{
"region": "region1",
"city": "city1",
"suburb": "suburb1"
},
{
"region": "region2",
"city": "city2",
"suburb": "suburb2"
},
{
region": "region1",
"city": "city5",
"suburb": "suburb4"
}]
}
I have my query as
{
"query": {
"nested": {
"path": "locations",
"query": {
"bool": {
"must": [
{
"term": {
"locations.region.keyword": {
"value": "region1"
}
}
}
]
}
}
}
}
}
I want aggregate only cities for region1. I've tried nested aggregations, nested with filter aggregations, and with reverse nested. Nothing seems to work. The problem is since documents come with other regions in the locations collection, everything get aggregated even cities that don't belong to region1.
any ideas?
EDIT:
Mappings:
"my_index": {
"mappings": {
"my_type": {
"properties": {
"locations": {
"type": "nested",
"properties": {
"city": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"region": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"suburb": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
},
"name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
}
}
Query:
{
"size": 0,
"query": {
"nested": {
"query": {
"bool": {
"must": [
{
"terms": {
"locations.region.keyword": [
"region1"
]
}
}
]
}
},
"path": "locations"
}
},
"aggs": {
"City": {
"nested": {
"path": "locations"
},
"aggs": {
"City": {
"terms": {
"field": "locations.city.keyword",
"size": 100,
"order": [
{
"_count": "desc"
},
{
"_term": "asc"
}
]
},
"aggs": {
"City": {
"reverse_nested": {}
}
}
}
}
}
}
}
Assuming your mapping is correct as per your usage in the query
You may use the below mentioned query to use filters in your aggregation.
{
"query": {
"match_all": {}
},
"aggs": {
"city_agg": {
"nested": {
"path": "locations"
},
"aggs": {
"filter_locations_regions": {
"filter": {
"term": {
"locations.region.keyword": "region1"
}
},
"aggs": {
"cities_in_region_agg": {
"terms": {
"field": "locations.city.keyword",
"size": 100,
"order": [{
"_count": "desc"
},
{
"_term": "asc"
}]
}
}
}
}
}
}
}
}

Deploy IIS Website with CloudFormation template

I have a Visual Studio (C#) deployment package (.zip) that I have pushed up to my S3 storage.
I want to run my CloudFormation script and have it create an instance of an IIS server (I have the script for this) and then deploy the Visual Studio web site to it from the S3 storage.
I'm looking for an example of the temple json that would do that
I have a template that does something similar to what you are looking for. Below is a template that I use. It may be more than you need, because it has an auto scaling group, but it will get you started. Basically, you need the IAM user to interact with cloud formation. The script in the UserData starts cf-init, which does the stuff in the metadata section.
{
"AWSTemplateFormatVersion": "2010-09-09",
"Description": "Autoscaling for .net Web application.",
"Parameters": {
"InstanceType": {
"Description": "WebServer EC2 instance type",
"Type": "String",
"Default": "m1.small",
"AllowedValues": [
"t1.micro",
"m1.small",
"m1.medium",
"m1.large",
"m1.xlarge",
"m2.xlarge",
"m2.2xlarge",
"m2.4xlarge",
"c1.medium",
"c1.xlarge",
"cc1.4xlarge",
"cc2.8xlarge",
"cg1.4xlarge"
],
"ConstraintDescription": "Must be a valid EC2 instance type."
},
"IamInstanceProfile": {
"Description": "Name of IAM Profile that will be used by instances to access AWS Services",
"Type": "String",
"Default": "YourProfileName"
},
"KeyName": {
"Description": "The EC2 Key Pair to allow access to the instances",
"Default": "yourkeypair",
"Type": "String"
},
"SpotPriceBid": {
"Description": "Max bid price of spot instances",
"Type": "String",
"Default": ".06"
},
"DeployS3Bucket": {
"Description": "The S3 Bucket where deploy files are stored",
"Type": "String",
"Default": "ApplicationBucket"
},
"DeployWebS3Key": {
"Description": "The zip file that holds the website",
"Type": "String",
"Default": "Application.zip"
},
"DNSHostedZone": {
"Type": "String",
"Default": "example.com.",
"AllowedPattern": "^[\\w\\.]*\\.$",
"ConstraintDescription": "DNSDomain must end with '.'"
},
"DNSSubDomain": {
"Type": "String",
"Default": "yoursubdomain"
}
},
"Mappings": {
"RegionToAMIMap": {
"us-east-1": {
"AMI": "ami-1234567"
}
}
},
"Resources": {
"IAMUser": {
"Type": "AWS::IAM::User",
"Properties": {
"Path": "/",
"Policies": [{
"PolicyName": "webuser",
"PolicyDocument": {
"Statement": [{
"Sid": "Stmt1353842250430",
"Action": [
"s3:GetObject"
],
"Effect": "Allow",
"Resource": [
"arn:aws:s3:::HelgaDogWeb*/*"
]
}, {
"Sid": "Stmt1353842327065",
"Action": [
"cloudformation:DescribeStackResource"
],
"Effect": "Allow",
"Resource": [
"*"
]
}
]
}
}
]
}
},
"IAMUserAccessKey": {
"Type": "AWS::IAM::AccessKey",
"Properties": {
"UserName": {
"Ref": "IAMUser"
}
}
},
"WebSecurityGroup": {
"Type": "AWS::EC2::SecurityGroup",
"Properties": {
"GroupDescription": "Enable Access From Elastic Load Balancer.",
"SecurityGroupIngress": [{
"IpProtocol": "tcp",
"FromPort": "443",
"ToPort": "443",
"SourceSecurityGroupOwnerId": {
"Fn::GetAtt": [
"WebLoadBalancer",
"SourceSecurityGroup.OwnerAlias"
]
},
"SourceSecurityGroupName": {
"Fn::GetAtt": [
"WebLoadBalancer",
"SourceSecurityGroup.GroupName"
]
}
}, {
"IpProtocol": "tcp",
"FromPort": "80",
"ToPort": "80",
"SourceSecurityGroupOwnerId": {
"Fn::GetAtt": [
"WebLoadBalancer",
"SourceSecurityGroup.OwnerAlias"
]
},
"SourceSecurityGroupName": {
"Fn::GetAtt": [
"WebLoadBalancer",
"SourceSecurityGroup.GroupName"
]
}
}
]
}
},
"WebLoadBalancer": {
"Type": "AWS::ElasticLoadBalancing::LoadBalancer",
"Properties": {
"Listeners": [{
"InstancePort": "443",
"InstanceProtocol": "HTTPS",
"LoadBalancerPort": "443",
"Protocol": "HTTPS",
"SSLCertificateId": "arn:aws:iam::123456789101:server-certificate/example"
}
],
"AvailabilityZones": {
"Fn::GetAZs": ""
},
"HealthCheck": {
"HealthyThreshold": "3",
"Interval": "30",
"Target": "HTTP:80/healthcheck.aspx",
"Timeout": 8,
"UnhealthyThreshold": "2"
}
}
},
"WebAsSpotLaunchConfiguration": {
"Type": "AWS::AutoScaling::LaunchConfiguration",
"Metadata": {
"AWS::CloudFormation::Init": {
"config": {
"sources": {
"C:\\inetpub\\wwwroot": {
"Fn::Join": [
"/",
[
"http://s3.amazonaws.com", {
"Ref": "DeployS3Bucket"
}, {
"Ref": "DeployWebS3Key"
}
]
]
}
},
"commands": {
"1-set-appPool-identity": {
"command": "C:\\Windows\\System32\\inetsrv\\appcmd set config /section:applicationPools /[name='DefaultAppPool'].processModel.identityType:LocalSystem",
"waitAfterCompletion": "0"
},
"2-add-http-binding": {
"command": "C:\\Windows\\System32\\inetsrv\\appcmd set site /site.name:\"Default Web Site\" /+bindings.[protocol='http',bindingInformation='*:80:']",
"waitAfterCompletion": "0"
}
}
}
},
"AWS::CloudFormation::Authentication": {
"S3AccessCreds": {
"type": "S3",
"accessKeyId": {
"Ref": "IAMUserAccessKey"
},
"secretKey": {
"Fn::GetAtt": [
"IAMUserAccessKey",
"SecretAccessKey"
]
},
"buckets": [{
"Ref": "DeployS3Bucket"
}
]
}
}
},
"Properties": {
"KeyName": {
"Ref": "KeyName"
},
"ImageId": {
"Fn::FindInMap": [
"RegionToAMIMap", {
"Ref": "AWS::Region"
},
"AMI"
]
},
"IamInstanceProfile": {
"Ref": "IamInstanceProfile"
},
"SecurityGroups": [{
"Ref": "WebSecurityGroup"
}
],
"InstanceType": {
"Ref": "InstanceType"
},
"SpotPrice": {
"Ref": "SpotPriceBid"
},
"UserData": {
"Fn::Base64": {
"Fn::Join": [
"",
[
"<script>\n",
"\"C:\\Program Files (x86)\\Amazon\\cfn-bootstrap\\cfn-init.exe\" -v -s ", {
"Ref": "AWS::StackName"
},
" -r WebAsSpotLaunchConfiguration ",
" --access-key ", {
"Ref": "IAMUserAccessKey"
},
" --secret-key ", {
"Fn::GetAtt": [
"IAMUserAccessKey",
"SecretAccessKey"
]
},
"\n",
"</script>"
]
]
}
}
}
},
"WebAsSpotGroup": {
"Type": "AWS::AutoScaling::AutoScalingGroup",
"Properties": {
"AvailabilityZones": {
"Fn::GetAZs": ""
},
"HealthCheckGracePeriod": "120",
"HealthCheckType": "EC2",
"LaunchConfigurationName": {
"Ref": "WebAsSpotLaunchConfiguration"
},
"LoadBalancerNames": [{
"Ref": "WebLoadBalancer"
}
],
"MaxSize": "20",
"MinSize": "1",
"DesiredCapacity": "1"
}
},
"WebAsSpotScaleUpPolicy": {
"Type": "AWS::AutoScaling::ScalingPolicy",
"Properties": {
"AdjustmentType": "PercentChangeInCapacity",
"AutoScalingGroupName": {
"Ref": "WebAsSpotGroup"
},
"Cooldown": "420",
"ScalingAdjustment": "200"
}
},
"WebAsSpotScaleDownPolicy": {
"Type": "AWS::AutoScaling::ScalingPolicy",
"Properties": {
"AdjustmentType": "ChangeInCapacity",
"AutoScalingGroupName": {
"Ref": "WebAsSpotGroup"
},
"Cooldown": "60",
"ScalingAdjustment": "-1"
}
},
"WebAsSpotScaleUpAlarm": {
"Type": "AWS::CloudWatch::Alarm",
"Properties": {
"MetricName": "CPUUtilization",
"Namespace": "AWS/EC2",
"Statistic": "Average",
"Period": "60",
"EvaluationPeriods": "1",
"Threshold": "75",
"AlarmActions": [{
"Ref": "WebAsSpotScaleUpPolicy"
}
],
"Dimensions": [{
"Name": "AutoScalingGroupName",
"Value": {
"Ref": "WebAsSpotGroup"
}
}
],
"ComparisonOperator": "GreaterThanThreshold"
}
},
"WebAsSpotScaleDownAlarm": {
"Type": "AWS::CloudWatch::Alarm",
"Properties": {
"MetricName": "CPUUtilization",
"Namespace": "AWS/EC2",
"Statistic": "Average",
"Period": "60",
"EvaluationPeriods": "2",
"Threshold": "50",
"AlarmActions": [{
"Ref": "WebAsSpotScaleDownPolicy"
}
],
"Dimensions": [{
"Name": "AutoScalingGroupName",
"Value": {
"Ref": "WebAsSpotGroup"
}
}
],
"ComparisonOperator": "LessThanThreshold"
}
},
"DNSRecord": {
"Type": "AWS::Route53::RecordSet",
"Properties": {
"HostedZoneName": {
"Ref": "DNSHostedZone"
},
"Comment": "VPN Host. Created by Cloud Formation.",
"Name": {
"Fn::Join": [
".",
[{
"Ref": "DNSSubDomain"
}, {
"Ref": "DNSHostedZone"
}
]
]
},
"Type": "CNAME",
"TTL": "150",
"ResourceRecords": [{
"Fn::GetAtt": [
"WebLoadBalancer",
"CanonicalHostedZoneName"
]
}
]
},
"DependsOn": "WebLoadBalancer"
}
},
"Outputs": {}
}
I havent tried it myself, but this post, on the AWS site, Using Amazon CloudFront with ASP.NET Apps maybe somewhere to start.

Categories