DocumentDB LINQ match multiple child objects - c#

I have a simple document.
{
Name: "Foo",
Tags: [
{ Name: "Type", Value: "One" },
{ Name: "Category", Value: "A" },
{ Name: "Source", Value: "Example" },
]
}
I would like to make a LINQ query that can find these documents by matching multiple Tags.
i.e. Not a SQL query, unless there is no other option.
e.g.
var tagsToMatch = new List<Tag>()
{
new Tag("Type", "One"),
new Tag("Category", "A")
};
var query = client
.CreateDocumentQuery<T>(documentCollectionUri)
.Where(d => tagsToMatch.All(tagToMatch => d.Tags.Any(tag => tag == tagToMatch)));
Which gives me the error Method 'All' is not supported..
I have found examples where a single property on the child object is being matched: LINQ Query Issue with using Any on DocumentDB for child collection
var singleTagToMatch = tagsToMatch.First();
var query = client
.CreateDocumentQuery<T>(documentCollectionUri)
.SelectMany
(
d => d.Tags
.Where(t => t.Name == singleTagToMatch.Name && t.Value == singleTagToMatch.Value)
.Select(t => d)
);
But it's not obvious how that approach can be extended to support matching multiple child objects.
I found there's a function called ARRAY_CONTAINS which can be used: Azure DocumentDB ARRAY_CONTAINS on nested documents
But all the examples I came across are using SQL queries.
This thread indicates that LINQ support was "coming soon" in 2015, but it was never followed up so I assume it wasn't added.
I haven't come across any documentation for ARRAY_CONTAINS in LINQ, only in SQL.
I tried the following SQL query to see if it does what I want, and it didn't return any results:
SELECT Document
FROM Document
WHERE ARRAY_CONTAINS(Document.Tags, { Name: "Type", Value: "One" })
AND ARRAY_CONTAINS(Document.Tags, { Name: "Category", Value: "A" })
According to the comments on this answer, ARRAY_CONTAINS only works on arrays of primitives, not objects. SO it appears not to be suited for what I want to achieve.
It seems the comments on that answer are wrong, and I had syntax errors in my query. I needed to add double quotes around the property names.
Running this query did return the results I wanted:
SELECT Document
FROM Document
WHERE ARRAY_CONTAINS(Document.Tags, { "Name": "Type", "Value": "One" })
AND ARRAY_CONTAINS(Document.Tags, { "Name": "Category", "Value": "A" })
So ARRAY_CONTAINS does appear to achieve what I want, so I'm looking for how to use it via the LINQ syntax.

Using .Contains in the LINQ query will generate SQL that uses ARRAY_CONTAINS.
So:
var tagsToMatch = new List<Tag>()
{
new Tag("Type", "One"),
new Tag("Category", "A")
};
var singleTagToMatch = tagsToMatch.First();
var query = client
.CreateDocumentQuery<T>(documentCollectionUri)
.Where(d => d.Tags.Contains(singleTagToMatch));
Will become:
SELECT * FROM root WHERE ARRAY_CONTAINS(root["Tags"], {"Name":"Type","Value":"One"})
You can chain .Where calls to create a chain of AND predicates.
So:
var query = client.CreateDocumentQuery<T>(documentCollectionUri)
foreach (var tagToMatch in tagsToMatch)
{
query = query.Where(s => s.Tags.Contains(tagToMatch));
}
Will become:
SELECT * FROM root WHERE ARRAY_CONTAINS(root["Tags"], {"Name":"Type","Value":"One"}) AND ARRAY_CONTAINS(root["Tags"], {"Name":"Category","Value":"A"})
If you need to chain the predicates using OR then you'll need some expression predicate builder library.

Related

newbie to CosmoDB how to query collection with multiple values?

I have the following collection and I want to query based on Class and FullName from Students
{
"id" : "ABCD",
"Class" : "Math",
"Students" : [
{
"FullName" : "Dan Smith",
},
{
"FullName" : "Dave Jackson",
},
]
}
The following filter works based on class.
var filter = builder.Eq(x => x.Class, "Math");
var document = collection.Find(filter).FirstOrDefaultAsync();
But I want to query based on student also, I tried to add another filter and it has "Cannot implicitly convert type string to bool" error
filter &= builder.Eq(x => x.Students.Any(y => y.FullName,"Dan"));
As you want to query with the nested document in an array, you need $elemMatch operator. In MongoDB .NET Driver syntax, you can achieve with either of these ways:
Solution 1: ElemMatch with LINQ Expression
filter &= builder.ElemMatch(x => x.Students, y => y.FullName == "Dan");
Solution 2: ElemMatch with FilterDefinition
filter &= builder.ElemMatch(x => x.Students,
Builders<Student>.Filter.Eq(y => y.FullName, "Dan"));
The above methods will return no document as the filter criteria don't match with the attached document.
If you look for matching the partial word, you need to work with $regex operator.
Solution: With regex match
filter &= builder.ElemMatch(x => x.Students,
Builders<Student>.Filter.Regex(y => y.FullName, "Dan"));
Demo

.NET MongoDB Bad Projection Specification

I'm in the process of upgrading a system from the Legacy Mongo Drivers to the new ones. I've got an issue with the following query.
var orgsUnitReadModels = _readModelService.Queryable<OrganisationalUnitReadModel>()
.Where(x => locations.Contains(x.Id))
.Select(x => new AuditLocationItemViewModel
{
Id = x.Id,
Name = x.Name,
AuditRecordId = auditRecordId,
Type = type,
IsArchived = !x.IsVisible,
AuditStatus = auditStatus
}).ToList();
It produces the following error message, which I don't understand. I would be grateful for assistance explaining what this means and how to fix it.
MongoDB.Driver.MongoCommandException: 'Command aggregate failed: Bad
projection specification, cannot exclude fields other than '_id' in an
inclusion projection: { Id: "$_id", Name: "$Name", AuditRecordId:
BinData(3, 5797FCCCA90C8644B4CB84FED4236D4B), Type: 0, IsArchived: {
$not: [ "$IsVisible" ] }, AuditStatus: 2, _id: 0 }.'
In this example LINQ's Select statement gets translated into MongoDB's $project. Typically you use 0 (or false) to exclude fields and 1 or true to include fields in a final result set. Of course you can also use the dollar syntax to refer to existing fields which happens for instance for Name.
The problem is that you're also trying to include some in-memory constant values as part of the projection. Unfortunately one of them (type) is equal to 0 which is interpreted as if you would like to exclude a field called Type from the pipeline result.
Due to this ambiguity MongoDB introduced $literal operator and you can try following syntax in Mongo shell:
db.col.aggregate([{ $project: { _id: 0, Id: 1, Name: 1, Type: { $literal: 0 } } }])
It will return 0 as a constant value as you expect. The MongoDB .NET driver documentation mentions literal here but it looks like it only works for strings.
There's a couple of ways you can solve your problem, I think the easier is to run simpler .Select statement first and then use .ToList() to make sure the query is materialized. Once it's done you can run another in-memory .Select() to build your OrganisationalUnitReadModel:
.Where(x => locations.Contains(x.Id))
.Select(x => new { x.Id, x.Name, x.IsVisible }).ToList()
.Select(x => new AuditLocationItemViewModel
{
Id = x.Id,
Name = x.Name,
Type = type,
IsArchived = !x.IsVisible,
AuditStatus = auditStatus
}).ToList();

How to query multiple fields with SelectTokens?

If I've got a JSON array and I want to extract one field from each object, it's fairly simple:
Data:
{
"Values": [
{
"Name": "Bill",
"Age": "25",
"Address": "1234 Easy St."
},
{
"Name": "Bob",
"Age": "28",
"Address": "1600 Pennsylvania Ave."
},
{
"Name": "Joe",
"Age": "31",
"Address": "653 28th St NW"
}
]
}
Query:
data.SelectTokens("Values[*].Name")
This will give me an array of all the names. But what if I want more than one field? Is there any way to get an array of objects containing names and addresses?
The obvious way is to run SelectTokens twice and then Zip them, but will that work? Are the two result arrays guaranteed to preserve the ordering of the original source data? And is there a simpler way that can do it with just one query?
You can use the union operator ['Name','Address'] to select the values of multiple properties simultaneously. However, at some point you're going to need to generate new objects containing just the required properties, for instance by grouping them by parent object:
var query = data.SelectTokens("Values[*]['Name','Address']")
.Select(v => (JProperty)v.Parent) // Get parent JProperty (which encapsulates name as well as value)
.GroupBy(p => p.Parent) // Group by parent JObject
.Select(g => new JObject(g)); // Create a new object with the filtered properties
While this works and uses only one JSONPath query, it feels a little overly complex. I'd suggest just selecting for the objects, then using a nested query to get the required properties like so:
var query = data.SelectTokens("Values[*]")
.OfType<JObject>()
.Select(o => new JObject(o.Property("Name"), o.Property("Address")));
Or maybe
var query = data.SelectTokens("Values[*]")
.Select(o => new JObject(o.SelectTokens("['Name','Address']").Select(v => (JProperty)v.Parent)));
Demo fiddle here.

Select all _id from mongodb collection with c# driver

I have large document collection in mongodb and want to get only _id list. Mongodb query is db.getCollection('Documents').find({},{_id : 0, _id: 1}). But in C# query
IMongoCollection<T> Collection { get; set; }
...
List<BsonDocument> mongoResult = this.Collection.FindAsync(FilterDefinition<T>.Empty, new FindOptions<T, BsonDocument>() { Projection = "{ _id: 0, _id: 1 }" }).Result.ToList();
throw exeption InvalidOperationException: Duplicate element name '_id'.
I want to get only _id list, other fileds not needed. Documents may have different structures and exclude all other fileds manualy difficult.
What C# query corresponds to the specified mongodb query db.getCollection('Documents').find({},{_id : 0, _id: 1}?
UPDATE: Do not offer solutions related query large amounts of data from the server, for example like
this.Collection.Find(d => true).Project(d => d.Id).ToListAsync().Result;
Since your using C# driver I would recommend to use the AsQueryable and then use linq instead.
In my opinion it is better since you wouldn't need the magic strings and you would benefit from your linq knowledge. Then it would look something like this
database.GetCollection<T>("collectionname").AsQueryable().Select(x => x.Id);
Alexey is correct, solutions such as these
var result = (await this.Collection<Foos>
.Find(_ => true)
.ToListAsync())
.Select(foo => foo.Id);
Will pull the entire document collection over the wire, deserialize, and then map the Id out in Linq To Objects, which will be extremely inefficient.
The trick is to use .Project to return just the _id keys, before the query is executed with .ToListAsync().
You can specify the type as a raw BsonDocument if you don't want to use a strongly typed DTO to deserialize into.
var client = new MongoClient(new MongoUrl(connstring));
var database = client.GetDatabase(databaseName);
var collection = database.GetCollection<BsonDocument>(collectionName);
var allIds = (await collection
.Find(new BsonDocument()) // OR (x => true)
.Project(new BsonDocument { { "_id", 1 } })
.ToListAsync())
.Select(x => x[0].AsString);
Which executes a query similar to:
db.getCollection("SomeCollection").find({},{_id: 1})

How do I accurately represent this ElasticSearch query using NEST?

Background / Goal
I have a query in ElasticSearch where I'm using filters on a several fields (relatively small data set and we know exactly what values should be in those fields at the time we query). The idea is that we'll perform a full-text query but only after we've filtered on some selections as made by the user.
I'm putting ElasticSearch behind a WebAPI controller and figured it made sense to use NEST to accomplish the query.
The query, in plain English
We have filters for several fields. Each inner filter is an or filter, but they're together as an AND.
In SQL, the pseudo-code equivalent would be select * from table where foo in (1,2,3) AND bar in (4,5,6).
Questions
Can I simplify the way I'm thinking about this query, based on what you see below? Am I overlooking some basic approach? This seems heavy but I'm new to ES.
How would I properly represent the query below in NEST syntax?
Is NEST the best choice for this? Should I be using the ElasticSearch library instead and going lower level?
The Query Text
{
"query": {
"filtered": {
"filter": {
"bool": {
"must": [
{
"or": [
{ "term": { "foo": "something" } },
{ "term": { "foo": "somethingElse" } }
]
},
{
"or": [
{ "term": { "bar": "something" } },
{ "term": { "bar": "somethingElse" } }
]
}
]
}
}
}
},
"size": 100
}
This kind of task is quite simple and popular in ES.
You can represent it in NEST like following:
var rs = es.Search<dynamic>(s => s
.Index("your_index").Type("your_type")
.From(0).Size(100)
.Query(q => q
.Filtered(fq => fq
.Filter(ff => ff
.Bool(b => b
.Must(
m1 => m1.Terms("foo", new string[] { "something", "somethingElse" }),
m2 => m2.Terms("bar", new string[] { "something", "somethingElse" })
)
)
)
.Query(qq => qq
.MatchAll()
)
)
)
);
Some notes:
I use filtered query to filter what I need first, then search stuffs later. In this case the it will filter for foo in ("something", "somethingElse") AND bar in ("something", "somethingElse"), then query all filtered results (match_all). You can change match_all to what you need. filtered query it's for best performance as ES will only need to evaluate scores of documents in query part (after filtered), not all documents.
I use terms filter, which more simple and better performance than or. Default mode of terms is OR all input terms, you can refer more in document about available modes (AND, OR, PLAIN, ...).
Nest is best choice for .NET in my opinion as it designed for simple & easy to use purposes. I only used lower API if I want to use new features that Nest does not support at that time, or if Nest have bugs in functions I use.
You can refer here for a brief NEST tutorial: http://nest.azurewebsites.net/nest/writing-queries.html
Updated: Building bool filters dynamic:
var filters = new List<Nest.FilterContainer>();
filters.Add(Nest.Filter<dynamic>.Terms("foo", new string[] { "something", "somethingElse" }));
// ... more filter
then replace .Bool(b => b.Must(...)) with .Bool(b => b.Must(filters.ToArray()))
Hope it help

Categories