MongoDB .NET Fluent Aggregate Query - c#

I'm trying to write the query below using the fluent syntax of MongoDB. I'm using the latest .NET driver. I don't like the strings for naming the columns and would prefer to not have to do the Bson Serialization as well.
var collection = _mongoDbClient.GetDocumentCollection<JobResult>();
var bsonDocuments = collection.Aggregate()
.Group<BsonDocument>(new BsonDocument{ { "_id", "$RunDateTime" }, { "Count", new BsonDocument("$sum", 1) } })
.Sort(new BsonDocument { { "count", -1 } })
.Limit(20)
.ToList();
foreach (var bsonDocument in bsonDocuments)
{
jobResultRunDateTimes.Add(BsonSerializer.Deserialize<JobResultRunDateTime>(bsonDocument));
}

C# driver has implementation of LINQ targeting the mongo aggregation framework, so you should be able to do your query using standard linq operators.
The example below shows a group by (on an assumed property Id) and take the count of documents followed by sorting. In example below x would be of type JobResult, i.e. type you use when getting the collection.
var result = collection.AsQueryable().GroupBy(x => x.Id).
Select(g=>new { g.Key, count=g.Count()}).OrderBy(a=>a.Key).Take(1).ToList();
For detailed reference and more example refer to C# driver documentation

Related

Getting a System.NotSupportedException when using MongoDB.Driver.Linq

I am using MongoDB for my database and I am using ASP.NET Core for my api. I am trying to convert my Items object to a ItemsDTO (Data Transfer Object), because I don't want to return the Id. However, when I use
_items.AsQueryable()
.Where(item => item.Game.Contains("Games/1"))
.Select(item => ItemToDTO(item))
.ToList();
It gives back a System.NotSupportedException: 'ItemToDTO of type Project.Services.ItemService is not supported in the expression tree ItemToDTO({document}).'
By the way ItemToDTO looks like this:
private static ItemDTO ItemToDTO(Item item)
{
return new ItemDTO
{
Name = item.Name,
Type = item.Type,
Price = item.Price,
Game = item.Game,
CostToSell = item.CostToSell,
Description = item.Description
};
}
So I am confused on why this doesn't work because before I was using a SQL Server Database and normal Linq that looked like this:
_context.Items
.Where(item => item.Game.Contains("Games/1"))
.Select(item => ItemToDTO(item))
.ToList(),
and it worked the data that I wanted. I know that when using the MongoDB.Driver I am using their Linq and Queryable objects. Just wondering why I am getting the error above.
It is typical mistake when working with LINQ. Translator can not look into ItemToDTO function body and will materialize whole item and then call ItemToDTO to correct result. Probably it is not a way for MongoDb and bad way for relational databases.
So I propose rewrite your function and create extension:
public static IQueryable<ItemDTO> ItemToDTO(this IQueryable<Item> items)
{
return items.Select(item => new ItemDTO
{
Name = item.Name,
Type = item.Type,
Price = item.Price,
Game = item.Game,
CostToSell = item.CostToSell,
Description = item.Description
});
}
Then your query should work fine:
_context.Items
.Where(item => item.Game.Contains("Games/1"))
.ItemToDTO()
.ToList()

.NET MongoDB Bad Projection Specification

I'm in the process of upgrading a system from the Legacy Mongo Drivers to the new ones. I've got an issue with the following query.
var orgsUnitReadModels = _readModelService.Queryable<OrganisationalUnitReadModel>()
.Where(x => locations.Contains(x.Id))
.Select(x => new AuditLocationItemViewModel
{
Id = x.Id,
Name = x.Name,
AuditRecordId = auditRecordId,
Type = type,
IsArchived = !x.IsVisible,
AuditStatus = auditStatus
}).ToList();
It produces the following error message, which I don't understand. I would be grateful for assistance explaining what this means and how to fix it.
MongoDB.Driver.MongoCommandException: 'Command aggregate failed: Bad
projection specification, cannot exclude fields other than '_id' in an
inclusion projection: { Id: "$_id", Name: "$Name", AuditRecordId:
BinData(3, 5797FCCCA90C8644B4CB84FED4236D4B), Type: 0, IsArchived: {
$not: [ "$IsVisible" ] }, AuditStatus: 2, _id: 0 }.'
In this example LINQ's Select statement gets translated into MongoDB's $project. Typically you use 0 (or false) to exclude fields and 1 or true to include fields in a final result set. Of course you can also use the dollar syntax to refer to existing fields which happens for instance for Name.
The problem is that you're also trying to include some in-memory constant values as part of the projection. Unfortunately one of them (type) is equal to 0 which is interpreted as if you would like to exclude a field called Type from the pipeline result.
Due to this ambiguity MongoDB introduced $literal operator and you can try following syntax in Mongo shell:
db.col.aggregate([{ $project: { _id: 0, Id: 1, Name: 1, Type: { $literal: 0 } } }])
It will return 0 as a constant value as you expect. The MongoDB .NET driver documentation mentions literal here but it looks like it only works for strings.
There's a couple of ways you can solve your problem, I think the easier is to run simpler .Select statement first and then use .ToList() to make sure the query is materialized. Once it's done you can run another in-memory .Select() to build your OrganisationalUnitReadModel:
.Where(x => locations.Contains(x.Id))
.Select(x => new { x.Id, x.Name, x.IsVisible }).ToList()
.Select(x => new AuditLocationItemViewModel
{
Id = x.Id,
Name = x.Name,
Type = type,
IsArchived = !x.IsVisible,
AuditStatus = auditStatus
}).ToList();

Unable to cast object of type 'MongoDB.Bson.BsonString' to type 'MongoDB.Bson.BsonDocument' in MongoDB .NET Driver

I am facing a problem while trying to run an aggregation pipeline using MongoDB .NET client. My code looks like so:
public async Task<IEnumerable<string>> GetPopularTags(int count)
{
var events = _database.GetCollection<Event>(_eventsCollectionName);
var agg = events.Aggregate();
var unwind = agg.Unwind<Event, Event>(e => e.Tags);
var group = unwind.Group(e => e.Tags, v => new { Tag = v.Key, Count = v.Count() });
var sort = group.SortByDescending(e => e.Count);
var project = group.Project(r => r.Tag);
var limit = project.Limit(count);
var result = await limit.SingleOrDefaultAsync();
return result;
}
(separate vars for each stage are just for debugging purposes)
While trying to get the result of the pipeline (last var) I get a following error:
System.InvalidCastException: Unable to cast object of type 'MongoDB.Bson.BsonString' to type 'MongoDB.Bson.BsonDocument'
What am I missing?
Thanks in advance for any help!
SOLUTION
I finally figured out that the fact that I was getting an exception at the last line had nothing to do with where the error was. I tried running .SingleOrDefault() on every stage to see outputs and I noticed that my pipeline had a couple of issues.
My unwind stage was trying to return an Event object, but since it was unwinding Tags property (which was a List<string>), it was trying to set it to string and was throwing an exception. I solved that issue by letting it set an output type to the default of BsonDocument and then in next stage using ["Tags"] accessor to get the value I need. It looked something like this:
var dbResult = await events.Aggregate()
.Unwind(e => e.Tags)
.Group(e => e["Tags"], v => new { Tag = v.Key, Count = v.Count() })
My project stage was not working for some reason. I was not able to get the Tag property (which turned out to be a BsonValue type) to be converted to string. In the end I deleted that stage and replaced it with a dbResult.Select(t => t.Tag.AsString) to cast it to a string. Not the most elegant solution, but better than nothing.
In the end my code ended up looking like so:
public async Task<IEnumerable<string>> GetPopularTags(int count)
{
var events = _database.GetCollection<Event>(_eventsCollectionName);
var dbResult = await events.Aggregate()
.Unwind(e => e.Tags)
.Group(e => e["Tags"], v => new { Tag = v.Key, Count = v.Count() })
.SortByDescending(e => e.Count)
.Limit(count)
.ToListAsync();
var result = dbResult.Select(t => t.Tag.AsString);
return result;
}
The problem you're facing can be basically simplified to below line of code:
var agg = collection.Aggregate().Project(x => x.Tag);
Where Tag is a string property in your model.
It appears that Aggregate() and all the MongoDB driver operators are closer to Aggregation Framework than C# syntax allows them to be.
Based on your code the result variable is supposed to be of type String which gets translated by the driver into MongoDB.Bson.BsonString however Aggregation Framework always returns BSON documents (single one in this case) so MongoDB .NET driver cannot handle such deserialization in the runtime (BsonDocument -> BsonString).
First workaround is obvious - return anything that resembles BSON document and can be deserialized from BsonDocument type like:
collection.Aggregate().Project(x => new { x.Tag });
and then map results in memory (same query is run behind the scenes)
Another approach: translate your query into LINQ using .AsQueryable() which allows you to return results in more flexible manner:
collection.AsQueryable().Select(x => x.Tag);
In both cases the query that's generated for my projection looks the same:
{aggregate([{ "$project" : { "Tag" : "$Tag", "_id" : 0 } }])}
A bit late but this had a similar issue and this would have solved it for me:
You will want to make an intermediate class to represent group { Tag = v.Key, Count = v.Count() } and then change the Project to this.
.Project(Builders<YourIntermediateClass>.Projection.Expression(x => x.Tag))

DocumentDB LINQ match multiple child objects

I have a simple document.
{
Name: "Foo",
Tags: [
{ Name: "Type", Value: "One" },
{ Name: "Category", Value: "A" },
{ Name: "Source", Value: "Example" },
]
}
I would like to make a LINQ query that can find these documents by matching multiple Tags.
i.e. Not a SQL query, unless there is no other option.
e.g.
var tagsToMatch = new List<Tag>()
{
new Tag("Type", "One"),
new Tag("Category", "A")
};
var query = client
.CreateDocumentQuery<T>(documentCollectionUri)
.Where(d => tagsToMatch.All(tagToMatch => d.Tags.Any(tag => tag == tagToMatch)));
Which gives me the error Method 'All' is not supported..
I have found examples where a single property on the child object is being matched: LINQ Query Issue with using Any on DocumentDB for child collection
var singleTagToMatch = tagsToMatch.First();
var query = client
.CreateDocumentQuery<T>(documentCollectionUri)
.SelectMany
(
d => d.Tags
.Where(t => t.Name == singleTagToMatch.Name && t.Value == singleTagToMatch.Value)
.Select(t => d)
);
But it's not obvious how that approach can be extended to support matching multiple child objects.
I found there's a function called ARRAY_CONTAINS which can be used: Azure DocumentDB ARRAY_CONTAINS on nested documents
But all the examples I came across are using SQL queries.
This thread indicates that LINQ support was "coming soon" in 2015, but it was never followed up so I assume it wasn't added.
I haven't come across any documentation for ARRAY_CONTAINS in LINQ, only in SQL.
I tried the following SQL query to see if it does what I want, and it didn't return any results:
SELECT Document
FROM Document
WHERE ARRAY_CONTAINS(Document.Tags, { Name: "Type", Value: "One" })
AND ARRAY_CONTAINS(Document.Tags, { Name: "Category", Value: "A" })
According to the comments on this answer, ARRAY_CONTAINS only works on arrays of primitives, not objects. SO it appears not to be suited for what I want to achieve.
It seems the comments on that answer are wrong, and I had syntax errors in my query. I needed to add double quotes around the property names.
Running this query did return the results I wanted:
SELECT Document
FROM Document
WHERE ARRAY_CONTAINS(Document.Tags, { "Name": "Type", "Value": "One" })
AND ARRAY_CONTAINS(Document.Tags, { "Name": "Category", "Value": "A" })
So ARRAY_CONTAINS does appear to achieve what I want, so I'm looking for how to use it via the LINQ syntax.
Using .Contains in the LINQ query will generate SQL that uses ARRAY_CONTAINS.
So:
var tagsToMatch = new List<Tag>()
{
new Tag("Type", "One"),
new Tag("Category", "A")
};
var singleTagToMatch = tagsToMatch.First();
var query = client
.CreateDocumentQuery<T>(documentCollectionUri)
.Where(d => d.Tags.Contains(singleTagToMatch));
Will become:
SELECT * FROM root WHERE ARRAY_CONTAINS(root["Tags"], {"Name":"Type","Value":"One"})
You can chain .Where calls to create a chain of AND predicates.
So:
var query = client.CreateDocumentQuery<T>(documentCollectionUri)
foreach (var tagToMatch in tagsToMatch)
{
query = query.Where(s => s.Tags.Contains(tagToMatch));
}
Will become:
SELECT * FROM root WHERE ARRAY_CONTAINS(root["Tags"], {"Name":"Type","Value":"One"}) AND ARRAY_CONTAINS(root["Tags"], {"Name":"Category","Value":"A"})
If you need to chain the predicates using OR then you'll need some expression predicate builder library.

Select all _id from mongodb collection with c# driver

I have large document collection in mongodb and want to get only _id list. Mongodb query is db.getCollection('Documents').find({},{_id : 0, _id: 1}). But in C# query
IMongoCollection<T> Collection { get; set; }
...
List<BsonDocument> mongoResult = this.Collection.FindAsync(FilterDefinition<T>.Empty, new FindOptions<T, BsonDocument>() { Projection = "{ _id: 0, _id: 1 }" }).Result.ToList();
throw exeption InvalidOperationException: Duplicate element name '_id'.
I want to get only _id list, other fileds not needed. Documents may have different structures and exclude all other fileds manualy difficult.
What C# query corresponds to the specified mongodb query db.getCollection('Documents').find({},{_id : 0, _id: 1}?
UPDATE: Do not offer solutions related query large amounts of data from the server, for example like
this.Collection.Find(d => true).Project(d => d.Id).ToListAsync().Result;
Since your using C# driver I would recommend to use the AsQueryable and then use linq instead.
In my opinion it is better since you wouldn't need the magic strings and you would benefit from your linq knowledge. Then it would look something like this
database.GetCollection<T>("collectionname").AsQueryable().Select(x => x.Id);
Alexey is correct, solutions such as these
var result = (await this.Collection<Foos>
.Find(_ => true)
.ToListAsync())
.Select(foo => foo.Id);
Will pull the entire document collection over the wire, deserialize, and then map the Id out in Linq To Objects, which will be extremely inefficient.
The trick is to use .Project to return just the _id keys, before the query is executed with .ToListAsync().
You can specify the type as a raw BsonDocument if you don't want to use a strongly typed DTO to deserialize into.
var client = new MongoClient(new MongoUrl(connstring));
var database = client.GetDatabase(databaseName);
var collection = database.GetCollection<BsonDocument>(collectionName);
var allIds = (await collection
.Find(new BsonDocument()) // OR (x => true)
.Project(new BsonDocument { { "_id", 1 } })
.ToListAsync())
.Select(x => x[0].AsString);
Which executes a query similar to:
db.getCollection("SomeCollection").find({},{_id: 1})

Categories