.NET MongoDB Bad Projection Specification

.NET MongoDB Bad Projection Specification - c#

I'm in the process of upgrading a system from the Legacy Mongo Drivers to the new ones. I've got an issue with the following query.
var orgsUnitReadModels = _readModelService.Queryable<OrganisationalUnitReadModel>()
.Where(x => locations.Contains(x.Id))
.Select(x => new AuditLocationItemViewModel
{
Id = x.Id,
Name = x.Name,
AuditRecordId = auditRecordId,
Type = type,
IsArchived = !x.IsVisible,
AuditStatus = auditStatus
}).ToList();
It produces the following error message, which I don't understand. I would be grateful for assistance explaining what this means and how to fix it.
MongoDB.Driver.MongoCommandException: 'Command aggregate failed: Bad
projection specification, cannot exclude fields other than '_id' in an
inclusion projection: { Id: "$_id", Name: "$Name", AuditRecordId:
BinData(3, 5797FCCCA90C8644B4CB84FED4236D4B), Type: 0, IsArchived: {
$not: [ "$IsVisible" ] }, AuditStatus: 2, _id: 0 }.'

In this example LINQ's Select statement gets translated into MongoDB's $project. Typically you use 0 (or false) to exclude fields and 1 or true to include fields in a final result set. Of course you can also use the dollar syntax to refer to existing fields which happens for instance for Name.
The problem is that you're also trying to include some in-memory constant values as part of the projection. Unfortunately one of them (type) is equal to 0 which is interpreted as if you would like to exclude a field called Type from the pipeline result.
Due to this ambiguity MongoDB introduced $literal operator and you can try following syntax in Mongo shell:
db.col.aggregate([{ $project: { _id: 0, Id: 1, Name: 1, Type: { $literal: 0 } } }])
It will return 0 as a constant value as you expect. The MongoDB .NET driver documentation mentions literal here but it looks like it only works for strings.
There's a couple of ways you can solve your problem, I think the easier is to run simpler .Select statement first and then use .ToList() to make sure the query is materialized. Once it's done you can run another in-memory .Select() to build your OrganisationalUnitReadModel:
.Where(x => locations.Contains(x.Id))
.Select(x => new { x.Id, x.Name, x.IsVisible }).ToList()
.Select(x => new AuditLocationItemViewModel
{
Id = x.Id,
Name = x.Name,
Type = type,
IsArchived = !x.IsVisible,
AuditStatus = auditStatus
}).ToList();

Related

Unable to cast object of type 'MongoDB.Bson.BsonString' to type 'MongoDB.Bson.BsonDocument' in MongoDB .NET Driver

I am facing a problem while trying to run an aggregation pipeline using MongoDB .NET client. My code looks like so:
public async Task<IEnumerable<string>> GetPopularTags(int count)
{
var events = _database.GetCollection<Event>(_eventsCollectionName);
var agg = events.Aggregate();
var unwind = agg.Unwind<Event, Event>(e => e.Tags);
var group = unwind.Group(e => e.Tags, v => new { Tag = v.Key, Count = v.Count() });
var sort = group.SortByDescending(e => e.Count);
var project = group.Project(r => r.Tag);
var limit = project.Limit(count);
var result = await limit.SingleOrDefaultAsync();
return result;
}
(separate vars for each stage are just for debugging purposes)
While trying to get the result of the pipeline (last var) I get a following error:
System.InvalidCastException: Unable to cast object of type 'MongoDB.Bson.BsonString' to type 'MongoDB.Bson.BsonDocument'
What am I missing?
Thanks in advance for any help!
SOLUTION
I finally figured out that the fact that I was getting an exception at the last line had nothing to do with where the error was. I tried running .SingleOrDefault() on every stage to see outputs and I noticed that my pipeline had a couple of issues.
My unwind stage was trying to return an Event object, but since it was unwinding Tags property (which was a List<string>), it was trying to set it to string and was throwing an exception. I solved that issue by letting it set an output type to the default of BsonDocument and then in next stage using ["Tags"] accessor to get the value I need. It looked something like this:
var dbResult = await events.Aggregate()
.Unwind(e => e.Tags)
.Group(e => e["Tags"], v => new { Tag = v.Key, Count = v.Count() })
My project stage was not working for some reason. I was not able to get the Tag property (which turned out to be a BsonValue type) to be converted to string. In the end I deleted that stage and replaced it with a dbResult.Select(t => t.Tag.AsString) to cast it to a string. Not the most elegant solution, but better than nothing.
In the end my code ended up looking like so:
public async Task<IEnumerable<string>> GetPopularTags(int count)
{
var events = _database.GetCollection<Event>(_eventsCollectionName);
var dbResult = await events.Aggregate()
.Unwind(e => e.Tags)
.Group(e => e["Tags"], v => new { Tag = v.Key, Count = v.Count() })
.SortByDescending(e => e.Count)
.Limit(count)
.ToListAsync();
var result = dbResult.Select(t => t.Tag.AsString);
return result;
}

The problem you're facing can be basically simplified to below line of code:
var agg = collection.Aggregate().Project(x => x.Tag);
Where Tag is a string property in your model.
It appears that Aggregate() and all the MongoDB driver operators are closer to Aggregation Framework than C# syntax allows them to be.
Based on your code the result variable is supposed to be of type String which gets translated by the driver into MongoDB.Bson.BsonString however Aggregation Framework always returns BSON documents (single one in this case) so MongoDB .NET driver cannot handle such deserialization in the runtime (BsonDocument -> BsonString).
First workaround is obvious - return anything that resembles BSON document and can be deserialized from BsonDocument type like:
collection.Aggregate().Project(x => new { x.Tag });
and then map results in memory (same query is run behind the scenes)
Another approach: translate your query into LINQ using .AsQueryable() which allows you to return results in more flexible manner:
collection.AsQueryable().Select(x => x.Tag);
In both cases the query that's generated for my projection looks the same:
{aggregate([{ "$project" : { "Tag" : "$Tag", "_id" : 0 } }])}

A bit late but this had a similar issue and this would have solved it for me:
You will want to make an intermediate class to represent group { Tag = v.Key, Count = v.Count() } and then change the Project to this.
.Project(Builders<YourIntermediateClass>.Projection.Expression(x => x.Tag))

How can I make Sum() return 0 instead of 'null'?

I'm trying to use LINQ-to-entities to query my DB, where I have 3 tables: Room, Conference, and Participant. Each room has many conferences, and each conference has many participants. For each room, I'm trying to get a count of its conferences, and a sum of all of the participants for all of the room's conferences. Here's my query:
var roomsData = context.Rooms
.GroupJoin(
context.Conferences
.GroupJoin(
context.Participants,
conf => conf.Id,
part => part.ConferenceId,
(conf, parts) => new { Conference = conf, ParticipantCount = parts.Count() }
),
rm => rm.Id,
data => data.Conference.RoomId,
(rm, confData) => new {
Room = rm,
ConferenceCount = confData.Count(),
ParticipantCount = confData.Sum(cd => cd.ParticipantCount)
}
);
When I try and turn this into a list, I get the error:
The cast to value type 'System.Int32' failed because the materialized value is null. Either the result type's generic parameter or the query must use a nullable type.
I can fix this by changing the Sum line to:
ParticipantCount = confData.Count() == 0 ? 0 : confData.Sum(cd => cd.ParticipantCount)
But the trouble is that this seems to generate a more complex query and add 100ms onto the query time. Is there a better way for me to tell EF that when it is summing ParticipantCount, an empty list for confData should just mean zero, rather than throwing an exception? The annoying thing is that this error only happens with EF; if I create an empty in-memory List<int> and call Sum() on that, it gives me zero, rather than throwing an exception!

You may use the null coalescing operator ?? as:
confData.Sum(cd => cd.ParticipantCount ?? 0)

I made it work by changing the Sum line to:
ParticipantCount = (int?)confData.Sum(cd => cd.ParticipantCount)
Confusingly, it seems that even though IntelliSense tells me that the int overload for Sum() is getting used, at runtime it is actually using the int? overload because the confData list might be empty. If I explicitly tell it the return type is int? it returns null for the empty list entries, and I can later null-coalesce the nulls to zero.

Use Enumerable.DefaultIfEmpty:
ParticipantCount = confData.DefaultIfEmpty().Sum(cd => cd.ParticipantCount)

Instead of trying to get EF to generate a SQL query that returns 0 instead of null, you change this as you process the query results on the client-side like this:
var results = from r in roomsData.AsEnumerable()
select new
{
r.Room,
r.ConferenceCount,
ParticipantCount = r.ParticipantCount ?? 0
};
The AsEnumerable() forces the SQL query to be evaluated and the subsequent query operators are client-side LINQ-to-Objects.

DocumentDB LINQ match multiple child objects

I have a simple document.
{
Name: "Foo",
Tags: [
{ Name: "Type", Value: "One" },
{ Name: "Category", Value: "A" },
{ Name: "Source", Value: "Example" },
]
}
I would like to make a LINQ query that can find these documents by matching multiple Tags.
i.e. Not a SQL query, unless there is no other option.
e.g.
var tagsToMatch = new List<Tag>()
{
new Tag("Type", "One"),
new Tag("Category", "A")
};
var query = client
.CreateDocumentQuery<T>(documentCollectionUri)
.Where(d => tagsToMatch.All(tagToMatch => d.Tags.Any(tag => tag == tagToMatch)));
Which gives me the error Method 'All' is not supported..
I have found examples where a single property on the child object is being matched: LINQ Query Issue with using Any on DocumentDB for child collection
var singleTagToMatch = tagsToMatch.First();
var query = client
.CreateDocumentQuery<T>(documentCollectionUri)
.SelectMany
(
d => d.Tags
.Where(t => t.Name == singleTagToMatch.Name && t.Value == singleTagToMatch.Value)
.Select(t => d)
);
But it's not obvious how that approach can be extended to support matching multiple child objects.
I found there's a function called ARRAY_CONTAINS which can be used: Azure DocumentDB ARRAY_CONTAINS on nested documents
But all the examples I came across are using SQL queries.
This thread indicates that LINQ support was "coming soon" in 2015, but it was never followed up so I assume it wasn't added.
I haven't come across any documentation for ARRAY_CONTAINS in LINQ, only in SQL.
I tried the following SQL query to see if it does what I want, and it didn't return any results:
SELECT Document
FROM Document
WHERE ARRAY_CONTAINS(Document.Tags, { Name: "Type", Value: "One" })
AND ARRAY_CONTAINS(Document.Tags, { Name: "Category", Value: "A" })
According to the comments on this answer, ARRAY_CONTAINS only works on arrays of primitives, not objects. SO it appears not to be suited for what I want to achieve.
It seems the comments on that answer are wrong, and I had syntax errors in my query. I needed to add double quotes around the property names.
Running this query did return the results I wanted:
SELECT Document
FROM Document
WHERE ARRAY_CONTAINS(Document.Tags, { "Name": "Type", "Value": "One" })
AND ARRAY_CONTAINS(Document.Tags, { "Name": "Category", "Value": "A" })
So ARRAY_CONTAINS does appear to achieve what I want, so I'm looking for how to use it via the LINQ syntax.

Using .Contains in the LINQ query will generate SQL that uses ARRAY_CONTAINS.
So:
var tagsToMatch = new List<Tag>()
{
new Tag("Type", "One"),
new Tag("Category", "A")
};
var singleTagToMatch = tagsToMatch.First();
var query = client
.CreateDocumentQuery<T>(documentCollectionUri)
.Where(d => d.Tags.Contains(singleTagToMatch));
Will become:
SELECT * FROM root WHERE ARRAY_CONTAINS(root["Tags"], {"Name":"Type","Value":"One"})
You can chain .Where calls to create a chain of AND predicates.
So:
var query = client.CreateDocumentQuery<T>(documentCollectionUri)
foreach (var tagToMatch in tagsToMatch)
{
query = query.Where(s => s.Tags.Contains(tagToMatch));
}
Will become:
SELECT * FROM root WHERE ARRAY_CONTAINS(root["Tags"], {"Name":"Type","Value":"One"}) AND ARRAY_CONTAINS(root["Tags"], {"Name":"Category","Value":"A"})
If you need to chain the predicates using OR then you'll need some expression predicate builder library.

Select all _id from mongodb collection with c# driver

I have large document collection in mongodb and want to get only _id list. Mongodb query is db.getCollection('Documents').find({},{_id : 0, _id: 1}). But in C# query
IMongoCollection<T> Collection { get; set; }
...
List<BsonDocument> mongoResult = this.Collection.FindAsync(FilterDefinition<T>.Empty, new FindOptions<T, BsonDocument>() { Projection = "{ _id: 0, _id: 1 }" }).Result.ToList();
throw exeption InvalidOperationException: Duplicate element name '_id'.
I want to get only _id list, other fileds not needed. Documents may have different structures and exclude all other fileds manualy difficult.
What C# query corresponds to the specified mongodb query db.getCollection('Documents').find({},{_id : 0, _id: 1}?
UPDATE: Do not offer solutions related query large amounts of data from the server, for example like
this.Collection.Find(d => true).Project(d => d.Id).ToListAsync().Result;

Since your using C# driver I would recommend to use the AsQueryable and then use linq instead.
In my opinion it is better since you wouldn't need the magic strings and you would benefit from your linq knowledge. Then it would look something like this
database.GetCollection<T>("collectionname").AsQueryable().Select(x => x.Id);

Alexey is correct, solutions such as these
var result = (await this.Collection<Foos>
.Find(_ => true)
.ToListAsync())
.Select(foo => foo.Id);
Will pull the entire document collection over the wire, deserialize, and then map the Id out in Linq To Objects, which will be extremely inefficient.
The trick is to use .Project to return just the _id keys, before the query is executed with .ToListAsync().
You can specify the type as a raw BsonDocument if you don't want to use a strongly typed DTO to deserialize into.
var client = new MongoClient(new MongoUrl(connstring));
var database = client.GetDatabase(databaseName);
var collection = database.GetCollection<BsonDocument>(collectionName);
var allIds = (await collection
.Find(new BsonDocument()) // OR (x => true)
.Project(new BsonDocument { { "_id", 1 } })
.ToListAsync())
.Select(x => x[0].AsString);
Which executes a query similar to:
db.getCollection("SomeCollection").find({},{_id: 1})

Anonymous Type with Linq and Guid

I have a simple table:
ID | Value
When I do this:
var sequence = from c in valuesVault.GetTable()
select new {RandomIDX = Guid.NewGuid(), c.ID, c.Value};
each element in the projection has the value of the same guid... How do I write this so that I get a different random guid value for each of my element in the projection?
Edit
To clarify on the issue. The GetTable() method simply calls this:
return this.context.GetTable<T>();
where the this.contenxt is the DataContext of type T.
The itteration is done as it's always done, nothing fancy:
foreach (var c in seq)
{
Trace.WriteLine(c.RandomIDX + " " + c.Value);
}
Output:
bf59c94e-119c-4eaf-a0d5-3bb91699b04d What is/was your mother's maiden name?
bf59c94e-119c-4eaf-a0d5-3bb91699b04d What was the last name of one of your high school English teachers?
bf59c94e-119c-4eaf-a0d5-3bb91699b04d In elementary school, what was your best friend's first and last name?
Edit 2
Using out the box linq2Sql Provider. I had built some generic wrappers around it but they do not alter the way IQuaryable or IEnumerable function in the code.

What is underneath valuesVault.GetTable()?
You probably have a Linq provider such as Linq 2 SQL.
That means that valuesVault.GetTable() is of type IQueryable which in turn means that the entire query becomes an expression.
An expression is a query that is defined but not yet executed.
When sequence is being iterated over, the query is executed using the Linq provider and that Linq provider and one of the steps it has to perform is to execute this expression: Guid.NewGuid(). Most Linq providers cannot pass that expression to the underlying source (SQL Server wouldn't know what to do with it) so it gets executed once and the result of the execution returned with the rest of the result.
What you could do is to force the valuesVault.GetTable() expression to become a collection by calling the .ToList() or .ToArray() methods. This executes the expression and returns an IEnumerable which represents an in-memory collection.
When performing queries against an IEnumerable, the execution is not passed to the Linq provider but executed by the .NET runtime.
In your case this means that the expression Guid.NewGuid() can be executed correctly.
Try this:
var sequence = from c in valuesVault.GetTable().ToArray()
select new {RandomIDX = Guid.NewGuid(), c.ID, c.Value};
Notice the .ToArray() there. That is what will make the statement go from IQueryable to IEnumerable and that will change its behaviour.

I think it's happening when it gets translated into SQL (ie: it's the database doing it). Since you have no WHERE clauses in your example, you could just do:
var sequence = from c in valuesVault.GetTable().ToList()
select new { RandomID = Guid.NewGuid(), c.ID, c.Value };
Which forces Guid.NewGuid() to be executed in the client. However, it's ugly if your table grows and you start adding filtering clauses. You could solve it by using a second LINQ query that projects a second result set with your new GUIDs:
var sequence = from c in valuesVault.GetTable()
where c.Value > 10
select new { c.ID, c.Value };
var finalSequence = from s in sequence.ToList()
select new { RandomID = Guid.NewGuid(), s.ID, s.Value };

Seems to work for me.
List<int> a = new List<int> {10, 11, 12, 13};
var v = a.Select(i => new {ID = Guid.NewGuid(), I = i});
foreach (var item in v)
{
Console.WriteLine(item);
}
output
{ ID = b760f0c8-8dcc-458e-a924-4401ce02e04c, I = 10 }
{ ID = 2d4a0b17-54d3-4d69-8a5c-d2387e50f054, I = 11 }
{ ID = 906e1dc7-6de4-4f8d-b1cd-c129142a277a, I = 12 }
{ ID = 6a67ef6b-a7fe-4650-a8d7-4d2d3b77e761, I = 13 }

I'm not able to reproduce this behavior with a simple LINQ query. Sample:
List<int> y = new List<int> { 0, 1, 2, 3, 4, 5 };
var result = y.Select(x => new { Guid = Guid.NewGuid(), Id = x }).ToList();
I'm imagining if you try to convert your Table value to a List in Linq, then perform your select, you'll get different Guids.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

.NET MongoDB Bad Projection Specification - c#

Related

Unable to cast object of type 'MongoDB.Bson.BsonString' to type 'MongoDB.Bson.BsonDocument' in MongoDB .NET Driver

How can I make Sum() return 0 instead of 'null'?

DocumentDB LINQ match multiple child objects

Select all _id from mongodb collection with c# driver

Anonymous Type with Linq and Guid

Categories

Resources