Mongo Aggregation With Projection - c#

Given the following document
{
"Id": "MyNormalObjectId",
"CompanyId" : 1234,
"User" :
{
"UserId":4567,
"FirstName":"Tester",
"Lastname":"McCtesterson"
}
}
How do I write my aggregate query to return a list of all unique users in a given company (including first and last name)?
FilterDefinitionBuilder<MyDoc> builder = Builders<MyDoc>.Filter;
var filter = builder .Eq(t => t.CompanyId, companyId);
var aggregate = _col.Aggregate();
aggregate.Match(filter).GroupBy(t=>t.User.UserId, ?????? )
Desired result would be a HashSet of UserMetadata. I've seen a lot of people go straight to BSon for what they need and I can if I HAVE to. Strongly typed is always better.

You can make use of the $addToSet operator inside the $group pipeline stage
to get the desired output.
db.collection.aggregate([
{
"$match": {
"CompanyId": 1234 // Find Conditions go here
}
},
{
"$group": {
"_id": "$CompanyId",
"Users": {
"$addToSet": "$User",
},
},
},
])

Related

MongoDB - How to update a single object in the array of objects inside a document

I have the following document structure:
{
"Agencies": [
{
"name": "tcs",
"id": "1",
"AgencyUser": [
{
"UserName": "ABC",
"Code": "ABC40",
"Link": "http.ios.com",
"TotalDownloads": 0
},
{
"UserName": "xyz",
"Code": "xyz20",
"Link": "http.ios.com",
"TotalDownloads": 0
}
]
}
]
}
Like I have multiple agencies and each agency contains a list of agents.
What I am trying is to pass the Code and update the TotalDownloads field of the agent that matches the code.
For example, if someone uses the code ABC40 so I need to update the field TotalDownloads of the agent called "ABC".
What I have tried is as below:
public virtual async Task UpdateAgentUsersDownloadByCode(string Code)
{
var col = _db.GetCollection<Agencies>(Agencies.DocumentName);
FilterDefinition<Agencies> filter = Builders<Agencies>.Filter.Eq("AgencyUsers.Code", Code);
UpdateDefinition<Agencies> update = Builders<Agencies>.Update.Inc(x => x.AgencyUsers.FirstOrDefault().TotalDownloads, 1);
await col.UpdateOneAsync(filter, update);
}
It is giving me the following error:
Unable to determine the serialization information for x => x.AgencyUsers.FirstOrDefault().TotalDownloads.
Where I'm wrong?
Note: From the attached sample document, the array property name: AgencyUser is not matched with the property name that you specified in the update operation, AgencyUsers.
Use arrayFilters with $[<identifier>] positional filtered operator to update the element(s) in the array.
MongoDB syntax
db.Agencies.update({
"AgencyUsers.Code": "ABC40"
},
{
$inc: {
"AgencyUsers.$[agencyUser].TotalDownloads": 1
}
},
{
arrayFilters: [
{
"agencyUser.Code": "ABC40"
}
]
})
Demo # Mongo Playground
MongoDB .NET Driver syntax
UpdateDefinition<Agencies> update = Builders<Agencies>.Update
.Inc("AgencyUsers.$[agencyUser].TotalDownloads", 1);
UpdateOptions updateOptions = new UpdateOptions
{
ArrayFilters = new[]
{
new BsonDocumentArrayFilterDefinition<Agencies>(
new BsonDocument("agencyUser.Code", Code)
)
}
};
UpdateResult result = await col.UpdateOneAsync(filter, update, updateOptions);
Demo

How to check only top 1 in nested mongo array

at the moment my notification documents has an events property which is an array of event. Each event has a status and a date. When querying notifications, it needs to check if the top status is opened.
Valid object where most recent event status is opened -
{
"subject" : "Hello there",
"events" : [
{
"status" : "opened",
"date" : 2020-01-02 17:35:31.229Z
},
{
"status" : "clicked",
"date" : 2020-01-01 17:35:31.229Z
},
]
}
Invalid object where status isn't most recent
{
"subject" : "Hello there",
"events" : [
{
"status" : "opened",
"date" : 2020-01-01 17:35:31.229Z
},
{
"status" : "clicked",
"date" : 2020-01-02 17:35:31.229Z
},
]
}
At the moment I have the query that can check if any event has the status opened, but I'm unsure how to query only the top 1 and sorted by the dates of a nested query. Any help would be greatly appreciated.
var filter = Builders<Notification>.Filter.Empty;
filter &= Builders<Notification>.Filter.Regex("events.event", new BsonRegularExpression(searchString, "i"));
var results = await collection.FindSync(filter, findOptions).ToListAsync();
In order to get only the latest event you can use $reduce to iterate over the events and compare each one to the temporarily latest:
db.collection.aggregate([
{
$addFields: {
latestEvent: {
$reduce: {
input: "$events",
initialValue: {status: null, date: 0},
in: {
$mergeObjects: [
"$$value",
{
$cond: [
{
$gt: [{$toDate: "$$this.date"}, {$toDate: "$$this.value"}]
},
"$$this",
"$$value"
]
}
]
}
}
}
}
}
])
See how it works on the playground example
for multiple documents, the result return only correct documents
example
db.collection.aggregate([{
$addFields: {
lastevent: {
$filter: {
input: '$events',
as: 'element',
cond: {$eq: ['$$element.date',{$max: '$events.date'}]}
}
}
}
}, {
$match: {
'lastevent.status': 'opened'
}
}])
I am a fan of not using an axe for everything, even if it is a good one :)
So i take it the events being disorderly is a rare thing, so we don't need to spend a lot of resources to weed out those up front as they will be few.
So my take is to get all the opened ones and use simple .net iteration to remove the few that may be, leaving a nice and orderly and easily maintainable method.
public List<Notification> GetValidSubjectStatusList(IMongoCollection<Notification> mongoCollection){
var builder = Builders<Notification>.Filter;
var filter = builder.Eq(x => x.Events.FirstOrDefault().Status, "opened");
var listOf = mongoCollection.Find(filter).ToList();
var reducedList = new List<Notification>();
foreach(var hit in listOf){
if(hit.Events.Any()
&& hit.Events.First()
.Date.Equals(hit.Events
.OrderByDescending(x => x.Date)
.FirstOrDefault()
))
{
reducedList.Add(hit);
}
}
return reducedList;
}

C# MongoDB Driver: Can't find the way to run complex query for AnyIn filter in MongoDB

I have a document like this:
{
"id": "xxxxxxxxxxxx",
"groupsAuthorized": [
"USA/California/SF",
"France/IDF/Paris"
]
}
And I have an user that has a list of authorized groups, like for example the following:
"groups": [
"France/IDF",
"USA/NY/NYC"
]
What I'm trying to achieve is to retrieve all documents in the database that the user is authorized to retrieve, essentially I want to be able to check in the list "groupsAuthorized" if one of the group contains a subset of an element of the other list "groups" contained in my user authorizations
using the following values:
my document:
{
"id": "xxxxxxxxxxxx",
"groupsAuthorized": [
"USA/California/SF",
"France/IDF/Paris"
]
}
my user permissions:
"groups": [
"France/IDF",
"USA/NY/NYC"
]
the user should be able to retrieve this document as the string "France/IDF" is correctly contained in the string "France/IDF/Paris", however, if the values would've been like this:
my document:
{
"id": "xxxxxxxxxxxx",
"groupsAuthorized": [
"USA/California/SF",
"France/IDF"
]
}
my user permissions:
"groups": [
"France/IDF/Paris",
"USA/NY/NYC"
]
it should not work, because my user is only authorized to view documents from France/IDF/Paris and USA/NY/NYC and none of the string inside of the authorizedGroups of my document contains those sequences
I've tried to use a standard LINQ query to achieve this which is fairly simple:
var userAuthorizedGroups = new List<string> { "France/IDF/Paris", "USA/NY/NYC" };
var results = collection.AsQueryable()
.Where(entity => userAuthorizedGroups
.Any(userGroup => entity.authorizedGroups
.Any(entityAuthorizedGroup => entityAuthorizedGroup.Contains(userGroup))));
But i'm getting the famous unsupported filter exception that it seems lot of people is having, i've tried different options found on the internet like the following:
var userAuthorizedGroups = new List<string> { "France/IDF/Paris", "USA/NY/NYC" };
var filter = Builders<PartitionedEntity<Passport>>.Filter.AnyIn(i => i.authorizedGroups, userAuthorizedGroups);
var results = (await collection.FindAsync(filter)).ToList();
return results;
But the problem is this will only check if one of the element of the array is contained inside the other array, It will not correctly work for case like "France/IDF" that should correctly match "France/IDF/Paris" because "France/IDF" string is contained inside the "France/IDF/Paris" string inside of my document
I'm getting a bit clueless on how to achieve this using the mongodb C# driver, i'm starting to think that I should just pull all documents to client and do the filtering manually but that would be quite messy
Has anyone an Idea on this subject ?
i'm starting to think that I should just pull all documents to client and do the filtering manually but that would be quite messy
don't do it :)
One place you can start with is here. It describes all the LINQ operators that are supported by the MongoDB .NET driver. As you can see .Contains() isn't mentioned there which means you can't use it and you'll get an arror in the runtime but it does not mean that there's no way to do what you're trying to achieve.
The operator closest to contains you can use is $indexOfBytes which returns -1 if there's no match and the position of a substring otherwise. Also since you need to match an array against another array you need two pairs of $map and $anyElementTrue to do exactly what .NET's .Any does.
Your query (MongoDB client) can look like this:
db.collection.find({
$expr: {
$anyElementTrue: {
$map: {
input: "$groupsAuthorized",
as: "group",
in: {
$anyElementTrue: {
$map: {
input: ["France/IDF/Paris", "USA/NY/NYC"],
as: "userGroup",
in: { $ne: [ -1, { $indexOfBytes: [ "$$userGroup", "$$group" ] } ] }
}
}
}
}
}
}
})
Mongo Playground,
You can run the same query from .NET using BsonDocument class which takes a string (JSON) and converts into a query:
var query = BsonDocument.Parse(#"{
$expr: {
$anyElementTrue:
{
$map:
{
input: '$groupsAuthorized',
as: 'group',
in: {
$anyElementTrue:
{
$map:
{
input: ['France/IDF/Paris', 'USA/NY/NYC'],
as: 'userGroup',
in: { $ne: [-1, { $indexOfBytes: ['$$userGroup', '$$group'] } ] }
}
}
}
}
}
}
}");
var result = col.Find(query).ToList();

MongoDB C# 2.0 upserting sub item in collection [duplicate]

I have documents that looks something like that, with a unique index on bars.name:
{ name: 'foo', bars: [ { name: 'qux', somefield: 1 } ] }
. I want to either update the sub-document where { name: 'foo', 'bars.name': 'qux' } and $set: { 'bars.$.somefield': 2 }, or create a new sub-document with { name: 'qux', somefield: 2 } under { name: 'foo' }.
Is it possible to do this using a single query with upsert, or will I have to issue two separate ones?
Related: 'upsert' in an embedded document (suggests to change the schema to have the sub-document identifier as the key, but this is from two years ago and I'm wondering if there are better solutions now.)
No there isn't really a better solution to this, so perhaps with an explanation.
Suppose you have a document in place that has the structure as you show:
{
"name": "foo",
"bars": [{
"name": "qux",
"somefield": 1
}]
}
If you do an update like this
db.foo.update(
{ "name": "foo", "bars.name": "qux" },
{ "$set": { "bars.$.somefield": 2 } },
{ "upsert": true }
)
Then all is fine because matching document was found. But if you change the value of "bars.name":
db.foo.update(
{ "name": "foo", "bars.name": "xyz" },
{ "$set": { "bars.$.somefield": 2 } },
{ "upsert": true }
)
Then you will get a failure. The only thing that has really changed here is that in MongoDB 2.6 and above the error is a little more succinct:
WriteResult({
"nMatched" : 0,
"nUpserted" : 0,
"nModified" : 0,
"writeError" : {
"code" : 16836,
"errmsg" : "The positional operator did not find the match needed from the query. Unexpanded update: bars.$.somefield"
}
})
That is better in some ways, but you really do not want to "upsert" anyway. What you want to do is add the element to the array where the "name" does not currently exist.
So what you really want is the "result" from the update attempt without the "upsert" flag to see if any documents were affected:
db.foo.update(
{ "name": "foo", "bars.name": "xyz" },
{ "$set": { "bars.$.somefield": 2 } }
)
Yielding in response:
WriteResult({ "nMatched" : 0, "nUpserted" : 0, "nModified" : 0 })
So when the modified documents are 0 then you know you want to issue the following update:
db.foo.update(
{ "name": "foo" },
{ "$push": { "bars": {
"name": "xyz",
"somefield": 2
}}
)
There really is no other way to do exactly what you want. As the additions to the array are not strictly a "set" type of operation, you cannot use $addToSet combined with the "bulk update" functionality there, so that you can "cascade" your update requests.
In this case it seems like you need to check the result, or otherwise accept reading the whole document and checking whether to update or insert a new array element in code.
if you dont mind changing the schema a bit and having a structure like so:
{ "name": "foo", "bars": { "qux": { "somefield": 1 },
"xyz": { "somefield": 2 },
}
}
You can perform your operations in one go.
Reiterating 'upsert' in an embedded document for completeness
I was digging for the same feature, and found that in version 4.2 or above, MongoDB provides a new feature called Update with aggregation pipeline.
This feature, if used with some other techniques, makes possible to achieve an upsert subdocument operation with a single query.
It's a very verbose query, but I believe if you know that you won't have too many records on the subCollection, it's viable. Here's an example on how to achieve this:
const documentQuery = { _id: '123' }
const subDocumentToUpsert = { name: 'xyz', id: '1' }
collection.update(documentQuery, [
{
$set: {
sub_documents: {
$cond: {
if: { $not: ['$sub_documents'] },
then: [subDocumentToUpsert],
else: {
$cond: {
if: { $in: [subDocumentToUpsert.id, '$sub_documents.id'] },
then: {
$map: {
input: '$sub_documents',
as: 'sub_document',
in: {
$cond: {
if: { $eq: ['$$sub_document.id', subDocumentToUpsert.id] },
then: subDocumentToUpsert,
else: '$$sub_document',
},
},
},
},
else: { $concatArrays: ['$sub_documents', [subDocumentToUpsert]] },
},
},
},
},
},
},
])
There's a way to do it in two queries - but it will still work in a bulkWrite.
This is relevant because in my case not being able to batch it is the biggest hangup. With this solution, you don't need to collect the result of the first query, which allows you to do bulk operations if you need to.
Here are the two successive queries to run for your example:
// Update subdocument if existing
collection.updateMany({
name: 'foo', 'bars.name': 'qux'
}, {
$set: {
'bars.$.somefield': 2
}
})
// Insert subdocument otherwise
collection.updateMany({
name: 'foo', $not: {'bars.name': 'qux' }
}, {
$push: {
bars: {
somefield: 2, name: 'qux'
}
}
})
This also has the added benefit of not having corrupted data / race conditions if multiple applications are writing to the database concurrently. You won't risk ending up with two bars: {somefield: 2, name: 'qux'} subdocuments in your document if two applications run the same queries at the same time.

How to mimic URI query

This may be too basic of a question for SO, but I thought I would ask anyway.
I getting my feet wet with ElasticSearch and am trying to return a single document that has an exact match to my field of interest.
I have the field "StoryText" which is mapped as type "string" and indexed as "not_analyzed".
When I search using a the basic URI query:
123.456.0.789:9200/stories/storyphrases/_search?q=StoryText:"The boy sat quietly"
I return an exact matched document as I expected with a single hit.
However, when I use the search functionality:
GET 123.456.0.789:9200/stories/storyphrases/_search
{
"query" : {
"filtered" : {
"filter" : {
"term" : {
"StoryText" : "The boy sat quietly"
}
}
}
}
}
I get multiple documents returned with many hits (i.e. "The boy sat loudly", "The boy stood quietly" etc. etc.)
Could somebody help me to understand how I need to restructure my search request to mimic the result I get using the basic query parameter?
At present I am using NEST in C# to generate my search request which looks like this
var searchresults = client.Search<stories>(p => p
.Query(q => q
.Filtered(r => r
.Filter(s => s.Term("StoryText", inputtext))
)
)
);
Thanks very much for any and all reads and or thoughts!
UPDATE: Mappings are listed below
GET /stories/storyphrases/_mappings
{
"stories": {
"mappings": {
"storyphrases": {
"dynamic": "strict",
"properties": {
"#timestamp": {
"type": "date",
"format": "date_optional_time"
},
"#version": {
"type": "string"
},
"SubjectCode": {
"type": "string"
},
"VerbCode": {
"type": "string"
},
"LocationCode": {
"type": "string"
},
"BookCode": {
"type": "string"
},
"Route": {
"type": "string"
},
"StoryText": {
"type": "string",
"index": "not_analyzed"
},
"SourceType": {
"type": "string"
},
"host": {
"type": "string"
},
"message": {
"type": "string"
},
"path": {
"type": "string"
}
}
}
}
}
Mick
Well, first off you are executing two different queries here. The first is running in a query context whilst the second is essentially a match_all query executing in a filtered context. If your objective is simply to emulate the first query but by passing a JSON body you will need something like
GET 123.456.0.789:9200/stories/storyphrases/_search
{
"query" : {
"query_string" : {
"query" : "StoryText:'The boy sat quietly'"
}
}
}
To write this simple query using Nest you would use
var searchresults = client.Search<stories>(p => p.QueryString("StoryText:" + inputtext));
or in longer form
var searchresults = client.Search<stories>(p => p
.Query(q => q
.QueryString(qs => qs
.Query("StoryText:" + inputtext)
)
)
);
These both produce the same JSON body and send it to the _search endpoint. Assuming that storyphrases is your Elasticsearch type then you may also wish to include this in your C#.
var searchresults = client.Search<stories>(p => p
.Index("stories")
.Type("storyphrases")
.Query(q => q
.QueryString(qs => qs
.Query("StoryText:" + inputtext)
)
)
);
Having said all that and looking at your filtered query it should do what you expect according to my testing. Is your field definitely not analyzed? Can you post your mapping?

Categories