MongoDB Union and intersection in one call - c#

I want to be able to perform union and then intersection.
My Document strucuture:
{
"_id" : 1,
"items" : [
52711,
201610,
273342,
279449,
511250
]
},
{
"_id" : 2,
"items" : [
246421,
390200
]
}
This collection contains of thousands of Documents of above form.
I want to perform Union on set of documents and then perform intersection on set returned from Union.
For example:
Set 1 contains Id: [1,2,3,4,5]
Set 2 Contains Id: [3,4,5,6,7,8]
Set 3 Contains Id: [12,14,15,16,17]
It should union all list items in set 1 and set 2 and set 3. Then perform Intersection on result of each set.
So far, I have got query that does union of list as following:
db.getCollection('Test').aggregate([
{ "$match": { "_id": { "$in": [1, 2, 3] } } },
{
"$group": {
"_id": 0,
"data": { "$push": "$items" }
}
},
{
"$project": {
"items": {
"$reduce": {
"input": "$data",
"initialValue": [],
"in": { "$setUnion": ["$$value", "$$this"] }
}
}
}
}
])
Also I am doing all this in c# right now:
var group = new BsonDocument
{
{ "_id", 0 },
{
"data", new BsonDocument {{"$push", "$items" } }
}
};
var project = new BsonDocument
{
{"items", new BsonDocument
{
{ "$reduce", new BsonDocument
{
{ "input", "$data"},
{ "initialValue", new BsonArray()},
{ "in", new BsonDocument { {"$setUnion", new BsonArray { "$$value", "$$this" }}}}
}
}
}
}
};
var result = qaCollection.Aggregate()
.Match(Builders<QAList>.Filter.In(x => x.Id, list))
.Group(group)
.Project(project)
.FirstOrDefault();
This query takes some time since it could return large data. So it would really nice if i can pass multiple sets and it would union separate sets and intersect them so data is not to big to return.
thanks in advance..

Answer based on the answer given to question 24824361:
There is no function to automatically do an intersection in MongoDB across several different documents. However, it is possible to calculate the intersection by taking this approach:
note the number of documents you are intersecting
unwind the items array
count the occurrence of each item
match only those items whose occurrence count matches the number of documents from step 1
So for example if you are taking the intersection of items in 3 documents, then you unwind the items, count the number of times each item comes up, and finish with just the items which come up 3 times.
This will only work if each document's items array has no duplicates, of course.
So for example, if the source data is like this:
db.test_unionintersection_stackoverflow_42686348.insert([
{ "_id" : 1,
"items" : [ 10, 20, 30, 40, 50 ]},
{ "_id" : 2,
"items" : [ 20, 30, 40, 50, 60, 70, 80 ]},
{ "_id" : 3,
"items" : [ 10, 40, 50, 60, 80 ]},
{ "_id" : 4,
"items" : [ 20, 30, 40, 70, 80 ]}
])
Then if you want the intersection of documents 1,2,3 (for example), you want the result [40, 50].
You can calculate it like this:
var document_ids = [1, 2, 3];
var number_documents = document_ids.length;
db.test_unionintersection_stackoverflow_42686348.aggregate([
{ "$match": { "_id": { "$in": document_ids } } },
{ "$unwind": "$items"},
{ "$project" : { "_id" : 0, "item" : "$items"}},
{ "$group" : { _id: "$item", "count" : {$sum: 1}}},
{ "$match" : { "count" : number_documents}},
{ "$group" : { _id: "intersection", "items" : {$push: "$_id"}}},
]);
which gives you the result:
{
"_id" : "intersection",
"items" : [
50.0,
40.0
]
}

Related

Cosmos query to fetch only specific inner array items that meets specific condition

In cosmos DB the document structure is like this
[
{
"id": "1",
"Plants": [
{
"PlantId": 3,
"UniqueQualityId": "3_pe55d74fc5f92b11ab3fe"
},
{
"PlantId": 4,
"UniqueQualityId": "3_pe55d74fc5sdfmsdfklms"
},
{
"PlantId": 10,
"UniqueQualityId": "3_pe55d7akjdsj6ysdssdsd"
},
{
"PlantId": 12,
"UniqueQualityId": "5_fdffpe55d7akjdsj6ysds"
}
],
"CompletionTime": 36
},
{
"id": "2",
"Plants": [
{
"PlantId": 3,
"UniqueQualityId": "3_pe55d74fc5f92b11ab3fe"
},
{
"PlantId": 4,
"UniqueQualityId": "3_pe55d74fc5sdfmsdfklms"
},
{
"PlantId": 3,
"UniqueQualityId": "3_pe55d74fc5f92b11ab3fe"
},
{
"PlantId": 5,
"UniqueQualityId": "3_pe55d7akjdsj6ysdssdsd"
}
],
"CompletionTime": 36
},
{
"id": "2",
"Plants": [
{
"PlantId": 10,
"UniqueQualityId": "3_pe55d74fc5f92b11ab3fe"
},
{
"PlantId": 11,
"UniqueQualityId": "3_pe55d74fc5sdfmsdfklms"
}
],
"CompletionTime": 36
}
]
I need to get the collection of plants that meets specific condition:
For example, the query is written as to fetch Plants along with some parent data where PlantId in ("3","4") , then the output am expecting is
[
{
"id": "1",
"Plants": [
{
"PlantId": 3,
"UniqueQualityId": "3_pe55d74fc5f92b11ab3fe"
},
{
"PlantId": 4,
"UniqueQualityId": "3_pe55d74fc5sdfmsdfklms"
}
],
"CompletionTime": 36
},
{
"id": "2",
"Plants": [
{
"PlantId": 3,
"UniqueQualityId": "3_pe55d74fc5f92b11ab3fe"
},
{
"PlantId": 4,
"UniqueQualityId": "3_pe55d74fc5sdfmsdfklms"
}
}
],
"CompletionTime": 36
}
]
Here in the plants array it should only contain the items that meet the filtered condition.
I have tried the following methods
SELECT root["Plants"],root.id FROM root
WHERE EXISTS(select value plant FROM plant in root.Plants WHERE plant.PlantId in ("3","4"))
SELECT root.id,root.Plants FROM root where ARRAY_CONTAINS(c.Plants,{"PlantId": "3"},true)
If any of the plant items meet the condition it is returning the entire plant array instead of specific items.
Is there any method where it will return only the specific array items that meet the condition?
You can use the following query to get the result to get the output you want:
SELECT
c.id,
ARRAY(
SELECT VALUE p
FROM p IN c.Plants
WHERE p.PlantId IN (3, 4)
) AS Plants,
c.CompletionTime
FROM c
Although my personal preference would be the query below that does the same, but creates a seperate item for every plant. From the context I understand that you are looking for specific plants and capture some parent data in the result. In that case it would make sense to have seperate results by the plant.
SELECT
c.id,
p AS Plant,
c.CompletionTime
FROM c
JOIN
(
SELECT VALUE p
FROM p IN c.Plants
WHERE p.PlantId IN (3, 4)
) AS p

$[<identifier>] operator syntax in C# MongoDB Driver for nested array update in a Document

I need to update point Description field highlighted inside the nested Document using C# Mongo DB driver. I can do this update successfully using following query.
Document Structure
{
"ControllerID": "testcon",
"CCNID": "testccn",
"TableGroup": 1,
"Tables": [
{
"ID": 0,
"TableGroupID": 1,
"Blocks": [
{
"ID": 0,
"TableID": 0,
"TableGroupID": 1,
"ControllerID": "testcon",
"Points": [
{
"BlockID": 0,
"TableGroupID": 1,
"ControllerID": "testcon",
"TableID": 0,
"PointDefinitionID": 23,
"Name": "Test Point",
"Description": "Hgfhdhfhey You" <----------- This field needs to be updated
},
{
"BlockID": 0,
"TableGroupID": 1,
"ControllerID": "testcon",
"TableID": 0,
"PointDefinitionID": 24,
"Name": "Test Point",
"Description": "Hgfhdhfhey You"
}
]
}
]
}
]
}
I can successfully update the point Description using this query.
db.ControllerPointCollection.updateOne({
"_id": "HRDC_testccn_0_34_1"
},
{
$set: {
"Tables.$[t].Blocks.$[b].Points.$[p].Description": "Hey You"
}
},
{
arrayFilters: [
{
"t.ID": 0
},
{
"b.ID": 0
},
{
"p.PointDefinitionID": 23
}
]
})
I tried using this filter and update object for the above operation
var pointFilter = Builders<Point>.Filter.Eq(p => p.PointDefinitionID, point.PointDefinitionID);
var blockFilter = Builders<Block>.Filter.Eq(b => b.ID, point.BlockID) & Builders<Block>.Filter.ElemMatch(b => b.Points, pointFilter);
var tableFilter = Builders<Table>.Filter.Eq(t => t.ID, point.TableID) & Builders<Table>.Filter.ElemMatch(t => t.Blocks, blockFilter);
var filter = Builders<ControllerPointDataDoc>.Filter.Eq(c => c.ID, point.ControllerID) & Builders<ControllerPointDataDoc>.Filter.ElemMatch(c => c.Tables, tableFilter);
var updater = Builders<ControllerPointDataDoc>.Update.Set(c => c.Tables[-1].Blocks[-1].Points[-1].Description, "Hey You");
operationResult.Data = await ControllerPointDataCollection.UpdateOneAsync(filter, updater);
but I am getting the following error.
A write operation resulted in an error.\r\n Too many positional (i.e. '$') elements found in path 'Tables.$.Blocks.$.Points'

MongoDB in C# - what is the right way to perform multiple filtering

I have a MongoDB collection of Persons (person_Id, person_name, person_age)
I want to return an object that contains:
number_of_all_persons
number_of_all_persons with age < 20
number_of_all_persons with age >= 20 && age <= 40
number_of_all_persons with age > 40
What is the right way to do it in Mongo using C#?
Should I run 4 different Filters to achieve result?
You can use Aggregation.
const lowerRange = { $lt: ["$person_age", 20] };
const middleRange = { $and: [{ $gte: ["$person_age", 20]}, { $lte: ["$person_age", 40] }] };
// const upperRange = { $gt: ["$person_age", 40] };
db.range.aggregate([
{
$project: {
ageRange: {
$cond: {
if: lowerRange,
then: "lowerRange",
else: {
$cond: {
if: middleRange,
then: "middleRange",
else: "upperRange"
}
}
}
}
}
},
{
$group: {
_id: "$ageRange",
count: { $sum: 1 }
}
}
]);
Here the total count is not included as you can calculate it from the count of the ranges. If you have more that 3 ranges, a better idea would be to pass an array of $cond statements to the project stage, as nesting multiple if/else statements starts to get harder to maintain. Here is a sample:
$project: {
"ageRange": {
$concat: [
{ $cond: [ { $lt: ["$person_age", 20] }, "0-19", ""] },
{ $cond: [ { $and: [ { $gte: ["$person_age", 20] }, { $lte: ["$person_age", 40] } ] }, "20-40", ""] },
{ $cond: [ { $gt: ["$person_age", 40] }, "41+", ""] }
]
}
}
You have several possible solutions here.
Filter with mongo
Filter with Linq
Looks like you are having an "and" filter here. You can do your filtering as described here: Link to filtering
Or, depending on how large your persons collection is, you could do query for all objects and filter with linq:
var collection.FindAll().Where(x => (x.age < 20) && (x.age >= 20 && x.age <= 40) ...)
The above is not syntax correct but I hope you get the idea behind it.
My approach would be creating a filter builder and and adding it to the find method call.
var highExamScoreFilter = Builders<BsonDocument>.Filter.ElemMatch<BsonValue>(
"scores", new BsonDocument { { "type", "exam" },
{ "score", new BsonDocument { { "$gte", 95 } } }
});
var highExamScores = collection.Find(highExamScoreFilter).ToList();
Hope this helps somehow.

Grouping Linq and Converting columns to rows

I have a list that will currently return something like this. the Att column can be anything because a user can enter in a Att and Value at anytime.
var attr_vals = (from mpav in _db.MemberProductAttributeValues
select mpav);
Results
Id Att Value
1 Color Blue
1 Size 30
1 Special Slim
3 Color Blue
4 Size 30
2 Special Slim
2 Random Foo Foo
The conversion I am looking for would be similar to this
Converted results
Id Color Size Special Random
1 Blue 30 Slim null
2 null null null Foo Foo
3 Blue null null null
4 null 52 null null
Class looks like this so far.
public class MyMappProducts
{
public int? id { get; set; }
public Dictionary<string, string> Attributes { get; set; }
string GetAttribute(string aName)
{
return Attributes[aName];
}
void setAttribute(string aName, string aValue)
{
Attributes[aName] = aValue;
}
}
So giving that your list of attributes might change, creating a class with each attribute as a property would not be good as you'll have to know all the attributes before hand, thus working with a dictionary is easier.
Here's a way of doing what you want (Note that the missing attributes of each row aren't present in the dictionary):
var list = new List<AttributeValue>
{
new AttributeValue(1, "Color", "Blue"),
new AttributeValue(1, "Size", "30"),
new AttributeValue(1, "Special", "Slim"),
new AttributeValue(3, "Color", "Blue"),
new AttributeValue(4, "Size", "30"),
new AttributeValue(2, "Special", "Slim"),
new AttributeValue(2, "Random", "Foo Foo")
};
// First we groupby the id and then for each group (which is essentialy a row now)
// we'll create a new MyMappProducts containing the id and its attributes
var result = list.GroupBy(av => av.Id)
.Select(g => new MyMappProducts
{
id = g.Key,
Attributes = g.ToDictionary(av => av.Attribute, av => av.Value)
})
.ToList();
This results in (pretty printed):
[
{
"id": 1,
"Attributes": {
"Color": "Blue",
"Size": "30",
"Special": "Slim"
}
},
{
"id": 3,
"Attributes": {
"Color": "Blue"
}
},
{
"id": 4,
"Attributes": {
"Size": "30"
}
},
{
"id": 2,
"Attributes": {
"Special": "Slim",
"Random": "Foo Foo"
}
}
]

Is it possible to project sub arrays and retrieve them as one array?

From the mongo sample let's say we have a collection like this;
{ "_id" : 1, "semester" : 1, "grades" : [ 70, 87, 90 ] }
{ "_id" : 2, "semester" : 1, "grades" : [ 90, 88, 92 ] }
{ "_id" : 3, "semester" : 1, "grades" : [ 85, 100, 90 ] }
{ "_id" : 4, "semester" : 2, "grades" : [ 79, 85, 80 ] }
{ "_id" : 5, "semester" : 2, "grades" : [ 88, 88, 92 ] }
{ "_id" : 6, "semester" : 2, "grades" : [ 95, 90, 96 ] }
and then the query is like this;
db.students.find( { semester: 1, grades: { $gte: 85 } },
{ "grades.$": 1 } )
which results to this ;
{ "_id" : 1, "grades" : [ 87 ] }
{ "_id" : 2, "grades" : [ 90 ] }
{ "_id" : 3, "grades" : [ 85 ] }
I would like to have a result like ;
{"grades": [87, 90, 85]}
on one array.
c# code I implemented gives me array of LogLists, code is different then above but the operation is quite the same ;
var result = collectionServerName.Find(x => x.LogList.Any(p => p.Ip.Contains("192")))
.Project(Builders<ServerName>.Projection.Exclude("_id").Include("LogList"))
.ToList();
I have tried following code;
var result = collectionServerName.Find(x => x.LogList.Any(p => p.Ip.Contains("192"))).Project(t => t.LogList.SelectMany(k => k)).ToList();
but it gives me following compile error for SelectMany
The type arguments for method 'Enumerable.SelectMany<TSource, TResult>(IEnumerable<TSource>, Func<TSource, IEnumerable<TResult>>)' cannot be inferred from the usage. Try specifying the type arguments explicitly
If I use only Select ;
var result = collectionServerName.Find(x => x.LogList.Any(p => p.Ip.Contains("192"))).Project(t => t.LogList.Select(k => k)).ToList();
The results type is List<IEnumarable<Log>> which I dont intend to have
The reason why I need as one array is I have to paginate the results before retrieving them. I am using c#. Any help would be appreciated.
What about this one for c# ?
For all grades in one Enumerable
collection
.AsQueryable()
.SelectMany(x => x.grades);
And for paging just add Skip(10), Take(10) extension methods.

Categories