Summing large amounts of data on mongodb - c#

Im looking for the most efficient way of performing summing queries against mongodb.
Currently we insert documents that contain various information and a date time stamp of when the document was created.
We need to sum this data to be viewed in the following ways:
Documents by hour of the day 1-24
Documents by day of the month 1-28/31
Documents by month of the year 1-12
Documents by year
This summed data will be accessed often as we're afraid that the massive amount of data thrown at mongo will have problems summing this data often.
We thought perhaps when a document is inserted into mongo that we have another document that contains these counts that we increment at the time of insertion. This way, we can quickly pull the counts without summing the data each request. Our concern is that this may not be the most efficient way to perform this type of operation in mongo
Any thoughts on the best way to accomplish this? My dev team as well as myself are new to mongodb and we want to make sure we don't fall into a performance trap with summing large sets of data.

The Aggregation Framework is perfectly suited for this type of queries.
I've done some examples for you below.
To start, let's populate some documents:
db.myDocumentCollection.insert({"date" : new Date('01/01/2012'),
"topic" : "My Title 1"}); db.myDocumentCollection.insert({"date" : new
Date('01/02/2012'), "topic" : "My Title 2"});
db.myDocumentCollection.insert({"date" : new Date('01/02/2012'),
"topic" : "My Title 3"}); db.myDocumentCollection.insert({"date" : new
Date('01/02/2012'), "topic" : "My Title 4"});
db.myDocumentCollection.insert({"date" : new Date('01/04/2012'),
"topic" : "My Title 5"}); db.myDocumentCollection.insert({"date" : new
Date('01/05/2012'), "topic" : "My Title 6"});
db.myDocumentCollection.insert({"date" : new Date('01/07/2013'),
"topic" : "My Title 7"}); db.myDocumentCollection.insert({"date" : new
Date('01/07/2013'), "topic" : "My Title 8"});
db.myDocumentCollection.insert({"date" : new Date('02/07/2013'),
"topic" : "My Title 9"}); db.myDocumentCollection.insert({"date" : new
Date('02/08/2013'), "topic" : "My Title 10"});
Return number of documents grouped by full date
db.myDocumentCollection.group(
{
$keyf : function(doc) {
return { "date" : doc.date.getDate()+"/"+doc.date.getMonth()+"/"+doc.date.getFullYear() };
},
initial: {count:0},
reduce: function(obj, prev) { prev.count++; }
})
Output
[
{
"date" : "1/0/2012",
"count" : 1
},
{
"date" : "2/0/2012",
"count" : 3
},
{
"date" : "4/0/2012",
"count" : 1
},
{
"date" : "5/0/2012",
"count" : 1
},
{
"date" : "7/0/2013",
"count" : 2
},
{
"date" : "7/1/2013",
"count" : 1
},
{
"date" : "8/1/2013",
"count" : 1
}
]
Return number of documents grouped by day of month for the year 2013
This is perhaps a little more relevant for the kinds of queries you want to do.
Here, we use the cond to specify only to group documents after 1/1/2013
You could use $gte and $lte to do date ranges here.
db.myDocumentCollection.group(
{
$keyf : function(doc) {
return { "date" : doc.date.getDate()+"/"+doc.date.getMonth()};
},
cond: {"date" : {"$gte": new Date('01/01/2013')}},
initial: {count:0},
reduce: function(obj, prev) { prev.count++; }
})
Output
[
{
"date" : "7/0",
"count" : 2
},
{
"date" : "7/1",
"count" : 1
},
{
"date" : "8/1",
"count" : 1
}
]

Related

Firebase Rest Api:How To Use OrderBy() and StartAt() using The Last Key id

I'm using firebase database rest API in my android application using c# And my application gets 10 data from the firebase database when the page is opened and these data are ordered by (Views Count) using .OrderBy("Views").LimitToLast(10)So far I am getting the data correctly and it works correctly, as well as I store the last id in a public string because I get another 10 elements from firebase database when user scroll down the listview, But when I get the other 10 data and use .StartAt(LastId).LimitToLast(10) the firebase will return null (0) data,Because the keys that return are not sorted correctly when I use .OrderBy("Views") So I can't get the correct last id to get the data starting at the last id, I found a way to order the keys using OrderByKey() And fetching the correct last id and fetching other 10 data starting at last id, **but I want to fetch the data ordered by (Views) **,
Summary: How can I get 10 elements from the firebase database ordered by ("Views") and at the same time I can get the other 10 elements starting at the last id for the first 10
My JSON Structure :
{
"Main" : {
"News" : {
"Categories" : {
"Education" : {
"-Mn7-ZxkUPO01ifhtpEn" : {
"Text" : "some text",
"Title" : "some title",
"Views" : 5
},
"-Mn7-ZxkUPO01ifhtp11En" : {
"Text" : "some text",
"Title" : "some title",
"Views" : 5
},
"-Mn7-ZxkUPO0112ifhtpEn" : {
"Text" : "some text",
"Title" : "some title",
"Views" : 12
},
"-Mn7-ZxkUPO01dxifhtpEn" : {
"Text" : "some text",
"Title" : "some title",
"Views" : 545
},
"-Mn7-ZxkUPO01sdifhtpEn" : {
"Text" : "some text",
"Title" : "some title",
"Views" : 5
},
"-Mn7-ZxkUPO01ifddhtpEn" : {
"Text" : "some text",
"Title" : "some title",
"Views" : 200
},
"-Mn7-ZxkUPO01ifdshtpEn" : {
"Text" : "some text",
"Title" : "some title",
"Views" : 1
},
"-Mn7-ZxkUPO01ifhtdxpEn" : {
"Text" : "some text",
"Title" : "some title",
"Views" : 223
}
}
}
}
}
Thanks in advance
And I thank Frank van Puffelen because he helped me a lot :)
As I said in my comments on your previous question: the Firebase Realtime Database REST API will return the correct, filtered nodes, but they will not be in any defined order - as the order of the keys in a JSON object is undefined by definition. There is nothing you can do to change that.
What you'll need to do is re-order the results in your application code, and then determine the first/last one. If you do that for the first/last of nodes you shared, you'll see that the first node has Views 33 and key -Mn7-ZxkUPO01iddfh2tpEn.
To get the previous page, you'd use:
https://stackoverflow.firebaseio.com/70102492/Main/News/Categories/Education.json?print=pretty&orderBy=%22Views%22&limitToLast=10&endAt=33,%22-Mn7-ZxkUPO01iddfh2tpEn%22
So:
We endAt=33,"-Mn7-ZxkUPO01iddfh2tpEn", which means we end at the first item of the next page. We include both the value of Views and the key in this parameter, since there may be multiple nodes with Views equal to 33 and in that case the database will use the key to select the correct node to end at.
You might want to request 11 items instead of 10, given that you'e already shown the -Mn7-ZxkUPO01iddfh2tpEn.

Firebase Database Rest Api: How to get the next page when use LimitToLast()

I have asked several questions about the use of LimitToLast() And EndAt() Also The Paginating backward from the end of The Data, Because I am trying to fetch the first 10 most viewed data using OrderBy("Views") Then Get the other 10 When Scroll Down.
When fetching the first 10 data Ordered By Views, I get it correctly, but my problem is how do I get the Correct Node so as I can use it to get the (Previous Page) which contains the other 10 data
without any problem even if there are nodes with the same value of Views.
As it is assumed that we have to get the node that has the lowest value of Views and store it in string and when fetching the other 10 data we use it like this :
OrderBy("Views").LimitToLast(11).EndAt(LastValue,LastId)
But I always get incorrect and often duplicate results, especially if there is data with the same value of Views.
My Json Stracture 30 Nodes :
{
"-Mps4qWpgU-L5E3OfAMD" : {
"Text" : 1,
"Views" : 20
},
"-Mps4qoSPhzh3Hzu1i2y" : {
"Text" : 2,
"Views" : 20
},
"-Mps4qxhzwbQ80snD3jF" : {
"Text" : 3,
"Views" : 20
},
"-Mps4r60gkcsFjI33d-q" : {
"Text" : 4,
"Views" : 20
},
"-Mps4rF4Ku1X1FIyR0M8" : {
"Text" : 5,
"Views" : 20
}
More...
}
My C# Code:
public class PostData
{
public int Views{get;set;}
public string Text{get;set;}
}
//Get First 10
var Data = await firebaseclient
.Child("Education/")
.OrderBy("Views")
.LimitToLast(10)
.OnceAsync<PostData>();
//Order The Data By Descending on their Views And Key
var sortedData = Data.OrderByDescending(x=>x.Object.Views).ThenByDescending(x=>x.Key);
//Save The Last Id (lowest value of Views) and his Value Of Views
string LastID = sortedData.Last().Key;
int LastValue = sortedData.Last().Object.Views;
//Get More (Previous Page)
var Data = await firebaseclient
.Child("Education/")
.OrderBy("Views")
.EndAt(LastValue,LastID)
.LimitToLast(11)
.OnceAsync<PostData>();
//Order The Data By Descending on their Views And Key
var sortedData = Data.OrderByDescending(x=>x.Object.Views).ThenByDescending(x=>x.Key);
//Update The Last Id and his Value Of Views
string LastID = sortedData.Last().Key;
int LastValue = sortedData.Last().Object.Views;
Thanks in advance and sorry for the long post :)
This quick jsbin shows that the correct order for the nodes is:
"-Mps4qWpgU-L5E3OfAMD": {"Text":1,"Views":20},
"-Mps4qoSPhzh3Hzu1i2y": {"Text":2,"Views":20},
"-Mps4qxhzwbQ80snD3jF": {"Text":3,"Views":20},
"-Mps4r60gkcsFjI33d-q": {"Text":4,"Views":20},
"-Mps4rF4Ku1X1FIyR0M8": {"Text":5,"Views":20},
"-Mps4rOgbcEeIZ-Kyi0I": {"Text":6,"Views":20},
"-Mps4rYAe5dIPytLvk4E": {"Text":7,"Views":20},
"-Mps4rgdzKlylkoeOTGc": {"Text":8,"Views":20},
"-Mps4rpqUdlVdfsatirv": {"Text":9,"Views":20},
"-Mps4rzCa-YS60CqXzfy": {"Text":10,"Views":20},
"-Mps4s7am1ZCuLtBp0nk": {"Text":11,"Views":20},
"-Mps4sGxBOgFsTLFSqMk": {"Text":12,"Views":20},
"-Mps4sQ9CLYNFQ4Uynhb": {"Text":13,"Views":20},
"-Mps4sZY5egXpf5_CmFi": {"Text":14,"Views":20},
"-Mps4shiJxx2dQVwen-Y": {"Text":15,"Views":20},
"-Mps4srZmVvPvImOf48D": {"Text":16,"Views":20},
"-Mps4tQpS0qYHlwHd8jJ": {"Text":18,"Views":20},
"-Mps4tZujfdt-q1EnjGq": {"Text":19,"Views":20},
"-Mps4thzzjBtFaRyRcxu": {"Text":20,"Views":20},
"-Mps4tr7SE6t7qZc2513": {"Text":21,"Views":20},
"-Mps4u0-6dOElpUhjzYY": {"Text":22,"Views":20},
"-Mps4u9G_112glr6ijQl": {"Text":23,"Views":20},
"-Mps4uJ-j75wAdmth6lb": {"Text":24,"Views":20},
"-Mps4uS8BCBb7FASfZJ5": {"Text":25,"Views":20},
"-Mps4uaMoWvGEvKNCcVc": {"Text":26,"Views":20},
"-Mps4ujkjYW5CeqSFPqU": {"Text":27,"Views":20},
"-Mps4usz_FyTYx49oz6e": {"Text":28,"Views":20},
"-Mps4v1HhVdmlOOZhYs_": {"Text":29,"Views":20},
"-Mps4vB9i4F0irpPeRgz": {"Text":30,"Views":20},
"-Mps4tH5mnQlklXP5o3I": {"Text":17,"Views":50}
If we order by Views and take the last 10 notes with this query: orderBy="Views"&limitToLast=10 we get these results:
{
"-Mps4tH5mnQlklXP5o3I" : { "Text" : 17, "Views" : 50 },
"-Mps4u0-6dOElpUhjzYY" : { "Text" : 22, "Views" : 20 },
"-Mps4u9G_112glr6ijQl" : { "Text" : 23, "Views" : 20 },
"-Mps4uJ-j75wAdmth6lb" : { "Text" : 24, "Views" : 20 },
"-Mps4uS8BCBb7FASfZJ5" : { "Text" : 25, "Views" : 20 },
"-Mps4uaMoWvGEvKNCcVc" : { "Text" : 26, "Views" : 20 },
"-Mps4ujkjYW5CeqSFPqU" : { "Text" : 27, "Views" : 20 },
"-Mps4usz_FyTYx49oz6e" : { "Text" : 28, "Views" : 20 },
"-Mps4v1HhVdmlOOZhYs_" : { "Text" : 29, "Views" : 20 },
"-Mps4vB9i4F0irpPeRgz" : { "Text" : 30, "Views" : 20 }
}
When we order them the same as above, the first one of these results in the original list is:
"-Mps4u0-6dOElpUhjzYY": {"Text":22,"Views":20},
So those are the value and key you need to use in your subsequent call to get the page above the last one: orderBy="Views"&limitToLast=10&endAt=20,"-Mps4u0-6dOElpUhjzYY". As far as I can see this gives the expected results.

MongoDB C# List the latest of all entries on a sub document

Is it possible to list all the restaurants and their latest grade, if grades is a an array within a restaurant?
{
"_id" : ObjectId("56bf7957b5e096fd06b755b2"),
"grades" : [
{
"date" : ISODate("2014-11-15T00:00:00.000Z"),
"grade" : "Z",
"score" : 38
},
{
"date" : ISODate("2014-05-02T00:00:00.000Z"),
"grade" : "A",
"score" : 10
},
{
"date" : ISODate("2013-03-02T00:00:00.000Z"),
"grade" : "A",
"score" : 7
},
{
"date" : ISODate("2012-02-10T00:00:00.000Z"),
"grade" : "A",
"score" : 13
}
],
"name" : "Brunos On The Boulevard",
}
I would want to get:
{
"_id" : ObjectId("56bf7957b5e096fd06b755b2"),
"grades" : [
{
"date" : ISODate("2014-11-15T00:00:00.000Z"),
"grade" : "Z",
"score" : 38
}
],
"name" : "Brunos On The Boulevard",
}
Explanation
The answer below uses the unwind operator. There's a really simple explanation of it on this answer, should anyone be confused by it as I was.
An option could be doing an aggregate with two operations, an Unwind which deconstructs your array field from the input documents to output a document for each element, and later a sort operation by date in descending order. This way you can get the result you are expected selecting the first element from the aggregate result:
var result = collection.Aggregate()
.Unwind(e => e["grades"])
.SortByDescending(e=>e["grades.date"])
.FirstOrDefault();

C# MongoDB index causes weird duplicate exceptions

we have a problem with our indexes. We have an index on our emails but it throws errors like such:
> db.User.insert({email: "hell33o#gmail.com", "_id" : BinData(3,"iKyq6FvBCdd54TdxxX0JhA==")})
WriteResult({
"nInserted" : 0,
"writeError" : {
"code" : 11000,
"errmsg" : "E11000 duplicate key error index: placetobe.User.$email_text dup key: { : \"com\", : 0.6666666666666666 }"
}
})
when we have the index created with our C# driver like this
Created by C# with:
CreateIndexOptions options = new CreateIndexOptions {Unique = true};
_collection.Indexes.CreateOneAsync(Builders<User>.IndexKeys.Text(_ => _.email), options);
resulted in
{
"v" : 1,
"unique" : true,
"key" : {
"_fts" : "text",
"_ftsx" : 1
},
"name" : "email_text",
"ns" : "placetobe.User",
"weights" : {
"email" : 1
},
"default_language" : "english",
"language_override" : "language",
"textIndexVersion" : 2
}
but if we create it with the MongoDB console like this it works:
{
"v" : 1,
"unique" : true,
"key" : {
"email" : 1
},
"name" : "email_1",
"ns" : "placetobe.User"
}
I don't understand the difference between the two indexes, but they have an effect on our DB. We also have problems with a Collectin that saves names. we get duplicate exceptions on "Molly" if we try to insert "Molli". With the emails is seems to give us errors whenever we have two "gmail" emails in the collection or two ".com" emails etc.
This is a University project and we have to turn it in tomorrow. we're really in trouble, help would be much appreciated
You don't want your email to be a text Index. Text indices allow you to search large amounts of text in MongoDB like if you were parsing through comments or something. All you want is to make sure your emails aren't duplicated so you should use an ascending or descending index.
CreateIndexOptions options = new CreateIndexOptions {Unique = true};
_collection.Indexes.CreateOneAsync(Builders<User>.IndexKeys.Ascending(_ => _.email), options)

How to find number of distinct fields in a collection in mongodb

As MongoDB provides the flexibility to store the unstructured data,
Is there any way in mongodb C# driver, I can find the number of distinct fields name from a collection.
I mean to say
{
"_id" : ObjectId("52fb69ff1ecf0322f0ab3129"),
"Serial Number" : "1",
"Name" : "Sameer Singh Rathoud",
"Skill" : "C++",
"City" : "Pune",
"Country" : "India"
}
{
"_id" : ObjectId("52fb69ff1ecf0322f0ab312a"),
"Serial Number" : "2",
"Name" : "Prashant Patil",
"DOB" : "31/07/1978",
"Location" : "Hinjewadi",
"State" : "Maharashtra",
"Country" : "India"
}
I want to get [_id, Serial Number, Name, DOB, Skill, City, State, Country]
i also faced this issue. If you till not got proper solution or for new person who searching solution for this kind of question they can use this.
var keys = [];
db.Entity.find().forEach(function(doc){
for (var key in doc){
if(keys.indexOf(key) < 0){
keys.push(key);
}
}
});
print(keys);

Categories