I'm new with Mongo DB and I'm trying to figure out how to do some more complex queries. I have a document that has a nested array of DateTime.
Here is my data:
{ "_id" : ObjectId("537d0b8c2c6b912520798b76"), "FirstName" : "Mary", "LastName" : "Johnson", "Age" : 27, "Phone" : "555 555-1212", "Dates" : [ISODate("2014-05-21T21:34:16.378Z"), ISODate("1987-01-05T08:00:00Z")] }
{ "_id" : ObjectId("537e4a7e2c6b91371c684a34"), "FirstName" : "Walter", "LastName" : "White", "Age" : 52, "Phone" : "800 123-4567", "Dates" : [ISODate("1967-12-25T08:00:00Z"), ISODate("2014-12-25T08:00:00Z")] }
What I want to do is return the document where the Dates array contains a value between a range. In my test case the range is 1/1/1987 and 1/10/1987 so I expect to get back the first document listed above (Mary Johnson) because 1/5/1987 is in that Dates array and falls between 1/1/1987 and 1/10/1987.
I'd like to be able to do this with both the MongoDB command line utility and the C# driver.
With the C# driver I tried the following LINQ query (based on an example in the MongoDB documentation):
DateTime beginRange = new DateTime(1987, 1, 1);
DateTime endRange = new DateTime(1987, 1, 10);
var result = from p in people.AsQueryable<Person>()
where p.Dates.Any(date => date > beginRange && date < endRange)
select p;
The above code throws an exception from within the C# Driver code:
Any is only supported for items that serialize into documents. The current serializer is DateTimeSerializer and must implement IBsonDocumentSerializer for participation in Any queries.
When I try and query the MongoDB database directly I tried the following:
db.People.find( {Dates: { $gt: ISODate("1987-01-01"), $lt: ISODate("1987-01-10") } } )
This query results in both of the documents coming back instead of just the one that has 1/5/1987 in its Dates array.
EDIT:
I figured out a way to do if from the C# driver. It isn't as clean as I would like but it is doable.
I figured that since there was a way to get what I wanted directly from the command utility that there must be a way to do if from the C# driver as well by just executing the same query from the C# driver.
string command = "{Dates : { $elemMatch : { $gt: ISODate(\"" + beginRange.ToString("yyyy-MM-dd") + "\"), $lt: ISODate(\"" + endRange.ToString("yyyy-MM-dd") + "\") } } } ";
var bsonDoc = BsonSerializer.Deserialize<BsonDocument>(command);
var queryDoc = new QueryDocument(bsonDoc);
MongoCursor<Person> p = people.Find(queryDoc);
C# Driver
Just as the exception suggests, you can't do what you want to do using the C# driver as long as your array is of a primitive type (DateTime) and not a genuine document.
From the MongoDB Linq Any description:
This will only function when the elements of the enumerable are serialized as a document. There is a server bug preventing this from working against primitives.
I guess you can create a document wrapper around a DateTime value so you can still do that:
var result = people.AsQueryable<Person>().Where(
person => person.Dates.Any(date =>
date.Value > beginRange && date.Value < endRange));
.
public class DocumentWrapper<T>
{
public ObjectId Id { get; private set; }
public T Value { get; private set; }
public DocumentWrapper(T value)
{
Id = ObjectId.GenerateNewId();
Value = value;
}
}
Native query
As to your query, it isn't actually the equivalent of the Linq query. That would be:
{
Dates :
{
$elemMatch :
{
$gt: ISODate("1987-01-01"),
$lt: ISODate("1987-01-10")
}
}
}
More on $elemMatch here
Related
I have documents like this:
class A
{
DateTime T;
...
}
and I would like to find the earliest and the newest document.
Is it better to do this:
var First = db.Collection.AsQueryable().OrderBy(_ => _.t).FirstOrDefault();
var Last = db.Collection.AsQueryable().OrderByDescending(_ => _.t).FirstOrDefault();
or,
var First = db.Collection.AsQueryable().OrderBy(_ => _.t).FirstOrDefault();
var Last = db.Collection.AsQueryable().OrderBy(_ => _.t).LastOrDefault();
or
var C = db.Collection.AsQueryable().OrderBy(_ => _.t);
var First = C.FirstOrDefault();
var Last = C.LastOrDefault();
I am wondering if there is anything with the underlying implementation that would change the speed between these options?
I am wondering if the sort has been done once already, is it possible that the result is cached and getting the first and the last elements would be faster?
Profiler becomes your friend when you're not sure about driver syntax. To enable logging for all queries you need to run on your database below statement:
db.setProfilingLevel(2)
Then to check last query executed on the database you need to run:
db.system.profile.find().limit(1).sort( { ts : -1 } ).pretty()
So for the first code snippet you will get:
"pipeline" : [ { "$sort" : { "t" : 1 } },
{ "$limit" : 1 }
]
"pipeline" : [ { "$sort" : { "t" : -1 } },
{ "$limit" : 1 }
]
For the second pair it prints
"pipeline" : [ { "$sort" : { "t" : 1 } },
{ "$limit" : 1 }
]
and throws NotSupportedException for LastOrDefault on my machine, if it works on your MongoDB driver's version you can check generated MongoDB statement using profiler
For the last one when you hover on c in your Visual Studio it prints
{aggregate([{ "$sort" : { "t" : 1 } }])}
but since it's of type IOrderedQueryable<T> it is only not materialized query so it will get executed on the database when you run FirstOrDefault generating the same aggregation body as previous statements. I'm getting NotSupportedException here as well. Here you can find a list of supported LINQ operators and both Last and LastOrDefault are not implemented so you need to sort descending.
I have a User class that accumulates lots of DataTime entries in some List<DateTime> Entries field.
Occasionally, I need to get last 12 Entries (or less, if not reached to 12). It can get to very large numbers.
I can add new Entry object to dedicated collection, but then I have to add ObjectId User field to refer the related user.
It seems like a big overhead, for each entry that holds only a DateTime, to add another field of ObjectId. It may double the collection size.
As I occasionally need to quickly get only last 12 entries of 100,000 for instance, I cannot place these entries in a per-user collection like:
class PerUserEntries {
public ObjectId TheUser;
public List<DateTime> Entries;
}
Because it's not possible to fetch only N entries from an embedded array in a mongo query, AFAIK (if I'm wrong, it would be very gladdening!).
So am I doomed to double my collection size or is there a way around it?
Update, according to #profesor79's answer:
If your answer works, that will be perfect! but unfortunately it fails...
Since I needed to filter on the user entity as well, here is what I did:
With this data:
class EndUserRecordEx {
public ObjectId Id { get; set; }
public string UserName;
public List<EncounterData> Encounters
}
I am trying this:
var query = EuBatch.Find(u => u.UserName == endUser.UserName)
.Project<BsonDocument>(
Builders<EndUserRecordEx>.Projection.Slice(
u => u.Encounters, 0, 12));
var queryString = query.ToString();
var requests = await query.ToListAsync(); // MongoCommandException
This is the query I get in queryString:
find({ "UserName" : "qXyF2uxkcESCTk0zD93Sc+U5fdvUMPow" }, { "Encounters" : { "$slice" : [0, 15] } })
Here is the error (the MongoCommandException.Result):
{
{
"_t" : "OKMongoResponse",
"ok" : 0,
"code" : 9,
"errmsg" : "Syntax error, incorrect syntax near '17'.",
"$err" : "Syntax error, incorrect syntax near '17'."
}
}
Update: problem identified...
Recently, Microsoft announced their DocumentDB protocol support for MongoDB. Apparently, it doesn't support yet all projection operators. I tried it with mLab.com, and it works.
You can use PerUserEntries as this is a valuable document structure.
To get part of that array we need to add projection to query, so we can get only x elements and this is done server side.
Please see snippet below:
static void Main(string[] args)
{
// To directly connect to a single MongoDB server
// or use a connection string
var client = new MongoClient("mongodb://localhost:27017");
var database = client.GetDatabase("test");
var collection = database.GetCollection<PerUserEntries>("tar");
var newData = new PerUserEntries();
newData.Entries = new List<DateTime>();
for (var i = 0; i < 1000; i++)
{
newData.Entries.Add(DateTime.Now.AddSeconds(i));
}
collection.InsertOne(newData);
var list =
collection.Find(new BsonDocument())
.Project<BsonDocument>
(Builders<PerUserEntries>.Projection.Slice(x => x.Entries, 0, 3))
.ToList();
Console.ReadLine();
}
public class PerUserEntries
{
public List<DateTime> Entries;
public ObjectId TheUser;
public ObjectId Id { get; set; }
}
I have the following aggregation pipline
var count = dbCollection.
Aggregate(new AggregateOptions { AllowDiskUse = true }).Match(query).
Group(groupby).
ToListAsync().Result.Count();
And this gets the following result:
{
"result" : [
{
"_id" : {
"ProfileId" : ObjectId("55f6c727965bb016c81971ba")
}
},
{
"_id" : {
"ProfileId" : ObjectId("55f6c727965bb016c81971bb")
}
}
],
"ok" : 1
}
But it seems it will make count operation on client, but how to perform it in MongoDb ?
I have MongoDb 2.0 C# driver & MongoDb v. 3.0.2
Add a constant field to your group function and then group again on the constant field (so that all the results are grouped into a single group) with a aggregate sum of 1. The first (and only) result will have the sum.
Ex.
var count = dbCollection.
Aggregate(new AggregateOptions { AllowDiskUse = true }).Match(query).
Group(groupby).Group(<id>:{ConstantField},Total:{$sum:1})
ToListAsync().Result.First().GetValue("Total").
I have the following C# model :
[ElasticType(Name = "myType")]
public class MyType
{
...
[ElasticProperty(Name = "ElasticId")]
[DataMember(Name = "ElasticId")]
public string ElasticId { get; set; }
...
[ElasticProperty(Name = "DateToBeUsed", Type = FieldType.Date, DateFormat = "date_hour_minute_second_millis")]
public string DateToBeUsed { get; set; }
...
}
The "date_hour_minute_second_millis" correspond to following format : yyyy-MM-dd’T'HH:mm:ss.SSS
(http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-date-format.html)
The mapping ES is done with Nest using "map" method and correspond to that :
"mappings": {
"myType": {
"properties": {
...,
"ElasticId": {
"type": "string"
},
...,
"DateToBeUsed": {
"type": "date",
"format": "date_hour_minute_second_millis"
},
...
}
}
}
I insert an document inside this index:
"_source": {
...,
"ElasticId": "2",
...,
"DateToBeUsed": "2012-05-21T09:51:34.073",
...
}
My problem is when I want to retrieve this object through Nest.
The value of DateToBeUsed is always formatted with the following format : MM/dd/yyyy HH:mm:ss
(ex : 05/21/2012 09:51:34)
(Using sense, the value is well formatted.)
1°) Is this normal?
I need to retrieve the same date format than the one I gave to ES.
(And I think it should be normal to have the same format as described in the mapping)
2°) Is there a "clean" solution to resolve this problem?
(Re-formatting the date after having retrieved the document is not a "clean" solution...)
Thanks for the answers!
Bye.
I've tried to reproduce what you're seeing, using the following code, but the date value is being returned in the Get call as expected:
string indexName = "so-27927069";
// --- create index ---
client.CreateIndex(cid => cid.Index(indexName));
Console.WriteLine("created index");
// --- define map ---
client.Map<MyType>(m => m
.Index(indexName)
.Type("myType")
.MapFromAttributes());
Console.WriteLine("set mapping");
// ---- index -----
client.Index<MyType>(
new MyType
{
DateToBeUsed = new DateTime(2012, 5, 21, 9, 51, 34, 73)
.ToString("yyyy-MM-ddThh:mm:ss.fff"),
ElasticId = "2"
},
i => i
.Index(indexName)
.Type("myType")
.Id(2)
);
Console.WriteLine("doc indexed");
// ---- get -----
var doc = client.Get<MyType>(i => i
.Index(indexName)
.Type("myType")
.Id(2)
);
Console.WriteLine();
Console.WriteLine("doc.Source.DateToBeUsed: ");
Console.WriteLine(doc.Source.DateToBeUsed);
Console.WriteLine();
Console.WriteLine("doc.RequestInformation.ResponseRaw: ");
Console.WriteLine(Encoding.UTF8.GetString(doc.RequestInformation.ResponseRaw));
I'm seeing the following result as output:
created index
set mapping
doc indexed
doc.Source.DateToBeUsed:
2012-05-21T09:51:34.073
doc.RequestInformation.ResponseRaw:
{"_index":"so-27927069","_type":"myType","_id":"2","_version":1,"found":true,"_source":{
"ElasticId": "2",
"DateToBeUsed": "2012-05-21T09:51:34.073"
}}
(Watching the traffic via Fiddler, I'm seeing an exact match between the ResponseRaw value and the payload of the response to the Get request.)
I'm on Elasticsearch version 1.5.2 and NEST version 1.6.0. (Maybe the issue you were seeing was fixed sometime in the interim....)
I have a "Payee" BsonDocument like this:
{
"Token" : "0b21ae960f25c6357286ce6c206bdef2",
"LastAccessed" : ISODate("2012-07-11T02:14:59.94Z"),
"Firstname" : "John",
"Lastname" : "Smith",
"PayrollInfo" : [{
"Tag" : "EARNINGS",
"Value" : "744.11",
}, {
"Tag" : "DEDUCTIONS",
"Value" : "70.01",
}],
},
"Status" : "1",
"_id" : ObjectId("4fc263158db2b88f762f1aa5")
}
I retrieve this document based on the Payee _id.
var collection = database.GetCollection("Payee");
var query = Query.EQ("_id", _id);
var bDoc = collection.FindOne(query);
Then, at various times I need to update a specific object inside the PayrollInfo array. So I search for the object with appropriate "Tag" inside the array and update the "Value" into the database. I use the following logic to do this:
var bsonPayrollInfo = bDoc["PayrollInfo", null];
if (bsonPayrollInfo != null)
{
var ArrayOfPayrollInfoObjects = bsonPayrollInfo.AsBsonArray;
for (int i = 0; i < ArrayOfPayrollInfoObjects.Count; i++)
{
var bInnerDoc = ArrayOfPayrollInfoObjects[i].AsBsonDocument;
if (bInnerDoc != null)
{
if (bInnerDoc["Tag"] == "EARNINGS")
{
//update here
var update = Update
.Set("PayrollInfo."+ i.ToString() + ".Value", 744.11)
collection.FindAndModify(query, null, update);
bUpdateData = true;
break;
}
}
}
}
if (!bUpdateData)
{
//Use Update.Push. This works fine and is not relevant to the question.
}
All this code works fine, but I think I am being cumbersome in achieving the result. Is there a more concise way of doing this? Essentially, I am trying to find a better way of updating an object inside of an array in a BsonDocument.
Mongo has a positional operator that will let you operate on the matched value in an array. The syntax is: field1.$.field2
Here's an example of how you'd use it from the Mongo shell:
db.dots.insert({tags: [{name: "beer", count: 2}, {name: "nuts", count: 3}]})
db.dots.update({"tags.name": "beer"}, {$inc: {"tags.$.count" : 1}})
result = db.dots.findOne()
{ "_id" : ObjectId("50078284ea80325278ff0c63"), "tags" : [ { "name" : "beer", "count" : 3 }, { "name" : "nuts", "count" : 3 } ] }
Putting my answer here in case it helps you. Based on #MrKurt's answer (thank you!), here is what I did to rework the code.
var collection = database.GetCollection("Payee");
var query = Query.EQ("_id", _id);
if (collection.Count(query) > 0)
{
//Found the Payee. Let's save his/her Tag for EARNINGS
UpdateBuilder update = null;
//Check if this Payee already has any EARNINGS Info saved.
//If so, we need to update that.
query = Query.And(query,
Query.EQ("PayrollInfo.Tag", "EARNINGS"));
//Update will be written based on whether we find the Tag:EARNINGS element in the PayrollInfo array
if (collection.Count(query) > 0)
{
//There is already an element in the PayrollInfo for EARNINGS
//Just update that element
update = Update
.Set("PayrollInfo.$.Value", "744.11");
}
else
{
//This user does not have any prior EARNINGS data. Add it to the user record
query = Query.EQ("_id", _id);
//Add a new element in the Array for PayrollInfo to store the EARNINGS data
update = Update.Push("PayrollInfo",
new BsonDocument {{"Tag", "EARNINGS"}, {"Value", "744.11"}}
);
}
//Run the update
collection.FindAndModify(query, null, update);
}
It doesn't look any lesser than my original code, but it is much more intuitive, and I got to learn a lot about positional operators!