I have an Azure function triggered by a timer in which I want to update documents inside CosmosDB. Now I'm using the function UpdateOneAsync with option IsUpsert = true to make the update (or insert if the document doesn't exist).
However I'm doing the update operation inside a foreach loop, therefore an update operation is performed foreach item. How can I do a bulk update (upsert), performing just one operation after the foreach loop finishes?
Here it is my code right now:
foreach (var group in GetGroups(date, time, hour))
{
dic = new MyDictionary<string>();
//... some operations
List<BsonElement> documents = new List<BsonElement>();
documents.Add(new BsonElement("$inc", new BsonDocument(dic)));
documents.Add(new BsonElement("$set", new BsonDocument(new Dictionary<string, string>() { { "c", key }, { "d", date } })));
var doc = clicksDoc.UpdateOneAsync(t => t["_id"] == "c-" + key + "-" + date, new BsonDocument(documents), new UpdateOptions() { IsUpsert = true }).Result;
}
Instead I'd like to perform just one update after the loop. How can I do that?
2020 answer
Bulk support has been added to the .NET SDK:
Introducing Bulk support in the .NET SDK
To use it, first enable bulk execution when you create your client:
CosmosClient client = new CosmosClientBuilder(options.Value.ConnectionString)
.WithConnectionModeDirect()
.WithBulkExecution(true)
.Build();
Then get your container as normal:
Container container = client.GetContainer("databaseName", "containerName");
Then do your bulk operation, e.g. upsert:
public async Task BulkUpsert(List<SomeItem> items)
{
var concurrentTasks = new List<Task>();
foreach (SomeItem item in items)
{
concurrentTasks.Add(container.UpsertItemAsync(item, new PartitionKey(item.PartitionKeyField)));
}
await Task.WhenAll(concurrentTasks);
}
You can use the method BulkUpdateAsync from the BulkExecutor,
List<UpdateItem> updateList = initialDocuments.Select(d =>
new UpdateItem(
d.id,
d.AccountNumber,
new List<UpdateOperation> {
new SetUpdateOperation<string>(
"NewSimpleProperty",
"New Property Value"),
new SetUpdateOperation<dynamic>(
"NewComplexProperty",
new {
prop1 = "Hello",
prop2 = "World!"
}),
new UnsetUpdateOperation(nameof(FakeOrder.DocumentIndex)),
}))
.ToList();
var updateSetResult = BulkUpdatetDocuments(_database, _collection, updateList).GetAwaiter().GetResult();
and
var executor = new BulkExecutor(_documentClient, collectionResource);
await executor.InitializeAsync();
return await executor.BulkUpdateAsync(updates);
SAMPLE
Related
I am trying to create a simple list of objects, but somehow on every foreach loop the previous records are overwritten by the new one loop the previous record is overwritten by the new record. So if there are 6 entries in realData, the list will have 6x the last record.
Do I somehow recreate the List instead of adding to it? Is there another alternative that I have overlooked to create a List?
My code is
public async Task<IActionResult> OrderOverview()
{
var itemList = new List<OrderItemVM>();
var realData = await _context.OrderItem.ToListAsync();
var orderItemVM = new OrderItemVM();
foreach (var item in realData)
{
orderItemVM.Id = item.Id;
orderItemVM.OrderId = item.OrderId;
orderItemVM.OrderName = _context.Order.Find(item.OrderId).OrderName;
orderItemVM.ItemName = item.ItemName;
itemList.Add(orderItemVM);
}
return View(itemList);
}
You are modifying the previously added objects instead of adding a new one. You should do this.
foreach (var item in realData)
{
OrderItemVM orderItemVM = new OrderItemVM ();
orderItemVM.Id = item.Id;
orderItemVM.OrderId = item.OrderId;
orderItemVM.OrderName = _context.Order.Find(item.OrderId).OrderName;
orderItemVM.ItemName = item.ItemName;
itemList.Add(orderItemVM);
}
So, basically on each iteration you create a new empty object and then assign that values and add that in List.
It happens because you are inserting the same reference of orderItemVM to itemList.
Also, you can set a default size for itemList and boost performance.
var realData = await _context.OrderItem.ToListAsync();
var itemList = new List<OrderItemVM>(realData.Count);
And for this task, you can use LINQ:
public async Task<IActionResult> OrderOverview()
{
var realData = await _context.OrderItem.ToListAsync();
var itemList = realData.Select(item => new OrderItemVM
{
Id = item.Id,
OrderId = item.OrderId,
OrderName = _context.Order.Find(item.OrderId).OrderName,
ItemName = item.ItemNam,
}).ToList();
return View(itemList);
}
Thanks to Lasse V. Karlsen I discovered the error. I moved the line var OrderItemVM = new OrderItemVM in the Foreach-loop. That solved it.
I have the following method which inserts a row into ADGroup and then inserts a row into child table GroupExtension:
public async Task AddADGroupsAsync(List<ADGroup> adGroups)
{
await Task.Run(async () =>
{
EntityFramework.ADGroup dbADGroup;
EntityFramework.ADGroup dbADGroupInsert;
GroupExtension dbGroupExtension;
using (var dbContext = new SecurityAPIEntities())
{
foreach (var adGroup in adGroups)
{
dbADGroup = new EntityFramework.ADGroup
{
Id = Guid.NewGuid(),
Name = adGroup.Guid
};
dbADGroupInsert = dbContext.ADGroups.Add(dbADGroup);
dbGroupExtension = new GroupExtension
{
Id = Guid.NewGuid(),
CreatedDateTime = DateTime.Now,
UpdatedDateTime = DateTime.Now
};
dbADGroupInsert.GroupExtensions.Add(dbGroupExtension);
}
await dbContext.SaveChangesAsync();
}
});
}
The problem is that this code is inserting 8914 rows into ADGroup, but only 8676 rows into GroupExtension. This seems like a pretty basic operation so I'm not sure why it's failing on a small number of child inserts. I think that this code may have been working when I had the SaveChangesAsync() inside of the foreach loop instead of after it.
I have just started using mongodb as a result of dealing with bulk data's for my new project.I just set up the database and installed c# driver for mongodb and here is what i tried
public IHttpActionResult insertSample()
{
var client = new MongoClient("mongodb://localhost:27017");
var database = client.GetDatabase("reznext");
var collection = database.GetCollection<BsonDocument>("sampledata");
List<BsonDocument> batch = new List<BsonDocument>();
for (int i = 0; i < 300000; i++)
{
batch.Add(
new BsonDocument {
{ "field1", 1 },
{ "field2", 2 },
{ "field3", 3 },
{ "field4", 4 }
});
}
collection.InsertManyAsync(batch);
return Json("OK");
}
But when i check the collection for documents i see only 42k out of 0.3million records inserted.I use robomongo as client and would like to know what is wrong here.Is there any insertion limit per operation ?
You write async and don't wait for a result. Either wait for it:
collection.InsertManyAsync(batch).Wait();
Or use synch call:
collection.InsertMany(batch);
I'm playing around with the new driver of mongodb 2.0, and looking for adding some facetted searchs (Temporary move ,before using elastic search).
Here is some method where I created to build the agreggation. I guess that it should work.
As parameter I passed also a filterdefinition in the method.
But I don't find how to limit my agreggation to the filter.
Any Idea ???
private void UpdateFacets(SearchResponse response, FilterDefinition<MediaItem> filter, ObjectId dataTableId)
{
response.FacetGroups =new List<SearchFacetGroup>();
SearchFacetGroup group = new SearchFacetGroup()
{
Code = "CAMERAMODEL",
Display = "Camera model",
IsOptional = false
};
using (IDataAccessor da = NodeManager.Instance.GetDataAccessor(dataTableId))
{
var collection = da.GetCollection<MediaItem>();
var list = collection.Aggregate()
.Group(x => ((ImageMetaData) x.MetaData).Exif.CameraModel, g => new { Model = g.Key, Count = g.Count() })
.ToListAsync().Result;
foreach (var l in list)
{
group.Facets.Add(new SearchFacetContainer()
{
Code = l.Model,
Display = l.Model,
Hits = l.Count,
IsSelected = false
});
}
}
response.FacetGroups.Add(group);
}
I haven't used facet, but with Mongo driver Aggregate has .Match operation that accepts a filterdefinition.
collection1.Aggregate().Match(filter)
I started out with Mongo client doing some nifty queries and aggretations.. but now that I want to use it in .NET/C#, I see that I can't simply run the query as text field..
Furthermore, after resorting to building an Aggregation Pipeline, and running the collection.Aggregate() function, I'm getting a result set, but I have no idea how to traverse it..
Can anyone help guide me here?
Here's my code:
var coll = db.GetCollection("animals");
var match = new BsonDocument {
{ "$match", new BsonDocument {{"category","cats"}} }
};
var group = new BsonDocument{
{
"$group", new BsonDocument{
{"_id", "$species"},
{"AvgWeight", new BsonDocument{{"$avg", "$weight"}}} }
}
};
var sort = new BsonDocument{{"$sort", new BsonDocument{{"AvgWeight", -1}}}};
var pipeline = new[] { match, group, sort };
var args = new AggregateArgs { Pipeline = pipeline };
var res = coll.Aggregate(args);
foreach (var obj in res)
{
// WHAT TO DO HERE??
}
Also, I should say that I'm a little rusty with C# / ASP.NET / MVC so any room for simplification would be much appreciated.
Your result is IEnumerable of BsonDocument, you can Serialize them to C# objects using the BSonSerializer. And this code snippet just writes them to your console, but you can see that you have typed objects
List<Average> returnValue = new List<Average>();
returnValue.AddRange(documents.Select(x=> BsonSerializer.Deserialize<Average>(x)));
foreach (var obj in returnValue)
{
Console.WriteLine("Species {0}, avg weight: {1}",returnValue._Id,returnValue.AvgWeight);
}
And then have a class called Average, where the property name match the names in the BSonDocument, if you want to rename then (because _Id is not so nice in c# terms concerning naming conventions), you can add a $project BsonDocument to your pipeline.
public class Average
{
public string _Id { get; set; }
public Double AvgWeight {get; set; }
}
$project sample (add this in your pipeline just before sort
var project = new BsonDocument
{
{
"$project",
new BsonDocument
{
{"_id", 0},
{"Species","$_id"},
{"AvgWeight", "$AvgWeight"},
}
}
};