Azure function with Entity Framework and concurrency - c#

I have an IotHub trigger which basically saves an incoming id to the database using Entity Framework if such an id does not exist.
[FunctionName("MainFunc")]
public static async Task Run(
[IoTHubTrigger("messages/events",
Connection = "IotHubCompatibleEndpointConnectionString",
ConsumerGroup = "ttx_iothub_trigger_sqldb_cg")]
EventData eventData, ILogger log)
{
string id = GetIdFromMessage(eventData);
var context = new MyEfDbContext();
InsertIfNotExists(id);
DoSomethingElse(context);
context.SaveChanges();
}
The problem is that when there are a lot of messages sent to the iot hub, multiple trigger calls start working in parallel (at least when I debug the trigger) which causes the issue for InsertIfNotExists() method leading to duplicate key exception when more than 1 record with the same id, not existing in the database, is being processed.
What is the most appropriate way to fix it? Just swallow the exception because the record anyway will appear in the database?

You did not provide much code what you are doing in InsertIfNotExists but as I see your problem is that you have context.SaveChanges(); in the end which means that firstly you modify in memory then you save, which means that at same time you can have multiple instances doing the same. Here is most important to insert quickly to database but even that wont guarantee one insert at a time.
So i see few options here
Option 1. You can also can have try catch, and on error just do second trip to get by id since you know that record is inserted.
Option 2. Transact sql has merge statement which actually do upsert. If you are using entity framework core (You did not provide if its core or not but I assume its core) you can use extension
DataContext.DailyVisits
.Upsert(new DailyVisit
{
UserID = userID,
Date = DateTime.UtcNow.Date,
Visits = 1,
})
.On(v => new { v.UserID, v.Date })
.WhenMatched(v => new DailyVisit
{
Visits = v.Visits + 1,
})
.RunAsync();
Option 3. Would be to use transactions.

Related

Simple web application with ASP.NET MVC + EF Core sometimes creates several identical entries in the database

I have a web application where users can post offers for products.
Sometimes - less than 1 in 10 cases - saving an offer leads to multiple entries in the database with identical data (usually 2 entries, but 3 or even 4 identical entries already occurred).
The offer class is a simple entity class with some properties in it. It is also connected to two further entity classes OfferImage and OfferCategory, which store the associated images and the categories in which the offer should appear.
The code to save an item to the database is the following (part of a repository class):
public class OfferRepository {
...
public async Task InsertAsync(Offer offer)
{
Context.Offer.Add(offer);
await Context.SaveChangesAsync();
}
...
}
It is called within a service class:
public class OfferService
{
...
public async Task CheckinAsync(Offer offer)
{
await repository.InsertAsync(offer);
}
...
}
That method of this service class is called by an mvc controller:
public async Task<IActionResult> Create(CreateOfferViewModel createOfferViewModel)
{
if (ModelState.IsValid)
{
...
//conversion of the view model object to an Offer object
...
await offerService.CheckinAsync(offer);
}
...
}
As you can see, the structure is relatively simple. However, this error occurs regularly.
The lifecycle of the service classes is managed with dependency injection in startup.cs. OfferRepository and OfferService are both added scoped (services.AddScoped) The context class is added like this:
services.AddDbContext<Context>(options =>
{
options.UseSqlServer(
Configuration.GetConnectionString("DataConnection"),
sqlServerOptionsAction: sqlServerOptions =>
{
sqlServerOptions.EnableRetryOnFailure(maxRetryCount: 3, maxRetryDelay: TimeSpan.FromSeconds(3), errorNumbersToAdd: null);
}
);
});
To further narrow down the problem, I ran a profiler and got a recording of the INSERT statements (in chronological order):
SPID 59 - 2019-09-24 16:05:19.670 - INSERT INTO Offer ...
SPID 57 - 2019-09-24 16:05:19.673 - INSERT INTO Offer ...
SPID 59 - 2019-09-24 16:05:19.710 - INSERT INTO OfferImage ... / INSERT INTO OfferCategory ...
SPID 57 - 2019-09-24 16:05:19.760 - INSERT INTO OfferImage ... / INSERT INTO OfferCategory ...
What makes me suspicious is that there are two different process IDs that execute the INSERTs. Since the DbContext is scoped by default, there should only be one process ID under which all statemens are executed - or am I wrong? If I am not mistaken, this would mean that two requests are executed in parallel, which in turn raises further questions about how this can happen.
As you can see, I am a little confused and hope for help from someone who can explain this or has observed and solved something similar.
(SQL Server is version 13/2016, EF Core is version 2.2)
Thank you very much!
Every call to your API can use a different thread id. If PID really is a different process, then you have 2 different instances of your API running at once.
If you really don't want duplicates, then add a constraint to your database to prevent duplicates (product name or something). This means that SaveChanges will throw a DbUpdateException which you will need to catch and determine if it's a duplicate exception (which you can send an error back to the user via a HTTP response code, maybe 409 conflict), or something else (which is probably a 5xx error).

Is it possible to add dynamic data to an MassTransit courier/routing slip custom event?

I have a MassTransit routing slip configured and working. For reference, the routing slip takes in an ID of an item in a MongoDB database and then creates a "version" of that document in a SQL database using EF Core. The activities (as commands) are:
Migrate document to SQL
Update audit info in MongoDB document
Update MongoDB document status (i.e. to published)
All of the above are write commands.
I have added a new 1st step which runs a query to make sure the MongoDB document is valid (e.g. name and description fields are completed) before running the migration. If this step fails it throws a custom exception, which in turns fires a failed event which is then picked up and managed by my saga. Below is a snippet of my activity code followed by the routing slip builder code:
Activity code
var result = await _queryDispatcher.ExecuteAsync<SelectModuleValidationResultById, ModuleValidationResult>(query).ConfigureAwait(false);
if (!result.ModuleValidationMessages.Any())
{
return context.Completed();
}
return context.Faulted(new ModuleNotValidException
{
ModuleId = messageCommand.ModuleId,
ModuleValidationMessages = result.ModuleValidationMessages
});
Routing slip builder code
builder.AddActivity(
nameof(Step1ValidateModule),
context.GetDestinationAddress(ActivityHelper.BuildQueueName<Step1ValidateModule>(ActivityQueueType.Execute)),
new SelectModuleValidationResultById(
context.Message.ModuleId,
context.Message.UserId,
context.Message.LanguageId)
);
builder.AddSubscription(
context.SourceAddress,
RoutingSlipEvents.ActivityFaulted,
RoutingSlipEventContents.All,
nameof(Step1ValidateModule),
x => x.Send<IModuleValidationFailed>(new
{
context.Message.ModuleId,
context.Message.LanguageId,
context.Message.UserId,
context.Message.DeploymentId,
}));
Whilst all of this works and the event gets picked up by my saga I would ideally like to add the ModuleValidationMessages (i.e. any failed validation messages) to the event being returned but I can't figure out how or even if that's possible (or more fundamentally if it's right thing to do).
It's worth noting that this is a last resort check and that the validation is checked by the client before even trying the migration so worse case scenario I can just leave it has "Has validation issues" but ideally I would like to include the derail in the failed response.
Good use case, and yes, it's possible to add the details you need to the built-in routing slip events. Instead of throwing an exception, you can Terminate the routing slip, and include variables - such as an array of messages, which are added to the RoutingSlipTerminated event that will be published.
This way, it isn't a fault but more of a business decision to terminate the routing slip prematurely. It's a contextual difference, which is why it allows variables to be specified (versus Faulted, which is a full-tilt exception).
You can then pull the array from the variables and use those in your saga or consumer.

Task.WhenAll hangs out if there are no tasks to complete

I have method that inserts list of objects into Mongo DB.
public class StorageService : IStorageService
{
public Task<BulkWriteResult<Option>> SaveOptions(List<Option> contracts)
{
var context = new MongoContext<Option>();
return context.SaveCollection(contracts);
}
}
var optionIds = Task
.WhenAll(storageService.SaveOptions(optionDetails.Values.ToList()))
.Result;
If list of contracts is empty, then there is no objects to insert into DB, and no tasks to complete, so Task.WhenAll keeps running indefinitely creating a deadlock.
Question
Is there a way to return empty / completed task if list is empty, or maybe there is a better solution of how to get results of insert, but at the same time, correctly handle case when there are no results?
Update #1
Approximate structure.
WebApi - MVC project
[AcceptVerbs("POST")]
public Response<int> DownloadOptions([FromBody] ContractSelector data)
{
.. some controller code
var optionIds = Task
.WhenAll(storageService.SaveOptions(optionDetails.Values.ToList()))
.Result;
// this method should gather data from multiple APIs
// so I need Task.Result of all previous operations
// I could make this method async and use await, but it's not the case here
}
Class Library project, .NET 4.6.1
public class StorageService : IStorageService
{
public Task<BulkWriteResult<Option>> SaveOptions(List<Option> contracts)
{
var context = new MongoContext<Option>();
return context.SaveCollection(contracts);
}
}
Update #2
Why there is no use for async / await. There are 2 external API calls, one is to get general info about some asset, and the second one is to get prices for this asset, I can't change this. So, if I want to get all info in one method I must request general info, then Wait for Result, and, based on general info, request relevant prices. After this, I want to save gathered info into DB and return list of saved IDs in the response to my API.
Sequence of calls
1. UI
2. WebApi MVC Controller
3. Class Library
3.1 Request asset info - wait for the result
3.2 Get asset info - request prices for selected assets - wait for the result
3.3 Get asset info and prices info - save everything to DB - return response
var contracts = Task.WhenAll(optionService.GetContracts(params)).Result;
var prices = Task.WhenAll(optionService.GetOptionDetails(contracts)).Result;
var ids = Task.WhenAll(storageService.SaveOptions(prices.Values.ToList()));
So, response depends on 3.3, 3.3 depends on 3.2, 3.2 depends on 3.1. If you know how to turn it all to a non-blocking call, I'm all ears. For now I think that 3 blocking calls in 1 HTTP request are better than 3 separate async HTTP requests.
Can't explain in details, but looks like issue was caused by incompatibility between .NET framework and Mongo DB driver. Currently I'm using .NET 4.6.1 and Mongo 2.6.1 and it works fine.
Yesterday any try to save an empty list of objects into Mongo DB caused a deadlock.
Then I tried to refactor code and install the latest version of Mongo DB, which is 2.7.0. Now any CRUD operation with Mongo DB returns this error. There is no call stack and more details about this exception.
MethodAccessException: Attempt by method 'MongoDB.Driver.ClientSession..ctor(MongoDB.Driver.IMongoClient, MongoDB.Driver.ClientSessionOptions, MongoDB.Driver.IServerSession, Boolean)' to access method 'MongoDB.Driver.Core.Clusters.ClusterClock..ctor()' failed.
Then I tried to lower version of Mongo DB to 2.6.1 and when I try to insert empty list of objects I'm getting this exception, which is correct. When I insert at least one record it works as expected.
System.AggregateException: 'One or more errors occurred.'
ArgumentException: Must contain at least 1 request.
Parameter name: requests
Testing code
var item = new Demo
{
Symbol = "SomeSymbol",
Expiration = "2019-01-01"
};
var list = new List<Demo>();
list.Add(item);
list.Add(item); // if I comment these lines, then Mongo 2.6.1 returns correct exception
var optionIds = Task.WhenAll(storageService.Save(list)).Result;
Save method is implemented as a part of Mongo repository pattern
public Task<BulkWriteResult<T>> SaveCollection(List<T> items)
{
var records = new List<ReplaceOneModel<T>>();
var processes = new List<Task<ReplaceOneResult>>();
items.ForEach(contract =>
{
var record = new ReplaceOneModel<T>(Builders<T>
.Filter
.Where(o => o.Id == contract.Id), contract)
{
IsUpsert = true
};
records.Add(record);
});
return Collection.BulkWriteAsync(records);
}

New DbContext is not pulling updated data if database is modified manually

Background
I have a central database my MVC EF web app interacts with following best practices. Here is the offending code:
// GET: HomePage
public ActionResult Index()
{
using (var db = new MyDbContext())
{
return View(new CustomViewModel()
{
ListOfStuff = db.TableOfStuff
.Where(x => x.Approved)
.OrderBy(x => x.Title)
.ToList()
});
}
}
I also modify the data in this database's table manually completely outside the web app.
I am not keeping an instance of the DbContext around any longer than is necessary to get the data I need. A new one is constructed per-request.
Problem
The problem I am having is if I delete a row or modify any data from this table manually outside the web app, the data being served by the above code does not reflect these changes.
The only way to get these manual edits of the data to be picked up by the above code is to either restart the web app, or use the web app to make a modification to the database that calls SaveChanges.
Log Results
After logging the query being executed and doing some manual tests there is nothing wrong with the query being generated that would make it return bad data.
However, in logging I saw a confusing line in the query completion times. The first query on app start-up:
-- Completed in 86 ms with result: CachingReader
Then any subsequent queries had the following completion time:
-- Completed in 0 ms with result: CachingReader
What is this CachingReader and how do I disable this?
Culprit
I discovered the error was introduced elsewhere in my web app as something that replaced the underlying DbProviderServices to provide caching, more specifically I am using MVCForum which uses EF Cache.
This forum's CachingConfiguration uses the default CachingPolicy which caches everything unless otherwise interacted with through the EF which was the exact behavior I was observing. More Info
Solution
I provided my own custom CachingPolicy that does not allow caching on entities where this behavior is undesirable.
public class CustomCachingPolicy : CachingPolicy
{
protected override bool CanBeCached(ReadOnlyCollection<EntitySetBase> affectedEntitySets, string sql, IEnumerable<KeyValuePair<string, object>> parameters)
{
foreach (var entitySet in affectedEntitySets)
{
var table = entitySet.Name.ToLower();
if (table.StartsWith("si_") ||
table.StartsWith("dft_") ||
table.StartsWith("tt_"))
return false;
}
return base.CanBeCached(affectedEntitySets, sql, parameters);
}
}
With this in place, the database logging now always shows:
-- Completed in 86 ms with result: SqlDataReader
Thanks everyone!

logg/audit trail on EF5

I am looking for a good way to perform logs change/audit trail on EF5 database first.
The main problem i'm having is that currently an old application is ruining and it creates logs using Triggers, but on that application the database connection uses a specific user for each user on the application (every user on the application has his own database user), so when they do a log they use a lot of the connection properties as default values like userID, and Host, also many logged tables doesn't have an userID row so if i use EF, the entity i want to update/insert/delete doesn't have any user data.
but my application (MVC4) has only 1 string connection using only 1 user (same database user for each) so the triggers will store the userId of the database user from the connection string.
so what will be a good way to create logs using EF? is there a way to do it using triggers?(and passing userID and others?).
i have being reading about override onUpdate functions but also they say it wont work on EF5
In the DatabaseContext it is possible to override the SaveChanges function.
You can test the changeset for entries that needs to be logged.
Maybe it's to low-level i.e. to close to the datalayer, but it will work in EF.
You'll get something like this:
public override int SaveChanges()
{
foreach (var entry in ChangeTracker.Entries())
{
if (entry.State == EntityState.Added)
{
var needToLogAdd = entry.Entity as INeedToLogAdd;
if (needToLogAdd != null)
DoLogAdd(needToLogAdd);
}
}
base.SaveChanges();
}

Categories