How to prevent to update changed data - c#

I use SQL Azure and have application, which sync data with external resource. Data is huge, approx 10K records, so, I get it from DB one time, update something if necessary during some minutes and save changes. It works, but problem with simultaneously access to data. IF during these some minutes other service add changes, these changes will be rewritten.
But in the most cases it concerns fields, which my application does not touch!
So, for example, my Table Device:
public partial class Device : BaseEntity
{
public string Name { get; set; }
public string IMEI { get; set; }
public string SN { get; set; }
public string ICCID { get; set; }
public string MacAddress { get; set; }
public DeviceStatus Status { get; set; }
first service (application with long-term process) can modify SN, ICCID, MacAddress, but not Status, second service, vice versa, can modify only Status.
Code to update in the first service:
_allLocalDevicesWithIMEI = _context.GetAllDevicesWithImei().ToList();
(it gets entities, not DTO, because really there are many fields can be changed)
and then:
_context.Devices.Update(localDevice);
for every device, which should be changed
and, eventually:
await _context.SaveChangesAsync();
How to mark, that field Status should be excluded from tracing?

One simple method to avoid update the status field when calling the first service is create a update entity not include the status field, and create another update entity for the second service which includes the status field.
Another way to resolve this problem is override the SaveChangesAsync method and control the update logic by yourself, but it's complex I think and the behavior is implicit, it will not easy for others to understand your code.

To avoid rewrite, you can specify RowVersion on entities. This is so called optimistic concurrency, it will throw error if rewrite happens and you can retry operation if someone already changed something. Or you can just level up your Transaction level, to something like RepeatableRead/Serialized to lock these rows for entire operation (which of course will pose huge performance impact and timeouts). Second option is simple, and good enough for background jobs and distributed transactions, first one is more flexible and usually faster. but hard to implement across multiple endpoints/entities.

Related

Incrementing a single database row field multiple times at once while making sure that all changes are saved?

I have an ASP.NET 5 Web API application using EF Core with Npgsql (PostgreSQL).
I'm trying to implement a file download tracker that increments some fields in my database when a file is requested.
Let's say I have this object:
public class Stats
{
[Key]
public long ID { get; set; }
public long DownloadCount { get; set; }
}
This application will obviously have many users requesting many different files at the same time, resulting in lots of changes made to the same value for each file. Now, I'm wondering whether this type of thing is already implemented in the internal change tracker or not.
If it isn't, how can I implement it in a way that will make sure that every request is counted?

Update Database after years of utilisation

EDIT 1: story example at the end
Years ago, we created tables in order to count how many products there were in our boxes.
There are two simple tables:
product (
code VARCHAR(16) PK,
length INT,
width INT,
height INT
)
box (
pkid INT IDENTITY(1,1),
barcode varchar(18),
product_code VARCHAR(16) FK,
quantity INT
)
And there two associated class:
public struct Product
{
public string Code { get; set; }
public int Length { get; set; }
public int Width { get; set; }
public int Heigth { get; set; }
}
public struct Box
{
public int Id { get; set; }
public string BarCode { get; set; }
public Product Product { get; set; }
public int Quantity { get; set; }
}
After years, we need to put multiple different products in the same box, so we now need this:
product (
code VARCHAR(16) PK,
length INT,
width INT,
height INT
)
-- box changed
box (
pkid INT IDENTITY(1,1),
barcode varchar(18)
)
-- stock created
stock (
box_pkid INT M-PK FK,
product_code VARCHAR(16) M-PK FK,
quantity INT
)
and this:
public struct Product
{
public string Code { get; set; }
public int Length { get; set; }
public int Width { get; set; }
public int Heigth { get; set; }
}
public struct Box
{
public int Id { get; set; }
public string BarCode { get; set; }
public Dictionary<Product, int> Content { get; set; } // <-- this changed
public int Quantity { get; set; }
}
But after years, we have lot of code, maybe duplicates in some dark places, left by leaving collaborators. I am a trainee, so I ask for my future experiences, in order to avoid this later.
What could be a solution to update our schema and keep data-integrity safe ? Even with millions of rows in DB ?
Example:
In 2014, we needed to store 10 Romeo and Juliet books in one box. If we had some Hamlet books, then we put them in another box. All 10 Romeo and Juliet books were the 'same' product (same cover, same content, same reference).
Today, we want to store, let's say, different Shakespear books in the same box. Or maybe different Love books. Or even Romeo and Juliet books AND figurines? So different products together: we should change the box table and Box class, shouldn't we?
You have many challenges; I'd split them into two high-level groups.
Firstly, how do you change your application at the code level, and secondly, how do you migrate your data from the old schema to the new one.
How do you change your code?
The first question is: can you be 100% certain that the classes you list are the only ways the data is accessed and modified? Are there any triggers, stored procedures, batch jobs, or other applications? I'm not aware of any way of finding this out other than by trawling through both the database schema artifacts, and the code base.
Within your "own" application, you have a choice. It's usually better to extend than modify your interface. In practical terms, that means that instead of changing the public Product Product { get; set; } signature to handle a dictionary, you keep it around, and add public Dictionary<Product, int> Content { get; set; } - if you can guarantee that the old method still works. This would mean limited re-writing all the dependencies of your class - you only have to worry about clients that need to understand that there could be more than 1 product in a box.
This allows you to follow a "lots of small changes, but the existing code continues to work" model; you can manage this via feature toggles etc. It's much lower risk - so the lesson here is "design your solution to be open to extension, but closed to change".
In this case, it doesn't seem possible - the "set" method may be okay (you can default that to a "one product in a box" solution), but the "get" method would have no graceful way of handling the case where you have more than 1 product in a box. If that's true, you change the class, and look for all the instances where your code won't compile, and follow the chain of dependencies.
For instance, in a typical MVC framework, in this case you'd be changing the model; this should cause the controller to report a compile error. In resolving that error, you almost certainly modify the signature of the controller methods. This in turn should break the view. So you follow that chain; doing this means your schema change becomes a "big bang, all-or-nothing" release. This typically is stressful for all involved...
How do you release your change?
This depends hugely on which of the two options you've chosen. #gburton's answer covers the database steps; these are necessary in both code options.
The second challenge is releasing new versions of your software; if it's a desktop client, for instance, you must make sure all clients are updated at the same time as your database change. This is (usually) horrible. If it's a web application, it's usually a little easier - fewer client machines to worry about.
Safely updating a legacy system is a classic problem. I'm guessing from your post that there isn't a nice safe dev copy of the DB, or at least one that is up to date, or you would already have a process to apply here.
I've written this in a system agnostic way even though you're obviously using MS SQL Server.
The key is to use caution, and ensure you are never 100% stuck if something goes wrong.
back up the old DB. Ensure you know how to do this without breaking anything.
Restore that backup into a new location.
figure out a test plan (this can be the longest part of the job)
make the changes to the new copy of the DB (don't touch the live one)
run through your test plan to ensure nothing has been broken.
If step 5 showed some errors, you just have to work through them. Once this is done, you have the scary part. The backup restore drill is critical here.
take backup of live database (your previous backup is probably out of date. you want as fresh a backup as possible to reduce data loss)
run a backup restore drill to make 100% sure you can recover
apply the changes to the live database
re-run your tests
Recovering a database down to the individual transaction is possible with many database engines. Consider using that process for step 6 if possible. How to achieve this would be a seperate question.

Entity framework (core), save incomplete model containing required fields

Using : dotnet core 1.1, entity framework code first, sql server.
Is there any elegant way to enable a user working on a large form, represented by a complexe model (40+ tables/C# objects), having multiple "required" fields, to save it's work temporarily and come back to complete it afterward?
Let's say I have this model :
[Table("IdentificationInfo", Schema = "Meta")]
public class IdentificationInfo : PocoBase
{
[...]
public int MetaDataId { get; set; }
[ForeignKey("MetaDataId")]
public virtual MetaData MetaData { get; set; }
public int ProgressId { get; set; }
[ForeignKey("ProgressId")]
public Progress Progress { get; set; }
public virtual MaintenanceInfo MaintenanceInfo { get; set; }
public int PresentationFormId { get; set; }
[ForeignKey("PresentationFormId")]
public PresentationForm PresentationForm { get; set; }
private string _abstract;
[Required]
public string Abstract
{
get { return _abstract; }
set { SetFieldValue(ref _abstract, value, "Abstract"); }
}
[...]
}
[Table("PresentationForm", Schema = "Meta")]
public class PresentationForm : PocoEnumeration
{
[...]
}
The user starts to fill everything (in a big form with multiples tabs or really long page!), but needs to stop and save the progress without having the time to save to fill the PresentationForm part, nor the abstract. Normally, in the database, those fields are not null, so it would fail when we try to save the model. Similarly, it would also fail with EF validation in the UI.
What would be nice is using the Progress property and disable EF model validation (model.isValid()), and also enable database insert even if the fields are null (it is not possible to put default values in those not nullable fields as they are often foreign keys to enum like table).
For the model validation part, I know we can make some custom validator, with custom annotation such as [RequiredIf("FieldName","Value","Message")]. I'm really curious about some method to do something similar in the database?
Would the easy way to do that be to save the model as JSON in a temporary table as long as the progress status is not completed, retrieve it when needed for edition directly from the JSON, and save it to the database only when the status is completed?
To support (elegantly) what you ask you should design it that way.
One table with it's required columns should be minimum segment that have to be inputted before any save. Should make segment optimal size.
You could set all fields to allow null but that would be very BAD design, so I would not consider that option at all.
Now if your input consist of several logical parts, and on form they could be different tabs so each tab is in one table in Db and main table have FKs of others tables.
That FK could be Nullable, and it would enable you to finish say first 2 tabs, Save it, and leave rest for after. So you will know that those FK column that have values are finished(and maybe could be edited still), while others are yet to be inserted. You can also have column Status:Draft/Active/...
What's more this design would allow you to have configurable tabs, so for example based on some chosen selection on main input you could chose what tables can be inputted, and which not and to enable/disable appropriate tabs.
If however you don't want FKs nullable than solution would be some temporary storage, one option being JSON in one string column, as you have mentioned your self. But I see no issues with nullable FKs in this case.

Working with database & architectural issue

I'm developing a web app (not ASP.NET), and I encountered a small architectural problem:
So, i have two classes to work with users.
public class User
{
public int Id { get; set; }
public string Username { get; set; }
public string Password { get; set; }
// Other properties...
}
public class Profile
{
public int Id { get; set; }
public string PhotoUrl { get; set; }
public string DisplayName { get; set; }
public string FirstName { get; set; }
public string LastName { get; set; }
public List<PostItem> Posts { get; set; }
}
I had to split these classes because there is a feature that allows you to view profile of the certain member, and obviously you don't want to retrieve data from database that contains user's password, name and other private stuff (though it's not displayed in view). So i'm storing this data in different tables: table Users contains personal infomation, while table Profiles contains public one (it can be viewed by anyone).
But at the same time, in order not to break Single responsibility principle, i had to implement UserRepository and ProfileRepository classes that does some checking, adding and other stuff.
And here they come:
Issue 1: code that handles user registration is turned into real hell now, i have to check if record with specific username exists in the two different tables by instantiating two repositories.
Issue 2: Also on the page where you can view public data, there is a need to display latest posts, but here is another problem: i can't store complicated values in one column, so i have to store posts in another table too. It means that i need to implement PostRepository and at the same time property Posts in Profile class is useless (though i need it to display latest posts in view), because in order to retrieve latest posts you need to look through other table inside UserRepository, but it should be handled by PostRepository. For example the same goes for comments.
So, this is my small problem. Any advices?
Ok, taking each item in turn;
1) Its perfectly normal to have the Identity of a user checked through one repository and their permissions to your application stored in another. In fact this is the basic idea behind federated identity. Consider that your might extend your application to allow Identity to be provided by Facebook, but permissions by your own application, and you will see that separating them makes sense.
2) Yes, absolutely. What makes you think that a high volume store like Posts is best served by the same repository that you store a low-change-rate set of data like Permissions in ? One might be in Mongo, the other in Active Directory, with the Identity being OAUTH. You see that since your own the whole application you see these as being unnecessary complexities, whereas they represent good architectural separation.
Identity => not owned by your application. Slow change rate.
Permissions => owned by your application. Slow change rate.
Posts => owned by your application. Fast change rage.
Just looking at those three use-cases, it seems that using different repositories would be a good idea since they have such different profiles. If ultimately your repositories all map to a SQL Server (or other) implementation, then so be it; but by separating these architecturally you can use the best possible underlying implementation.

ServiceStack Request and Response Objects

Is it ok (read good practice) to re-use POCO's for the request and response DTO's. Our POCO's are lightweight (ORM Lite) with only properties and some decorating attributes.
Or, should I create other objects for the request and/or response?
Thanks,
I would say it depends on your system, and ultimately how big and complex it is now, and has the potential to become.
The ServiceStack documentation doesn't specify which design pattern you should use. Ultimately it provides the flexibility for separating the database model POCO's from the DTOs, but it also provides support for their re-use.
When using OrmLite:
OrmLite was designed so that you could re-use your data model POCOs as your request and response DTOs. As noted from the ServiceStack documentation, this was an intentional design aim of the framework:
The POCOs used in Micro ORMS are particularly well suited for re-using as DTOs since they don't contain any circular references that the Heavy ORMs have (e.g. EF). OrmLite goes 1-step further and borrows pages from NoSQL's playbook where any complex property e.g. List is transparently blobbed in a schema-less text field, promoting the design of frictionless Pure POCOS that are uninhibited by RDBMS concerns.
Consideration:
If you do opt to re-use your POCOs, because it is supported, you should be aware that there are situations where it will be smarter to use separate request and response DTOs.
In many cases these POCO data models already make good DTOs and can be returned directly instead of mapping to domain-specific DTOs.
^ Not all cases. Sometimes the difficulty of choosing your design pattern is foreseeing the cases where it may not be suitable for re-use. So hopefully a scenario will help illustrate a potential problem.
Scenario:
You have a system where users can register for your service.
You, as the administrator, have the ability to list users of your service.
If you take the OrmLite POCO re-use approach, then we may have this User POCO:
public class User
{
[PrimaryKey, AutoIncrement, Alias("Id")]
public int UserId { get; set; }
public string Username { get; set; }
public string Password { get; set; }
public string Salt { get; set; }
public bool Enabled { get; set; }
}
When you make your Create User request you populate Username and Password of your User POCO as your request to the server.
We can't just push this POCO into the database because:
The password in the Password field will be plain text. We are good programmers, and security is important, so we need to create a salt which we add to the Salt property, and hash Password with the salt and update the Password field. OK, that's not a major problem, a few lines of code will sort that before the insert.
The client may have set a UserId, but for create this wasn't required and will cause our database query to fail the insert. So we have to default this value before inserting into the database.
The Enabled property may have been passed with the request. What if somebody has set this? We only wanted the deal with Username and Password, but now we have to consider other fields that would effect the database insert. Similarly they could have set the Salt (though this wouldn't be a problem because we would be overriding the value anyway.) So now you have added validation to do.
But now consider when we come to returning a List<User>.
If you re-use the POCO as your response type, there are a lot of fields that you don't want exposed back to the client. It wouldn't be smart to do:
return Db.Select<User>();
Because you don't have a tight purpose built response for listing Users, the Password hash and the Salt would need to be removed in the logic to prevent it being serialised out in the response.
Consider also that during the registration of a user, that as part of the create request we want to ask if we should send a welcome email. So we would update the POCO:
public class User
{
// Properties as before
...
[Ignore] // This isn't a database field
public bool SendWelcomeEmail { get; set; }
}
We now have the additional property that is only useful in the user creation process. If you use the User POCO over and over again, you will find over time you are adding more and more properties that don't apply to certain contexts.
When we return the list of users, for example, there is now an optional property of SendWelcomeEmail that could be populated - it just doesn't make sense. It can then be difficult to maintain the code.
A key thing to remember is that when sharing a POCO object such that it is used as both a request and response object: Properties that you send as a response will be exposed in a request. You will have to do more validation on requests, ultimately the sharing of the POCO may not save effort.
In this scenario wouldn't it be far easier to do:
public class CreateUserRequest
{
public string Username { get; set; }
public string Password { get; set; }
public bool SendWelcomeEmail { get; set; }
}
public class UserResponse
{
public int UserId { get; set; }
public string Username { get; set; }
public bool Enabled { get; set; }
}
public class User
{
[PrimaryKey, AutoIncrement, Alias("Id")]
public int UserId { get; set; }
public string Username { get; set; }
public string Password { get; set; }
public string Salt { get; set; }
public bool Enabled { get; set; }
}
We know now when we create a request (CreateUserRequest) that we don't have to consider UserId, Salt or Enabled.
When returning a list of users it's now List<UserResponse> and there is no chance the client will see any properties we don't want them to see.
It's clear to other people looking at the code, the required properties for requests, and what will be exposed in response.
Summary:
Sorry, it's a really long answer, but I think this addresses an aspect of sharing POCOs that some people miss, or fail to grasp initially, I was one of them.
Yes you can re-use POCOs for requests and response.
The documentation says it's OK to do so. In fact it is by design.
In many cases it will be fine to re-use.
There are cases where it's not suitable. (My scenario tries to show this, but you'll find as you develop your own real situations.)
Consider how many additional properties may be exposed because your shared POCO tries to support multiple actions, and how much extra validation work may be required.
Ultimately it's about what you are comfortable maintaining.
Hope this helps.
We have other approach, and my answer is opinionated.
Because we work not only with C# clients, but mainly with JavaScript clients.
The request and response DTO's, the routes and the data entities, are negotiated between
the customer and the front-end analyst. They are part of the specs in a detailed form.
Even if "customer", in some cases, is our product UI.
These DTO's don't change without important reason and can be reusable in both sides.
But the objects in the data layer, can be the same or partial class or different,
They can be changed internally, including sensitive or workflow information,
but they have to be compatible with the specification of the API.
We start with the API first , not the database or ORM.
Person { ... }
Address { ... }
ResponceDTO
{
public bool success {get; set;}
public string message {get; set;}
public Person person {get; set;}
public List<Address> addresses {get; set;}
//customer can use the Person & Address, which can be the same or different
//with the ones in the data-layer. But we have defined these POCO's in the specs.
}
RequestDTO
{
public int Id {get; set;}
public FilteByAge {get; set;}
public FilteByZipCode {get; set;}
}
UpdatePersonRequest
{
public int Id {get; set;}
public bool IsNew {get; set;}
public Person person {get; set;}
public List<Address> addresses {get; set;}
}
We don't expose only Request or Response DTOs.
The Person and Address are negotiated with the customer and are referenced in the API specs.
They can be the same or partial or different from the data-layer internal implementation.
Customer will use them to their application or web site, or mobile.
but the important is that we design and negotiate first the API interface.
We use also often the requestDTO as parameter to the business layer function,
which returns the response object or collection.
By this way the service code is a thin wrapper in front of the business layer.
ResponseDTO Get(RequestDTO request)
{
return GetPersonData(request);
}
Also from the ServiceStack wiki , the API-First development approach
This will not be a problem given you are OK with exposing the structure of your data objects (if this is a publicly consumed API). Otherwise, Restsharp is made to be used with simple POCOs :)
I think it all depends on how you're using your DTO's, and how you want to balance re-usability of code over readability. If both your requests and responses both utilize a majority of properties on your DTO's, then you'll be getting a lot of re-usability without really lowering readability. If, for instance, your request object has 10 properties (or vice-versa), but your response only needs 1 of them, someone could make an argument that it's easier to understand/read if your response object only had that 1 property on it.
In summary, good practice is just clean code. You have to evaluate your particular use case on whether or not your code is easy to use and read. Another way to think of it, is to write code for the next person who will read it, even if that person is you.

Categories