Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
I allow a user to download some data to csv. They can then edit some columns and then upload it back. I need a speed efficient way to compare certain columns between like objects to see what changed.
Currently I pull the original data from the DB and make it a list so it's all in memory. There is about 100k items so it's not that bad. That part takes less than a second. Then I load in the csv file and put it to list. Both lists have the same class type.
Then I loop over the csv data (as they probably removed some rows which they didn't change but they could still have changed a lot of rows). For each row in the csv list I query the list that came from the DB to find that object. Now I have the csv object and the object from the database as the same structure. Then I run it through a custom object compare function that looks at certain columns to see if anything changed.
If something did change I have to validate what they entered is a valid value by query another reference list for that column. If it's not valid I write it out to an exceptions list. At the end if there are no exceptions I save to db. If there are exceptions I don't save anything and I show them the list of errors.
The detail compare provides a list of columns and the old vs new values that changed. I need this to query the reference list to make sure the new value is valid before I make the change. It's fairly inefficient but it gives great detail to the user about what may be an issue with an upload which is very valuable.
This is very slow. I'm looking for ways to speed it up while still being able to give the user detailed information about why it may have failed so they can correct it.
// get all the new records from the csv
var newData = csv.GetRecords<MyTable>().ToArray();
// select all data from database to list
var origData = ctx.MyTable.Select(s => s).ToList();
// look for any changes in the new data and update the database. note we are looping over the new data so if they removed some data from the csv file it just won't loop over those and they won't change
foreach (var d in newData)
{
// find data so we can compare between new (csv) and current (from db) to see what possibly changed
var oData = (from o in origData
where o.id == d.id
select o).FirstOrDefault();
// only the columns in the updatableColumns list are compared
var diff = d.DetailedCompare(oData, comparableColumns.ToList());
if (diff.Count > 0)
{
// even though there are differences between the csv record and db record doesn't mean what the user input is valid. only existing ref data is valid and needs to be checked before a change is made
bool changed = false;
// make a copy of this original data and we'll check after if we actually were able to make a change to it (was the value provided valid)
var data = CopyRecord(oData);
// update this record's data fields that have changed with the new data
foreach (var v in diff)
{
// special check for setting a value to NULL as its always valid to do this but wouldn't show up in ref data to pass the next check below
if (v.valA == null)
{
oData.GetType().GetProperty(v.Prop).SetValue(oData, v.valA);
oData.UpdatedBy = user;
oData.UpdatedDate = DateTime.Now;
changed = true;
}
// validate that the value for this column is in the ref table before allowing an update. note exception if not so we can tell the user
else if (refData[v.Prop].Where(a => a.value == v.valA.ToString()).FirstOrDefault() != null)
{
// update the current objects values with the new objects value as it changed and is a valid value based on the ref data defined for that column
oData.GetType().GetProperty(v.Prop).SetValue(oData, v.valA);
oData.UpdatedBy = user;
oData.UpdatedDate = DateTime.Now;
changed = true;
}
else
{
// the value provided isn't valid for this column so note this to tell the user
exceptions.Add(string.Format("Error: ID: {0}, Value: '{1}' is not valid for column [{2}]. Add the reference data if needed and re-import.", d.id, v.valA, v.Prop));
}
}
// we only need to reattach and save off changes IF we actually changed something to a valid ref value and we had no exceptions for this record
if (changed && exceptions.Count == 0)
{
// because our current object was in memory we will reattached it to EF so we can mark it as changed and SaveChanges() will write it back to the DB
ctx.MyTable.Attach(oData);
ctx.Entry(oData).State = EntityState.Modified;
// add a history record for the change to this product
CreateHistoryRecord(data, user);
}
}
}
// wait until the very end before making DB changed. we don't save anything if there are exceptions or nothing changed
if (exceptions.Count == 0)
{
ctx.SaveChanges();
}
The first big win would be to put your data in a dictionary so you can get to the desired value quickly by ID, without having to search for the object through thousands of objects. I'm pretty sure it'll be faster.
Beyond that I suggest you run your code through a profiler to determine exactly which parts are the slowest. It's entirely possible that DetailedCompare() does something that's terribly slow but may not be obvious.
One thing to consider is having asynchronous compares and or asynchronous if (diff,Count > 0) at least the latter assuming that there are a few random changes why wait for all the copying and reflection. Put it in a seperatge function and have run parallel.
I want to implement restriction on creating duplicated data in Asp.net MVC project.
I have a table tSectionForwardSelling (SectionForwardSellingID, StoreID, SectionID, Amount, Date).
I want to restrict a user to input duplicated data if data he wants to input in tSectionForwardSelling already has the same StoreID and SectionID. If data with same StoreID and SectionID exists, he can only edit.
I want to avoid this:
Amount Date SectionName StoreName
$1000 5/20/2015 Men Clarissa
$2345 5/20/2015 Men Clarissa
Here is my Create ActionResult from tSectionForwardSellings controller:
// GET: tSectionForwardSellings/Create
public ActionResult Create()
{
ViewBag.SectionID = new SelectList(db.tSections, "SectionID", "Section_Name");
ViewBag.StoreID = new SelectList(db.tStores, "StoreID", "Store_Name");
return View();
}
// POST: tSectionForwardSellings/Create
// To protect from overposting attacks, please enable the specific properties you want to bind to, for
// more details see http://go.microsoft.com/fwlink/?LinkId=317598.
[HttpPost]
[ValidateAntiForgeryToken]
public ActionResult Create([Bind(Include = "SectionForwardSellingID,Amount,Date,StoreID,SectionID")] tSectionForwardSelling tSectionForwardSelling)
{
if (ModelState.IsValid)
{
db.tSectionForwardSellings.Add(tSectionForwardSelling);
db.SaveChanges();
return RedirectToAction("Index");
}
ViewBag.SectionID = new SelectList(db.tSections, "SectionID", "Section_Name", tSectionForwardSelling.SectionID);
ViewBag.StoreID = new SelectList(db.tStores, "StoreID", "Store_Name", tSectionForwardSelling.StoreID);
return View(tSectionForwardSelling);
}
And here the project itself:
https://drive.google.com/file/d/0BwgF9RnNTDDEOVlUMmxub2JxbFU/view?usp=sharing
Do you ever want that duplicated data to exist?
If the table should never contain more than 1 row for the same SectionName and StoreName values then you should solve this in the database by either creating a composite primary key (clustered index) on those 2 columns, or by creating a unique non-clustered index on those 2 columns.
Then in your .NET MVC you can also perform some checks when inserting data to check if it already exists, but you won't strictly have to, and your database still will never be able to get into a bad state.
I'll echo some of what's already been said and add a few thoughts.
First: have a constraint at the database level that prevents the duplicate scenario outright. This is usually done with a key and some types of indexes can also force this constraint.
Second: before you add anything to the database that must be unique, ask for a copy of the object from the database with those parameters, if they exist, simply update the record, if they don't, add the new item.
Third: if it's a critical item that must not under any circumstance be duplicated, make sure that for the second step you issue a lock so that no one else can do anything with that key. A lock will ensure that when you search for the item no one else will be able to then add it after you do.
In my own system I use a combination of SQL level locks and Cache based distributed locks. Either way, if it's a critical component, you will want to start to understand this sort of architecture better. In most non-critical low load scenarios you can get away with a simple look up though.
I can't believe it is so hard to get someone to show me a simple working example. It leads me to believe that everyone can only talk like they know how to do it but in reality they don't.
I shorten the post down to only what I want the example to do. Maybe the post was getting to long and scared people away.
To get this bounty I am looking for a WORKING EXAMPLE that I can copy in VS 2010 and run.
What the example needs to do.
Show what datatype should be in my domain for version as a timestamp in mssql 2008
Show nhibernate automatically throwing the "StaleObjectException"
Show me working examples of these 3 scenarios
Scenario 1
User A comes to the site and edits Row1. User B comes(note he can see Row1) and clicks to edit Row1, UserB should be denied from editing the row until User A is finished.
Scenario 2
User A comes to the site and edits Row1. User B comes 30mins later and clicks to edit Row1. User B should be able to edit this row and save. This is because User A took too long to edit the row and lost his right to edit.
Scenario 3
User A comes back from being away. He clicks the update row button and he should be greeted with StaleObjectException.
I am using asp.net mvc and fluent nhibernate. Looking for the example to be done in these.
What I tried
I tried to build my own but I can't get it throw the StaleObjectException nor can I get the version number to increment. I tired opening 2 separate browser and loaded up the index page. Both browsers showed the same version number.
public class Default1Controller : Controller
{
//
// GET: /Default1/
public ActionResult Index()
{
var sessionFactory = CreateSessionFactory();
using (var session = sessionFactory.OpenSession())
{
using (var transaction = session.BeginTransaction())
{
var firstRecord = session.Query<TableA>().FirstOrDefault();
transaction.Commit();
return View(firstRecord);
}
}
}
public ActionResult Save()
{
var sessionFactory = CreateSessionFactory();
using (var session = sessionFactory.OpenSession())
{
using (var transaction = session.BeginTransaction())
{
var firstRecord = session.Query<TableA>().FirstOrDefault();
firstRecord.Name = "test2";
transaction.Commit();
return View();
}
}
}
private static ISessionFactory CreateSessionFactory()
{
return Fluently.Configure()
.Database(MsSqlConfiguration.MsSql2008
.ConnectionString(c => c.FromConnectionStringWithKey("Test")))
.Mappings(m => m.FluentMappings.AddFromAssemblyOf<TableA>())
// .ExposeConfiguration(BuidSchema)
.BuildSessionFactory();
}
private static void BuidSchema(NHibernate.Cfg.Configuration config)
{
new NHibernate.Tool.hbm2ddl.SchemaExport(config).Create(false, true);
}
}
public class TableA
{
public virtual Guid Id { get; set; }
public virtual string Name { get; set; }
// Not sure what data type this should be for timestamp.
// To eliminate changing to much started with int version
// but want in the end timestamp.
public virtual int Version { get; set; }
}
public class TableAMapping : ClassMap<TableA>
{
public TableAMapping()
{
Id(x => x.Id);
Map(x => x.Name);
Version(x => x.Version);
}
}
Will nhibernate stop the row from being retrieved?
No. Locks are only placed for the extent of a transaction, which in a web application ends when the request ends. Also, the default type of transaction isolation mode is Read committed which means that read locks are released as soon as the select statement terminates. If you are reading and making edits in the same request and transaction, you could place a read and write lock on the row at hand which would prevent other transactions from writing to or reading from that row. However, this type of concurrency control doesn't work well in a web application.
Or would the User B be able to still see the row but if he tried to save it would crash?
This would happen if [optimistic concurrency] was being used. In NHibernate, optimistic concurrency works by adding a version field. Save/update commands are issued with the version upon which the update was based. If that differs from the version in the database table, no rows are updated and NHibernate will throw.
What happens if User A say cancels and does not edit. Do I have to
release the lock myself or is there a timeout can be set to release
the lock?
No, the lock is released at the end of the request.
Overall, your best bet is to opt for optimistic concurrency with version fields managed by NHibernate.
How does it look in code? Do I setup in my fluent nhibernate to
generate a timestamp(not sure if I would timespan datatype).
I would suggest using a version column. If you're using FluentNhibernate with auto mappings, then if you make a column called Version of type int/long it will use that to version by default, alternatively you can use the Version() method in the mapping to do so (it's similar for timestamp).
So now I generated somehow the timestamp and the user is editing a
row(through a gui). Should I be storing the timestamp in memory or
something? Then when the user submits call from memory the timestamp
and id of the row and check?
When the user starts editing a row, you retrieve it and store the current version (the value of the version property). I would recommend putting the current version in a hidden field in the form. When the user saves his changes, you can either do a manual check against the version in the database (check that it's the same as the version in the hidden field), or you can set the version property to the value from the hidden field (if you are using databinding, you could do this automatically). If you set the version property, then when you try to save the entity, NHibernate will check that the version you're saving matches the version in the database, and throws an exception if it doesn't.
NHibernate will issue an update query something like:
UPDATE xyz
SET ,
Version = 16
WHERE Id = 1234 AND Version = 15
(assuming your version was 15) - in the process it will also increment the version field
If so that means the business logic is keeping track of the "row
locking" but in theory someone still could just go Where(x => x.Id ==
id) and grab that row and update at will.
If someone else updates the row via NHibernate, it will increment the version automatically, so when your user tries to save it with the wrong version you will get an exception which you need to decide how to handle (ie. try show some merge screen, or tell the user to try again with the new data)
What happens when the row gets updated? Do you set null to the timestamp?
It updates the version or timestamp (timestamp will get updated to the current time) automatically
What happens if the user never actually finishes updating and leaves. How does the row
every become unlocked again?
The row is not locked per se, it is instead using optimistic concurrency, where you assume that no-one will change the same row at the same time, and if someone does, then you need to retry the update.
Is there still a race condition what happens or is this next to
impossible in happening? I am just concerned 2 ppl try to get edit the
same row and both of them see it in their gui for editing but one is
actually going to get denied in the end because they lost the race
condition.
If 2 people try to edit the same row at the same time, one of them will lose if you're using optimistic concurrency. The benefit is that they will KNOW that there was a conflict, as opposed to either losing their changes and thinking that it updated, or overwriting someone else's changes without knowing about it.
So I did something like this
var test = session.Query.Where(x => x.Id ==
id).FirstOrDefault(); // send to user for editing. Has versioning on
it. user edits and send back the data 30mins later.
Codes does
test.Id = vm.Id; test.ColumnA = vm.ColumnA; test.Version = vm.Version;
session.Update(test); session.Commit(); So the above will work right?
The above will throw an exception if someone else has gone in and changed the row. That's the point of it, so you know that a concurrency issue has arisen. Typically you'd show the user a message saying "Someone else has changed this row" with the new row there and possibly their changes also so the user has to select which changes win.
but if I do this
test.Id = vm.Id;
test.ColumnA = vm.ColumnA;
session.Update(test);
session.Commit(); it would not commit right?
Correct as long as you haven't reloaded test (ie. you did test = new Xyz(), not test = session.Load() ) because the Timestamp on the row wouldn't match
If someone else updates the row via NHibernate, it will increment the
version automatically, so when your user tries to save it with the
wrong version you will get an exception which you need to decide how
to handle (ie. try show some merge screen, or tell the user to try
again with the new data)
Can I make it so when the record is grabbed this checked. I want to
keep it simple at first that only one person can edit at a time. The
other guy won't even be able to access the record to edit while
something is editing it.
That's not optimistic concurrency. As a simple answer you could add a CheckOutDate property which you set when someone starts editing it, and set it to null when they finish. Then when they start to edit, or when you show them the rows to edit you could exclude all rows where that CheckOutDate is newer than say the last 10 minutes (then you wouldn't need a scheduled task to reset it periodically)
The row is not locked per se, it is instead using optimistic
concurrency, where you assume that no-one will change the same row at
the same time, and if someone does, then you need to retry the update.
I am not sure what your saying does this mean I can do
session.query.Where(x => x.id == id).FirstOrDefault(); all day
long and it will keep getting me the record(thought it would keep
incrementing the version).
The query will NOT increment the version, only an update to it will increment the version.
I don't know that much about nHibernate itself, but if you are prepared to create some stored procs on the database it can >sort of< be done.
You will need one extra data column and two fields in your object model to store information against each row:
A 'hash' of all the field values (using SQL Server CHECKSUM 2008 and later or HASHBYTES for earlier editions) other than the hash field itself and the EditTimestamp field. This could be persisted to the table using INSERT/UPDATE triggers if needs be.
An 'edit-timestamp' of type datetime.
Change your procedures to do the following:
The 'select' procedure should include a where clause similar to 'edit-timestamp < (Now - 30 minutes)' and should update the 'edit-timestamp' to the current time. Run the select with appropriate locking BEFORE updating the row I'm thinking a stored procedure with hold locking such as this one here Use a persistent date/time rather than something like GETDATE().
Example (using fixed values):
BEGIN TRAN
DECLARE #now DATETIME
SET #now = '2012-09-28 14:00:00'
SELECT *, #now AS NewEditTimestamp, CHECKSUM(ID, [Description]) AS RowChecksum
FROM TestLocks
WITH (HOLDLOCK, ROWLOCK)
WHERE ID = 3 AND EditTimestamp < DATEADD(mi, -30, #now)
/* Do all your stuff here while the record is locked */
UPDATE TestLocks
SET EditTimestamp = #now
WHERE ID = 3 AND EditTimestamp < DATEADD(mi, -30, #now)
COMMIT TRAN
If you get a row back from this procedure then you 'have' the 'lock', otherwise, no rows will be returned and there's nothing to edit.
The 'update' procedure should add a where clause similar to 'hash = previously returned hash'
Example (using fixed values):
BEGIN TRAN
DECLARE #RowChecksum INT
SET #RowChecksum = -845335138
UPDATE TestLocks
SET [Description] = 'New Description'
WHERE ID = 3 AND CHECKSUM(ID, [Description]) = #RowChecksum
SELECT ##ROWCOUNT AS RowsUpdated
COMMIT TRAN
So in your scenarios:
User A edits a row. When you return this record from the database, the 'edit-timestamp' has been updated to the current time and you have a row so you know you can edit. User B would not get a row because the timestamp is still too recent.
User B edits the row after 30 minutes. They get a row back because the timestamp has passed more than 30 minutes ago. The hash of the fields will be the same as for user A 30 minutes ago as no updates have been written.
Now user B updates. The previously retrieved hash still matches the hash of the fields in the row, so the update statement succeeds, and we return the row-count to show that the row was updated. User A however, tries to update next. Because the value of the description field has changed, the hashvalue has changed, and so nothing is updated by the UPDATE statement. We get a result of 'zero rows updated' so we know that either the row has since been changed or the row was deleted.
There are probably some issues regarding scalability with all these locks going on and the above code could be optimised (might get problems with clocks going forward/back for example, use UTC), but I wrote these examples just to explain how it could work.
Outside of that I can't see how you can do this without utilising database level row-locking within the select transaction. It might be that you can request those locks via nHibernate, but that's beyond my knowledge of nHibernate I'm afraid.
Have you looked at the ISaveOrUpdateEventListener interface?
public class SaveListener : NHibernate.Event.ISaveOrUpdateEventListener
{
public void OnSaveOrUpdate(NHibernate.Event.SaveOrUpdateEvent e)
{
NHibernate.Persister.Entity.IEntityPersister p = e.Session.GetEntityPersister(null, e.Entity);
if (p.IsVersioned)
{
//TODO: check types etc...
MyEntity m = (MyEntity) e.Entity;
DateTime oldversion = (DateTime) p.GetVersion(m, e.Session.EntityMode);
DateTime currversion = (DateTime) p.GetCurrentVersion(m.ID, e.Session);
if (oldversion < currversion.AddMinutes(-30))
throw new StaleObjectStateException("MyEntity", m.ID);
}
}
}
Then in your Configuration, register it.
private static void Configure(NHibernate.Cfg.Configuration cfg)
{
cfg.EventListeners.SaveOrUpdateEventListeners = new NHibernate.Event.ISaveOrUpdateEventListener[] {new SaveListener()};
}
public static ISessionFactory CreateSessionFactory()
{
return Fluently.Configure().Database(...).
.Mappings(...)
.ExposeConfiguration(Configure)
.BuildSessionFactory();
}
And version the Properties you want to version in your Mapping class.
public class MyEntityMap: ClassMap<MyENtity>
{
public MyEntityMap()
{
Table("MyTable");
Id(x => x.ID);
Version(x => x.Timestamp);
Map(x => x.PropA);
Map(x => x.PropB);
}
}
The short answer to your question is you can't/shouldn't do this in a simple web application with nhibernates optimistic (version) and pessimistic (row locks) locking. The fact that your transactions are only as long as a request are your limiting factor.
What you CAN do is create another table and entity class, and mappings that manages these "locks". At the lowest level you need an Id of the object being edited and the Id of the user performing the editing, and a datetime of when the lock was acquired. I would make the Id of the object being edited the primary key since you want it to be exclusive...
When a user clicks on a row to edit, you can try to acquire a lock (create a new record in that table with the ids and current datetime). If the lock already exists for another user, than it will fail because you are trying to violate a primary key constraint.
If a lock is acquired, when the user clicks save you need to check that they still have a valid "lock" before performing the actual save. Then, perform the actual save and remove the lock record.
I would also recommend a background service/process that sweeps these locks periodically and removes the ones that have expired or are older than your time limit.
This is my prescribed way of dealing with "locks" in a web environment. Good luck!
Yes, it is possible to lock a row with nhibernate but if I understand well, your scenario is in a web context and then it is not the best practice.
The best practive is to use optimistic locking with automatic versioning as you mentioned.
Locking a row when page is opening and releasing it when page is unloading will quickly lead to dead lock the row (javascript issue, page not killed properly...).
Optimistic locking will make NHibernate throws an exception when flushing a transaction which contains objects modified by another session.
If you want to have true concurent modification of the same information you may try to think about a system which merge many users input inside a same document, but it is a system on its own, not managed by ORM.
You will have to choose a way to deal with session in a web environment.
http://nhibernate.info/doc/nh/en/index.html#transactions-optimistic
The only approach that is consistent with high concurrency and high
scalability is optimistic concurrency control with versioning.
NHibernate provides for three possible approaches to writing
application code that uses optimistic concurrency.
Hey you can try these sites
http://thesenilecoder.blogspot.ca/2012/02/nhibernate-samples-row-versioning-with.html
http://stackingcode.com/blog/2010/12/09/optimistic-concurrency-and-nhibernate
I've got a SQL Server database that I'm trying to build a RESTful API for.
I'm using ADO.Net and Linq to retrieve a single row from a table like this:
[HttpGet]
public tTrip getTripById(Guid id)
{
var _trip = (from trips in db.tTrip
where trips.ID == id
select trips).FirstOrDefault();
return _trip;
}
When I debug the code the correct object is retrieved. If I keep running however, there will be no response. I'm guessing that's because for every foreign key present in the returned row, ADO does another lookup through the other mapped tables which slows down everything by a lot.
If I only select a single column that doesn't contain any FKCs everything works fine.
Any ideas how I can turn off the FKC lookup for that fetched object?
Thank you!
I found the problem - In the ObjectContext class (that's where the 'db' variable comes from btw), I had the ContextOptions.LazyLoadingEnabled variable set to true.
Set it to false and the application returns only the Guid for every entry instead of loading the entry details from the database.
My table has two ID fields (I did not put 2 IDs so dont ask me why). One is a primary key and the other is a nullable duplicate field which will contain the value of the primary key itself.
public static void UpdateDuplicate_ID(Company updatingCompany)
{
Company tempCompany;
using (var context = new TestLiveDataContext())
{
tempCompany = (from company in context.Companies
where company.Id == updatingCompany.Id
select company).FirstOrDefault();
tempCompany.DuplicateId = updatingCompany.DuplicateId;
context.SubmitChanges();
}
}
It seems the above code is not working. I can't update the duplicate id with my primary key value. Can anyone tell me whether I am missing anything here?
As much as I can see, updatingCompany and tempCompany appear to be the same record.
If this is the case, you may be overwriting the chance outside of this method if you later change the value passed in and save again.
Does beg the question, why don't you just change the value in updatingCompany and then submit changes on its own context, rather than starting up a new one?
That is unless I have misunderstood the problem.