Conditional mapping using GraphDiff

Conditional mapping using GraphDiff - c#

I am having an issue with GraphDiff whilst saving some data and I just need some one to confirm if this is possible.
I will provide an example of what is going on:
Firstly, I am using VS2017 (latest revision) EF 6, Automapper and GraphDiff.
I have a table which contains the following data.
As you can see this lists data for a ParcelId of 5023, the only difference is the IsAcquired and IsCurrent flags.
I am not going into the code that updates the data with the IsCurrent flag set to true as it is very complicated, but in essence, a screen allows users to enter values, that on saving sets the records in the second list from IsCurrent to False and inserts Three new records that have new values and has the IsCurrent set to True. This provides us the ability to undo these records.
Now, I have differnt screen that enables you to edit the main data, or in other words the data from the first grid.
This is using GrphDiff to update the data. The data that is in the second grid is NOT recovered for this edit, but on saving the GraphDiff opertaion is seeing that the data is different and overwriting in.
dbContext.UpdateGraph(a,map=>map.OwnedCollection(x => x.ParcelRight);
I need it to ingnore the records that an IsCurrent equals true and only update records that have IsAcquired = true.
I tried:
dbContext.UpdateGraph(a,map=>map.OwnedCollection(x => x.ParcelRight
.Where(r => r.IsAcquired == true).ToList());
but it did not work.
I found the following which sort of implies that it cannot be done.
Research

Thanks to all that looked at this, but it looks like it is something that causes GraphDiff issues, so I have changed my code for this update to use a more traditional Linq to Entites solution.

Related

Remove list values based on series of other values

I have a situation wherein a List object is built off of values pulled from a MSSQL database. However, this particular table is mysteriously getting an errant record or two tossed in. Removing the records cause trouble even though they have no referential links to any other tables, and will still get recreated without any known user actions taken. This causes some trouble as it puts unwanted values on display that add a little bit of confusion. The specific issue is that this is a platform that allows users to run a search for quotes, and the filtering allows for sales rep selection. The select/dropdown field is showing these errant values, and they need to be removed.
Given that deleting the offending table rows does not provide a desirable result, I was thinking that maybe the best course of action was to modify the code where the List object is created and either filter the values out or remove them after the object is populated. I'd like to do this in a clean, scalible fashion by providing some kind of appendable data object where I could just add in a new string value if something else cropped up as opposed to doing something clunky that adds new code to find the value and remove it each time.
My thought was to create a string array, and somehow loop through that to remove bad List values, but I wasn't entirely certain that was the best way to approach this, and I could not for the life of me think of a clean approach for this. I would think that the best way would be to add a filter within the Find arguments, but I don't know how to add in an array or list that way. Otherwise I figured to loop through the values either before or after the sorting of the List and remove any matches that way, but I wasn't sure that was the best choice of actions.
I have attached the current code, and would appreciate any suggestions.
int licenseeID = Helper.GetLicenseeIdByLicenseeShortName(Membership.ApplicationName);
List<User> listUsers;
if (Roles.IsUserInRole("Admin"))
{
//get all users
listUsers = User.Find(x => x.LicenseeID == licenseeID).ToList();
}
else
{
//get only the current user
listUsers = User.Find(x => (x.LicenseeID == licenseeID && x.EmailAddress == Membership.GetUser().Email)).ToList();
}
listUsers.Sort((x, y) => string.Compare(x.FirstName, y.FirstName));
-- EDIT --
I neglected to mention that I did not develop this, I merely inherited its maintenance after the original developer(s) disappeared, and my coworker who was assigned to it left the company. I'm not really really skilled at handling ASP.NET sites. Many object sources are hidden and unavailable for edit, I assume due to them being defined in a DLL somewhere. So, for any of these objects that are sourced from database tables, altering the tables will not help, since I would not be able to get the new data anyway.
However, I did try to do the following to filter out the undersirable data:
List<String> exclude = new List<String>(new String[] { "value1" , "value2" });
listUsers = User.Find(x => x.LicenseeID == licenseeID && !exclude.Contains(x.FirstName)).ToList();
Unfortunately it only resulted in an error being displayed to the page.
-- EDIT #2 --
I got the server setup to accept a new event viewer source so I could write info to the Application log to see what was happening. Looks like this installation of ASP.NET does not accept "Contains" as an action on a List object. An error gets kicked out stating that the method is not available.

I will probably add a bit to the table and flag Errant rows and then skip them when I query the table, something like
&& !ErrantData
Other way, that requires a bit more upkeep but doesn't require db change, would be to keep a text file that gets periodically updated and you read it and remove users from list based on it.
The bigger issue is unknown rows creeping in your database. Changing user credentials and adding creation timestamps may help you narrow down the search scope.

Millions of rows in the database, only so much needed

Problem summary:
C# (MVC), entity framework 5.0 and Oracle.
I have a couple of million rows in a view which joins two tables.
I need to populate dropdownlists with filter-posibilities.
The options in these dropdownlists should reflect the actual contents
of the view for that column, distinct.
I want to update the dropdownlists whenever you select something, so
that the new options reflect the filtered content, preventing you
from choosing something that would give 0 results.
Its slow.
Question: whats the right way of getting these dropdownlists populated?
Now for more detail.
-- Goal of the page --
The user is presented with some dropownlists that filter the data in a grid below. The grid represents a view (see "Database") where the results are filtered.
Each dropdownlist represents a filter for a column of the view. Once something is selected, the rest of the page updates. The other dropdownlists now contain the posible values for their corresponding columns that complies to the filter that was just applied in the first dropdownlist.
Once the user has selected a couple of filters, he/she presses the search button and the grid below the dropdownlists updates.
-- Database --
I have a view that selects almost all columns from two tables, nothing fancy there. Like this:
SELECT tbl1.blabla, tbl2.blabla etc etc
FROM table1 tbl1, table2 tbl2
WHERE bsl.bvz_id = bvz.id AND bsl.einddatum IS NULL;
There is a total of 22 columns. 13 VARCHARS (mostly small, 1 - 20, one of em has a size of 2000!), 6 DATES and 3 NUMBERS (one of them size 38 and one of them 15,2).
There are a couple of indexes on the tables, among which the relevant ID's for the WHERE clause.
Important thing to know: I cannot change the database. Maybe set an index here and there, but nothing major.
-- Entity Framework --
I created a Database first EDMX in my solution and also mapped the view. There are also classes for both tables, but I need data from both of them, so I don't know if I need them. The problem by selecting things from either table would be that you can't apply half of the filtering, but maybe there are smart way's I didn't think of yet.
-- View --
My view is strongly bound to a viewModel. In there I have a IEnumerable for each dropdownlist. The getter for these gets its data from a single IEnumerable called NameOfViewObjects. Like this:
public string SelectedColumn1{ get; set; }
private IEnumerable<SelectListItem> column1Options;
public IEnumerable<SelectListItem> Column1Options
{
get
{
if (column1Options == null)
{
column1Options= NameOfViewObjects.Select(item => item.Column1).Distinct()
.Select(item => new SelectListItem
{
Value = item,
Text = item,
Selected = item.Equals(SelectedColumn1, StringComparison.InvariantCultureIgnoreCase)
});
}
return column1Options;
}
}
The two solutions I've tried are:
- 1 -
Selecting all columns in a linq query I need for the dropdownlists (the 2000 varchar is not one of them and there are only 2 date columns), do a distinct on them and put the results into a Hashset. Then I set NameOfViewObjects to point towards this hashset. I have to wait for about 2 minutes for that to complete, but after that, populating the dropdownlists is almost instant (maybe a second for each of them).
model.Beslissingen = new HashSet<NameOfViewObject>(dbBes.NameOfViewObject
.DistinctBy(item => new
{
item.VarcharColumn1,
item.DateColumn1,
item.DateColumn2,
item.VarcharColumn2,
item.VarcharColumn3,
item.VarcharColumn4,
item.VarcharColumn5,
item.VarcharColumn6,
item.VarcharColumn7,
item.VarcharColumn8
}
)
);
The big problem here is that the object NameOfViewObject is probably quite large, and even though using distinct here, resulting in less than 100.000 results, it still uses over 500mb of memory for it. This is unacceptable, because there will be a lot of users using this screen (a lot would be... 10 max, 5 average simultaniously).
- 2 -
The other solution is to use the same linq query and point NameOfViewObjects towards the IQueryable it produces. This means that every time the view wants to bind a dropdownlist to a IEnumerable, it will fire a query that will find the distinct values for that column in a table with millions of rows where most likely the column it's getting the values from is not indexed. This takes around 1 minute for each dropdownlist (I have 10), so that takes ages.
Don't forget: I need to update the dropdownlists every time one of them has it's selection changed.
-- Question --
So I'm probably going at this the wrong way, or maybe one of these solutions should be combined with indexing all of the columns I use, maybe I should use another way to store the data in memory, so it's only a little, but there must be someone out there who has done this before and figured out something smart. Can you please tell me what would be the best way to handle a situation like this?
Acceptable performance:
having to wait for a while (2 minutes) while the page loads, but
everything is fast after that.
having to wait for a couple of seconds every time a dropdownlist
changes
the page does not use more than 500mb of memory

Of course you should have indexes on all columns and combinations in WHERE clauses. No index means table scan and O(N) query times. Those cannot scale under any circumstance.
You do not need millions of entries in a drop down. You need to be smarter about filtering the database down to manageable numbers of entries.
I'd take a page from Google. Their type ahead helps narrow down the entire Internet graph into groups of 25 or 50 per page, with the most likely at the top. Maybe you could manage that, too.
Perhaps a better answer is something like a search engine. If you were a Java developer you might try Lucene/SOLR and indexing. I don't know what the .NET equivalent is.

First point you need to check is your DB, make sure you have to right indexes and entity relations in place,
next if you want to dynamical build your filter options then you need to run the query with the existing filters to obtain what the next filter can be. there are several ways to do this,
firstly you can query the data and extract the values from the return, this has a huge load time and wastes time returning data you don't want (unless you are live updating the results with the filter and dont have paging, in which case you might aswell just get all the data and use linqToObjects to filter)
a second option is to have a parallel queries for each filter that returns the possible filters, so filter A = all possible values of A from data, filter b = all possible values of B when filtered by A in the data, C = all possible values of C when filtered by A & B in the data, etc. this is better than the first but not by much
another option is the use aggregates to speed things up, ie you have a parallel query as above but instead of returning the data you return how many records are returned, aggregate functions are always quicker so this will cut your load time dramatically but you are still repeatedly querying a huge dataset to it wont be exactly nippy.
you can tweak this further using exist to just return a 0 or 1.
in this case you would look at a table with all possible filters and then remove the ones with no values from the parallel query
the next option will be the fastest by a mile is to cache the filters in the DB, with a separate table
then you can query that and say from Cache, where filter = ABC select D, the problem with this maintaining the cache, which you would have to do in the DB as part of the save functions, trigggers etc.

Another solution that can be added in addition to the previous suggestions is to use the /*+ result_cache */ hint, if your version of Oracle supports it (Oracle version 11g or later). If the output of the query is small enough for a drop-down list, then when a user enters criteria that matches the same criteria another user used, the results are returned in a few milliseconds instead of a few seconds or minutes. Result cache is wonderful for queries that return a small set of rows out of millions.
select /*+ result_cache */ item_desc from some_table where item_id ...
The result cache is automatically flushed when any insert/updates/deletes occur on the database tables.

I've done something 'kind of' similar in the past - if you can add a table to the database then I'd explore introducing a 'scratchpad' type table where results are temporarily stored as the user refines their search. Since multiple users could be working simultaneously the table would have to have an additional column for identifying the user.
I'd think you'd see some performance benefit since all processing is kept server-side and your app would simply be pulling data from this table. Since you're adding this table you would also have total control over it.
Essentially I'd imagine the program flow would go something like:
User selects some filters and clicks 'Search'.
Server populates scratchpad table with results from that search.
App populates results grid from scratchpad table.
User further refines search and clicks 'Search'.
Server removes/adds rows to scratchpad table as necessary.
App populates results grid from scratchpad table.
And so on.
Rather than having all the users results in one 'scratchpad' table you could possibly explore having temporary 'scratchpad' tables per user.

How to deal with deprecated values in (country-)code lists

Let's say we have a code list of all the countries including their country codes. The country code is primary key of the Countries table and it is used as a foreign key in many places in the database. In my application the countries are usually displayed as dropdowns on multiple forms.
Some of the countries, that used to exists in the past, don't exist any more, for example Serbia and Montenegro, which had the country code of SCG.
I have two objectives:
don't allow the user to use these old values (so these values should not be visible in dropdowns when inserting data)
the user should still be able to (readonly) open old stuff and in this case the deprecated values should be visible in dropdowns.
I see two options:
Rename deprecated values, for instance from 'CountryName' to '!!!!!CountryName'. This approach is the easiest to implement, but with obvious drawbacks.
Add IsActive column to Countries table and set it to false for all deprecated values and true for all other. On all the forms where the user can insert data, display only values which are active. On the readonly forms we can display all values (including deprecated ones) so the user will be able to display old data. But on some of my forms the user should be able to also edit data, which means that the deprecated values should be hidden from him. That means, that each dropbox should have some initialization logic like this: if the data displayed is readonly, then include deprecated values in dropbox and if the data is for edit also, then exclude them. But this is a lot of work and error prone too.
And other ideas?

I deal with this scenario a lot, and use the 'Active' flag to solve the problem, much as you described. When I populate a drop-down list with values, I only load 'active' data and include upto 1 deprecated value, but only if it is being used. (i.e. if I am looking at a person record, and that person has a deprecated country, then that country would be included in the Drop-downlist along with the active countries. I do this in read-only AND in edit modes, because in my cases, if a person record (for example) has a deprecated country listed, they can continue to use it, but once they change it to a non-deprecated country, and then save it, they can never switch back (your use case may vary).
So the key differences is, even in read-only mode I don't add all the deprecated countries to the DDL, just the deprecated country that applies to the record I am looking at, and even then, it is only if that record was already in use.
Here is an example of the logic I use when loading the drop down list:
protected void LoadSourceDropdownList(bool AddingNewRecord, int ExistingCode)
{
using (Entities db = new Entities())
{
if (AddingNewRecord) // when we are adding a new record, only show 'active' items in the drop-downlist.
ddlSource.DataSource = (from q in db.zLeadSources where (q.Active == true) select q);
else // for existing records, show all active items AND the current value.
ddlSource.DataSource = (from q in db.zLeadSources where ((q.Active == true) || (q.Code == ExistingCode)) select q);
ddlSource.DataValueField = "Code";
ddlSource.DataTextField = "Description";
ddlSource.DataBind();
ddlSource.Items.Insert(0, "--Select--");
ddlSource.Items[0].Value = "0";
}
}

If you are displaying the record as read-only, why bother loading the standing data at all?
Here's what I would do:
the record will contain the country code in any case, I would also propose returning the country description (which admittedly makes things less efficient), but when the user loads "old stuff", the business service recognises that this record will be read only, and you don't bother loading the country list (which would make things more efficient).
in my presentation service I will then generally do a check to see whether the list of countries is null. If not (r/w) load the data into the list box, if so (r/o) populate the list box from the data in the record - a single entry in the list equals read-only.

You can filter with CollectionViewSource or you could just create a Public Enumerable that filters the full list using LINQ.
CollectionViewSource Class
LINQ The FieldDef.DispSearch is the active condition. IEnumerable is a little better performance than List.
public IEnumerable<FieldDefApplied> FieldDefsAppliedSearch
{
get
{
return fieldDefsApplied.Where(df => df.FieldDef.DispSearch).OrderBy(df => df.FieldDef.DispName);
}
}

Why would you still want to display (for instance) customer-addresses with their OLD country-code?
If I understand correctly, you currently still have 'address'-records that still point to 'Serbia and Montenegro'. I think if you solve that problem, your current question would be none-existent.
The term "country" is perhaps a little misleading: not all the "countries" in ISO 3166 are actually independent. Rather, many of them are geographically separate territories that are legally portions or dependencies of other countries.
Also note that 'withdrawn country-codes' are reserved for 5 years, meaning that after 5 years they may be reused. So moving away from using the country-code itself as primary key would make sense to me, especially if for historical reasons you would need to back-track previous country-codes.
So why not make the 'withdrawn' field/table that points to the new country-id's. You can still check (in sql for instance, since you were already using a table) if this field is empty or not to get a true/false check if you need it.
The way I see it: "Country" codes may change, country's may merge and country's may divide.
If country's change or merge, you can update your address-records with a simple query.
If country's divide, you need a way to determine what address is part of what country.
You could use some automated system do do this (and write lengthly books about it).
OR
(when it is a forum like site), you could ask the users that still have a withdrawn country that points to multiple alternatives in their account to update their country-entry at login, where they can only choose from the list of new country's that are specified in the withdrawn field.
Think of this simplified country-table setup:
id cc cn withdrawn
1 DE Germany
2 CS Serbia and Montenegro 6,7
3 RH Southern Rhodesia 5
4 NL The Netherlands
5 ZW Zimbabwe
6 RS Serbia
7 ME Montenegro
In this example, address-records with country-id 3, get updated with a query to country-id 5, no user interaction (or other solution) needed.
But address-records that specify country-id 2 will be asked to select country-id 6 or 7 (of course in the text presented to the user you use the country-name) or are selected to perform your custom automated update routine on.
Also note: 'withdrawn' is a repeating group and as such you could/should make it into a separate table.
Implementing this idea (without downtime) in your scenario:
sql statement to build a new country-table with numerical id's as primary key.
sql statement to update address-records with new field 'country-id' and fill this field with the country-id from the new country-table that corresponds with country-code specified in that record's address-field.
(sql statement to) create the withdrawn table and populate the correct data with in it.
then rewrite your the sql statements that supply your forms with data
add the check and 'ask user to update country'-routine
let new forms go live
wait/see for unintended bugs
delete old country-table and (now unused) country-code column from the "address"-table
I am very curious what other experts think about this idea!!

ASP.NET MVC3 database manipulation

Right now, I have a simple web application that displays the entries of a database. One of the fields that is visible in the database is a bool?, which is true, false, or neither. Everything in the database originally should have the bool? set to neither.
Here's what I want to get working: when a user edits an entry in the table by selecting either true or false for the bool? field, I want to be able to run some C# code (that I have already written) and have that entry deleted from the database. This means that the next time that the database is loaded, once again all the entries will have neither true nor false selected in the bool? field.
Does someone know how I can do this simply? (I know very little about querying databases or creating web apps in general.)

My problem was less about how to delete the items and more about how to pick out from the database those chosen to be deleted.
I found that this does the trick:
var toBeRemoved = from m in db.Issues
where m.Remove.HasValue && m.Remove
select m;

I believe when you say "true, false, or neither", the neither means null in the database, so, without seeing your code, I believe you could change the SELECT that retrieves the rows for the view to display, to have a WHERE *field* IS NULL in it. If this doesn't help, please post us some control, view, and model code.

Linq to SQL, Update a lot of Data before One Insert

Before insert new value to table, I need change one field in all rows of that table.
What the best way to do this? in c# code, ore use trigger? if C# can you show me the code?
UPD
*NEW VERSION of Question*
Hello. Before insert new value to table, I need change one field in all rows of that table with specific ID( It is FK to another table).
What the best way to do this? in c# code, ore use trigger? if C# can you show me the code?

You should probably consider changing your design this doesn't sound like it will scale well, i would probably do it with a trigger if it is always required, but if not, id use ExecuteCommand.
var ctx = new MyDataContext();
ctx.ExecuteCommand("UPDATE myTable SET foo = 'bar'");

Looking at your comment on Paul's answer, I feel like I should chime in here. We have a few tables where we need to keep a history of each entry in that table. We implement this by creating a separate table for each. For example, we may have a Comment table, and then a CommentArchive table with a foreign key reference to the CommentId in the Comment table.
A trigger on the Comment table ensures that each time certain fields in the Comment table are updated, the "old" version (which is accessible via the deleted table in the trigger) gets pushed to the CommentArchive table. Obviously, this means several CommentArchive entries may exist for each Comment, but if you're only looking for the "active" comments, you just look in the Comment table. And if you need information about the history of a comment, you can easily use LINQ to SQL to jump from the Comment you're interested in to the CommentArchives that reference it.
Because the triggers we use in the above example only insert a single value into the Archive table for each update, they run very quickly and we get good performance. We had issues recently where I tried making the triggers more complex and we started getting dead-locks with as few as 15 concurrent transactions. So the lesson is that you should make these triggers simple, and make them touch as few rows in as few tables as possible.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.