I have a piece of code that goes through all the linked tables and tables in an access database and for every table(all linked in this case) that matches a certain criteria it should add a new table and delete the old. The new is on a sql server database and the old the oracle, however this is irrelevant. The code is:
var dbe = new DBEngine();
Database db = dbe.OpenDatabase(#"C:\Users\x339\Documents\Test.accdb");
foreach (TableDef tbd in db.TableDefs)
{
if (tbd.Name.Contains("CLOASEUCDBA_T_"))
{
useddatabases[i] = tbd.Name;
string tablename = CLOASTableDictionary[tbd.Name];
string tablesourcename = CLOASTableDictionary[tbd.Name].Substring(6);
var newtable = db.CreateTableDef(tablename.Trim());
newtable.Connect = "ODBC;DSN=sql server copycloas;Trusted_Connection=Yes;APP=Microsoft Office 2010;DATABASE=ILFSView;";
newtable.SourceTableName = tablesourcename;
db.TableDefs.Append(newtable);
db.TableDefs.Delete(tbd.Name);
i++;
}
}
foreach (TableDef tbd in db.TableDefs)
{
Console.WriteLine("After loop "+tbd.Name);
}
There are 3 linked tables in this database 'CLOASEUCDBA_T_AGENT', 'CLOASEUCDBA_T_CLIENT' and 'CLOASEUCDBA_T_BASIC_POLICY'. The issue with the code is that it updates the first two tables perfectly but for some unknown reason, it never finds the third. Then in the second loop, it prints it out... it seems to just skip over 'CLOASEUCDBA_T_BASIC_POLICY'. I really dont know why. The weird thing is then that if run the code again, it will change 'CLOASEUCDBA_T_BASIC_POLICY'. Any help would be greatly appreciated.
Modifying a collection while you are iterating over it can sometimes mess things up. Try using a slightly different approach:
Iterate over the TableDefs collection and build a List (or perhaps a Dictionary) of the items you need to change. Then,
Iterate over the List and update the items in the TableDefs collection.
Related
I've been using Linq2Entities for quite some time now on small scale programs. So usually my queries are basic (return elements with a certain value in a specific column, add/update element, remove element,...).
Now i'm moving to a larger scale program and given my experience went with Linq2Entities and everything was fine until i had to remove a large number of elements.
In Linq2Entities i can only find a remove method that takes on entity - and so to remove 1000 elemnts i need first to retrieve them then apply a remove one by one then call savechanges.
Either i am missing something or i feel a) this is really bad performance-wise and b) i would be better off making this a procedure in the database.
Am i right in my assumptions? Should massive removals / additions be made as procedures only? or is there a way to do this efficiently with Linq2Entities that i am missing?
If you have the primary key values then you can use the other features of the Context by creating the objects manually, setting their key(s), and attaching them to the context and setting their state to delete.
var ids = new List<Guid>();
foreach (var id in ids)
{
Employee employee = new Employee();
employee.Id = id;
entityEntry = context.Entry(employee);
entityEntry.State = EntityState.Deleted;
}
context.SaveChanges();
I am developing a C# ASP.NET web application. I have data being pulled from two databases. One is the database that holds all of our actual data, the second is to be used so that users of the site can save "favorites" and easily find this data later. The databases have the following columns:
Table1:
itemid, itemdept, itemdescription
Table2:
userid, itemid, itemdept, itemdescription
If the item is present in table2 (the user has already added it), I want to mark the item as removable if it comes up again in a search, and addable if it has is not yet in their favorites.
I've got data from both pulled into datatables so I can compare them, but I feel that using a nested foreach loops will be too tedious as the query is set to return a max of 300 results. Also to do that, I have to put a bool value in one of the tables to mark that it was found, so this seems messy.
I have read up a little on Linq, but can't find anything exactly like this scenario. Could I use Linq to accomplish such a thing? Below is an (admittedly crude) image of the search results page that may help get a better grasp on this. In the real deal, the Add and Remove links will be imagebuttons.
Forgot to ever post the solution to this one, but I went with the HashSet setup, with one loop to compare. Thank you everyone for your comments.
if (User.Identity.IsAuthenticated)
{
DataColumn dc = new DataColumn("isMarked", System.Type.GetType("System.Int32"));
ds.Tables[0].Columns.Add(dc);
string[] strArray = ds.Tables[0].AsEnumerable().Select(s => s.Field<string>("itemid")).ToArray<string>();
HashSet<string> hset = new HashSet<string>(strArray);
foreach (DataRow dr in ds.Tables[0].Rows)
{
if (hset.Contains(dr["itemid"].ToString().Trim()))
dr[3] = 1;
else
dr[3] = 0;
}
}
I have data with the same schema in a pipe delimited text file and in a database table, including the primary key column.
I have to check if each row in the file is present in the table, if not generate an INSERT statement for that row.
The table has 30 columns, but here I've simplified for this example:
ID Name Address1 Address2 City State Zip
ID is the running identity column; so if a particular ID value from the file is found in the table, there should be no insert statement generated for that.
Here's my attempt, which doesn't feel correct:
foreach (var item in RecipientsInFile)
{
if (!RecipientsInDB.Any(u => u.ID == item.ID ))
{
Console.WriteLine(GetInsertSql(item));
}
}
Console.ReadLine();
EDIT: Sorry, I missed the asking the actual question; how to do this?
Thank you very much for all the help.
EDIT: The table has a million plus rows, while the file has 50K rows. This a one time thing, not a permanent project.
I would add all the RecipientsInDB Ids in a HashSet and then test if the set contains the item Id.
var recipientsInDBIds = new Hashset(RecipientsInDB.Select(u => u.ID));
foreach (var item in RecipientsInFile)
{
if (!recipientsInDBIds.Contains(item.ID ))
{
Console.WriteLine(GetInsertSql(item));
}
}
Console.ReadLine();
Try comparing the ID lists using .Except()
List<int> dbIDs = Recipients.Select(x=>x.ID).ToList();
List<int> fileIDs = RecipientsFile.Select(x=>x.ID).ToList();
List<int> toBeInserted = fileIDs.Except(dbIDs).ToList();
toBeInserted.ForEach(x=>GetInsertSqlStatementForID(x));
For the pedantic and trollish among us in the comments, please remember the above code (like any source code you find on the interwebs) shouldn't be copy/pasted into your production code. Try this refactoring:
foreach (var item in RecipientsFile.Select(x=>x.ID)
.Except(DatabaseRecipients.Select(x=>x.ID)))
{
GetInsertSqlStatementForID(item);
}
Lots of ways of accomplishing this. Yours is one way.
Another would be to always generate SQL, but generate it in the following manner:
if not exists (select 1 from Recipients where ID == 1234)
insert Recipients (...) values (...)
if not exists (select 1 from Recipients where ID == 1235)
insert Recipients (...) values (...)
Another would be to retrieve the entire contents of the database into memory beforehand, loading the database IDs into a HashSet, then only checking that HashSet to see if it exists - would take a little longer to get started, but would be faster for each record.
Any of these three techniques would work - it all depends on how big your database table is, and how big your file is. If they're both relatively small (maybe 10,000 records or so), then any of these should work fine.
EDIT
And there's always option D: Insert all records from the file into a temporary table (could be a real table or a SQL temp table, doesn't really matter) in the database, then use SQL to join the two tables together and retrieve the differences (using not exists or in or whatever technique you want), and insert the missing records that way.
OK, so you think you're a real debugger? Try this one:
I've got a Linq2Sql project where suddenly we've discovered that occasionally, seemingly randomly, we get double-inserts of the same row.
I have subclassed DataContext and overridden SubmitChanges. Inside there I have made a call to GetChangeSet(), both before and after the call to base.SubmitChanges(), and used a text file logger to record the objects in the Inserts collection. Plus I hang on to a reference to the inserted objects long enough to record their autonumbered ID.
When the double-insert happens, I see in the DB that instead of one row each being inserted into MyTableA and MyTableB, there are two each. SQL Profiler shows four insert statements, one after the other in rapid succession:
insert into MyTableA(...
insert into MyTableB(...
insert into MyTableA(...
insert into MyTableB(...
I check in the debug log, and there are only two objects in the Inserts collection: one of MyClassA and one of MyClassB. After the call to base.SubmitChanges(), the changeset is empty (as it should be). And the autonumber IDs show the larger value of the newly inserted rows.
Another piece of useful information: the bug never happens when stepping through in debug mode; only when you run without breakpoints. This makes me suspect it's something to do with the speed of execution.
We have been using the same DataContext subclass for over a year now, and we've never seen this kind of behavior before in our product. It only happens with MyClassA and MyClassB.
To summarize:
From the debug log, everything looks like it's working correctly.
On SQL Profiler, you can see that a double-insert is happening.
This behavior happens frequently but unpredictably, only to the two classes mentioned, excepting that it never happens when stepping through code in debug mode.
EDIT - New information:
Inside my DataContext subclass, I have the following code:
try {
base.SubmitChanges(ConflictMode.ContinueOnConflict);
} catch (ChangeConflictException) {
// Automerge database values for members that client has not modified.
foreach (ObjectChangeConflict occ in ChangeConflicts) {
occ.Resolve(RefreshMode.KeepChanges);
}
}
// Submit succeeds on second try.
base.SubmitChanges(ConflictMode.FailOnFirstConflict);
MyTableA and MyTableB both have a mandatory foreign key OtherTableID referencing OtherTable. The double insert happens when a ChangeConflictException happens during an update of the common parent table OtherTable.
We're on the scent, now...
When I've had a problem like this before it is usually down to multiple threads executing the same code at the same time.
Have you tried using the lock{} command to make sure the insert is only being used by a single thread?
MSDN Lock
looks like it's a BUG in Linq2Sql! Here's a repeatable experiment for you:
using (var db1 = new MyDataContext()) {
var obj1 = db1.MyObjects.Single(x => x.ID == 1);
obj1.Field1 = 123;
obj1.RelatedThingies.Add(new RelatedThingy {
Field1 = 456,
Field2 = "def",
});
using (var db2 = new MyDataContext()) {
var obj2 = db2.MyObjects.Single(x => x.ID == 1);
obj2.Field2 = "abc";
db2.SubmitChanges();
}
try {
db1.SubmitChanges(ConflictMode.ContinueOnConflict);
} catch (ChangeConflictException) {
foreach (ObjectChangeConflict occ in ChangeConflicts) {
occ.Resolve(RefreshMode.KeepChanges);
}
}
base.SubmitChanges(ConflictMode.FailOnFirstConflict);
}
Result: MyObject record with ID = 1 gets updated, Field1 value is 123 and Field2 value is "abc". And there are two new, identical records inserted to RelatedThingy, with MyObjectID = 1, Field1 = 456 and Field2 = "def".
Explain THAT!
UPDATE: After logging this on Microsoft Connect, the nice folks at MS asked me to put together a little demo project highlighting the bug. And wouldn't you know it - I couldn't reproduce it. It seems to be connected to some weird idiosyncrasy of my project. Don't have time to investigate further, and I found a workaround, anyway...
FWIW, we recently found this problem with our retry logic for SubmitChanges. We were doing an InsertAllOnSubmit. When a ChangeConflictException occurred, we would retry with a Resolve(RefreshMode.KeepChanges,true) on each ObjectChangeConflict.
We redid the work a different way (retry logic to re-perform the entire transaction) and that seems to fix the problem.
I've got my model updating the database according to some information that comes in in the form of a Dictionary. The way I currently do it is below:
SortedItems = db.SortedItems.ToList();
foreach (SortedItem si in SortedItems)
{
string key = si.PK1 + si.PK2 + si.PK3 + si.PK4;
if (updates.ContainsKey(key) && updatas[key] != si.SortRank)
{
si.SortRank = updates[key];
db.SortedItems.ApplyCurrentValues(si);
}
}
db.SaveChanges();
Would it be faster to iterate through the dictionary, and do a db lookup for each item? The dictionary only contains the item that have changed, and can be anywhere from 2 items to the entire set. My idea for the alternate method would be:
foreach(KeyValuePair<string, int?> kvp in updates)
{
SortedItem si = db.SortedItems.Single(s => (s.PK1 + s.PK2 + s.PK3 + s.PK4).Equals(kvp.Key));
si.SortRank = kvp.Value;
db.SortedItems.ApplyCurrentValues(si);
}
db.SaveChanges();
EDIT: Assume the number of updates is usually about 5-20% of the db entires
Let's look:
Method 1:
You'd iterate through all 1000 items in the database
You'd still visit every item in the Dictionary and have 950 misses against the dictionary
You'd still have 50 update calls to the database.
Method 2:
You'd iterate every item in the dictionary with no misses in the dictionary
You'd have 50 individual lookup calls to the database.
You'd have 50 update calls to the database.
This really depends on how big the dataset is and what % on average get modified.
You could also do something like this:
Method 3:
Build a set of all the keys from the dictionary
Query the database once for all items matching those keys
Iterate over the results and update each item
Personally, I would try to determine your typical case scenario and profile each solution to see which is best. I really think the 2nd solution, though, will result in a ton of database and network hits if you have a large set and a large number of updates, since for each update it would have to hit the database twice (once to get the item, once to update the item).
So yes, this is a very long winded way of saying, "it depends..."
When in doubt, I'd code both and time them based on reproductions of production scenarios.
To add to #James' answer, you would get fastest results using a stored proc (or a regular SQL command).
The problem with LINQ-to-Entities (and other LINQ providers, if they haven't updated recently) is that they don't know how to produce SQL updates with where clauses:
update SortedItems set SortRank = #NewRank where PK1 = #PK1 and (etc.)
A stored procedure would do this at the server side, and you would only need a single db call.