Updating two databases at the same time

Updating two databases at the same time - c#

I have an SQL Server instance where I have two databases attached. One is a MS SQL database and the other is a linked server(ODBC) which is an indexed file system (Vision). Let's say the Customer table exists in both db's and should be kept identical. I will populate fields in my application from the linked server, and if any changes are made they should be written to both databases. Field names may also be different in the two db's. I use ADO connections in the application and would normally use adapter.Update if I were working with only one db. As I will be doing quite a lot of db calls throughout the application, I would prefer to make a kind of data handling class which will take care of this and leave me with a simple call to this class. I was also thinking of making some kind of db-transaction to ensure both systems will stay identical.
Does anybody have a suggestion on how to approach this?

I'm thinking you can have 2 separate projects for handling the DataLayer (one for each db) and expose them through a Facade/Adapter that will handle delegating the CRUD operations to both of them, also handling the necessary conversions (you mentioned the fields are not named the same).
In the Facade/Adapter you can also implement Retry Logic and Transactions to ensure both data sources are in sync.

Related

Using LINQ to SQL for two different databases

I have a Router that saves changes from incoming messages to a database that is used by our CRM software. The Router uses LINQ to SQL for communication. We just released a completely revisioned version of the software that was built from the ground up, and runs on a different database.
Rather than maintaining two Routers with almost identical code, we want to change the code to work on either database, and change the context dynamically. This requires having a DataContext for each of the two formats.
My question is, can I have a DataContext for a database that doesn't exist on the system as long as the context is never used? I plan on refactoring all database communication to a single dll, and use two different classes that implement the same interface to access the two different databases. I will then only call methods in the correct class, but the dll would hold both DataContexts.
Thanks.

Creating a clear abstraction layer over a convoluted and large SQL database

Almost all of the applications I write at work get their data from a central MSSQL database. This database has about 70 tables, and on average I'd say 25 or so columns per table. The database has developed over 5-10 years (I'm not entirely sure) and is full of idiosyncrasies and quirks. Foreign keys are irregularly implemented when it comes to naming and so on, as well as case and language mixing in table and column names.
I am not able to restructure the database itself as it would break a ton of backwards compatibility for applications needed in the daily work of most people in the office.
I've almost exclusively been using LINQ2SQL for interacting with the database and it works fine, but always requires a lot of manual joining of tables, either in some db repository or 'inline' when coding. So I've finally decided that I have to do something to once and for all ease the pain of working with this leviathan. This would preferably include implementing a clear naming scheme, joining relevant tables with foreign keys properly once and for all etc.
The three routes I can see are:
Creating a number of views, stored procedures and functions in the SQL to ease up my interaction with the DB. This obviously has the bonus of being usable in many languages, as opposed to a solution implemented in e.g. C#. The biggest drawback I can see here is that it would probably take a lot of time to do this properly, as well as being a bit harder to service a year down the road when I haven't looked at the SQL queries for a while. I would also need to implement another DB abstraction step inside my applications as I wouldn't want to work with just straight up DB calls (abstraction upon abstraction seems bad in this case, but maybe I'm wrong?)
Continuing on my LINQ2SQL road, but creating a once-and-for-all repository class that hides all the underlying tables in abstracted calls only. This idea seems more feasible in terms of development time, maintenance and single-point-abstraction.
Pulling off some EF4 reverse-engineering magic, using the designer to hook up relevant foreign keys and renaming table classes to fit my taste.
Any input on how this should/could be done, as well as any recommended reading you might have, would be most appreciated.

We have a very similar situation with our database. We went the EF route, but we used Code First. I know it sounds weird to use Code First when your database already exists, but due to the size of the tables and the number of tables, trying to do it all in the designer was not feasible.
You can use the "Reverse Engineer Code First" option in Entity Framework Power Tools to generate everything you need from your database.

I think that well thought out abstraction layer is better suits the needs of application if it is not based on physical schema of DB. I mean - the main goal of DAL is to hide tables from users leaving to them only valid "activities" thru stored procedures. In most cases this will outperform the direct data access and gives to you one more degree of freedom - to play with TSQL code and to implement additional logic/schema changes without needing to change the application.

Synchronizing Entire Databases using Microsoft Sync Framework 2.1

I need the ability to sync multiple remote databases, upload and download, with my main database.
However, the problem lies in the fact that I need to sync the entire database, and the database schema is going to be being updated constantly, and I didn't see any way to code it to grab the entire database schema without adding each individual table to the SyncScope.
This is problematic as that scope will always be changing. I solved the initial problem of removing the existing scope, and adding a new one, but I still cannot find any simple solutions, without querying system tables, and parsing the results, and passing those results (for 150+) tables back to my SyncScope.
The reasons I originally looked at Sync Framework are:
I need to be able to manage the direction of the sync (upload/download) when I do a sync programatically from C# on a button click.
I need the ability to turn on that button, based off their network connectivity.
There's additional tasks that need to be done on a sync download, such as changing connection strings of the mobile units, and storing information about their connection and unit in the database.
There's additional tasks that need to be run on a sync upload, such as verifying data against customer business rules through my OR/M, archiving the data to a network storage, restarting the application, and changing connection strings again.
Eventually, I need partial data sets, decided/chosen by the customer, at run-time, at the object level, in an OR/M framework. These objects, may coincide with one or more tables I won't know of at design-time, or may not even exist at design-time.
Does anyone know if another framework encompasses all my requirements, or if there is a simpler way to do this in the sync framework?

For this task, especially with a changing schema, you could consider Merge Replication instead of the Sync framework.

Converting project to SQL Server, design thoughts?

Currently, I'm sitting on an ugly business application written in Access that takes a spreadsheet on a bi-daily basis and imports it into a MDB. I am currently converting a major project that includes this into SQL Server and .net, specifically in c#.
To house this information there are two tables (alias names here) that I will call Master_Prod and Master_Sheet joined on an identity key parent to the Master_Prod table, ProdID. There are also two more tables to store history, History_Prod and History_Sheet. There are more tables that extend off of Master_Prod but keeping this limited to two tables for explanation purposes.
Since this was written in Access, the subroutine to handle this file is littered with manually coded triggers to deal with history that were and have been a constant pain to keep up with, one reason why I'm glad this is moving to a database server rather than a RAD tool. I am writing triggers to handle history tracking.
My plan is/was to create an object modeling the spreadsheet, parse the data into it and use LINQ to do some checks client side before sending the data to the server... Basically I need to compare the data in the sheet to a matching record (Unless none exist, then its new). If any of the fields have been altered I want to send the update.
Originally I was hoping to put this procedure into some sort of CLR assembly that accepts an IEnumerable list since I'll have the spreadsheet in this form already but I've recently learned this is going to be paired with a rather important database server that I am very concerned with bogging down.
Is this worth putting a CLR stored procedure in for? There are other points of entry where data enters and if I could build a procedure to handle them given the objects passed in then I could take a lot of business rule away from the application at the expense of potential database performance.
Basically I want to take the update checking away from the client and put it on the database so the data system manages whether or not the table should be updated so the history trigger can fire off.
Thoughts on a better way to implement this along the same direction?

Use SSIS. Use Excel Source to read the spreadsheets, perhaps use a Lookup Transformation to detect new items and finally use a SQL Server Destination to insert the stream of missing items into SQL.
SSIS is way better fit to these kind of jobs that writing something from scratch, no matter how much fun linq is. SSIS Packages are easier to debug, maintain and refactor than some dll with forgoten sources. Besides, you will not be able to match the refinements SSIS has in managing its buffers for high troughput Data Flows.

Originally I was hoping to put this
procedure into some sort of CLR
assembly that accepts an IEnumerable
list since I'll have the spreadsheet
in this form already but I've recently
learned this is going to be paired
with a rather important database
server that I am very concerned with
bogging down.
Does not work. Any input into a C# written CLR procedure STILL has to follow normal SQL semantics. All that can change is the internal setup. Any communication up with the client has to be done in SQL. Which means executions / method calls. No way to directly pass in an enumerable of objects.

My plan is/was to create an object
modeling the spreadsheet, parse the
data into it and use LINQ to do some
checks client side before sending the
data to the server... Basically I need
to compare the data in the sheet to a
matching record (Unless none exist,
then its new). If any of the fields
have been altered I want to send the
update.
You probably need to pick a "centricity" for your approach - i.e. data-centric or object-centric.
I would probably model the data appropriately first. This is because relational databases (or even non-normalized models represented in relational databases) will often outlive client tools/libraries applications. I would probably start trying to model in a normal form and think about the triggers to maintain audit/history as you mention during this time also.
I would typically then think of the data coming in (not an object model or an entity, really). So then I focus on the format and semantics of the inputs and see if there is misfit in my data model - perhaps there were assumptions in my data model which were incorrect. Yes, I'm not thinking of making an object model which validates the spreadsheet even though spreadsheets are notoriously fickle input sources. Like Remus, I would simply use SSIS to bring it in - perhaps to a staging table and then some more validation before applying it to production tables with some T-SQL.
Then I would think about a client tool which had an object model based on my good solid data model.
Alternatively, the object approach would mean modeling the spreadsheet, but also an object model which needs to be persisted to the database - and perhaps you now have two object models (spreadsheet and full business domain) and database model (storage persistence), if the spreadsheet object model is not as complete as the system's business domain object model.
I can think of an example where I had a throwaway external object model kind of like this. It read a "master file" which was a layout file describing an input file. This object model allowed the program to build SSIS packages (and BCP and SQL scripts) to import/export/do other operations on these files. Effectively it was a throwaway object model - it was not used as the actual model for the data in the rows or any kind of navigation between parent and child rows, etc., but simply an internal representation for internal purposes - it didn't necessarily correspond to a "domain" entity.

Linq-to-SQL: how many datacontexts?

I have a SQL Server 2008 database with > 300 tables. The application I have to design is an Windows Forms app, .NET 3.5, C#.
Which is the best way to work with Linq-to-SQL ?
I intend to make a datacontext for each business entity.
Is there any problem ?
I need to know if this way of working with Linq-to-SQL has any disadvantage or can create performance issues ?
Thanks.

You should typically have 1 single DBML file (=data context) per database. You should certainly not create a DataContext per business entity, because doing this would make you lose most of the useful capabilities of LINQ to SQL, like memory transactions (unit of work), lazy loading, and doing LINQ queries over multiple entities.
You have a pretty big model (+300 tables) which means a lot of entities. A lot of entities is not a big problem, except for the LINQ to SQL designer. Using the designer with such big models can be pretty annoying. This can be a reason to split a domain in multiple sub domains (with each a DBML file), but certainly not one per entity. However, keep in mind that you loose the L2S capabilities at the boundaries of the domains.
In the past I advised a team, who had split up their +150 entities domain in 5 DBML files, to merge them back together to a single DBML. The pain of editing the model went up, but the pain of using multiple DataContexts went away, which lowered the overall pain drastically for them.

There is no point in making a data context for each business entity, you only need one datacontext per database.

well it depends on how many users will use your database simultaneously not how many tables are there. So its all about typical database issues: number of connections, locking and other stuff.

I now use 1 for the entire database, but there are legitimate uses for having more. For example, I run a script when installing my site that connects to a remote DB and imports and converts data to the new format for deployment. The process uses some temporary tables.
By putting the temporary tables in a separate context, once the site is deployed I can simply delete these contexts and code as they are independent entities.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.