Best way to incorporate legacy data

Best way to incorporate legacy data - c#

I am working on a price list management program for my business in C# (Prototype is in Win Forms but am thinking of using WPF for the final ap as a MVVM learning exercise).
Our EMS system is based on a COBOL back end and will remain that way for at least 3 years so I cannot really access it's data directly. I want to pull data from them EMS system periodically to ensure that pricing remains in sync (And to provide some other information to users in a non-editable manner such as bin locations). What I am looking at doing is...
Use WinBatch to automatically run a report nightly then to Use Monarch to convert the text report to a flat file (.xls?)
Drop the file into a folder and write a small ap to read it in and add it to the database
How should I add this to the database? (SQL Express) I could have a table that is just replaced completely each time but I am a beginner at most of this and I am concerned what would happen if an entire table was replaced while the database was being used by the price list ap.
Mike

If you truncate and refill a whole table you should do it in one single transaction and place a full table lock. This is more secure and faster.
You also could update all changed rows, then insert new (missing rows) and then delete all rows which weren't updated in this run (insert some kind of version number in each row to determine this).

First create a .txt file from the legacy application. Then use a batch insert to pull it into a work table for whatever clean up you need to make. Do the clean up using t-sql. Then run t-sql to insert new data into the proper tables and/or to update rows where data has changed. If there are toomany records, do the inserting and updating in batches. Schedule all this as a job to run during hours when the database is not busy.
You can of course do all of this best in SSIS but I don't know if that is available with Express.

Are there any fields/tables available to tell you when the price was last updated? If so you can just pull the recently updated rows and update that in your database.... assuming you have a readily available unique primary key in your cobol app's datastore.
This wouldn't be up to date though because you're running it as a nightly script to update the database used by the new app. You can maybe create a .net script to query the cobol datastore specifically for whatever price the user is looking for, and if the cobol datastores update time is more recent than what you have logged, update the SQL Server record(s).
(I'm not familiar with cobol at all, just throwing ideas out there)

Related

Azure SQL: How to determine if table was altered or created and by whom

My team works with a large Azure SQL database where several other teams insert and read data from our database. They sometimes need to create or alter tables but those actions should be coordinated with our team and unfortunately has not been the case. We've had a couple scenario's where one of those teams updated a stored procedure. As a result their changes are not under our source control and if we create a local database for development or do a backup/restore we get errors because of missing references.
We are looking for a way to programmatically determine if a table was altered or modified. It doesn't need to be real-time. I considered reading logs and looking for alter or create commands. I've not has much success as the logs are binary and I don't currently know how to parse them. My other thought is to keep a copy of the master database sys tables and routinely compare them to see if something changed. I'm not sure how well that would work or if I could determine who made the change. Thoughts, Ideas?
Please keep in mind that this is using an Azure SQL Database which is a bit more limited than a standard SQL database.

You can use DDL Triggers as explained here.
CREATE TRIGGER safety
ON DATABASE
FOR DROP_TABLE, ALTER_TABLE
AS
PRINT 'Save change on a log'
SELECT EVENTDATA().value('(/EVENT_INSTANCE/TSQLCommand/CommandText)[1]','nvarchar(max)');
Additionally you can use Extended Events to track schema changes. Look at samples here.
Finally you can also see how Azure SQL Auditing may fit your needs.

How Does One Fill a Typed DataSet, Keep it Synchronized, and Receive Updates When the Data Changes?

So I'm developing an application that works as sort of a "sidekick" to a large proprietary application which I do not have the source code for nor the rights to modify. The proprietary application does store all of its data in a Microsoft SQL database (version 2008 R2 or higher, I believe), however, and I have a good idea what the data represents. What I need my application to do is to constantly monitor the data as it is being added, updated, and deleted, and then act on the data automatically (such as raising alerts).
The issue is figuring out the best approach to receiving changes made to the database by the other application as they're happening, because I don't wanna miss a beat.
Here is what I have done so far:
LINQ to SQL: As far as I know, each time I run a query, I receive a new set of data, but I do not get the ability to receive the changes only or be notified of changes.
Typed DataSet using DataSet.Load:
using (IDataReader reader = dataSetInstance.CreateDataReader())
{
dataSetInstance.Load(reader, LoadOption.OverwriteChanges, dataSetInstance.Table1, dataSetInstance.Table2, dataSetInstance.Table3);
}
This didn't work out too well when I did it. dataSetInstance only contained a set of unfilled tables after calling the Load method. I was hoping to call dataSetInstance.GetChanges and dataSetInstance.AcceptChanges at regular intervals after the first call to dataSetInstance.Load to get only the changes. Am I doing it wrong?
Typed DataSet with tables filled individually using their associated table adapters:
using (Table1TableAdapter adapter = new Table1TableAdapter())
{
adapter.Fill(dataSetInstance.Table1);
}
using (Table2TableAdapter adapter = new Table2TableAdapter())
{
adapter.Fill(dataSetInstance.Table2);
}
using (Table3TableAdapter adapter = new Table3TableAdapter())
{
adapter.Fill(dataSetInstance.Table3);
}
Of course, the problem is that there are actually way more than 3 tables which can add up to quite a lot of repetitive code (and maintenance work), but the real problem is that I will not receive any change notifications since I'm not using the Load/AcceptChanges methods (according to the documentation).
Row retrieval by date/time field: This was something I started work on, but something I stopped after observing the other application modify fields in the rows after creating them. Consider this:
There is a row with a time stamp of a transaction and a boolean field that specifies if the transaction was canceled later on. If it is canceled, the other application simply goes back to that row and toggles the value. The time stamp remains the same, and my application will never know of the news. There is no statute of limitations; the other application can change this field any time in the future.
By the way, I should mention that this other application does not implement any constraints within the database such as foreign and primary keys. I believe I read somewhere in the documentation that for row update events and such to fire on the typed DataTable classes, some sort of primary key is needed.
There must be some way to do this!!!

Have you considered SQL Server Query Notifications? This uses SQL Server Service Broker under the covers.
SqlDependency is the C# class to look at.
Using SqlDependency in a Windows Application (.NET Framework 2.0 example: should be very similar to later versions.)
SqlDependency in an ASP.NET Application

I’d consider solving this at SQL Server level by implementing auditing triggers or SQL Server traces.
Triggers – idea is to add triggers to all tables you want to monitor. Triggers will catch all changes and store the data in some other “history” table. Once this is setup all your application needs to do is to read from these tables.
Check this for more details Creating audit triggers in SQL Server
Traces – you can setup SQL Server traces that will store all info in trace files and then your app can parse trace files and see what’s going on.

There appears to be no silver bullet to the problem given the conditions, but anything is better than polling the database for changes every minute. What I will probably do now is take Mitch Wheat's suggestion and work from there:
Some tables have rows that are highly likely to change. A recent purchase, for example, is more likely to be cancelled than one from 7 days ago, or 6 months ago, or in the case of 1 year—probably never. The application will only need to monitor queries restricted to a certain time range. Older (in terms of creation time) rows will simply be refreshed at a much slower rate and without prompting from SQL Server query notifications. The application is going to have to tolerate some stale data in order to not needlessly pull entire tables from the database every minute.
For tables without chronological information, the application will have to receive notifications for queries on conditions that are important or have to be acted on right away such as WHERE Quantity < 0.
Some more clever approaches will need to be taken for the rest of the tables. Some tables are never updated nor their rows deleted, but they will gain new rows whenever some other table's rows changes. For example: every time the NumberOfPeople value changes for a row in table Room, another row is added to one of the tables CheckIn or CheckOut.
A lot more code needs to be written, but the application is probably going to be doing a lot less unnecessary work when it's running.

Auditing record changes in sql server databases

Using only microsoft based technologies (MS SQL Server, C#, EAB, etc) if you needed keep the track of changes done on a record in a database which strategy would you will use? Triggers, AOP on the DAL, Other? And how you will display the collected data? Is there a pattern about it? Is there a tool or a framework that help to implement this kind of solution?

The problem with Change Data capture is that it isn't flexible enough for real auditing. You can't add the columns you need. Also it dumps the records every three days by default (you can change this, but I don't think you can store forever) so you have to have a job dunping the records to a real audit table if you need to keep the data for a long time which is typical of the need to audit records (we never dump our audit records).
I prefer the trigger approach. You have to be careful when you write the triggers to ensure that they will capture the data if multiple records are changed. We have two tables for each table audited, one to store the datetime and id of the user or process that took the action and one to store the old and new data. Since we do a lot of multiple record processes this is critical for us. If someone reports one bad record, we want to be able to see if it was a process that made the change and if so, what other records might have been affected as well.
At the time you create the audit process, create the scripts to restore a set of audited data to the old values. It's a lot easier to do this when under the gun to fix things, if you already have this set up.

Sql Server 2008 R2 has this built-in - lookup Change Data Capture in books online

This is probably not a popular opinion, but I'm going to throw it out there anyhow.
I prefer stored procedures for all database writes. If auditing is required, it's right there in the stored procedure. There's no magic happening outside the code, everything that happens is documented right at the point where writes occur.
If, in the future, a table needs to change, one has to go to the stored procedure to make the change. The need to update the audit is documented right there. And because we used a stored procedure, it's simpler to "version" both the table and its audit table.

How to change database design in a deployed application?

Situation
I'm creating a C#/WPF 4 application using a SQL Compact Edition database as a backend with the Entity Framework and deploying with ClickOnce.
I'm fairly new to applications using databases, though I don't suspect I'll have much problem designing and building the original database. However, I'm worried that in the future I'll need to add or change some functionality which will require me to change the database design after the database is already deployed and the user has data in the database.
Questions
Is it even possible to push an updated database design out to users via a clickonce update in the same way it is for code changes?
If I did, how would the user's data be affected?
How is this sort of thing done in real situations? What are some best-practices?
I figure that in the worst case, I'd need to build some kind of "version" number into the database or program settings and create some routine to migrate the user's current version of the database to the new one.
I appreciate any insight into my problem. Thanks a lot.

There are some 'tricks' that are employed when designing databases to allow for design changes.
Firstly, many database designers create views to code against, rather than coding directly to the tables. This allows tables to be altered (split or merged, etc) while only requiring that the views are updated. You may want to investigate database refactoring techniques for this.
Secondly, you can indeed add versioning information to the database (commonly done as a 'version' table with a single field). Updating the database can be done through code or through scripts. One system I worked on would automatically check the database version and then progressively update the schema through versions in code until it matched the required version for the runtime. This was quite an undertaking.

I think your "worst" case is actually a pretty good route to go in this situation. Maintain a database version in the DB and have your application check and update the DB as necessary. If you build your updater correctly, it should be able to maintain the user's data. Depending on the update this might involve creating temporary tables to hold the existing data and repopulating new versions of the tables from them. You might be able to include a new SDF file with the new schema in place in the update process and simply transfer the data. It might be slightly easier that way -- you could use file naming to differentiate versions and trigger the update code that way.

Unfortunately version control and change management for databases is desperately, desperately far from what you can do with the rest of your code.
If you have an internal-only environment there are a number of tools which will help you (DBGhost, Red Gate has a newish app, some deployment management apps) but all of them are less than full solutions imho, but they are mostly good enough.
For client-shipped solutions you really don't have anything better than your worst case I'm afraid. Just try and design with flexibility in mind - see Dr.Herbie's answer.
This is not a solved problem basically.

"Smart Client Deployment with ClickOnce" by Brian Noyes has an excellent chapter on this issue. (Chapter 5)
ISBN 978-0-32-119769-6
He suggests something like this:
if(ApplicationDeployment.CurrentDeployment.IsFirstRun) {
MigrateData();
}
private void MigrateData() {
string previousDb = Path.Combine(ApplicationDeployment.CurrentDeployment.DataDirectory, #".\pre\mydb.sdf");
if(!File.Exists(previousDb))
return;
string oldConnString = #"Data Source=|DataDirectory|\.pre\mydb.sdf";
string newConnString = #"Data Source=|DataDirectory|\mydb.sdf";
//If you are using datasets perform any migration here, with the old and new table adapters.
//Otherwise use an .sql data migration script.
//Store the version of the database in the database, and check that in the beginning of your update script and GOTO the correct line in the SQL script.
}

A common solution is to include a version number somewhere in the database. If you have a table with miscellaneous system data, throw it in there, or create a table with one record just to hold the DB version number. Then whenever the program starts up, check if the database version is less than the expected version. If so, execute the required SQL CREATE, ALTER, etc, commands to bring it up to speed. Have a script or function for each version change. So if you see the database is currently at version 6 and the code expects version 8, execute the 6 to 7 update and the 7 to 8 update.
Another method we used on one project I worked was to ship a schema-only, no data database with the code. Every time you installed a new version the installer would also install the latest copy of this new blank database. Then when the program started it up it would compare the user's current database schema with the new database schema, and determine what database changes were needed on the fly. Like, if in the "reference schema" table Foo had a column named Bar, and there was no column Bar in the user's current database, we would generate a "alter table Foo add Bar ..." and execute it. While writing the first draft of the program to do this was a fair amount of work, once we'd done it there was pretty much zero maintenance to keep the DB schema up to date. The conversion was just done on the fly.
Note that this scheme doesn't handle DB changes that require changing data values, like if you add a new column that must be initially populated by doing some computation on data from other tables or some such. But if you can generate new data from old data, that must mean that the new data is redundant and your database is not normalized. I don't think the situation ever came up for us.

I had the same issue with an app in Android with an SQLite database adding a table. I changed the name of the database to include a version extension, like: theDataBaseV1, deleted the previous one and the app works fine.
I just changed the name of the database and the name in this line of code
private static final String DATABASE_NAME = "busesBogotaV2.db";
in the DBManager when its going to open.
Does anybody knows if this trivial solution has any unintended consequences?

Best way to track changes and make changes from Mysql -> MSSQL

So I need to track changes that happen on a Mysql table. I was thinking of using triggers to log all the changes made to it and then save these changes in another table. Then I will have a cron script get all these changes and propagate the changes into the Mssql database.
I really dont expect a lot of information to be proporgated, but the data is very time sensitive. Ideally the MSSQL will see these changes within a minute, but I know that this requirement may be too high.
I was wondering if anyone had a better solution.
I have the bulk of the site written in .net but use vbulletin as the forums (sorry but there are no .net forums as powerful or feature rich like vbulletin)

The majority of the replicator tools use this technique. Fill another table on insert/update/delete triggers that containt the tablename and the PK or a unique key.
Then a reader reads this table, do the proper "select" if insert/update to get the data, then updates the other database.
HTH

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.