Explanation of Migrators (FluentMigrator)? - c#

Could someone explain the concept of Migrators (specifically fluentmigrator)?
Here are the (possibly confused) facts Ive gleaned on the subject:
Is it a way to initially create then maintain updates for a database
by way of versioning.
The first migration (or initial version of the
database) would contain all the tables, relationships and properties
required (done either fluently or using a chunk of sql in a script).
When you want to push a change to a database, you would create a new
migration method (Up and Down), something like add a new table or modify a field.
To deploy one of these migrations, you would use a
command line specifying the dll containing the migration, the
connection string and the required version.
If you had a rather complex set of data models, wouldn't it be rather difficult and time consuming to create a migration definition for all of that?
I know with nHibernate/fluent you can easily generate tables for a database without having to define anything other than the models and map files. Is there a way to make this configuration compatible with the Migrator/Versioning?
When nhibernate/fluent is in charge of generating a database, I do not necessarily need to define every thing aspect of the tables. Its done either via convention or via the mapping files. With the migrators I would need to define this level of detail?

Lots of questions here. I'll answer the questions with a focus on FluentMigrator.
Is it a way to initially create then maintain updates for a database
by way of versioning.
FluentMigrator is a way to version control your database schema. Everyone does it in some way. Either manually, with sql scripts, with a tool like SqlCompare or a Visual Studio Database project. All these methods are easy to mess up. It is so easy to make a mistake when releasing a new version and cause the system to crash. Migrations is a better way to handle this.
FluentMigrator allows you to define a change to the schema as code and this is usually checked in to your source control with the other code changes. Meaning that you can say version 1.XX of your system should have version 123 of the database. It means if you roll back your code to the previous version you also know what version of the database to rollback to as well.
It can be used both to create the database schema from the beginning or to start with version control of the schema for an existing database.
A Migration is a way to describe a change to your database schema. FluentMigrator creates a VersionInfo table and stores the unique id (version number) of the Migration after is has been applied.
For example, if I have two Migrations one with Id 1 and one with Id 2. If then I execute the first Migration then Id 1 will be stored in the VersionInfo table and I can look there and know that the version of the database is 1 and that version 2 has not been applied yet.
Being able to know which version the database schema is very useful when pushing changes from Test to Production or if you have multiple copies of the database in Production. For example, I have a customer with offices all around the world and each office has their own copy of the database and all of them are on different versions. Without knowing the database version it would be very difficult to update them safely.
Most of the time I do not need to actually look in the VersionInfo table, FluentMigrator handles this automatically. It compares the assembly with Migrations to the VersionInfo table and figures out which changes have not been applied yet and then executes those.
The first migration (or initial version of the database) would contain
all the tables, relationships and properties required (done either
fluently or using a chunk of sql in a script).
The starting point is up to you. You can have a first migration that is an sql script that you have generated from the current database. You could could also use one of the contrib projects like FluentMigrator.T4 to generate a Fluent Migration. Or you could just decide that the existing database is the starting point and save a copy of it to be able to restore it as version 1.
I have introduced FluentMigrator to a lot of legacy databases without any major problems.
When you want to push a change to a database, you would create a new
migration method (Up and Down), something like add a new table or
modify a field.
Yes, Up is used to apply the change specified in the Migration and Down rolls it back. So Up could be to create a table and Down could be to drop the table.
To deploy one of these migrations, you would use a command line
specifying the dll containing the migration, the connection string and
the required version.
There are three runners available to execute migrations. The command line runner, the Nant task and the MSBuild task. There are usually executed as part of a build script.
The MigrationRunner class can also be used in code. You might do this if you wanted to build your own runner or if you have other needs (like building databases dynamically or automatically updating the database if a new migration is added.)
If you had a rather complex set of data models, wouldn't it be rather
difficult and time consuming to create a migration definition for all
of that?
I have mostly answered this already. It is usually quite easy to generate an sql script for a database. For Sql Server it takes less than a minute to generate the script even for large databases. This script can be saved in a .sql file and executed as the first migration using the Execute.EmbeddedSqlScript expression. It works a treat.
I know with nHibernate/fluent you can easily generate tables for a
database without having to define anything other than the models and
map files. Is there a way to make this configuration compatible with
the Migrator/Versioning?
At the moment, there is no such integration and in practise I, at least, don't miss it. There was some discussion about connecting Fluent NHibernate and FluentMigrator but it would be a lot of work. It would enable scaffolding to generate changes to the model like EF Code First migrations do. It's not on the roadmap at the moment however.
When nhibernate/fluent is in charge of generating a database, I do not
necessarily need to define every thing aspect of the tables. Its done
either via convention or via the mapping files. With the migrators I
would need to define this level of detail?
Yes, you would need to define at that level of detail. FluentMigrators' migrations are a DSL (own little language) for defining schema changes that are translated to sql. You can write sql directly as well using the Execute.Sql expression. Entity Frameworks migrations have that sort of integration which has both advantages and disadvantages.
Check out the wiki or one of the tutorials here, here (part 1) or here (part 2) for more help getting started.

Related

Potential conflict between transferring data and '.ValueGeneratedOnAdd()'

I apologize if this is duplicative; I could find nothing directly pertaining.
The difficulty involves EF Core (v 3.1.8, if it matters), but is not specific or restricted thereto. I am doing code first, creating a number of entities, but the key point is that I am getting my initial data set from an app that I am trying to replace. My new app has a number of structural differences in every corresponding entity, but the data in the old app is still critical, so I will be transferring it to my new database. (Old db is hosted by MS SQL 2008; new db is hosted by MS SQL 2019, if it matters).
Most of the key fields are GUIDs, and the problem is that in EF Core, at the point in the future when I want to use the new app to do more data entry, I will also want the database to choose the GUID. In EF Core Fluent API parlance, that would be, for example:
modelBuilder.Entity("ReplaceOldApp.Models.Address", b =>
{
b.Property<Guid>("AddressID")
.ValueGeneratedOnAdd()
.HasColumnType("uniqueidentifier");
}
However, if I inform EF Core that I want the database to create the key, then it will create the tables such that when I try to transfer the data from the old database (whether using EF or some other means), the new database will ignore the old GUID and create a new, unrelated one. (Or at least, that's what I think will happen. I'm not ready to try it yet.) If that happens, then all of the data from, say, the old Person entity (such as the above-implied Address entity), will no longer be related between their corresponding entities in the new database, because all records will have shiny new GUIDs. I will have all the information, and no way to actually use it.
Obviously I can tell EF Core to inform the database that it will not be creating the GUIDs, and I can then read, unmunge and transfer the data from the old database to the new without fear of data loss (God willing). But then going forward, for any new data entry, the GUIDs will not be automatically genned. I can of course then mod my IEntityTypeConfiguration Fluent API classes for the various entities and do a second migration, re-genning the affected tables, but I'm worried that EF Core will decide that it needs to DROP the tables to accommodate such a change. (Again, I do not know for sure because I have not tried it: sorry.)
So my question is: How would you approach such a situation? Should I ignore EF and do something clever with MS SQL Studio? Should I do two migrations with a transfer in-between? Should I tell the database, even though it has been told to gen the keys, somehow to accept the old keys without changing things, perhaps via LINQ?
============== Edit:
I'm sure SSIS would work to transfer the data from old to new databases, but the learning curve appears daunting, and I am only trying to solve one problem, not gain a new career. Powershell ditto, although it may be a bit more of a hacker's tool, and as such knowledge of it might assist tweaking or help to solve a diverse set of one-time SQL Server headaches. However, again, as would you, I prefer to use what I know, or failing that, learn or learn more about a tool which promises to serve me consistently into the future.
With the very welcome new (to me) information about IDENTITY_INSERT, and information gained from Linq To Sql and identity_insert, I believe I should not use LINQ to SQL because it may assume that IDENTITY_INSERT is OFF and simply filter out the crucial GUID, failing therefore to provide it to the target server. Rather, it seems I can use C# to produce a series of generated SQL statements, and then run each one on the target server inside a TransactionScope(). Because each such insert will thereby run 'in the same connection', the state of IDENTITY_INSERT will be preserved for that entire insert transaction, and (creek don't rise) it should work.
Again, I appreciate your answer, Randy in Marin. It has, it seems, led me to an approach that will work within the potential constraints of my context (EF Core), while allowing me to preserve the crucial existing IDENTITY information. Peace.
Not being an EF programmer, I don't know if there is an option for identity insert that you can enable for a migration. You might search the term to see if it comes up.
Our team support database migrations. We can do it a number of ways. I would not even consider EF because it's not designed for data migrations - or for database design. (And because we tend to use what we know.)
This is not the way I would do it, but it might be better than SSIS if you have not used SSIS. If the tables are in the same database or in databases on the same server, you can use T-SQL to load each table one at a time. Even if not on the same server, a linked server would allow a distributed transaction. (I avoid linked servers like the plague, but for a one time thing like a migration I would tolerate it. I would rather restore a copy of the source database to the destination server to use as a source. Distributed transactions gone wrong have forced me to reboot critical servers.)
Each table can have a 4 part name. If the server part (e.g., using a linked server name) is not present, the local instance is used. If the database part is not present, the current database is used. This is the format I assume for the "src_table" and "dst_table".
[myserver\myinstance].[mydatabase].[myschema].[mytable]
Each table is loaded with T-SQL as follows:
TRUNCATE TABLE dst_table
SET IDENTITY_INSERT dst_table ON
INSERT dst_table (...) SELECT ... FROM src_table
SET IDENTITY_INSERT dst_table OFF -- must be turned off - only 1 table can have this ON
If there are foreign keys, some tables (e.g., def tables) would need to be loaded first.
If the table does not have an IDENTITY column (EF code creates all values), you don't use the IDENTITY_INSERT stuff. It will fail if you use it and there is not an identity column. It will fail if you don't use it and try to insert into an identity column.
If there is a lot of data in a table, the transaction might be too big or slow. Inserting in batches might be called for.
If it was something to run on a schedule, I would likely create a SSIS package to do the load.
If I wanted to try something new, I would use powershell and the DBATools module cmdlets to see if extracting to csv and importing the csv would be efficient. The import cmdlet has a column mapping parameter, among many others. PowerShell could be used to do transformation, but I think this crosses over into SSIS territory.
I have dealt with migrations where the GUIDs and IDs no longer related after the move. Using queries joining the new data to the old data, we were able to fix the related values. It's likely more work to fix it after than to plan for it to be correct from the start.

Update Db in EF5 with MVC5

I am working on a project using Entity Framework 5 with MVC5. My Project is currently running.
I am trying to add a column in a table. But as we know that in EF when we add a field in model it drop and recreate the database, which i can`t do.
One method for this is found code migration. But my manager is not allow me to use that(because its a big database project).
Please help me and suggest something for it.
When I start using code first with Entity Framework, I was in the same situation as you. I was always running Update-Database -F and then watching all my tables get dropped and recreated, even for something as simple as renaming a field.
Versioning databases is hard, but it's much easier with named migrations (which I think it what you mean when you refer to code migrations). I know your boss is against the idea, but it's very flexible.
Essentially you run Add-Migration -Name xxx in your Package Manager Console and Entity Framework will scaffold a configuration class for you with the default commands (both for versioning Up() and Down()) it will execute when you Update-Database. If you don't like the commands, you can change them. You can even move data around if you need to (it's a bit fiddly though).
I think you have four options available to you;
Use code-first automatic migrations: This is what you have at the moment, and doesn't give you enough control over what happens when you update your database. It's good for getting started in the earlier stages of a project, but becomes unwieldy after production.
Use code-first named migrations: Gives you the control you need via Configurations - but your boss has prevented use from using it.
Use a database-first approach: Database First allows you to reverse engineer a model from an existing database. So if you need to make a change, you would change your database first, and then regenerate your models using EF. This is usually favoured by DBA's, but it may mean that you have reimplement some aspects of your existing project.
Dont use entity framework: It's possible that you could revert back to SQL queries, which your boss might accept and gives you the flexibility you need - but who needs that kind of pain?
Let me know if I can help further.

Verify that target database schema complies with what's in Entity Framework?

We have a process where our database guys script changes (and version them using Juneau) to our application's database out-of-band with our code base. They're good at accounting for new columns being null, and not wiping existing data, but occasionally a column rename sneaks in that isn't fully communicated. So they will make some changes to the database schema on a testing server, we'll update Entity Framework to work with those changes, and then commit our code. This process works okay, except for when it's time to deploy.
We have TFS set up to deploy the successful build to the appropriate servers, but there's no guarantee that the database for that environment has been updated. We don't care if extra fields/tables/views/etc. exist in the target database, but we want change the build to check that the database contains at least everything EF is aware of.
I looked at this question, but I don't need the schema to match exactly. Plus, we don't want it creating/modifying the database directly. And this question seems like it's trying to achieve a similar ideal, but still not quite what we're looking to achieve. We just want a integration test of sorts to verify our version of EF will work with the target schema.
I wonder why you try to deploy your application without changes to database. Your application is dependent on the database so the deployment should always be done after the database. It looks like you are going to invest a lot of time to develop validation to fix your incorrect deployment process (where fixing the process itself is the correct solution).
Anyway you can create some "validation" of the database but it will take some time. If you are using EDMX file you can open it as XML and read its SSDL part which describes all expected tables, columns, relations, views (in form of SELECT SQL queries), stored procedures and functions. You can parse this XML part and use system database views (sys.tables, sys.columns, ...) to query if these objects exists in the database.
Another approach can be using database diff. tool to compare your current test database with the target one. This will require the tool which can be executed from command line and you will have to parse its output to find breaking changes.

How to change database design in a deployed application?

Situation
I'm creating a C#/WPF 4 application using a SQL Compact Edition database as a backend with the Entity Framework and deploying with ClickOnce.
I'm fairly new to applications using databases, though I don't suspect I'll have much problem designing and building the original database. However, I'm worried that in the future I'll need to add or change some functionality which will require me to change the database design after the database is already deployed and the user has data in the database.
Questions
Is it even possible to push an updated database design out to users via a clickonce update in the same way it is for code changes?
If I did, how would the user's data be affected?
How is this sort of thing done in real situations? What are some best-practices?
I figure that in the worst case, I'd need to build some kind of "version" number into the database or program settings and create some routine to migrate the user's current version of the database to the new one.
I appreciate any insight into my problem. Thanks a lot.
There are some 'tricks' that are employed when designing databases to allow for design changes.
Firstly, many database designers create views to code against, rather than coding directly to the tables. This allows tables to be altered (split or merged, etc) while only requiring that the views are updated. You may want to investigate database refactoring techniques for this.
Secondly, you can indeed add versioning information to the database (commonly done as a 'version' table with a single field). Updating the database can be done through code or through scripts. One system I worked on would automatically check the database version and then progressively update the schema through versions in code until it matched the required version for the runtime. This was quite an undertaking.
I think your "worst" case is actually a pretty good route to go in this situation. Maintain a database version in the DB and have your application check and update the DB as necessary. If you build your updater correctly, it should be able to maintain the user's data. Depending on the update this might involve creating temporary tables to hold the existing data and repopulating new versions of the tables from them. You might be able to include a new SDF file with the new schema in place in the update process and simply transfer the data. It might be slightly easier that way -- you could use file naming to differentiate versions and trigger the update code that way.
Unfortunately version control and change management for databases is desperately, desperately far from what you can do with the rest of your code.
If you have an internal-only environment there are a number of tools which will help you (DBGhost, Red Gate has a newish app, some deployment management apps) but all of them are less than full solutions imho, but they are mostly good enough.
For client-shipped solutions you really don't have anything better than your worst case I'm afraid. Just try and design with flexibility in mind - see Dr.Herbie's answer.
This is not a solved problem basically.
"Smart Client Deployment with ClickOnce" by Brian Noyes has an excellent chapter on this issue. (Chapter 5)
ISBN 978-0-32-119769-6
He suggests something like this:
if(ApplicationDeployment.CurrentDeployment.IsFirstRun) {
MigrateData();
}
private void MigrateData() {
string previousDb = Path.Combine(ApplicationDeployment.CurrentDeployment.DataDirectory, #".\pre\mydb.sdf");
if(!File.Exists(previousDb))
return;
string oldConnString = #"Data Source=|DataDirectory|\.pre\mydb.sdf";
string newConnString = #"Data Source=|DataDirectory|\mydb.sdf";
//If you are using datasets perform any migration here, with the old and new table adapters.
//Otherwise use an .sql data migration script.
//Store the version of the database in the database, and check that in the beginning of your update script and GOTO the correct line in the SQL script.
}
A common solution is to include a version number somewhere in the database. If you have a table with miscellaneous system data, throw it in there, or create a table with one record just to hold the DB version number. Then whenever the program starts up, check if the database version is less than the expected version. If so, execute the required SQL CREATE, ALTER, etc, commands to bring it up to speed. Have a script or function for each version change. So if you see the database is currently at version 6 and the code expects version 8, execute the 6 to 7 update and the 7 to 8 update.
Another method we used on one project I worked was to ship a schema-only, no data database with the code. Every time you installed a new version the installer would also install the latest copy of this new blank database. Then when the program started it up it would compare the user's current database schema with the new database schema, and determine what database changes were needed on the fly. Like, if in the "reference schema" table Foo had a column named Bar, and there was no column Bar in the user's current database, we would generate a "alter table Foo add Bar ..." and execute it. While writing the first draft of the program to do this was a fair amount of work, once we'd done it there was pretty much zero maintenance to keep the DB schema up to date. The conversion was just done on the fly.
Note that this scheme doesn't handle DB changes that require changing data values, like if you add a new column that must be initially populated by doing some computation on data from other tables or some such. But if you can generate new data from old data, that must mean that the new data is redundant and your database is not normalized. I don't think the situation ever came up for us.
I had the same issue with an app in Android with an SQLite database adding a table. I changed the name of the database to include a version extension, like: theDataBaseV1, deleted the previous one and the app works fine.
I just changed the name of the database and the name in this line of code
private static final String DATABASE_NAME = "busesBogotaV2.db";
in the DBManager when its going to open.
Does anybody knows if this trivial solution has any unintended consequences?

Is it possible to update old database from dbml file ? (C#, .Net 4, Linq, SQL Server)

I began recently a new job, a very interesting project (C#,.Net 4, Linq, VS 2010 and SQL Server). And immediately I got a very exciting challenge: I must implement either a new tool or integrate the logic when program start, or whatever, but what must happen is the following: the customers have previous application and database (full with their specific data). Now a new version is ready and the customer gets the update. In the mean time we made some modification on DB (new table, columns, maybe an old column deleted, or whatever). I’m pretty new in Linq and also SQL databases and my first solution can be: I check the applications/databases version and implement all the changes step by step comparing all tables, columns, keys, constrains, etc. (all this new information I have in my dbml and the old I asked from the existing DB). And I’ll do this each time the version changed. But somehow I feel, this is NOT a smart solution so I look for a general solution of this problem.
Is there a way to update customers DB from the dbml file? To create a new one is not a problem (CreateDatabase with DataContext), is there any update/alter database methods? I guess I’m not the only one who search for such a solution (I found nothing in internet – or I looked for bad keywords). How did you solve this problem? I look also for an external tool, but first for a solution with C#, Linq or something similar.
For any idea thank you in advance!
Best regards,
Emil
What I always do is use Red Gate's SQL Compare to compare the schema of the new database to the schema of the old database. It will generate a change script for you and then you can run that script in code.
We have a table that has a single row in it for program setup information. One of the columns in this table is the database version number. This will instantly tell us what database version the customer has when we do an update. Then we run every script that will update them to the latest version they need to be running. Whenever we release a new version (with database changes), we run the SQL Compare and make a script to go from the previous version to the next. We don't do any scripts that will skip versions, just in case of strange conflicts that may arise from that.
This also gives us the opportunity to do any data massaging we may have to do in between versions by writing a custom script and inserting that into the update scripts. Every update script changes that database version field as well.
This allows us to do a lot of automated updating. Having that database version allows the client to take a peek at that version before the user has a chance to use the application. If it's different and the application needs an update, it will go out to our ftp site and download the update and run the setup automatically.
Basically what you want to be able to do is to script the changes - to be able to run "something" that allows you to update one version of the database to the next and also to make any necessary changes to the data required by that change in the schema.
Good news is that you can do this with SQL, you can write DDL statements to create and modify a database schema.
My solution is to put my database schema maintenance entirely in code, I think this is the best version of the writeup I've done so far:
How to create "embedded" SQL 2008 database file if it doesn't exist?
Why in code? Because it works. May not be the best solution but its one I have had some success with and the results are consistent and repeatable. Oh and its version controlled too.
The big problem you may have in this specific instance is that you need to establish a baseline - to make sure that the existing databases are consistent in terms of their schema. This is where more complex and clever tools may serve you better - being able to do a schema diff and then update has a lot of appeal as a concept for example but equally you're somewhat dependent on having your reference database perfect and that raises other issues.

Categories