Entity Framework AddObject to only "INSERT INTO" certain columns

Entity Framework AddObject to only "INSERT INTO" certain columns - c#

We have a system that will use the same code to communicate with different client databases. These databases will use the same EF Model, but different connection strings.
Our problem is, not every site will be using the same version of our database structure; some might be missing a few columns or contain a few old columns.
If we upgrade the system to the current version, now the database model now has an extra EmergencyContact column. All older databases will now fail, because EF is trying to insert into this column (even though we have not set a value for this property).
Is there a way of telling EF to only use columns for which we have a value for, when it generates the INSERT INTO query?

EF will be fine if your schema has missing columns that are in the real database, but it will not work if you have columns in the schema that are not in the database, and there is no way to fix that.
Your only choice is to use different schemas for different databases, and write code that manages them (ie, only instantiates the version of the context you need).

In the case where your model does not match your database schema, EF will only insert/update the columns in the model. However, if the unknown columns are not null, EF will throw an exception. Also, if you created relational constraints on the unknown columns, of course those will not be created as they are not yet known.

If the persistence layer per site is the only part that changes then I would extract your EF model into it's own version e.g.
DbV1.dll
DbV2.dll
You could then load in the appropriate DLL based on some setting from the client i.e. you could pass information as a custom header e.g.
db-version: 1
There are other more reliable ways, however, I don't know what your current setup is like so it's difficult to answer.

Related

Potential conflict between transferring data and '.ValueGeneratedOnAdd()'

I apologize if this is duplicative; I could find nothing directly pertaining.
The difficulty involves EF Core (v 3.1.8, if it matters), but is not specific or restricted thereto. I am doing code first, creating a number of entities, but the key point is that I am getting my initial data set from an app that I am trying to replace. My new app has a number of structural differences in every corresponding entity, but the data in the old app is still critical, so I will be transferring it to my new database. (Old db is hosted by MS SQL 2008; new db is hosted by MS SQL 2019, if it matters).
Most of the key fields are GUIDs, and the problem is that in EF Core, at the point in the future when I want to use the new app to do more data entry, I will also want the database to choose the GUID. In EF Core Fluent API parlance, that would be, for example:
modelBuilder.Entity("ReplaceOldApp.Models.Address", b =>
{
b.Property<Guid>("AddressID")
.ValueGeneratedOnAdd()
.HasColumnType("uniqueidentifier");
}
However, if I inform EF Core that I want the database to create the key, then it will create the tables such that when I try to transfer the data from the old database (whether using EF or some other means), the new database will ignore the old GUID and create a new, unrelated one. (Or at least, that's what I think will happen. I'm not ready to try it yet.) If that happens, then all of the data from, say, the old Person entity (such as the above-implied Address entity), will no longer be related between their corresponding entities in the new database, because all records will have shiny new GUIDs. I will have all the information, and no way to actually use it.
Obviously I can tell EF Core to inform the database that it will not be creating the GUIDs, and I can then read, unmunge and transfer the data from the old database to the new without fear of data loss (God willing). But then going forward, for any new data entry, the GUIDs will not be automatically genned. I can of course then mod my IEntityTypeConfiguration Fluent API classes for the various entities and do a second migration, re-genning the affected tables, but I'm worried that EF Core will decide that it needs to DROP the tables to accommodate such a change. (Again, I do not know for sure because I have not tried it: sorry.)
So my question is: How would you approach such a situation? Should I ignore EF and do something clever with MS SQL Studio? Should I do two migrations with a transfer in-between? Should I tell the database, even though it has been told to gen the keys, somehow to accept the old keys without changing things, perhaps via LINQ?
============== Edit:
I'm sure SSIS would work to transfer the data from old to new databases, but the learning curve appears daunting, and I am only trying to solve one problem, not gain a new career. Powershell ditto, although it may be a bit more of a hacker's tool, and as such knowledge of it might assist tweaking or help to solve a diverse set of one-time SQL Server headaches. However, again, as would you, I prefer to use what I know, or failing that, learn or learn more about a tool which promises to serve me consistently into the future.
With the very welcome new (to me) information about IDENTITY_INSERT, and information gained from Linq To Sql and identity_insert, I believe I should not use LINQ to SQL because it may assume that IDENTITY_INSERT is OFF and simply filter out the crucial GUID, failing therefore to provide it to the target server. Rather, it seems I can use C# to produce a series of generated SQL statements, and then run each one on the target server inside a TransactionScope(). Because each such insert will thereby run 'in the same connection', the state of IDENTITY_INSERT will be preserved for that entire insert transaction, and (creek don't rise) it should work.
Again, I appreciate your answer, Randy in Marin. It has, it seems, led me to an approach that will work within the potential constraints of my context (EF Core), while allowing me to preserve the crucial existing IDENTITY information. Peace.

Not being an EF programmer, I don't know if there is an option for identity insert that you can enable for a migration. You might search the term to see if it comes up.
Our team support database migrations. We can do it a number of ways. I would not even consider EF because it's not designed for data migrations - or for database design. (And because we tend to use what we know.)
This is not the way I would do it, but it might be better than SSIS if you have not used SSIS. If the tables are in the same database or in databases on the same server, you can use T-SQL to load each table one at a time. Even if not on the same server, a linked server would allow a distributed transaction. (I avoid linked servers like the plague, but for a one time thing like a migration I would tolerate it. I would rather restore a copy of the source database to the destination server to use as a source. Distributed transactions gone wrong have forced me to reboot critical servers.)
Each table can have a 4 part name. If the server part (e.g., using a linked server name) is not present, the local instance is used. If the database part is not present, the current database is used. This is the format I assume for the "src_table" and "dst_table".
[myserver\myinstance].[mydatabase].[myschema].[mytable]
Each table is loaded with T-SQL as follows:
TRUNCATE TABLE dst_table
SET IDENTITY_INSERT dst_table ON
INSERT dst_table (...) SELECT ... FROM src_table
SET IDENTITY_INSERT dst_table OFF -- must be turned off - only 1 table can have this ON
If there are foreign keys, some tables (e.g., def tables) would need to be loaded first.
If the table does not have an IDENTITY column (EF code creates all values), you don't use the IDENTITY_INSERT stuff. It will fail if you use it and there is not an identity column. It will fail if you don't use it and try to insert into an identity column.
If there is a lot of data in a table, the transaction might be too big or slow. Inserting in batches might be called for.
If it was something to run on a schedule, I would likely create a SSIS package to do the load.
If I wanted to try something new, I would use powershell and the DBATools module cmdlets to see if extracting to csv and importing the csv would be efficient. The import cmdlet has a column mapping parameter, among many others. PowerShell could be used to do transformation, but I think this crosses over into SSIS territory.
I have dealt with migrations where the GUIDs and IDs no longer related after the move. Using queries joining the new data to the old data, we were able to fix the related values. It's likely more work to fix it after than to plan for it to be correct from the start.

Incremental ETL on code first many-to-many association table

I'm setting up a data warehouse (in SQL Server) together with our engineers we got almost everything up and running. Our main application also uses SQL Server as backend, and aims to be code first while using the entity framework. In most tables we added a column like updatedAt to allow for incremental loading to our data warehouse, but there is a many-to-many association table created by the entity framework which we cannot modify. The table consists of two GUID columns with a composite key, so they are not iterable like an incrementing integer or dates. We are now basically figuring out the options on how to enable incremental load on this table, but there is little information to be found.
After searching for a while I mostly came across posts which explained how it's not possible to manually add columns (such as updatedAt) to the association table, such as here Create code first, many to many, with additional fields in association table. Suggestions are to split out the table into two one-to-many tables. We would like to prevent this if possible.
Another potential option would be to turn on change data capture on the server, but that would potentially defeat the purpose of code first in the application.
Another thought was to add a column in the database itself, not in code, with a default value of the current datetime. But that might also be impossible / non compatible with the entity framework, as well as defeating the code first principle.
Are we missing anything? Are there other solutions for this? The ideal solution would be a code first solution, or a solution in the ETL process without affecting the base application, without changing too much. Any suggestions are appreciated.

Entity Framework and Dynamic Schema

I inherited an application that talks to many different client databases.
Most of these tables in the client databases have identical schema - but there are a handful of tables that have extra custom columns that contain tax information (ya - bad idea - I know … I didn't set it up).
These extra columns could be named anything. They are known at runtime as they can be looked up in another table.
I can setup EF to that it will read/write these tables (skipping the dynamic columns) but I really do need this information - as it is tax data.
I think my best route it to have a fixed model with extra properties added that could be filled by these dynamic columns.
How can I get Entity Framework to dynamically read and write these columns without using custom SQL statements on every call?
I can do extra reads and writes to read and write these extra columns separately (using custom sql)… but there must be some way to override EF so that it knows about these extra columns and can handle them correctly.
Any help would be appreciated.

In a first step, you could interrogate the _INFORMATION_SCHEMA_, or other metadata tables directly, to know if the table you want your context to be on has these columns. Based on that information, you can use a different DbContext (generic would probably work) but create it using MappingConfiguration in which you either ignore the columns if they aren't there, or map them to the POCO class your context desires.

How to get Nhibernate to handle non existing database columns gracefully

I am working on a project that requires the use of multiple databases that for the most part are completely identical but some columns might be missing. How do you get NHibernate to handle this for instance i have a table with 4 columns an index and 2 data coloumns that will always be availible but a singe customer does not want the column in their database.
as this is part of a legacy application migration i do not have the luxury of dictating the database format or even change the databases. anybody have any ideas of how to do this. I cannot get NHibernate shards to work with this either.
KR
Nicky

I don't know of a way to tell NHibernate to ignore columns that are otherwise mapped.
I would look at creating multiple mappings files for the different databases and then depending on your environment configure your SessionFactory using the correct mapping files.
This may seem like a little more work to setup initially but it makes it very clear that in database X you have columns A-B-C and in database Y you only have columns A-B.

Change Entity framework database schema at runtime

In most asp.net applications you can change the database store by modifing the connectionstring at runtime. i.e I can change from using a test database to a production database by simply changing the value of the "database" field in the connectionstring
I'm trying to change the schema (but not necessarily the database itself) with entity framework but no luck.
The problem I'm seeing is the that the SSDL content in the edmx xml file is storing the schema for each entityset.
see below
<EntitySet
Name="task"
EntityType="hardModel.Store.task"
store:Type="Tables"
Schema="test" />
Now I have changed the schema attribute value to "prod" from test and it works..
But this does not seem to be a good solution.
I need to update evert entity set as well as stored procedures ( I have +50 tables )
I can only do this an compile time?
If I then try to later update the Entity model-entityies that already exist are being read due to EF not recognizing that the table already exists in the edm.
Any thoughts?

I have this same issue and it's really rather annoying, because it's one of those cases where Microsoft really missed the boat. Half the reason to use EF is support for additional databases, but unless you go code first which doesn't really address the problem.
In MS SQL changing the schema makes very little sense, because the schema is part of the identity of the tables. For other types of databases, the schema is very much not part of the identity of the database and only determines the location of the database. Connect to Oracle and changing the database and changing the schema are essentially synonymous.

Update Upon reading your comments it's clear that you're wanting to change the referenced schema for each DB, not the database. I've edited the question to clarify this and to restore the sample EDMX you provided which was hidden in the original formatting.
I'll repeat my comment below here:
If the schemata are in the same DB, you can't switch these at runtime (except with EF 4 code-only). This is because two identically-named and structured tables in two different schemata are considered entirely different tables.
I also agree with JMarsch above: I'd reconsider the design of putting test and production data (or, actually, 'anything and production data') in the same DB. Seems like an invitation to disaster.
Old answer below.
Are you sure you're changing the correct connection string? The connection string used by the EF is embedded inside the connection string which specifies the location of CSDL/SSDL/etc. It's common to have a "normal" connection string for use by some other part of your app (e.g., ASP.NET membership). In this case, when changing DBs you must update both of your connection strings.
Similarly, if you update the connection string at runtime then you must use specific tools for this, which understand the EF connection string format and are separate from the usual connection string builder. See the example in the link. See also this help on assigning EF connection strings.

The easiest way to solve the problem is to manualy remove all entries like 'Schema="SchemaName"' from the SSDL part of the model.
Everything works propely in this case.

Sorry its not a robust answer but I found this project on codeplex ( as well as this question ) while googling around for a similar problem:
http://efmodeladapter.codeplex.com/
The features include:
Run-time adjustment of model schema,
including:
Adjusting data-level table
prefixes or suffixes
Adjusting the
owner of database objects
Some code from the docs:
public partial class MyObjectContext : BrandonHaynes.ModelAdapter.EntityFramework.AdaptingObjectContext
{
public MyObjectContext()
: base(myConnectionString,
new ConnectionAdapter(
new TablePrefixModelAdapter("Prefix",
new TableSuffixModelAdapter("Suffix")),
System.Reflection.Assembly.GetCallingAssembly()))
{
...
}
}
Looks like its exactly what your looking for.

The connection string for EF is in the config file. There is no need to change the SSDL file.
EDIT
Do you have the prod and test schema in the same database?
If Yes you can fix it by using a seperate database for prod and test. Using the same schema name in both databases.
If No you can fix it by Using the same schema name in both databases.
If you will absolutly have different schema names, create two EF models, one for test and one for prod, then select which on to use in code based on a value in your config file.

When I create a new "ADO.NET Entity Data Model", there are two properties "Entity Container Name" and "Namespace" available for editing in design view.. Using the namespace.EntityContainerName, you can create a new instance specifying a connection string.
MyEntities e = new MyEntities("connstr");
e.MyTable.Count();
I'm not sure if this helps you or not, good luck!
Also, this is a good case for multiple layers (doesn't have to be projects, but could be).
Solution
* DataAccess - Entities here
* Service - Wraps access to DataAccess
* Consumer - Calls Service
In this scenario, the consumer calls service, passing in whatever factor determines which connection string is used. The service then instantiates an instance of data access passing in the appropriate connection string and executes the consumer's query.

Here is a similar question with a better answer:
Changing schema name on runtime - Entity Framework
The solution that worked for me was the one written by Jan Matousek.

Solved my problem by moving to sql server and away from mysql.
Mysql and Mssql interpret "schemas" differently. Schemas in mysql are the same/synonyms to databases. When I created the model the schema name..which is the same as the database name is hard coded in the generated model xml. In Mssql the schema is by default "dbo" which gets hard coded but this isnt an issue since in mssql schemas and databases are different.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.