linq2sql C#: How to query from a table with changing schema name

linq2sql C#: How to query from a table with changing schema name - c#

I have a webservice which tries to connect to a database of a desktop accounting application.
It have tables with same name but with different schema names such as:
[DatabaseName].[202001].[CustomerCredit]
[DatabaseName].[202002].[CustomerCredit]
.
.
.
[DatabaseName].[202014].[CustomerCredit]
[DatabaseName].[202015].[CustomerCredit]
[DatabaseName].[202016].[CustomerCredit]
...
..
[DatabaseName].[2020xx].[CustomerCredit]
Schema name is in format [Year+IncrementalNumber] such as [202014], [202015],[202016] and etc.
Whenever I want to query customer credit information in database, I should fetch information from schema with biggest number such as [DatabaseName].[202016].[CustomerCredit] if 202016 is latest schema in my db.
Note:
Creation of new schema in accounting application database have no rules and is completely decided by user of accounting application and every instance of application installed on different place may have different number of schemas.
So when I'm developing my webservice I have no idea to connect to which schema prior to development. In run-time I can find correct schema to query from its tables but I don't know how to manage to fetch table information with correct schema name in query.
I ususally creat a linq-to-sql dbml class and use its definitions to read information from db but I don't know how to manage schema change in this way?
DBML designer manage Scehma names like this:
[global::System.Data.Linq.Mapping.TableAttribute(Name="[202001].CustomerCredit")]
However since my app can retrieve schema name in run time, I don't know how to fix table declaration in my special case.
It is so easy to handle in ADO.NET but I don't know its equivalent in Linq2SQL:
select count(*) from [" + Variables.FinancialYearSchemaName + "].CustomerCredit where SFC_Status = 100;

Ultimately, no: most ORMs do not expect the schema change to vary at runtime, so most - including EF and LINQ-to-SQL do not support this scenario. One possible option would be to have different connection strings, each with different user accounts, that each has a different default schema configured at the database - and intialize your DB-context with a connection-string or connection that matches the required account. Then if EF asks the RDBMS for [CustomerCredit], it will look first in that account's schema ([202014].[CustomerCredit]). You should probably avoid having a [202014].[CustomerCredit] in that scenario, to prevent confusion. This is, however, a pretty hacky and ugly solution. But... it should work.
Alternatively, you would have to take more control over the data access, essentially writing your own SQL (presumably with a token replacement for the schema, which has problems of its own).

That schema is essentially a manual partitioning of the CustomerCredit table. The best solution would one that makes partitioning transparent to all users. The code shouldn't know how the data is partitioned.
Database Solutions
The benefit of database solutions is that they are transparent or almost transparent to users and require minimal maintenance
Table Partitioning
The clean solution would be to use table partitioning, making the different partitions transparent to all users. Table partitioning used to be an Enterprise-only feature but it became available in all editions since SQL Server 2016 SP1, even Express. This means it's free in all versions still in mainstream support.
The table is partitioned based on a function (eg a date based function) and stored in different files. Whenever possible, the query optimizer can check the partition boundaries and the query conditions and use only the file that contains the relevant data. Eg in a date-partitioned table, queries that contain a date filter can search only the relevant partitions.
Partitioned views
Another option, available since 2000 at least, is to use partitionend views, essentially a UNION ALL view that combines all table partitions, eg :
SELECT <select_list1>
FROM [202001].[CustomerCredit]
UNION ALL
SELECT <select_list2>
FROM [202002].[CustomerCredit]
UNION ALL
...
SELECT <select_listn>
FROM Tn;
EF can map entities to views instead of tables. If the criteria for updatable views are met, the partitioned view itself will be updatable and any modifications will be made to the correct table.
The query optimizer can take advantage of CHECK constraints on the tables to search only one table at a time, similar to how partitioned tables work.
Code solutions
This requires raw SQL queries, and a way to identify the correct table/schema each time a change is made. It requires modifications to the application each time the table partitioning changes, whether those are code modifications, or changes in a configuration file.
In all cases, one query can only read from one table at a time
Keep ADO.NET
One possibility is to keep using ADO.NET, replacing the table/schema name in a query template. The code will have to map to objects if needed, the same way it already did.
EF Raw SQL
Another, is to use EF's raw SQL features, eg EF Core's FromSqlRaw to query from a specific table , the same way ADO.NET would. The benefit is that EF will map the query results to objects. In EF Core, the raw query can be combined with LINQ operators :
var query=$"select * from [DatabaseName].[{schemaName}].[CustomerCredit]"
var credits = context.CustomerCredits
.FromSqlRaw(query)
.Where(...)
.ToList();
Dapper
Another option is to use Dapper or another micro-ORM with an ad-hoc query, similar to ADO.NET, and map the results to objects:
var query=$"select * from [DatabaseName].[{schemaName}].[CustomerCredit] where customerID=#ID";
var credits=connection.Query<CustomerCredit>(query,new {ID=someID});

Related

Potential conflict between transferring data and '.ValueGeneratedOnAdd()'

I apologize if this is duplicative; I could find nothing directly pertaining.
The difficulty involves EF Core (v 3.1.8, if it matters), but is not specific or restricted thereto. I am doing code first, creating a number of entities, but the key point is that I am getting my initial data set from an app that I am trying to replace. My new app has a number of structural differences in every corresponding entity, but the data in the old app is still critical, so I will be transferring it to my new database. (Old db is hosted by MS SQL 2008; new db is hosted by MS SQL 2019, if it matters).
Most of the key fields are GUIDs, and the problem is that in EF Core, at the point in the future when I want to use the new app to do more data entry, I will also want the database to choose the GUID. In EF Core Fluent API parlance, that would be, for example:
modelBuilder.Entity("ReplaceOldApp.Models.Address", b =>
{
b.Property<Guid>("AddressID")
.ValueGeneratedOnAdd()
.HasColumnType("uniqueidentifier");
}
However, if I inform EF Core that I want the database to create the key, then it will create the tables such that when I try to transfer the data from the old database (whether using EF or some other means), the new database will ignore the old GUID and create a new, unrelated one. (Or at least, that's what I think will happen. I'm not ready to try it yet.) If that happens, then all of the data from, say, the old Person entity (such as the above-implied Address entity), will no longer be related between their corresponding entities in the new database, because all records will have shiny new GUIDs. I will have all the information, and no way to actually use it.
Obviously I can tell EF Core to inform the database that it will not be creating the GUIDs, and I can then read, unmunge and transfer the data from the old database to the new without fear of data loss (God willing). But then going forward, for any new data entry, the GUIDs will not be automatically genned. I can of course then mod my IEntityTypeConfiguration Fluent API classes for the various entities and do a second migration, re-genning the affected tables, but I'm worried that EF Core will decide that it needs to DROP the tables to accommodate such a change. (Again, I do not know for sure because I have not tried it: sorry.)
So my question is: How would you approach such a situation? Should I ignore EF and do something clever with MS SQL Studio? Should I do two migrations with a transfer in-between? Should I tell the database, even though it has been told to gen the keys, somehow to accept the old keys without changing things, perhaps via LINQ?
============== Edit:
I'm sure SSIS would work to transfer the data from old to new databases, but the learning curve appears daunting, and I am only trying to solve one problem, not gain a new career. Powershell ditto, although it may be a bit more of a hacker's tool, and as such knowledge of it might assist tweaking or help to solve a diverse set of one-time SQL Server headaches. However, again, as would you, I prefer to use what I know, or failing that, learn or learn more about a tool which promises to serve me consistently into the future.
With the very welcome new (to me) information about IDENTITY_INSERT, and information gained from Linq To Sql and identity_insert, I believe I should not use LINQ to SQL because it may assume that IDENTITY_INSERT is OFF and simply filter out the crucial GUID, failing therefore to provide it to the target server. Rather, it seems I can use C# to produce a series of generated SQL statements, and then run each one on the target server inside a TransactionScope(). Because each such insert will thereby run 'in the same connection', the state of IDENTITY_INSERT will be preserved for that entire insert transaction, and (creek don't rise) it should work.
Again, I appreciate your answer, Randy in Marin. It has, it seems, led me to an approach that will work within the potential constraints of my context (EF Core), while allowing me to preserve the crucial existing IDENTITY information. Peace.

Not being an EF programmer, I don't know if there is an option for identity insert that you can enable for a migration. You might search the term to see if it comes up.
Our team support database migrations. We can do it a number of ways. I would not even consider EF because it's not designed for data migrations - or for database design. (And because we tend to use what we know.)
This is not the way I would do it, but it might be better than SSIS if you have not used SSIS. If the tables are in the same database or in databases on the same server, you can use T-SQL to load each table one at a time. Even if not on the same server, a linked server would allow a distributed transaction. (I avoid linked servers like the plague, but for a one time thing like a migration I would tolerate it. I would rather restore a copy of the source database to the destination server to use as a source. Distributed transactions gone wrong have forced me to reboot critical servers.)
Each table can have a 4 part name. If the server part (e.g., using a linked server name) is not present, the local instance is used. If the database part is not present, the current database is used. This is the format I assume for the "src_table" and "dst_table".
[myserver\myinstance].[mydatabase].[myschema].[mytable]
Each table is loaded with T-SQL as follows:
TRUNCATE TABLE dst_table
SET IDENTITY_INSERT dst_table ON
INSERT dst_table (...) SELECT ... FROM src_table
SET IDENTITY_INSERT dst_table OFF -- must be turned off - only 1 table can have this ON
If there are foreign keys, some tables (e.g., def tables) would need to be loaded first.
If the table does not have an IDENTITY column (EF code creates all values), you don't use the IDENTITY_INSERT stuff. It will fail if you use it and there is not an identity column. It will fail if you don't use it and try to insert into an identity column.
If there is a lot of data in a table, the transaction might be too big or slow. Inserting in batches might be called for.
If it was something to run on a schedule, I would likely create a SSIS package to do the load.
If I wanted to try something new, I would use powershell and the DBATools module cmdlets to see if extracting to csv and importing the csv would be efficient. The import cmdlet has a column mapping parameter, among many others. PowerShell could be used to do transformation, but I think this crosses over into SSIS territory.
I have dealt with migrations where the GUIDs and IDs no longer related after the move. Using queries joining the new data to the old data, we were able to fix the related values. It's likely more work to fix it after than to plan for it to be correct from the start.

What is best way access different database on different engine but with similar structure

I usually work with MySql, but also with SQL Server, Oracle and Access, the database structure is almost the same. My database stores configuration and recorded data of a SCADA application ("Supervisory Control And Data Acquisition").
Most of the tables are usually the same but sometime my teammates adds fields, tables or changes some fields type.
I'm writing an application that need to load some config parameters from db, then load data, process it and store the new values on db. It also need to add new records.
I have a class that, independently from db type, given the correct connection params, gets a IDbConnection object. With some methods I can specified a SQL query and it give me and IDataReader or a also Dataset.
Now, how should i query data from the db, analyze, recalculate, and finally store them again?
I'm a bit scared of building a detailed object mapping because of the possibility of changed fields. A simple dataset/datatable/datarow should be ok but i'd like to use linq to query in a simpler way the extracted data from the database.
Finally, my db has about 60 tables but in this application I work only with a dozen of them. I have only a few time to build that application, so I need a fast way, also if it's not "very beautiful".
Thanks.

you should try an ORM that configures itself automatically according to schema
i have found this one. I didn't use similar things in c# but it works nicely in other (dynamic) languages.
http://www.codeproject.com/Articles/117666/Kerosene-ORM

Using an ORM would most probably be the fastest. You could use NHibernate which has multiple DB support. NHibernate does have a learning curve, so something like a micro ORM could be easier to use perhaps. Petapoco is a great micro ORM and supports SQL Server, SQL Server CE, MySQL, PostgreSQL and Oracle.
These ORMs would create a mapping file for each DB you use which needs to be updated or recreated when changes are made in the DB.

Create a database and entity model based on user input

Currently I'm adjusting a system that works with Entity Framework to connect to a SQL Server 2008 R2 database.
For the new part the key users need to add, change and remove entities that the normal users can use. Before I make a system that saves objects with names and with attributes I wanted to look if it is possible to create the database dynamically with the entities that the key users are giving (through a simplified entity designer).
I've search a little bit but on the internet but didn't find something quite like this. Maybe someone here knows something to push me in the right direction?

It sounds like you are best off by really defining tables, columns, indexes and foreign keys dynamically. If you were to use a "database of databases" schema with entities and attributes you would be unable to effectively index the database. Queries become extraordinarily slow and nasty.
You can query and change the database schema using SQL Server Management Objects (SMO). I have used them multiple times. They work and are quite nice to work with.
I'm not convinced that Entity Framework brings much to the table here. EF is good for expressing queries and DML on a static schema. If you were to use a dynamic schema you lose most of the benefits. Of course, some benefits remain such as entity key management and being able to use Entity SQL instead of T-SQL. On the downside you have to create all EF metadata at runtime (probably generate EDMX files or dynamic assemblies).
I think it is not worth it. I'd strongly consider building a database schema at runtime and executing queries against it using dynamically built T-SQL. It is much easier to do this than work against the system with EF.
In that sense you are back to DataTables and GridViews which was considered good style even 5 years ago. It's probably not too bad.

Entity Framework Batch Update by ID

I'm using an Entity Framework project in which I need to batch update records by ID. The ID (i.e. the primary key of a particular table) is available at runtime, and I would like to update all records like in the following query:
UPDATE EntityTable
SET Column = #p0
WHERE EntityID IN '1,2,3,[...]'
The problem I'm running into is that I have around 60k IDs (at worst) that I will need to handle, and our database software (SQL Server 2008) can't handle this:
The query processor ran out of internal resources and could not
produce a query plan. This is a rare event and only expected for
extremely complex queries or queries that reference a very large
number of tables or partitions. Please simplify the query. If you
believe you have received this message in error, contact Customer
Support Services for more information.
Through Google searches I've found people using old-school DataTable and SqlDataAdapter calls to accomplish this, but I'd like to stay within the spirit of Entity Framework if possible, or raw sql if necessary. Is there any way to do this in a reasonably efficient manner?

EF doesn't directly support batch updates so you have to use direct SQL. Your best choice is stored procedure with table valued parameter containing your IDs. EF doesn't support table valued parameters and because of that you have to use ADO.NET directly.
Your current solution can be improved only by dividing yours IDs into smaller groups and execute your update for each subset separately.

Using NHibernate with ancient database with some "dynamic" tables

I have a legacy database with a pretty evil design that I need to write some applications for. I am not allowed to touch the database design at all, seeing how this is a fragile old system held together by spit and prayers. I am of course very aware that this is not how the database should have been designed in the first place, but real life some times gets in the way..
For my new application I am using NHibernate (with Fluent for mappings and NHibernate LINQ for querying) and trying to Do Things Right. So there is IoC and repositories and more interfaces than I can count. However, the DB structure is giving me some headaches.
The system is very much focused around the concept of customers, and each customer lives in a campaign. These campaigns are created by one of the old applications. Each campaign in the system is defined in a table called CampaignSettings. One of the columns of this table is simply a text column called "Table", which refers to a database table that is created at the same time as the campaign entry in CampaignSettings. The name of this table is related to the name of the campaign, which can pretty much be anything the customer wants (within the constraints given by SQL Server (2000 or 2005)). In these tables the customers live.
So that is challenge #1 - I won't know the table names until runtime. And it will change from site to site - no static mapping I guess.
To make it even worse, we have challenge #2 - this campaign table is also dynamic in structure, meaning it has a certain number of columns that are always there (customer id, name, phone number, email address and other housekeeping stuff), and then there are two other sets of columns, added depending on the requirements of the customer on a case-by-case basis.
The old applications use SQL to get the column names present in the table, then add the ones it doesn't know about as "custom fields" in the application. I need to handle this.
I know I probably can't handle these challenges simply by using mapping magic, and I am prepared to do some ugly SQL in addition to the ORM goodness that I get from NHibernate (there are 20-some "static" tables in here as well which NHibernate handles beautifully) - but how?
I will create a Customer entity that I guess I can populate manually by doing direct SQL like
SELECT * FROM SomeCampaignTable WHERE id=<?>
and then going through the columns one by one and putting stuff where it belongs. Not fun, but necessary.
And then I guess to discover the structure of the table in the first place, I could run SQL like this:
SELECT COLUMN_NAME
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = 'SomeCampaignTable'
ORDER BY ORDINAL_POSITION
And again do some manual work to configure my object to handle the custom fields.
My question is simply - how can I do this in NHibernate? Is it a simple matter of finding a way to run my own SQL, then looping through the results, or is there a more elegant way to take the pain out of it?
While I appreciate that this database design belongs in some kind of Museum of Torture somewhere, answers like "Add some views" or "Change the DB" won't help me - I will be shot if I suggest something like that.
Thanks for anything that could help save my sanity here!

You might be able to use NHibernate using Native SQL Entity Queries. Forget Linq2NH - not that I would recommend Linq2NH for any serious application.
Check this page.
13.1.2. Entity queries
https://www.hibernate.org/hib_docs/nhibernate/1.2/reference/en/html/querysql.html
You could maybe do something like this:
Map your entities based on a 'fake' table to keep NHibernate happy when it compiles the mapping documents (I know you said you can't change the DB, but hopefully ok to make an empty table to keep NH happy).
Then run a query like this, as per 13.1.2 above:
sess.CreateSQLQuery("SELECT tempColumn1 as mappingFileColumn1, tempColumn2 as mappingFileColumn2, tempColumn3 as mappingFileColumn3 FROM tempTableName").AddEntity(typeof(Cat));
NHibernate should stitch together the columns you've returned with the mapped entity and give you the entity of type 'Cat' with all the properties populated. I am speculating here though, I do not know for sure if this will work, its the only way I can think of to use NHibernate for this given you don't know the tables/columns at compile time. You definitely cannot use HQL, Criteria, Linq2NH since you don't know the tables and columns at compile time, and HQL et al all convert your mappings to the mapped column names to produce the underlying SQL. Native SQL Queries are the only way I think.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.