Best practice when querying restored databases - c#

I am developing an application that is differentiating between an existing database and a backup-file (.bak). When generally connecting and retrieving data from SQL Server, I prefer to use Linq, since I find it easier to use this when building queries.
In this case I do not see how I can do this, my procedure is that I construct a string with the SQL query for restoring the database (from the .bak file) and then retrieve whatever data that differs from the current one, with this one.
All queries that I use are strings which I format to add the database-name and schema. The strings that I am using become quite large and I find that it is very cluttering (I like clean code).
The structure of the databases are all the same and do not ever change between the different backups or the real deal. New backups come each day, and have to be checked against, so it isn't really an option to go in and add a new connection-file each time there is a new backup.
Is there any way that I can restore the database, using Linq, and then retrieve the data using Linq queries? Or am I doing it as I should?

You can only use Linq for queries so you can not use it to restore a database.
Also, you can't use different databases in the same Linq query, you can however materialize the results of each query to the database and make a third Linq query based on them, as you can see in this answer.

Related

linq2sql C#: How to query from a table with changing schema name

I have a webservice which tries to connect to a database of a desktop accounting application.
It have tables with same name but with different schema names such as:
[DatabaseName].[202001].[CustomerCredit]
[DatabaseName].[202002].[CustomerCredit]
.
.
.
[DatabaseName].[202014].[CustomerCredit]
[DatabaseName].[202015].[CustomerCredit]
[DatabaseName].[202016].[CustomerCredit]
...
..
[DatabaseName].[2020xx].[CustomerCredit]
Schema name is in format [Year+IncrementalNumber] such as [202014], [202015],[202016] and etc.
Whenever I want to query customer credit information in database, I should fetch information from schema with biggest number such as [DatabaseName].[202016].[CustomerCredit] if 202016 is latest schema in my db.
Note:
Creation of new schema in accounting application database have no rules and is completely decided by user of accounting application and every instance of application installed on different place may have different number of schemas.
So when I'm developing my webservice I have no idea to connect to which schema prior to development. In run-time I can find correct schema to query from its tables but I don't know how to manage to fetch table information with correct schema name in query.
I ususally creat a linq-to-sql dbml class and use its definitions to read information from db but I don't know how to manage schema change in this way?
DBML designer manage Scehma names like this:
[global::System.Data.Linq.Mapping.TableAttribute(Name="[202001].CustomerCredit")]
However since my app can retrieve schema name in run time, I don't know how to fix table declaration in my special case.
It is so easy to handle in ADO.NET but I don't know its equivalent in Linq2SQL:
select count(*) from [" + Variables.FinancialYearSchemaName + "].CustomerCredit where SFC_Status = 100;
Ultimately, no: most ORMs do not expect the schema change to vary at runtime, so most - including EF and LINQ-to-SQL do not support this scenario. One possible option would be to have different connection strings, each with different user accounts, that each has a different default schema configured at the database - and intialize your DB-context with a connection-string or connection that matches the required account. Then if EF asks the RDBMS for [CustomerCredit], it will look first in that account's schema ([202014].[CustomerCredit]). You should probably avoid having a [202014].[CustomerCredit] in that scenario, to prevent confusion. This is, however, a pretty hacky and ugly solution. But... it should work.
Alternatively, you would have to take more control over the data access, essentially writing your own SQL (presumably with a token replacement for the schema, which has problems of its own).
That schema is essentially a manual partitioning of the CustomerCredit table. The best solution would one that makes partitioning transparent to all users. The code shouldn't know how the data is partitioned.
Database Solutions
The benefit of database solutions is that they are transparent or almost transparent to users and require minimal maintenance
Table Partitioning
The clean solution would be to use table partitioning, making the different partitions transparent to all users. Table partitioning used to be an Enterprise-only feature but it became available in all editions since SQL Server 2016 SP1, even Express. This means it's free in all versions still in mainstream support.
The table is partitioned based on a function (eg a date based function) and stored in different files. Whenever possible, the query optimizer can check the partition boundaries and the query conditions and use only the file that contains the relevant data. Eg in a date-partitioned table, queries that contain a date filter can search only the relevant partitions.
Partitioned views
Another option, available since 2000 at least, is to use partitionend views, essentially a UNION ALL view that combines all table partitions, eg :
SELECT <select_list1>
FROM [202001].[CustomerCredit]
UNION ALL
SELECT <select_list2>
FROM [202002].[CustomerCredit]
UNION ALL
...
SELECT <select_listn>
FROM Tn;
EF can map entities to views instead of tables. If the criteria for updatable views are met, the partitioned view itself will be updatable and any modifications will be made to the correct table.
The query optimizer can take advantage of CHECK constraints on the tables to search only one table at a time, similar to how partitioned tables work.
Code solutions
This requires raw SQL queries, and a way to identify the correct table/schema each time a change is made. It requires modifications to the application each time the table partitioning changes, whether those are code modifications, or changes in a configuration file.
In all cases, one query can only read from one table at a time
Keep ADO.NET
One possibility is to keep using ADO.NET, replacing the table/schema name in a query template. The code will have to map to objects if needed, the same way it already did.
EF Raw SQL
Another, is to use EF's raw SQL features, eg EF Core's FromSqlRaw to query from a specific table , the same way ADO.NET would. The benefit is that EF will map the query results to objects. In EF Core, the raw query can be combined with LINQ operators :
var query=$"select * from [DatabaseName].[{schemaName}].[CustomerCredit]"
var credits = context.CustomerCredits
.FromSqlRaw(query)
.Where(...)
.ToList();
Dapper
Another option is to use Dapper or another micro-ORM with an ad-hoc query, similar to ADO.NET, and map the results to objects:
var query=$"select * from [DatabaseName].[{schemaName}].[CustomerCredit] where customerID=#ID";
var credits=connection.Query<CustomerCredit>(query,new {ID=someID});

Complicated queries with LINQ to SQL

I've recently studied some LINQ sources and decided to use it in the project I'm working on. Everything is almost clear except for one thing.
I'm making complicated reports that make up of several tables. Earlier I used stored procedures for the purpose. I formed several temporary pieces of data that I stored in temporary tables and then joined them together using a series of 2-table joins.
Trouble is: LINQ doesn't allow the creation of temporary tables. I know that complicated queries are built in LINQ in a "cascade" way, but if I do it this way,
Question is: what am I going to receive in DataContext.Log in the end? I assume it's going to be a really huge query that is impossible to understand and use for debugging. Am I right? If I am, how to find a workaround for this? DataLoadOptions and LoadWith won't do, because I am processing all the data at once and using it will lead to an avalanche of queries.
Thanks in advance
LINQ definitely allows the creation of temporary tables, it's just any data type that implements IQueryable (Lists being a good example) So if you need a temp table, you just do this...
var tempTable1 = [LINQ Query goes here].ToList()
and voila, you have a temp table. If you don't like generic variables, then you can create classes and use a List<tableClass> instead of a generic.
From this point, you can use the new Var as something to select from while running LINQ queries. If you need it to persist longer, you can store it in Session or pass the variable around.

What is best way access different database on different engine but with similar structure

I usually work with MySql, but also with SQL Server, Oracle and Access, the database structure is almost the same. My database stores configuration and recorded data of a SCADA application ("Supervisory Control And Data Acquisition").
Most of the tables are usually the same but sometime my teammates adds fields, tables or changes some fields type.
I'm writing an application that need to load some config parameters from db, then load data, process it and store the new values on db. It also need to add new records.
I have a class that, independently from db type, given the correct connection params, gets a IDbConnection object. With some methods I can specified a SQL query and it give me and IDataReader or a also Dataset.
Now, how should i query data from the db, analyze, recalculate, and finally store them again?
I'm a bit scared of building a detailed object mapping because of the possibility of changed fields. A simple dataset/datatable/datarow should be ok but i'd like to use linq to query in a simpler way the extracted data from the database.
Finally, my db has about 60 tables but in this application I work only with a dozen of them. I have only a few time to build that application, so I need a fast way, also if it's not "very beautiful".
Thanks.
you should try an ORM that configures itself automatically according to schema
i have found this one. I didn't use similar things in c# but it works nicely in other (dynamic) languages.
http://www.codeproject.com/Articles/117666/Kerosene-ORM
Using an ORM would most probably be the fastest. You could use NHibernate which has multiple DB support. NHibernate does have a learning curve, so something like a micro ORM could be easier to use perhaps. Petapoco is a great micro ORM and supports SQL Server, SQL Server CE, MySQL, PostgreSQL and Oracle.
These ORMs would create a mapping file for each DB you use which needs to be updated or recreated when changes are made in the DB.

Should I do the Where clause in SQL or in LINQ?

I have a method that can pass in a where clause to a query. This method then returns a DataSet.
What is faster and more efficient? To pass the where clause through so the SQL Server sends back less data (but has to do more work) or let the web server handle it through LINQ?
Edit:
This is assuming that the SQL Server is more powerful than the web server (as probably should be the case).
Are you using straight up ADO.Net to perform your data acccess? If so, then yes - use a WHERE clause in your SQL and limit the amount of data sent back to your application.
SQL Server is effecient at this, you can design indexes to help it access data and you are transmitting less data back to your client application.
Imagine you have 20,000 rows in a table, but you are only interested in 100 of them. Of course it is much more effecient to only grab the 100 rows from the source and send them back, rather than the whole lot which you then filter in your web application.
You have tagged linq-to-sql, if that's the case then using a WHERE clause in your LINQ statement will generate the WHERE clause on the SQL Server.
But overall rule of thumb, just get the data you are interested in. It's less data over the wire, the query will generally run faster (as long as it's optimised via indexes etc) and your client application will have less work to do, it's already got only the data it's interested in.
SQL Server is good at filtering data, in fact: that's what it's built for, so always make use of that. If you filter in C#; you won't be able to make use of any indexes you have on your tables. It's going to be far less efficient.
Selecting all rows only to discard many/most of them is wasteful and it will definitely show in the performance.
If you are not using any sort of ORM then use the where condition
at database level as i believe filtering should happen at database
level.
But if your using any ORM like Entity Framework or Linq to SQL then from performance point of view its is the same as your Linq Where clause will ultimately translated into an SQL Where clause as far as your using where clause on IQuerable.
From the point of efficiency, it is SQL server that shoud do the job. If it does not requires multiple database callings, it is always a better solution to use SQL server. But, if you already have a Dataset from a database, you can filter it with LINQ

A Better DataTable

I have an application that uses DataTables to perform grouping, filtering and aggregation of data. I want to replace datatables with my own data structures so we don't have any unnecessary overhead that we get from using datatables. So my question is if Linq can be used to perform the grouping, filtering and aggregation of my data and if it can is the performance comparable to datatables or should I just hunker down and write my own algorithms to do it?
Thanks
Dan R.
Unless you go for simple classes (POCO etc), your own implementation is likely to have nearly as much overhead as DataTable. Personally, I'd look more at using tools like LINQ-to-SQL, Entity Framework, etc. Then you can use either LINQ-to-Objects against local data, or the provider-specific implementation for complex database queries without pulling all the data to the client.
LINQ-to-Objects can do all the things you mention, but it involves having all the data in memory. If you have non-trivial data, a database is recommended. SQL Server Express Edition would be a good starting point if you look at LINQ-to-SQL or Entity Framework.
Edited re comment:
Regular TSQL commands are fine and dandy, but you ask about the difference... the biggest being that LINQ-to-SQL will provide the entire DAL for you, which is a huge time saver, as well as making it possible to get a lot more compile-time safety. But is also allows you to use the same approach to look at your local objects and your database - for example, the following is valid C# 3.0 (except for [someDataSource], see below):
var qry = from row in [someDataSource]
group row by row.Category into grp
select new {Category = grp.Key, Count = grp.Count(),
TotalValue = grp.Sum(x=>x.Value) };
foreach(var x in qry) {
Console.WriteLine("{0}, {1}, {2}", x.Category, x.Count, x.TotalValue);
}
If [someDataSource] is local data, such as a List<T>, this will execute locally; but if this is from your LINQ-to-SQL data-context, it can build the appropriate TSQL at the database server. This makes it possible to use a single query mechanism in your code (within the bounds of LOLA, of course).
You'd be better off letting your database handle grouping, filtering and aggregation. DataTables are actually relatively good at this sort of thing (their bad reputation seems to come primarily from inappropriate usage), but not as good as an actual database. Moreover, without a lot of work on your part, I would put my money on the DataTable's having better performance than your homegrown data structure.
Why not use a local database like Sqlserver CE or firebird embedded? (or even ms access! :)). Store the data in the local database, do the processing using simple sql queries and pull the data back. Much simpler and likely less overhead, plus you don't have to write all the logic for grouping/aggregates etc. as the database systems already have that logic built in, debugged and working.
Yes, you can use LINQ to do all those things using your custom objects.
And I've noticed a lot of people suggest that you do this type of stuff in the database... but you never indicated where the database was coming from.
If the data is coming from the database then at the very least the filtering should probably happen there, unless you are doing something specialized (e.g. working from a cached set of data). And even then, if you are working with a significant amount of cached data, you might do well to put that data into an embedded database like SQLite, as someone else has already mentioned.

Categories