Using paging (Skip, Take) over multiple DbSets

Using paging (Skip, Take) over multiple DbSets - c#

I have an Entity Framework DbContext with two different DbSets.
In my view I am combining these two sets into the same view model and listing them in the same table.
I want to support table paging to be able to query only a page of records at a time sorted by a particular column. I can't see how to do this without reading all of the records from the database and then paging from memory.
For example, I want to be able to sort by date ascending since both tables have a date column. I could simply take the page size from both tables and then sort in memory but the problem comes into play when I am skipping records. I do not know how many to skip in each table since it depends on how many records are found in the other table.
Is there a way to manipulate Entity Framework to do this?

It is possible.
JOin them in the database (can be done in EF).
Project that (select new {}) into the final object
Order by, skip, take on that projection.
It will be crap performance wise but there is no way around that given you have a broken database model. It basically has to get a tempoary view of all rows for the SQL to find the first ones - that will be slow.

Your best bet is going to be to combine them with a stored procedure or view, and then map that sp/view into Entity Framework. Combing them on the client is going to kill performance - let the server do it for you; it is clearly a server side task.

Related

linq2sql C#: How to query from a table with changing schema name

I have a webservice which tries to connect to a database of a desktop accounting application.
It have tables with same name but with different schema names such as:
[DatabaseName].[202001].[CustomerCredit]
[DatabaseName].[202002].[CustomerCredit]
.
.
.
[DatabaseName].[202014].[CustomerCredit]
[DatabaseName].[202015].[CustomerCredit]
[DatabaseName].[202016].[CustomerCredit]
...
..
[DatabaseName].[2020xx].[CustomerCredit]
Schema name is in format [Year+IncrementalNumber] such as [202014], [202015],[202016] and etc.
Whenever I want to query customer credit information in database, I should fetch information from schema with biggest number such as [DatabaseName].[202016].[CustomerCredit] if 202016 is latest schema in my db.
Note:
Creation of new schema in accounting application database have no rules and is completely decided by user of accounting application and every instance of application installed on different place may have different number of schemas.
So when I'm developing my webservice I have no idea to connect to which schema prior to development. In run-time I can find correct schema to query from its tables but I don't know how to manage to fetch table information with correct schema name in query.
I ususally creat a linq-to-sql dbml class and use its definitions to read information from db but I don't know how to manage schema change in this way?
DBML designer manage Scehma names like this:
[global::System.Data.Linq.Mapping.TableAttribute(Name="[202001].CustomerCredit")]
However since my app can retrieve schema name in run time, I don't know how to fix table declaration in my special case.
It is so easy to handle in ADO.NET but I don't know its equivalent in Linq2SQL:
select count(*) from [" + Variables.FinancialYearSchemaName + "].CustomerCredit where SFC_Status = 100;

Ultimately, no: most ORMs do not expect the schema change to vary at runtime, so most - including EF and LINQ-to-SQL do not support this scenario. One possible option would be to have different connection strings, each with different user accounts, that each has a different default schema configured at the database - and intialize your DB-context with a connection-string or connection that matches the required account. Then if EF asks the RDBMS for [CustomerCredit], it will look first in that account's schema ([202014].[CustomerCredit]). You should probably avoid having a [202014].[CustomerCredit] in that scenario, to prevent confusion. This is, however, a pretty hacky and ugly solution. But... it should work.
Alternatively, you would have to take more control over the data access, essentially writing your own SQL (presumably with a token replacement for the schema, which has problems of its own).

That schema is essentially a manual partitioning of the CustomerCredit table. The best solution would one that makes partitioning transparent to all users. The code shouldn't know how the data is partitioned.
Database Solutions
The benefit of database solutions is that they are transparent or almost transparent to users and require minimal maintenance
Table Partitioning
The clean solution would be to use table partitioning, making the different partitions transparent to all users. Table partitioning used to be an Enterprise-only feature but it became available in all editions since SQL Server 2016 SP1, even Express. This means it's free in all versions still in mainstream support.
The table is partitioned based on a function (eg a date based function) and stored in different files. Whenever possible, the query optimizer can check the partition boundaries and the query conditions and use only the file that contains the relevant data. Eg in a date-partitioned table, queries that contain a date filter can search only the relevant partitions.
Partitioned views
Another option, available since 2000 at least, is to use partitionend views, essentially a UNION ALL view that combines all table partitions, eg :
SELECT <select_list1>
FROM [202001].[CustomerCredit]
UNION ALL
SELECT <select_list2>
FROM [202002].[CustomerCredit]
UNION ALL
...
SELECT <select_listn>
FROM Tn;
EF can map entities to views instead of tables. If the criteria for updatable views are met, the partitioned view itself will be updatable and any modifications will be made to the correct table.
The query optimizer can take advantage of CHECK constraints on the tables to search only one table at a time, similar to how partitioned tables work.
Code solutions
This requires raw SQL queries, and a way to identify the correct table/schema each time a change is made. It requires modifications to the application each time the table partitioning changes, whether those are code modifications, or changes in a configuration file.
In all cases, one query can only read from one table at a time
Keep ADO.NET
One possibility is to keep using ADO.NET, replacing the table/schema name in a query template. The code will have to map to objects if needed, the same way it already did.
EF Raw SQL
Another, is to use EF's raw SQL features, eg EF Core's FromSqlRaw to query from a specific table , the same way ADO.NET would. The benefit is that EF will map the query results to objects. In EF Core, the raw query can be combined with LINQ operators :
var query=$"select * from [DatabaseName].[{schemaName}].[CustomerCredit]"
var credits = context.CustomerCredits
.FromSqlRaw(query)
.Where(...)
.ToList();
Dapper
Another option is to use Dapper or another micro-ORM with an ad-hoc query, similar to ADO.NET, and map the results to objects:
var query=$"select * from [DatabaseName].[{schemaName}].[CustomerCredit] where customerID=#ID";
var credits=connection.Query<CustomerCredit>(query,new {ID=someID});

Querying whole database with filtering tables and columns and there joins etc

Background
My backend has a database in SQL server 2012 which has around 20 tables (maybe will increase in time) and each table will have approx 100 - 1000 rows initially might increase in future.
Now one of my colleague developed an web application which uses this database and let clients do CRUD and usual business logic.
Problem
My task is to create a reporting page for this web application, what I will be doing is to give client ability to export all of the data for all of there deep nested objects from SQL from all tables or only couple with all columns or only few... in excel, pdf and other formats in future. I might also need to query 3rd party in my business logic for gathering further information (out of context for now).
What can I do to achieve above ?
What I know
I can't think of any efficient and extendable solution, as it will involve 100s of columns and 20s of tables. All I can think of adding 100s of views for what I might require but it doesn't sound particle either.
Should I look into BI or SQL reporting or should this be done in code using ORM like EF ? or is there any open source code already out there for such Generic operations I am totally confused.
Please note I am asking what to use not how to use. Hope I didn't offended anyone.

If you aren't concerned with the client having access to all your database object names, you could write up something yourself without too much effort. If you are creating a page you could query the system views to get a list of all table and column names to populate some sort of filtering (dropdowns, listbox, etc).
you can get a list of all the tables:
select object_id, name from sys.tables
you could get a list of all columns per table:
select object_id, name from sys.columns
object_id is the common key between the views.
Then you could write some dynamic SQL based on the export requirements if you plan to export through SQL.

SQL Server and Entity Framework - Dynamic Columns

I use SQL Server and Entity Framework as ORM.
Currently I have a table Product which contains all products of any kind. The different kinds of products possess different attributes.
For example:
All products of kind TV have attributes title, resolution and contrast
Where as all products of kind Car have attributes like model and horsepower
Based on this scenario I created a table called Attribute which contains all the attributes of a product.
Now to fetch a product from database I always have to join all the attributes.
To insert a product I have to insert all the attributes one by one as single rows.
The application is not just a shop or anything like it. It should be possible to add/remove an attribute to/from a kind of product on the fly without changing the db.
But my questions to you is still:
Is this a bad design?
Is there another way of doing it?
Will my solution slow down significant? (f.e. an insert takes several seconds assumed the product has hundreds of attributes...)
Update
The problem is that my application is really complex. There are a lot of huge algorithms. The software is used for statistical purposes.
One problem for example is the following one: In an algorithm-table I'm storing which attributes are used for filters. Say an administrator wants to filter all cars that have less than 100 horsepowers. The filters are dynamical, what means that I have a filter table which stores the filter type (lessThan) and the attribute (horsepowers). How can I keep this flexibility with the suggested approaches (with "hardcoded" columns)?

There is a thing about EF that I don't think everybody is aware of when designing the relations.
When you query something, EF (at least <= 4) wants to create a single SELECT for that query.
What that implies is that if you have entity A, that have a one-to-many relationship to entity B (say Item to Attributes) then EF joins the two together such there will be a returned row for all dependent Bs for each A. If A have many properties, multiple dependencies or even worse if B has many sub-dependencies, then the returned table will be quite massive, since all A-properties will be copied for each row of dependent B. Over time, when your entity models grow in complexity, this can turn into a real performance problem.
EF only includes the Bs if you explicitly tell to it to eager load the dependencies "include"s. If the includes are omitted, your stuff will initially load faster, but once you access your attributes, they will be lazy-loaded by EF. This is known as the SELECT N+1 problem (each A will require N times B-lazy queries, which can be a huge overhead).
While this is not a straight answer to your question, it is something to consider when designing your tables.
Also note, that EF supports several alternatives for base-classing. One strategy is to have a common table, that automatically joined together with the sub-entities. The alternative, which typically performs better, but is harder to upgrade, is to have one table with a super-set of all properties of all sub-classes.
More (over) generalized database design considerations:
The devil is in the details. You can make a whole career out of making good database design choices. There is no silver bullet database patterns.
EF comes with a lot of limitations. This is the price for the convenience. If the model suits EF well, then EF is quite good, but do consider more flexible alternatives like NHibernate. Sometimes even plain old data tables with views and stored procedures are to be preferred.
EF is not efficient if your model has a lot of small dependents (like a ton of attributes to an item table). It will result in either a monster query and return table or the select n+1 problem. You can write some tricky multi-part LINQ queries to somewhat compensate, but it is tricky.
SQL's strength is in integrity and reporting which works best for rather rigid data models.
Depending on the details, your model looks like a great candidate for a NoSql backend, like RavenDb and MongoDb. NoSql is much better for dynamic datamodels and scale really well.

Should I persist consolidated sums in a separate table?

I am developing a C# application working with millions of records retrieved from a relational database (SQL Server). My main table "Positions" contains the following columns:
PositionID, PortfolioCode, SecurityAccount, Custodian, Quantity
Users must be able to retrieve Quantities consolidated by some predefined set of columns e.g. {PortfolioCode, SecurityAccount}, {Porfolio, Custodian}
First, I simply used dynamic queries in my application code but, as the database grew, the queries became slower.
I wonder if it would be a good idea to add another table that will contain the consolidated quantities. I guess it depends on the distribution of those groups?
Besides, how to synchronize the source table with the consolidated one?

In SQL Server you could use indexed views to do this, it'd keep the aggregates synchronised with the underlying table, but would slow down inserts to the underlying table:
http://technet.microsoft.com/en-us/library/ms191432.aspx
If it's purely a count of grouped rows in a single table, would standard indexing not suffice here? More info on your structure would be useful.
Edit: Also, it sounds a little like you're using your OLTP server as a reporting server? If so, have you considered whether a data warehouse and an ETL process might be appropriate?

Using NHibernate with ancient database with some "dynamic" tables

I have a legacy database with a pretty evil design that I need to write some applications for. I am not allowed to touch the database design at all, seeing how this is a fragile old system held together by spit and prayers. I am of course very aware that this is not how the database should have been designed in the first place, but real life some times gets in the way..
For my new application I am using NHibernate (with Fluent for mappings and NHibernate LINQ for querying) and trying to Do Things Right. So there is IoC and repositories and more interfaces than I can count. However, the DB structure is giving me some headaches.
The system is very much focused around the concept of customers, and each customer lives in a campaign. These campaigns are created by one of the old applications. Each campaign in the system is defined in a table called CampaignSettings. One of the columns of this table is simply a text column called "Table", which refers to a database table that is created at the same time as the campaign entry in CampaignSettings. The name of this table is related to the name of the campaign, which can pretty much be anything the customer wants (within the constraints given by SQL Server (2000 or 2005)). In these tables the customers live.
So that is challenge #1 - I won't know the table names until runtime. And it will change from site to site - no static mapping I guess.
To make it even worse, we have challenge #2 - this campaign table is also dynamic in structure, meaning it has a certain number of columns that are always there (customer id, name, phone number, email address and other housekeeping stuff), and then there are two other sets of columns, added depending on the requirements of the customer on a case-by-case basis.
The old applications use SQL to get the column names present in the table, then add the ones it doesn't know about as "custom fields" in the application. I need to handle this.
I know I probably can't handle these challenges simply by using mapping magic, and I am prepared to do some ugly SQL in addition to the ORM goodness that I get from NHibernate (there are 20-some "static" tables in here as well which NHibernate handles beautifully) - but how?
I will create a Customer entity that I guess I can populate manually by doing direct SQL like
SELECT * FROM SomeCampaignTable WHERE id=<?>
and then going through the columns one by one and putting stuff where it belongs. Not fun, but necessary.
And then I guess to discover the structure of the table in the first place, I could run SQL like this:
SELECT COLUMN_NAME
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = 'SomeCampaignTable'
ORDER BY ORDINAL_POSITION
And again do some manual work to configure my object to handle the custom fields.
My question is simply - how can I do this in NHibernate? Is it a simple matter of finding a way to run my own SQL, then looping through the results, or is there a more elegant way to take the pain out of it?
While I appreciate that this database design belongs in some kind of Museum of Torture somewhere, answers like "Add some views" or "Change the DB" won't help me - I will be shot if I suggest something like that.
Thanks for anything that could help save my sanity here!

You might be able to use NHibernate using Native SQL Entity Queries. Forget Linq2NH - not that I would recommend Linq2NH for any serious application.
Check this page.
13.1.2. Entity queries
https://www.hibernate.org/hib_docs/nhibernate/1.2/reference/en/html/querysql.html
You could maybe do something like this:
Map your entities based on a 'fake' table to keep NHibernate happy when it compiles the mapping documents (I know you said you can't change the DB, but hopefully ok to make an empty table to keep NH happy).
Then run a query like this, as per 13.1.2 above:
sess.CreateSQLQuery("SELECT tempColumn1 as mappingFileColumn1, tempColumn2 as mappingFileColumn2, tempColumn3 as mappingFileColumn3 FROM tempTableName").AddEntity(typeof(Cat));
NHibernate should stitch together the columns you've returned with the mapped entity and give you the entity of type 'Cat' with all the properties populated. I am speculating here though, I do not know for sure if this will work, its the only way I can think of to use NHibernate for this given you don't know the tables/columns at compile time. You definitely cannot use HQL, Criteria, Linq2NH since you don't know the tables and columns at compile time, and HQL et al all convert your mappings to the mapped column names to produce the underlying SQL. Native SQL Queries are the only way I think.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.