Get min, avg and max values from suppliers table - c#

I have a SQL query that returns a table with 4 columns to my c# project (multiple records for each supplier)
Suppliers | Data before | Date After | Dates diff
What I need? Minimum, average and maximum days per supplier to all the 3 columns (date before, date after and date diff).
What's an elegant way of achievement this? Either c# or SQL are viable options.
Note: I did it through c# where I made a datatable from the query. Then I made a list with each supplier. Then for-each supplier in the list I runned through each row in the datatable, and when suppliers in list and in datatable were the same I did the math and added the results to a new datatable. Newbie style.

I guess that depends on what you mean by elegant....
To some people that means you should update your sql query to use Min(), Max(), Avg(). etc.
To others that would be creating a POCO, adding it to an Enumerable of some sort and using LINQ to get what you want.
Personally I believe both are viable and to me the decision is really dependent on too many other factors to mention, but in short, does it make sense to perform these operations locally to your program or rely on the database to do it for you?
There isnt really a good single answer to be given here.

Related

In sql, what is the best way to store several ordered vectors/lists?

I have a few integer lists of different length of which I want to preserve the order.
I will be using them as alternative to each other, never 2 at the same time.
The number of lists might grow in the future, although I expect it never to reach a value like 50 or so. I might want to insert one value within this list. These lists are relatively seldom modified, and using a manual editor like MS SQL Server Management Studio for this purpose is fine. For what I can see in this moment, these lists will be rarely used to directly make queries, there will be some C# in between.
For storing one ordered list, a linked (or double-linked) list seems appropriate. But if I have to store several ordered lists, it seems to me that I will have to add one table for each one of them. The same is valid if I use an indexed list. On the other hand, I could also store all these lists in one table transforming them in strings (one string per list) with values comma separated, that I would then parse in my C# program.
In sql, what is the best way to store several ordered vectors/lists in sql?
Relational databases like SQL Server typically don't have "arrays" or "lists" - if you need to store more than one value - they have tables for that. And no - it's in no way cumbersome for a database like SQL Server to have even thousands of tables - if you really must..... After all - handling tables is the core competency of a relational database ... it should be good at it! (and SQL Server is - and so are many other mature RDBMS systems, too).
And I would strongly recommend not to use comma-separated strings - those first of all violate even the first normal form of database design, and as soon as you do need to join those values against something else - you're in trouble. Don't do this - there's absolutely no need for this, and you'll just make your developer life miserable sometime in the future - if you have the opportunity to avoid it - don't do it!
A kind of this?
ListId ItemOrder ItemValue
1 1 10
1 4 7
1 2 5
2 1 55
1 7 23
2 4 15
Select ItemValue FROM [Table] WHERE ListId = 1 Order By ItemOrder
Here Each list has an ID (you can use a clustered index here) and the order is given by the field ItemOrder

What is the best optimization technique for a wildcard search through 100,000 records in sql table

I am working on an ASP.NET MVC application. This application is used by 200 users. These
users constantly (every 5 mins) search for an item from the list of 100,000 items (this list is going to increase every month by 1-2 %). This list of 100,000 items are stored in a SQL Server table.
The search is a wildcard search
eg:
Select itemCode, itemName, ItemDesc
from tblItems
Where itemName like '%SearchWord%'
The searching needs to really fast since the main business relies on searching and selecting the item.
I would like to know how to get the best performance. The search results have to come up instantaneously.
What I have tried -
I tried pre-loading the entire 100,000 records into memcache and then reading from the memcache. I was trying to avoid the calls to SQL Server for every search.
This takes a lot of time. Every time user searches for an item, we are retrieving 100,000 records from the memcache and then doing the search. This is taking almost 2-3 times more time than direct SQL searches.
I tried doing a direct search on the SQL Server table but limiting the results to only 50 records at a time (using top 50)
This seems to be Ok but still no-where near the performance we are seeking
I would like to hear the possible solutions and links to any articles/code.
Thanks in advance
Run SQL Profiler and do a tuning profile. This will make recommendations on indexes to execute against your database.
Also, a query such as the following would be worth a try.
SELECT *
FROM
(
SELECT ROW_NUMBER() OVER ( ORDER BY ColumnA) AS RowNumber, itemCode, itemName, ItemDesc
FROM tblItems
WHERE itemName LIKE '%FooBar%'
) AS RowResults
WHERE RowNumber >= 1 AND RowNumber < 50
ORDER BY RowNumber
EDIT: Updated query to reflect your real scenario.
How about having a search without the leading wildcard as your primary search....
Where itemName like 'SearchWord%'
and then have having a "More Results" button that loads
Where itemName like '%SearchWord%'
(alternatively exclude results from the first result set)
Where itemName not like 'SearchWord%' and itemName like '%SearchWord%'
A weird alternative which might work, as it depends on several assumptions etc. Sorry not fully explained but am using ipad so hard to type. (and yes, this solution has been used in high txn commericial systems)
This assumes
That your query is cpu constrained not IO
That itemName is not too long, such that it holds all letters and numbers
That searchword, in total, contains enough selective characters and isnt just highly common characters
Your selection predicates are constrained by a %like%
The basic idea is to expand your query to help the optimiser know which rows need the like scanning.
Step 1. Setup your table
Create an additional 26 or 36 columns for each letter/digit. When I've done this for real it has always been a seperate table, but putting it on source table should be ok for a small volume like 100k. Lets call the colmns trig_a, trig_b etc.
Create a trigger for each insert/edit/delete and put a 1 or 0 into the trig_a field if it contains an 'a', do this for all 26/36 columns. The trigger to do this is complex, but possible (at least using Oracle). If you get stuck I'm sure SO'ers can create it, or I can dig it out.
At this point, we have a series of columns that indicate whether a field contains a letter/digit etc.
Step 2. Helping you query
With this extra info, we are in the position to help the optimiser. Add the following to your query
Select ... Where .... And
((trig_a > 0) or (searchword not like '%a%')) and
((trig_b > 0) or (searchword not like '%b%')) and
... Repeat for all columns monitored...
If the optimiser behaves, it can use the (hopefully) lower cost field>0 predicates to reduce the like predicates evaluated.
Notes.
You may need to force the optimiser to scan trig_? Fields first
Indexes can help on trig_? Fields, especically if in the source table
I haven't shown how to handle upper/lower case, dont forget to handle this
You might find just doing a few letters is all you need to do.
This technique doesnt offer performance gains for every use of like, so it isnt a general purpose technique for everywhere you use a like.

What is the best way to create a GridView showing summary sales data?

I have developed an eCommerce application in C# and ASP.Net. For the Admin users "dashboard" landing page, I would like to give them a GridView that shows them the total sales dollar amount for a couple different time ranges, these would be my columns (ie last day, last week, last month, last year, total ever). I would like to give these values for orders that are in different status' (ie complete, paid but not shipped, in progress). Something similar to this:
|OrderStatus|Today|LastWeek|LastMonth|
|Processed |$10 |$100 |$34000 |
|PaidNotShip|$4 |$12 |$45 |
My question: What is the best/most efficient way to do this? I know that I could write separate SQL statements and union them together and bind the gridview to a sqldatasource:
(select amountForYesterday, amountForLastWeek from sales where orderStatus = processed)
UNION
(select amountForYesterday, amountForLastWeek from sales where orderStatus = paidnotshipped)
But that seems like a pain and very inefficient, since I would effectively be writing a separate query for each value.
I could also do this in the .cs page behind on load and programmatically populate the grid view row by row.
This GridView would only show information for the user's specific organization, so it would have to filter based on that as well.
I'm kind of at a loss as to how to do this without writing a massive query and continually hitting that query and database each time the page is viewed.
Any ideas?
I prefer using LINQ to work with data and/or GridViews (accessing the rows etc.). Have a look at a project I have on GitHub, which does exactly what I am mentioning here, as example. Note that this is just a sandbox I used previously for illustration purposes.
GitHub Repo
https://github.com/pauloosthuysen/int
Other useful info:
http://www.codeproject.com/Articles/33685/Simple-GridView-Binding-using-LINQ-to-SQL
The Sales etc. for LastWeek and LastMonth does not change very often. You could store that in a static Dictionary indexed by organization or summarize it in a separate table for faster access. This way you will not need to select the same huge amount of rows to get the same numbers over and over again. Unless special demands I would stick to the Dictionary solution because it is simple but a combination could also be a good solution
There is no direct way of doing it.
However instead of hitting the DB to the sum of every columns, you can perform the stuff using you datatable which is used for binging to your grid.
All you need to do is use
Dim iSumSal As Integer
iSumSal = StudentTable.Compute("SUM(sal)", "")
similarly you can perform for other columns.
once this is done. then just add a new row to you data table with all the summed values in it.
And then you can bind it to your grid.
optional - you can put some text value in the first column of you new row as "Total:"
thanks
rahul

C# SQL Server - More Efficient for Multiple Database accesses or multiple loops through data?

In part of my application I have to get the last ID of a table where a condition is met
For example:
SELECT(MAX) ID FROM TABLE WHERE Num = 2
So I can either grab the whole table and loop through it looking for Num = 2, or I can grab the data from the table where Num = 2. In the latter, I know the last item will be the MAX ID.
Either way, I have to do this around 50 times...so would it be more efficient grabbing all the data and looping through the list of data looking for a specific condition...
Or would it be better to grab the data several times based on the condition..where I know the last item in the list will be the max id
I have 6 conditions I will have to base the queries on
Im just wondering which is more efficient...looping through a list of around 3500 items several times, or hitting the database several times where I can already have the data broken down like I need it
I could speak for SqlServer. If you create a StoredProcedure where Num is a parameter that you pass, you will get the best performance due to its optimization engine on execution plan of the stored procedure. Of course an Index on that field is mandatory.
Let the database do this work, it's what it is designed to do.
Does this table have a high insert frequency? Does it have a high update frequency, specifically on the column that you're applying the MAX function to? If the answer is no, you might consider adding an IS_MAX BIT column and set it using an insert trigger. That way, the row you want is essentially cached, and it's trivial to look up.

How to manage a million records?

I really need an expert's help to answer my query.
Here is the scenario:
Im using an sql select query to retrieve a million records.
I need to perform sorting and grouping on the resultant records which im storing in a datatable( in one execution)
and looping through it for grouping and sorting it.
I know this is so childish and not the right way to process it.
How can i manage the million records effectively and apply the grouping and sorting to it?
Really need help out here. Heard of executing the select query batch wise but how to implement the grouping and sorting while we dont have the entire data in hand?
I cannot go for sql order by and group by directly and that's against my requirement.
Here is what i'm doing right now:
I have the following objects, i.e the column names for grouping and Sorting
List<Group> groupList;
List<Sort> sortList;
DataTable reportData; // Here im having the entire records from db
Im looping through the 'reportData' row by row and matches the current and previous row for the custom grouping and sorting. Would like to know how the same can be done when we are using a batchwise execution or any alternative solution is there?
I need to perform sorting and grouping on the resultant records which
im storing in a datatable( in one execution) and looping through it
for grouping and sorting it.
What for?
Seriously.
Do not pull then try plaing smart with a stupid object model behind (and datasets are not particularly smart, sorry).
Group and sort in your select statement, pull the data lready grouped and joined and be done with it.
A million records was a small amount of data for sql server when the original version was release (4.2 it was, a port of sysase sql server) 17 years of so ago. These days it is something that fits likely into the processor thiird level cache and is nothing a proper sql server even realizes it has just processed.
SQL is particulaly good ad doing projects and ever since they indoruced MARS you can even run multiple queries over one connection, which comes in handy here.
So, go back - throw away the dataset and "I try to program a sort algo" and create proper SQL statements to pull the data as you need it.
Sounds like you should implement Partition Pruning. Partitioning will allow for a separation of content like you are requesting in order to have faster queries.
If I understood correctly, in your case, I would create a temporary database table with the structure I want especially to cover my grouping.
Then I would select the records from main tables and insert them to the temporary one appying all modifications including grouping.
A specific index on how you want them sorted should be also applied.
After that, just select from this table, do what you have to do, and finally if the data are not needed any more, delete the temporary table.
I would choose the above solution because a million of records in memory smells trouble to me...
For example:
1. Lets assume that you would like to group them by their DocumentTypeID
var groupByType = reportData.GroupBy(g=>g.DocumentTypeID);
2. Sorting Alphabetically
var sortAlphabetically = reportData.OrderBy(g=>g.DocumentName);
3. Grouping and Sorting
var groupAndSort = reportData.GroupBy(g=>g.DocumentTypeID)
.OrderBy(g=>g.DocumentName);
4. Sort and Group
var groupAndSort = reportData.OrderBy(g=>g.DocumentName)
.GroupBy(g=>g.DocumentTypeID);
5. Multiple Grouping and sorting
var multipleGroupAndSort = reportData.GroupBy(g=>g.DocumentTypeID)
.GroupBy(g=>g.CreatedOnDate.Month)
.OrderBy(g=>g.DocumentName);
so on and so forth...
But I would still discourage bringing million rows to application. It will cost memory. There are of course ways to manage it through stored procedures etc.

Categories