C# inventory updating stock - c#

I am building an app with a SQL Server database. I have a main table of products (tblProducts) with a column that holds the quantity in hand (quantity). Another table holds the orders (tblOrders) that come from the supplier.
When an order comes in, I add the order to my database (tblOrders) and then I edit tblProducts to add to the quantity column the new received product.
As far, everything is good.
My question: after let's say 1 year of many many orders, with a lot of edits in quantity, do you guys, periodically check all orders to check if the quantity in main table tblProducts is correct ? Or do I just assume that it is always correct?
What procedures do you use for updating this kind of database? Do you sum all orders every time when you need quantity in hand?
Thanks!

This is really up to how you want to implement it.
Trusting that the values will always check out (with adequate testing to ensure only stable code will see production) is the easiest and the fastest way, but might be vulnerable to data corruption, and thus, not that recommended.
Always summing up the orders is the safest way and correct way, but will become increasingly slower as the size of your tables grow. If this is not an issue for you, then this is the recommended option.
What I consider a good intermediate method is to have a separate tblProductLogs table which stores the stock of an item at a specific timestamp. You can sum the inventory at set periods (daily, hourly, up to you), and when you want to retrieve the current inventory stock you only need to sum the values that were registered after the last log entry for that item, saving you query time. This could be made more safe if update operations were disabled on the log table, since you won't need to modify the entries there. This is faster than the second option, and somewhat more stable than the first.

Related

Ideas on incorrect ORDER BY results

I want to emphasize that I'm looking for ideas, not necessarily a concrete answer since it's difficult to show what my queries look like, but I don't believe that's needed.
The process looks like this:
Table A keeps filling up, like a bucket - an SQL job keeps calling SP_Proc1 every minute or less and it inserts multiple records into table A.
At the same time a C# process keeps calling another procedure SP_Proc2 every minute or less that does an ordered TOP 5 select from table A and returns the results to the C# method. After C# code finishes processing the results it deletes the selected 5 records from table A.
I bolded the problematic part above. It is necessary that the records from table A be processed 5 at a time in the order specified, but a few times a month SP_Proc2 selects the ordered TOP 5 records in a wrong order even though all the records are present in table A and have correct column values that are used for ordering.
Something to note:
I'm ordering by integers, not varchar.
The C# part is using 1 thread.
Both SP_Proc1 and SP_Proc2 use a transaction and use READ COMMITTED OR READ COMMITTED SNAPSHOT transaction isolation level.
One column that is used for ordering is a computed value, but a very simple one. It just checks if another column in table A is not null and sets the computed column to either 1 or 0.
There's a unique nonclustered index on primary key Id and a clustered index composed of the same columns used for ordering in SP_Proc2.
I'm using SQL Server 2012 (v11.0.3000)
I'm beginning to think that this might be an SQL bug or maybe the records or index in table A get corrupted and then deleted by the C# process and that's why I can't catch it.
Edit:
To clarify, SP_Proc1 commits a big batch of N records to table A at once and SP_Proc2 pulls the records from table A in batches of 5, it orders the records in the table and selects TOP 5 and sometimes a wrong batch is selected, the batch itself is ordered correctly, but a different batch was supposed to be selected according to ORDER BY. I believe Rob Farley might have the right idea.
My guess is that your “out of order TOP 5” is ordered, but that a later five overlaps. Like, one time you get 1231, 1232, 1233, 1234, and 1236, and the next batch is 1235, 1237, and so on.
This can be an issue with locking and blocking. You’ve indicated your processes use transactions, so it wouldn’t surprise me if your 1235 hasn’t been committed yet, but can just be ignored by your snapshot isolation, and your 1236 can get picked up.
It doesn’t sound like there’s a bug here. What I’m describing above is a definite feature of snapshot isolation. If you must have 1235 picked up in an earlier batch than 1236, then don’t use snapshot isolation, and force your table to be locked until each block of inserts is finished.
An alternative suggestion would be to use a table lock (tablock) for the reading and writing procedures.
Though this is expensive, if you desire absolute consistency then this may be the way to go.

SQL Server, Does 'sum' funcation works fine in large tables?

I'm building a market sell points system using C# winForms, SQL Server 2014
My problem is the idea of storing the products and quantities
the program should display the quantity of some product when making a new sell bill
so before user make a sell he should know the quantity of this product, and see if it enough for the order or not
i created this table for items
Item_ID, Item, Buy_Price, Sell_Price, Quantity
but this makes confusions when user edit some bills , sell bills or buy bills
so for example when user edit some buy bill(invoice), edit the quantity in the buy bill
then the product quantity cell in items table will not match the quantity in buybills table
so i want to make it little more dynamic and remove the quantity cell from items table
and using the bills table to get the quantity of any item
that means when user search for an item(Product) to sell, system uses a sum funcation buy sql statement
quantity = sum(item_quantity) in buy_bills - sum(item_quantity) in sell_bills
does it works? or this will be so heavey when databases size and records count get larger?
Problem here is not related to SUM function as such (it will sum up whatever you put in your query and rather quickly - in general) but with concept of detecting available quantities. Assuming that there is more than one selling point it will be impossible to get accurate available quantity of any product if you do not provide some form of concurrency control.
So, you may use sum from table where you have history of buy or sell of products. But, before you accept any change you need to put some lock to prevent other sellers to change (sell) same quantity. After you have committed one sell you may allow other sellers to continue with their activities assuring that you refresh currently presented items quantities.
Do have in mind that this scenario is one of the basic ones that may occur in any multiuser business application.
So, to conclude, do provide
One point in time where you are sure that existing stock cannot be lowered
SUM in the way you see fit for your application
Refresh any pending sale with new quantities
I want to make it little more dynamic and remove the quantity cell from items table and using the bills table to get the quantity of any item that means when user search for an item(Product) to sell, system uses a sum funcation buy sql statement quantity = sum(item_quantity) in buy_bills - sum(item_quantity) in sell_bills does it works?
Assuming you have your syntax correct, it is possible to sum a column using the sum function. You can subtract two sums in a single statement. Ie, SUM(Column_1) - SUM(Column2) AS Column3. This assumes that you have everything else set up correctly.
or this will be so heavey when databases size and records count get larger?
This is 100% relative to your database setup and not answerable here. It is possible that it will slow down when the database gets larger. But, if your database is set up correctly, it should not be by much.
It is also possible, later on, to get around that issue by pre-loading the data.

Millions of rows in the database, only so much needed

Problem summary:
C# (MVC), entity framework 5.0 and Oracle.
I have a couple of million rows in a view which joins two tables.
I need to populate dropdownlists with filter-posibilities.
The options in these dropdownlists should reflect the actual contents
of the view for that column, distinct.
I want to update the dropdownlists whenever you select something, so
that the new options reflect the filtered content, preventing you
from choosing something that would give 0 results.
Its slow.
Question: whats the right way of getting these dropdownlists populated?
Now for more detail.
-- Goal of the page --
The user is presented with some dropownlists that filter the data in a grid below. The grid represents a view (see "Database") where the results are filtered.
Each dropdownlist represents a filter for a column of the view. Once something is selected, the rest of the page updates. The other dropdownlists now contain the posible values for their corresponding columns that complies to the filter that was just applied in the first dropdownlist.
Once the user has selected a couple of filters, he/she presses the search button and the grid below the dropdownlists updates.
-- Database --
I have a view that selects almost all columns from two tables, nothing fancy there. Like this:
SELECT tbl1.blabla, tbl2.blabla etc etc
FROM table1 tbl1, table2 tbl2
WHERE bsl.bvz_id = bvz.id AND bsl.einddatum IS NULL;
There is a total of 22 columns. 13 VARCHARS (mostly small, 1 - 20, one of em has a size of 2000!), 6 DATES and 3 NUMBERS (one of them size 38 and one of them 15,2).
There are a couple of indexes on the tables, among which the relevant ID's for the WHERE clause.
Important thing to know: I cannot change the database. Maybe set an index here and there, but nothing major.
-- Entity Framework --
I created a Database first EDMX in my solution and also mapped the view. There are also classes for both tables, but I need data from both of them, so I don't know if I need them. The problem by selecting things from either table would be that you can't apply half of the filtering, but maybe there are smart way's I didn't think of yet.
-- View --
My view is strongly bound to a viewModel. In there I have a IEnumerable for each dropdownlist. The getter for these gets its data from a single IEnumerable called NameOfViewObjects. Like this:
public string SelectedColumn1{ get; set; }
private IEnumerable<SelectListItem> column1Options;
public IEnumerable<SelectListItem> Column1Options
{
get
{
if (column1Options == null)
{
column1Options= NameOfViewObjects.Select(item => item.Column1).Distinct()
.Select(item => new SelectListItem
{
Value = item,
Text = item,
Selected = item.Equals(SelectedColumn1, StringComparison.InvariantCultureIgnoreCase)
});
}
return column1Options;
}
}
The two solutions I've tried are:
- 1 -
Selecting all columns in a linq query I need for the dropdownlists (the 2000 varchar is not one of them and there are only 2 date columns), do a distinct on them and put the results into a Hashset. Then I set NameOfViewObjects to point towards this hashset. I have to wait for about 2 minutes for that to complete, but after that, populating the dropdownlists is almost instant (maybe a second for each of them).
model.Beslissingen = new HashSet<NameOfViewObject>(dbBes.NameOfViewObject
.DistinctBy(item => new
{
item.VarcharColumn1,
item.DateColumn1,
item.DateColumn2,
item.VarcharColumn2,
item.VarcharColumn3,
item.VarcharColumn4,
item.VarcharColumn5,
item.VarcharColumn6,
item.VarcharColumn7,
item.VarcharColumn8
}
)
);
The big problem here is that the object NameOfViewObject is probably quite large, and even though using distinct here, resulting in less than 100.000 results, it still uses over 500mb of memory for it. This is unacceptable, because there will be a lot of users using this screen (a lot would be... 10 max, 5 average simultaniously).
- 2 -
The other solution is to use the same linq query and point NameOfViewObjects towards the IQueryable it produces. This means that every time the view wants to bind a dropdownlist to a IEnumerable, it will fire a query that will find the distinct values for that column in a table with millions of rows where most likely the column it's getting the values from is not indexed. This takes around 1 minute for each dropdownlist (I have 10), so that takes ages.
Don't forget: I need to update the dropdownlists every time one of them has it's selection changed.
-- Question --
So I'm probably going at this the wrong way, or maybe one of these solutions should be combined with indexing all of the columns I use, maybe I should use another way to store the data in memory, so it's only a little, but there must be someone out there who has done this before and figured out something smart. Can you please tell me what would be the best way to handle a situation like this?
Acceptable performance:
having to wait for a while (2 minutes) while the page loads, but
everything is fast after that.
having to wait for a couple of seconds every time a dropdownlist
changes
the page does not use more than 500mb of memory
Of course you should have indexes on all columns and combinations in WHERE clauses. No index means table scan and O(N) query times. Those cannot scale under any circumstance.
You do not need millions of entries in a drop down. You need to be smarter about filtering the database down to manageable numbers of entries.
I'd take a page from Google. Their type ahead helps narrow down the entire Internet graph into groups of 25 or 50 per page, with the most likely at the top. Maybe you could manage that, too.
Perhaps a better answer is something like a search engine. If you were a Java developer you might try Lucene/SOLR and indexing. I don't know what the .NET equivalent is.
First point you need to check is your DB, make sure you have to right indexes and entity relations in place,
next if you want to dynamical build your filter options then you need to run the query with the existing filters to obtain what the next filter can be. there are several ways to do this,
firstly you can query the data and extract the values from the return, this has a huge load time and wastes time returning data you don't want (unless you are live updating the results with the filter and dont have paging, in which case you might aswell just get all the data and use linqToObjects to filter)
a second option is to have a parallel queries for each filter that returns the possible filters, so filter A = all possible values of A from data, filter b = all possible values of B when filtered by A in the data, C = all possible values of C when filtered by A & B in the data, etc. this is better than the first but not by much
another option is the use aggregates to speed things up, ie you have a parallel query as above but instead of returning the data you return how many records are returned, aggregate functions are always quicker so this will cut your load time dramatically but you are still repeatedly querying a huge dataset to it wont be exactly nippy.
you can tweak this further using exist to just return a 0 or 1.
in this case you would look at a table with all possible filters and then remove the ones with no values from the parallel query
the next option will be the fastest by a mile is to cache the filters in the DB, with a separate table
then you can query that and say from Cache, where filter = ABC select D, the problem with this maintaining the cache, which you would have to do in the DB as part of the save functions, trigggers etc.
Another solution that can be added in addition to the previous suggestions is to use the /*+ result_cache */ hint, if your version of Oracle supports it (Oracle version 11g or later). If the output of the query is small enough for a drop-down list, then when a user enters criteria that matches the same criteria another user used, the results are returned in a few milliseconds instead of a few seconds or minutes. Result cache is wonderful for queries that return a small set of rows out of millions.
select /*+ result_cache */ item_desc from some_table where item_id ...
The result cache is automatically flushed when any insert/updates/deletes occur on the database tables.
I've done something 'kind of' similar in the past - if you can add a table to the database then I'd explore introducing a 'scratchpad' type table where results are temporarily stored as the user refines their search. Since multiple users could be working simultaneously the table would have to have an additional column for identifying the user.
I'd think you'd see some performance benefit since all processing is kept server-side and your app would simply be pulling data from this table. Since you're adding this table you would also have total control over it.
Essentially I'd imagine the program flow would go something like:
User selects some filters and clicks 'Search'.
Server populates scratchpad table with results from that search.
App populates results grid from scratchpad table.
User further refines search and clicks 'Search'.
Server removes/adds rows to scratchpad table as necessary.
App populates results grid from scratchpad table.
And so on.
Rather than having all the users results in one 'scratchpad' table you could possibly explore having temporary 'scratchpad' tables per user.

What is the best way to create a GridView showing summary sales data?

I have developed an eCommerce application in C# and ASP.Net. For the Admin users "dashboard" landing page, I would like to give them a GridView that shows them the total sales dollar amount for a couple different time ranges, these would be my columns (ie last day, last week, last month, last year, total ever). I would like to give these values for orders that are in different status' (ie complete, paid but not shipped, in progress). Something similar to this:
|OrderStatus|Today|LastWeek|LastMonth|
|Processed |$10 |$100 |$34000 |
|PaidNotShip|$4 |$12 |$45 |
My question: What is the best/most efficient way to do this? I know that I could write separate SQL statements and union them together and bind the gridview to a sqldatasource:
(select amountForYesterday, amountForLastWeek from sales where orderStatus = processed)
UNION
(select amountForYesterday, amountForLastWeek from sales where orderStatus = paidnotshipped)
But that seems like a pain and very inefficient, since I would effectively be writing a separate query for each value.
I could also do this in the .cs page behind on load and programmatically populate the grid view row by row.
This GridView would only show information for the user's specific organization, so it would have to filter based on that as well.
I'm kind of at a loss as to how to do this without writing a massive query and continually hitting that query and database each time the page is viewed.
Any ideas?
I prefer using LINQ to work with data and/or GridViews (accessing the rows etc.). Have a look at a project I have on GitHub, which does exactly what I am mentioning here, as example. Note that this is just a sandbox I used previously for illustration purposes.
GitHub Repo
https://github.com/pauloosthuysen/int
Other useful info:
http://www.codeproject.com/Articles/33685/Simple-GridView-Binding-using-LINQ-to-SQL
The Sales etc. for LastWeek and LastMonth does not change very often. You could store that in a static Dictionary indexed by organization or summarize it in a separate table for faster access. This way you will not need to select the same huge amount of rows to get the same numbers over and over again. Unless special demands I would stick to the Dictionary solution because it is simple but a combination could also be a good solution
There is no direct way of doing it.
However instead of hitting the DB to the sum of every columns, you can perform the stuff using you datatable which is used for binging to your grid.
All you need to do is use
Dim iSumSal As Integer
iSumSal = StudentTable.Compute("SUM(sal)", "")
similarly you can perform for other columns.
once this is done. then just add a new row to you data table with all the summed values in it.
And then you can bind it to your grid.
optional - you can put some text value in the first column of you new row as "Total:"
thanks
rahul

Memcached/Microsoft Velocity Performance Question

Just a random query regarding Microsoft Velocity.
Scenario:
Say I want ALL Orders from my database. In SQL, this is fine, I can do SELECT OrderId,TotalCost... from Orders. This is one round trip to my database, and everyone is happy.
Now, if I'm using Memcached or (as I'm using now) Microsoft Velocity (CTP3), there is no easy way to do this. The two options I see are (in pseudo code)
FOR EACH ORDER
Order = cache.TryGet(OrderId)
if Order is null
Order = db.Get(OrderId)
END FOR EACH
which would be LOADS of roundtrips.
Also, consider I want to get orders by Customer
SQL: Select OrderId....TotalCost from Orders where CustomerId = MyCustomerId
One round trip, everyone is happy.
Using a cached solution there are two solutions I see really:
Solution 1:
DOES CustomerOrderIdsForCustomerId EXIST
NO
POPULATE CustomerOrderIdsForCustomerId FROM DATABASE
YES
FOR EACH OrderId IN CustomerOrdersForCustomerId
cache.TryGet(OrderId)
IF Order IS NULL
Order = db.Get(OrderId)
END FOR EACH
Solution 2 is to hold a serialized list of all the customer orders in it's own cache object. Reduces round trips, but just seems lame.
Can someone shed light on this situation please?
Just because you have a cache doesn't mean you have to use it for every query! In this instance as you've already identified, it's not really helping you and I'd probably go straight through to the database for this sort of thing.
It depends a bit on your application though - if you think customers are regularly going to be looking at their order history, or you have some function that's analysing orders to see what products are hot, then you might want to use some caching to keep load off your SQL server. In that case, I'd probably go with holding in the cache either a DataTable of the orders, or a collection of Orders and query it with LINQ to show the orders for a customer.
Keep in mind that a cache is not supposed to be the permanent store for any data (orders in your case). In this case the cache can help in removing some of the load from your DB server, but something has to load the orders in the cache before you can retrieve them. With that being said, here's a couple of options to consider if you are using velocity that avoid having to loop through a collection. However, you will always have to figure out a way to deal with data that is not in the cache.
Option 1: Use Regions
You can create a region and get all the objects from that region with one call. In your scenario, you could create an Orders region where you can store all the orders and then use the GetObjectsInRegion method to get all the orders in the cache. Note however that this brings back all the orders in the cache... which might or might not have all the orders that you have in the database.
Option 2: Use Regions And Tags
Velocity lets you tag objects that you put in the cache regions and then retrieve them using those tags. So, in your scenario you could tag the order objects with an "order" tag and then use the GetObjectsByTag method to retrieve them. Since you can use multiple tags, you could also tag them with their customer id tag and then pull them out that way.
These 2 options come with some caveats, so be sure to read up on the documentation:
Velocity Tag BasedMethods

Categories