Pagination and Data buffering in Windows Application using C# 2005 - c#

Requirement
.NET Windows application using C# interacts with Oracle DB for retrieving and saving data
Issue
With huge volume of data, the performance is slow and the memory usage is high, the application displays the entire data in the screen. Response time is high due to database call and client side data processing
Proposed Solution
Using pagination (from Oracle DB) to display partial data in the screen, response time of the application will be faster; however, it will make DB calls for each page.
We are looking at a solution to get the 1st page data from DB and start the application, after which there will be a background job to get the rest of the data from DB to a local XML DB. So, in case of next page, the data will be loaded from XML instead of making a DB call.
Is this design possible?
Is synchronization possible between local XML DB and the Oracle DB?

Personally I am not sure you really want to go that far, as synchronoization, and overall disk IO could be very "interesting" at best.
Typically what I have found to be good in the past if you REALLY must have "pre-fetched" records for more of the result set is that you can cache say the next 2 and previous 2 pages in memory, that way the users transition is smooth, and after you navigate the page, a backend thread will go out and pre-fetch the next page so taht you have it.
Otherwise, if you do what you are talking about, you are only deferring the performance impacts and introducing data synchronization and other issues.

Related

How to use a local APPlication database

I have an application in which such a large amount of data is loaded at the beginning that the waiting time for the users is no longer justifiable.
At first only data is loaded to fill a listbox explorer, which serves as browser to load the remaining information when selecting the item. So much for the data model.
I now intend to maintain a local data source and only update the data that the user selects, but I have to deal with the question if I should keep the finished objects for the model or the raw data.
Has anyone played around with the different approaches and can say/link to what is best approach in terms of maintenance and performance? I work with .NET

How can the data in the DataGrid be populated asynchronously in a streaming fashion?

In a Windows C# application form I load more than 500,000 records from a SQL Server database for analysis.
SELECT TOP 500000 * FROM MobileTrans
When I run above query in SQL Server Management Studio data shows up immediately and takes 15 sec to load be completed. But when I run this query in my Windows application, it takes 15 sec without showing anything in the data grid, after that data show in data grid suddenly.
How can I retrieve results of query async same as SQL Server Management Studio in my windows data grid form?
Please send a small sample of code.
You do not need to show 1mln records to anyone. None can see them all contemporary.
So first load reasonable amount of data one could see and operate in your app.
In short: use server side paging of the data if this is only about presentation.
By reducing in this way dramatically amount of data, you may avoid async processing at all.
If you need, by the way, process it in async I would go for populating data retrieved from DB into storage (Queue<T>, List<T>....) which is a source for visual element you visualize data on.
Consider that this can easily jump into fairly complicated scenarios, as it's not absolutely clear to me how complex your app is. So, may be the first solution would reveal as the best one.
EDIT
Here, may be, a useful example about how can be that (defered loading) achieved.
Implementing Virtual Mode with Just-In-Time Data Loading in the Windows Forms DataGridView Control

How do I process large amount of data in asp.net

I have a web project in asp.net/C#/Mysql where the data can be up to some 30,000 rows of data to process.
Its a reporting tool, and I have to show statistics like counts and sum at several levels.
I want to know which would be the better way to go around this.
I can filter my data to limited columns though which I can query.
Now, Is is a good way to get the data (all rows) to my application on load and whenever user queries I can filter that data and do my calculations in the code and show my statistics.
or I can have a stored procedure do all my calculations and every time user queries I can call the stored procedure and get my statistics.
Thanks
Databases are optimized to do this kind of data manipulation. And since you reduce network load as well i would vote for the second option.
Another possibility is to consider some kind of OLAP solution where transactional data is already consolidated into smaller chunks of data which in turn can be easily queried.
I would definetly go with the second option. You are talking about a web application, if You want to get all the data at load, then you must store it somewhere to preserve it during postbacks. If you chose to store it in a session state, you will end up consuming the Web server memory, since you have more than one user accessing your site.
If you store it in view state, then you will end up in a very big client response,which will make your page very slow to load on the client side, and will cause network traffic.
Option 2 is the best, because stored procedures are precompiled, which means they are much better in terms of performance. You will also reduce network traffic.

Mid-tier caching for Windows Forms Application

I have a simple Windows Forms Application which is written C# 4.0. The application shows some of the records from database. The application features a query option which is initiated by user.
The records in the database we can call as jobs
Consider the two columns JobID and Status
These being updated by two of the background services which in fact work like a producer consumer services. The status of the job will be updated by these services running behind.
Now for the user, who has an option to query the records from the database, say for e.g. to query data based on status (Submitted, processing, completed). This can result in thousands of records and the GUI might face some performance glitches on displaying these much of data.
Hence, it's important to display chunks of the query results as pages. The GUI isn't refreshed until user manually refresh or make the new query.
Say for e.g. Since the jobs are being constantly updated from the services, the job status can be different at any point of time. The basic requirement that the pages should have the data at the time those were fetched from the DB.
I am using LINQ to SQL for fetching data from the DB. It's quite easy to use but there isn't something mid-level caching required to meet this demand. Using the process memory to cache the results can shoot up page memory to the extreme if the number of records are very high. Unfortunately LINQ isn't providing any mid-tier caching facilities with the DataContext objects.
What are the preferable way to implement a paging mechanism with C# 4.0 + SQL Server + Windows environment?
Some of the alternatives I feel like to have a duplicated table/DB which can temporarily store the results as cache. Or using Enterprising Application Library's Application Cache Block. I believe that this is a typical problem faced by most of the developers. Which is the most efficient way to solve this problem. (NOTE: my application and DB running on same box)
While caching is a sure way to improve performance, implementing a caching strategy properly can be more difficult than it may seem. The problem is managing cache expiration or essentially ensuring that the cache is synchronized up to a desired degree. Therefore, before considering caching consider whether you need it in the first place. Based on what I can gather from the question it seems like the data model is relatively simple and doesn't require any joins. If that is the case, why not optimize the tables and indexes for pagination? SQL server and Linq To SQL will handle pagination for thousands of records transparently and with a breeze.
You are correct in stating that displaying too many records at once is prohibitive for the GUI and it is also prohibitive for the user. No user will want to see more records than are filling the screen at any given time. Given the constraint that the data doesn't need to be refreshed until requested by the user, it should be safe to assume that the number of queries will be relatively low. The additional constraint that the DB is on the same box as the application further solidifies the point that you don't need caching. SQL server already does caching internally.
All advice about performance tuning states that you should profile and measure performance before attempting to make optimizations. As state by Donald Knuth, premature optimization is the root of all evil.

Live Analytic Data

I'm planning on creating a live analytics's page for my website - A bit like Google Analytic but will real live data which will change as new users load a page on my site etc.
The main site is/will be written using Asp.Net/C# as the back end with a MS SQL database and the front end will support things like JavaScript (JQuery), CSS3, HTML5 (If required).
I was wondering what methods can I use to have the live analytic in terms of; How to get the data onto the analytic's page, what efficient graphing can I use, and storing the data with fast input/output.
The first thing that came to my mind is to use Node.js - Could I use this to achieve a live analytic's page? Is a good idea? Are there any better alternatives? Any drawbacks with this?
Would I need a C# Application running on a server to use Node.js to send/receive all the data to and from the website?
Would using a MS SQL database be fast enough? Would I need to store all the data live, or could I store it in chunks every x amount of seconds/minutes? (Which would be more efficient?)
This illustrates my initial thoughts on the matter -
Edit:
I'm going to be using this system over multiple sites, I could be getting 10 hits at a time to around 1,000,000 (Highly unlikely, but still possible). I want to be able to scale this system and adapt it to the environment it's in.
It really depends on how "real time" the realtime data needs to be. For example, I made this recently:
http://www.reed.co.uk/labs/realtime/
Which shows job applications coming into the system. Obviously there is way too much going on during busy periods to actually be querying the main database in realtime - so, what we do is query a sliding "window" and cache it on the server - this is a chunk of the last 5minutes worth of events.
We then play this back to the user as is it's happening "now". having a little latency as part of a SLA (wherein the users don't really care) can make the whole system vastly more scalable.
[EDIT- further explanation]
The data is just retrieved from a basic stored procedure call - naturally, a big system like reed has hundreds of transactions/second - so we cant keep hitting the main cluster, for every user.
All we do, is make sure we have a current window, in this case the last 5min of data cached on the server. When a client comes to the site, we get that last 5min of data, and play it back like it's happening right now - the end user is none-the-wiser - but what it means is that all clients are reading off the cache. Once the cache is 5min old, we invalidate it, and start again. This means a max of 1 DB hit, every five min - thus making teh system vastly more scalable (not that it really needs to be - as it's just for fun, really)
Just so you are aware Google analytics's already offers live user tracking. when inside the dashboard of a site on Google analytics's. click the home button on the top bar, and then the real time button on the left bar. Considering the design work and quality of this service, it seems this may be a better option then to attempt to recreate its service. If you do choose to proceed to create your own, then you can at least use their services as a benchmark for the desired features.
Using Api's like the googles charting API https://developers.google.com/chart/ would be a good approach to displaying the output of your stored data, with decreased development time. If you provide more information on the number of hits you exspect, and the scale of the server this software will be hosted, then it will be easier to give you answers to the speed questions.

Categories