Caching strategy for better performance [duplicate]

Caching strategy for better performance [duplicate] - c#

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Linq performance for in-memory collection
I have a web application with around 1 million users. Almost every web page in that application calls the GetUser() method (to load first name in activity stream and other user details).
Right now I am hitting the database for each call, and I am thinking of caching all the users in memory and using Linq to fetch the search results or GetUser() from there.
My only issue is whether or not caching all users (in memory) is a good idea. Would I be wasting my RAM?
I personally think fetching from RAM is much faster than fetching from the DB (even if DB is optimized and indexed).
Note that I have already handled cache validation/updating/etc.
Does stackoverflow cache all its users?

We did something similar, but instead of turning to Linq, we just installed a copy of SQL Server Express on each web server. We would push user data changes to each of the web servers, and the local app was using a middle tier and only pulling data from the local database periodically (but at least that was local, instead of everyone hitting the database).
What technology you use for the caching, and how the app (or Linq) knows when to refresh its local copy, depends on how stale the cached data is allowed to be.

If GetUser will be returning the same set of users the majority of the time, and if most users will rarely be retrieved you might try a hybrid approach, where you setup a dictionary (or some other collection) and check that collection first and it does not exist then get it from the database and store it into the collection.
Using this approach you could also use the Cache since it already has built-in mechanisms to go stale and clean itself up.
Having said this, I worked on a project in the past where we did the same thing for users (we only had about 100 users though) and all our research and testing found it was faster to go to the database everytime.

Related

Retrieve all images at once or individually from SQL Server

I have a number of images that are stored as VARBINARY(MAX) (using FileStream) in a database. I'm looking to retrieve about 10 images or so at a time.
The prescribed, most common way using ASP.net is to use an HTTP handler and hit the database for each individual image. Seems fine, but is a bit slow at times.
Is it best to download all images for a given page at the same time in one big data chunk? Or should I try to grab each individually? Best practice?

Probably best to do them individually on a domain that doesn't have cookies set, or make sure your handler will work with multiple simultaneous requests. That way you can stream multiple results from the DB at the same time, and stream multiple images from your webserver as it gets them.

Well,
I think many people would have different opinions, and reasons about what the best practice is for them, but in reality, it all depends on hardware, software, data structure, and if the data is normalized.
In general, the SQL server likes SET operations better, meaning, the loops in general are slower.But, loops are safer for IOPs related issues, and they are better at causing less locks.
I am not sure which object mapper, or built in SQL library you are using( I have a feeling you may be using LINQ after you built a SQL class), but it also depends on the library you are using, and I would definitely recommend dapper.
I think reading them all at once would be faster, and here is why;
- If it is as you say, and you hit the database each time for the image, then that would add the delay of reconnecting to the database, so the latency will occur. But when there is one connection, the data retrieval is straight and your connection is open at that moment without requiring further session authentication.
I would recommend downloading them all at once, and informing the end user with a download screen during the process of that. Also for retrieving data, this link is very helpful I believe : https://technet.microsoft.com/en-us/library/dd425070(v=sql.100).aspx
Depending on the features of your server, and edition, you could definitely use different features.

If data caching used like a session would it have better performance

I am working on a maintenance of one asp.net application where I found pervious developers have implemented data caching as like a session, means they stored data in a cache for per session like this
Public Function GetDataCache(ByVal dataCacheKey As String) As Object
dataCacheKey = dataCacheKey & Convert.ToString(LoginSessionDO.UserID)
Return Cache(dataCacheKey)
End Function
In this application there are many screens where user can add multiple rows (data) in a grid temporary which actually store in cache for that particular current user only and finally press save button to save data in database.
My question is if caching is used like a session! will it give any performance improvement?
actually I can change it in my dev. environment to check performance but we cannot create load like prod in our environment and also without any surety I cannot change and deploy code in production.
Please suggest me
Is caching is good the way its implemented?.
It’s using like a session would it have better performance than session?

The cache will need to be cleared out, otherwise all items will remain until the app domain recycles. Session has a much shorter expiry and can be explicitly abandoned on log out, for example.
This might be a scaling issue if your site grows. However, the shorter expiry time of the session might cause you issues with saving if it is no longer there when expected. A staging table in the db might be a better approach.
An edit from several years after the initial answer.
In reality, it would be much preferable to store the added rows on the client side and then submit them all in one go. Either of the server side options above run into issues if the app domain recycles in the middle of a session and both will cause you scaling issues on the server with enough users/data.

Periodic updates to Access DB in C# over VPN

I know this may be considered a generic question, but I honestly don't have the first clue even where to start. I've tried searching, and have not found any results that fit the application.
I'm trying to develop a front-end for an Access 2010 database that will allow users to add/modify records. Several of the users use a VPN to connect to the DB, and the current model we are using of an Access 2010 Navigation Form is horrendously slow, regardless of connection speed. I have verified that we can reach the DB over VPN with no privilege issues or security concerns, but even through the OleDb engine there is significant latency on the data access.
What I would like to do is to be able to have updates sent/received in a background process, say every 5-10 minutes, so that the end user will be able to update it as they need to, and have the changes written without the user really being aware of the latency. Would simply using a background worker suffice to do this, or is there a better way to send "packets" of updates over the connection?
Again, I know this is not code-specific exactly, but I've never worked with C# and DB updates before, so I'm kind of learning as I go. Nearly all the results I've found have dealt with engines other than OleDb, such as SQL, but we are locked into using Access (an accdb file) as we don't have any other database engines available to us. I appreciate any and all help, in whatever form it comes in.
This is a new enough project that so far the only code that I've developed for this has consisted of initializing the connection to the DB to verify that it's even possible.

MS Access is not designed to be concurrent. When you open access db on remote machine you also downloading the entire db onto the client machine memory that's why the bigger the file the more it become slower.. If you wished to make it concurrent use MS SQL Express and linked it to your MS Access application. This way it is much faster and better.
Beside its free, and can be upscale if you needed.

Mid-tier caching for Windows Forms Application

I have a simple Windows Forms Application which is written C# 4.0. The application shows some of the records from database. The application features a query option which is initiated by user.
The records in the database we can call as jobs
Consider the two columns JobID and Status
These being updated by two of the background services which in fact work like a producer consumer services. The status of the job will be updated by these services running behind.
Now for the user, who has an option to query the records from the database, say for e.g. to query data based on status (Submitted, processing, completed). This can result in thousands of records and the GUI might face some performance glitches on displaying these much of data.
Hence, it's important to display chunks of the query results as pages. The GUI isn't refreshed until user manually refresh or make the new query.
Say for e.g. Since the jobs are being constantly updated from the services, the job status can be different at any point of time. The basic requirement that the pages should have the data at the time those were fetched from the DB.
I am using LINQ to SQL for fetching data from the DB. It's quite easy to use but there isn't something mid-level caching required to meet this demand. Using the process memory to cache the results can shoot up page memory to the extreme if the number of records are very high. Unfortunately LINQ isn't providing any mid-tier caching facilities with the DataContext objects.
What are the preferable way to implement a paging mechanism with C# 4.0 + SQL Server + Windows environment?
Some of the alternatives I feel like to have a duplicated table/DB which can temporarily store the results as cache. Or using Enterprising Application Library's Application Cache Block. I believe that this is a typical problem faced by most of the developers. Which is the most efficient way to solve this problem. (NOTE: my application and DB running on same box)

While caching is a sure way to improve performance, implementing a caching strategy properly can be more difficult than it may seem. The problem is managing cache expiration or essentially ensuring that the cache is synchronized up to a desired degree. Therefore, before considering caching consider whether you need it in the first place. Based on what I can gather from the question it seems like the data model is relatively simple and doesn't require any joins. If that is the case, why not optimize the tables and indexes for pagination? SQL server and Linq To SQL will handle pagination for thousands of records transparently and with a breeze.
You are correct in stating that displaying too many records at once is prohibitive for the GUI and it is also prohibitive for the user. No user will want to see more records than are filling the screen at any given time. Given the constraint that the data doesn't need to be refreshed until requested by the user, it should be safe to assume that the number of queries will be relatively low. The additional constraint that the DB is on the same box as the application further solidifies the point that you don't need caching. SQL server already does caching internally.
All advice about performance tuning states that you should profile and measure performance before attempting to make optimizations. As state by Donald Knuth, premature optimization is the root of all evil.

Approach for caching data from data logger

Greetings,
I've been working on a C#.NET app that interacts with a data logger. The user can query and obtain logs for a specified time period, and view plots of the data. Typically a new data log is created every minute and stores a measurement for a few parameters. To get meaningful information out of the logger, a reasonable number of logs need to be acquired - data for at least a few days. The hardware interface is a UART to USB module on the device, which restricts transfers to a maximum of about 30 logs/second. This becomes quite slow when reading in the data acquired over a number of days/weeks.
What I would like to do is improve the perceived performance for the user. I realize that with the hardware speed limitation the user will have to wait for the full download cycle at least the first time they acquire a larger set of data. My goal is to cache all data seen by the app, so that it can be obtained faster if ever requested again. The approach I have been considering is to use a light database, like SqlServerCe, that can store the data logs as they are received. I am then hoping to first search the cache prior to querying a device for logs. The cache would be updated with any logs obtained by the request that were not already cached.
Finally my question - would you consider this to be a good approach? Are there any better alternatives you can think of? I've tried to search SO and Google for reinforcement of the idea, but I mostly run into discussions of web request/content caching.
Thanks for any feedback!

Seems like a very reasonable approach. Personally I'd go with SQL CE for storage, make sure you index the column holding the datetime of the record, then use TableDirect on the index for getting and inserting data so it's blazing fast. Since your data is already chronological there's no need to get any slow SQL query processor involved, just seek to the date (or the end) and roll forward with a SqlCeResultSet. You'll end up being speed limited only by I/O. I profiled doing really, really similar stuff on a project and found TableDirect with SQLCE was just as fast as a flat binary file.

I think you're on the right track wanting to store it locally in some queryable form.
I'd strongly recommend SQLite. There's a .NET class here.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.