currently we are using Sessions to store datatables in our pages so that we don't have to make Database hits to get the same datatable again and again. But my worry is that it is using the server memory and if large number of users login some day, the response of the server will become slow and our application might crash as well.
Please tell me is it a good idea to store datatables into Sessions or should we get the datatables from DB everytime?
As a general rule of thumb I would say don't use session. I haven't had to use session for a long time. As soon as you move into a web farm situation session either gets a lot slower or a lot more complicated or both.
Whether you will get away with it or not really depends on how much data you are storing in session, and how many users will be active within the session timeout period.
There are a lot of caching and in memory database options available today that may be a better option. Finally, while the solution as described sounds questionable, I wouldn't optimize the existing solution until you have actually measured a problem.
This is dependent on what is being stored in the datatables. In any case, I would use the ASP.NET Cache to store these datatables for the following reasons.
Cache has an expiry, which means you can automatically remove it based upon a sliding or absolute expiry timed value
Cache will automatically be removed if the processes memory "pressure" is too high.
You can make a cached item specific to one user, or global to all users based upon its key
for example:
// personalized cache item
string personalCacheKey = string.Format("MyDataTable_{0}", (int)Session["UserID"]);
DataTable myPersonalDataTable = (DataTable)Cache[personalCacheKey];
if (myPersonalDataTable == null)
{
myPersonalDataTable = database.dosomething();
Cache.Insert(personalCacheKey, myPersonalDataTable, null, Cache.NoAbsoluteExpiration, new TimeSpan(0, 30, 0)); // 30 minutes
}
// global (non user specific) cached item
string globalCacheKey = "MyDataTable";
DataTable globalDataTable = (DataTable)Cache[globalCacheKey];
if (globalDataTable == null)
{
globalDataTable = database.dosomething();
Cache.Insert(globalCacheKey, globalDataTable, null, Cache.NoAbsoluteExpiration, new TimeSpan(0, 30, 0)); // 30 minutes (again)
}
The issue that you have now, however, is if the underlying data gets updated, and whether it is acceptable for your application to present "old" cached data. If it is not acceptable, you will have to forcibly remove an item from cache, there are a few mechanisms for that.
You can setup a SqlCacheDependency (which I have never personally used), or you can just clear out the cached object yourself using Cache.Remove(cachekey).
It is preferable to store "commonly used" data in Memory; that's good logic. However "Session" means that it exists for the life of that Session, and hence that user. Secondly, pending the user "Session" life, as you already said, this could be consume valuable resources on the Server Side.
What you may want to consider using is the "Cache" object, as it serves the same purpose with "Expiration".
DataTable users = new DataTable();
if (Cache["users"] == null)
{
// users = getUsers(customer);
Cache.Add(“users”, users, null, System.Web.Caching.Cache.NoAbsoluteExpiration, new TimeSpan(0, 60, 0), System.Web.Caching.CacheItemPriority.Default, null);
}
else
{
sers = (DataTable)Cache["users"];
}
There are many ways to re-use memory in .NET
(1) ViewState
(2) Cache
(3) Session
(4) Cookies
But I would go for the "Cache" object.
If you can't increase memory on the web server then the obvious answer is to not store it in session state and get it from the database every time.
The problem with this is what impact will it have on your database? Are you just moving the problem from the web server to the database server?
It is much easier to scale out web servers than it is to scale up/out Databases (and often cheaper if you're using something like SQL Server)
If your datatable has smaller number of records and it does not contain sensitive data then you can use ViewState as well but data should be smaller as this approach will serialize the data and store it at client side and then gets the data from client side to store at server side.
Related
I have a C# .NET 2.2 web server process that exposes an API. When a request comes in, the server needs to make its own HTTP request to a database API. Depending on the query, the response from the database can be very large, and in some cases this is large enough that my .NET process crashes with (Memory quota exceeded) in the logs.
The code that sends off the request looks like this:
string endpoint_url = "<database service url>";
var request_body = new StringContent(query, Encoding.UTF8, "<content type>");
request_body.Headers.ContentType.CharSet = "";
try {
var request_task = Http.client.PostAsync(endpoint_url, request_body);
if (await Task.WhenAny(request_task, Task.Delay(timeoutSeconds*1000)) == request_task) {
request_task.Result.EnsureSuccessStatusCode();
var response = await request_task.Result.Content.ReadAsStringAsync();
JObject json_result = JObject.Parse(response);
if (json_result["errors"] is null) {
return json_result;
} else {
// return error
}
} else {
// return timeout error
}
} catch(Exception e) {
// return error
}
My question is, what is the best way of protecting against my web service going down when a query returns a large response like this? The .NET Core best practices suggest that I shouldn't be loading the response body into a string wholesale, but doesn't really suggest an alternative.
I want to fail gracefully and return an error to the client rather than causing an outage of the .NET service, so setting some kind of limit on the response size would work. Unfortunately the database service in question does not return a Content-Length header so I can't just check that.
My web server currently has 512MB of memory available, which I know is not much, but I'm concerned that this error could happen for a large response regardless of the amount of memory I have available. My main concern is guaranteeing that my .NET service wont crash regardless of the size of response from the database service.
If Http.client is an HttpClient you can restrict the maximum data that it will read before aborting the operation and throwing an exception with it's MaxResponseContentBufferSize property. By default it's set to 2Gb, that explains why makes your server go away if it only has 512Mb of RAM, so you can set it to something like 10/20Mb and handle the exception if it has been overflown.
The simplest approach that you could use is to make decision based on the returning row count.
If you are using ExecuteReader then it will not return the affected rows, but you can overcome this limitation by simply returning two result sets. The first result set would have a single row with a single column, which tells you the row count and based on that you can decide whether or not you are calling the NextResult and process the requested data.
If you are using stored procedures then you can use an out parameter to indicate the retrieved row count. By using either the ##ROWCOUNT variable or the ROWCOUNT_BIG() function. Yet again you can branch on that data.
The pro side of these solutions is that you don't have to read any record if it would outgrow your available space.
The con side of these solutions is that determining the threshold could be hard, because it could depend on the query itself, on one (or more) parameter(s) of it, on the table size, etc.
Well you definitely shouldn't be creating an unbounded string that could be larger than your heap size but it's more complicated than just that advice. As others are pointing out the entire system needs to work together be able to return large results with a limited memory footprint.
The simplest answer to your direct question - how can I send back an error if the response won't fit in memory - would be to create a buffer of some limited "max" size and read only that much data from the response. If it doesn't fit in your buffer then it's too large and you can return an error.
But in general that's a poor design because the "max" is impossible to statically derive - it depends on server load.
The better answer is to avoid buffering the entire result before sending it to the client and instead stream the results to the client - read in a buffer full of data and write out that buffer - or some processed form of that buffer - to the client. But that requires some synergy between the back-end API, your service and possibly the client.
If your service has to parse a complete object - as you're showing with Json.Parse - then you'll likely need to re-think your design in general.
Just want to double check something. I have the following code:
if (HttpContext.Current.Cache["DataTable"] == null)
{
Cache.Insert("DataTable", DtMaster, null, DateTime.Now.AddMinutes(2),
System.Web.Caching.Cache.NoSlidingExpiration);
}
Say user A logs in and creates a datatable containing 3 rows, if user B then logged on to a completely machine would they also see 3 rows?
I guess I'm asking does items stored in the cache become available to all users?
Thanks.
Yes.
There is one instance of the Cache class per application domain. As a
result, the Cache object that is returned by the Cache property is the
Cache object for all requests in the application domain.
HttpContext.Cache
Yes.
It will work for multi user's, Please don't forget to Reset the cache in case of any change is happening in db which is related to this datatable records.
I want to do grid, I get 1000 rows of data from SQL Server with WCF, then I put grid 10 data in view in first after use scroll and get 10-20 data from controller in two after use scroll and get 20-30 data from controller in three..... use scroll and get 990-1000 data from controller. But I must go SQL Server with WCF only one time for 1000 rows of data (I cannot go to SQL Server all time (example 0-10,10-20,20-30)) and I put 10 data grid in view, problem is 990 rows of data in controller.
How to keep 990 rows of data in the controller ?
You can make use of Caching for this
Either use the System.Web.Caching
Or use MemoryCache
Depending on you setup, you might also be able to use OutputCache
[OutputCache(Duration=10, VaryByParam="none")]
public ActionResult Result()
{
return Data();
}
See http://www.asp.net/mvc/overview/older-versions-1/controllers-and-routing/improving-performance-with-output-caching-cs for more around this.
your description is quite confusing. Sorry if I misunderstood your requirement.
If it involve over 1000+ of data, session is not a good option especially if your program involve other usage of session.
Since you are using MVC, you can take advantage of new option such as ViewData and TempData. You can read more about it here.
I used TempData before and it can process large amount of data (I did not count how much it was, but consider quite huge) so it should be a much better option than session.
I am working in C#, and am trying to use DirectorySearch to query the groups of an extremely large Microsoft ActiveDirectory LDAP server.
So, in my application, I'm going to have a paged list of groups, with searching capability. Naturally, I don't want to hammer my LDAP server with passing me the entire result set for these queries every time I hit "Next Page".
Is there a way, using DirectorySearch, to retrieve ONLY a single arbitrary page's results, rather than returning the entire result-set in one method call?
Similar questions:
DirectorySearch.PageSize = 2 doesn't work
c# Active Directory Services findAll() returns only 1000 entries
Many questions like these exist, where someone asks about paging (meaning from LDAP server to app server), and gets responses involving PageSize and SizeLimit. However, those properties only affect paging between the C# server and the LDAP server, and in the end, the only relevant methods that DirectorySearch has are FindOne() and FindAll().
What I'm looking for is basically "FindPaged(pageSize, pageNumber)" (the pageNumber being the really important bit. I don't just want the first 1000 results, I want (for example) the 100'th set of 1000 results. The app can't wait for 100,000 records to be passed from the LDAP server to the app server, even if they are broken up into 1,000-record chunks.
I understand that DirectoryServices.Protocols has SearchRequest, which (I think?) allows you to use a "PageResultRequestControl", which looks like it has what I'm looking for (although it looks like the paging information comes in "cookies", which I'm not sure how I'd be supposed to retrieve). But if there's a way to do this without rewriting the entire thing to use Protocols instead, I'd rather not have to do so.
I just can't imagine there's no way to do this... Even SQL has Row_Number.
UPDATE:
The PageResultRequestControl does not help - It's forward-only and sequential (You must call and get the first N results before you can get the "cookie" token necessary to make a call to get result N+1).
However, the cookie does appear to have some sort of reproducible ordering... On a result set I was working on, I iterated one by one through the results, and each time the cookie came out thusly:
1: {8, 0, 0, 0}
2: {11, 0, 0, 0}
3: {12, 0, 0, 0}
4: {16, 0, 0, 0}
When I iterated through two by two, I got the same numbers (11, 16).
This makes me think that if I could figure out the code of how those numbers are generated, I could create a cookie ad-hoc, which would give me exactly the paging I'm looking for.
The PageResultRequestControl is indeed the way to do this, it's part of the LDAP protocol. You'll just have to figure out what that implies for your code, sorry. There should be a way to use it from where you are, but, having said that, I'm working in Java and I've just had to write a dozen or so request controls and extended-operation classes for use with JNDI so you might be out of luck ... or you might have to do like I did. Warning, ASN.1 parsing follows not that far behind :-|
Sadly, it appears there may not be a way to do this given current C# libraries.
All of the standard C#4.0 LDAP libraries return Top-N results (As in, FindAll(), which returns every result, FindOne(), which returns the first result, or SearchResult with PageResultRequestControl, which returns results N through N+M but requires you to retrieve results 1 through N-1 before you'll have a cookie token that you can pass with the request in order to get the next set.
I haven't been able to find any third-party LDAP libraries that allow this, either.
Unless a better solution is found, my path forward will be to modify the interface to instead display the top X results, with no client paging capabilities (obviously still using server-side paging as appropriate).
I may pursue a forward-only paging system at a later date, by passing the updated cookie to the client with the response, and passing it back with a click of a "More Results" type of button.
It might be worth pursuing at a later date, whether or not these cookies can be hand-crafted.
UPDATE:
I spoke with Microsoft Support and confirmed this - There is no way to do dynamic paging with LDAP servers. This is a limitation of LDAP servers themselves.
You can use Protocols and the Paging control (if your LDAP server supports it) to step forward at will, but there is no cross-server (or even cross-version) standard for the cookie, so you can't reasonably craft your own, and there's no guarantee that the cookie can be reused for repeated queries.
A full solution involves using Protocols (with Paging as above) to pull your pageable result set into SQL, whether into a temp table or a permanent storage table, and allow your user to page and sort through THAT result set in the traditional manner. Bear in mind your results won't be precisely up to date, but with some smart cache updating you can minimize that risk.
Maybe you want to iterate through your "pages" using the range-attribute accordingly:
----copy & paste----
This sample retrieves entries 0-500, inclusively.
DirectoryEntry group = new DirectoryEntry("LDAP://CN=Sales,DC=Fabrikam,DC=COM");
DirectorySearcher groupMember = new DirectorySearcher
(group,"(objectClass=*)",new string[]{"member;Range=0-500"},SearchScope.Base);
SearchResult result = groupMember.FindOne();
// Each entry contains a property name and the path (ADsPath).
// The following code returns the property name from the PropertyCollection.
String propName=String.Empty;
foreach(string s in result.Properties.PropertyNames)
{
if ( s.ToLower() != "adspath")
{
propName = s;
break;
}
}
foreach(string member in result.Properties[propName])
{
Console.WriteLine(member);
}
----copy & paste----
for more Information see:
Enumerating Members in a Large Group
https://msdn.microsoft.com/en-us/library/ms180907.aspx
Range Retrieval of Attribute Values
https://msdn.microsoft.com/en-us/library/cc223242.aspx
Searching Using Range Retrieval
https://msdn.microsoft.com/en-us/library/aa367017.aspx
In one sentence, what i ultimately need to know is how to share objects between mid-tier functions w/ out requiring the application tier to to pass the data model objects.
I'm working on building a mid-tier layer in our current environment for the company I am working for. Currently we are using primarily .NET for programming and have built custom data models around all of our various database systems (ranging from Oracle, OpenLDAP, MSSQL, and others).
I'm running into issues trying to pull our model from the application tier and move it into a series of mid-tier libraries. The main issue I'm running into is that the application tier has the ability to hang on to a cached object throughout the duration of a process and make updates based on the cached data, but the Mid-Tier operations do not.
I'm trying to keep the model objects out of the application as much as possible so that when we make a change to the underlying database structure, we can edit and redeploy the mid-tier easily and multiple applications will not need to be rebuilt. I'll give a brief update of what the issue is in pseudo-code, since that is what us developers understand best :)
main
{
MidTierServices.UpdateCustomerName("testaccount", "John", "Smith");
// since the data takes up to 4 seconds to be replicated from
// write server to read server, the function below is going to
// grab old data that does not contain the first name and last
// name update.... John Smith will be overwritten w/ previous
// data
MidTierServices.UpdateCustomerPassword("testaccount", "jfjfjkeijfej");
}
MidTierServices
{
void UpdateCustomerName(string username, string first, string last)
{
Customer custObj = DataRepository.GetCustomer(username);
/*******************
validation checks and business logic go here...
*******************/
custObj.FirstName = first;
custObj.LastName = last;
DataRepository.Update(custObj);
}
void UpdateCustomerPassword(string username, string password)
{
// does not contain first and last updates
Customer custObj = DataRepository.GetCustomer(username);
/*******************
validation checks and business logic go here...
*******************/
custObj.Password = password;
// overwrites changes made by other functions since data is stale
DataRepository.Update(custObj);
}
}
On a side note, options I've considered are building a home grown caching layer, which takes a lot of time and is a very difficult concept to sell to management. Use a different modeling layer that has built in caching support such as nHibernate: This would also be hard to sell to management, because this option would also take a very long time tear apart our entire custom model and replace it w/ a third party solution. Additionally, not a lot of vendors support our large array of databases. For example, .NET has LINQ to ActiveDirectory, but not a LINQ to OpenLDAP.
Anyway, sorry for the novel, but it's a more of an enterprise architecture type question, and not a simple code question such as 'How do I get the current date and time in .NET?'
Edit
Sorry, I forgot to add some very important information in my original post. I feel very bad because Cheeso went through a lot of trouble to write a very in depth response which would have fixed my issue were there not more to the problem (which I stupidly did not include).
The main reason I'm facing the current issue is in concern to data replication. The first function makes a write to one server and then the next function makes a read from another server which has not received the replicated data yet. So essentially, my code is faster than the data replication process.
I could resolve this by always reading and writing to the same LDAP server, but my admins would probably murder me for that. The specifically set up a server that is only used for writing and then 4 other servers, behind a load balancer, that are only used for reading. I'm in no way an LDAP administrator, so I'm not aware if that is standard procedure.
You are describing a very common problem.
The normal approach to address it is through the use of Optimistic Concurrency Control.
If that sounds like gobbledegook, it's not. It's pretty simple idea. The concurrency part of the term refers to the fact that there are updates happening to the data-of-record, and those updates are happening concurrently. Possibly many writers. (your situation is a degenerate case where a single writer is the source of the problem, but it's the same basic idea). The optimistic part I'll get to in a minute.
The Problem
It's possible when there are multiple writers that the read+write portion of two updates become interleaved. Suppose you have A and B, both of whom read and then update the same row in a database. A reads the database, then B reads the database, then B updates it, then A updates it. If you have a naive approach, then the "last write" will win, and B's writes may be destroyed.
Enter optimistic concurrency. The basic idea is to presume that the update will work, but check. Sort of like the trust but verify approach to arms control from a few years back. The way to do this is to include a field in the database table, which must be also included in the domain object, that provides a way to distinguish one "version" of the db row or domain object from another. The simplest is to use a timestamp field, named lastUpdate, which holds the time of last update. There are other more complex ways to do the consistency check, but timestamp field is good for illustration purposes.
Then, when the writer or updater wants to update the DB, it can only update the row for which the key matches (whatever your key is) and also when the lastUpdate matches. This is the verify part.
Since developers understand code, I'll provide some pseudo-SQL. Suppose you have a blog database, with an index, a headline, and some text for each blog entry. You might retrieve the data for a set of rows (or objects) like this:
SELECT ix, Created, LastUpdated, Headline, Dept FROM blogposts
WHERE CONVERT(Char(10),Created,102) = #targdate
This sort of query might retrieve all the blog posts in the database for a given day, or month, or whatever.
With simple optimistic concurrency, you would update a single row using SQL like this:
UPDATE blogposts Set Headline = #NewHeadline, LastUpdated = #NewLastUpdated
WHERE ix=#ix AND LastUpdated = #PriorLastUpdated
The update can only happen if the index matches (and we presume that's the primary key), and the LastUpdated field is the same as what it was when the data was read. Also note that you must insure to update the LastUpdated field for every update to the row.
A more rigorous update might insist that none of the columns had been updated. In this case there's no timestamp at all. Something like this:
UPDATE Table1 Set Col1 = #NewCol1Value,
Set Col2 = #NewCol2Value,
Set Col3 = #NewCol3Value
WHERE Col1 = #OldCol1Value AND
Col2 = #OldCol2Value AND
Col3 = #OldCol3Value
Why is it called "optimistic"?
OCC is used as an alternative to holding database locks, which is a heavy-handed approach to keeping data consistent. A DB lock might prevent anyone from reading or updating the db row, while it is held. This obviously has huge performance implications. So OCC relaxes that, and acts "optimistically", by presuming that when it comes time to update, the data in the table will not have been updated in the meantime. But of course it's not blind optimism - you have to check right before update.
Using Optimistic Cancurrency in practice
You said you use .NET. I don't know if you use DataSets for your data access, strongly typed or otherwise. But .NET DataSets, or specifically DataAdapters, include built-in support for OCC. You can specify and hand-code the UpdateCommand for any DataAdapter, and that is where you can insert the consistency checks. This is also possible within the Visual Studio design experience.
(source: asp.net)
If you get a violation, the update will return a result showing that ZERO rows were updated. You can check this in the DataAdapter.RowUpdated event. (Be aware that in the ADO.NET model, there's a different DataAdapter for each sort of database. The link there is for SqlDataAdapter, which works with SQL Server, but you'll need a different DA for different data sources.)
In the RowUpdated event, you can check for the number of rows that have been affected, and then take some action if the count is zero.
Summary
Verify the contents of the database have not been changed, before writing updates. This is called optimistic concurrency control.
Other links:
MSDN on Optimistic Concurrency Control in ADO.NET
Tutorial on using SQL Timestamps for OCC