IIS 7.0 and above. No load balancer involved in this setup. File being requested is a small spacer image which can be requested synchronously or aynchronous load using JQuery. The file is not important, It is just a way to get the end user to hit this IIS server for analytics.
I have a requirement to capture machine name of visitors from IIS logs. Current Log has client IP address already in there. Problem is IPs are short lived in our environment and if I don't resolve it to a machine name soon enough, it is not useful. So we need the machine name for visiting IP determined pretty much in real time.
What is a good approach to go about this. These are the options I found...
1) Enable reverse DNS lookup in IIS -> http://www.expta.com/2010/01/how-to-enable-reverse-dns-lookup-in-iis.html. This affects server performace and I am worried this will end up holding the user request and cause his page to load slow due to the increased expense of reverse lookup operation
2) Write a IIS log module that does enhances logging by doing a revere lookup of IPs and writing machine names in the log. >> I'm afraid this will slow the request turnaround time for end user and affect server performance due to the reverse DNS lookup. Pretty much I guess this is me doing point 1 above instead of relying to Microsoft's built in capability. At the end the realtime reverse DNS lookups will affect performance.
3) Same as point 1 or 2 above, but I will change the HTML of the page users are hitting to load the IIS hosted image file using a Async javascript call (as opposed to an inline call). That way end suer doesn't have to wait for this IIS request to complete and can haverest of the page (the content that matters to them) load without depending on the spacer image request to complete. But then browser will still dedicate one thread for the async image loading and it still is a performance hit for the end user.
4) Just use default IIS logging to log in real time. Have a separate C# app read the file every 5 minutes or so, detect the new lines added, parse them and get IP, do a reverse lookup and find machine name and log it to a database or flat file as requested. Flip side is that now I need to pretty much log in real time because if I don't log things immediately, the IP might have gotten assigned to a different machine by the time my application reads the log, finds it and does a reverse lookup on it. Also I have to deal with the complexity of reading the log file to read only newly inserted log entries after the previous read etc.
5) http://www.iis.net/learn/extensions/advanced-logging-module/advanced-logging-for-iis-real-time-logging -> I guess this is the same as point 2 above except it is written in VC++ instead of C#. So same disadvantages are there for this method also I guess
So every method out there seems to have downsides. What do you think is a good way to go around solving the problem?
Reversing IP to machine name is not possible due to the way routing works - many machines can come via the same IP.
If you found a way to map IP to machine name that is acceptable for you one approach could be to simply have site serving the image and doing all necessary discovery in normal request handler. This way you may also have more information about user (cookies, hauthentication headers, ...). This approach may be more flexible than configuring IIS logging.
Related
I have a C# application acting as an HTTP server which hypothetically can be reached at example1.com, example2.com, etc.
The server does not have this information at startup. Instead, it looks at the "host" field in every HTTP request to learn its "known names" and populates a list, i.e., ('example1.com', 'example2.com', 'localhost')
If the server receives an incorrect or malicious HTTP request with an invalid host field, it will still add the wrong hostname.
I want to check the host field on HTTP requests coming into my server to see if they correspond to the current machine. Is is possible to do this without any additional network requests?
The app would need to test whether it's actually example.com. I don't see any other (reliable) solution. You can't necessarily rely on DNS lookups since the webapp could have a private address.
You can set up a special endpoint for these tests. The flow I imagine is something like this:
The server receives a request for www.example.com/blah.html
It's the first time that the application is asked about www.example.com so to make sure that it really is www.example.com, it generates and stores a large random number, say 123456 and an index, say 5.
The application then sends a challenge to www.example.com/verify_hostname, passing the index as a parameter (i.e. www.example.com/verify_hostname?index=5).
The thread that handles the verification request looks up the stored random number by its index, and responds with 123456.
After receiving a response of 123456, the server now knows that it really is accessible through www.example.com, at least for now.
Of course other variations of this solution are possible.
Note that this approach leverages the fact that the application's threads have shared memory to store the random number and index. If the webapp is deployed in a cluster, you'd need to replace this simple authentication scheme. One way to do it would be using a shared secret and challenge-response, or some other cryptographic solution.
One other thing - in the solution I propose there's an inherent race condition between the thread storing the random number and index, and the thread verifying them. You'd need to make sure that the chosen random number and index are stored before the verification thread tries to read them. Various eventual consistency collections won't be enough to guarantee it, so they might fail from time to time.
The current situation: I have written an c# application server, which communicate with some applications (Computer/Smartphone/Web). Now I have the problem, that the application server has to deal with a lot of requests and it is going to be very slow.
My idea was to change the application server to work in a software cluster. To select the correct application server I want to write a load-balancer who choose the application server with the lowest workload.
My problem is, that I don't know how to write the load-balancer. Should the load-balancer work as a proxy, so that all the traffic goes through the load-balancer or should the load-balancer redirect to the application server and the application communicate directly with the application server.
Actually there are off-the-shelf products which do exactly what you're looking for, one of the most established ones is HAProxy that acts as a HTTP/TCP Load Balancer/ HA proxy, it can select appropriate server based on previous client requests (e.g. by cookie -insertion, it supports other methods), which I believe does exactly what you need.
back to the question,
Should the load balancer work as a proxy, so that all the traffic goes through the load balancer or should the load balancer redirect to the application server
Proxy implementation is a normal route to take, and Redirecting is not such a good idea and cause some disturbing issues on client-side specially browsers (e.g. bookmarks won't work as intended) and I would say it wouldn't have much gain over using proxy (aside from removing load balancer node if balancing is going to be done on client-side)
that i don't know how to write the load balancer
Short answer is you don't need to write your own, as I said before there are well established products in this area, however if you want to write your own HAProxy Architecture manual and Writing a load balancer proxy from ground up would be good start.
Answering in two parts:
You need a Proxy functionality, and not a redirect or a router
function. A redirect would reveal the IP/URL for your backend server
pool to the client, which you certainly do not want. The clients
could always bypass your LB once they know the backend IPs. Thus,
all the traffic must flow through the proxy.
I would not recommend entering the realm of writing a
LB. Its a pretty specialized function, and there are many
free/commercial baked products that can be deployed for this. You
might choose one of HAProxy, Appache HTTPD, Microsoft NLB, NginX. Each one offers a configuration choice of many load balancing algorithms, that you may want to use.
Redirecting would change the URL for the end-user, which is usually not a good idea.
What you're attempting to do is possible, but very complicated. There are numerous factors that constitute 'workload', including CPU, drive activity (possibly on multiple drives), network activity (possibly on multiple network cards), and software locking. Being able to effectively monitor all of those things is a very large project (I've never even heard of anyone taking locks into account). Entire companies are dedicated to doing stuff like that.
For your situation, I would recommend Microsoft's built-in Network Load Balancing. It does more of a random load balancing, but it gets the job done, and for the vast majority of applications, random distribution of requests results in a fairly even workload.
If that's not sufficient, get a hardware load balancer, or plan on at least two weeks of hardcore coding to properly balance based on CPU, drive activity, and network activity.
There are ready to use load balancer like Apache + mod_cluster.
Configuration can be created like .... Apache+mod_cluster -> Tomcat1 , Tomcat2 , Tomcat3 ,Tomcat4.
All request will come to Apache+mod_cluster and if it not static than distributed between Tomcat1, Tomcat2 , Tomcat3 , Tomcat4.
If request is static type than it will be handle by Apache only .
It is possible and advisable to configure Stick Session.
Main advanteage of mod_cluster is that Server-side load balance.
Apache + mod_cluster can handle HTTPS request also.
http://mod-cluster.jboss.org/
I'm working on a Cloud-Hosted ZipFile creation service.
This is a Cross-Origin WebApi2 service used to provide ZipFiles from a file system that cannot host any server side code.
The basic operation goes like this:
User makes a POST request with a string[] of Urls that correlate to file locations
WebApi reads the array into memory, and creates a ticket number
WebApi returns the ticket number to the user
AJAX callback then redirects the user to a web address with the ticket number appended, which returns the zip file in the HttpResponseMessage
In order to handle the ticket system, my design approach was to set up a Global Dictionary that paired a randomly generated 10 digit number to a List<String> value, and the dictionary was paired to a Queue storing 10,000 entries at a time. ([Reference here][1])
This is partially due to the fact that WebApi does not support Cache
When I make my AJAX call locally, it works 100% of the time. When I make the call remotely, it works about 20% of the time.
When it fails, this is the error I get:
The given key was not present in the dictionary.
Meaning, the ticket number was not found in the Global Dictionary Object.
We (with the help of Stack) tracked down the issue to multiple servers in the Cloud.
In this case, there are three.
That doesn't mean there is a one-in-three chance of this working, what seems to be going on is this:
Calls made while the browser is on the cloud site work 100% of the time because the same machine handles the whole operation end-to-end
Calls made from other sites work far less often because there is no continuity between the machine who takes the AJAX call, and the machine who takes the subsequent REDIRECT to the website to download the file. It's simple luck of the draw if the same machine handles both.
Now, I'm sure we could create a database to handle requests, but that seems like a lot more work to maintain state among these machines.
Is there any non-database way for these machines to maintain the same Dictionary across all sessions that doesn't involve setting up a fourth machine just to handle queue?
Is the reason for the dictionary simply to have a queue of operations?
It seems you either need:
A third machine that hosts the queue (despite your objection). If you're using Azure, an obvious choice might be the distributed Azure Cache Service.
To forget about the dictionary and just have the server package and deliver the requested result, perhaps in an asynchronous operation.
If your ASP.NET web app uses session state, you will need to configure an external session state provider (either the Redis Cache Service or a SQL Server session state provider).
There's a step-by-step guide here.
I'm having an issue sending large volumes of emails out from an ASP.Net application. I won't post the code, but instead explain what's going on. The code should send emails to 4000 recipients but seems to stall at 385/387.
The code creates the content for the email in a string.
It then selects a list of email address to send to.
Looping through the data via a datareader it picks out the email address and sends an email.
The email sending is done by a separate method which can handle failures and returns it's outcome.
As each record is sent I produce an XML node in an XML document to log each specific attempt to send.
The loop seems to end prematurely and the XML document is saved to disk.
Now I know the code works. I have run it locally using the same SMTP machine and it worked fine with 500 records. Granted there was less content, but I can't see how that would make any difference.
I don't think the page itself times out, but even if it did, I was sure .Net would continue processing the page, even if the user saw a page time out error.
Any suggestions appreciate because I'm pretty stumped.
You're sending lots of emails. During the span of a single request? IIS will kill a request if it takes longer than a certain (configurable) amount of time.
You need to use a separate process to do stuff like this. Whether that's a Timer you start from within global.asax, or a Thread which checks for a list of emails in a database/app_data directory, or a service you send a request to via WCF, or some combination of these.
The way I've handled this in the past is to queue the emails into a SQL Server table and then launch another thread to actually process/send the emails. Another aspx utility page can give me the status of the queue or restart the processing.
I also highly recommend that use an existing, legit, third-party mailing service for your SMTP server if you are sending mail out to the general public. Otherwise you run the risk of your ISP shutting off your mail access or (worse) your own server being blacklisted.
If the web server has a timeout setting, it will kill the page if it runs too long.
I recommend you check the value of HttpServerUtility.ScriptTimeout - if this is set then when a script has run for that length of time, it will be shut down.
Something you could do to help is go completely old-school - combine some Response.Writes with a few Response.Flush to send some data back to the client browser, and this tends to keep the script alive (certainly worked on an old ASP.NET 1.1 site we had).
Also, you need to take into account when this script is being run - the server may well also have been configured to perform an application reset (by default this is set to every 29 hours in IIS), if your server is set to something like 24 hours and this coincides with the time your script it run, you could be seeing that too - although the fact that the script's logging its response probably rules that out - unless your XML document is badly formed?
All that being said, I'd go with Will's answer of using a seperate process (not just a thread hosted by the site), or as Bryan said, go with a proper mailing service, which will help you with things like bounce backs, click tracking, reporting, open counts, etc, etc.
I've got a project where I'm hitting a bunch of custom Windows Performance Counters on multiple servers and aggregating them into a database. If a server is down, I want to skip it, and just continue on with my day.
Currently I'm checking to see if a server is live by doing a DirectoryInfo on a share that I've got to look at later in the process anyways, then checking the .Exists property.This is my current code snippet for testing:
DirectoryInfo di = new DirectoryInfo(machine.Share_Path);
if (!di.Exists)
{
log.Warn("Could not access " + machine.Name + "! Maybe its down?");
continue; // Skips to the next server in my loop where this snippet exists.
}
This works, but its pretty slow. It takes about 68 seconds on average for the di.Exists bit to finish its work, and I ideally need to know within a second whether or not a server is accessible. Pinging also isn't an option since a server can be pingable but not "live" in our environment.
I'm still kind of fresh to the .NET world, so I'm open to any advice people can offer.
Thanks in advance.
-Weegee
Ping First, Ask Questions Later
Why not ping first, and then do the di.Exists if you get a response?
That would allow you to fail early in the case that is not reachable, and not waste the time for machines that are down hard.
I have, in fact, used this method successfully before.
MSDN Ping Documentation
Paralellize
Another option you have is to paralellize the checking, and action on the servers as they are known to be available.
You could use the Paralell.ForEach() method, and use a thread-safe queue along with a simple consumer thread to do the required action. Combined with the checking method above, this could alleviate almost all of your bottleneck on the up/down checking.
Knock on the Door
Yet another method would be to ckeck if the required remote service is running (either by hitting its port directly or by querying it with WMI).
Since WMI is almost always running when a machine is up, your connection should be very quick to either succeed or fail.
The only "quick" way I think to see if it's up without relying on ping would be to create a socket, and see if you can actually connect to the port of the service you're trying to reach.
This would be the equivalent of telnet servername 135 to see if it's up.
Specifically...
Create a .NET TCP socket client (System.Net.Sockets.TcpClient)
Call BeginConnect() as an asynchronous operation, to connect to the server in question on one of the RPC ports that your directory exists code would use anyway (TCP 135, 139, or 445).
If you don't hear back from it within X milliseconds, call Close() to cancel the connection.
Disclaimer: I have no idea what effect this would have on any threat/firewall protection that may see this type of Connect / Disconnect with no data sent activity as a threat.
Opening Socket to a specific port usually does the trick. If you really want it to be fast, be sure to set the NoDelay property on the new socket (Nagle algorithm) so there is no buffering.
Fast will largely depend on latency but this is probably the fastest way I know to connect to an endpoint. It's pretty simple to parallelize using the async methods. How fast you can check will largely depend on your network topology but in tests for 1000 servers (latency between 0-75ms) I've been able to get connectivity state in ~30 seconds. Not scientific data at all but should give you the idea.
Also, don't ever do this through UNC file shares because if the server no longer exists you will have a lot of dangling connections that take forever to timeout. So if you have a lot of servers with invalid DNS records and you try to poll them you will bring Windows down completely over time. Things like File.Exists and any file access will cause this.
The "Full-Blown" option would be to install a monitoring tool like SCOM (System Center Operations Manager), this has an SDK you can use to query SCOM for (performance) and maintenance information avout machines being monitored. Might be a bridge to far though....
Telnet is another option. Try telnetting to the target machine to see if it responds.
Create a small Windows Service that you install on your target machine, have the sys admin stop it when they perform maintenance on the target machine (just use batch file to net stop / net start the service)