I'm developing an application that manages devices in the network, at a certain point in the applicaiton, I must ping (actually it's not a ping, it's a SNMP get) all computers in the network to check if it's type is of my managed device.
My problem is that pinging all computers in the network is very slow (specially because most of them won't respond to my message and will simply timeout) and has to be done asynchronously.
I tried to use TLP to do this with the following code:
public static void FindDevices(Action<IPAddress> callback)
{
//Returns a list of all host names with a net view command
List<string> hosts = FindHosts();
foreach (string host in hosts)
{
Task.Run(() =>
{
CheckDevice(host, callback);
});
}
}
But it runs VERY slow, and when I paused execution I checked threads window and saw that it only had one thread pinging the network and was thus, running tasks synchronously.
When I use normal threads it runs a lot faster, but Tasks were supposed to be better, I'd like to know why aren't my Tasks optimizing parallelism.
**EDIT**
Comments asked for code on CheckDevice, so here it goes:
private static void CheckDevice(string host, Action<IPAddress> callback)
{
int commlength, miblength, datatype, datalength, datastart;
string output;
SNMP conn = new SNMP();
IPHostEntry ihe;
try
{
ihe = Dns.Resolve(host);
}
catch (Exception)
{
return;
}
// Send sysLocation SNMP request
byte[] response = conn.get("get", ihe.AddressList[0], "MyDevice", "1.3.6.1.2.1.1.6.0");
if (response[0] != 0xff)
{
// If response, get the community name and MIB lengths
commlength = Convert.ToInt16(response[6]);
miblength = Convert.ToInt16(response[23 + commlength]);
// Extract the MIB data from the SNMP response
datatype = Convert.ToInt16(response[24 + commlength + miblength]);
datalength = Convert.ToInt16(response[25 + commlength + miblength]);
datastart = 26 + commlength + miblength;
output = Encoding.ASCII.GetString(response, datastart, datalength);
if (output.StartsWith("MyDevice"))
{
callback(ihe.AddressList[0]);
}
}
}
Your issue is that you are iterating a none thread safe item the List.
If you replace it with a thread safe object like the ConcurrentBag you should find the threads will run in parallel.
I was a bit confused as to why this was only running one thread, I believe it is this line of code:
try
{
ihe = Dns.Resolve(host);
}
catch (Exception)
{
return;
}
I think this is throwing exceptions and returning; hence you only see one thread. This also ties into your observation that if you added a sleep it worked correctly.
Remember that when you pass a string your passing the reference to the string in memory, not the value. Anyway, the ConcurrentBag seems to resolve your issue. This answer might also be relevant
Related
Edit: Keeping the original question for continuity.
I then edited the question with replacement code for the ReadLine() method by using ReadExisting instead. It works however I still have the same freeze, where the app becomes unresponsive. Debug says it's locking (it takes a while to freeze, sometimes seconds, sometimes minutes) in the while () {} function where I wait for the complete message. More explanations below:
-- obsolete --
What is a good way to handle serialport.readtimeout exception?
try
{
serialPort1.Write(Command_);
if (!IsWriteComm_)
{
Response_ = serialPort1.ReadLine().Replace("\r", "");
}
}
catch (TimeoutException err)
{
DateTime d = DateTime.Now;
rtboxDiag.AppendText("\n" + d.ToString("HH:mm:ss") + ": ");
rtboxDiag.AppendText(err.Message);
if (!serialPort1.IsOpen)
InitConnection();
return Textbox_;
}
this bit of code is exectuted on a timer tick event.
I was having a weird "crash" of the app with an IO exception
"The I/O operation has been aborted because of either a thread exit or an application request."
no matter what I do I am not able to "recover" meaning, I am no longer able to poll data from the serial port.
I added this exception catch and it does log the exception. weirdly enough the test on !serialport.isopen is false (meaning the port is still open).
What might be a hint is: this error does STOP the timer somehow, this is not something I am doing in code. so I am suspecting something related to the timer, rather than the serialport, but I could be wrong.
Closing the port manually, and reconnecting does not fix the problem.
Disconnecting and reconnecting the USB does not fix the problem.
however, closing the app, and relaunching the app does fix the problem (without even disconnecting the MCU or power cycling the MCU/hardware).
-- /obsolete --
edit: the problem is appearing after a few seconds, sometimes minutes of flawless operations. I cannot repeat the issue using a serialport terminal polling the data the same way, at the same frequency. It seems the problem is not coming from the hardware itself.
cheers
Edit: I have yet to test the following modification, not sure if it will fix this problem (I doubt), but at least it's an attempt at not using .readline() which from what I've gathered is not good practice.
anyway here it is:
try
{
serialPort1.Write(Command_);
if (!IsWriteComm_)
{
while (!SerialRxCplt) ;
Response_ = SerialRxResponse.Replace("\r", "").Replace("\n", "");
SerialRxCplt = false;
//Response_ = serialPort1.ReadLine().Replace("\r", "");
}
}
catch (TimeoutException err)
{
DateTime d = DateTime.Now;
rtboxDiag.AppendText("\n" + d.ToString("HH:mm:ss") + ": ");
rtboxDiag.AppendText(err.Message);
if (!serialPort1.IsOpen)
InitConnection();
return Textbox_;
}
and I have the datareceived event enabled:
private void serialPort1_DataReceived(object sender, System.IO.Ports.SerialDataReceivedEventArgs e)
{
var serialPort = (System.IO.Ports.SerialPort)sender;
string dataReceived = serialPort.ReadExisting();
ProcessSerialData(dataReceived);
}
and this is how I am processing the data, and manually "waiting" for the \n character which tells me when the data has been fully received.
private void ProcessSerialData(string data)
{
SerialRxBuffer += data;
if (SerialRxBuffer.Contains("\n"))
{
SerialRxCplt = true;
SerialRxResponse = SerialRxBuffer;
SerialRxBuffer = "";
}
else
{
SerialRxCplt = false;
}
}
any input is welcome.
I have added "stuff" for debugging inside that while loop and it does work fine for a while and then freezes, no error or exception is thrown there. For some reason I have a feeling it's not related to the serial port.
I have even added this:
try
{
serialPort1.Write(Command_);
if (!IsWriteComm_)
{
Stopwatch stopWatch = new Stopwatch();
stopWatch.Start();
while (!SerialRxCplt || Timer2StopWatchMilli > 5)
{
Timer2StopWatchMilli = stopWatch.Elapsed.TotalMilliseconds;
ExceptionMessage = Timer2StopWatchMilli.ToString();
IsException = true;
}
stopWatch.Stop();
if (!SerialRxCplt)
return Textbox_;
Response_ = SerialRxResponse.Replace("\r", "").Replace("\n", "");
SerialRxCplt = false;
//Response_ = serialPort1.ReadLine().Replace("\r", "");
}
}
the ExceptionMessage and IsException help me have an idea of what's happening in that loop. And in normal operations, it is what you would except, increments in the order of 0.0x milliseconds. Data is being processed correctly. When it freezes, nothing looks abnormal. I initially thought I was somehow getting "stuck" in an infinite loop but that || Timer2StopWatchMilli > 5 should get me out of it, acting as some sort of timeout.
one extra piece of info: when it freezes, the one CPU core is fully loaded. (I have a 6core CPU, and it's 16-17% in the task manager - memory usage is low < 30MB)
Any help is welcome
I fixed it by clearing RX/TX and stream buffers after each successful transaction.
I think data was being sent to the PC faster than it was able to read causing data to eventually accumulating on the Rx Buffer.
private void SerialPortClearBuffers()
{
serialPort1.DiscardOutBuffer();
serialPort1.DiscardInBuffer();
serialPort1.BaseStream.Flush();
}
I'm having an issue with ZeroMQ, which I believe is because I'm not very familiar with it.
I'm trying to build a very simple service where multiple clients connect to a server and sends a query. The server responds to this query.
When I use REQ-REP socket combination (client using REQ, server binding to a REP socket) I'm able to get close to 60,000 messages per second at server side (when client and server are on the same machine). When distributed across machines, each new instance of client on a different machine linearly increases the messages per second at the server and easily reaches 40,000+ with enough client instances.
Now REP socket is blocking, so I followed ZeroMQ guide and used the rrbroker pattern (http://zguide.zeromq.org/cs:rrbroker):
REQ (client) <----> [server ROUTER -- DEALER --- REP (workers running on different threads)]
However, this completely screws up the performance. I'm getting only around 4000 messages per second at the server when running across machines. Not only that, each new client started on a different machine reduces the throughput of every other client.
I'm pretty sure I'm doing something stupid. I'm wondering if ZeroMQ experts here can point out any obvious mistakes. Thanks!
Edit: Adding code as per advice. I'm using the clrzmq nuget package (https://www.nuget.org/packages/clrzmq-x64/)
Here's the client code. A timer counts how many responses are received every second.
for (int i = 0; i < numTasks; i++) { Task.Factory.StartNew(() => Client(), TaskCreationOptions.LongRunning); }
void Client()
{
using (var ctx = new Context())
{
Socket socket = ctx.Socket(SocketType.REQ);
socket.Connect("tcp://192.168.1.10:1234");
while (true)
{
socket.Send("ping", Encoding.Unicode);
string res = socket.Recv(Encoding.Unicode);
}
}
}
Server - case 1: The server keeps track of how many requests are received per second
using (var zmqContext = new Context())
{
Socket socket = zmqContext.Socket(SocketType.REP);
socket.Bind("tcp://*:1234");
while (true)
{
string q = socket.Recv(Encoding.Unicode);
if (q.CompareTo("ping") == 0) {
socket.Send("pong", Encoding.Unicode);
}
}
}
With this setup, at server side, I can see around 60,000 requests received per second (when client is on the same machine). When on different machines, each new client increases number of requests received at server as expected.
Server Case 2: This is essentially rrbroker from ZMQ guide.
void ReceiveMessages(Context zmqContext, string zmqConnectionString, int numWorkers)
{
List<PollItem> pollItemsList = new List<PollItem>();
routerSocket = zmqContext.Socket(SocketType.ROUTER);
try
{
routerSocket.Bind(zmqConnectionString);
PollItem pollItem = routerSocket.CreatePollItem(IOMultiPlex.POLLIN);
pollItem.PollInHandler += RouterSocket_PollInHandler;
pollItemsList.Add(pollItem);
}
catch (ZMQ.Exception ze)
{
Console.WriteLine("{0}", ze.Message);
return;
}
dealerSocket = zmqContext.Socket(SocketType.DEALER);
try
{
dealerSocket.Bind("inproc://workers");
PollItem pollItem = dealerSocket.CreatePollItem(IOMultiPlex.POLLIN);
pollItem.PollInHandler += DealerSocket_PollInHandler;
pollItemsList.Add(pollItem);
}
catch (ZMQ.Exception ze)
{
Console.WriteLine("{0}", ze.Message);
return;
}
// Start the worker pool; cant connect
// to inproc socket before binding.
workerPool.Start(numWorkers);
while (true)
{
zmqContext.Poll(pollItemsList.ToArray());
}
}
void RouterSocket_PollInHandler(Socket socket, IOMultiPlex revents)
{
RelayMessage(routerSocket, dealerSocket);
}
void DealerSocket_PollInHandler(Socket socket, IOMultiPlex revents)
{
RelayMessage(dealerSocket, routerSocket);
}
void RelayMessage(Socket source, Socket destination)
{
bool hasMore = true;
while (hasMore)
{
byte[] message = source.Recv();
hasMore = source.RcvMore;
destination.Send(message, message.Length, hasMore ? SendRecvOpt.SNDMORE : SendRecvOpt.NONE);
}
}
Where the worker pool's start method is:
public void Start(int numWorkerTasks=8)
{
for (int i = 0; i < numWorkerTasks; i++)
{
QueryWorker worker = new QueryWorker(this.zmqContext);
Task task = Task.Factory.StartNew(() =>
worker.Start(),
TaskCreationOptions.LongRunning);
}
Console.WriteLine("Started {0} with {1} workers.", this.GetType().Name, numWorkerTasks);
}
public class QueryWorker
{
Context zmqContext;
public QueryWorker(Context zmqContext)
{
this.zmqContext = zmqContext;
}
public void Start()
{
Socket socket = this.zmqContext.Socket(SocketType.REP);
try
{
socket.Connect("inproc://workers");
}
catch (ZMQ.Exception ze)
{
Console.WriteLine("Could not create worker, error: {0}", ze.Message);
return;
}
while (true)
{
try
{
string message = socket.Recv(Encoding.Unicode);
if (message.CompareTo("ping") == 0)
{
socket.Send("pong", Encoding.Unicode);
}
}
catch (ZMQ.Exception ze)
{
Console.WriteLine("Could not receive message, error: " + ze.ToString());
}
}
}
}
Could you post some source code or at least a more detailed explanation of your test case? In general the way to build out your design is to make one change at a time, and measure at each change. You can always move stepwise from a known working design to more complex ones.
Most probably the 'ROUTER' is the bottleneck.
Check out these related questions on this:
Client maintenance in ZMQ ROUTER
Load testing ZeroMQ (ZMQ_STREAM) for finding the maximum simultaneous users it can handle
ROUTER (and ZMQ_STREAM, which is just a variant of ROUTER) internally has to maintain the client mapping, hence IMO it can accept limited connections from a particular client. It looks like ROUTER can multiplex multiple clients, only as long as, each client has only one active connection.
I could be wrong here - but I am not seeing much proof to the contrary (simple working code that scales to multi-clients with multi-connections with ROUTER or STREAM).
There certainly is a very severe restriction on concurrent connections with ZeroMQ, though it looks like no one know what is causing it.
I have done done performance testing on calling a native unmanaged DLL function with various methods from C#:
1. C++/CLI wrapper
2. PInvoke
3. ZeroMQ/clrzmq
The last might be interesting for you.
My finding at the end of my performance test was that using the ZMQ binding clrzmq was not useful and produced a factor of 100 performance overhead after I tried to optimize the PInvoke calls within the source code of the binding. Therefore I have used the ZMQ without a binding but with PInvoke calls.these calls must be done with the cdecl convention and with the option "SuppressUnmanagedCodeSecurity" to get most speed.
I had to import just 5 functions which was fairly easy.
At the end the speed was a bit slower than a PInvoke call but with the ZMQ-in my case over "inproc".
This may give you the hint to try it without the binding, if speed is interesting for you.
This is not a direct answer for your question but may help you to increase performance in general.
As part of an effort to automate starting/stopping some of our NServiceBus services, I'd like to know when a service has finished processing all the messages in it's input queue.
The problem is that, while the NServiceBus service is running, my C# code is reporting one less message than is actually there. So it thinks that the queue is empty when there is still one message left. If the service is stopped, it reports the "correct" number of messages. This is confusing because, when I inspect the queues myself using the Private Queues view in the Computer Management application, it displays the "correct" number.
I'm using a variant of the following C# code to find the message count:
var queue = new MessageQueue(path);
return queue.GetAllMessages().Length;
I know this will perform horribly when there are many messages. The queues I'm inspecting should only ever have a handful of messages at a time.
I have looked at
other
related
questions,
but haven't found the help I need.
Any insight or suggestions would be appreciated!
Update: I should have mentioned that this service is behind a Distributor, which is shut down before trying to shut down this service. So I have confidence that new messages will not be added to the service's input queue.
The thing is that it's not actually "one less message", but rather dependent on the number of messages currently being processed by the endpoint which, in a multi-threaded process, can be as high as the number of threads.
There's also the issue of client processes that continue to send messages to that same queue.
Probably the only "sure" way of handling this is by counting the messages multiple times with a delay in between and if the number stay zero over a certain number of attempts that you can assume the queue is empty.
WMI was the answer! Here's a first pass at the code. It could doubtless be improved.
public int GetMessageCount(string queuePath)
{
const string query = "select * from Win32_PerfRawData_MSMQ_MSMQQueue";
var query = new WqlObjectQuery(query);
var searcher = new ManagementObjectSearcher(query);
var queues = searcher.Get();
foreach (ManagementObject queue in queues)
{
var name = queue["Name"].ToString();
if (AreTheSameQueue(queuePath, name))
{
// Depending on the machine (32/64-bit), this value is a different type.
// Casting directly to UInt64 or UInt32 only works on the relative CPU architecture.
// To work around this run-time unknown, convert to string and then parse to int.
var countAsString = queue["MessagesInQueue"].ToString();
var messageCount = int.Parse(countAsString);
return messageCount;
}
}
return 0;
}
private static bool AreTheSameQueue(string path1, string path2)
{
// Tests whether two queue paths are equivalent, accounting for differences
// in case and length (if one path was truncated, for example by WMI).
string sanitizedPath1 = Sanitize(path1);
string sanitizedPath2 = Sanitize(path2);
if (sanitizedPath1.Length > sanitizedPath2.Length)
{
return sanitizedPath1.StartsWith(sanitizedPath2);
}
if (sanitizedPath1.Length < sanitizedPath2.Length)
{
return sanitizedPath2.StartsWith(sanitizedPath1);
}
return sanitizedPath1 == sanitizedPath2;
}
private static string Sanitize(string queueName)
{
var machineName = Environment.MachineName.ToLowerInvariant();
return queueName.ToLowerInvariant().Replace(machineName, ".");
}
I have a video kiosk setup in my lobby, it lets people check in and print a badge with their picture, name, etc. There is also a remote support tool that unfortunately crashes sometimes. I have a function on the kiosk that fixes this issue but you must go to the kiosk to trigger it right now.
I have also written a management tool that uses WMI to monitor and manage some other aspects of the kiosk. I would like to be able to trigger this repair function via this application. I have spent countless hours on google trying to figure this out with no luck. Maybe I am not searching for the right things.
My question is this. In C# how can I call the repair function in my kiosk application from the admin application over the network?
OK, on my Server form, I have a BackgroundWorker that runs a TcpListener. You will want to put this TcpListener in a BackgroundWorker, otherwise you will never be able to stop it from executing until it accepts a TcpClient.
Also, you will want to process any data you receive from this background thread in the main thread of execution to prevent cross thread exceptions:
private TcpListener _listener;
private const int port = 8000;
private void Worker_TcpListener(object sender, DoWorkEventArgs e) {
BackgroundWorker worker = sender as BackgroundWorker;
do {
try {
_listener = new TcpListener(IPAddress.Any, port);
_listener.Start();
TcpClient client = _listener.AcceptTcpClient(); // waits until data is avaiable
int MAX = client.ReceiveBufferSize;
NetworkStream stream = client.GetStream();
Byte[] buffer = new Byte[MAX];
int len = stream.Read(buffer, 0, MAX);
if (0 < len) {
string data = Encoding.UTF8.GetString(buffer);
worker.ReportProgress(len, data.Substring(0, len));
}
stream.Close();
client.Close();
} catch (SocketException) {
// See MSDN: Windows Sockets V2 API Error Code Doc for details of error code
} catch (ThreadAbortException) { // If I have to call Abort on this thread
return;
} finally {
_listener.Stop();
}
} while (!worker.CancellationPending);
}
This would not be good for large messages (like JPEG files and such), but works great for short strings where I have coded in special data to look for.
This data is sent back to my main thread of execution (using the ReportProcess method) where the data is processed:
private void Worker_TcpListener(object sender, ProgressChangedEventArgs e) {
if (e.UserState != null) {
int len = e.ProgressPercentage;
string data = e.UserState.ToString();
if (!String.IsNullOrEmpty(data) && (3 < len)) {
string head = data.Substring(0, 3);
string item = data.Substring(3);
if (!String.IsNullOrEmpty(item)) {
if (head == "BP:") {
string[] split = data.Split(';');
if (2 < split.Length) {
string box = split[0].Substring(3); // Box Number
string qty = split[1].Substring(2); // Quantity
string customer = split[2].Substring(2); // Customer Name
MyRoutine(box, qty, customer);
}
}
}
}
}
}
The code above just sits and runs all day long.
Meanwhile, I have about 10 Pocket PC devices in our plant that could send data at any time. The code for them is written in VB, and I really hope I have time to finish my C# version one of these days, but here it is:
Private Sub SendToServer(string serialNum, int qty, string customer)
Cursor.Current = Cursors.WaitCursor
Try
Dim strPacket As String = String.Format("BP:{0};Q:{1};C:{2};", serialNum, qty, customer)
Dim colon As Integer = p7_txtIPAddress.Text.IndexOf(":")
Dim host As String = p7_txtIPAddress.Text.Substring(0, colon)
Dim port As Integer = CInt(p7_txtIPAddress.Text.Substring(colon + 1))
Dim dataPacket As [Byte]() = Encoding.ASCII.GetBytes(strPacket)
Using client As New TcpClient(host, port)
Dim stream As NetworkStream = client.GetStream()
stream.Write(dataPacket, 0, dataPacket.Length)
End Using
Catch err As Exception
MessageBox.Show(err.Message, "Print To Server TCP Error")
Finally
Cursor.Current = Cursors.Default
End Try
End Function
I don't know if that is what you are trying to do, but it works and is reliable.
Obviously, the code I have in production is larger and includes other things (i.e. employee validation, error loggers, etc.) that you would not find useful. I have cut a lot of those out, and I hope I did not cut out anything necessary.
This should give you an idea of how to move forward, at least.
I am writing test harness to test a HTTP Post. Test case would send 8 http request using UploadValuesAsync in webclient class in 10 seconds interval. It sleeps 10 seconds after every 8 request. I am recording start time and end time of each request. When I compute the average response time. I am getting around 800 ms. But when I run this test case synchronously using UploadValues method in web client I am getting average response time 250 milliseconds. Can you tell me why is difference between these two methods? I was expecting the less response time in Aync but I did not get that.
Here is code that sends 8 requests async
var count = 0;
foreach (var nameValueCollection in requestCollections)
{
count++;
NameValueCollection collection = nameValueCollection;
PostToURL(collection,uri);
if (count % 8 == 0)
{
Thread.Sleep(TimeSpan.FromSeconds(10));
count = 0;
}
}
UPDATED
Here is code that sends 8 requests SYNC
public void PostToURLSync(NameValueCollection collection,Uri uri)
{
var response = new ServiceResponse
{
Response = "Not Started",
Request = string.Join(";", collection.Cast<string>()
.Select(col => String.Concat(col, "=", collection[col])).ToArray()),
ApplicationId = collection["ApplicationId"]
};
try
{
using (var transportType2 = new DerivedWebClient())
{
transportType2.Expect100Continue = false;
transportType2.Timeout = TimeSpan.FromMilliseconds(2000);
response.StartTime = DateTime.Now;
var responeByte = transportType2.UploadValues(uri, "POST", collection);
response.EndTime = DateTime.Now;
response.Response = Encoding.Default.GetString(responeByte);
}
}
catch (Exception exception)
{
Console.WriteLine(exception.ToString());
}
response.ResponseInMs = (int)response.EndTime.Subtract(response.StartTime).TotalMilliseconds;
responses.Add(response);
Console.WriteLine(response.ResponseInMs);
}
Here is the code that post to the HTTP URI
public void PostToURL(NameValueCollection collection,Uri uri)
{
var response = new ServiceResponse
{
Response = "Not Started",
Request = string.Join(";", collection.Cast<string>()
.Select(col => String.Concat(col, "=", collection[col])).ToArray()),
ApplicationId = collection["ApplicationId"]
};
try
{
using (var transportType2 = new DerivedWebClient())
{
transportType2.Expect100Continue = false;
transportType2.Timeout = TimeSpan.FromMilliseconds(2000);
response.StartTime = DateTime.Now;
transportType2.UploadValuesCompleted += new UploadValuesCompletedEventHandler(transportType2_UploadValuesCompleted);
transportType2.UploadValuesAsync(uri, "POST", collection,response);
}
}
catch (Exception exception)
{
Console.WriteLine(exception.ToString());
}
}
Here is the upload completed event
private void transportType2_UploadValuesCompleted(object sender, UploadValuesCompletedEventArgs e)
{
var now = DateTime.Now;
var response = (ServiceResponse)e.UserState;
response.EndTime = now;
response.ResponseInMs = (int) response.EndTime.Subtract(response.StartTime).TotalMilliseconds;
Console.WriteLine(response.ResponseInMs);
if (e.Error != null)
{
response.Response = e.Error.ToString();
}
else
if (e.Result != null && e.Result.Length > 0)
{
string downloadedData = Encoding.Default.GetString(e.Result);
response.Response = downloadedData;
}
//Recording response in Global variable
responses.Add(response);
}
One problem you're probably running into is that .NET, by default, will throttle outgoing HTTP connections to the limit (2 concurrent connections per remote host) that are mandated by the relevant RFC. Assuming 2 concurrent connections and 250ms per request, that means the response time for your first 2 requests will be 250ms, the second 2 will be 500ms, the third 750ms, and the last 1000ms. This would yield a 625ms average response time, which is not far from the 800ms you're seeing.
To remove the throttling, increase ServicePointManager.DefaultConnectionLimit to the maximum number of concurrent connections you want to support, and you should see your average response time go down alot.
A secondary problem may be that the server itself is slower handling multiple concurrent connections than handing one request at a time. Even once you unblock the throttling problem above, I'd expect each of the async requests to, on average, execute somewhat slower than if the server was only executing one request at a time. How much slower depends on how well the server is optimized for concurrent requests.
A final problem may be caused by test methodology. For example, if your test client is simulating a browser session by storing cookies and re-sending cookies with each request, that may run into problems with some servers that will serialize requests from a single user. This is often a simplification for server apps so they won't have to deal with locking cross-requests state like session state. If you're running into this problem, make sure that each WebClient sends different cookies to simulate different users.
I'm not saying that you're running into all three of these problems-- you might be only running into 1 or 2-- but these are the most likley culprits for the problem you're seeing.
As Justin said, I tried ServicePointManager.DefaultConnectionLimit but that did not fix the issue. I could not able reproduce other problems suggested by Justin. I am not sure how to reproduce them in first place.
What I did, I ran the same piece of code in peer machine that runs perfectly response time that I expected. The difference between the two machines is operating systems. Mine is running on Windows Server 2003 and other machine is running on Windows Server 2008.
As it worked on the other machines, I suspect that it might be one of the problem specified by Justin or could be server settings on 2003 or something else. I did not spend much time after that to dig this issue. As this is a test harness that we had low priority on this issue. We left off with no time further.
As I have no glue on what exactly fixed it, I am not accepting any answer other than this. Becuase at very least I know that switching to server 2008 fixed this issue.