UploadValuesAsync response time - c#

I am writing test harness to test a HTTP Post. Test case would send 8 http request using UploadValuesAsync in webclient class in 10 seconds interval. It sleeps 10 seconds after every 8 request. I am recording start time and end time of each request. When I compute the average response time. I am getting around 800 ms. But when I run this test case synchronously using UploadValues method in web client I am getting average response time 250 milliseconds. Can you tell me why is difference between these two methods? I was expecting the less response time in Aync but I did not get that.
Here is code that sends 8 requests async
var count = 0;
foreach (var nameValueCollection in requestCollections)
{
count++;
NameValueCollection collection = nameValueCollection;
PostToURL(collection,uri);
if (count % 8 == 0)
{
Thread.Sleep(TimeSpan.FromSeconds(10));
count = 0;
}
}
UPDATED
Here is code that sends 8 requests SYNC
public void PostToURLSync(NameValueCollection collection,Uri uri)
{
var response = new ServiceResponse
{
Response = "Not Started",
Request = string.Join(";", collection.Cast<string>()
.Select(col => String.Concat(col, "=", collection[col])).ToArray()),
ApplicationId = collection["ApplicationId"]
};
try
{
using (var transportType2 = new DerivedWebClient())
{
transportType2.Expect100Continue = false;
transportType2.Timeout = TimeSpan.FromMilliseconds(2000);
response.StartTime = DateTime.Now;
var responeByte = transportType2.UploadValues(uri, "POST", collection);
response.EndTime = DateTime.Now;
response.Response = Encoding.Default.GetString(responeByte);
}
}
catch (Exception exception)
{
Console.WriteLine(exception.ToString());
}
response.ResponseInMs = (int)response.EndTime.Subtract(response.StartTime).TotalMilliseconds;
responses.Add(response);
Console.WriteLine(response.ResponseInMs);
}
Here is the code that post to the HTTP URI
public void PostToURL(NameValueCollection collection,Uri uri)
{
var response = new ServiceResponse
{
Response = "Not Started",
Request = string.Join(";", collection.Cast<string>()
.Select(col => String.Concat(col, "=", collection[col])).ToArray()),
ApplicationId = collection["ApplicationId"]
};
try
{
using (var transportType2 = new DerivedWebClient())
{
transportType2.Expect100Continue = false;
transportType2.Timeout = TimeSpan.FromMilliseconds(2000);
response.StartTime = DateTime.Now;
transportType2.UploadValuesCompleted += new UploadValuesCompletedEventHandler(transportType2_UploadValuesCompleted);
transportType2.UploadValuesAsync(uri, "POST", collection,response);
}
}
catch (Exception exception)
{
Console.WriteLine(exception.ToString());
}
}
Here is the upload completed event
private void transportType2_UploadValuesCompleted(object sender, UploadValuesCompletedEventArgs e)
{
var now = DateTime.Now;
var response = (ServiceResponse)e.UserState;
response.EndTime = now;
response.ResponseInMs = (int) response.EndTime.Subtract(response.StartTime).TotalMilliseconds;
Console.WriteLine(response.ResponseInMs);
if (e.Error != null)
{
response.Response = e.Error.ToString();
}
else
if (e.Result != null && e.Result.Length > 0)
{
string downloadedData = Encoding.Default.GetString(e.Result);
response.Response = downloadedData;
}
//Recording response in Global variable
responses.Add(response);
}

One problem you're probably running into is that .NET, by default, will throttle outgoing HTTP connections to the limit (2 concurrent connections per remote host) that are mandated by the relevant RFC. Assuming 2 concurrent connections and 250ms per request, that means the response time for your first 2 requests will be 250ms, the second 2 will be 500ms, the third 750ms, and the last 1000ms. This would yield a 625ms average response time, which is not far from the 800ms you're seeing.
To remove the throttling, increase ServicePointManager.DefaultConnectionLimit to the maximum number of concurrent connections you want to support, and you should see your average response time go down alot.
A secondary problem may be that the server itself is slower handling multiple concurrent connections than handing one request at a time. Even once you unblock the throttling problem above, I'd expect each of the async requests to, on average, execute somewhat slower than if the server was only executing one request at a time. How much slower depends on how well the server is optimized for concurrent requests.
A final problem may be caused by test methodology. For example, if your test client is simulating a browser session by storing cookies and re-sending cookies with each request, that may run into problems with some servers that will serialize requests from a single user. This is often a simplification for server apps so they won't have to deal with locking cross-requests state like session state. If you're running into this problem, make sure that each WebClient sends different cookies to simulate different users.
I'm not saying that you're running into all three of these problems-- you might be only running into 1 or 2-- but these are the most likley culprits for the problem you're seeing.

As Justin said, I tried ServicePointManager.DefaultConnectionLimit but that did not fix the issue. I could not able reproduce other problems suggested by Justin. I am not sure how to reproduce them in first place.
What I did, I ran the same piece of code in peer machine that runs perfectly response time that I expected. The difference between the two machines is operating systems. Mine is running on Windows Server 2003 and other machine is running on Windows Server 2008.
As it worked on the other machines, I suspect that it might be one of the problem specified by Justin or could be server settings on 2003 or something else. I did not spend much time after that to dig this issue. As this is a test harness that we had low priority on this issue. We left off with no time further.
As I have no glue on what exactly fixed it, I am not accepting any answer other than this. Becuase at very least I know that switching to server 2008 fixed this issue.

Related

GRPC performance vs WCF performance

We have a legacy app that runs on top of WCF, so we are trying to move off of it, and find another technology. One of the problems is that we need performance from the wire, so part of evaluating GRPC is evaluating how quickly it works, but also how many simultaneous clients we can run.
So, to that end, we're simulating many calls with relatively low amount of data being passed through, but high number of calls. In that respect, WCF has turned out to be significantly better than GRPC, which is very unexpected. Is there possibly something wrong with the way the test were conceived and implemented?
The server code:
public override Task<TestReply> Test(TestRequest request, ServerCallContext context)
{
var ret = new char[request.Size];
var a = (int)'a';
for (var i = 0; i < request.Size; i++)
{
ret[i] = (char)(a + (i % 26));
}
return Task.FromResult(new TestReply { Message = new string(ret) });
}
The client code:
static void Main(string[] args)
{
AppContext.SetSwitch("System.Net.Http.SocketsHttpHandler.Http2UnencryptedSupport", true);
using var channel = GrpcChannel.ForAddress("http://remote_server:8001", new GrpcChannelOptions { Credentials = ChannelCredentials.Insecure });
var client = new Greeter.GreeterClient(channel);
string TestMethod(int i)
{
var request = new TestRequest {Size = i};
return client.Test(request).Message;
}
var start = DateTime.Now;
for (var i = 0; i < 15625; i++)
{
var val = TestMethod(10);
}
var end = DateTime.Now;
}
If we run one a single instance of the client, it takes just under 7 seconds. If we run 64 instances simultaneously, each takes an average of 23 seconds. Part of the problem is that running 64 instances is also CPU intensive, on both client and server. With 64 clients, the client will see 85-95% CPU utilization, and the server will see 70-80%.
By comparison, WCF will run a single instance of that code in 2.4 seconds, and 64 in an average of 9 seconds, and never experience significant CPU utilization on either.
Are we using GRPC wrongly? Is there something wrong with the test? What can we do to make GRPC run a little faster/leaner?

UWP AppServiceConnection not getting value

I'm trying to communicate between my host app and service via AppServiceConnection. I'm using the following code in my host app:
using (var connection = new AppServiceConnection())
{
connection.AppServiceName = extension.AppServiceName;
connection.PackageFamilyName = extension.PackageFamilyName;
var connectionStatus = await connection.OpenAsync();
if (connectionStatus == AppServiceConnectionStatus.Success)
{
var response = await connection.SendMessageAsync(requestMessage);
if (response.Status == AppServiceResponseStatus.Success)
returnValue = response.Message as ValueSet;
}
}
And my service code:
private async void OnRequestReceived(AppServiceConnection sender, AppServiceRequestReceivedEventArgs args)
{
var messageDeferral = args.GetDeferral();
var message = args.Request.Message;
var returnData = new ValueSet();
var command = message["Command"] as string;
switch (command)
{
case "ACTION":
var value = await AsyncAction();
returnData = new ValueSet { { "Value", JsonConvert.SerializeObject(value) } };
break;
default:
break;
}
await args.Request.SendResponseAsync(returnData);
messageDeferral.Complete();
}
This works some of the time, but other times the ValueSet (returnValue) is randomly empty when the host receives it. It has Value in it when returned in the service, but when I get the response in the host, nothing.
I've verified that the service is indeed setting the value, adding it to the ValueSet and returning it correctly.
Note that my service is receiving the request, the host is receiving the response and the response status is Success; a failed connection isn't the issue.
Sometimes this happens only once before requests start working again, other times it will happen ten times in a row.
The first working response after a failure always takes significantly longer than normal.
Also, I have no issues in the request from host to service. It's always service to host where the problem shows up.
Has anyone else run into this issue and figured it out?
In the process of creating a sample app I realized what the problem was. I was performing an asynchronous action (albeit a very short one) before my line var messageDeferral = args.GetDeferral();. It appears that this was allowing the background task to be closed before it had responded to the host. Simply moving that line to the beginning of the OnRequestReceived function fixed the problem for me.
So for anyone who runs into a similar issue, get your deferral before you do anything else! Spare yourself the pain I went through.

Pinging all computers in network with TPL

I'm developing an application that manages devices in the network, at a certain point in the applicaiton, I must ping (actually it's not a ping, it's a SNMP get) all computers in the network to check if it's type is of my managed device.
My problem is that pinging all computers in the network is very slow (specially because most of them won't respond to my message and will simply timeout) and has to be done asynchronously.
I tried to use TLP to do this with the following code:
public static void FindDevices(Action<IPAddress> callback)
{
//Returns a list of all host names with a net view command
List<string> hosts = FindHosts();
foreach (string host in hosts)
{
Task.Run(() =>
{
CheckDevice(host, callback);
});
}
}
But it runs VERY slow, and when I paused execution I checked threads window and saw that it only had one thread pinging the network and was thus, running tasks synchronously.
When I use normal threads it runs a lot faster, but Tasks were supposed to be better, I'd like to know why aren't my Tasks optimizing parallelism.
**EDIT**
Comments asked for code on CheckDevice, so here it goes:
private static void CheckDevice(string host, Action<IPAddress> callback)
{
int commlength, miblength, datatype, datalength, datastart;
string output;
SNMP conn = new SNMP();
IPHostEntry ihe;
try
{
ihe = Dns.Resolve(host);
}
catch (Exception)
{
return;
}
// Send sysLocation SNMP request
byte[] response = conn.get("get", ihe.AddressList[0], "MyDevice", "1.3.6.1.2.1.1.6.0");
if (response[0] != 0xff)
{
// If response, get the community name and MIB lengths
commlength = Convert.ToInt16(response[6]);
miblength = Convert.ToInt16(response[23 + commlength]);
// Extract the MIB data from the SNMP response
datatype = Convert.ToInt16(response[24 + commlength + miblength]);
datalength = Convert.ToInt16(response[25 + commlength + miblength]);
datastart = 26 + commlength + miblength;
output = Encoding.ASCII.GetString(response, datastart, datalength);
if (output.StartsWith("MyDevice"))
{
callback(ihe.AddressList[0]);
}
}
}
Your issue is that you are iterating a none thread safe item the List.
If you replace it with a thread safe object like the ConcurrentBag you should find the threads will run in parallel.
I was a bit confused as to why this was only running one thread, I believe it is this line of code:
try
{
ihe = Dns.Resolve(host);
}
catch (Exception)
{
return;
}
I think this is throwing exceptions and returning; hence you only see one thread. This also ties into your observation that if you added a sleep it worked correctly.
Remember that when you pass a string your passing the reference to the string in memory, not the value. Anyway, the ConcurrentBag seems to resolve your issue. This answer might also be relevant

c# Why the WebClient times out most of the timeswhen it is invoked through a thread?

I am working on a project which uses a timed web client. Class structure is like this.
Controller => Main supervisor of class
Form1, SourceReader, ReportWriter, UrlFileReader, HTTPWorker, TimedWebClient.
HTTPworker is the class to get the page source when the url is given.
TimedWebClient is the class to handle the timeout of the WebClient. Here is the code.
class TimedWebClient : WebClient
{
int Timeout;
public TimedWebClient()
{
this.Timeout = 5000;
}
protected override WebRequest GetWebRequest(Uri address)
{
var objWebRequest = base.GetWebRequest(address);
objWebRequest.Timeout = this.Timeout;
return objWebRequest;
}
}
In HTTPWorker i have
TimedWebClient wclient = new TimedWebClient();
wclient.Proxy = WebRequest.GetSystemWebProxy();
wclient.Headers["Accept"] = "application/x-ms-application, image/jpeg, application/xaml+xml, image/gif, image/pjpeg, application/x-ms-xbap, application/x-shockwave-flash, application/vnd.ms-excel, application/vnd.ms-powerpoint, application/msword, */*";
wclient.Headers["User-Agent"] = "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; WOW64; Trident/4.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; MDDC)";
string pagesource = wclient.DownloadData(requestUrl);
UTF8Encoding objUTF8 = new UTF8Encoding();
responseData = objUTF8.GetString(pagesource);
I have handled exceptions there.
In Form1 i have a background controller and a urllist.
First Implementation :
First I took one url at a time and gave it to the ONLY Controller object to process.
Then it worked fine. But as it is sequential it took a long time when the list is too large.
Second Implementation:
Then in the Do_Work of the backgroundworker I made seven controllers and seven threads. Each controller has unique HTTPWorker object. But now it throws exceptions saying "timedout".
Below is the code in Form1.cs backgroundworker1_DoWork.
private void backgroundWorker1_DoWork(object sender, DoWorkEventArgs e)
{
bool done = false;
while (!backgroundWorker1.CancellationPending && !done)
{
int iterator = 1;
int tempiterator = iterator;
Controller[] cntrlrarray = new Controller[numofcontrollers];
Thread[] threadarray = new Thread[numofcontrollers];
int cntrlcntr = 0;
for ( cntrlcntr = 0; cntrlcntr < numofcontrollers; cntrlcntr++)
{
cntrlrarray[cntrlcntr] = new Controller();
}
cntrlcntr = 0;
for (iterator = 1; iterator <= this.urlList.Count; iterator++)
{
int assignedthreads = 0;
for (int threadcounter = 0; threadcounter < numofcontrollers; threadcounter++)
{
cntrlcntr = threadcounter;
threadarray[threadcounter] = new Thread(() => cntrlrarray[cntrlcntr].Process(iterator - 1));
threadarray[threadcounter].Name = this.urlList[iterator - 1];
threadarray[threadcounter].Start();
backgroundWorker1.ReportProgress(iterator);
assignedthreads++;
if (iterator == this.urlList.Count)
{
break;
}
else
{
iterator++;
}
}
for (int threadcounter = 0; threadcounter < assignedthreads; threadcounter++)
{
cntrlcntr = threadcounter;
threadarray[threadcounter].Join();
}
if (iterator == this.urlList.Count)
{
break;
}
else
{
iterator--;
}
}
done = true;
}
}
What is the reason and the solution for this?
Appolgises for being too lengthy. Thank you in advance.
The sky... it's full of Threads! Seriously, though - don't use this many threads. That's what asynchronous I/O is for. If you're using .NET 4.5, this is very easy to do using await/async, otherwise it's a bit of boilerplate code, but it's still far preferable to this.
With that out of the way, the amount of TCP connections is quite limited by default. Even if there was a use for having 1000 downloads at once (and it probably isn't, since you're sharing bandwidth), you simply can't create and drop TCP connections willy-nilly - there's a limit to open TCP connections (anywhere from 5 to 20, unless you're on a server). You can change this, but it's usually preferred to do things differently. See this entry. This might also be a problem if this application is not running alone (which it probably isn't, given that you wouldn't have such a problem on server Windows). For example, torrent clients often bump into the half-open connection limit (a connection which is still waiting for the end of the initial TCP handskahe). This would be detriminal to your application, of course).
Now, even if you keep under this limit, there's also a fixed amount of outbound and inbound ports to use when communicating. This is a problem when you quickly open and close TCP connections, because TCP keeps the connection alive in the background for about 4 minutes (to make sure no wrong packets arrive to the port, which could be reused in the meantime). This means that if you create enough connections in this time interval, you're going to "starve" your port pool, and every new TCP connection will be denied (so your browser will temporarily stop working, etc.).
Next, a 5 second timeout is pretty low. Really. Imagine that it would take a second to complete a handshake (that's a ping of ~300ms, which is still within the realm of reasonable internet response). Suddenly, you've got a new connection, which has to wait for the other handshakes to finish, and it might take a few seconds just for that. And that's still just the initiation of the connection. Then there's the DNS lookup, and the response of the HTTP server itself... 5 seconds is a low timeout.
In short, it's not the multi-threading - it's the massive amounts of (useless) connections you're opening. Also, for URLs on a single web, you should look into Keep-Alive connections - they can reuse the already opened TCP connection, which significantly mitigates this problem.
Now, to get deeper into this. You're starting and destroying threads needlessly. Instead, it would be a better idea to have a URL queue and several thread consumers, that would take input from the queue. This way, you'll only have those 7 (or whatever the number) threads that poll from the queue as long as there's something in it, which saves a lot of system resources (and improves your performance). I'm thinking that the Thread.Join you're doing might also have something to do with your issues. Even though you're running the thing in a background worker, it just might be possible there's something strange hapenning in there.

ZeroMQ performance issue

I'm having an issue with ZeroMQ, which I believe is because I'm not very familiar with it.
I'm trying to build a very simple service where multiple clients connect to a server and sends a query. The server responds to this query.
When I use REQ-REP socket combination (client using REQ, server binding to a REP socket) I'm able to get close to 60,000 messages per second at server side (when client and server are on the same machine). When distributed across machines, each new instance of client on a different machine linearly increases the messages per second at the server and easily reaches 40,000+ with enough client instances.
Now REP socket is blocking, so I followed ZeroMQ guide and used the rrbroker pattern (http://zguide.zeromq.org/cs:rrbroker):
REQ (client) <----> [server ROUTER -- DEALER --- REP (workers running on different threads)]
However, this completely screws up the performance. I'm getting only around 4000 messages per second at the server when running across machines. Not only that, each new client started on a different machine reduces the throughput of every other client.
I'm pretty sure I'm doing something stupid. I'm wondering if ZeroMQ experts here can point out any obvious mistakes. Thanks!
Edit: Adding code as per advice. I'm using the clrzmq nuget package (https://www.nuget.org/packages/clrzmq-x64/)
Here's the client code. A timer counts how many responses are received every second.
for (int i = 0; i < numTasks; i++) { Task.Factory.StartNew(() => Client(), TaskCreationOptions.LongRunning); }
void Client()
{
using (var ctx = new Context())
{
Socket socket = ctx.Socket(SocketType.REQ);
socket.Connect("tcp://192.168.1.10:1234");
while (true)
{
socket.Send("ping", Encoding.Unicode);
string res = socket.Recv(Encoding.Unicode);
}
}
}
Server - case 1: The server keeps track of how many requests are received per second
using (var zmqContext = new Context())
{
Socket socket = zmqContext.Socket(SocketType.REP);
socket.Bind("tcp://*:1234");
while (true)
{
string q = socket.Recv(Encoding.Unicode);
if (q.CompareTo("ping") == 0) {
socket.Send("pong", Encoding.Unicode);
}
}
}
With this setup, at server side, I can see around 60,000 requests received per second (when client is on the same machine). When on different machines, each new client increases number of requests received at server as expected.
Server Case 2: This is essentially rrbroker from ZMQ guide.
void ReceiveMessages(Context zmqContext, string zmqConnectionString, int numWorkers)
{
List<PollItem> pollItemsList = new List<PollItem>();
routerSocket = zmqContext.Socket(SocketType.ROUTER);
try
{
routerSocket.Bind(zmqConnectionString);
PollItem pollItem = routerSocket.CreatePollItem(IOMultiPlex.POLLIN);
pollItem.PollInHandler += RouterSocket_PollInHandler;
pollItemsList.Add(pollItem);
}
catch (ZMQ.Exception ze)
{
Console.WriteLine("{0}", ze.Message);
return;
}
dealerSocket = zmqContext.Socket(SocketType.DEALER);
try
{
dealerSocket.Bind("inproc://workers");
PollItem pollItem = dealerSocket.CreatePollItem(IOMultiPlex.POLLIN);
pollItem.PollInHandler += DealerSocket_PollInHandler;
pollItemsList.Add(pollItem);
}
catch (ZMQ.Exception ze)
{
Console.WriteLine("{0}", ze.Message);
return;
}
// Start the worker pool; cant connect
// to inproc socket before binding.
workerPool.Start(numWorkers);
while (true)
{
zmqContext.Poll(pollItemsList.ToArray());
}
}
void RouterSocket_PollInHandler(Socket socket, IOMultiPlex revents)
{
RelayMessage(routerSocket, dealerSocket);
}
void DealerSocket_PollInHandler(Socket socket, IOMultiPlex revents)
{
RelayMessage(dealerSocket, routerSocket);
}
void RelayMessage(Socket source, Socket destination)
{
bool hasMore = true;
while (hasMore)
{
byte[] message = source.Recv();
hasMore = source.RcvMore;
destination.Send(message, message.Length, hasMore ? SendRecvOpt.SNDMORE : SendRecvOpt.NONE);
}
}
Where the worker pool's start method is:
public void Start(int numWorkerTasks=8)
{
for (int i = 0; i < numWorkerTasks; i++)
{
QueryWorker worker = new QueryWorker(this.zmqContext);
Task task = Task.Factory.StartNew(() =>
worker.Start(),
TaskCreationOptions.LongRunning);
}
Console.WriteLine("Started {0} with {1} workers.", this.GetType().Name, numWorkerTasks);
}
public class QueryWorker
{
Context zmqContext;
public QueryWorker(Context zmqContext)
{
this.zmqContext = zmqContext;
}
public void Start()
{
Socket socket = this.zmqContext.Socket(SocketType.REP);
try
{
socket.Connect("inproc://workers");
}
catch (ZMQ.Exception ze)
{
Console.WriteLine("Could not create worker, error: {0}", ze.Message);
return;
}
while (true)
{
try
{
string message = socket.Recv(Encoding.Unicode);
if (message.CompareTo("ping") == 0)
{
socket.Send("pong", Encoding.Unicode);
}
}
catch (ZMQ.Exception ze)
{
Console.WriteLine("Could not receive message, error: " + ze.ToString());
}
}
}
}
Could you post some source code or at least a more detailed explanation of your test case? In general the way to build out your design is to make one change at a time, and measure at each change. You can always move stepwise from a known working design to more complex ones.
Most probably the 'ROUTER' is the bottleneck.
Check out these related questions on this:
Client maintenance in ZMQ ROUTER
Load testing ZeroMQ (ZMQ_STREAM) for finding the maximum simultaneous users it can handle
ROUTER (and ZMQ_STREAM, which is just a variant of ROUTER) internally has to maintain the client mapping, hence IMO it can accept limited connections from a particular client. It looks like ROUTER can multiplex multiple clients, only as long as, each client has only one active connection.
I could be wrong here - but I am not seeing much proof to the contrary (simple working code that scales to multi-clients with multi-connections with ROUTER or STREAM).
There certainly is a very severe restriction on concurrent connections with ZeroMQ, though it looks like no one know what is causing it.
I have done done performance testing on calling a native unmanaged DLL function with various methods from C#:
1. C++/CLI wrapper
2. PInvoke
3. ZeroMQ/clrzmq
The last might be interesting for you.
My finding at the end of my performance test was that using the ZMQ binding clrzmq was not useful and produced a factor of 100 performance overhead after I tried to optimize the PInvoke calls within the source code of the binding. Therefore I have used the ZMQ without a binding but with PInvoke calls.these calls must be done with the cdecl convention and with the option "SuppressUnmanagedCodeSecurity" to get most speed.
I had to import just 5 functions which was fairly easy.
At the end the speed was a bit slower than a PInvoke call but with the ZMQ-in my case over "inproc".
This may give you the hint to try it without the binding, if speed is interesting for you.
This is not a direct answer for your question but may help you to increase performance in general.

Categories