How to optimize SOA requests in HPC - c#

I want to use HPC to do some simulations, I'm going to use SOA. I have following code from some sample materials, I modified it (I added this first for). Currently I stumbled upon problem of optimization / poor performance. This basic sample do nothing expect querying service method, this method return value it gets in parameter. However my example is slow. I have 60 computers with 4 core processors and 1Gb network. First phase of sending messages takes something about 2 seconds and then I have to wait another 7 seconds for return values. All values come leas or more at the same time. Another problem I have is that I cannot re-use session object, that is why this first for is outside using I want to put it inside using, but then I get time out, or information that BrokerClient is ended.
Can I reuse BrokerClient or DurableSession object.
How can I speed up this whole process of message passing ?
static void Main(string[] args)
{
const string headnode = "Head-Node.hpcCluster.edu.edu";
const string serviceName = "EchoService";
const int numRequests = 1000;
SessionStartInfo info = new SessionStartInfo(headnode, serviceName);
for (int j = 0; j < 100; j++)
{
using (DurableSession session = DurableSession.CreateSession(info))
{
Console.WriteLine("done session id = {0}", session.Id);
NetTcpBinding binding = new NetTcpBinding(SecurityMode.Transport);
using (BrokerClient<IService1> client = new BrokerClient<IService1>(session, binding))
{
for (int i = 0; i < numRequests; i++)
{
EchoRequest request = new EchoRequest("hello world!");
client.SendRequest<EchoRequest>(request, i);
}
client.EndRequests();
foreach (var response in client.GetResponses<EchoResponse>())
{
try
{
string reply = response.Result.EchoResult;
Console.WriteLine("\tReceived response for request {0}: {1}", response.GetUserData<int>(), reply);
}
catch (Exception ex)
{
}
}
}
session.Close();
}
}
}
Second version with Session instead of DurableSession, which is working better, but I have problem with Session reuse:
using (Session session = Session.CreateSession(info))
{
for (int i = 0; i < 100; i++)
{
count = 0;
Console.WriteLine("done session id = {0}", session.Id);
NetTcpBinding binding = new NetTcpBinding(SecurityMode.Transport);
using (BrokerClient<IService1> client = new BrokerClient<IService1>( session, binding))
{
//set getresponse handler
client.SetResponseHandler<EchoResponse>((item) =>
{
try
{
Console.WriteLine("\tReceived response for request {0}: {1}",
item.GetUserData<int>(), item.Result.EchoResult);
}
catch (SessionException ex)
{
Console.WriteLine("SessionException while getting responses in callback: {0}", ex.Message);
}
catch (Exception ex)
{
Console.WriteLine("Exception while getting responses in callback: {0}", ex.Message);
}
if (Interlocked.Increment(ref count) == numRequests)
done.Set();
});
// start to send requests
Console.Write("Sending {0} requests...", numRequests);
for (int j = 0; j < numRequests; j++)
{
EchoRequest request = new EchoRequest("hello world!");
client.SendRequest<EchoRequest>(request, i);
}
client.EndRequests();
Console.WriteLine("done");
Console.WriteLine("Retrieving responses...");
// Main thread block here waiting for the retrieval process
// to complete. As the thread that receives the "numRequests"-th
// responses does a Set() on the event, "done.WaitOne()" will pop
done.WaitOne();
Console.WriteLine("Done retrieving {0} responses", numRequests);
}
}
// Close connections and delete messages stored in the system
session.Close();
}
I get exception during second run of EndRequest: The server did not provide a meaningful reply; this might be caused by a contract mismatch, a premature session shutdown or an internal server error.

Don't use DurableSession for computations where the indivdual requests are shorter than about 30 seconds. A DurableSession will be backed by an MSMQ queue in the broker. Your requests and responses may be round-tripped to disk; this will cause performance problems if your amount of computation per request is small. You should use Session instead.
In general, for performance reasons, don't use DurableSession unless you absolutely need the durable behavior in the broker. In this case, since you are calling GetResponses immediately after SendRequests, Session will work fine for you.
You can reuse a Session or DurableSession object to create any number of BrokerClient objects, as long you haven't called Session.Close.
If it's important to process the responses in parallel on the client side, use BrokerClient.SetResponseHandler to set a callback function which will handle responses asynchronously (rather than use client.GetResponses, which handles them synchronously). Look at the HelloWorldR2 sample code for details.

Related

How to read from multiple EventHub partitions simultaneously with high throughput?

My one role instance needs to read data from 20-40 EventHub partitions at the same time (context: this is our internal virtual partitioning scheme - 20-40 partitions represent scale out unit).
In my prototype I use below code. By I get throughput 8 MBPS max. Since if I run the same console multiple times I get throughput (perfmon counter) multiplied accordingly then I think this is not neither VM network limit nor EventHub service side limit.
I wonder whether I create clients correctly here...
Thank you!
Zaki
const string EventHubName = "...";
const string ConsumerGroupName = "...";
var connectionStringBuilder = new ServiceBusConnectionStringBuilder();
connectionStringBuilder.SharedAccessKeyName = "...";
connectionStringBuilder.SharedAccessKey = "...";
connectionStringBuilder.Endpoints.Add(new Uri("sb://....servicebus.windows.net/"));
connectionStringBuilder.TransportType = TransportType.Amqp;
var clientConnectionString = connectionStringBuilder.ToString();
var eventHubClient = EventHubClient.CreateFromConnectionString(clientConnectionString, EventHubName);
var runtimeInformation = await eventHubClient.GetRuntimeInformationAsync().ConfigureAwait(false);
var consumerGroup = eventHubClient.GetConsumerGroup(ConsumerGroupName);
var offStart = DateTime.UtcNow.AddMinutes(-10);
var offEnd = DateTime.UtcNow.AddMinutes(-8);
var workUnitManager = new WorkUnitManager(runtimeInformation.PartitionCount);
var readers = new List<PartitionReader>();
for (int i = 0; i < runtimeInformation.PartitionCount; i++)
{
var reader = new PartitionReader(
consumerGroup,
runtimeInformation.PartitionIds[i],
i,
offStart,
offEnd,
workUnitManager);
readers.Add(reader);
}
internal async Task Read()
{
try
{
Console.WriteLine("Creating a receiver for '{0}' with offset {1}", this.partitionId, this.startOffset);
EventHubReceiver receiver = await this.consumerGroup.CreateReceiverAsync(this.partitionId, this.startOffset).ConfigureAwait(false);
Console.WriteLine("Receiver for '{0}' has been created.", this.partitionId);
var stopWatch = new Stopwatch();
stopWatch.Start();
while (true)
{
var message =
(await receiver.ReceiveAsync(1, TimeSpan.FromSeconds(10)).ConfigureAwait(false)).FirstOrDefault();
if (message == null)
{
continue;
}
if (message.EnqueuedTimeUtc >= this.endOffset)
{
break;
}
this.processor.Push(this.partitionIndex, message);
}
this.Duration = TimeSpan.FromMilliseconds(stopWatch.ElapsedMilliseconds);
}
catch (Exception ex)
{
Console.WriteLine(ex);
throw;
}
}
The above code snippet you provided is effectively: creating 1 Connection to ServiceBus Service and then running all receivers on one single connection (at protocl level, essentially, creating multiple Amqp Links on that same connection).
Alternately - to achieve high throughput for receive operations, You will need to create multiple connections and map your receivers to connection ratio to fine-tune your throughput. That's what happens when you run the above code in multiple processes.
Here's how:
You will need to go one layer down the .Net client SDK API and code at MessagingFactory level - you can start with 1 MessagingFactory per EventHubClient. MessagingFactory is the one - which represents 1 Connection to EventHubs service. Code to create a dedicated connection per EventHubClient:
var connStr = new ServiceBusConnectionStringBuilder("Endpoint=sb://servicebusnamespacename.servicebus.windows.net/;SharedAccessKeyName=saskeyname;SharedAccessKey=sakKey");
connStr.TransportType = TransportType.Amqp;
var msgFactory = MessagingFactory.CreateFromConnectionString(connStr.ToString());
var ehClient = msgFactory.CreateEventHubClient("teststream");
I just added connStr in my sample to emphasize assigning TransportType to Amqp.
You will end up with multiple connections with outgoing port 5671:
If you rewrite your code with 1 MessagingFactory per EventHubClient (or a reasonable ratio) - you are all set (in your code - you will need to move EventHubClient creation to Reader)!
The only extra criteria one need to consider while creating multiple connections is the Bill - only 100 connections are included (including senders and receivers) in basic sku. I guess you are already on standard (as you have >1 TUs) - which gives 1000 connections included in the package - so no need to worry - but mentioning just-in-case.
~Sree
A good option is to create a Task for each partition.
This a copy of my implementation which is able to process a rate of 2.5k messages per second per partition. This rate will be also related to your downstream speed.
static void EventReceiver()
{
for (int i = 0; i <= EventHubPartitionCount; i++)
{
Task.Factory.StartNew((state) =>
{
Console.WriteLine("Starting worker to process partition: {0}", state);
var factory = MessagingFactory.Create(ServiceBusEnvironment.CreateServiceUri("sb", "tests-eventhub", ""), new MessagingFactorySettings()
{
TokenProvider = TokenProvider.CreateSharedAccessSignatureTokenProvider("Listen", "PGSVA7L="),
TransportType = TransportType.Amqp
});
var client = factory.CreateEventHubClient("eventHubName");
var group = client.GetConsumerGroup("customConsumer");
Console.WriteLine("Group: {0}", group.GroupName);
var receiver = group.CreateReceiver(state.ToString(), DateTime.Now);
while (true)
{
if (cts.IsCancellationRequested)
{
receiver.Close();
break;
}
var messages = receiver.Receive(20);
messages.ToList().ForEach(aMessage =>
{
// Process your event
});
Console.WriteLine(counter);
}
}, i);
}
}

Pinging all computers in network with TPL

I'm developing an application that manages devices in the network, at a certain point in the applicaiton, I must ping (actually it's not a ping, it's a SNMP get) all computers in the network to check if it's type is of my managed device.
My problem is that pinging all computers in the network is very slow (specially because most of them won't respond to my message and will simply timeout) and has to be done asynchronously.
I tried to use TLP to do this with the following code:
public static void FindDevices(Action<IPAddress> callback)
{
//Returns a list of all host names with a net view command
List<string> hosts = FindHosts();
foreach (string host in hosts)
{
Task.Run(() =>
{
CheckDevice(host, callback);
});
}
}
But it runs VERY slow, and when I paused execution I checked threads window and saw that it only had one thread pinging the network and was thus, running tasks synchronously.
When I use normal threads it runs a lot faster, but Tasks were supposed to be better, I'd like to know why aren't my Tasks optimizing parallelism.
**EDIT**
Comments asked for code on CheckDevice, so here it goes:
private static void CheckDevice(string host, Action<IPAddress> callback)
{
int commlength, miblength, datatype, datalength, datastart;
string output;
SNMP conn = new SNMP();
IPHostEntry ihe;
try
{
ihe = Dns.Resolve(host);
}
catch (Exception)
{
return;
}
// Send sysLocation SNMP request
byte[] response = conn.get("get", ihe.AddressList[0], "MyDevice", "1.3.6.1.2.1.1.6.0");
if (response[0] != 0xff)
{
// If response, get the community name and MIB lengths
commlength = Convert.ToInt16(response[6]);
miblength = Convert.ToInt16(response[23 + commlength]);
// Extract the MIB data from the SNMP response
datatype = Convert.ToInt16(response[24 + commlength + miblength]);
datalength = Convert.ToInt16(response[25 + commlength + miblength]);
datastart = 26 + commlength + miblength;
output = Encoding.ASCII.GetString(response, datastart, datalength);
if (output.StartsWith("MyDevice"))
{
callback(ihe.AddressList[0]);
}
}
}
Your issue is that you are iterating a none thread safe item the List.
If you replace it with a thread safe object like the ConcurrentBag you should find the threads will run in parallel.
I was a bit confused as to why this was only running one thread, I believe it is this line of code:
try
{
ihe = Dns.Resolve(host);
}
catch (Exception)
{
return;
}
I think this is throwing exceptions and returning; hence you only see one thread. This also ties into your observation that if you added a sleep it worked correctly.
Remember that when you pass a string your passing the reference to the string in memory, not the value. Anyway, the ConcurrentBag seems to resolve your issue. This answer might also be relevant

Redis Booksleeve client, ResultCompletionMode.PreserveOrder not working

When I print out received messages on the Console the displayed messages are all messed up, each message containing 5 string sub-messages that are printed on the Console before control reverts back to the incoming message callback. I strongly assume this is because the incoming message event is raised async in Booksleeve?
I refer to the following post, How does PubSub work in BookSleeve/ Redis?, where the author, Marc Gravell, pointed to the ability to force sync reception by setting Completion Mode to "PreserveOrder". I have done that, tried before and after connecting the client. Neither seems to work.
Any ideas how I can receive messages and print them on the console in the exact order they were sent? I only have one single publisher in this case.
Thanks
Edit:
Below some code snippets to show how I send messages and the Booksleeve wrapper I quickly wrote.
Here the client (I have a similar Client2 that receives the messages and checks order, but I omitted it as it seems trivial).
class Client1
{
const string ClientId = "Client1";
private static Messaging Client { get; set; }
private static void Main(string[] args)
{
var settings = new MessagingSettings("127.0.0.1", 6379, -1, 60, 5000, 1000);
Client = new Messaging(ClientId, settings, ReceiveMessage);
Client.Connect();
Console.WriteLine("Press key to start sending messages...");
Console.ReadLine();
for (int index = 1; index <= 100; index++)
{
//I turned this off because I want to preserve
//the order even if messages are sent in rapit succession
//Thread.Sleep(5);
var msg = new MessageEnvelope("Client1", "Client2", index.ToString());
Client.SendOneWayMessage(msg);
}
Console.WriteLine("Press key to exit....");
Console.ReadLine();
Client.Disconnect();
}
private static void ReceiveMessage(MessageEnvelope msg)
{
Console.WriteLine("Message Received");
}
}
Here the relevant code snippets of the library:
public void Connect()
{
RequestForReplyMessageIds = new ConcurrentBag<string>();
Connection = new RedisConnection(Settings.HostName, Settings.Port, Settings.IoTimeOut);
Connection.Closed += OnConnectionClosed;
Connection.CompletionMode = ResultCompletionMode.PreserveOrder;
Connection.SetKeepAlive(Settings.PingAliveSeconds);
try
{
if (Connection.Open().Wait(Settings.RequestTimeOutMilliseconds))
{
//Subscribe to own ClientId Channel ID
SubscribeToChannel(ClientId);
}
else
{
throw new Exception("Could not connect Redis client to server");
}
}
catch
{
throw new Exception("Could not connect Redis Client to Server");
}
}
public void SendOneWayMessage(MessageEnvelope message)
{
SendMessage(message);
}
private void SendMessage(MessageEnvelope msg)
{
//Connection.Publish(msg.To, msg.GetByteArray());
Connection.Publish(msg.To, msg.GetByteArray()).Wait();
}
private void IncomingChannelSubscriptionMessage(string channel, byte[] body)
{
var msg = MessageEnvelope.GetMessageEnvelope(body);
//forward received message
ReceivedMessageCallback(msg);
//release requestMessage if returned msgId matches
string msgId = msg.MessageId;
if (RequestForReplyMessageIds.Contains(msgId))
{
RequestForReplyMessageIds.TryTake(out msgId);
}
}
public void SubscribeToChannel(string channelName)
{
if (!ChannelSubscriptions.Contains(channelName))
{
var subscriberChannel = Connection.GetOpenSubscriberChannel();
subscriberChannel.Subscribe(channelName, IncomingChannelSubscriptionMessage).Wait();
ChannelSubscriptions.Add(channelName);
}
}
Without seeing exactly how you are checking for this, it is hard to comment, but what I can say is that any threading oddity is going to be hard to track down and fix, and is therefore very unlikely to be addressed in BookSleeve, given that it has been succeeded. However! It will absolutely be checked in StackExchange.Redis. Here's the a rig I've put together in SE.Redis (and, embarrassingly, it did highlight a slight bug, fixed in next release, so .222 or later); output first:
Subscribing...
Sending (preserved order)...
Allowing time for delivery etc...
Checking...
Received: 500 in 2993ms
Out of order: 0
Sending (any order)...
Allowing time for delivery etc...
Checking...
Received: 500 in 341ms
Out of order: 306
(keep in mind that 500 x 5ms is 2500, so we should not be amazed by the 2993ms number, or the 341ms - this is mainly the cost of the Thread.Sleep we have added to nudge the thread-pool into overlapping them; if we remove that, both loops take 0ms, which is awesome - but we can't see the overlapping issue so convincingly)
As you can see, the first run has the correct order output; the second run has mixed order, but it ten times faster. And that is when doing trivial work; for real work it would be even more noticeable. As always, it is a trade-off.
Here's the test rig:
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Threading;
using StackExchange.Redis;
static class Program
{
static void Main()
{
using (var conn = ConnectionMultiplexer.Connect("localhost"))
{
var sub = conn.GetSubscriber();
var received = new List<int>();
Console.WriteLine("Subscribing...");
const int COUNT = 500;
sub.Subscribe("foo", (channel, message) =>
{
lock (received)
{
received.Add((int)message);
if (received.Count == COUNT)
Monitor.PulseAll(received); // wake the test rig
}
Thread.Sleep(5); // you kinda need to be slow, otherwise
// the pool will end up doing everything on one thread
});
SendAndCheck(conn, received, COUNT, true);
SendAndCheck(conn, received, COUNT, false);
}
Console.WriteLine("Press any key");
Console.ReadLine();
}
static void SendAndCheck(ConnectionMultiplexer conn, List<int> received, int quantity, bool preserveAsyncOrder)
{
conn.PreserveAsyncOrder = preserveAsyncOrder;
var sub = conn.GetSubscriber();
Console.WriteLine();
Console.WriteLine("Sending ({0})...", (preserveAsyncOrder ? "preserved order" : "any order"));
lock (received)
{
received.Clear();
// we'll also use received as a wait-detection mechanism; sneaky
// note: this does not do any cheating;
// it all goes to the server and back
for (int i = 0; i < quantity; i++)
{
sub.Publish("foo", i);
}
Console.WriteLine("Allowing time for delivery etc...");
var watch = Stopwatch.StartNew();
if (!Monitor.Wait(received, 10000))
{
Console.WriteLine("Timed out; expect less data");
}
watch.Stop();
Console.WriteLine("Checking...");
lock (received)
{
Console.WriteLine("Received: {0} in {1}ms", received.Count, watch.ElapsedMilliseconds);
int wrongOrder = 0;
for (int i = 0; i < Math.Min(quantity, received.Count); i++)
{
if (received[i] != i) wrongOrder++;
}
Console.WriteLine("Out of order: " + wrongOrder);
}
}
}
}

ZeroMQ performance issue

I'm having an issue with ZeroMQ, which I believe is because I'm not very familiar with it.
I'm trying to build a very simple service where multiple clients connect to a server and sends a query. The server responds to this query.
When I use REQ-REP socket combination (client using REQ, server binding to a REP socket) I'm able to get close to 60,000 messages per second at server side (when client and server are on the same machine). When distributed across machines, each new instance of client on a different machine linearly increases the messages per second at the server and easily reaches 40,000+ with enough client instances.
Now REP socket is blocking, so I followed ZeroMQ guide and used the rrbroker pattern (http://zguide.zeromq.org/cs:rrbroker):
REQ (client) <----> [server ROUTER -- DEALER --- REP (workers running on different threads)]
However, this completely screws up the performance. I'm getting only around 4000 messages per second at the server when running across machines. Not only that, each new client started on a different machine reduces the throughput of every other client.
I'm pretty sure I'm doing something stupid. I'm wondering if ZeroMQ experts here can point out any obvious mistakes. Thanks!
Edit: Adding code as per advice. I'm using the clrzmq nuget package (https://www.nuget.org/packages/clrzmq-x64/)
Here's the client code. A timer counts how many responses are received every second.
for (int i = 0; i < numTasks; i++) { Task.Factory.StartNew(() => Client(), TaskCreationOptions.LongRunning); }
void Client()
{
using (var ctx = new Context())
{
Socket socket = ctx.Socket(SocketType.REQ);
socket.Connect("tcp://192.168.1.10:1234");
while (true)
{
socket.Send("ping", Encoding.Unicode);
string res = socket.Recv(Encoding.Unicode);
}
}
}
Server - case 1: The server keeps track of how many requests are received per second
using (var zmqContext = new Context())
{
Socket socket = zmqContext.Socket(SocketType.REP);
socket.Bind("tcp://*:1234");
while (true)
{
string q = socket.Recv(Encoding.Unicode);
if (q.CompareTo("ping") == 0) {
socket.Send("pong", Encoding.Unicode);
}
}
}
With this setup, at server side, I can see around 60,000 requests received per second (when client is on the same machine). When on different machines, each new client increases number of requests received at server as expected.
Server Case 2: This is essentially rrbroker from ZMQ guide.
void ReceiveMessages(Context zmqContext, string zmqConnectionString, int numWorkers)
{
List<PollItem> pollItemsList = new List<PollItem>();
routerSocket = zmqContext.Socket(SocketType.ROUTER);
try
{
routerSocket.Bind(zmqConnectionString);
PollItem pollItem = routerSocket.CreatePollItem(IOMultiPlex.POLLIN);
pollItem.PollInHandler += RouterSocket_PollInHandler;
pollItemsList.Add(pollItem);
}
catch (ZMQ.Exception ze)
{
Console.WriteLine("{0}", ze.Message);
return;
}
dealerSocket = zmqContext.Socket(SocketType.DEALER);
try
{
dealerSocket.Bind("inproc://workers");
PollItem pollItem = dealerSocket.CreatePollItem(IOMultiPlex.POLLIN);
pollItem.PollInHandler += DealerSocket_PollInHandler;
pollItemsList.Add(pollItem);
}
catch (ZMQ.Exception ze)
{
Console.WriteLine("{0}", ze.Message);
return;
}
// Start the worker pool; cant connect
// to inproc socket before binding.
workerPool.Start(numWorkers);
while (true)
{
zmqContext.Poll(pollItemsList.ToArray());
}
}
void RouterSocket_PollInHandler(Socket socket, IOMultiPlex revents)
{
RelayMessage(routerSocket, dealerSocket);
}
void DealerSocket_PollInHandler(Socket socket, IOMultiPlex revents)
{
RelayMessage(dealerSocket, routerSocket);
}
void RelayMessage(Socket source, Socket destination)
{
bool hasMore = true;
while (hasMore)
{
byte[] message = source.Recv();
hasMore = source.RcvMore;
destination.Send(message, message.Length, hasMore ? SendRecvOpt.SNDMORE : SendRecvOpt.NONE);
}
}
Where the worker pool's start method is:
public void Start(int numWorkerTasks=8)
{
for (int i = 0; i < numWorkerTasks; i++)
{
QueryWorker worker = new QueryWorker(this.zmqContext);
Task task = Task.Factory.StartNew(() =>
worker.Start(),
TaskCreationOptions.LongRunning);
}
Console.WriteLine("Started {0} with {1} workers.", this.GetType().Name, numWorkerTasks);
}
public class QueryWorker
{
Context zmqContext;
public QueryWorker(Context zmqContext)
{
this.zmqContext = zmqContext;
}
public void Start()
{
Socket socket = this.zmqContext.Socket(SocketType.REP);
try
{
socket.Connect("inproc://workers");
}
catch (ZMQ.Exception ze)
{
Console.WriteLine("Could not create worker, error: {0}", ze.Message);
return;
}
while (true)
{
try
{
string message = socket.Recv(Encoding.Unicode);
if (message.CompareTo("ping") == 0)
{
socket.Send("pong", Encoding.Unicode);
}
}
catch (ZMQ.Exception ze)
{
Console.WriteLine("Could not receive message, error: " + ze.ToString());
}
}
}
}
Could you post some source code or at least a more detailed explanation of your test case? In general the way to build out your design is to make one change at a time, and measure at each change. You can always move stepwise from a known working design to more complex ones.
Most probably the 'ROUTER' is the bottleneck.
Check out these related questions on this:
Client maintenance in ZMQ ROUTER
Load testing ZeroMQ (ZMQ_STREAM) for finding the maximum simultaneous users it can handle
ROUTER (and ZMQ_STREAM, which is just a variant of ROUTER) internally has to maintain the client mapping, hence IMO it can accept limited connections from a particular client. It looks like ROUTER can multiplex multiple clients, only as long as, each client has only one active connection.
I could be wrong here - but I am not seeing much proof to the contrary (simple working code that scales to multi-clients with multi-connections with ROUTER or STREAM).
There certainly is a very severe restriction on concurrent connections with ZeroMQ, though it looks like no one know what is causing it.
I have done done performance testing on calling a native unmanaged DLL function with various methods from C#:
1. C++/CLI wrapper
2. PInvoke
3. ZeroMQ/clrzmq
The last might be interesting for you.
My finding at the end of my performance test was that using the ZMQ binding clrzmq was not useful and produced a factor of 100 performance overhead after I tried to optimize the PInvoke calls within the source code of the binding. Therefore I have used the ZMQ without a binding but with PInvoke calls.these calls must be done with the cdecl convention and with the option "SuppressUnmanagedCodeSecurity" to get most speed.
I had to import just 5 functions which was fairly easy.
At the end the speed was a bit slower than a PInvoke call but with the ZMQ-in my case over "inproc".
This may give you the hint to try it without the binding, if speed is interesting for you.
This is not a direct answer for your question but may help you to increase performance in general.

HttpWebRequest Limitations? Or bad implementation

I am trying to build a c# console app that will monitor about 3000 urls (Just need to know that HEAD request returned 200, not necessarily content, etc.)
My attempt here was to build a routine the checks the web URLS, looping and creating threads each executing the routine. What's happening is if i run with <20 threads, it executes ok most of the time, but if i use >20 threads, some of the url's time out. I tried increasing the Timeout to 30 seconds, same occurs. The network I am running this on is more than capable of executing 50 HTTP HEAD requests (10MBIT connection at ISP), and both the CPU and network run very low when executing the routine.
When a timeout occurs, i test the same IP on a browser and it works fine, I tested this repeatedly and there was never a case during testing that a "timed out" url was actually timing out.
The reason i want to run >20 threads is that i want to perform this test every 5 minutes, with some of the URL's taking a full 10sec (or higher if the timeout is set higher), i want to make sure that its able to run through all URLs within 2-3 minutes.
Is there a better way to go about checking if a URL is available, or, should I be looking at the system/network for an issue.
MAIN
while (rdr.Read())
{
Thread t = new Thread(new ParameterizedThreadStart(check_web));
t.Start(rdr[0]);
}
static void check_web(object weburl)
{
bool isok;
isok = ConnectionAvailable(weburl.ToString());
}
public static bool ConnectionAvailable(string strServer)
{
try
{
strServer = "http://" + strServer;
HttpWebRequest reqFP = (HttpWebRequest)HttpWebRequest.Create(strServer);
reqFP.Timeout = 10000;
reqFP.Method = "HEAD";
HttpWebResponse rspFP = (HttpWebResponse)reqFP.GetResponse();
if (HttpStatusCode.OK == rspFP.StatusCode)
{
Console.WriteLine(strServer + " - OK");
rspFP.Close();
return true;
}
else
{
Console.WriteLine(strServer + " Server returned error..");
rspFP.Close();
return false;
}
}
catch (WebException x)
{
if (x.ToString().Contains("timed out"))
{
Console.WriteLine(strServer + " - Timed out");
}
else
{
Console.WriteLine(x.Message.ToString());
}
return false;
}
}
Just remember, you asked.
Very bad implementation.
Do not go creating threads like that. It does very little good to have more threads than processor cores. The extra threads will pretty much just compete with each other, especially since they're all running the same code.
You need to implement using blocks. If you throw an exception (and chances are you will), then you will be leaking resources.
What is the purpose in returning a bool? Do you check it somewhere? In any case, your error and exception processing are a mess.
When you get a non-200 response, you don't display the error code.
You're comparing against the Message property to decide if it's a timeout. Microsoft should put a space between the "time" and "out" just to spite you.
When it's not a timeout, you display only the Message property, not the entire exception, and the Message property is already a string and doesn't need you to call ToString() on it.
Next Batch of Changes
This isn't finished, I don't think, but try this one:
public static void Main()
{
// Don't mind the interpretation. I needed an excuse to define "rdr"
using (var conn = new SqlConnection())
{
conn.Open();
using (var cmd = new SqlCommand("SELECT Url FROM UrlsToCheck", conn))
{
using (var rdr = cmd.ExecuteReader())
{
while (rdr.Read())
{
// Use the thread pool. Please.
ThreadPool.QueueUserWorkItem(
delegate(object weburl)
{
// I invented a reason for you to return bool
if (!ConnectionAvailable(weburl.ToString()))
{
// Console would be getting pretty busy with all
// those threads
Debug.WriteLine(
String.Format(
"{0} was not available",
weburl));
}
},
rdr[0]);
}
}
}
}
}
public static bool ConnectionAvailable(string strServer)
{
try
{
strServer = "http://" + strServer;
var reqFp = (HttpWebRequest)WebRequest.Create(strServer);
reqFp.Timeout = 10000;
reqFp.Method = "HEAD";
// BTW, what's an "FP"?
using (var rspFp = (HttpWebResponse) reqFp.GetResponse()) // IDisposable
{
if (HttpStatusCode.OK == rspFp.StatusCode)
{
Debug.WriteLine(string.Format("{0} - OK", strServer));
return true; // Dispose called when using is exited
}
// Include the error because it's nice to know these things
Debug.WriteLine(String.Format(
"{0} Server returned error: {1}",
strServer, rspFp.StatusCode));
return false;
}
}
catch (WebException x)
{
// Don't tempt fate and don't let programs read human-readable messages
if (x.Status == WebExceptionStatus.Timeout)
{
Debug.WriteLine(string.Format("{0} - Timed out", strServer));
}
else
{
// The FULL exception, please
Debug.WriteLine(x.ToString());
}
return false;
}
}
Almost Done - Not Tested Late Night Code
public static void Main()
{
using (var conn = new SqlConnection())
{
conn.Open();
using (var cmd = new SqlCommand("", conn))
{
using (var rdr = cmd.ExecuteReader())
{
if (rdr == null)
{
return;
}
while (rdr.Read())
{
ThreadPool.QueueUserWorkItem(
CheckConnectionAvailable, rdr[0]);
}
}
}
}
}
private static void CheckConnectionAvailable(object weburl)
{
try
{
// If this works, it's a lot simpler
var strServer = new Uri("http://" + weburl);
using (var client = new WebClient())
{
client.UploadDataCompleted += ClientOnUploadDataCompleted;
client.UploadDataAsync(
strServer, "HEAD", new byte[] {}, strServer);
}
}
catch (WebException x)
{
Debug.WriteLine(x);
}
}
private static void ClientOnUploadDataCompleted(
object sender, UploadDataCompletedEventArgs args)
{
if (args.Error == null)
{
Debug.WriteLine(string.Format("{0} - OK", args.UserState));
}
else
{
Debug.WriteLine(string.Format("{0} - Error", args.Error));
}
}
Use ThreadPool class. Don't spawn hundreds of threads like this. Threads have such a huge overhead and what happens in your case is that your CPU will spend 99% time on context switching and 1% doing real work.
Don't use threads.
Asynch Call backs and queues. Why create a thread when the resource that they are all wanting is access to the outside world. Limit your threads to about 5, and then implement a class that uses a queue. split the code into two parts, the fetch and the process. One controls the flow of data while the other controls access to the outside world.
Use whatever language you like but you won't got wrong if you think that threads are for processing and number crunching and async call backs are for resource management.

Categories