No synchronized non-blocking read method in basic Stream/StreamReader class - c#

Recently I'm trying some .Net.Sockets secured networking by using BouncyCastle library.
The TlsStream class in BouncyCastle inherits the original Stream (not NetworkStream), and StreamReader/StreamWriter seem to be a convenient way for read/write.
Since I tend to use 1 thread for 1 end(server or client) to handle both read and write :
void CommunicationLoop() // Loops in Thread A
{
while (true)
{
ReadFromStream(); // If data available. It always hangs/blocks here(if there's no data to be read.)
WriteToStream(); // If user input something.
}
}
void ReadFromStream()
{
String line;
while ( StreamReader.Peek() > -1 )
// Or ((line = StreamReader.ReadLine()) != null) / (Stream.Read(buff, 0, buff.Length) > 0)
// or any synchronized Readxxx() methods.
// It always hangs/blocks here(if there's no data to be read.)
{
line = StreamReader.ReadLine();
Console.WriteLine($"Received: {line}");
}
}
void WriteToStream()
{
//...
}
I did a lot of research, everyone suggests to use async method to solve the problem.
I would like to know that, is there really no official method/function to check if there is data to be read in StreamReader/Stream, if no data then skip(instead of hanging there waiting for the input, like the NetworkStream.DataAvailable)?
Also, if the the communication for 1 connection is not heavy, isn't using 1 thread dealing with both read/write in server side (there might be multiple connections from MANY CLIENTS to ONE SERVER) more efficient(saves resource)?
Thanks.

I would like to know that, is there really no official method/function to check if there is data to be read in StreamReader/Stream
Check the documentation for StreamReader. As far as I can see there is no way to check for waiting data without using the async methods.
Also, if the the communication for 1 connection is not heavy, isn't using 1 thread dealing with both read/write in server side (there might be multiple connections from MANY CLIENTS to ONE SERVER) more efficient(saves resource)
This should not be more efficient than using the async methods. Consider the case where all clients are idle. Your method would use 1 thread per client. Using async methods would not use any threads. Assuming the async methods use non-blocking IO in the backend. It is possible the sync methods have slightly lower overhead since they can do the synchronization in the kernel rather than in .Net, but I think this would need benchmarking to verify.
Is there some specific reason you do not want to use the async methods?

Related

How does PubSub work in BookSleeve/ Redis?

I wonder what the best way is to publish and subscribe to channels using BookSleeve. I currently implement several static methods (see below) that let me publish content to a specific channel with the newly created channel being stored in private static Dictionary<string, RedisSubscriberConnection> subscribedChannels;.
Is this the right approach, given I want to publish to channels and subscribe to channels within the same application (note: my wrapper is a static class). Is it enough to create one channel even I want to publish and subscribe? Obviously I would not publish to the same channel than I would subscribe to within the same application. But I tested it and it worked:
RedisClient.SubscribeToChannel("Test").Wait();
RedisClient.Publish("Test", "Test Message");
and it worked.
Here my questions:
1) Will it be more efficient to setup a dedicated publish channel and a dedicated subscribe channel rather than using one channel for both?
2) What is the difference between "channel" and "PatternSubscription" semantically? My understanding is that I can subscribe to several "topics" through PatternSubscription() on the same channel, correct? But if I want to have different callbacks invoked for each "topic" I would have to setup a channel for each topic correct? Is that efficient or would you advise against that?
Here the code snippets.
Thanks!!!
public static Task<long> Publish(string channel, byte[] message)
{
return connection.Publish(channel, message);
}
public static Task SubscribeToChannel(string channelName)
{
string subscriptionString = ChannelSubscriptionString(channelName);
RedisSubscriberConnection channel = connection.GetOpenSubscriberChannel();
subscribedChannels[subscriptionString] = channel;
return channel.PatternSubscribe(subscriptionString, OnSubscribedChannelMessage);
}
public static Task UnsubscribeFromChannel(string channelName)
{
string subscriptionString = ChannelSubscriptionString(channelName);
if (subscribedChannels.Keys.Contains(subscriptionString))
{
RedisSubscriberConnection channel = subscribedChannels[subscriptionString];
Task task = channel.PatternUnsubscribe(subscriptionString);
//remove channel subscription
channel.Close(true);
subscribedChannels.Remove(subscriptionString);
return task;
}
else
{
return null;
}
}
private static string ChannelSubscriptionString(string channelName)
{
return channelName + "*";
}
1: there is only one channel in your example (Test); a channel is just the name used for a particular pub/sub exchange. It is, however, necessary to use 2 connections due to specifics of how the redis API works. A connection that has any subscriptions cannot do anything else except:
listen to messages
manage its own subscriptions (subscribe, psubscribe, unsubscribe, punsubscribe)
However, I don't understand this:
private static Dictionary<string, RedisSubscriberConnection>
You shouldn't need more than one subscriber connection unless you are catering for something specific to you. A single subscriber connection can handle an arbitrary number of subscriptions. A quick check on client list on one of my servers, and I have one connection with (at time of writing) 23,002 subscriptions. Which could probably be reduced, but: it works.
2: pattern subscriptions support wildcards; so rather than subscribing to /topic/1, /topic/2/ etc you could subscribe to /topic/*. The name of the actual channel used by publish is provided to the receiver as part of the callback signature.
Either can work. It should be noted that the performance of publish is impacted by the total number of unique subscriptions - but frankly it is still stupidly fast (as in: 0ms) even if you have tens of multiple thousands of subscribed channels using subscribe rather than psubscribe.
But from publish
Time complexity: O(N+M) where N is the number of clients subscribed to the receiving channel and M is the total number of subscribed patterns (by any client).
I recommend reading the redis documentation of pub/sub.
Edit for follow on questions:
a) I assume I would have to "publish" synchronously (using Result or Wait()) if I want to guarantee the order of sending items from the same publisher is preserved when receiving items, correct?
that won't make any difference at all; since you mention Result / Wait(), I assume you're talking about BookSleeve - in which case the multiplexer already preserves command order. Redis itself is single threaded, and will always process commands on a single connection in order. However: the callbacks on the subscriber may be executed asynchronously and may be handed (separately) to a worker thread. I am currently investigating whether I can force this to be in-order from RedisSubscriberConnection.
Update: from 1.3.22 onwards you can set the CompletionMode to PreserveOrder - then all callbacks will be completed sequentially rather than concurrently.
b) after making adjustments according to your suggestions I get a great performance when publishing few items regardless of the size of the payload. However, when sending 100,000 or more items by the same publisher performance drops rapidly (down to 7-8 seconds just to send from my machine).
Firstly, that time sounds high - testing locally I get (for 100,000 publications, including waiting for the response for all of them) 1766ms (local) or 1219ms (remote) (that might sound counter-intuitive, but my "local" isn't running the same version of redis; my "remote" is 2.6.12 on Centos; my "local" is
2.6.8-pre2 on Windows).
I can't make your actual server faster or speed up the network, but: in case this is packet fragmentation, I have added (just for you) a SuspendFlush() / ResumeFlush() pair. This disables eager-flushing (i.e. when the send-queue is empty; other types of flushing still happen); you might find this helps:
conn.SuspendFlush();
try {
// start lots of operations...
} finally {
conn.ResumeFlush();
}
Note that you shouldn't Wait until you have resumed, because until you call ResumeFlush() there could be some operations still in the send-buffer. With that all in place, I get (for 100,000 operations):
local: 1766ms (eager-flush) vs 1554ms (suspend-flush)
remote: 1219ms (eager-flush) vs 796ms (suspend-flush)
As you can see, it helps more with remote servers, as it will be putting fewer packets through the network.
I cannot use transactions because later on the to-be-published items are not all available at once. Is there a way to optimize with that knowledge in mind?
I think that is addressed by the above - but note that recently CreateBatch was added too. A batch operates a lot like a transaction - just: without the transaction. Again, it is another mechanism to reduce packet fragmentation. In your particular case, I suspect the suspend/resume (on flush) is your best bet.
Do you recommend having one general RedisConnection and one RedisSubscriberConnection or any other configuration to have such wrapper perform desired functions?
As long as you're not performing blocking operations (blpop, brpop, brpoplpush etc), or putting oversized BLOBs down the wire (potentially delaying other operations while it clears), then a single connection of each type usually works pretty well. But YMMV depending on your exact usage requirements.

How to create C# TCP listner that would keep all clients connected in one thread sending events only when clients write to it?

Say we want to get API alike this:
var Listner = new ServerSocket();
Listner.Bind(URL);
Listner.OnData((senderClient, ClientDataStream) => {/* ... */})
We also want the delegate passed to OnData be executed in limited multythreaded task pool that does not affect socket receiving performance.
New senderClient tasks shall get into end of task pool only when current task on senderClient was executed.
Ofcourse while working with OnData we shall be capable of writting data back to clients thrue socket.
We can not provide information on next ClientDataStream length when parsing current frame. So ClientDataStream shall provide abilety to read from it as much as needed in form of async operation alike:
{
byte[] data = ClientDataStream.Read(5).Wait();
/* */
byte[] data = ClientDataStream.Read(someDinamicVarNWeGotFromThatFirstFiveBytes).Wait(); //...
}
and while task waits it shall probably allow other tasks to work.
Is there such smart socket server in .Net out of the box or in some OSS library?
I'm not aware of a ServerSocket class in .NET. It's just Socket. It can do stuff asynchronously. There is an extensive article on MSDN: http://msdn.microsoft.com/en-us/library/5w7b7x5f%28v=vs.110%29.aspx
The API is somewhat different from your pseudocode. There is no OnData event, but a BeginReceive method that takes a callback method.
The Socket class does not support async/await out of the box (if you're using .NET 4.5), but I ran into this blog article that defines some extension methods for the class to make it possible to use that programming model as well.

WSAEWOULDBLOCK handling

I have written a socket for a server in C++ CLI that is using winsock. The sockets are using async methods for sending, receiving and accepting connections. After implementing my socket in the production environment, the send function stops working giving me the error WSAEWOULDBLOCK. Out from my research on the net, this means the network buffer for socket IO is full or the networking is too busy to do my operation at this moment. However, I have not seen any specific solution which can address this problem. My temporary solution was to create a do-while loop around the WSASend function, making the thread sleep for X amount of MS and then try again. This resulted in far higher latency than the previous socket (.NET socket class) and large lag spikes.
My code for sending data is as following:
void Connectivity::ConnectionInformation::SendData(unsigned char data[], const int length)
{
if (isClosed || sendError)
return;
Monitor::Enter(this->syncRoot);
try
{
sendInfo->buf = (char*)data;
sendInfo->len = length;
do
{
state = 0;
if (WSASend(connection, sendInfo, 1, bytesSent, 0, NULL, NULL) == SOCKET_ERROR)
{
state = WSAGetLastError();
if (state == WSAEWOULDBLOCK)
{
Thread::Sleep(SleepTime);
//Means the networking is busy and we need to wait a bit for data to be sent
//Might wanna decrease the value since this could potentially lead to lagg
}
else if (state != WSA_IO_PENDING)
{
this->sendError = true;
//The send error bool makes sure that the close function doesn't get called
//during packet processing which could cause a lot of null reffernce exceptions.
}
}
}
while (state == WSAEWOULDBLOCK);
}
finally
{
Monitor::Exit(this->syncRoot);
}
}
Is there a way to use for example the WSAEventSelect method in order to get a callback when I am able to send data? Out from the documentation on MSDN, the wait for data method could also get stuck in this error. Anyone got any solutions for getting around this?
The error code WSAEWOULDBLOCK means that you attempted to operate on a non-blocking socket but the operation could not be completed immediately. This is not a real error - it means that you can retry later or schedule an asynchronous IO (which wouldn't fail). But this is not what you want in the first place. Let me explain:
You are supposed to use sockets in one of two ways:
Synchronous, blocking.
Asynchronous, non-blocking, callback-based.
You are mixing the two which gets you the worst of both. You created a non-blocking socket and use it in a potentially blocking way.
Alas I'm not full qualified to give best-practices for native-code sockets. I suggest you read all of the docs for WSASend because they seem to explain all of this.
Now, why would this strange error code even exist? It is a performance optimization. You can speculatively try to send synchronously (which is very fast). And only if it fails you are supposed to schedule an asynchronous IO. If you don't need that optimization (which you don't) don't do it.
As #usr says, I need to have either LPWSAOVERLAPPED or LPWSAOVERLAPPED_COMPLETION_ROUTINE set to a value in order to make the operation non-blocking. However, after testing, I found out I need t have a LPWSAOVERLAPPED object in order to make the completion routine called. It is also mentioned on MSDN on the documentation of the WSASend function that if the overlapped object and the completion routine is NULL, the socket would behave as a blocking socket.
Thanks, and merry xmas everyone! :)

How to properly parallelise job heavily relying on I/O

I'm building a console application that have to process a bunch of data.
Basically, the application grabs references from a DB. For each reference, parse the content of the file and make some changes. The files are HTML files, and the process is doing a heavy work with RegEx replacements (find references and transform them into links). The results in then stored on the file system and sent to an external system.
If I resume the process, in a sequential way :
var refs = GetReferencesFromDB(); // ~5000 Datarow returned
foreach(var ref in refs)
{
var filePath = GetFilePath(ref); // This method looks up in a previously loaded file list
var html = File.ReadAllText(filePath); // Read html locally, or from a network drive
var convertedHtml = ParseHtml(html);
File.WriteAllText(destinationFilePath); // Copy the result locally, or a network drive
SendToWs(ref, convertedHtml);
}
My program is working correctly but is quite slow. That's why I want to parallelise the process.
By now, I made a simple Parallelization adding AsParallel :
var refs = GetReferencesFromDB().AsParallel();
refs.ForAll(ref=>
{
var filePath = GetFilePath(ref);
var html = File.ReadAllText(filePath);
var convertedHtml = ParseHtml(html);
File.WriteAllText(destinationFilePath);
SendToWs(ref, convertedHtml);
});
This simple change decrease the duration of the process (25% less time). However, what I understand with parallelization is that there won't be much benefits (or worse, less benefits) if parallelyzing over resources relying on I/O, because the i/o won't magically doubles.
That's why I think I should change my approach not to parallelize the whole process, but to create dependent chained queued tasks.
I.E., I should create a flow like :
Queue read file. When finished, Queue ParseHtml. When finished, Queue both send to WS and write locally. When finished, log the result.
However, I don't know how to realize such think.
I feel it will ends in a set of consumer/producer queues, but I didn't find a correct sample.
And moreover, I'm not sure if there will be benefits.
thanks for advices
[Edit] In fact, I'm the perfect candidate for using c# 4.5... if only it was rtm :)
[Edit 2] Another thing making me thinking it's not correctly parallelized, is that in the resource monitor, I see graphs of CPU, network I/O and disk I/O not stable. when one is high, others are low to medium
You're not leveraging any async I/O APIs in any of your code. Everything you're doing is CPU bound and all your I/O operations are going to waste CPU resources blocking. AsParallel is for compute bound tasks, if you want to take advantage of async I/O you need to leverage the Asynchronous Programming Model (APM) based APIs today in <= v4.0. This is done by looking for BeginXXX/EndXXX methods on the I/O based classes you're using and leveraging those whenever available.
Read this post for starters: TPL TaskFactory.FromAsync vs Tasks with blocking methods
Next, you don't want to use AsParallel in this case anyway. AsParallel enables streaming which will result in an immediately scheduling a new Task per item, but you don't need/want that here. You'd be much better served by partitioning the work using Parallel::ForEach.
Let's see how you can use this knowledge to achieve max concurrency in your specific case:
var refs = GetReferencesFromDB();
// Using Parallel::ForEach here will partition and process your data on separate worker threads
Parallel.ForEach(
refs,
ref =>
{
string filePath = GetFilePath(ref);
byte[] fileDataBuffer = new byte[1048576];
// Need to use FileStream API directly so we can enable async I/O
FileStream sourceFileStream = new FileStream(
filePath,
FileMode.Open,
FileAccess.Read,
FileShare.Read,
8192,
true);
// Use FromAsync to read the data from the file
Task<int> readSourceFileStreamTask = Task.Factory.FromAsync(
sourceFileStream.BeginRead
sourceFileStream.EndRead
fileDataBuffer,
fileDataBuffer.Length,
null);
// Add a continuation that will fire when the async read is completed
readSourceFileStreamTask.ContinueWith(readSourceFileStreamAntecedent =>
{
int soureFileStreamBytesRead;
try
{
// Determine exactly how many bytes were read
// NOTE: this will propagate any potential exception that may have occurred in EndRead
sourceFileStreamBytesRead = readSourceFileStreamAntecedent.Result;
}
finally
{
// Always clean up the source stream
sourceFileStream.Close();
sourceFileStream = null;
}
// This is here to make sure you don't end up trying to read files larger than this sample code can handle
if(sourceFileStreamBytesRead == fileDataBuffer.Length)
{
throw new NotSupportedException("You need to implement reading files larger than 1MB. :P");
}
// Convert the file data to a string
string html = Encoding.UTF8.GetString(fileDataBuffer, 0, sourceFileStreamBytesRead);
// Parse the HTML
string convertedHtml = ParseHtml(html);
// This is here to make sure you don't end up trying to write files larger than this sample code can handle
if(Encoding.UTF8.GetByteCount > fileDataBuffer.Length)
{
throw new NotSupportedException("You need to implement writing files larger than 1MB. :P");
}
// Convert the file data back to bytes for writing
Encoding.UTF8.GetBytes(convertedHtml, 0, convertedHtml.Length, fileDataBuffer, 0);
// Need to use FileStream API directly so we can enable async I/O
FileStream destinationFileStream = new FileStream(
destinationFilePath,
FileMode.OpenOrCreate,
FileAccess.Write,
FileShare.None,
8192,
true);
// Use FromAsync to read the data from the file
Task destinationFileStreamWriteTask = Task.Factory.FromAsync(
destinationFileStream.BeginWrite,
destinationFileStream.EndWrite,
fileDataBuffer,
0,
fileDataBuffer.Length,
null);
// Add a continuation that will fire when the async write is completed
destinationFileStreamWriteTask.ContinueWith(destinationFileStreamWriteAntecedent =>
{
try
{
// NOTE: we call wait here to observe any potential exceptions that might have occurred in EndWrite
destinationFileStreamWriteAntecedent.Wait();
}
finally
{
// Always close the destination file stream
destinationFileStream.Close();
destinationFileStream = null;
}
},
TaskContinuationOptions.AttachedToParent);
// Send to external system **concurrent** to writing to destination file system above
SendToWs(ref, convertedHtml);
},
TaskContinuationOptions.AttachedToParent);
});
Now, here's few notes:
This is sample code so I'm using a 1MB buffer to read/write files. This is excessive for HTML files and wasteful of system resources. You can either lower it to suit your max needs or implement chained reads/writes into a StringBuilder which is an excercise I leave up to you since I'd be writing ~500 more lines of code to do async chained reads/writes. :P
You'll note that on the continuations for the read/write tasks I have TaskContinuationOptions.AttachedToParent. This is very important as it will prevent the worker thread that the Parallel::ForEach starts the work with from completing until all the underlying async calls have completed. If this was not here you would kick off work for all 5000 items concurrently which would pollute the TPL subsystem with thousands of scheduled Tasks and not scale properly at all.
I call SendToWs concurrent to writing the file to the file share here. I don't know what is underlying the implementation of SendToWs, but it too sounds like a good candidate for making async. Right now it's assumed it's pure compute work and, as such, is going to burn a CPU thread while executing. I leave it as an excercise to you to figure out how best to leverage what I've shown you to improve throughput there.
This is all typed free form and my brain was the only compiler here and SO's syntax higlighting is all I used to make sure syntax was good. So, please forgive any syntax errors and let me know if I screwed up anything too badly that you can't make heads or tails of it and I'll follow up.
The good news is your logic could be easily separated into steps that go into a producer-consumer pipeline.
Step 1: Read file
Step 2: Parse file
Step 3: Write file
Step 4: SendToWs
If you are using .NET 4.0 you can use the BlockingCollection data structure as the backbone for the each step's producer-consumer queue. The main thread will enqueue each work item into step 1's queue where it will be picked up and processed and then forwarded on to step 2's queue and so on and so forth.
If you are willing to move on to the Async CTP then you can take advantage of the new TPL Dataflow structures for this as well. There is the BufferBlock<T> data structure, among others, that behaves in a similar manner to BlockingCollection and integrates well with the new async and await keywords.
Because your algorithm is IO bound the producer-consumer strategies may not get you the performance boost you are looking for, but at least you will have a very elegant solution that would scale well if you could increase the IO throughput. I am afraid steps 1 and 3 will be the bottlenecks and the pipeline will not balance well, but it is worth experimenting with.
Just a suggestion, but have you looked into the Consumer / Producer pattern ? A certain number of threads would read your files on disk and feed the content to a queue. Then another set of threads, known as the consumers, would "consume" the queue as its filled. http://zone.ni.com/devzone/cda/tut/p/id/3023
Your best bet in these kind of scenario is definitely the producer-consumer model. One thread to pull the data and a bunch of workers to process it. There's no easy way around the I/O so you might as well just focus on optimizing the computation itself.
I will now try to sketch a model:
// producer thread
var refs = GetReferencesFromDB(); // ~5000 Datarow returned
foreach(var ref in refs)
{
lock(queue)
{
queue.Enqueue(ref);
event.Set();
}
// if the queue is limited, test if the queue is full and wait.
}
// consumer threads
while(true)
{
value = null;
lock(queue)
{
if(queue.Count > 0)
{
value = queue.Dequeue();
}
}
if(value != null)
// process value
else
event.WaitOne(); // event to signal that an item was placed in the queue.
}
You can find more details about producer/consumer in part 4 of Threading in C#: http://www.albahari.com/threading/part4.aspx
I think your approach to split up the list of files and process each file in one batch is ok.
My feeling is that you might get more performance gain if you play with degree of parallelism.
See: var refs = GetReferencesFromDB().AsParallel().WithDegreeOfParallelism(16); this would start processing 16 files at the same time. Currently you are processing probably 2 or 4 files depending on number of cores you have. This is only efficient when you have only computation without IO. For IO intensive tasks adjustment might bring incredible performance improvements reducing processor idle time.
If you are going to split up and join tasks back using producer-consumer look at this sample: Using Parallel Linq Extensions to union two sequences, how can one yield the fastest results first?

Is a non-blocking, single-threaded, asynchronous web server (like Node.js) possible in .NET?

I was looking at this question, looking for a way to create a single-threaded, event-based nonblocking asynchronous web server in .NET.
This answer looked promising at first, by claiming that the body of the code runs in a single thread.
However, I tested this in C#:
using System;
using System.IO;
using System.Threading;
class Program
{
static void Main()
{
Console.WriteLine(Thread.CurrentThread.ManagedThreadId);
var sc = new SynchronizationContext();
SynchronizationContext.SetSynchronizationContext(sc);
{
var path = Environment.ExpandEnvironmentVariables(
#"%SystemRoot%\Notepad.exe");
var fs = new FileStream(path, FileMode.Open,
FileAccess.Read, FileShare.ReadWrite, 1024 * 4, true);
var bytes = new byte[1024];
fs.BeginRead(bytes, 0, bytes.Length, ar =>
{
sc.Post(dummy =>
{
var res = fs.EndRead(ar);
// Are we in the same thread?
Console.WriteLine(Thread.CurrentThread.ManagedThreadId);
}, null);
}, null);
}
Thread.Sleep(100);
}
}
And the result was:
1
5
So it seems like, contrary to the answer, the thread initiating the read and the thread ending the read are not the same.
So now my question is, how do you to achieve a single-threaded, event-based nonblocking asynchronous web server in .NET?
The whole SetSynchronizationContext is a red herring, this is just a mechanism for marshalling, the work still happens in the IO Thread Pool.
What you are asking for is a way to queue and harvest Asynchronous Procedure Calls for all your IO work from the main thread. Many higher level frameworks wrap this kind functionality, the most famous one being libevent.
There is a great recap on the various options here: Whats the difference between epoll, poll, threadpool?.
.NET already takes care of scaling for you by have a special "IO Thread Pool" that handles IO access when you call the BeginXYZ methods. This IO Thread Pool must have at least 1 thread per processor on the box. see: ThreadPool.SetMaxThreads.
If single threaded app is a critical requirement (for some crazy reason) you could, of course, interop all of this stuff in using DllImport (see an example here)
However it would be a very complex and risky task:
Why don't we support APCs as a completion mechanism? APCs are really not a good general-purpose completion mechanism for user code. Managing the reentrancy introduced by APCs is nearly impossible; any time you block on a lock, for example, some arbitrary I/O completion might take over your thread. It might try to acquire locks of its own, which may introduce lock ordering problems and thus deadlock. Preventing this requires meticulous design, and the ability to make sure that someone else's code will never run during your alertable wait, and vice-versa. This greatly limits the usefulness of APCs.
So, to recap. If you want a single threaded managed process that does all its work using APC and completion ports, you are going to have to hand code it. Building it would be risky and tricky.
If you simply want high scale networking, you can keep using BeginXYZ and family and rest assured that it will perform well, since it uses APC. You pay a minor price marshalling stuff between threads and the .NET particular implementation.
From: http://msdn.microsoft.com/en-us/magazine/cc300760.aspx
The next step in scaling up the server is to use asynchronous I/O. Asynchronous I/O alleviates the need to create and manage threads. This leads to much simpler code and also is a more efficient I/O model. Asynchronous I/O utilizes callbacks to handle incoming data and connections, which means there are no lists to set up and scan and there is no need to create new worker threads to deal with the pending I/O.
An interesting, side fact, is that single threaded is not the fastest way to do async sockets on Windows using completion ports see: http://doc.sch130.nsc.ru/www.sysinternals.com/ntw2k/info/comport.shtml
The goal of a server is to incur as few context switches as possible by having its threads avoid unnecessary blocking, while at the same time maximizing parallelism by using multiple threads. The ideal is for there to be a thread actively servicing a client request on every processor and for those threads not to block if there are additional requests waiting when they complete a request. For this to work correctly however, there must be a way for the application to activate another thread when one processing a client request blocks on I/O (like when it reads from a file as part of the processing).
What you need is a "message loop" which takes the next task on a queue and executes it. Additionally, every task needs to be coded so that it completes as much work as possible without blocking, and then enqueues additional tasks to pick up a task that needs time later. There is nothing magical about this: never using a blocking call and never spawn additional threads.
For example, when processing an HTTP GET, the server can read as much data as is currently available on the socket. If this is not enough data to handle the request, then enqueue a new task to read from the socket again in the future. In the case of a FileStream, you want to set the ReadTimeout on the instance to a low value and be prepared to read fewer bytes than the entire file.
C# 5 actually makes this pattern much more trivial. Many people think that the async functionality implies multithreading, but that is not the case. Using async, you can essentially get the task queue I mentioned earlier without ever explicility managing it.
Yes, it's called Manos de mono
Seriously, the entire idea behind manos is a single threaded asynchronous event driven web server.
High performance and scalable. Modeled after tornadoweb, the technology that powers friend feed, Manos is capable of thousands of simultaneous connections, ideal for applications that create persistent connections with the server.
The project appears to be low on maintenance and probably wouldn't be production ready but it makes a good case study as a demonstration that this is possible.
Here's a great article series explaining what IO Completion Ports are and how they can be accessed via C# (i.e. you need to PInvoke into Win32 API calls from the Kernel32.dll).
Note: The libuv the cross platform IO framework behind node.js uses IOCP on Windows and libev on unix operating systems.
http://www.theukwebdesigncompany.com/articles/iocp-thread-pooling.php
i am wondering nobody mentioned kayak it's basicly C#s answer to Pythons twisted, JavaScripts node.js or Rubys eventmachine
I've been fiddling with my own simple implementation of such an architecture and I've put it up on github. I'm doing it more as a learning thing. But it's been a lot of fun and I think I'll flush it out more.
It's very alpha, so it's liable to change, but the code looks a little like this:
//Start the event loop.
EventLoop.Start(() => {
//Create a Hello World server on port 1337.
Server.Create((req, res) => {
res.Write("<h1>Hello World</h1>");
}).Listen("http://*:1337");
});
More information about it can be found here.
I developed a server based on HttpListener and an event loop, supporting MVC, WebApi and routing. For what i have seen the performances are far better than standard IIS+MVC, for the MVCMusicStore i moved from 100 requests per seconds and 100% CPU to 350 with 30% CPU.
If anybody would give it a try i am struggling for feedbacks!
Actually is present a template to create websites based on this structure.
Note that I DON'T USE ASYNC/AWAIT until absolutely necessary. The only tasks i use there are the ones for the I/O bound operations like writing on the socket or reading files.
PS any suggestion or correction is welcome!
Documentation
MvcMusicStore sample port on Node.Cs
Packages on Nuget
you can this framework SignalR
and this Blog about it
Some kind of the support from operating system is essential here. For example, Mono uses epoll on Linux with asynchronous I/O, so it should scale really well (still thread pool). If you are looking and performance and scalability, definitely try it.
On the other hand, the example of C# (with native libs) webserver which is based around idea you have mentioned can be Manos de Mono. Project has not been active lately; however, idea and code is generally available. Read this (especially the "A closer look at Manos" part).
Edit:
If you just want to have callback fired on your main thread, you can do a little abuse of existing synchronization contexts like the WPF dispatcher. Your code, translated to this approach:
using System;
using System.IO;
using System.Threading;
using System.Windows;
namespace Node
{
class Program
{
public static void Main()
{
var app = new Application();
app.Startup += ServerStart;
app.Run();
}
private static void ServerStart(object sender, StartupEventArgs e)
{
var dispatcher = ((Application) sender).Dispatcher;
Console.WriteLine(Thread.CurrentThread.ManagedThreadId);
var path = Environment.ExpandEnvironmentVariables(
#"%SystemRoot%\Notepad.exe");
var fs = new FileStream(path, FileMode.Open,
FileAccess.Read, FileShare.ReadWrite, 1024 * 4, true);
var bytes = new byte[1024];
fs.BeginRead(bytes, 0, bytes.Length, ar =>
{
dispatcher.BeginInvoke(new Action(() =>
{
var res = fs.EndRead(ar);
// Are we in the same thread?
Console.WriteLine(Thread.CurrentThread.ManagedThreadId);
}));
}, null);
}
}
}
prints what you wish. Plus you can set priorities with dispatcher. But agree, this is ugly, hacky and I do not know why I would do it that way for another reason than answer your demo request ;)
First about SynchronizationContext. It's just like Sam wrote. Base class won't give You single-thread functionality. You probably got that idea from WindowsFormsSynchronizationContext which provides functionality to execute code on UI thread.
You can read more here
I've written a piece of code that works with ThreadPool parameters. (Again something Sam already pointed out).
This code registers 3 asynchronous actions to be executed on free thread. They run in parallel until one of them changes ThreadPool parameters. Then each action is executed on the same thread.
It only proves that you can force .net app to use one thread.
Real implementation of web server that would receive and process calls on only one thread is something entirely different :).
Here's the code:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading;
using System.IO;
namespace SingleThreadTest
{
class Program
{
class TestState
{
internal string ID { get; set; }
internal int Count { get; set; }
internal int ChangeCount { get; set; }
}
static ManualResetEvent s_event = new ManualResetEvent(false);
static void Main(string[] args)
{
Console.WriteLine(Thread.CurrentThread.ManagedThreadId);
int nWorkerThreads;
int nCompletionPortThreads;
ThreadPool.GetMaxThreads(out nWorkerThreads, out nCompletionPortThreads);
Console.WriteLine(String.Format("Max Workers: {0} Ports: {1}",nWorkerThreads,nCompletionPortThreads));
ThreadPool.GetMinThreads(out nWorkerThreads, out nCompletionPortThreads);
Console.WriteLine(String.Format("Min Workers: {0} Ports: {1}",nWorkerThreads,nCompletionPortThreads));
ThreadPool.QueueUserWorkItem(new WaitCallback(LetsRunLikeCrazy), new TestState() { ID = "A ", Count = 10, ChangeCount = 0 });
ThreadPool.QueueUserWorkItem(new WaitCallback(LetsRunLikeCrazy), new TestState() { ID = " B ", Count = 10, ChangeCount = 5 });
ThreadPool.QueueUserWorkItem(new WaitCallback(LetsRunLikeCrazy), new TestState() { ID = " C", Count = 10, ChangeCount = 0 });
s_event.WaitOne();
Console.WriteLine("Press enter...");
Console.In.ReadLine();
}
static void LetsRunLikeCrazy(object o)
{
if (s_event.WaitOne(0))
{
return;
}
TestState oState = o as TestState;
if (oState != null)
{
// Are we in the same thread?
Console.WriteLine(String.Format("Hello. Start id: {0} in thread: {1}",oState.ID, Thread.CurrentThread.ManagedThreadId));
Thread.Sleep(1000);
oState.Count -= 1;
if (oState.ChangeCount == oState.Count)
{
int nWorkerThreads = 1;
int nCompletionPortThreads = 1;
ThreadPool.SetMinThreads(nWorkerThreads, nCompletionPortThreads);
ThreadPool.SetMaxThreads(nWorkerThreads, nCompletionPortThreads);
ThreadPool.GetMaxThreads(out nWorkerThreads, out nCompletionPortThreads);
Console.WriteLine(String.Format("New Max Workers: {0} Ports: {1}", nWorkerThreads, nCompletionPortThreads));
ThreadPool.GetMinThreads(out nWorkerThreads, out nCompletionPortThreads);
Console.WriteLine(String.Format("New Min Workers: {0} Ports: {1}", nWorkerThreads, nCompletionPortThreads));
}
if (oState.Count > 0)
{
Console.WriteLine(String.Format("Hello. End id: {0} in thread: {1}", oState.ID, Thread.CurrentThread.ManagedThreadId));
ThreadPool.QueueUserWorkItem(new WaitCallback(LetsRunLikeCrazy), oState);
}
else
{
Console.WriteLine(String.Format("Hello. End id: {0} in thread: {1}", oState.ID, Thread.CurrentThread.ManagedThreadId));
s_event.Set();
}
}
else
{
Console.WriteLine("Error !!!");
s_event.Set();
}
}
}
}
LibuvSharp is a wrapper for libuv, which is used in the node.js project for async IO. BUt it only contains only low level TCP/UDP/Pipe/Timer functionality. And it will stay like that, writing a webserver on top of it is an entire different story. It doesn't even support dns resolving, since this is just a protocol on top of udp.
I believe it's possible, here is an open-source example written in VB.NET and C#:
https://github.com/perrybutler/dotnetsockets/
It uses Event-based Asynchronous Pattern (EAP), IAsyncResult Pattern and thread pool (IOCP). It will serialize/marshal the messages (messages can be any native object such as a class instance) into binary packets, transfer the packets over TCP, and then deserialize/unmarshal the packets at the receiving end so you get your native object to work with. This part is somewhat like Protobuf or RPC.
It was originally developed as a "netcode" for real-time multiplayer gaming, but it can serve many purposes. Unfortunately I never got around to using it. Maybe someone else will.
The source code has a lot of comments so it should be easy to follow. Enjoy!
Here is one more implementation of the event-loop web server called SingleSand. It executes all custom logic inside single-threaded event loop but the web server is hosted in asp.net.
Answering the question, it is generally not possible to run a pure single threaded app because of .NET multi-threaded nature. There are some activities that run in separate threads and developer cannot change their behavior.

Categories