RabbitMQ on Windows seems to limit connection creation per process - c#

I want to understand the behavior I'm seeing with the following code. Using the RabbitMQ.Client library version 6.2.2.
Expected behavior: Connections are created quickly and the process does not slow down.
Actual behavior: First 6 connections are created quickly, after that there is a significant slowdown and connections are created one by done (1s apart).
Note; starting the program multiple times shows similar behavior. That leads me to believe that the bottleneck is per-process rather than RabbitMQ or system resources.
Note 2; system resources are not the bottleneck (AFAIK).
Does anybody know what is causing the observed behavior? RabbitMQ installed on Windows 10 with default settings.
using RabbitMQ.Client;
using System;
namespace ConsoleApp1
{
internal class Program
{
static void Main(string[] args)
{
for (int i = 0; i < 50; i++)
{
var factory = new ConnectionFactory() { HostName = "localhost" };
var connection = factory.CreateConnection();
var channel1 = connection.CreateModel();
var channel2 = connection.CreateModel();
Console.WriteLine(i);
}
Console.ReadLine();
}
}
}
EDIT: I know this violates every best practice regarding "Single connection per process". I'm just curious what is limiting the connection creation and if there is any setting that can control this behavior.

The .NET client uses the ThreadPool which probably doesn't have enough threads out of the box. You need to increase the amount available:
https://github.com/rabbitmq/rabbitmq-dotnet-client/blob/main/projects/TestApplications/MassPublish/Program.cs#L21
See issues and discussion here:
https://github.com/rabbitmq/rabbitmq-dotnet-client/search?q=threadpool
NOTE: the RabbitMQ team monitors the rabbitmq-users mailing list and only sometimes answers questions on StackOverflow.

Related

Using NetMQMonitor to detect server disconnects?

I am looking for a better way to detect disconnects when a Router/server goes down or is unavailable due to a poor connection. (I'm Listening from a Dealer/client running on wifi) I found zmq_socket_monitor() and discovered that NetMQ has the same feature. My understanding from the documentation is that when you monitor a socket you give it an inproc address, and it notifies you of any socket changes using that address. I couldn't really find any examples of the NetMQMonitor except the unit tests, my question is if I am using it correctly in the code below? Is it valid to use it alongside a NetMQPoller?
// run poller on a separate thread
_poller = new NetMQPoller { _dealer, _subscriber, _outgoingMessageQueue, _subscriptionChanges};
_poller.RunAsync();
// run a monitor listening for Connected and Disconnected events
_monitor = new NetMQMonitor(_dealer, "inproc://rep.inproc", SocketEvents.Disconnected | SocketEvents.Connected);
_monitor.EventReceived += _monitor_EventReceived;
_monitor.StartAsync();
**** UPDATE ****
So... after posting this I discovered the answer in the NetMQPoller tests on github, so that answers whether you can use the NetMQMonitor with a NetMQPoller, but I'm still curious if anyone has thoughts on the overall approach of using a monitor to track connection state. Here's the relevant code for anyone interested:
[Fact]
public void Monitoring()
{
var listeningEvent = new ManualResetEvent(false);
var acceptedEvent = new ManualResetEvent(false);
var connectedEvent = new ManualResetEvent(false);
using (var rep = new ResponseSocket())
using (var req = new RequestSocket())
using (var poller = new NetMQPoller())
using (var repMonitor = new NetMQMonitor(rep, "inproc://rep.inproc", SocketEvents.Accepted | SocketEvents.Listening))
using (var reqMonitor = new NetMQMonitor(req, "inproc://req.inproc", SocketEvents.Connected))
{
repMonitor.Accepted += (s, e) => acceptedEvent.Set();
repMonitor.Listening += (s, e) => listeningEvent.Set();
repMonitor.AttachToPoller(poller);
int port = rep.BindRandomPort("tcp://127.0.0.1");
reqMonitor.Connected += (s, e) => connectedEvent.Set();
reqMonitor.AttachToPoller(poller);
poller.RunAsync();
req.Connect("tcp://127.0.0.1:" + port);
req.SendFrame("a");
rep.SkipFrame();
rep.SendFrame("b");
req.SkipFrame();
Assert.True(listeningEvent.WaitOne(300));
Assert.True(connectedEvent.WaitOne(300));
Assert.True(acceptedEvent.WaitOne(300));
}
}
Using the monitor is exactly the right way to look for changes in connection state.
Under the hood the management threads are ping pinging across the connection. If the ping pongs dry up, then there is a problem. This detects network issues, but also detects things like crashes; if the process at one of a socket dies, the process at the other end is informed of the dead connection.
The only inadequacy is if it matters to you what happens to sent messages. Different sockets cache messages in different places, some being biased to keeping them at the sending end until the receiver is ready, others storing them at the receiving end. If the connection dies and you want your undelivered messages back (to send elsewhere, perhaps), you can't get them. ZMQ is like a post office. As soon as you hand the letter over the counter, they cannot and will not give it back to you, even if you can still see it!
This is the nature of Actor model, which is what ZMQ implements. Communicating Sequential Processes, a development of Actor model, does not store messages in a channel at all, meaning that if the connection dies the application still owns the unsent message. Sometimes it's useful to know for sure if a message definitely was not delivered.

azure queue performance

For the windows azure queues the scalability target per storage is supposed to be around 500 messages / second (http://msdn.microsoft.com/en-us/library/windowsazure/hh697709.aspx). I have the following simple program that just writes a few messages to a queue. The program takes 10 seconds to complete (4 messages / second). I am running the program from inside a virtual machine (on west-europe) and my storage account also is located in west-europe. I don't have setup geo replication for my storage. My connection string is setup to use the http protocol.
// http://blogs.msdn.com/b/windowsazurestorage/archive/2010/06/25/nagle-s-algorithm-is-not-friendly-towards-small-requests.aspx
ServicePointManager.UseNagleAlgorithm = false;
CloudStorageAccount storageAccount=CloudStorageAccount.Parse(ConfigurationManager.AppSettings["DataConnectionString"]);
var cloudQueueClient = storageAccount.CreateCloudQueueClient();
var queue = cloudQueueClient.GetQueueReference(Guid.NewGuid().ToString());
queue.CreateIfNotExist();
var w = new Stopwatch();
w.Start();
for (int i = 0; i < 50;i++ )
{
Console.WriteLine("nr {0}",i);
queue.AddMessage(new CloudQueueMessage("hello "+i));
}
w.Stop();
Console.WriteLine("elapsed: {0}", w.ElapsedMilliseconds);
queue.Delete();
Any idea how I can get better performance?
EDIT:
Based on Sandrino Di Mattia's answer I re-analyzed the code I've originally posted and found out that it was not complete enough to reproduce the error. In fact I had created a queue just before the call to ServicePointManager.UseNagleAlgorithm = false; The code to reproduce my problem looks more like this:
CloudStorageAccount storageAccount=CloudStorageAccount.Parse(ConfigurationManager.AppSettings["DataConnectionString"]);
var cloudQueueClient = storageAccount.CreateCloudQueueClient();
var queue = cloudQueueClient.GetQueueReference(Guid.NewGuid().ToString());
//ServicePointManager.UseNagleAlgorithm = false; // If you change the nagle algorithm here, the performance will be okay.
queue.CreateIfNotExist();
ServicePointManager.UseNagleAlgorithm = false; // TOO LATE, the queue is already created without 'nagle'
var w = new Stopwatch();
w.Start();
for (int i = 0; i < 50;i++ )
{
Console.WriteLine("nr {0}",i);
queue.AddMessage(new CloudQueueMessage("hello "+i));
}
w.Stop();
Console.WriteLine("elapsed: {0}", w.ElapsedMilliseconds);
queue.Delete();
The suggested solution from Sandrino to configure the ServicePointManager using the app.config file has the advantage that the ServicePointManager is initialized when the application starts up, so you don't have to worry about time dependencies.
I answered a similar question a few days ago: How to achive more 10 inserts per second with azure storage tables.
For adding 1000 items in table storage it took over 3 minutes, and with the changes I described in my answer it dropped to 4 seconds (250 requests/sec). In the end, table storage and storage queues aren't all that different. The backend is the same, data is simply stored in a different way. And both table storage and queues are exposed through a REST API, so if you improve the way you handle your requests, you'll get a better performance.
The most important changes:
expect100Continue: false
useNagleAlgorithm: false (you're already doing this)
Parallel requests combined with connectionManagement/maxconnection
Also, ServicePointManager.DefaultConnectionLimit should be increased before making a service point. Actually Sandrino's answer says the same thing but using config.
Turn off proxy detection even in the cloud. Auto detect in proxy config element. Slows initialisation.
Choose distributed partition keys.
Collocate your account near to compute, and customers.
Design to add more accounts as needed.
Microsoft set the SLA at 2,000 tps on queues and tables as of 07 2012.
I didn't read Sandrino's linked answer, sorry, just was on this question as I was watching Build 2012 session on exactly this.

Is a non-blocking, single-threaded, asynchronous web server (like Node.js) possible in .NET?

I was looking at this question, looking for a way to create a single-threaded, event-based nonblocking asynchronous web server in .NET.
This answer looked promising at first, by claiming that the body of the code runs in a single thread.
However, I tested this in C#:
using System;
using System.IO;
using System.Threading;
class Program
{
static void Main()
{
Console.WriteLine(Thread.CurrentThread.ManagedThreadId);
var sc = new SynchronizationContext();
SynchronizationContext.SetSynchronizationContext(sc);
{
var path = Environment.ExpandEnvironmentVariables(
#"%SystemRoot%\Notepad.exe");
var fs = new FileStream(path, FileMode.Open,
FileAccess.Read, FileShare.ReadWrite, 1024 * 4, true);
var bytes = new byte[1024];
fs.BeginRead(bytes, 0, bytes.Length, ar =>
{
sc.Post(dummy =>
{
var res = fs.EndRead(ar);
// Are we in the same thread?
Console.WriteLine(Thread.CurrentThread.ManagedThreadId);
}, null);
}, null);
}
Thread.Sleep(100);
}
}
And the result was:
1
5
So it seems like, contrary to the answer, the thread initiating the read and the thread ending the read are not the same.
So now my question is, how do you to achieve a single-threaded, event-based nonblocking asynchronous web server in .NET?
The whole SetSynchronizationContext is a red herring, this is just a mechanism for marshalling, the work still happens in the IO Thread Pool.
What you are asking for is a way to queue and harvest Asynchronous Procedure Calls for all your IO work from the main thread. Many higher level frameworks wrap this kind functionality, the most famous one being libevent.
There is a great recap on the various options here: Whats the difference between epoll, poll, threadpool?.
.NET already takes care of scaling for you by have a special "IO Thread Pool" that handles IO access when you call the BeginXYZ methods. This IO Thread Pool must have at least 1 thread per processor on the box. see: ThreadPool.SetMaxThreads.
If single threaded app is a critical requirement (for some crazy reason) you could, of course, interop all of this stuff in using DllImport (see an example here)
However it would be a very complex and risky task:
Why don't we support APCs as a completion mechanism? APCs are really not a good general-purpose completion mechanism for user code. Managing the reentrancy introduced by APCs is nearly impossible; any time you block on a lock, for example, some arbitrary I/O completion might take over your thread. It might try to acquire locks of its own, which may introduce lock ordering problems and thus deadlock. Preventing this requires meticulous design, and the ability to make sure that someone else's code will never run during your alertable wait, and vice-versa. This greatly limits the usefulness of APCs.
So, to recap. If you want a single threaded managed process that does all its work using APC and completion ports, you are going to have to hand code it. Building it would be risky and tricky.
If you simply want high scale networking, you can keep using BeginXYZ and family and rest assured that it will perform well, since it uses APC. You pay a minor price marshalling stuff between threads and the .NET particular implementation.
From: http://msdn.microsoft.com/en-us/magazine/cc300760.aspx
The next step in scaling up the server is to use asynchronous I/O. Asynchronous I/O alleviates the need to create and manage threads. This leads to much simpler code and also is a more efficient I/O model. Asynchronous I/O utilizes callbacks to handle incoming data and connections, which means there are no lists to set up and scan and there is no need to create new worker threads to deal with the pending I/O.
An interesting, side fact, is that single threaded is not the fastest way to do async sockets on Windows using completion ports see: http://doc.sch130.nsc.ru/www.sysinternals.com/ntw2k/info/comport.shtml
The goal of a server is to incur as few context switches as possible by having its threads avoid unnecessary blocking, while at the same time maximizing parallelism by using multiple threads. The ideal is for there to be a thread actively servicing a client request on every processor and for those threads not to block if there are additional requests waiting when they complete a request. For this to work correctly however, there must be a way for the application to activate another thread when one processing a client request blocks on I/O (like when it reads from a file as part of the processing).
What you need is a "message loop" which takes the next task on a queue and executes it. Additionally, every task needs to be coded so that it completes as much work as possible without blocking, and then enqueues additional tasks to pick up a task that needs time later. There is nothing magical about this: never using a blocking call and never spawn additional threads.
For example, when processing an HTTP GET, the server can read as much data as is currently available on the socket. If this is not enough data to handle the request, then enqueue a new task to read from the socket again in the future. In the case of a FileStream, you want to set the ReadTimeout on the instance to a low value and be prepared to read fewer bytes than the entire file.
C# 5 actually makes this pattern much more trivial. Many people think that the async functionality implies multithreading, but that is not the case. Using async, you can essentially get the task queue I mentioned earlier without ever explicility managing it.
Yes, it's called Manos de mono
Seriously, the entire idea behind manos is a single threaded asynchronous event driven web server.
High performance and scalable. Modeled after tornadoweb, the technology that powers friend feed, Manos is capable of thousands of simultaneous connections, ideal for applications that create persistent connections with the server.
The project appears to be low on maintenance and probably wouldn't be production ready but it makes a good case study as a demonstration that this is possible.
Here's a great article series explaining what IO Completion Ports are and how they can be accessed via C# (i.e. you need to PInvoke into Win32 API calls from the Kernel32.dll).
Note: The libuv the cross platform IO framework behind node.js uses IOCP on Windows and libev on unix operating systems.
http://www.theukwebdesigncompany.com/articles/iocp-thread-pooling.php
i am wondering nobody mentioned kayak it's basicly C#s answer to Pythons twisted, JavaScripts node.js or Rubys eventmachine
I've been fiddling with my own simple implementation of such an architecture and I've put it up on github. I'm doing it more as a learning thing. But it's been a lot of fun and I think I'll flush it out more.
It's very alpha, so it's liable to change, but the code looks a little like this:
//Start the event loop.
EventLoop.Start(() => {
//Create a Hello World server on port 1337.
Server.Create((req, res) => {
res.Write("<h1>Hello World</h1>");
}).Listen("http://*:1337");
});
More information about it can be found here.
I developed a server based on HttpListener and an event loop, supporting MVC, WebApi and routing. For what i have seen the performances are far better than standard IIS+MVC, for the MVCMusicStore i moved from 100 requests per seconds and 100% CPU to 350 with 30% CPU.
If anybody would give it a try i am struggling for feedbacks!
Actually is present a template to create websites based on this structure.
Note that I DON'T USE ASYNC/AWAIT until absolutely necessary. The only tasks i use there are the ones for the I/O bound operations like writing on the socket or reading files.
PS any suggestion or correction is welcome!
Documentation
MvcMusicStore sample port on Node.Cs
Packages on Nuget
you can this framework SignalR
and this Blog about it
Some kind of the support from operating system is essential here. For example, Mono uses epoll on Linux with asynchronous I/O, so it should scale really well (still thread pool). If you are looking and performance and scalability, definitely try it.
On the other hand, the example of C# (with native libs) webserver which is based around idea you have mentioned can be Manos de Mono. Project has not been active lately; however, idea and code is generally available. Read this (especially the "A closer look at Manos" part).
Edit:
If you just want to have callback fired on your main thread, you can do a little abuse of existing synchronization contexts like the WPF dispatcher. Your code, translated to this approach:
using System;
using System.IO;
using System.Threading;
using System.Windows;
namespace Node
{
class Program
{
public static void Main()
{
var app = new Application();
app.Startup += ServerStart;
app.Run();
}
private static void ServerStart(object sender, StartupEventArgs e)
{
var dispatcher = ((Application) sender).Dispatcher;
Console.WriteLine(Thread.CurrentThread.ManagedThreadId);
var path = Environment.ExpandEnvironmentVariables(
#"%SystemRoot%\Notepad.exe");
var fs = new FileStream(path, FileMode.Open,
FileAccess.Read, FileShare.ReadWrite, 1024 * 4, true);
var bytes = new byte[1024];
fs.BeginRead(bytes, 0, bytes.Length, ar =>
{
dispatcher.BeginInvoke(new Action(() =>
{
var res = fs.EndRead(ar);
// Are we in the same thread?
Console.WriteLine(Thread.CurrentThread.ManagedThreadId);
}));
}, null);
}
}
}
prints what you wish. Plus you can set priorities with dispatcher. But agree, this is ugly, hacky and I do not know why I would do it that way for another reason than answer your demo request ;)
First about SynchronizationContext. It's just like Sam wrote. Base class won't give You single-thread functionality. You probably got that idea from WindowsFormsSynchronizationContext which provides functionality to execute code on UI thread.
You can read more here
I've written a piece of code that works with ThreadPool parameters. (Again something Sam already pointed out).
This code registers 3 asynchronous actions to be executed on free thread. They run in parallel until one of them changes ThreadPool parameters. Then each action is executed on the same thread.
It only proves that you can force .net app to use one thread.
Real implementation of web server that would receive and process calls on only one thread is something entirely different :).
Here's the code:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading;
using System.IO;
namespace SingleThreadTest
{
class Program
{
class TestState
{
internal string ID { get; set; }
internal int Count { get; set; }
internal int ChangeCount { get; set; }
}
static ManualResetEvent s_event = new ManualResetEvent(false);
static void Main(string[] args)
{
Console.WriteLine(Thread.CurrentThread.ManagedThreadId);
int nWorkerThreads;
int nCompletionPortThreads;
ThreadPool.GetMaxThreads(out nWorkerThreads, out nCompletionPortThreads);
Console.WriteLine(String.Format("Max Workers: {0} Ports: {1}",nWorkerThreads,nCompletionPortThreads));
ThreadPool.GetMinThreads(out nWorkerThreads, out nCompletionPortThreads);
Console.WriteLine(String.Format("Min Workers: {0} Ports: {1}",nWorkerThreads,nCompletionPortThreads));
ThreadPool.QueueUserWorkItem(new WaitCallback(LetsRunLikeCrazy), new TestState() { ID = "A ", Count = 10, ChangeCount = 0 });
ThreadPool.QueueUserWorkItem(new WaitCallback(LetsRunLikeCrazy), new TestState() { ID = " B ", Count = 10, ChangeCount = 5 });
ThreadPool.QueueUserWorkItem(new WaitCallback(LetsRunLikeCrazy), new TestState() { ID = " C", Count = 10, ChangeCount = 0 });
s_event.WaitOne();
Console.WriteLine("Press enter...");
Console.In.ReadLine();
}
static void LetsRunLikeCrazy(object o)
{
if (s_event.WaitOne(0))
{
return;
}
TestState oState = o as TestState;
if (oState != null)
{
// Are we in the same thread?
Console.WriteLine(String.Format("Hello. Start id: {0} in thread: {1}",oState.ID, Thread.CurrentThread.ManagedThreadId));
Thread.Sleep(1000);
oState.Count -= 1;
if (oState.ChangeCount == oState.Count)
{
int nWorkerThreads = 1;
int nCompletionPortThreads = 1;
ThreadPool.SetMinThreads(nWorkerThreads, nCompletionPortThreads);
ThreadPool.SetMaxThreads(nWorkerThreads, nCompletionPortThreads);
ThreadPool.GetMaxThreads(out nWorkerThreads, out nCompletionPortThreads);
Console.WriteLine(String.Format("New Max Workers: {0} Ports: {1}", nWorkerThreads, nCompletionPortThreads));
ThreadPool.GetMinThreads(out nWorkerThreads, out nCompletionPortThreads);
Console.WriteLine(String.Format("New Min Workers: {0} Ports: {1}", nWorkerThreads, nCompletionPortThreads));
}
if (oState.Count > 0)
{
Console.WriteLine(String.Format("Hello. End id: {0} in thread: {1}", oState.ID, Thread.CurrentThread.ManagedThreadId));
ThreadPool.QueueUserWorkItem(new WaitCallback(LetsRunLikeCrazy), oState);
}
else
{
Console.WriteLine(String.Format("Hello. End id: {0} in thread: {1}", oState.ID, Thread.CurrentThread.ManagedThreadId));
s_event.Set();
}
}
else
{
Console.WriteLine("Error !!!");
s_event.Set();
}
}
}
}
LibuvSharp is a wrapper for libuv, which is used in the node.js project for async IO. BUt it only contains only low level TCP/UDP/Pipe/Timer functionality. And it will stay like that, writing a webserver on top of it is an entire different story. It doesn't even support dns resolving, since this is just a protocol on top of udp.
I believe it's possible, here is an open-source example written in VB.NET and C#:
https://github.com/perrybutler/dotnetsockets/
It uses Event-based Asynchronous Pattern (EAP), IAsyncResult Pattern and thread pool (IOCP). It will serialize/marshal the messages (messages can be any native object such as a class instance) into binary packets, transfer the packets over TCP, and then deserialize/unmarshal the packets at the receiving end so you get your native object to work with. This part is somewhat like Protobuf or RPC.
It was originally developed as a "netcode" for real-time multiplayer gaming, but it can serve many purposes. Unfortunately I never got around to using it. Maybe someone else will.
The source code has a lot of comments so it should be easy to follow. Enjoy!
Here is one more implementation of the event-loop web server called SingleSand. It executes all custom logic inside single-threaded event loop but the web server is hosted in asp.net.
Answering the question, it is generally not possible to run a pure single threaded app because of .NET multi-threaded nature. There are some activities that run in separate threads and developer cannot change their behavior.

Can we create 300,000 threads in a C# application and run it on a PC?

I am trying to imitate a scenario where 300,000 consumers are accessing a server. So I am trying to create the pseudo clients, by repeatedly querying the server from the concurrent threads.
But the first hurdle to be cleared is, whether it is possible to run 300,000 threads on a PC? Here is a code which I am using to see intially how many max threads I can get, and later then replace the test function with the actual function:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading;
namespace CheckThread
{
class Program
{
static int count;
public static void TestThread(int i)
{
while (true)
{
Console.Write("\rThread Executing : {0}", i);
Thread.Sleep(500);
}
}
static void Main(string[] args)
{
count = 0;
int limit = 0;
if (args.Length != 1)
{
Console.WriteLine("Usage CheckThread <number of threads>");
return;
}
else
{
limit = Convert.ToInt32(args[0]);
}
Console.WriteLine();
while (count < limit)
{
ThreadStart newThread = new ThreadStart(delegate { TestThread(count); });
Thread mythread = new Thread(newThread);
mythread.Start();
Console.WriteLine("Thread # {0}", count++);
}
while (true)
{
Thread.Sleep(30*1000);
}
} // end of main
} // end of CheckThread class
} // end of namespace
Now what I am trying might be unrealistic, but still, if there is a way out to do it and you know, then please help me.
Each thread will create its own stack and local storage, you are looking at roughly 512k of stack space per thread on a 32bit OS, I think the stack space doubles on a 64 bit OS. A quick back of the spreadsheet calc gives us 146.484375 gigs of stack space for your 300k clients.
So, no, don't create 300k threads, but rather use the threadpool to simulate 300k requests, although tbh I think you would be better off with several test clients spamming your server through a network interface.
There are a lot of web load-testing tools available. Good starting point : http://www.webperformance.com/library/reports/TestingAspDotNet/
You can alter the maximum nunmber of threads by calling the ThreadPool.SetMaxThreads method. 300,000 threads will probably make your PC explode*
*This is probably an exaggeration
Language-agnostic answer:
The better way to probably go about this is using the Reactor pattern, with a maximum of 1 or 2 concurrent threads per core.
As .net commits the entire stack (1MB) for each clr thread; as Ben says, your PC may actually explode. Or possibly OoM.
Well, what was the result of your test when you tried to create 300K threads? I'm not going to try it on mine!
You could not connect up 300K clients at once anyway because there are not enough sockets available on a single server, (hence farming).
I have done some server testing and, by tweaking the registry to make more sockets available, I have had 24K sockets connected to a server, all one one box. That was somewhat what I was expecting since the server<>client connection requires one socket object at each end and there are only 64K sockets available. I did not attempt to create 24K threads for my testing, I used a client thread class that opened/closed connections on multiple client socket objects in a list.
Rgds,
Martin

.NET best practices for MongoDB connections?

I've been playing with MongoDB recently (It's AMAZINGLY FAST) using the C# driver on GitHub. Everything is working just fine in my little single threaded console app that I'm testing with. I'm able to add 1,000,000 documents (yes, million) in under 8 seconds running single threaded. I only get this performance if I use the connection outside the scope of a for loop. In other words, I'm keeping the connection open for each insert rather than connecting for each insert. Obviously that's contrived.
I thought I'd crank it up a notch to see how it works with multiple threads. I'm doing this because I need to simulate a website with multiple concurrent requests. I'm spinning up between 15 and 50 threads, still inserting a total of 150,000 documents in all cases. If I just let the threads run, each creating a new connection for each insert operation, the performance grinds to a halt.
Obviously I need to find a way to share, lock, or pool the connection. Therein lies the question. What's the best practice in terms of connecting to MongoDB? Should the connection be kept open for the life of the app (there is substantial latency opening and closing the TCP connection for each operation)?
Does anyone have any real world or production experience with MongoDB, and specifically the underlying connection?
Here is my threading sample using a static connection that's locked for insert operations. Please offer suggestions that would maximize performance and reliability in a web context!
private static Mongo _mongo;
private static void RunMongoThreaded()
{
_mongo = new Mongo();
_mongo.Connect();
var threadFinishEvents = new List<EventWaitHandle>();
for(var i = 0; i < 50; i++)
{
var threadFinish = new EventWaitHandle(false, EventResetMode.ManualReset);
threadFinishEvents.Add(threadFinish);
var thread = new Thread(delegate()
{
RunMongoThread();
threadFinish.Set();
});
thread.Start();
}
WaitHandle.WaitAll(threadFinishEvents.ToArray());
_mongo.Disconnect();
}
private static void RunMongoThread()
{
for (var i = 0; i < 3000; i++)
{
var db = _mongo.getDB("Sample");
var collection = db.GetCollection("Users");
var user = GetUser(i);
var document = new Document();
document["FirstName"] = user.FirstName;
document["LastName"] = user.LastName;
lock (_mongo) // Lock the connection - not ideal for threading, but safe and seemingly fast
{
collection.Insert(document);
}
}
}
Most answers here are outdated and are no longer applicable as the .net driver has matured and had numberless features added.
Looking at the documentation of the new 2.0 driver found here:
http://mongodb.github.io/mongo-csharp-driver/2.0/reference/driver/connecting/
The .net driver is now thread safe and handles connection pooling. According to documentation
It is recommended to store a MongoClient instance in a global place, either as a static variable or in an IoC container with a singleton lifetime.
The thing to remember about a static connection is that it's shared among all your threads. What you want is one connection per thread.
When using mongodb-csharp you treat it like you would an ADO connection.
When you create a Mongo object it borrows a connection from the pool, which it owns until it is disposed. So after the using block the connection is back into the pool.
Creating Mongo objects are cheap and fast.
Example
for(var i=0;i<100;i++)
{
using(var mongo1 = new Mongo())
using(var mongo2 = new Mongo())
{
mongo1.Connect();
mongo2.Connect();
}
}
Database Log
Wed Jun 02 20:54:21 connection accepted from 127.0.0.1:58214 #1
Wed Jun 02 20:54:21 connection accepted from 127.0.0.1:58215 #2
Wed Jun 02 20:54:21 MessagingPort recv() errno:0 No error 127.0.0.1:58214
Wed Jun 02 20:54:21 end connection 127.0.0.1:58214
Wed Jun 02 20:54:21 MessagingPort recv() errno:0 No error 127.0.0.1:58215
Wed Jun 02 20:54:21 end connection 127.0.0.1:58215
Notice it only opened 2 connections.
I put this together using mongodb-csharp forum.
http://groups.google.com/group/mongodb-csharp/browse_thread/thread/867fa78d726b1d4
Somewhat but still of interest is CSMongo, a C# driver for MongoDB created by the developer of jLinq. Here's a sample:
//create a database instance
using (MongoDatabase database = new MongoDatabase(connectionString)) {
//create a new document to add
MongoDocument document = new MongoDocument(new {
name = "Hugo",
age = 30,
admin = false
});
//create entire objects with anonymous types
document += new {
admin = true,
website = "http://www.hugoware.net",
settings = new {
color = "orange",
highlight = "yellow",
background = "abstract.jpg"
}
};
//remove fields entirely
document -= "languages";
document -= new[] { "website", "settings.highlight" };
//or even attach other documents
MongoDocument stuff = new MongoDocument(new {
computers = new [] {
"Dell XPS",
"Sony VAIO",
"Macbook Pro"
}
});
document += stuff;
//insert the document immediately
database.Insert("users", document);
}
Connection Pool should be your answer.
The feature is being developed (please see http://jira.mongodb.org/browse/CSHARP-9 for more detail).
Right now, for web application, the best practice is to connect at the BeginRequest and release the connection at EndRequest. But to me, I think that operation is too expensive for each request without Connection Pool. So I decide to have the global Mongo object and using that as shared resource for every threads (If you get the latest C# driver from github right now, they also improve the performance for concurrency a bit).
I don't know the disadvantage for using Global Mongo object. So let's wait for another expert to comment on this.
But I think I can live with it until the feature(Connection pool) have been completed.
I am using csharp-mongodb driver and it doesn't help me with his connection pool :( I have about 10-20 request to mongodb per web request.(150 users online - average) And i can't even monitor statistics or connect to mongodb from shell it throw exception to me.
I have created repository, which open and dispose connection per request. I rely on such things as:
1) Driver has connection pool
2) After my research(i have posted some question in user groups about this) - i understood that creating mongo object and open connection doesn't heavy operation, so heavy operation.
But today my production go down :(
May be i have to save open connection per request...
here is link to user group http://groups.google.com/group/mongodb-user/browse_thread/thread/3d4a4e6c5eb48be3#

Categories