I have a Windows Service application which is performing some calls to SQL Server. I have a particular unit of work to do which involves saving one row to the Message table and updating multiple rows in the Buffer table.
I have wrapped these two SQL statements into a TransactionScope to ensure that they either both get committed, or neither get committed.
The high level code looks like this:
public static void Save(Message message)
{
using (var transactionScope = new TransactionScope())
{
MessageData.Save(message.TransactionType,
message.Version,
message.CaseNumber,
message.RouteCode,
message.BufferSetIdentifier,
message.InternalPatientNumber,
message.DistrictNumber,
message.Data,
message.DateAssembled,
(byte)MessageState.Inserted);
BufferLogic.FlagSetAsAssembled(message.BufferSetIdentifier);
transactionScope.Complete();
}
}
This has all worked perfectly on my development machine with a local SQL Server installation.
On deploying the Windows Service to a server (but connecting back to my local machine's SQL Server) I am intermittently getting this error message:
System.ArgumentNullException: Value cannot be null.
at System.Threading.Monitor.Enter(Object obj)
at System.Data.ProviderBase.DbConnectionPool.TransactedConnectionPool.TransactionEnded(Transaction transaction, DbConnectionInternal transactedObject)
at System.Data.SqlClient.SqlDelegatedTransaction.SinglePhaseCommit(SinglePhaseEnlistment enlistment)
at System.Transactions.TransactionStateDelegatedCommitting.EnterState(InternalTransaction tx)
at System.Transactions.CommittableTransaction.Commit()
at System.Transactions.TransactionScope.InternalDispose()
at System.Transactions.TransactionScope.Dispose()
at OpenLink.Logic.MessageLogic.Save(Message message) in E:\DevTFS\P0628Temp\OpenLink\OpenLink.Logic\MessageLogic.cs:line 30
at OpenLinkMessageAssembler.OpenLinkMessageAssemblerService.RunService() in E:\DevTFS\P0628Temp\OpenLink\OpenLinkMessageAssembler\OpenLinkMessageAssemblerService.cs:line 99
I believe the line of code being referred to by the exception is where the using block is closed, thus calling the Dispose() method of the TransactionScope. I'm at a bit of a loss here, as the exception seems to be thrown by the internal workings of the TransactionScope class.
One thing that may be significant is that when installing on the server, I had to enable some of the settings for the Distributed Transaction Coordinator to allow network access This got me into thinking that when it's all on my local machine, DTC is probably not used.
Could DTC be part of the cause of this exception?
I also considered whether it was to do with connection pools being maxed out, but would expect a more useful exception than what I'm getting. I kept running the query in this question to check the connection pool size, and it never exceeded four.
My ultimate question is, why is this error intermittently occurring? How can I diagnose what's causing it?
Edit: Threading
#Joe suggested this could be a threading issue. I've therefore included the skeleton code of my Windows Service below to see if it is problematic.
Note that the EventLogger class writes only to the Windows event log and does not connect to SQL Server.
partial class OpenLinkMessageAssemblerService : ServiceBase
{
private volatile bool _isStopping;
private readonly ManualResetEvent _stoppedEvent;
private readonly int _stopTimeout = Convert.ToInt32(ConfigurationManager.AppSettings["ServiceOnStopTimeout"]);
Thread _workerThread;
public OpenLinkMessageAssemblerService()
{
InitializeComponent();
_isStopping = false;
_stoppedEvent = new ManualResetEvent(false);
ServiceName = "OpenLinkMessageAssembler";
}
protected override void OnStart(string[] args)
{
try
{
_workerThread = new Thread(RunService) { IsBackground = true };
_workerThread.Start();
}
catch (Exception exception)
{
EventLogger.LogError(ServiceName, exception.ToString());
throw;
}
}
protected override void OnStop()
{
// Set the global flag so it can be picked up by the worker thread
_isStopping = true;
// Allow worker thread to exit cleanly until timeout occurs
if (!_stoppedEvent.WaitOne(_stopTimeout))
{
_workerThread.Abort();
}
}
private void RunService()
{
// Check global flag which indicates whether service has been told to stop
while (!_isStopping)
{
try
{
var buffersToAssemble = BufferLogic.GetNextSetForAssembly();
if (!buffersToAssemble.Any())
{
Thread.Sleep(30000);
continue;
}
... // Some validation code removed here for clarity
string assembledMessage = string.Empty;
buffersToAssemble.ForEach(b => assembledMessage += b.Data);
var messageParser = new MessageParser(assembledMessage);
var message = messageParser.Parse();
MessageLogic.Save(message); // <-- This calls the method which results in the exception
}
catch (Exception exception)
{
EventLogger.LogError(ServiceName, exception.ToString());
throw;
}
}
_stoppedEvent.Set();
}
}
Check you have setup Your your web server and separate db server if you have them separate.
http://itknowledgeexchange.techtarget.com/sql-server/how-to-configure-dtc-on-windows-2008/
For Logging i would Suggest put int a try catch inside the transaction scope However if you logging to database you will need to make use of transaction scope suppress function
using(TransactionScope scope4 = new
TransactionScope(TransactionScopeOption.Suppress))
{
...
}
I worked around this by stopping the transaction from being escalated to DTC. By using SQL 2008 instead of SQL 2005, the transaction does not get escalated, and all is fine.
You do not mention your .Net version but according to
http://support.microsoft.com/kb/960754, there is an issue with 2.50727.4016 version of System.Data.dll.
If your server has this older version, I would try to get the updated one from Microsoft.
Related
I'm encountering an issue where a service is exiting on errors that should never propagate up.
I built a microservice manager (.NET as the local environment doesnt support .NET Core and some of its native microservice abilities)
Built in VS2019 targeting .NET 4.5.2 (I know, but this is the world we live in)
For the microservice manager, it is built and installed as a windows service. Entry looks like this (#if/#else was for testing locally, it is working as intended when registered as a windows service)
Program.cs (Entry point)
` static class Program
{
/// <summary>
/// The main entry point for the application.
/// </summary>
static void Main()
{
#if DEBUG
Scheduler myScheduler = new Scheduler();
myScheduler.OnDebug();
System.Threading.Thread.Sleep(System.Threading.Timeout.Infinite);
#else
ServiceBase[] ServicesToRun;
ServicesToRun = new ServiceBase[]
{
new Scheduler()
};
ServiceBase.Run(ServicesToRun);
#endif
}
}`
Scheduler.cs
//(confidential code hidden)
`private static readonly Configuration config = Newtonsoft.Json.JsonConvert.DeserializeObject<Configuration>(
File.ReadAllText(configFilePath)
);
public Scheduler()
{
//InitializeComponent(); //windows service, doesnt need UI components initialized
}
public void OnDebug()
{
OnStart(null); //triggers when developing locally
}
protected override async void OnStart(string[] args)
{
try
{
logger.Log($#"Service manager starting...");
logger.Log($#"Finding external services... {config.services.Count} services found.");
foreach (var service in config.services)
{
try
{
if (service.disabled)
{
logger.Log(
$#"Skipping {service.name}: disabled=true in Data Transport Service's appSettings.json file");
continue;
}
logger.Queue($#"Starting: {service.name}...");
string serviceLocation = service.useRelativePath
? Path.Combine(assemblyLocation, service.path)
: service.path;
var svc = Assembly.LoadFrom(serviceLocation);
var assemblyType = svc.GetType($#"{svc.GetName().Name}.Program");
var methodInfo = assemblyType.GetMethod("Main");
var instanceObject = Activator.CreateInstance(assemblyType, new object[0]);
methodInfo.Invoke(instanceObject, new object[0]);
logger.Queue(" Running").Send("");
}
catch (TargetInvocationException ex)
{
logger.Queue(" Failed").Send("");
logger.Log("an error occurred", LOG.LEVEL.CRITICAL, ex);
}
catch (Exception ex)
{
logger.Queue(" Failed").Send("");
logger.Log("an error occurred", LOG.LEVEL.CRITICAL, ex);
}
}
logger.Log("Finished loading services.");
}
catch (Exception ex)
{
logger.Log($#"Critical error encountered", LOG.LEVEL.CRITICAL, ex);
}
}
Microservice:
public [Confidential]()
{
if (currentProfile == null)
{
var errMsg =
$#"Service not loaded, Profile not found, check appSettings.currentProfile: '{config.currentProfile}'";
logger.Log(errMsg,severity: LOG.LEVEL.CRITICAL);
throw new SettingsPropertyNotFoundException(errMsg);
}
if (currentProfile.disabled)
{
var errMsg = $#"Service not loaded: {config.serviceName}, Service's appSettings.currentProfile.disabled=true";
logger.Log(errMsg,LOG.LEVEL.WARN);
throw new ArgumentException(errMsg);
}
logger.Log($#"Loading: '{config.serviceName}' with following configuration:{Environment.NewLine}{JsonConvert.SerializeObject(currentProfile,Formatting.Indented)}");
logger.Queue($#"Encrypting config file passwords...");
bool updateConfig = false;
foreach (var kafkaSource in config.dataTargets)
{
if (!kafkaSource.password.IsEncrypted())
{
updateConfig = true;
logger.Queue($#"%tabEncrypting: {kafkaSource.name}");
kafkaSource.password = kafkaSource.password.Encrypt();
}
else
{
logger.Queue($#"%tabAlready encrypted: {kafkaSource.name}");
}
}
logger.Send(Environment.NewLine);
if (updateConfig)
{
File.WriteAllText(
configFilePath,
Newtonsoft.Json.JsonConvert.SerializeObject(config));
}
var _source = config.dataSources.FirstOrDefault(x=>x.name==currentProfile.dataSource);
var _target = config.dataTargets.FirstOrDefault(x => x.name == currentProfile.dataTarget);
source = new Connectors.Sql(logger,
_source?.name,
_source?.connectionString,
_source.pollingInterval,
_source.maxRowsPerSelect,
_source.maxRowsPerUpdate);
target = new Connectors.KafkaProducer(logger)
{
bootstrapServers = _target?.bootstrapServers,
name = _target?.name,
password = _target?.password.Decrypt(),
sslCaLocation = Path.Combine(assemblyLocation,_target?.sslCaLocation),
topic = _target?.topic,
username = _target?.username
};
Start();
}
public void Start()
{
Timer timer = new Timer();
try
{
logger.Log($#"SQL polling interval: {source.pollingInterval} seconds");
timer.Interval = source.pollingInterval * 1000;
timer.Elapsed += new ElapsedEventHandler(this.OnTimer);
timer.Start();
if (currentProfile.executeOnStartup)
Run();
}
catch (Exception ex)
{
var sb = new StringBuilder();
sb.AppendLine($#"Critical error encountered loading external service: {config.serviceName}.");
if (!timer.Enabled)
sb.AppendLine($#"service unloaded - Schedule not started!");
else
sb.AppendLine($#"service appears to be loaded and running on schedule.");
logger.Log(sb.ToString(), LOG.LEVEL.CRITICAL, ex);
}
}
public void OnTimer(object sender, ElapsedEventArgs e)
{
try
{
Run();
}
catch (Exception ex)
{
logger.Log($#"Critical error during scheduled run on service: {config.serviceName}.", LOG.LEVEL.CRITICAL, ex);
}
}
public async void Run()
{
//Get new alarm events from SQL source
logger.Queue("Looking for new alarms...");
var rows = await GetNewEvents();`
The exception occurred during the GetNewEvents method, which attempted to open a SqlConnection to a SQL server that was unavailable due to network issues, that method intentionally throws an exception, which should throw up to OnTimer, where it gets caught, logged, and the timer keeps running. During development/testing, I used invalid credentials, bad connection string, etc and simulated this type of error and it worked as expected, logged the error, kept running. For some reason recently, that error is not caught in OnTimer, it propagates up, where it should be caught by Start (but isn't), after that it should be caught by the parent service manager which is entirely wrapped in a try/catch with no throw's, and above that (because their could be multiple microservices managed by that service) the entry point to the service manager is wrapped in try/catch with no throws, all for isolation from microservice errors. For some reason though, now, the error from a VERY downstream application is propagating all the way up.
Typically, this code runs 24/7 no issues, the microservice it is loading from the config file launches and runs fine. The entry into that specific microservice starts with a try {...} catch (Exception ex) {...} block.
The concept is to have a microservice manager than can launch a number of microservices without having to install all of them as windows services, and have some level of configuration driven by a config file that dictates how the main service runs.
The microservice represented here opens a SQL connection, reads data, performs business logic, publishes results to Kafka, it does this on a polling interval dictated by the config file contained in the microservice. As stated above, its ran for months without issue.
Recently, I noticed the main microservice manager service was not running on the windows server, I investigated the Server Application Logs and found a "Runtime Error" that essentially stated the microservice, while attempting to connect to sql, failed (network issue) and caused the entire microservice manager to exit. To my understanding, they way I'm launching the microservice should isolate it from the main service manager app. Additionally, the main service manager app is wrapped in a very generic try catch block. The entry point to the micro service itself is wrapped in a try catch, and almost every component in the microservice is wrapped in try / catch per business need. The scenario that faulted (cant connect to sql) intentionally throws an error for logging purposes, but should be caught by the immediate parent try/catch, which does not propagate or re-throw, only logs the error to a txt file and the windows server app log.
How is it that this exception is bubbling up through isolation points and causing the main service to fault and exit? I tested this extensively during development and prior to release, this exact scenario being unable to connect to sql, and it generated the correct log entry, and tried again on the next polling cycle as expected.
I haven't tried any other approaches as yet, as I feel they would be band-aid fixes as best as I dont understand why the original design is suddenly failing. The server hasn't changed, no patching/security updates/etc.
From the server Application Log:
Application: DataTransportService.exe
Framework Version: v4.0.30319
Description: The process was terminated due to an unhandled exception.
Exception Info: System.Exception
at Connectors.SqlHelper.DbHelper+d__13`1[[System.__Canon, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089]].MoveNext()
at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(System.Threading.Tasks.Task)
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(System.Threading.Tasks.Task)
at IntelligentAlarms.IntelligentAlarm+d__14.MoveNext()
at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(System.Threading.Tasks.Task)
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(System.Threading.Tasks.Task)
at System.Runtime.CompilerServices.TaskAwaiter.ValidateEnd(System.Threading.Tasks.Task)
at IntelligentAlarms.IntelligentAlarm+d__12.MoveNext()
at System.Runtime.CompilerServices.AsyncMethodBuilderCore+<>c.b__6_1(System.Object)
at System.Threading.QueueUserWorkItemCallback.WaitCallback_Context(System.Object)
at System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)
at System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)
at System.Threading.QueueUserWorkItemCallback.System.Threading.IThreadPoolWorkItem.ExecuteWorkItem()
at System.Threading.ThreadPoolWorkQueue.Dispatch()
at System.Threading._ThreadPoolWaitCallback.PerformWaitCallback()
I have a strange problem when working with MySql in C#. When executing the command sometimes(not always) I get "An established connection was aborted by the software in your host machine" error. I cannot solve this problem and can't find any solution in web.
As a workaround I developed simple ReConnection class. When this exception occurs I simply call ReOpen() method. It works perfectly with ADO.NET and I am using this trick more than 2 year.
public class ReConnection : DbConnection
{
private readonly int _reconnectAttempts;
private readonly int _reconnectInterval;
private bool _isOpened;
private int _reConnectSessions;
private readonly object _syncRoot = new object();
private readonly DbCommand _pingCommand;
private readonly DbConnection _innerConnection;
public ReConnection(DbConnection innerConnection,
int reconnectAttempts = 4, int reconnectInterval = 1000)
{
_innerConnection = innerConnection;
_pingCommand = _innerConnection.CreateCommand();
_pingCommand.CommandText = "SELECT 1 FROM DUAL";
_reconnectAttempts = reconnectAttempts;
_reconnectInterval = reconnectInterval;
}
public override void Open()
{
_innerConnection.Open();
_isOpened = true;
}
public override void Close()
{
_isOpened = false;
while (_reConnectSessions > 0)
Thread.Sleep(100);
_innerConnection.Close();
}
internal bool ReOpen()
{
var restored = false;
lock (_syncRoot)
{
_reConnectSessions++;
var retries = _reconnectAttempts;
try
{
_pingCommand.ExecuteNonQuery();
restored = true;
}
catch (Exception)
{
while (_isOpened &&
((retries > 0) || (_reconnectAttempts == -1)) &&
!restored)
{
if ((retries < _reconnectAttempts) || (_reconnectAttempts == -1))
Thread.Sleep(_reconnectInterval);
try
{
_innerConnection.Close();
try
{
_innerConnection.Open();
}
catch{}
//here ecception occurs. Unable to write data to the transport connection: An established connection was aborted by the software in your host machine.
_innerConnection.Close(); // closing the corrupted connection
_innerConnection.Open(); // opening the new connection
//tut vnezapno poyavlaetsya transport connection, kuda mojno pisat' dannie
_pingCommand.ExecuteNonQuery();
restored = true;
}
catch (Exception){}
retries--;
}
}
}
}
}
}
But recently I have developed an app with Entity Framework and similar exception occurs here also. I can't develop previous workaround for Entity Framework.
Does anyone encounter such problem with MySql ?
What is the actual reason behind this exception?
Where the error occurs
Error Details
The connection will expire and terminated by the host after some time. This is managed by MySql so nothing you can do in the code.
If particular query takes a long time to execute (some of my queries takes ~ 20 mins to return), you can set the timeout via the connection string - just add this to you connection string "ConnectionIdleTimeout=1800;Command Timeout=0;".
The Connection idle time out stops the mySql server from killing the connection until the given time (it's not recommended to set it to 0, otherwise eventually the host will be swamped with the open connections, and report an error or max open connections exceeding the limit), while the command timeout = 0 stops any command to timeout by itself. A combination of the two should help with the error.
However, I don't personally like to cache the DbConnection, given the timeout behavior from the server side. I prefer to rely solely on the server to time out the connections, and on C# side, I usually just use:
using MySqlConnection connection = new MySqlConnection("yourConnectionString"); connection.Open();
I have written a windows service but when I try to stop the service it says that the service cannot be stopped at this time. Here's my whole class:
public partial class RenewalsService : ServiceBase
{
private readonly ManualResetEvent _shutdownEvent = new ManualResetEvent(false);
private Thread _thread;
public RenewalsService()
{
InitializeComponent();
this.CanStop = true;
}
protected override void OnStart(string[] args)
{
_thread = new Thread(WorkerThread)
{
Name = "Renewals Service Thread",
IsBackground = true
};
_thread.Start();
}
protected override void OnStop()
{
try
{
if (!_shutdownEvent.SafeWaitHandle.IsClosed)
{
_shutdownEvent.Set();
}
if (_thread.IsAlive)
{
if (!_thread.Join(3000))
{
// give the thread 3 seconds to stop
_thread.Abort();
}
}
}
catch (Exception ex)
{
// _thread.Join may raise an error at this point. If it does we dont care. We dont care about any other exceptions
// since we are already in the process of closing the service.
}
finally
{
IError logger = new Logger();
Exception ex = new Exception("The Renewals service has been stopped.");
logger.Log(this, SeverityEnum.Warning, ex);
Environment.ExitCode = 0;
Environment.Exit(Environment.ExitCode);
}
}
private void WorkerThread()
{
try
{
while (!_shutdownEvent.WaitOne(1))
{
string timeToRun = ConfigurationManager.AppSettings["RunTime"];
string[] timeStrings = timeToRun.Split(':');
TimeSpan runtime = new TimeSpan(0, Int32.Parse(timeStrings[0]), Int32.Parse(timeStrings[1]), Int32.Parse(timeStrings[2]));
if (DateTime.Today.TimeOfDay.Hours == runtime.Hours &&
DateTime.Today.TimeOfDay.Minutes == runtime.Minutes)
{
Renewals renewals = new Renewals();
renewals.GenerateRenewal();
}
}
}
catch (Exception ex)
{
IError logger = new Logger();
logger.Log(this, SeverityEnum.Warning, ex);
this.OnStop();
}
}
}
What's missing to ensure the user can stop the service.
Your code looks ok to me, so here's a couple of things to check.
First, does the GenerateRenewal() method take a long time to complete? If so, you might need to periodically check _shutdownEvent inside that method for a timely shutdown. Of course, you've marked the thread as a background thread so it should shut down when you tell the service to stop anyway. I haven't seen background threads hold up process termination, but I guess there's always that chance.
Second, the more likely culprit to me is that the service has already shut down due to an exception. The Services console doesn't automatically refresh when a service shuts down, so it's possible you see the Stop link available to you when it shouldn't be. If you hit F5, the console will refresh, and if your service has stopped, the Start link should be the only one available. Check your log files to see if your exception handlers have been triggered.
UPDATE
So it looks like your WorkerThread() method is throwing an exception, which causes the service to stop. This explains why the Stop link is giving you the error message when you click it.
Providing you have sufficient permissions on your box, use this link to debug your service to find out why the exception is occurring.
HTH
The base ServiceBase class calls your overridden virtual method OnStop() when the Windows Service Control Manager ("the SCM") has sent the service a "Stop" command. In the method's implementation you are supposed to do whatever is necessary to get your service to a stopped state, then return from the method back to the ServiceBase class, which handles the interaction with the SCM, in this case to tell the SCM that your service is now stopped. The SCM will decide when your service process should be terminated, and the ServiceBase class handles that without you needing to do anything explicit.
For a well-behaved service, you should either just return at the end of your OnStop method, or throw an exception. The ServiceBase class will handle things appropriately, including logging your exception, if you have thrown one, as an error in the Windows Event Log. If your method may take a while to get your service stopped, you should call base.RequestAdditionalTime() at the appropriate points, so the base class can tell the SCM that you haven't just hung, your service is in the process of stopping.
I think your main problem lies in these lines:
Environment.ExitCode = 0;
Environment.Exit(Environment.ExitCode);
You never return to the base class at all... so the ServiceBase class never has a chance to respond gracefully to the SCM... you are just unilaterally terminating the process hosting your service. This is not what a well-behaved Windows service does.
The ServiceBase class is designed to be able to support multiple services hosted in a single service process. Individual services should not concern themselves with the lifetime of the host service process, only with the logical state of their own service.
I have two self hosted services running on the same network. The first is sampling an excel sheet (or other sources, but for the moment this is the one I'm using to test) and sending updates to a subscribed client.
The second connects as a client to instances of the first client, optionally evaluates some formula on these inputs and the broadcasts the originals or the results as updates to a subscribed client in the same manner as the first. All of this is happening over a tcp binding.
My problem is occuring when the second service attempts to subscribe to two of the first service's feeds at once, as it would do if a new calculation is using two or more for the first time. I keep getting TimeoutExceptions which appear to be occuring when the second feed is subscribed to. I put a breakpoint in the called method on the first server and stepping through it, it is able to fully complete and return true back up the call stack, which indicates that the problem might be some annoying intricacy of WCF
The first service is running on port 8081 and this is the method that gets called:
public virtual bool Subscribe(int fid)
{
try
{
if (fid > -1 && _fieldNames.LeftContains(fid))
{
String sessionID = OperationContext.Current.SessionId;
Action<Object, IUpdate> toSub = MakeSend(OperationContext.Current.GetCallbackChannel<ISubClient>(), sessionID);//Make a callback to the client's callback method to send the updates
if (!_callbackList.ContainsKey(fid))
_callbackList.Add(fid, new Dictionary<String, Action<Object, IUpdate>>());
_callbackList[fid][sessionID] = toSub;//add the callback method to the list of callback methods to call when this feed is updated
String field = GetItem(fid);//get the current stored value of that field
CheckChanged(fid, field);//add or update field, usually returns a bool if the value has changed but also updates the last value reference, used here to ensure there is a value to send
FireOne(toSub, this, MakeUpdate(fid, field));//sends an update so the subscribing service will have a first value
return true;
}
return false;
}
catch (Exception e)
{
Log(e);//report any errors before returning a failure
return false;
}
}
The second service is running on port 8082 and is failing in this method:
public int AddCalculation(string name, string input)
{
try
{
Calculation calc;
try
{
calc = new Calculation(_fieldNames, input, name);//Perform slow creation before locking - better wasted one thread than several blocked ones
}
catch (FormatException e)
{
throw Fault.MakeCalculationFault(e.Message);
}
lock (_calculations)
{
int id = nextID();
foreach (int fid in calc.Dependencies)
{
if (!_calculations.ContainsKey(fid))
{
lock (_fieldTracker)
{
DataRow row = _fieldTracker.Rows.Find(fid);
int uses = (int)(row[Uses]) + 1;//update uses of that feed
try
{
if (uses == 1){//if this is the first use of this field
SubServiceClient service = _services[(int)row[ServiceID]];//get the stored connection (as client) to that service
service.Subscribe((int)row[ServiceField]);//Failing here, but only on second call and not if subscribed to each seperately
}
}
catch (TimeoutException e)
{
Log(e);
throw Fault.MakeOperationFault(FaultType.NoItemFound, "Service could not be found");//can't be caught, if this timed out then outer connection timed out
}
_fieldTracker.Rows.Find(fid)[Uses] = uses;
}
}
}
return id;
}
}
catch (FormatException f)
{
Log(f.Message);
throw Fault.MakeOperationFault(FaultType.InvalidInput, f.Message);
}
}
The ports these are on could change but are never shared. The tcp binding used is set up in code with these settings:
_tcpbinding = new NetTcpBinding();
_tcpbinding.PortSharingEnabled = false;
_tcpbinding.Security.Mode = SecurityMode.None;
This is in a common library to ensure they both have the same set up, which is also a reason why it is declared in code.
I have already tried altering the Service Throttling Behavior for more concurrent calls but that didn't work. It's commented out for now since it didn't work but for reference here's what I tried:
ServiceThrottlingBehavior stb = new ServiceThrottlingBehavior
{
MaxConcurrentCalls = 400,
MaxConcurrentSessions = 400,
MaxConcurrentInstances = 400
};
host.Description.Behaviors.RemoveAll<ServiceThrottlingBehavior>();
host.Description.Behaviors.Add(stb);
Has anyone had similar issues of methods working correctly but still timing out when sending back to the caller?
This was a difficult problem and from everything I could tell, it is an intricacy of WCF. It cannot handle one connection being reused very quickly in a loop.
It seems to lock up the socket connection, though trying to add GC.Collect() didn't free up whatever resources it was contesting.
In the end the only way I found to work was to create another connection to the same endpoint for each concurrent request and perform them on separate threads. Might not be the cleanest way but it was all that worked.
Something that might come in handy is that I used the svc trace viewer to monitor the WCF calls to try and track the problem, I found out how to use it from this article: http://www.codeproject.com/Articles/17258/Debugging-WCF-Apps
Does anyone have a solid pattern fetching Redis via BookSleeve library?
I mean:
BookSleeve's author #MarcGravell recommends not to open & close the connection every time, but rather maintain one connection throughout the app. But how can you handle network breaks? i.e. the connection might be opened successfully in the first place, but when some code tries to read/write to Redis, there is the possibility that the connection has dropped and you must reopen it (and fail gracefully if it won't open - but that is up to your design needs.)
I seek for code snippet(s) that cover general Redis connection opening, and a general 'alive' check (+ optional awake if not alive) that would be used before each read/write.
This question suggests a nice attitude to the problem, but it's only partial (it does not recover a lost connection, for example), and the accepted answer to that question draws the right way but does not demonstrate concrete code.
I hope this thread will get solid answers and eventually become a sort of a Wiki with regards to BookSleeve use in .Net applications.
-----------------------------
IMPORTANT UPDATE (21/3/2014):
-----------------------------
Marc Gravell (#MarcGravell) / Stack Exchange have recently released the StackExchange.Redis library that ultimately replaces Booksleeve. This new library, among other things, internally handles reconnections and renders my question redundant (that is, it's not redundant for Booksleeve nor my answer below, but I guess the best way going forward is to start using the new StackExchange.Redis library).
Since I haven't got any good answers, I came up with this solution (BTW thanks #Simon and #Alex for your answers!).
I want to share it with all of the community as a reference. Of course, any corrections will be highly appreciated.
using System;
using System.Net.Sockets;
using BookSleeve;
namespace Redis
{
public sealed class RedisConnectionGateway
{
private const string RedisConnectionFailed = "Redis connection failed.";
private RedisConnection _connection;
private static volatile RedisConnectionGateway _instance;
private static object syncLock = new object();
private static object syncConnectionLock = new object();
public static RedisConnectionGateway Current
{
get
{
if (_instance == null)
{
lock (syncLock)
{
if (_instance == null)
{
_instance = new RedisConnectionGateway();
}
}
}
return _instance;
}
}
private RedisConnectionGateway()
{
_connection = getNewConnection();
}
private static RedisConnection getNewConnection()
{
return new RedisConnection("127.0.0.1" /* change with config value of course */, syncTimeout: 5000, ioTimeout: 5000);
}
public RedisConnection GetConnection()
{
lock (syncConnectionLock)
{
if (_connection == null)
_connection = getNewConnection();
if (_connection.State == RedisConnectionBase.ConnectionState.Opening)
return _connection;
if (_connection.State == RedisConnectionBase.ConnectionState.Closing || _connection.State == RedisConnectionBase.ConnectionState.Closed)
{
try
{
_connection = getNewConnection();
}
catch (Exception ex)
{
throw new Exception(RedisConnectionFailed, ex);
}
}
if (_connection.State == RedisConnectionBase.ConnectionState.Shiny)
{
try
{
var openAsync = _connection.Open();
_connection.Wait(openAsync);
}
catch (SocketException ex)
{
throw new Exception(RedisConnectionFailed, ex);
}
}
return _connection;
}
}
}
}
With other systems (such as ADO.NET), this is achieved using a connection pool. You never really get a new Connection object, but in fact get one from the pool.
The pool itself manages new connections, and dead connections, independently from caller's code. The idea here is to have better performance (establishing a new connection is costy), and survive network problems (the caller code will fail while the server is down but resume when it comes back online). There is in fact one pool per AppDomain, per "type" of connection.
This behavior transpires when you look at ADO.NET connection strings. For example SQL Server connection string (ConnectionString Property) has the notion of 'Pooling', 'Max Pool Size', 'Min Pool Size', etc. This is also a ClearAllPools method that is used to programmaticaly reset the current AppDomain pools if needed for example.
I don't see anything close to this kind of feature looking into BookSleeve code, but it seems to be planned for next release: BookSleeve RoadMap.
In the mean time, I suppose you can write your own connection pool as the RedisConnection has an Error Event you can use for this, to detect when it's dead.
I'm not a C# programmer, but the way I'd look at the problem is the following:
I'd code a generic function that would take as parameters the redis connection and a lambda expression representing the Redis command
if trying to execute the Redis command would result in an exception pointing out a connectivity issue, I've re-initialize the connection and retry the operation
if no exception is raised just return the result
Here is some sort of pseudo-code:
function execute(redis_con, lambda_func) {
try {
return lambda_func(redis_con)
}
catch(connection_exception) {
redis_con = reconnect()
return lambda_func(redis_con)
}
}