I have a WCF Service hosted in a Windows service as described here.
I have scheduled nightly restart of the service, but sometimes the restart fails and the service remains/hangs in Stopping state and the EXE process has to be killed manually. It looks likely that it hangs on line _ESSServiceHost.Close();, because nothing after that line is logged it the log file. It is possible but not very likely that the service gets the stop request when it is busy.
Moreover the underlying process cannot be killed because it is dependent on services.exe, so only server restart works.
What could be wrong with this approach?
protected override void OnStop()
{
try
{
if (_ESSServiceHost != null)
{
_ESSServiceHost.Close();
_ESSServiceHost = null;
//Never reaches the following line
Tools.LogInfo("Services stopped.");
}
}
catch (Exception ex)
{
Tools.LogError(ex.Message);
}
This is how I stop the service:
private bool StopService(ServiceController scESiftServer)
{
int i = 0;
if (scESiftServer.Status == ServiceControllerStatus.Running)
{
try
{
scESiftServer.Stop();
}
catch (Exception ex)
{
Tools.LogEvent("Exception ...");
return false;
}
while (scESiftServer.Status != ServiceControllerStatus.Stopped && i < 120)
{
Thread.Sleep(1000);
scESiftServer.Refresh();
i++;
}
}
if (scESiftServer.Status != ServiceControllerStatus.Stopped)
{
//This line gets executed
Tools.LogEvent("Failed within 120 sec...");
return false;
}
else
{
Tools.LogEvent("OK ...");
}
return true;
}
Could something like this help?
var task = Task.Run(() => _ESSServiceHost.Close(TimeSpan.FromSeconds(299)));
if (!task.Wait(TimeSpan.FromSeconds(300)))
{
_ESSServiceHost.Abort();
}
But _ESSServiceHost.Abort() should be called internally by the Close method if needed.
Target framework is 4.5, installed is .NET 4.7.2.
Found out that probably the service hangs after series of malformed requests. Expected record type 'Version', found '71'. etc.
I have found in the svclog file that my service hangs after series of malformed request that happen on Saturday and Sunday at approx. 5:15 AM. The error messages were Expected record type 'Version', found '71'., Error while reading message framing format at position 0 of stream (state: ReadingVersionRecord). But I could not find the cause of theese malformed request series, so I tried to fix the service to withstand the "attack".
I have modified the OnStop method as follows:
protected override void OnStop()
{
try
{
if (_ESSServiceHost != null)
{
Tools.LogInfo("Stopping ESService.");
var abortTask = Task.Run(() => _ESSServiceHost.Abort());
var closeTask = Task.Run(() => _ESSServiceHost.Close(TimeSpan.FromSeconds(300)));
try
{
if (_ESSServiceHost.State == CommunicationState.Faulted)
{
Tools.LogInfo("ESSServiceHost.State == CommunicationState.Faulted");
if (!abortTask.Wait(TimeSpan.FromSeconds(60)))
Tools.LogInfo("Failed to Abort.");
}
else
{
if (!closeTask.Wait(TimeSpan.FromSeconds(301)))
{
Tools.LogInfo("Failed to Close - trying Abort.");
if (!abortTask.Wait(TimeSpan.FromSeconds(60)))
Tools.LogInfo("Failed to Abort.");
}
}
}
catch (Exception ex)
{
Tools.LogException(ex, "ESSServiceHost.Close");
try
{
Tools.LogInfo("Abort.");
if (!abortTask.Wait(TimeSpan.FromSeconds(60)))
Tools.LogInfo("Failed to Abort.");
}
catch (Exception ex2)
{
Tools.LogException(ex2, "ESSServiceHost.Abort");
}
}
_ESSServiceHost = null;
Tools.LogInfo("ESService stopped.");
}
}
catch (Exception ex)
{
Tools.LogException(ex,"OnStop");
}
}
Today on Monday I have checked the svclog and the "attacks" with malformed request remained there but my service lived happily through it. So it seemed to be fixed. Moreover only:
Stopping ESService.
ESService stopped.
events were logged in my log file. No aborts etc. So I guess that putting the Close call on a separate thread fixed the problem but absolutely do not know why.
Related
I have a call to write to a log file whenever the API is called. This works flawlessly on single machine but as soon as it is moved to a web farm, then nothing gets written. No errors are raised either.
Here is how things are arranged
Call to API
[HttpGet]
public model.Returns Get([FromUri] model.Requests requests)
With this function, Get, is a call to write to the log file like so
//record request
Task.Run(() => writers.WriteApiLogAsync(requests));
Within the WriteApiLogAsync is this
logs.Add(new model.Log()
{
Type = requests.t,
When = DateTime.Now,
PhoneId = requests.p,
Location = requests.l,
Phone = requests.ph,
PhoneType = requests.pt
});
//file is locked, attempt write on next round but stack log entries
if (logLocker.IsWriterLockHeld) return;
try
{
var result = await writeLogs(logs);
if(result) logLocker.ReleaseWriterLock();
}
catch (OutOfMemoryException ex)
{
WriteErrorAsnc(ex.Message, ex.ToString(), "WriteApiLogAsync_OutOfMemoryException");
logs.Clear();
logLocker.ReleaseWriterLock();
}
catch (IOException ex)
{
WriteErrorAsnc(ex.Message, ex.ToString(), "WriteApiLogAsync_IOException");
logs.Clear();
logLocker.ReleaseWriterLock();
}
catch (Exception ex)
{
WriteErrorAsnc(ex.Message, ex.ToString(), "WriteApiLogAsync_Exception");
logs.Clear();
logLocker.ReleaseWriterLock();
}
finally
{
if (logs.Count > 1000)
{
logs.Clear();
logLocker.ReleaseWriterLock();
}
}
logs is set as
private static List<model.Log> logs = new List<model.Log>();
I didn't specify a machine key as I understand it is primarily for encryption/decryption which there isn't any.
Anyone have an idea of the cause or can point to somewhere with same issue?
I have a c# windows service which is doing various tasks. Its working perfectly on my local system but as soon as I start it on my product server, its doesn't perform a particular task on it.
This is how my service is structured:
public static void Execute()
{
try
{
// .... some work ....
foreach (DataRow dr in dt.Rows)
{
string cc = dr["ccode"].ToString();
Task objTask = new Task(delegate { RequestForEachCustomer(cc); });
objTask.Start();
}
}
catch (Exception ex)
{
// Logging in DB + Text File
}
}
public static void RequestForEachCustomer(object cc)
{
try
{
// .... some work ....
foreach (DataRow dr in dt.Rows)
{
WriteLog("RequestForEachCustomer - Before Task");
Task objTask = new Task(delegate { RequestProcessing(dr); });
objTask.Start();
WriteLog("RequestForEachCustomer - After Task");
}
}
catch (Exception ex)
{
// Logging in DB + Text File
}
}
public static void RequestProcessing(object dr)
{
try
{
WriteLog("Inside RequestProcessing");
// .... some work ....
}
catch (Exception ex)
{
// Logging in DB + Text File
}
}
Now what happens on the production server is that it logs the last entry in RequestForEachCustomer which is "RequestForEachCustomer - After Task" but it doesn't log the entry from RequestProcessing which mean the task is not starting at all. There are no exceptions in either database or text file.
There are no events logged in window's event viewer either. Also the service keeps working (if I insert another record in database, its processed by the service immediately so the service isn't stuck either. It just doesn't seem to process RequestProcessing task.)
I am baffled by this and it would be great if someone could point out the mistake I am making. Oh, btw did I forgot to mention that this service was working perfectly few days ago on the server and it is still working fine on my local PC.
EDIT :
WriteLog :
public static void WriteErrorLog(string Message)
{
StreamWriter sw = null;
try
{
lock (locker)
{
sw = new StreamWriter(AppDomain.CurrentDomain.BaseDirectory + "\\Logs\\LogFile.txt", true);
sw.WriteLine(DateTime.Now.ToString() + ": " + Message);
sw.Flush();
sw.Close();
}
}
catch (Exception excep)
{
try
{
// .... Inserting ErrorLog in DB ....
}
catch
{
throw excep;
}
throw excep;
}
}
I have also logged an entry on OnStop() something like "Service Stopped" and its logs every time I stop my service so the problem couldn't exist in WriteLog function.
I suggest you refactor your code as in this MSDN example. What bother me in your code is, you never wait for tasks to finish anywhere.
The following example starts 10 tasks, each of which is passed an index as a state object. Tasks with an index from two to five throw exceptions. The call to the WaitAll method wraps all exceptions in an AggregateException object and propagates it to the calling thread.
Source : Task.WaitAll Method (Task[])
This line from example might be of some importance :
Task.WaitAll(tasks.ToArray());
Basically I am ting to catch any exception off a block of code, and fire said code one.
try {
CODE
catch (Exception e)
{
DO THIS ONCE
}
finally
{
CODE
}
In Depth
So I have been creating a TCP/SOCKET Server. Which can work with multiple clients. And send/recite (I/O) Data. It works well, and has been for a long time now. But I have found in my console that it says this:
This is bad because if it thinks the user disconnected twice it can create many problems. The way I know if a user has disconnected is by sending data to them every 200ms. And if there is a error then print they disconnected remove them from the client list, and disconnect there stream/tcp.
static bool currentlyUsing;
private static void PingClient(Object o)
{
if (!currentlyUsing)
{
if (clientsConnected.Count != 0)
{
foreach (Client c in clientsConnected)
{
try
{
c.tcp.Client.Blocking = false;
c.tcp.Client.Send(new byte[1], 0, 0);
}
catch (Exception e)
{
currentlyUsing = true;
Console.WriteLine("[INFO] Client Dissconnected: IP:" + c.ip + " PORT:" + c.port.ToString() + " Reason:" + e.Message);
clientsConnected.Remove(c);
c.tcp.Close();
break;
}
finally
{
currentlyUsing = false;
}
GC.Collect();
}
}
}
Is there a way to make it so it catches it only once, or catches it multiple times but only fires the code I want once?
If I understand your question correctly: you want to try to run the code on each iteration of the foreach block, and always run the finally code for each iteration, but only run the catch code once?
If so:
Before the foreach block, define:
bool caught = false;
And then after:
catch (Exception e)
{
if (caught == false)
{
caught = true;
...
}
}
I was making multiple timers. So it overlapped.
using IPC over local TCP to communicate from Client to a Server thread. The connection itself doesn't seem to be throwing any errors, but every time I try to make one of the associated calls, I get this message:
System.Runtime.Remoting.RemotingException: Could not connect to an IPC Port: The System cannot Find the file specified
What I am attempting to figure out is WHY. Because this WAS working correctly, until I transitioned the projects in question (yes, both) from .NET 3.5 to .NET 4.0.
Listen Code
private void ThreadListen()
{
_listenerThread = new Thread(Listen) {Name = "Listener Thread", Priority = ThreadPriority.AboveNormal};
_listenerThread.Start();
}
private void Listen()
{
_listener = new Listener(this);
LifetimeServices.LeaseTime = TimeSpan.FromDays(365);
IDictionary props = new Hashtable();
props["port"] = 63726;
props["name"] = "AGENT";
TcpChannel channel = new TcpChannel(props, null, null);
ChannelServices.RegisterChannel(channel, false);
RemotingServices.Marshal(_listener, "Agent");
Logger.WriteLog(new LogMessage(MethodBase.GetCurrentMethod().Name, "Now Listening for commands..."));
LogEvent("Now Listening for commands...");
}
Selected Client Code
private void InitializeAgent()
{
try
{
_agentController =
(IAgent)RemotingServices.Connect(typeof(IAgent), IPC_URL);
//Note: IPC_URL was originally "ipc://AGENT/AGENT"
// It has been changed to read "tcp://localhost:63726/Agent"
SetAgentPid();
}
catch (Exception ex)
{
HandleError("Unable to initialize the connected agent.", 3850244, ex);
}
}
//This is the method that throws the error
public override void LoadTimer()
{
// first check to see if we have already set the agent process id and set it if not
if (_agentPid < 0)
{
SetAgentPid();
}
try
{
TryStart();
var tries = 0;
while (tries < RUNCHECK_TRYCOUNT)
{
try
{
_agentController.ReloadSettings();//<---Error occurs here
return;
} catch (RemotingException)
{
Thread.Sleep(2000);
tries++;
if (tries == RUNCHECK_TRYCOUNT)
throw;
}
}
}
catch (Exception ex)
{
HandleError("Unable to reload the timer for the connected agent.", 3850243, ex);
}
}
If you need to see something I haven't shown, please ask, I'm pretty much flying blind here.
Edit: I think the issue is the IPC_URL String. It is currently set to "ipc://AGENT/AGENT". The thing is, I have no idea where that came from, why it worked before, or what might be stopping it from working now.
Update
I was able to get the IPC Calls working correctly by changing the IPC_URL String, but I still lack understanding of why what I did worked. Or rather, why the original code stopped working and I needed to change it in the first place.
The string I am using now is "tcp://localhost:63726/Agent"
Can anyone tell me, not why the new string works, I know that...but Why did the original string work before and why did updating the project target to .NET 4.0 break it?
So my application is exchanging request/responses with a server (no problems), until the internet connection dies for a couple of seconds, then comes back. Then a code like this:
response = (HttpWebResponse)request.GetResponse();
will throw an exception, with a status like ReceiveFailure, ConnectFailure, KeepAliveFailure etc.
Now, it's quite important that if the internet connection comes back, I am able to continue communicating with the server, otherwise I'd have to start again from the beginning and that will take a long time.
How would you go about resuming this communication when the internet is back?
At the moment, I keep on checking for a possibility to communicate with the server, until it is possible (at least theoretically). My code attempt looks like this:
try
{
response = (HttpWebResponse)request.GetResponse();
}
catch (WebException ex)
{
// We have a problem receiving stuff from the server.
// We'll keep on trying for a while
if (ex.Status == WebExceptionStatus.ReceiveFailure ||
ex.Status == WebExceptionStatus.ConnectFailure ||
ex.Status == WebExceptionStatus.KeepAliveFailure)
{
bool stillNoInternet = true;
// keep trying to talk to the server
while (stillNoInternet)
{
try
{
response = (HttpWebResponse)request.GetResponse();
stillNoInternet = false;
}
catch
{
stillNoInternet = true;
}
}
}
}
However, the problem is that the second try-catch statement keeps throwing an exception even when the internet is back.
What am I doing wrong? Is there another way to go about fixing this?
Thanks!
You should recreate the request each time, and you should execute the retries in a loop with a wait between each retry. The wait time should progressively increase with each failure.
E.g.
ExecuteWithRetry (delegate {
// retry the whole connection attempt each time
HttpWebRequest request = ...;
response = request.GetResponse();
...
});
private void ExecuteWithRetry (Action action) {
// Use a maximum count, we don't want to loop forever
// Alternativly, you could use a time based limit (eg, try for up to 30 minutes)
const int maxRetries = 5;
bool done = false;
int attempts = 0;
while (!done) {
attempts++;
try {
action ();
done = true;
} catch (WebException ex) {
if (!IsRetryable (ex)) {
throw;
}
if (attempts >= maxRetries) {
throw;
}
// Back-off and retry a bit later, don't just repeatedly hammer the connection
Thread.Sleep (SleepTime (attempts));
}
}
}
private int SleepTime (int retryCount) {
// I just made these times up, chose correct values depending on your needs.
// Progressivly increase the wait time as the number of attempts increase.
switch (retryCount) {
case 0: return 0;
case 1: return 1000;
case 2: return 5000;
case 3: return 10000;
default: return 30000;
}
}
private bool IsRetryable (WebException ex) {
return
ex.Status == WebExceptionStatus.ReceiveFailure ||
ex.Status == WebExceptionStatus.ConnectFailure ||
ex.Status == WebExceptionStatus.KeepAliveFailure;
}
I think what you are trying to do is this:
HttpWebResponse RetryGetResponse(HttpWebRequest request)
{
while (true)
{
try
{
return (HttpWebResponse)request.GetResponse();
}
catch (WebException ex)
{
if (ex.Status != WebExceptionStatus.ReceiveFailure &&
ex.Status != WebExceptionStatus.ConnectFailure &&
ex.Status != WebExceptionStatus.KeepAliveFailure)
{
throw;
}
}
}
}
When you want to retry something on failure then instead of thinking of this as something that you want to do when something fails, think of it instead as looping until you succeed. (or a failure that you don't want to retry on). The above will keep on retrying until either a response is returned or a different exception is thrown.
It would also be a good idea to introduce a maximum retry limit of some sort (for example stop retrying after 1 hour).
If it's still doing it when you get the connection back - then my guess is that it's simply returning the same result again.
You might want to to try recreating the request anew each time, other than that I don't see much wrong with the code or logic. Apart from the fact that you're forever blocking this thread. But then that might be okay if this is, in itself, a worker thread.