Starting application pools with WMI\ADSI (C#) hangs immediately after reboot - c#

I have encountered a strange situation where starting an application pool from a windows service (written in C#, set to "Automatic" startup) using WMI or ADSI immediately after the server reboots hangs.
I'll describe the issue:
We are developing a large application (Windows 2003 Server SP2, IIS 6.0) which contains the following main processes (these processes are invoked & initialized using a windows service startup procedure when the application is started):
1) XServer1.exe, XServer2.exe - These processes are native COM-Exe servers, contains some logics, but mainly supplies COM objects to other processes via DCOM (mainly .NET2COM interOp calls & pure COM calls). For example, some of the classic ASP "Application Scope static objects" (w3wp.exe) are COM objects which "live" inside these processes.
2) dllhost.exe - this is a COM+ application. Some of our DLLs are loaded into this process which acts as a "state server" (the same idea as the ASP.NET out-of-proc sessions server, but for classic ASP pages).
3) 3 different IIS application pools (we'll call them appPool1\2\3) - containers of our ASP pages, ASP.NET pages, WCF services etc. Code (native C++ COM dlls & C#) in these application pools (w3wp.exe's) usually makes DCOM calls to the processes described in (1) & (2). Only appPool1 can be configured as a Web Garden.
In order to Start\Stop our application we have written a windows Service (C#) which controls these procedures. Our service process is called XWinService.exe. The service depends on the following windows services (the list began with the first 4 services, ongoing tries made the list like this...):
W3SVC
aspnet_state
COMSysApp
DcomLaunch
winmgmt
lanmanserver
lanmanworkstation
seclogon
Browser
TermService
The summary of the Stop procedure of the application (implemented by the service):
1) Stop all 3 IIS application pools (appPool1\2\3) - This is done to prevent w3wp.exe processes to jump alive when the application is shut-down. This is implemented with WMI from C# (system.Management.dll)
2) Stop XServer1\2.exe
3) Stop the COM+ application (dllhost.exe).
The summary of the Start procedure of the application (implemented by the service):
1) Execute the Stop procedure - This ensures that no HTTP hits will wake a w3wp.exe process before it's time.
2) Invokes & Initializes the XServer1\2.exe COM-Exe servers - Initialization is required prior to any w3wp.exe invocation. Only after some object had been initialized, w3wp.exe's can access these servers. This is implemented by .NET2COM InterOp (eventually DCOM).
3) Invokes & initialized the dllhost.exe (COM+ application) process - This is implemented by the ComAdmin Catalog API (C#).
4) Starts our 3 application pools - This allows incoming HTTP hits to wake w3wp.exe processes and start serving requests.
This is the C# code which is responsible to start\stop application pools (WMI). This code runs in our service processes (XWinService.exe):
ConnectionOptions co = new ConnectionOptions();
ManagementScope scope = new ManagementScope(#"\\localhost\root\MicrosoftIISV2", co);
foreach (string appPool in AppPools)
{
string objPath = string.Format("IISApplicationPool.Name='W3SVC/AppPools/{0}'", appPool);
using (ManagementObject mc = new ManagementObject(objPath))
{
mc.Scope = scope;
if (Operation.ToLower() == "start")
{
mc.InvokeMethod("Start", null, null); // ### The problematic line of code ###
}
else if (Operation.ToLower() == "stop")
{
mc.InvokeMethod("Stop", null, null);
}
else if (Operation.ToLower() == "recycle")
{
mc.InvokeMethod("Recycle", null, null);
}
}
}
 
Now the issue:
Prior to rebooting the server, starting the service manually (from the services.msc tool) succeeds without any problems. also, stopping it is OK. We have set the service to start "Automatic", that is, will start when the server (Win2K3 SP2) starts and rebooted the server. When the server started (the login screen appeared), our service was "stuck" (status = "Starting") and will NEVER (it hang for 2 days!) start.
Analyzing the processes reveled the following:
1) The XWinService.exe process was stuck on the problematic line of code (### above ###). This hanged for 2 days until we killed the process. Please note: Shutting down the application pools (the Start procedure begins with a Stop procedure) did not hang!
2) From a DUMP file taken (with DebugDiag tool) from XWinService.exe during this "hang" we can see the thread which is waiting. This is the (native) stack trace of it:
Thread 6 - System ID 2784
Entry point mscorwks!Thread::intermediateThreadProc
Create time 11/19/2009 1:40:05 PM
Time spent in user mode 0 Days 00:00:00.078
Time spent in kernel mode 0 Days 00:00:00.781
This thread is making a COM call to multi-threaded apartment (MTA) in process 884
Function Source
ntdll!KiFastSystemCallRet
ntdll!NtRequestWaitReplyPort+c
rpcrt4!LRPC_CCALL::SendReceive+230
rpcrt4!I_RpcSendReceive+24
ole32!ThreadSendReceive+138
ole32!CRpcChannelBuffer::SwitchAptAndDispatchCall+112
ole32!CRpcChannelBuffer::SendReceive2+d3
ole32!CAptRpcChnl::SendReceive+ab
ole32!CCtxComChnl::SendReceive+1a9
rpcrt4!NdrProxySendReceive+43
rpcrt4!NdrClientCall2+206
rpcrt4!ObjectStublessClient+8b
rpcrt4!ObjectStubless+f
….
This thread is calling (via DCOM) a component in process 884, which is svchost.exe, running the following services: AeLookupSvc, AudioSrv, Browser, CryptSvc, dmserver, EventSystem, helpsvc, lanmanserver, lanmanworkstation, Schedule, seclogon, SENS, ShellHWDetection, TrkWks, winmgmt, wuauserv, WZCSVC.
As you can see the "winmgmt" service (responsible for WMI) is running in this process and our service depends on it, so our service will start after winmgmt is started (the same for IIS W3SVC service).
The svchost.exe process (884) was dumped and we can see a thread (waiting for a DCOM call to end) accessing process 2880 which is - wmiprvse.exe (I guess this is the WMI server. Don't know if it's relevent, but there were 2 instances of this process). This is the native call stack of the thread (in svchost.exe):
Thread 48 - System ID 3816
Entry point wbemcore!CCoreQueue::_ThreadEntry
Create time 11/19/2009 1:40:56 PM
Time spent in user mode 0 Days 00:00:00.00
Time spent in kernel mode 0 Days 00:00:00.00
This thread is making a COM call to multi-threaded apartment (MTA) in process 2880
Function Source
ntdll!KiFastSystemCallRet
ntdll!NtRequestWaitReplyPort+c
rpcrt4!LRPC_CCALL::SendReceive+230
rpcrt4!I_RpcSendReceive+24
ole32!ThreadSendReceive+138
ole32!CRpcChannelBuffer::SwitchAptAndDispatchCall+112
ole32!CRpcChannelBuffer::SendReceive2+d3
ole32!CAptRpcChnl::SendReceive+ab
ole32!CCtxComChnl::SendReceive+1a9
…
3) Setting our service to "Manual" and starting it (manually - after logging into the server or starting it remotely from a different server immediately after reboot) is OK - nothing hangs.
4) We deleted our service (from the registry!) and placed a batch file in the windows "startup" folder. This batch files calls the service's code, but runs it as a normal C# executable. After server reboot, it also hang on the same problematic line of code (again... for 2 days until we killed it).
5) Using ADSI (System.DirectoryServices) instead of WMI had the same results (starting the application pools hanged!).
We have been digging into this for the past 2 weeks...
My questions:
==========
1) Did anyone encounter the same issue?
2) Does anyone know why it hangs? Is there any additional service dependency we should take in mind?
3) Does anyone have a solution for this issue?
4) Why is this happening after a reboot only when the service to set to "Automatic" startup? If we do it manually - everything is Ok!
***** Small update:**
We have noticed that on VMs (VMware stations) the service hangs after reboot for an average of ~40min, until it starts (note: it never fails to start, but 40min is way too much). An event log message is recorded in the system event log stating that our service hanged for more than 16min (source: Service Control Manager, Event ID: 7044).
On "regular" machines (real metals) the average time until the service starts is ~55 hours!!! Again, an event log entry is recorded as described above.
The avergae values were calculated from 10 differens VMs & 8 different "real" servers.

I see no one had responded, but I'll post some news anyway...
We have found out, that prior to starting the application pools, setting the service status to "Started" and opening a new thread (new Thread(...)) which runs the code above (starting the app pools with WMI) solves the issue.
This is the pseudo code of the OnStart method of the service:
OnStart {
StopProcedure();
InvokeInitXServer1And2(); //COM-Exe servers
InvokeInitCOMPlusApplication(); //dllhost.exe
SetServiceStatus(SERVICE_STARTED);
Thread worker = new Thread(new threadStart(IISAppPoolStartWMI); //Calls the code
}
This is the only way the service starts in reasonable time (Max of 3 min, Avg of ~1.5 min of real machines and VMs both!) and a w3wp.exe processes is started.
If anyone has an explnation for it (MTA\STA issues?!?!?) I'll be happy to read it.

Related

C# COM application crash debugging

The C# console application running .NET 4.5.2 (app1) opens a COM application (app2) and does some work with that app2's API. So far all of the work is successful, but sometimes when app1 attempts to close app2, app2 hangs permanently.
If the process for app2 is ended with task manager then app1 reports access denied. Does that occur because the terminated process is no longer available or does it occur because it was blocking a thread in app1 and it was unable to report the error until the thread was allowed to continue?
The code used to terminate app2 is
private static void CloseSW(SldWorks swApp, Process sw_proces)
{
// Close with API call
if (Task.Run(() => { swApp.CloseAllDocuments(true); swApp.ExitApp(); }).Wait(TimeSpan.FromSeconds(20)))
return;
// Kill process if API call failed
if (Task.Run(() => { SWHelper.CloseSW(sw_proces); }).Wait(TimeSpan.FromSeconds(20)))
return;
// Unable to close SolidWorks, ignore error and continue
// This will eventually cause SolidWorks to crash and the crash handler will take over
}
This code should never take much more than 40 seconds to complete, but maybe the COM interop is causing some unexpected behaviour?
I am unable to reproduce this error on a development machine. What it the best way to trace the exact point of failure? It is possible that the failure is not in CloseSW but some point before this. Is there a better way to trace the error than to write each line to a log file?
It is also worth noting that this code works for 60 - 150 runs before it has any errors and both applications are closed between each run.
I have control of the remote environment so remote debugging is an option, but I've never set that up before.
Typically with COM interops causing issues is that IIS is having issues with the object using the current ISAPI.dll. Please verify that your permissions are configured within your assembly to work with your current version of IIS>
A few questions to help assist would be, which framework version are you using, which version of IIS and what is your Application Pool using for a framework.
HTH

Best way to implement windows service as "worker"

I have an ASP .NET page which allows users to start programs. These programs and the parameter are stored in a database and a windows service then executes these programs.
The programs are dlls which implements my IPlugin interface, so I can add them at runtime (the dlls are loaded at runtime so I can add them at runtime without compiling or restarting the service).
I created the ASP .NET page, more than 10 programs (plugins) and the windows service. Everything is running fine, but I think the implementation of the windows service is bad.
The windows service periodically queries the database and executes the needed program if it gets a new entry. The service can run multiple programs in parallel (at the moment 3 programs).
Currently my service method looks like this:
while (Alive)
{
// gets all running processes from the database
Processes = Proc.GetRunningProcs();
// if there are less than 3 processes running and
// a process is in queue
if (ReadyToRun())
{
// get next program from queue, sets the status to
// runnig and update the entry in the database
Proc.ProcData proc = GetNextProc();
proc.Status = Proc.ProcStatus.Running;
Proc.Update(proc);
// create a new thread and execute the program
Thread t = new Thread(new ParameterizedThreadStart(ExecuteProc));
t.IsBackground = true;
t.Start(proc);
}
Thread.Sleep(1000);
}
I have a method that queries the database for entries with status 'Canceling' (if a user cancels a program, the status will be set to 'Canceling') and does a Thread.Abort().
Is there a better practice? Like using tasks with the cancel mechanism or is the whole concept (storing the processes in database (program name, parameter, status,... and querying this information periodically) wrong?
As an alternative you can use some existing libraries for your purposes like Quartz.NET http://www.quartz-scheduler.net/. It takes care about job persistence, job scheduling and many other things. All you must do to create an adapter and put it into Windows Service.

Problem with calling Console application (WCF Service) from webform

I am using a ASP.net webform application to run an existing console application which get all records from DB and send them through a third party WCF service. Locally everything is working fine. When I run the application it opens the console, gets the records and sends them. But now I pushed my files over to Test server along with the exe file and related config files. But when I access the application through the browser (test url) I get the same error message time and again and I don't see the console window. Sometimes everything works fine but never two times in a row.
The error message is:
"There was no end point listening at '.....svc' that could accept message. This is often caused by incorrect address or soap action.
System.net.webexception. Remote name could not be resolved
at System.Net.HttpWebRequest.GetRequestStream
at System.ServiceModel.Channels.HttpOutput.Webrequest.HttpOutput.GetOutputStream()
The code I have used in the webform to call console application is:
ProcessStartInfo p = new ProcessStartInfo();
p.Arguments = _updateNow.ToString();
p.FileName="something";
p.UseShellExecute = false;// tried true too without luck
Process.Start(p);
Error message denotes "there is no end point" and sounds like there is problem with the WCF service but if I double click the executable in Test there is no problem. What could be the possible problem or should I redo the console application functionality to my main webform application?
Update: After adding Thread.Sleep(3000) after Process.Start(p), I'm having no problem. So seems like main application is not waiting for the batch process to complete. How to solve this problem?
It seems like there is a short delay between starting the console application and the WCF web service becoming initialise and available to use - this is to be expected.
You could either:
Work around the issue using Thread.Sleep() and possibly with a couple of catch - retry blocks.
You could have the console application report to the creating process when it is ready to recieve requests (for example by having it write to the standard output and using redirected streams).
However at this point I'd probably reconsider the architecutre slightly - starting a new process is relativley costly, and on top of that initialising a WCF serice is also relatively costly too. If this is being done once per request then as well as the above timing issues you are also incurring performance penalties.
Is it not possible to change the architecutre slightly so that a single external process (for example a Windows service) is used for all requests instead of spawning a new process each time?

How do I wait until a console application is idle?

I have a console application that starts up, hosts a bunch of services (long-running startup), and then waits for clients to call into it. I have integration tests that start this console application and make "client" calls. How do I wait for the console application to complete its startup before making the client calls?
I want to avoid doing Thread.Sleep(int) because that's dependent on the startup time (which may change) and I waste time if the startup is faster.
Process.WaitForInputIdle works only on applications with a UI (and I confirmed that it does throw an exception in this case).
I'm open to awkward solutions like, have the console application write a temp file when it's ready.
One option would be to create a named EventWaitHandle. This creates a synchronization object that you can use across processes. Then you have your 'client' applications wait until the event is signalled before proceeding. Once the main console application has completed the startup it can signal the event.
http://msdn.microsoft.com/en-us/library/41acw8ct(VS.80).aspx
As an example, your "Server" console application might have the following. This is not compiled so it is just a starting point :)
using System.Threading;
static EventWaitHandle _startedEvent;
static void main()
{
_startedEvent = new EventWaitHandle(false, EventResetMode.ManualReset, #"Global\ConServerStarted");
DoLongRunnningInitialization();
// Signal the event so that all the waiting clients can proceed
_startedEvent.Set();
}
The clients would then be doing something like this
using System.Threading;
static void main()
{
EventWaitHandle startedEvent = new EventWaitHandle(false, EventResetMode.ManualReset, #"Global\ConServerStarted");
// Wait for the event to be signaled, if it is already signalled then this will fall throught immediately.
startedEvent.WaitOne();
// ... continue communicating with the server console app now ...
}
What about setting a mutex, and removing it once start up is done. Have the client app wait until it can grab the mutex before it starts doing things.
Include an is ready check in the app's client interface, or have it return a not ready error if called before it's ready.
Create a WCF service that you can use for querying the status of the server process. Only start this service if a particular command is passed on the command line. The following traits will ensure a very fast startup of this service:
Host this service as the first operation of the client application
Use the net.tcp or net.pipe binding because they start very quickly
Keep this service as simple as possible to ensure that as long as the console application doesn't terminate, it will remain available
The test runner can attempt to connect to this service. Retry the attempt if it fails until the console application terminates or a reasonably short timeout period expires. As long as the console application doesn't terminate unexpectedly you can rely on this service to provide any additional information before starting your tests in a reasonably short period of time.
Since the two(the console application, and integration test app that makes client calls - as I understand) are separate application, so there should be a mechanism - a bridge - that would tell play as a mediator(socket, external file, registry, etc).
Another possibility could be that you come up with an average time the console takes to load the services and use that time in your test app; well, just thinking out loud!

Why does my .NET service start really slow on a XP boot

I have a .NET windows service which acts as a host for some wcf. In the OnStart method the service hosts are created and started. The service is configured to startup automatically. This works well on Windows 7 (32bit and 64bit) and it can be startet with "net start" on Windows XP Pro SP3. The service startup with "net start" command takes about 20 seconds.
But when Windows XP Pro SP3 is booting there's a timeout message in the event log. The service itself does not fail to startup, though do its dependencies. The problem can be reproduced on various XP machines. Core count and memory does not have an influence. The updates are up to date.
Now it's getting curious: I analyzed the trace and found out that the service is taking about 60 seconds for startup. Thus I've added a call to ReqestAdditionalTime(480000). But now the service takes slightly more than 480 seconds. The relation is obvious. The time is consumed in the following code section:
var asyncResults = new List<IAsyncResult>();
foreach (var host in myHosts)
asyncResults.Add(host.BeginOpen(null, host));
// wait until finished
while (asyncResults.Count != 0)
{
IAsyncResult ar = asyncResults[0];
if (!ar.IsCompleted) ar.AsyncWaitHandle.WaitOne(1000);
if (ar.IsCompleted)
{
asyncResults.Remove(ar);
var co = (ICommunicationObject)ar.AsyncState;
try
{
co.EndOpen(ar);
}
catch (Exception ex)
{
...
}
}
}
Do you have any idea what's happening here?
Hey, I found the resolution myself by doing some intensive Log-Research.
In the event log there were some services, which started AFTER the timeout of my service has been reached. As my service is running as a sepecial user, I could detect two services, which where acutally triggered by my own service. Thus I added those to the services dependencies and it works.
I wonder if there's a documentation, where the dependencies of wcf are listed.
As reference here are the services, my service is dependen on:
http
RPCSS
CryptSvc
HTTPFilter
RasMan
Latter two where those causing the deadlock.

Categories