How to use an interactive command line program from another .NET program - c#

I need to write a wrapper for an interactive command line program.
That means I need to be able to send commands to the other program via its standard input und receive the response via its standard output.
The problem is, that the standard output stream seems to be blocked while the input stream is still open. As soon as I close the input stream I get the response. But then I cannot send more commands.
This is what I am using at the moment (mostly from here):
void Main() {
Process process;
process = new Process();
process.StartInfo.FileName = "atprogram.exe";
process.StartInfo.Arguments = "interactive";
// Set UseShellExecute to false for redirection.
process.StartInfo.UseShellExecute = false;
process.StartInfo.CreateNoWindow = true;
// Redirect the standard output of the command.
// This stream is read asynchronously using an event handler.
process.StartInfo.RedirectStandardOutput = true;
// Set our event handler to asynchronously read the output.
process.OutputDataReceived += (s, e) => Console.WriteLine(e.Data);
// Redirect standard input as well. This stream is used synchronously.
process.StartInfo.RedirectStandardInput = true;
process.Start();
// Start the asynchronous read of the output stream.
process.BeginOutputReadLine();
String inputText;
do
{
inputText = Console.ReadLine();
if (inputText == "q")
{
process.StandardInput.Close(); // After this line the output stream unblocks
Console.ReadLine();
return;
}
else if (!String.IsNullOrEmpty(inputText))
{
process.StandardInput.WriteLine(inputText);
}
}
}
I also tried reading the standard output stream synchronously, but with the same result. Any method call on the output stream block indefinitely until the input stream is closed - even Peek() and EndOfStream.
Is there any way to communicate with the other process in a full duplex kind of way?

I tried to reproduce your problem with a small test suite of my own.
Instead of using event handlers I do it in the most trivial way I could conceive: Synchronously. This way no extra complexity is added to the problem.
Here my little "echoApp" I wrote in rust, just for the giggles and also to have a chance to run into the eternal line termination wars problem ( \n vs \r vs \r\n). Depending on the way your command line application is written, this could indeed be one of your problems.
use std::io;
fn main() {
let mut counter = 0;
loop {
let mut input = String::new();
let _ = io::stdin().read_line(&mut input);
match &input.trim() as &str {
"quit" => break,
_ => {
println!("{}: {}", counter, input);
counter += 1;
}
}
}
}
And - being a lazy bone who does not like creating a solution for such a small test, I used F# instead of C# for the controlling side - it is easy enough to read I think:
open System.Diagnostics;
let echoPath = #"E:\R\rustic\echo\echoApp\target\debug\echoApp.exe"
let createControlledProcess path =
let p = new Process()
p.StartInfo.UseShellExecute <- false
p.StartInfo.RedirectStandardInput <- true
p.StartInfo.RedirectStandardOutput <- true
p.StartInfo.Arguments <- ""
p.StartInfo.FileName <- path
p.StartInfo.CreateNoWindow <- true
p
let startupControlledProcess (p : Process) =
if p.Start()
then
p.StandardInput.NewLine <- "\r\n"
else ()
let shutdownControlledProcess (p : Process) =
p.StandardInput.WriteLine("quit");
p.WaitForExit()
p.Close()
let interact (p : Process) (arg : string) : string =
p.StandardInput.WriteLine(arg);
let o = p.StandardOutput.ReadLine()
// we get funny empty lines every other time...
// probably some line termination problem ( unix \n vs \r\n etc -
// who can tell what rust std::io does...?)
if o = "" then p.StandardOutput.ReadLine()
else o
let p = createControlledProcess echoPath
startupControlledProcess p
let results =
[
interact p "Hello"
interact p "World"
interact p "Whatever"
interact p "floats"
interact p "your"
interact p "boat"
]
shutdownControlledProcess p
Executing this in f# interactive (CTRL-A ALT-Enter in Visual Studio) yields:
val echoPath : string = "E:\R\rustic\echo\echoApp\target\debug\echoApp.exe"
val createControlledProcess : path:string -> Process
val startupControlledProcess : p:Process -> unit
val shutdownControlledProcess : p:Process -> unit
val interact : p:Process -> arg:string -> string
val p : Process = System.Diagnostics.Process
val results : string list =
["0: Hello"; "1: World"; "2: Whatever"; "3: floats"; "4: your"; "5: boat"]
val it : unit = ()
I could not reproduce any blocking or deadlocks etc.
So, in your case I would try to investigate if maybe your NewLine property needs some tweaking (see function startupControlledProcess. If the controlled application does not recognize an input as a line, it might not respond, still waiting for the rest of the input line and you might get the effect you have.

process.BeginOutputReadLine();
Doesn't work like expected, because it waits until output stream will be closed, which will happen when process will end, and process will end when its input stream will be closed.
As workaround just use combinations of process.StandardOutput.ReadLine() and asynchronous made by yourself

Related

Why does calling the Tesseract process cause this service to crash randomly?

I have a .NET Core 2.1 service which runs on an Ubuntu 18.04 VM and calls Tesseract OCR 4.00 via a Process instance. I would like to use an API wrapper, but I could only find one available and it is only in beta for the latest version of Tesseract -- the stable wrapper uses version 3 instead of 4. In the past, this service worked well enough, but I have been changing it so that document/image data is written and read from disk less frequently in an attempt to improve speed. The service used to call many more external processes (such as ImageMagick) which were unnecessary due to the presence of an API, so I have been replacing those with API calls.
Recently I've been testing this with a sample file taken from real data. It's a faxed document PDF that has 133 pages, but is only 5.8 MB in spite of that due to grayscale and resolution. The service takes a document, splits it into individual pages, then assigns multiple threads (one thread per page) to call Tesseract and process them using Parallel.For. The thread limits are configurable. I am aware that Tesseract has its own multithreading environment variable (OMP_THREAD_LIMIT). I found in prior testing that setting it to "1" is ideal for our set up at the moment, but in my recent testing for this issue I have tried leaving it unset (dynamic value) with no improvement.
The issue is that unpredictably, when Tesseract is called, the service will hang for about a minute and then crash, with the only error showing in journalctl being:
dotnet[32328]: Error while reaping child. errno = 10
dotnet[32328]: at System.Environment.FailFast(System.String, System.Exception)
dotnet[32328]: at System.Environment.FailFast(System.String)
dotnet[32328]: at System.Diagnostics.ProcessWaitState.TryReapChild()
dotnet[32328]: at System.Diagnostics.ProcessWaitState.CheckChildren(Boolean)
dotnet[32328]: at System.Diagnostics.Process.OnSigChild(Boolean)
I can't find anything at all online for this particular error. It would seem to me, based on related research I've done on the Process class, that this is occurring when the process is exiting and dotnet is trying to clean up the resources it was using. I'm really at a loss as to how to even approach this problem, although I have tried a number of "guesses" such as changing thread limit values. There is no cross-over between threads. Each thread has its own partition of pages (based on how Parallel.For partitions a collection) and it sets to work on those pages, one at a time.
Here is the process call, called from within multiple threads (8 is the limit we normally set):
private bool ProcessOcrPage(IMagickImage page, int pageNumber, object instanceId)
{
var inputPageImagePath = Path.Combine(_fileOps.GetThreadWorkingDirectory(instanceId), $"ocrIn_{pageNumber}.{page.Format.ToString().ToLower()}");
string outputPageFilePathWithoutExt = Path.Combine(_fileOps.GetThreadOutputDirectory(instanceId),
$"pg_{pageNumber.ToString().PadLeft(3, '0')}");
page.Write(inputPageImagePath);
var cmdArgs = $"-l eng \"{inputPageImagePath}\" \"{outputPageFilePathWithoutExt}\" pdf";
bool success;
_logger.LogStatement($"[Thread {instanceId}] Executing the following command:{Environment.NewLine}tesseract {cmdArgs}", LogLevel.Debug);
var psi = new ProcessStartInfo("tesseract", cmdArgs)
{
RedirectStandardError = true,
RedirectStandardOutput = true,
UseShellExecute = false,
CreateNoWindow = true
};
// 0 is not the default value for this environment variable. It should remain unset if there
// is no config value, as it is determined dynamically by default within OpenMP.
if (_processorConfig.TesseractThreadLimit > 0)
psi.EnvironmentVariables.Add("OMP_THREAD_LIMIT", _processorConfig.TesseractThreadLimit.ToString());
using (var p = new Process() { StartInfo = psi })
{
string standardErr, standardOut;
int exitCode;
p.Start();
standardOut = p.StandardOutput.ReadToEnd();
standardErr = p.StandardError.ReadToEnd();
p.WaitForExit();
exitCode = p.ExitCode;
if (!string.IsNullOrEmpty(standardOut))
_logger.LogStatement($"Tesseract stdOut:\n{standardOut}", LogLevel.Debug, nameof(ProcessOcrPage));
if (!string.IsNullOrEmpty(standardErr))
_logger.LogStatement($"Tesseract stdErr:\n{standardErr}", LogLevel.Debug, nameof(ProcessOcrPage));
success = p.ExitCode == 0;
}
return success;
}
EDIT 4: After much testing and discussion with Clint in chat, here is what we learned. The error is raised from a Process event "OnSigChild," that much is obvious from the stack trace, but there is no way to hook into the same event that raises this error. The process never times out given a timeout of 10 seconds (Tesseract typically only takes a few seconds to process a given page). Curiously, if the process timeout is removed and I wait on the standard output and error streams to close, it will hang for a good 20-30 seconds, but the process does not appear in ps auxf during this hang time. From the best that I can tell, Linux is able to determine that the process is done executing, but .NET is not. Otherwise, the error seems to be raised at the very moment that the process is done executing.
The most baffling thing to me is still that the process handling part of the code really hasn't changed very much compared to the working version of this code we have in production. This suggests that it's an error I made somewhere, but I am simply unable to find it. I think I will have to open up an issue on the dotnet GitHub tracker.
"Error while reaping child"
Processes hold up some resources in the kernel, On Unix, when the parent dies, it is the init process that is responsible for cleaning up the kernel resources both Zombine and Orphan process (aka reaping the child). .NET Core reaps child processes as soon as they terminate.
"I have discovered that removing the stdout and stderr stream ReadToEnd
calls causes the processes to end immediately instead of hang, with
the same error"
The error is due to the fact that you are prematurely calling p.ExitCode even before the process has finished and with the ReadToEnd you are just delaying this activity
Summary of updated code
StartInfo.FileName should point to a filename that you want to start
UseShellExecute to false if the process should be created directly from the executable file and true if you intend that shell should be used when starting the process;
Added asynchrnous read operations to standard ouput and error streams
AutoResetEvents to signal when the output and error when the operations complete
Process.Close() to release the resources
It is easier to set and use ArgumentList over Arguments property
Redhat Blog on NetProcess on Linux
Revised Module
private bool ProcessOcrPage(IMagickImage page, int pageNumber, object instanceId)
{
StringBuilder output = new StringBuilder();
StringBuilder error = new StringBuilder();
int exitCode;
var inputPageImagePath = Path.Combine(_fileOps.GetThreadWorkingDirectory(instanceId), $"ocrIn_{pageNumber}.{page.Format.ToString().ToLower()}");
string outputPageFilePathWithoutExt = Path.Combine(_fileOps.GetThreadOutputDirectory(instanceId),
$"pg_{pageNumber.ToString().PadLeft(3, '0')}");
page.Write(inputPageImagePath);
var cmdArgs = $"-l eng \"{inputPageImagePath}\" \"{outputPageFilePathWithoutExt}\" pdf";
bool success;
_logger.LogStatement($"[Thread {instanceId}] Executing the following command:{Environment.NewLine}tesseract {cmdArgs}", LogLevel.Debug);
using (var outputWaitHandle = new AutoResetEvent(false))
using (var errorWaitHandle = new AutoResetEvent(false))
{
try
{
using (var process = new Process())
{
process.StartInfo = new ProcessStartInfo
{
WindowStyle = ProcessWindowStyle.Hidden,
FileName = "tesseract.exe", // Verify if this is indeed the process that you want to start ?
RedirectStandardOutput = true,
RedirectStandardError = true,
UseShellExecute = false,
CreateNoWindow = true,
Arguments = cmdArgs,
WorkingDirectory = Path.GetDirectoryName(path)
};
if (_processorConfig.TesseractThreadLimit > 0)
process.StartInfo.EnvironmentVariables.Add("OMP_THREAD_LIMIT", _processorConfig.TesseractThreadLimit.ToString());
process.OutputDataReceived += (sender, e) =>
{
if (e.Data == null)
{
outputWaitHandle.Set();
}
else
{
output.AppendLine(e.Data);
}
};
process.ErrorDataReceived += (sender, e) =>
{
if (e.Data == null)
{
errorWaitHandle.Set();
}
else
{
error.AppendLine(e.Data);
}
};
process.Start();
process.BeginOutputReadLine();
process.BeginErrorReadLine();
if (!outputWaitHandle.WaitOne(ProcessTimeOutMiliseconds) && !errorWaitHandle.WaitOne(ProcessTimeOutMiliseconds) && !process.WaitForExit(ProcessTimeOutMiliseconds))
{
//To cancel the read operation if the process is stil reading after the timeout this will prevent ObjectDisposeException
process.CancelOutputRead();
process.CancelErrorRead();
Console.ForegroundColor = ConsoleColor.Red;
Console.WriteLine("Timed Out");
//To release allocated resource for the Process
process.Close();
//Timed out
return false;
}
Console.ForegroundColor = ConsoleColor.Green;
Console.WriteLine("Completed On Time");
exitCode = process.ExitCode;
if (!string.IsNullOrEmpty(standardOut))
_logger.LogStatement($"Tesseract stdOut:\n{standardOut}", LogLevel.Debug, nameof(ProcessOcrPage));
if (!string.IsNullOrEmpty(standardErr))
_logger.LogStatement($"Tesseract stdErr:\n{standardErr}", LogLevel.Debug, nameof(ProcessOcrPage));
process.Close();
return exitCode == 0 ? true : false;
}
}
Catch
{
//Handle Exception
}
}
}

Communicate with process in C#

I need to communicate with external executable (ampl.exe) using standard input and standard output. This exe make calculations during some minutes with some display in the console. It has a prompt so I can succesively launch calculations by using its standard input as soon as a calculation is finished.
The external exe is launched as :
var myProcess = new Process();
myProcess.StartInfo = new ProcessStartInfo("ampl.exe");
myProcess.StartInfo.CreateNoWindow = true;
myProcess.StartInfo.UseShellExecute = false;
myProcess.StartInfo.RedirectStandardOutput = true;
myProcess.StartInfo.RedirectStandardError = true;
myProcess.StartInfo.RedirectStandardInput = true;
myProcess.Start();
I communicate with it by using myProcess.StandardInput and myProcess.StandardOutput (synchronous way).
I use standard input to launch the calcul, for example :
myProcess.StandardInput.WriteLine("solve;");
I want to wait the end of the solve statement, get results in files, prepare new calculation input files and then launching a second solve.
My problem is that I do now know when the first calculation is finished, that is when the exe is waiting for new command in its standard input.
The only way I found is to add a specific display command and wait for getting it it its standard output :
myProcess.StandardInput.WriteLine("solve;");
myProcess.StandardInput.WriteLine("print 'calculDone';");
string output = myProcess.StandardOutput.ReadLine();
while (!output.Contains("calculDone"))
{
output = myProcess.StandardOutput.ReadLine();
}
Is there another way avoiding to use this display command to do this ?
Edit : following advices, I tried the asynchronous way. But I still need to print 'CalculDone' to know when the solve statement ended. I do not get the prompt of ampl.exe (which is 'ampl : ') in the standard output of the process.
AutoResetEvent eventEnd = new AutoResetEvent(false);
var myProcess = new Process();
myProcess.StartInfo = new ProcessStartInfo("ampl.exe");
myProcess.StartInfo.CreateNoWindow = true;
myProcess.StartInfo.UseShellExecute = false;
myProcess.StartInfo.RedirectStandardOutput = true;
myProcess.StartInfo.RedirectStandardError = true;
myProcess.StartInfo.RedirectStandardInput = true;
myProcess.EnableRaisingEvents = true;
myProcess.OutputDataReceived += (sender, e) =>
{
if (e.Data == "commandDone")
{
eventEnd.Set();
}
else if (e.Data != null)
{
Console.WriteLine("ampl: {0}", e.Data);
}
};
myProcess.Start();
myProcess.BeginOutputReadLine();
myProcess.StandardInput.WriteLine("solve;");
myProcess.StandardInput.WriteLine("print 'commandDone';");
eventEnd.WaitOne();
The best option would be to use the Processs.OutputDataReceived event instead of a tight while loop. It’s like the event async pattern, you launch an asynchronous task and wait for an event callback telling you it’s done. The continuation of the asynchronous task would go in the event handler. Remember to unsubscribe the event handler the first time it goes off, otherwise it will be firing when you don’t want it to.
Another option I have never tried is Process.WaitForInputIdle() method, but I’m not sure if this will work in your particular case. If it does you wouldn’t need to write anything to the input stream.

Need to open command line task on different thread and monitor [duplicate]

I'm currently rendering the output of a command line process into a text box. The problem is that in a normal command prompt window, one of the lines that is written has a load bar kind of thing... where every few seconds it outputs a "." to the screen.... After a few dots, it will start a new line and then continue loading until it has completed its process.
With the following code, instead of getting these "." appear one by one, my OutputDataRecieved is waiting for the whole line to be written out... so the load bar is useless... Ie, it waits for "............." and thennnn it acts upon it.
Is there a way to keep track of every character being output to the screen rather than what seems to be per line outputs?
//Create process
System.Diagnostics.Process process = new System.Diagnostics.Process();
// arguments.ProcessStartInfo contains the following declaration:
// ProcessStartInfo = new ProcessStartInfo( "Cmd.exe" )
// {
// WorkingDirectory = executableDirectoryName,
// UseShellExecute = false,
// RedirectStandardInput = true,
// RedirectStandardOutput = true,
// CreateNoWindow = true,
// }
process.StartInfo = arguments.ProcessStartInfo;
//Start the process
StringBuilder sb = new StringBuilder();
bool alreadyThrownExit = false;
// The following event only seems to be run per line output rather than each character rendering the command line process useless
process.OutputDataReceived += ( sender, e ) =>
{
sb.AppendLine( e.Data );
CommandLineHelper.commandLineOutput = sb.ToString();
arguments.DelegateUpdateTextMethod();
if( !alreadyThrownExit )
{
if( process.HasExited )
{
alreadyThrownExit = true;
arguments.DelegateFinishMethod();
process.Close();
}
}
};
process.Start();
process.StandardInput.WriteLine( arguments.Command );
process.StandardInput.WriteLine( "exit" );
process.BeginOutputReadLine();
If you want asynchronous processing of the stdout of the given process on a per-character basis, you can use the TextReader.ReadAsync() method. Instead of the code you have to handle the OutputDataReceived event, just do something like this:
process.Start();
// Ignore Task object, but make the compiler happy
var _ = ConsumeReader(process.StandardOutput);
process.StandardInput.WriteLine( arguments.Command );
process.StandardInput.WriteLine( "exit" );
where:
async Task ConsumeReader(TextReader reader)
{
char[] buffer = new char[1];
while ((await read.ReadAsync(buffer, 0, 1)) > 0)
{
// process character...for example:
Console.Write(buffer[0]);
}
}
Alternatively, you could just create a dedicated thread and use that to call TextReader.Read() in a loop:
process.Start();
new Thread(() =>
{
int ch;
while ((ch = process.StandardOutput.Read()) >= 0)
{
// process character...for example:
Console.Write((char)ch);
}
}).Start();
process.StandardInput.WriteLine( arguments.Command );
process.StandardInput.WriteLine( "exit" );
IMHO the latter is more efficient, as it doesn't require as much cross-thread synchronization. But the former is more similar to the event-driven approach you would have had with the OutputDataReceived event.

Process Wrapping, Some output not displaying

I have a small wrapping application to give a GUI to an existing console application. I'm using the ProcessStartInfo and Process class to bind to the .exe, and then using BeginErrorReadLine() and BeginOutputReadLine() to redirect any messages into the new GUI. Everything works fine except for when the console calls Console.Write() instead Console.WriteLine(), in which case the text passed to Write is not displayed at all. I would think that the problem is because the WriteLine function inserts a line break after the text, and the Write method does not. Is there any way to circumvent this? I can't change it from Write to WriteLine in the original command line program as Write is used to prompt for input.
Relevant Code:
var startInfo = new ProcessStartInfo(ServerFile);
startInfo.RedirectStandardInput = true;
startInfo.RedirectStandardError = true;
startInfo.RedirectStandardOutput = true;
ServerProc = new Process();
ServerProc.StartInfo = startInfo;
ServerProc.EnableRaisingEvents = true;
ServerProc.ErrorDataReceived += new DataReceivedEventHandler(ServerProc_ErrorDataReceived);
ServerProc.OutputDataReceived += new DataReceivedEventHandler(ServerProc_OutputDataReceived);
private void ServerProc_ErrorDataReceived(object sender, DataReceivedEventArgs e)
{
Dispatcher.Invoke(new Action(() =>
{
ConsoleTextBlock.Text += e.Data + "\r\n";
ConsoleScroll.ScrollToEnd();
}));
}
private void ServerProc_OutputDataReceived(object sender, DataReceivedEventArgs e)
{
Dispatcher.Invoke(new Action(() =>
{
ConsoleTextBlock.Text += e.Data + "\r\n";
ConsoleScroll.ScrollToEnd();
}));
}
The problem you are experiencing is that the Process class is set up for convenient line-oriented event-based processing of the output of the process. You cannot use this functionality if you need to read partial lines as they are being output.
Nevertheless, the Process class does give you the tools you need if you need finer-grained control over the output than the line-oriented facilities. If you redirect the output, then Process.StandardOutput is a StreamReader and you have the whole StreamReader API that you can use instead of being forced to read entire lines.
For example, here is a character-by-character read of the standard output:
var start = DateTime.Now;
int n;
while ((n = ServerProc.StandardOutput.Read()) != -1)
{
var c = (char)n;
var delta = (DateTime.Now - start).TotalMilliseconds;
Console.WriteLine("c = {0} (0x{1:X}) delta = {2}",
char.IsWhiteSpace(c) ? '*' : c, n, delta);
}
If we run it on another console program that produces this output:
Console.Write("abc");
Thread.Sleep(1000);
Console.WriteLine("def");
It produces this output:
c = a (0x61) delta = 44.0025
c = b (0x62) delta = 44.0025
c = c (0x63) delta = 44.0025
c = d (0x64) delta = 1109.0634
c = e (0x65) delta = 1110.0635
c = f (0x66) delta = 1110.0635
c = * (0xD) delta = 1110.0635
c = * (0xA) delta = 1110.0635
which shows that the "abc" was read one second before the rest of the line.
However, this is not convenient if you like the event-oriented I/O already provided by Process. You can either:
use the async Stream API
use threads that do blocking reads
and perhaps even roll your own event-based I/O that meets your needs.
You are writing a program that is parsing another program's character-oriented output and so without full lines you will need some sort of timeout to indicate "I now satisfied that the program is done producing output for the time being." This can be easy or hard depending on how predictable the output is. For example, you might be able to recognize a prompt that ends with a "?". It depends on your situation.
The point is that you'll have to use the StandardOutput and StandardError StreamReader properties if you want something other than line-oriented I/O.

Process.Start vs Process `p = new Process()` in C#?

As is asked in this post, there are two ways to call another process in C#.
Process.Start("hello");
And
Process p = new Process();
p.StartInfo.FileName = "hello.exe";
p.Start();
p.WaitForExit();
Q1 : What are the pros/cons of each approach?
Q2 : How to check if error happens with the Process.Start() method?
With the first method you might not be able to use WaitForExit, as the method returns null if the process is already running.
How you check if a new process was started differs between the methods. The first one returns a Process object or null:
Process p = Process.Start("hello");
if (p != null) {
// A new process was started
// Here it's possible to wait for it to end:
p.WaitForExit();
} else {
// The process was already running
}
The second one returns a bool:
Process p = new Process();
p.StartInfo.FileName = "hello.exe";
bool s = p.Start();
if (s) {
// A new process was started
} else {
// The process was already running
}
p.WaitForExit();
For simple cases, the advantage is mainly convenience. Obviously you have more options (working path, choosing between shell-exec, etc) with the ProcessStartInfo route, but there is also a Process.Start(ProcessStartInfo) static method.
Re checking for errors; Process.Start returns the Process object, so you can wait for exit and check the error code if you need. If you want to capture stderr, you probably want either of the ProcessStartInfo approaches.
Very little difference. The static method returns a process object, so you can still use the "p.WaitForExit()" etc - using the method where you create a new process it would be easier to modify the process parameters (processor affinity and such) before launching the process.
Other than that - no difference. A new process object is created both ways.
In your second example - that is identical to this:
Process p = Process.Start("hello.exe");
p.WaitForExit();

Categories