I'm trying to do something that seems like it should be relatively simple: Call jpegoptim from C#.
I can get it to write to disk fine, but getting it to accept a stream and emit a stream has so far eluded me - I always end up with 0 length output or the ominous "Pipe has been ended."
One approach I tried:
var processInfo = new ProcessInfo(
jpegOptimPath,
"-m" + quality + " -T1 -o -p --strip-all --all-normal"
);
processInfo.CreateNoWindow = true;
processInfo.WindowStyle = ProcessWindowStyle.Hidden;
processInfo.UseShellExecute = false;
processInfo.RedirectStandardInput = true;
processInfo.RedirectStandardOutput = true;
processInfo.RedirectStandardError = true;
using(var process = Process.Start(processInfo))
{
await Task.WhenAll(
inputStream.CopyToAsync(process.StandardInput.BaseStream),
process.StandardOutput.BaseStream.CopyToAsync(outputStream)
);
while (!process.HasExited)
{
await Task.Delay(100);
}
// Do stuff with outputStream here - always length 0 or exception
}
I've also tried this solution:
http://alabaxblog.info/2013/06/redirectstandardoutput-beginoutputreadline-pattern-broken/
using (var process = new Process())
{
process.StartInfo.UseShellExecute = false;
process.StartInfo.CreateNoWindow = true;
process.StartInfo.RedirectStandardError = true;
process.StartInfo.RedirectStandardOutput = true;
process.StartInfo.FileName = fileName;
process.StartInfo.Arguments = arguments;
process.Start();
//Thread.Sleep(100);
using (Task processWaiter = Task.Factory.StartNew(() => process.WaitForExit()))
using (Task<string> outputReader = Task.Factory.StartNew(() => process.StandardOutput.ReadToEnd()))
using (Task<string> errorReader = Task.Factory.StartNew(() => process.StandardError.ReadToEnd()))
{
Task.WaitAll(processWaiter, outputReader, errorReader);
standardOutput = outputReader.Result;
standardError = errorReader.Result;
}
}
Same problem. Output length 0. If I let jpegoptim run without the output redirect I get what I'd expect - an optimized file - but not when I run it this way.
There's gotta be a right way to do this?
Update: Found a clue - don't I feel sheepish - jpegoptim never supported piping to stdin until an experimental build in 2014, fixed this year. The build I have is from an older library, dated 2013. https://github.com/tjko/jpegoptim/issues/6
A partial solution - see deadlock issue below. I had multiple problems in my original attempts:
You need a build of jpegoptim that will read and write to pipes instead of files-only. As mentioned builds prior to mid-2014 can't do it. The github "releases" of jpegoptim are useless zips of source, not built releases, so you'll need to look elsewhere for actual built releases.
You need to call it properly, passing --stdin and --stdout, and depending on how you'll be responding to it, avoid parameters that might cause it to write nothing, like -T1 (which will, when optimization is going to only be 1% or less, cause it to emit nothing to stdout).
You need to perform the non-trivial task of: Redirecting both input and output on the Process class
and avoiding a Buffer overflow on the input side that will get you 0 output once again - the obvious stream.CopyToAsync() overruns Process's very limited 4096 byte (4K) buffer and gets you nothing.
So many routes to nothing. None signalling why.
var processInfo = new ProcessInfo(
jpegOptimPath,
"-m" + quality + " --strip-all --all-normal --stdin --stdout",
);
processInfo.CreateNoWindow = true;
processInfo.WindowStyle = ProcessWindowStyle.Hidden;
processInfo.UseShellExecute = false;
processInfo.RedirectStandardInput = true;
processInfo.RedirectStandardOutput = true;
processInfo.RedirectStandardError = true;
using(var process = new Process())
{
process.StartInfo = processInfo;
process.Start();
int chunkSize = 4096; // Process has a limited 4096 byte buffer
var buffer = new byte[chunkSize];
int bufferLen = 0;
var inputStream = process.StandardInput.BaseStream;
var outputStream = process.StandardOutput.BaseStream;
do
{
bufferLen = await input.ReadAsync(buffer, 0, chunkSize);
await inputStream.WriteAsync(buffer, 0, bufferLen);
inputStream.Flush();
}
while (bufferLen == chunkSize);
do
{
bufferLen = await outputStream.ReadAsync(buffer, 0, chunkSize);
if (bufferLen > 0)
await output.WriteAsync(buffer, 0, bufferLen);
}
while (bufferLen > 0);
while(!process.HasExited)
{
await Task.Delay(100);
}
output.Flush();
There are some areas for improvement here. Improvements welcome.
Biggest problem: On some images, this deadlocks on the outputStream.ReadAsync line.
It all belongs in separate methods to break it up - I unrolled a bunch of methods to keep this example simple.
There are a bunch of flushes that may not be necessary.
The code here is meant to handle anything that streams in and out. The 4096 is a hard limit that any Process will deal with, but the assumption that all the input goes in, then all the output comes out is likely a bad one and based on my research could result in a deadlock for other types of process. It appears that jpegoptim behaves in this (very buffered, very unpipe-like...) way when passed --stdin --stdout however, so, this code copes well for this specific task.
Related
I use the following method to compress the pdf:
private bool CompressPDF(string Input, string Output, string CompressValue)
{
try
{
Process proc = new Process();
ProcessStartInfo psi = new ProcessStartInfo();
psi.CreateNoWindow = true;
psi.ErrorDialog = false;
psi.UseShellExecute = false;
psi.WindowStyle = ProcessWindowStyle.Hidden;
psi.FileName = string.Concat(Path.GetDirectoryName(Application.ExecutablePath), "\\ghost.exe");
string args = "-sDEVICE=pdfwrite -dCompatibilityLevel=1.4" + " -dPDFSETTINGS=/" + CompressValue + " -dNOPAUSE -dQUIET -dBATCH" + " -sOutputFile=\"" + Output + "\" " + "\"" + Input + "\"";
psi.Arguments = args;
//start the execution
proc.StartInfo = psi;
proc.Start();
proc.WaitForExit();
return true;
}
catch
{
return false;
}
}
I put the pdf settings on "Printer" by default. I cant figure out why the file size of my pdf files increase sometimes.
Ghostscript (more accurately its pdfwrite device) doesn't 'compress' files.
It is possible, by judicious use of settings which will do things like downsample images to trade quality for file size, to get a smaller file produced but there is absolutely no guarantee that this is the case.
Without seeing the input file, there is no possible way to comment on why your file increases in size, but (for example) a PDF 1.5 file can use compressed streams and xref, and the pdfwrite device never uses those, so that could be one reason.
The canned 'PDFSETTINGS' cover a multitude of different controls, you should read those and understand what is actually going on. If your original file happens to already have traded quality for size, then it's entirely likely that the printer settings (which are reasonably conservative) will not actually do anything at all.
My C# program needs to send data to a 3rd-party program via its standard input. However, the program waits for the input stream to reach EOF before processing. Here my code:
// Starts the process.
var process = new Process();
process.StartInfo.CreateNoWindow = true;
process.StartInfo.UseShellExecute = false;
process.StartInfo.RedirectStandardInput = true;
process.StartInfo.RedirectStandardOutput = true;
process.StartInfo.FileName = "foo.exe";
process.Start();
// Sends data to child process.
var input = process.StandardInput;
input.WriteLine("eval('2 * PI')");
input.Flush();
input.Close();
// Reads the result.
var output = process.StandardOutput;
var result = output.ReadLine();
The child program won't do anything and my C# code becomes stuck at output.ReadLine() call. However if I kill the C# process, then the child starts to work exactly on the data I've sent. How can I make the child encounter an EOF while I'm still alive?
StreamWriter might not be sending an actual eof when it closes the stream. You could try writing your own to the stream just before you close it. Something like this might work:
input.Write((char)26);
You may have to find out what the process expects for eof.
I want to be able to read a screen shot of a web site, and am attempting to use phantomjs and ASP.NET.
I have tried using page.render which would save the screen shot to a file. It works as a console application, but not when I call it from an asp.net handler. It is probably due to file permissions, since simple applications (like hello.js) work fine.
That is okay, my preference would be not to write to a file, but to deal with the bytes and return an image directly from the handler.
I am a bit lost as to how to do that. I noticed a method called page.renderBase64, but do not know how to use it.
Currently I am using an IHttpHandler.
There is a similar question here, but that person eventualy dropped phantomjs. I like the look of it and want to continue using it if possible.
Running Phantomjs using C# to grab snapshot of webpage
According to your last comment you can do the following in phantom js file:
var base64image = page.renderBase64('PNG');
system.stdout.write(base64image);
in C#:
var startInfo = new ProcessStartInfo {
//some other parameters here
...
FileName = pathToExe,
Arguments = String.Format("{0}",someParameters),
UseShellExecute = false,
CreateNoWindow = true,
RedirectStandardOutput = true,
RedirectStandardError = true,
RedirectStandardInput = true,
WorkingDirectory = pdfToolPath
};
var p = new Process();
p.StartInfo = startInfo;
p.Start();
p.WaitForExit(timeToExit);
//Read the Error:
string error = p.StandardError.ReadToEnd();
//Read the Output:
string output = p.StandardOutput.ReadToEnd();
In your output variable you can read base64 returned from phantomJS and then do what you have planned with it.
Use the wrapper for Phantomjs from here nreco wrapper
You can get js for rastor here : rastorize
And then the following code in C# would do the job.
var phantomJS=new PhantomJS();
phantomJS.Run("rasterize.js", new[] { "http://google.com","ss.pdf" });
This question stemmed from my lack of understanding of what a base64 string actually was.
In the javascript file that phantomjs runs, I can write the base64 image directly to the console like so:
var base64image = page.renderBase64('PNG');
console.log(base64image);
In the c# code that runs phantomjs, I can convert the console output back to bytes and write the image to the response, like so:
var info = new ProcessStartInfo(path, string.Join(" ", args));
info.RedirectStandardInput = true;
info.RedirectStandardOutput = true;
info.UseShellExecute = false;
info.CreateNoWindow = true;
var p = Process.Start(info);
p.Start();
var base64image = p.StandardOutput.ReadToEnd();
var bytes = Convert.FromBase64CharArray(base64image.ToCharArray(), 0, base64image.Length);
p.WaitForExit();
context.Response.OutputStream.Write(bytes, 0, bytes.Length);
context.Response.ContentType = "image/PNG";
This seems to avoid file locking issues I was having.
Using CasperJS coupled with PhantomJS , I've been getting beautiful shots of webpages.
var casper = require('casper').create();
casper.start('http://target.aspx', function() {
this.capture('snapshot.png');
});
casper.run(function() {
this.echo('finished');
});
I highly recommend you check out that tool. I'm still not sure how to do the post-backs though..
Set the 'WorkingDirectory' property of ProcessStartInfo object in order to specify the saving location of the file.
I have tried calling a Process(console application) using the following code:
ProcessStartInfo pi = new ProcessStartInfo();
pi.UseShellExecute = false;
pi.RedirectStandardOutput = true;
pi.CreateNoWindow = true;
pi.FileName = #"C:\fakepath\go.exe";
pi.Arguments = "FOO BAA";
Process p = Process.Start(pi);
StreamReader streamReader = p.StandardOutput;
char[] buf = new char[256];
string line = string.Empty;
int count;
while ((count = streamReader.Read(buf, 0, 256)) > 0)
{
line += new String(buf, 0, count);
}
It works for only some cases.
The file that does not work has a size of 1.30 mb,
I don't know if that is the reason for it not working correctly.
line returns an empty string.
I hope this is clear.
Can someone point out my error? Thanks in advance.
A couple thoughts:
The various Read* methods of streamreader require you to ensure that your app has completed before they run, otherwise you may get no output depending on timing issues. You may want to look at the Process.WaitForExit() function if you want to use this route.
Also, unless you have a specific reason for allocating buffers (pain in the butt IMO) I would just use readline() in a loop, or since the process has exited, ReadToEnd() to get the whole output. Neither requires you to have to do arrays of char, which opens you up to math errors with buffer sizes.
If you want to go asynchronous and dump output as you run, you will want to use the BeginOutputReadLine() function (see MSDN)
Don't forget that errors are handled differently, so if for any reason your app is writing to STDERR, you will want to use the appropriate error output functions to read that output as well.
In C# (.NET 4.0 running under Mono 2.8 on SuSE) I would like to run an external batch command and capture its ouput in binary form. The external tool I use is called 'samtools' (samtools.sourceforge.net) and among other things it can return records from an indexed binary file format called BAM.
I use Process.Start to run the external command, and I know that I can capture its output by redirecting Process.StandardOutput. The problem is, that's a text stream with an encoding, so it doesn't give me access to the raw bytes of the output. The almost-working solution I found is to access the underlying stream.
Here's my code:
Process cmdProcess = new Process();
ProcessStartInfo cmdStartInfo = new ProcessStartInfo();
cmdStartInfo.FileName = "samtools";
cmdStartInfo.RedirectStandardError = true;
cmdStartInfo.RedirectStandardOutput = true;
cmdStartInfo.RedirectStandardInput = false;
cmdStartInfo.UseShellExecute = false;
cmdStartInfo.CreateNoWindow = true;
cmdStartInfo.Arguments = "view -u " + BamFileName + " " + chromosome + ":" + start + "-" + end;
cmdProcess.EnableRaisingEvents = true;
cmdProcess.StartInfo = cmdStartInfo;
cmdProcess.Start();
// Prepare to read each alignment (binary)
var br = new BinaryReader(cmdProcess.StandardOutput.BaseStream);
while (!cmdProcess.StandardOutput.EndOfStream)
{
// Consume the initial, undocumented BAM data
br.ReadBytes(23);
// ... more parsing follows
But when I run this, the first 23bytes that I read are not the first 23 bytes in the ouput, but rather somewhere several hundred or thousand bytes downstream. I assume that StreamReader does some buffering and so the underlying stream is already advanced say 4K into the output. The underlying stream does not support seeking back to the start.
And I'm stuck here. Does anyone have a working solution for running an external command and capturing its stdout in binary form? The ouput may be very large so I would like to stream it.
Any help appreciated.
By the way, my current workaround is to have samtools return the records in text format, then parse those, but this is pretty slow and I'm hoping to speed things up by using the binary format directly.
Using StandardOutput.BaseStream is the correct approach, but you must not use any other property or method of cmdProcess.StandardOutput. For example, accessing cmdProcess.StandardOutput.EndOfStream will cause the StreamReader for StandardOutput to read part of the stream, removing the data you want to access.
Instead, simply read and parse the data from br (assuming you know how to parse the data, and won't read past the end of stream, or are willing to catch an EndOfStreamException). Alternatively, if you don't know how big the data is, use Stream.CopyTo to copy the entire standard output stream to a new file or memory stream.
Since you explicitly specified running on Suse linux and mono, you can work around the problem by using native unix calls to create the redirection and read from the stream. Such as:
using System;
using System.Diagnostics;
using System.IO;
using Mono.Unix;
class Test
{
public static void Main()
{
int reading, writing;
Mono.Unix.Native.Syscall.pipe(out reading, out writing);
int stdout = Mono.Unix.Native.Syscall.dup(1);
Mono.Unix.Native.Syscall.dup2(writing, 1);
Mono.Unix.Native.Syscall.close(writing);
Process cmdProcess = new Process();
ProcessStartInfo cmdStartInfo = new ProcessStartInfo();
cmdStartInfo.FileName = "cat";
cmdStartInfo.CreateNoWindow = true;
cmdStartInfo.Arguments = "test.exe";
cmdProcess.StartInfo = cmdStartInfo;
cmdProcess.Start();
Mono.Unix.Native.Syscall.dup2(stdout, 1);
Mono.Unix.Native.Syscall.close(stdout);
Stream s = new UnixStream(reading);
byte[] buf = new byte[1024];
int bytes = 0;
int current;
while((current = s.Read(buf, 0, buf.Length)) > 0)
{
bytes += current;
}
Mono.Unix.Native.Syscall.close(reading);
Console.WriteLine("{0} bytes read", bytes);
}
}
Under unix, file descriptors are inherited by child processes unless marked otherwise (close on exec). So, to redirect stdout of a child, all you need to do is change the file descriptor #1 in the parent process before calling exec. Unix also provides a handy thing called a pipe which is a unidirectional communication channel, with two file descriptors representing the two endpoints. For duplicating file descriptors, you can use dup or dup2 both of which create an equivalent copy of a descriptor, but dup returns a new descriptor allocated by the system and dup2 places the copy in a specific target (closing it if necessary). What the above code does, then:
Creates a pipe with endpoints reading and writing
Saves a copy of the current stdout descriptor
Assigns the pipe's write endpoint to stdout and closes the original
Starts the child process so it inherits stdout connected to the write endpoint of the pipe
Restores the saved stdout
Reads from the reading endpoint of the pipe by wrapping it in a UnixStream
Note, in native code, a process is usually started by a fork+exec pair, so the file descriptors can be modified in the child process itself, but before the new program is loaded. This managed version is not thread-safe as it has to temporarily modify the stdout of the parent process.
Since the code starts the child process without managed redirection, the .NET runtime does not change any descriptors or create any streams. So, the only reader of the child's output will be the user code, which uses a UnixStream to work around the StreamReader's encoding issue,
I checked out what's happening with reflector. It seems to me that StreamReader doesn't read until you call read on it. But it's created with a buffer size of 0x1000, so maybe it does. But luckily, until you actually read from it, you can safely get the buffered data out of it: it has a private field byte[] byteBuffer, and two integer fields, byteLen and bytePos, the first means how many bytes are in the buffer, the second means how many have you consumed, should be zero. So first read this buffer with reflection, then create the BinaryReader.
Maybe you can try like this:
public class ThirdExe
{
private static TongueSvr _instance = null;
private Diagnostics.Process _process = null;
private Stream _messageStream;
private byte[] _recvBuff = new byte[65536];
private int _recvBuffLen;
private Queue<TonguePb.Msg> _msgQueue = new Queue<TonguePb.Msg>();
void StartProcess()
{
try
{
_process = new Diagnostics.Process();
_process.EnableRaisingEvents = false;
_process.StartInfo.FileName = "d:/code/boot/tongueerl_d.exe"; // Your exe
_process.StartInfo.UseShellExecute = false;
_process.StartInfo.CreateNoWindow = true;
_process.StartInfo.RedirectStandardOutput = true;
_process.StartInfo.RedirectStandardInput = true;
_process.StartInfo.RedirectStandardError = true;
_process.ErrorDataReceived += new Diagnostics.DataReceivedEventHandler(ErrorReceived);
_process.Exited += new EventHandler(OnProcessExit);
_process.Start();
_messageStream = _process.StandardInput.BaseStream;
_process.BeginErrorReadLine();
AsyncRead();
}
catch (Exception e)
{
Debug.LogError("Unable to launch app: " + e.Message);
}
private void AsyncRead()
{
_process.StandardOutput.BaseStream.BeginRead(_recvBuff, 0, _recvBuff.Length
, new AsyncCallback(DataReceived), null);
}
void DataReceived(IAsyncResult asyncResult)
{
int nread = _process.StandardOutput.BaseStream.EndRead(asyncResult);
if (nread == 0)
{
Debug.Log("process read finished"); // process exit
return;
}
_recvBuffLen += nread;
Debug.LogFormat("recv data size.{0} remain.{1}", nread, _recvBuffLen);
ParseMsg();
AsyncRead();
}
void ParseMsg()
{
if (_recvBuffLen < 4)
{
return;
}
int len = IPAddress.NetworkToHostOrder(BitConverter.ToInt32(_recvBuff, 0));
if (len > _recvBuffLen - 4)
{
Debug.LogFormat("current call can't parse the NetMsg for data incomplete");
return;
}
TonguePb.Msg msg = TonguePb.Msg.Parser.ParseFrom(_recvBuff, 4, len);
Debug.LogFormat("recv msg count.{1}:\n {0} ", msg.ToString(), _msgQueue.Count + 1);
_recvBuffLen -= len + 4;
_msgQueue.Enqueue(msg);
}
The key is _process.StandardOutput.BaseStream.BeginRead(_recvBuff, 0, _recvBuff.Length, new AsyncCallback(DataReceived), null); and the very very important is that convert to asynchronous reads event like Process.OutputDataReceived.