Get document text from printjob in c# - c#

I'd like to get the text from documents that were sent to network printers from the printer queue. Printing the contents of the printer spool file yields garbage, though. (I already had to give up my dream of capturing the JobStream itself on the fly, I read it in various sources that it's mostly null.) So, since I get a garbage output, I have a terrible suspicion: it is naive to think that characters get sent to a printer. Maybe it's just a bunch a coordinates and shade codes for black and white, or color codes for color printers. So is there any hope of acquiring the original text from the spool file or a printer jobstream?

It sure looks like graphics data that is sent to the printer. So it is too late to capture in the print queue. What is needed here is a custom print driver. I found only proprietary solutions, though.

Related

Zebra ZPL print job fails intermittently

I am generating ZPL labels in a C# windows service. The service is simple in srtucture... it uses the System.IO.FileSystemWatcher to detect when a new file is created by our ERP, it then parses the file, gets a chunk of data from SQL about thie job and formats this into validated ZPL.
It then uses the StreamWriter and TcpClient classes to create a connection to a Zebra label printer and sends the ZPL to port 9100. This is a technique we have used in the past without issue.
We use exclusively Zebra GK420D printers.
Here is the weird bit. Sometimes, when the job is sent to the printer, the LED just flashes - no label is printed. If you look at the configuration page in the web interface for that printer, it reports it is busy processing a job. the job appears in the job log absolutely fine, but the printer is seized up. You can't print a config label (as you would usually by holding down the feed button for a few seconds). You can reboot the printer, resubmit the job, and it will print... but this is not guaranteed. Frequently it will just flash again. You can send the same ZPL to another printer and it will print fine.
The ZPL being produced is around 4000 - 4500 bytes long. We have validated the ZPL using online tools to reproduce the label we want to print, and they all appear to be fine.
Has anyone seen anything like this before? It is baffling us here...
Check the firmware on the printer to make sure it's the most current. It sounds like you're doing the right things, even with the pause. I know if you send more data than it has available the printer will stall until it's power cycled.

c# Print string *directly* to printer

I need to print a string exactly as it is without drawing a picture before printing.
Graphic.DrawString Draws the specified text string, so I DO NOT want to use that - It draws a picture of the string. What I need is the exact string.
The only solution I found is to send a file to the printer - but I don't like that solution..
Nowdays printers is not writing machines.
In order to print anything, you got to draw a picture of it first. Keep in mind that not all pictures are jpg files.
One solution is stream your output to txt file and prompt printDialog

get the data send to the printer

In my project we need to use a virtual printer and then catch the file (most of the times its bitmap) and extract data from it. and transform it into xml like so .
<document name="file://C:\DOCUME~1\ilanit\LOCALS~1\Temp\p0129600584.htm">
<lineXY x="0" y="0" height="1656" width="2275" />
Is it something like Redmon you are looking for (used in conjunction with output to file and the launch an application)? If so you can use it or there are others out there too. Redmon is a little dated and depending on the OS you might have issues. If you can, add more detail and specifics to your question as it's a bit confusing.
UPDATE (based on comments): If the source is PDF or some other document (ie: Word) that has actual text and not just graphics (scan/image) type data you could use a Postscript driver (type 1 might work best) and then extract the text after you capture the print file. If you are not going to use the print file for actual output and just need the data, you can always try the Generic Text driver in Windows as it will ignore graphcis and just put the text in the output file. As long as the output is consistent and a little Regex should be able to pull out what you need.
If the data is graphical in nature such as a scanned image that you are printing, you will need to capture the print job, turn it into a graphic image (as it will be a print file with PCL or Postscript etc.) and then run it through an OCR engine to pull out what you need.

How to get "printer ready bytes" from a source in c#?

I'm in a bit of trouble here, hoping you can help a fellow programmer out.
I have an application that receives a pointer to raw bytes (plus length and stuff) and sends said raw data to a printer. This is important, I have no choice but to use this method to get any printing done.
If I send a raw string, it will print with no problem. However, I need to be able to print formatted text, images, etc. So the thing is... I would like to be able to get printer ready bytes from a given source (maybe a pdf, or html, does not matter as long as it contains formatted text and/or images). It would be like "splitting" the print command like so:
a) Open file and read data
b) Load printer data into memory
c) Send bytes to printer
Obviously, I've got a) and c) covered, it's b) the one that's breaking my head.
Any thoughts?
Thanks in advance for your help.
What you need is the printer processor to receive your print command and create formatted data. You wouldn't want to do this yourself, I hope (formatting to printer-ready data, even if you know PS, AFP, PCL or what it is nowadays, by heart, is very hard and months work). Instead, the printer processor of Windows should be used.
If you're on Windows (I assume, because you use C#, but perhaps you use Mono), you can send any printer command to a file (simply use the FILE: port). To create the formatted data, use any PDF library you have, or use RTF, which is supported by the .NET Framework, and send it to the selected printer (which should match the same printer that's on the other end of your application), which is configured on port FILE:.
The raw print data is then on disk, which you can simply read in as a byte array and send to your actual printer using the application you already got.

Is PrintSystemJobInfo.JobStream broken?

I get the queue from my targeted printer and goes through the list of jobs on it. When a job is not IsSpooling, I try to read the JobStream to see the print job.
So far JobStream has always been null. My printed stuff comes from on DOS application and should be pure text. I've Paused the printer to safe the rain forest, but I should still be able to get the spooled data, right?
Am I missing something, or is PrintSystemJobInfo.JobStream broken?
This value is almost always going to be null. Refer to this forum post: http://www.vbforums.com/showthread.php?t=549634
If you want the actual binary JobStream your best bet is to read the spool file (.SPL) out of the "C:\Windows\System32\spool\PRINTERS" directory. You can pause the job before its printed, or set the "keep print jobs" setting as mentioned in the linked forum post. Be forewarned though, this data comes in a gamut of formats all depending on the driver creating the spool file and the application initiating the print. Extracting data out of this stream is no trivial task as it will change from printer driver to printer driver. If you are working with 1 single known printer, then you may have success.

Categories