Display DICOM monochrome2 having bits stored less than bits allocated - c#

I want to display DICOM file having photometric interpretation MONOCHROME2.
some of the specifications of image are-
Rows: 1024
Columns: 1024
No of Frames: 622
Bits Allocated: 16
Bits Stored: 10
High Bit: 9
Pixel Representation: 0
Sample per pixel: 1
I am using gdcmRegionReader to extract single frames byte array in the following way.
gdcm.ImageRegionReader _regionReader = new gdcm.ImageRegionReader();
_regionReader.SetRegion(_boxRegion); // _boxRegion is some region
_regionReader.ReadIntoBuffer(Result, (uint)Result.Length);
Marshal.Copy(Result.ToArray(), 0, _imageData.GetScalarPointer(),
Result.ToArray().Length);
_viewer.SetInput(_imageData); // _viewer = vtkImageViewer
But when i display that file it is displaying like this..
but the original image is like this..
So can someone help me on how to load and display MONOCHROME2 dicom images.

Disclaimer: I never used the toolkit in question. I am attempting to answer based on my understanding of DICOM. In my experience about DICOM, syntax was rarely was the problem. Real problem was the concept and terms.
I see two problems in output.
One is about part of the image rendered. Notice that entire data is not rendered in your output. Check the toolkit document to see how to set the dimensions/bounds while rendering image.
Other problem is about output quality. Initially, I suspected the Transfer Syntax might be the issue. I do not think it is but just make sure you are uncompromising the image before rendering. I am not sure how your toolkit handles compression while rendering.
There is other way available to render pixel data in the toolkit.
_ImageViewer.SetRenderWindow(renderWindow);
_ImageViewer.GetRenderer().AddActor2D(sliceStatusActor);
_ImageViewer.GetRenderer().AddActor2D(usageTextActor);
_ImageViewer.SetSlice(_MinSlice);
_ImageViewer.Render();
Above code is copied from "http://www.vtk.org/Wiki/VTK/Examples/CSharp/IO/ReadDICOMSeries". Detailed code is available there.
Following links may also be helpful:
http://vtk.1045678.n5.nabble.com/How-to-map-negative-grayscale-to-color-td5737080.html
https://www.codeproject.com/Articles/31581/Displaying-bit-Images-Using-C

You should really use vtkGDCMImageReader2 instead in your code. vtkGDCMImageReader2 precisely encapsulate gdcm::RegionReader for binding with VTK.
If for some reason you cannot use directly this class, simply copy/paste the C++ code from within the main function, into your C# code.
See:
http://gdcm.sourceforge.net/2.6/html/classvtkGDCMImageReader2.xhtml
http://gdcm.sourceforge.net/2.6/html/classgdcm_1_1ImageRegionReader.xhtml

Related

How to extract text from PDF using iTextSharp version 4.1.6? [duplicate]

We are developing a Pdf parser to be used along with our system.
The requirement is such that, we store all the information on any pdf documents and should be able to reproduce the document as such (with minimal changes from original document).
We did some googling and found iTextSharp be the best mate for our purpose.
We are developing our project using .net.
You might have guessed as i mentioned in my title requiring comparisons for specific versions of iTextSharp (4.1.6 vs 5.x). We know that 4.1.6 is the last version of iTextSharp with the LGPL/MPL license . The 5.x versions are AGPL.
We would like to have a good comparison between the versions before choosing the LGPL version or we buy the license for AGPL (we dont like to publish our code).
I did some browsing through the revision changes in the iTextSharp but i would like to know if any content exist, making a good comparison between the versions.
Thanks in advance!
I'm the CTO of iText Software, so just like Michaƫl who already answered in the comment section, I'm at the same time the most authoritative source as well as a biased source.
There's a very simple comparison chart on the iText web site.
This chart doesn't cover text extraction, so allow me to list the relevant improvements since iText 5.
You've probably also found this page.
In case you wonder about the bug fixes and the performance improvements regarding text parsing, this is a more exhaustive list:
5.0.0: Text extraction: major overhaul to perform calculations in user space. This allows the parser to correctly determine line breaks, even if the text or page is rotated.
5.0.1: Refactored callback so method signature won't need to change as render callback API evolves.
5.0.1: Refactoring to make it easier for outside users to interact with the content stream processor. Also refactored render listener so text and image event listening occurs in the same interface (reduces a lot of non-value-add complexity)
5.0.1: New filtering functionality for text renderers.
5.0.1: Additional utility method for previewing PDF content.
5.0.1: Added a much more advanced text renderer listener that can reconstruct page content based on physical location of text on the page
5.0.1: Added support for XObject Form processing (text added via PdfTemplate can now be parsed)
5.0.1: Added rudimentary support for XObject Image callbacks
5.0.1: Bug fix - text extraction wasn't correct for certain page orientations
5.0.1: Bug fix - matrices were being concatenated in the wrong order.
5.0.1: PdfTextExtractor: changed the default render listener (new location aware strategy)
5.0.1: Getters for GraphicsState
5.0.2: Major refactoring of interface to text extraction functionality: for instance introduction of class PdfReaderContentParser
5.0.2: CMapAwareDocumentFont: Tweaks to make processing quasi-invalid PDF files more robust
5.0.2: PdfContentReaderTool: null pointer handling, plus a few well placed flush calls
5.0.2: PdfContentReaderTool: Show details on resource entries
5.0.2: PdfContentStreamProcessor: Adjustment so embedded images don't cause parsing problems and improvements to EI detection
5.0.2: LocationTextExtractionStrategy: Fixed anti-parallel algorithm, plus accounting for negative inter-character offsets. Change to text extraction strategy that builds out the text model first, then computes concatenation requirements.
5.0.2: Adjustments to linesegment implementation; optimalization of changes made by Bruno to text extraction; for example: introduction of the class MarkedContentInfo.
5.0.2: Major refactoring of interface to text extraction functionality: for instance introduction of class PdfReaderContentParser
5.0.3: added method to get area of image in user units
5.0.3: better parsing of inline images
5.0.3: Adding an extra check for begin/end sequences when parsing a ToUnicode stream.
5.0.4: Content streams in arrays should be parsed as if they were separated by whitespace
5.0.4: Expose CTM
5.0.4: Refactor to pull inline image processing into it's own class. Added parsing of image data if there is no filter applied (there are some PDFs where there is no white space between the end of the image data and the EI operator). Ultimately, it will be best to actually parse the image data, but this will require a pretty big refactoring of the iText decoders (to work from streams instead of byte[] of known lengths).
5.0.4: Handle multi-stage filters; Correct bug that pulled whitespace as first byte of inline image stream.
5.0.4: Applying stream filters to inline images.
5.0.4: PdfReader: Expose filter decoder for arbitrary byte arrays (instead of only streams)
5.0.6: CMapParser: Fix to read broken ToUnicode cmaps.
5.0.6: handle slightly malformed embedded images
5.0.6: CMapAwareDocumentFont: Some PDFs have a diff map bigger than 256 characters.
5.0.6: performance: Cache the fonts used in text extraction
5.1.2: PRTokeniser: Made the algorithm to find startxref more memory efficient.
5.1.2: RandomAccessFileOrArray: Improved handling for huge files that can't be mapped
5.1.2: CMapAwareDocumentFont: fix NPE if mapping doesn't get initialized (I'd rather wind up with junk characters than throw an unexpected exception down the road)
5.1.3: refactoring of how filters are applied to streams, adjust parser so it can handle multi-stage filters
5.1.3: images: allow correct decoding of 1bpc bitmask images
5.1.3: images: add jbig2 streams to pass through
5.1.3: images: handle null and indirect references in decode parameters, throw exception if unable to decode an image
5.2.0: Better error messages and better handling zero sized files and attempts to read past the end of the file.
5.2.0: Removed restriction that using memory mapping requires the file be smaller than ~2GB.
5.2.0: Avoid NullPointerException in RandomAccessFileOrArray
5.2.0: Made a utility method in pdfContentStreamProcessor private and clarified the stateful nature of the class
5.2.0: LocationTextExtractionStrategy: bounds checking on string lengths and refactoring to make code easier to read.
5.2.0: Better handling of color space dictionaries in images.
5.2.0: improve handling of quasi improper inline image content.
5.2.0: don't decode inline image streams until we absolutely need them.
5.2.0: avoid NullPointerException of resource dictionary isn't provided.
5.3.0: LocationTextExtractionStrategy: old comparison approach caused runtime exceptions in Java 7
5.3.3: incorporate the text-rise parameter
5.3.3: expose glyph-by-glyph information
5.3.3: Bugfix: text to user space transformation was being applied multiple times for sub-textrenderinfo objects
5.3.3: Bugfix: Correct baseline calculation so it doesn't include final character spacing
5.3.4: Added low-level filtering hook to LocationTextExtractionStrategy.
5.3.5: Fixed bug in PRTokeniser: handle case where number is at end of stream.
5.3.5: Replaced StringBuffer with StringBuilder in PRTokeniser for performance reasons.
5.4.2: Added an isChunkAtWordBoundary() method to LocationTextExtractionStrategy to check if a space character should be inserted between a previous chunk and the current one.
5.4.2: Added a getCharSpaceWidth() method to LocationTextExtractionStrategy to get the width of a space character.
5.4.2: Added a getText() method to LocationTextExtractionStrategy to get the text of the current Chunk.
5.4.2: Added an appendTextChunk(() method to SimpleTextExtractionStrategy to expose the append process so that subclasses can add text from outside the text parse operation.
5.4.5: Added MultiFilteredRenderListener class for PDF parser.
5.4.5: Added GlyphRenderListener and GlyphTextRenderListener classes for processing each glyph rather than processing chunks of text.
5.4.5: Added method getMcid() in TextRenderInfo.
5.4.5: fixed resource leak when many inline images were in content stream
5.5.0: CMapAwareDocumentFont: if font space width isn't defined, use the default width for the font.
5.5.0: PdfContentReader: avoid exception when displaying an empty dictionary.
There are some things that you won't be able to do if you don't upgrade. For instance, you won't be able to do the things described in these slides.
If you look at the roadmap for iText, you'll see that we'll invest even more time on text extraction in the future.
In all honesty: using the 5 year old version wouldn't only be like reinventing the wheel, it would also be like falling in every pitfall we've fallen in in the last 5 years. I can assure you that buying a license will be less expensive.

C# values differ from MATLAB values

I am working on a project with audio files. I read a file and parse it. I compare my parsed values with other sources and everything seems fine. (FYI : wav file with 16 bits per sample, 44.100 Hz and 7211 sample points.)
Since my every data point is defined with 16 bits I expect my value range as [-65536, +65536]. I get -4765 as minimum and 5190 as maximum from the values I read.
But when I perform the same operation in MATLAB I get -0.07270813 and 0.079193115 respectively. This is not a big problem because it seems that MATLAB seems normalizing my values within range [-1, +1]. When I plot I get the same figure.
But when I take FFT from both applications (I am using Lomont FFT) result differ greatly and I am not sure what is wrong with my code. Lomont seems fine but the results are inconsistent.
Is this difference normal, or should I use another algorithm for this spesific operation. Can anyone suggest a better FFT algorithm (I already tried NAudio, Exocortex etc) in C# to get compliant result with MATLAB. (I suppose that their result are correct.) or any advice or suggestions on this difference?
My plots :
MATLAB plots:

What is the formula to calculate a QR Code's maximum data?

I've Google'd and read quite a bit on QR codes and the maximum data that can be used based on the various settings, all of it being in tabular format. I can't seem to find anything giving a formula or a proper explanation of how these values are calculated.
What I would like to do is this:
Present the user with a form, allowing them to choose Format, EC & Version.
Then they can type in some data and generate a QR code.
Done deal. That part is easy.
The addition I would like to include is a "remaining character count" so that they (the user) can see how much more data they can type in, as well as what effect the properties have on the storage capacity of the QR code.
Does anyone know where I can find the formula(s)? Or do I need to purchase ISO 18004:2006?
A formula to calculate the amount of data you could put in a QRcode would be quite complex to make, not mentioning it would need some approximations for the calculation to be possible. The formula would have to calculate the amount of modules dedicated to the data in your QRCode based on its version, and then calculate how many codewords (which are sets of 8 modules) will be used for the error correction.
To calculate the amount of modules that will be used for the data, you need to know how many modules will be used for the function patterns. While this is not a problem for the three finder patterns, the timing or the version/format information, there will be a problem with the alignment patterns as their number is dependent on the QRCode's version, meaning you anyway would have to use a table at that point.
For the second part, I have to say I don't know how to calculate the number of error correcting codewords based on the correction capacity. For some reason, there are more error correcting codewords used that there should to match the error correction capacity, as for example a 6-H QRCode can correct up to 32.6% of the data, instead of the 30% set by the H correction level.
In any case, as you can see a formula would be quite complex to implement. Using a table like already suggested is probably the best thing you could do.
I wrote the original AIM specification for QR Code back in the '90s for Denso Corporation, and was also project editor for both editions of the ISO/IEC 18004 standard. It was felt to be much easier for people producing code printing software to use a look-up table rather than calculate capacities from a formula - no easy job as there are several independent variables that have to be taken into account iteratively when parsing the text to be encoded to minimise its length in bits, in order to achieve the smallest symbol. The most crucial factor is the mix of characters in the data, the sequence and lengths of sub-strings of numeric, alphanumeric, Kanji data, with the overhead needed to signal each change of character set, then the required level of error correction. I did produce a guidance section for this which is contained in the ISO standard.
The storage is calculated by the QR mode and the version/type that you are using. More specifically the calculation is based on how 'compressible' the characters are and what algorithm that the qr generator is allowed to use on the content present.
More information can be found http://en.wikipedia.org/wiki/QR_code#Storage

Read TIFF-File Header in C#

I'd like to know the size of a tiff-picture represented by an Image. This Information can be calculated from ImageWidth and ImageLength by using ResolutionUnit and X/Y-Resolution as Parameters. However I'm not capable to extract These Information out of a tiff-picture.
The description off the tiff-file-Header is found at Adobe.
As files can easily reach sizes above 400mb such as big maps, I'm looking for a way to only scan the Header of a TIFF-File to receive basic meta data like Resolution, ResolutionUnit, ResolutionX, ResolutionY and so on...
Do you know a good way to extract these information?
TiffBitmapDecoder class' CodecInfo property may give you what you want.
http://msdn.microsoft.com/en-us/library/system.windows.media.imaging.tiffbitmapdecoder%28v=vs.110%29.aspx

handling pixels in xsl fo

I have a Web front end and I am trying to handle layout with tables, because my tables all contain a col with a width- in pixels, what is the best way to handle it inside the pdf to get a consistent layout...?
I am using fo.net and the code I use to convert pixels to in is: However on different machines I am getting inconsistent results...
<xsl:value-of disable-output-escaping="yes" select="floor(#width div 72)"/>
<xsl:text>in</xsl:text>
Is there a way using c# to get the screen resolution and any other info to get a more accurate result?
To answer my own question... use % with a fixed max width on the table, that way I can get the Xsl to work out the % of each column based on the total width of a table. This workaround seems like the best and most felxible way to handle my situation, the biggest problem is, pixels cannot be translated into xsl fo - if a person is working on a page then moves onto a different machine the outputted pdf could be drastically different.
On the note: I would like for mm to be supported alongside pixels in WYSIWYG editors... as I am using jquery client side I will most likely enhance my tables therefore requiring me to create this plugin. I hope this info helps anyone else who wants to create pdfs from client side WYSIWYG editors, I am sure this info can be applicable for other scenarios too... :)
In the class 'System.Windows.Forms.Screen' there are several functions and values concerning the screen.

Categories