byte array to pdf - c#

I am trying to convert content of a file stored in a sql column to a pdf.
I use the following piece of code:
byte[] bytes;
BinaryFormatter bf = new BinaryFormatter();
MemoryStream ms = new MemoryStream();
bf.Serialize(ms, fileContent);
bytes = ms.ToArray();
System.IO.File.WriteAllBytes("hello.pdf", bytes);
The pdf generated is corrupt in the sense that when I open the pdf in notepad++, I see some junk header (which is same irrespective of the fileContent). The junk header is NUL SOH NUL NUL NUL ....

You shouldn't be using the BinaryFormatter for this - that's for serializing .Net types to a binary file so they can be read back again as .Net types.
If it's stored in the database, hopefully, as a varbinary - then all you need to do is get the byte array from that (that will depend on your data access technology - EF and Linq to Sql, for example, will create a mapping that makes it trivial to get a byte array) and then write it to the file as you do in your last line of code.
With any luck - I'm hoping that fileContent here is the byte array? In which case you can just do
System.IO.File.WriteAllBytes("hello.pdf", fileContent);

Usually this happens if something is wrong with the byte array.
File.WriteAllBytes("filename.PDF", Byte[]);
This creates a new file, writes the specified byte array to the file, and then closes the file. If the target file already exists, it is overwritten.
Asynchronous implementation of this is also available.
public static System.Threading.Tasks.Task WriteAllBytesAsync
(string path, byte[] bytes, System.Threading.CancellationToken cancellationToken = null);

Related

System.Text.Encoding.Default.GetBytes fails

Here is my sample code:
CodeSnippet 1: This code executes in my file repository server and returns the file as encoded string using the WCF Service:
byte[] fileBytes = new byte[0];
using (FileStream stream = System.IO.File.OpenRead(#"D:\PDFFiles\Sample1.pdf"))
{
fileBytes = new byte[stream.Length];
stream.Read(fileBytes, 0, fileBytes.Length);
stream.Close();
}
string retVal = System.Text.Encoding.Default.GetString(fileBytes); // fileBytes size is 209050
Code Snippet 2:
Client box, which demanded the PDF file, receives the encoded string and converts to PDF and save to local.
byte[] encodedBytes = System.Text.Encoding.Default.GetBytes(retVal); /// GETTING corrupted here
string pdfPath = #"C:\DemoPDF\Sample2.pdf";
using (FileStream fileStream = new FileStream(pdfPath, FileMode.Create)) //encodedBytes is 327279
{
fileStream.Write(encodedBytes, 0, encodedBytes.Length);
fileStream.Close();
}
Above code working absolutely fine Framework 4.5 , 4.6.1
When I use the same code in Asp.Net Core 2.0, it fails to convert to Byte Array properly. I am not getting any runtime error but, the final PDF is not able to open after it is created. Throws error as pdf file is corrupted.
I tried with Encoding.Unicode and Encoding.UTF-8 also. But getting same error for final PDF.
Also, I have noticed that when I use Encoding.Unicode, atleast the Original Byte Array and Result byte array size are same. But other encoding types are mismatching with bytes size also.
So, the question is, System.Text.Encoding.Default.GetBytes broken in .NET Core 2.0 ?
I have edited my question for better understanding.
Sample1.pdf exists on a different server and communicate using WCF to transmit the data to Client which stores the file encoded stream and converts as Sample2.pdf
Hopefully my question makes some sense now.
1: the number of times you should ever use Encoding.Default is essentially zero; there may be a hypothetical case, but if there is one: it is elusive
2: PDF files are not text, so trying to use an Encoding on them is just... wrong; you aren't "GETTING corrupted here" - it just isn't text.
You may wish to see Extracting text from PDFs in C# or Reading text from PDF in .NET
If you simply wish to copy the content without parsing it: File.Copy or Stream.CopyTo are good options.

How to Read .DSS format audio files into Byte array

in My application, i read .DSS format audio Files into Byte Array,with following code
byte[] bt = File.ReadAllBytes(Filepath);
but i am unable to get data into Byte's. but In the Audio player it is playing ,
here how can i read the files into Byte Array.
Here i am attaching Snap, what bt have, it show's 255 for all bytes.
TIA
To ensure this is not the issue with File.ReadAllBytes, try to read file using stream, like this:
using (var fileStream = new FileStream(FilePath, FileMode.Open, FileAccess.Read))
{
byte[] buffer = new byte[fileStream.Length];
fileStream.Read(buffer, 0, (int) fileStream.Length);
// use buffer;
}
UPDATE: as it's not working too, there should be issue with your file. Try to find any process that may be blocking and using it at the moment. Also, try to open the file with any HEX editor and see if there really any meaningful data present. I'd also create clean testing app/sandbox to test if it's working.
Well, the Dss format is copyrighted, and you'll likely not find a lot of information about it.
255 or 0xFF is commonly used in Dss files to indicate that a byte is not in use. You will see many of them in the header of the Dss file, later in the audio part they will be more sparse.
That means: a value of 255 in the region of bytes 83-97 which you show does NOT mean that something went wrong.

GZipStream create a invalid charaset

I have a simple function to create a gzip file. This function work fine and pass the unit test. Then I hosted the generated filed at amazon s3.
But it produce some invalid character when the input value contain a unicode character.
eg.アームバンド & ケース > 9ÎvøS‰
public static void CompressStringToFile(string fileName, string value)
{
// Use GZipStream to write compressed bytes to target file.
using (FileStream f2 = new FileStream(fileName, FileMode.Create))
using (GZipStream gz = new GZipStream(f2,CompressionMode.Compress, false))
{
byte[] b = Encoding.Unicode.GetBytes(value);
gz.Write(b, 0, b.Length);
gz.Flush();
}
}
The output of GZip compression isn't meant to be text. It's effectively arbitrary binary content, which you should only use to decompress it to the original binary content... which in your case is UTF-16-encoded text. You shouldn't expect to be able to read the gzip file as a text file.
GZip itself doesn't interpret the (binary) data that it's given - it just compresses it, so it can be faithfully decompressed later on. GZip couldn't care less whether it's text, an image, a sound file, whatever: it just does the best it can to compress it.

zipping memory stream in silverlight

im using the SLsharpziplip to try to compress a byte[] before sending it on the network to a server. the byte[] contains jpeg data which is already compressed by the jpeg encoder.
you may ask , if jpeg already compress the image, why do i need to compress it more, well because i tried it and it worked.
here is what happened:
I wrote the bytes in the byte[] to a txt file , the size of the txt file is ~5k , i compressed it with winzip and the result file was ~2k , so thats about 50% reduction in the file size. however , when i try to do it with the byte[] and use the slsharziplip to compress the byte[] , the reduction in size is minimal.
here is the code i used:
MemoryStream msCompressed = new MemoryStream();
GZipOutputStream gzCompressed = new GZipOutputStream(msCompressed);
gzCompressed.SetLevel(9);
// allframes is a byte array.
gzCompressed.Write(allframes, 0, allframes.Length);
gzCompressed.Finish();
gzCompressed.IsStreamOwner = false;
gzCompressed.Close();
// i used byte[] compresseddata = msCompressed.ToArray() but i thought i'll try this too.
msCompressed.Seek(0, SeekOrigin.Begin);
byte[] compresseddata = new byte[msCompressed.Length];
msCompressed.Read(compresseddata, 0, compresseddata.Length);
==================================================================================
from debugging the code, i can see that the difference of size between allframes.Length and compresseddata.lenght is minimal. but if that same data is written to a text file and zipped with winzip its size is reduced by 50%.
this is how i write the same data to a txt file:
TextWriter tw = new StreamWriter(MainPage.fs); // fs is a filestream.
foreach (byte b in allframes )
{
tw.Write(b);
}
===============================================================================
am i doing something wrong?! am i misunderstanding something!!
thanks up front :)
You are not comparing like with like.
There is no point in compressing JPEG image data as it is compressed already. Writing it out to a text file won't give you the same file size as writing it to a binary file.
Probably not, I would imagine WinZip has a superior zip algorithm to SLSharpZipLib. You can try varying the compression ratio but other than that, I would try different Silverlight compatible zip libraries.
JPEG as you've correctly pointed out is already a highly compressed file type, so finding a compression algorithm that can find further redundancy is going to be difficult.
Best regards,

Reliable way to convert a file to a byte[]

I found the following code on the web:
private byte [] StreamFile(string filename)
{
FileStream fs = new FileStream(filename, FileMode.Open,FileAccess.Read);
// Create a byte array of file stream length
byte[] ImageData = new byte[fs.Length];
//Read block of bytes from stream into the byte array
fs.Read(ImageData,0,System.Convert.ToInt32(fs.Length));
//Close the File Stream
fs.Close();
return ImageData; //return the byte data
}
Is it reliable enough to use to convert a file to byte[] in c#, or is there a better way to do this?
byte[] bytes = System.IO.File.ReadAllBytes(filename);
That should do the trick. ReadAllBytes opens the file, reads its contents into a new byte array, then closes it. Here's the MSDN page for that method.
byte[] bytes = File.ReadAllBytes(filename)
or ...
var bytes = File.ReadAllBytes(filename)
Not to repeat what everyone already have said but keep the following cheat sheet handly for File manipulations:
System.IO.File.ReadAllBytes(filename);
File.Exists(filename)
Path.Combine(folderName, resOfThePath);
Path.GetFullPath(path); // converts a relative path to absolute one
Path.GetExtension(path);
All these answers with .ReadAllBytes(). Another, similar (I won't say duplicate, since they were trying to refactor their code) question was asked on SO here: Best way to read a large file into a byte array in C#?
A comment was made on one of the posts regarding .ReadAllBytes():
File.ReadAllBytes throws OutOfMemoryException with big files (tested with 630 MB file
and it failed) – juanjo.arana Mar 13 '13 at 1:31
A better approach, to me, would be something like this, with BinaryReader:
public static byte[] FileToByteArray(string fileName)
{
byte[] fileData = null;
using (FileStream fs = File.OpenRead(fileName))
{
var binaryReader = new BinaryReader(fs);
fileData = binaryReader.ReadBytes((int)fs.Length);
}
return fileData;
}
But that's just me...
Of course, this all assumes you have the memory to handle the byte[] once it is read in, and I didn't put in the File.Exists check to ensure the file is there before proceeding, as you'd do that before calling this code.
looks good enough as a generic version. You can modify it to meet your needs, if they're specific enough.
also test for exceptions and error conditions, such as file doesn't exist or can't be read, etc.
you can also do the following to save some space:
byte[] bytes = System.IO.File.ReadAllBytes(filename);
Others have noted that you can use the built-in File.ReadAllBytes. The built-in method is fine, but it's worth noting that the code you post above is fragile for two reasons:
Stream is IDisposable - you should place the FileStream fs = new FileStream(filename, FileMode.Open,FileAccess.Read) initialization in a using clause to ensure the file is closed. Failure to do this may mean that the stream remains open if a failure occurs, which will mean the file remains locked - and that can cause other problems later on.
fs.Read may read fewer bytes than you request. In general, the .Read method of a Stream instance will read at least one byte, but not necessarily all bytes you ask for. You'll need to write a loop that retries reading until all bytes are read. This page explains this in more detail.
string filePath= #"D:\MiUnidad\testFile.pdf";
byte[] bytes = await System.IO.File.ReadAllBytesAsync(filePath);

Categories