I have to read image binary from database and save this image binary as a Tiff image on filesystem. I was using the following code
private static bool SavePatientChartImageFileStream(byte[] ImageBytes, string ImageFilePath, string IMAGE_NAME)
{
bool success = false;
try
{
using (FileStream str = new FileStream(Path.Combine(ImageFilePath, IMAGE_NAME), FileMode.Create))
{
str.Write(ImageBytes, 0, Convert.ToInt32(ImageBytes.Length));
success = true;
}
}
catch (Exception ex)
{
success = false;
}
return success;
}
Since these image binaries are being transferred through merge replication, sometimes it happens that image binary is not completely transferred and we are sending request to fetch Image Binary with a nolock hint. This returns in ImageBytes having 1 byte data and it saves it as a 0 kb corrupted tiff image.
I have changed the above code to :-
private static bool SavePatientChartImage(byte[] ImageBytes, string ImageFilePath, string IMAGE_NAME)
{
bool success = false;
System.Drawing.Image newImage;
try
{
using (MemoryStream stream = new MemoryStream(ImageBytes))
{
using (newImage = System.Drawing.Image.FromStream(stream))
{
newImage.Save(Path.Combine(ImageFilePath, IMAGE_NAME));
success = true;
}
}
}
catch (Exception ex)
{
success = false;
}
return success;
}
In this case if ImageBytes is of 1 byte or incomplete, it won't save image and will return success as false.
I cannot remove NOLOCK as we are having extreme locking.
The second code is slower as compared to first one. I tried for 500 images. there was a difference of 5 seconds.
I couldn't understand the difference between these 2 pieces of code and which code to use when. Please help me understand.
In the first version of the code, you are essentially taking a bunch of bytes and writing them to the filesystem. There's no verification of a valid TIFF file because the code neither knows nor cares it's a TIFF file. It's just a bunch of bytes without any business logic attached.
In the second code, you're taking the bytes, wrapping them in a MemoryStream, and then feeding them into an Image object, which parses the entire file and reads it as a TIFF file. This give you the validation you need - it can tell when the data is invalid - but you're essentially going over the entire file twice, once to read it in (with additional overhead for parsing) and once to write it to disk.
Assuming you don't need any validation that requires deep parsing of the image file (# of colors, image dimensions, etc) you can skip this overhead by simply checking if the byte[] ImageBytes is of length 1 (or find any other good indicator of corrupt data) and skip writing if it doesn't match. In effect, do your own validation, rather than using the Image class as a validator.
I think the main difference between the two is that in the second code you are writing the source byte[] to a MemoryStream object first which would mean that if the data becomes essentially independent of the database. So, you could potentially incorporate this MemoryStream into the first code to achieve the same results.
Related
Hello
I've been working on terminal-like application to get better at programming in c#, just something to help me learn. I've decided to add a feature that will copy a file exactly as it is, to a new file... It seems to work almost perfect. When opened in Notepad++ the file are only a few lines apart in length, and very, very, close to the same as far as actual file size goes. However, the duplicated copy of the file never runs. It says the file is corrupt. I have a feeling it's within the methods for reading and rewriting binary to files that I created. The code is as follows, thank for the help. Sorry for the spaghetti code too, I get a bit sloppy when I'm messing around with new ideas.
Class that handles the file copying/writing
using System;
using System.IO;
//using System.Collections.Generic;
namespace ConsoleFileExplorer
{
class FileTransfer
{
private BinaryWriter writer;
private BinaryReader reader;
private FileStream fsc; // file to be duplicated
private FileStream fsn; // new location of file
int[] fileData;
private string _file;
public FileTransfer(String file)
{
_file = file;
fsc = new FileStream(file, FileMode.Open);
reader = new BinaryReader(fsc);
}
// Reads all the original files data to an array of bytes
public byte[] ReadAllDataToArray()
{
byte[] bytes = reader.ReadBytes((int)fsc.Length); // reading bytes from the original file
return bytes;
}
// writes the array of original byte data to a new file
public void WriteDataFromArray(byte[] fileData, string path) // got a feeling this is the problem :p
{
fsn = new FileStream(path, FileMode.Create);
writer = new BinaryWriter(fsn);
int i = 0;
while(i < fileData.Length)
{
writer.Write(fileData[i]);
i++;
}
}
}
}
Code that interacts with this class .
(Sleep(5000) is because I was expecting an error on first attempt...
case '3':
Console.Write("Enter source file: ");
string sourceFile = Console.ReadLine();
if (sourceFile == "")
{
Console.Clear();
Console.ForegroundColor = ConsoleColor.DarkRed;
Console.Error.WriteLine("Must input a proper file path.\n");
Console.ForegroundColor = ConsoleColor.White;
Menu();
} else {
Console.WriteLine("Copying Data"); System.Threading.Thread.Sleep(5000);
FileTransfer trans = new FileTransfer(sourceFile);
//copying the original files data
byte[] data = trans.ReadAllDataToArray();
Console.Write("Enter Location to store data: ");
string newPath = Console.ReadLine();
// Just for me to make sure it doesnt exit if i forget
if(newPath == "")
{
Console.Clear();
Console.ForegroundColor = ConsoleColor.DarkRed;
Console.Error.WriteLine("Cannot have empty path.");
Console.ForegroundColor = ConsoleColor.White;
Menu();
} else
{
Console.WriteLine("Writing data to file"); System.Threading.Thread.Sleep(5000);
trans.WriteDataFromArray(data, newPath);
Console.WriteLine("File stored.");
Console.ReadLine();
Console.Clear();
Menu();
}
}
break;
File compared to new file
right-click -> open in new tab is probably a good idea
Original File
New File
You're not properly disposing the file streams and the binary writer. Both tend to buffer data (which is a good thing, especially when you're writing one byte at a time). Use using, and your problem should disappear. Unless somebody is editing the file while you're reading it, of course.
BinaryReader and BinaryWriter do not just write "raw data". They also add metadata as needed - they're designed for serialization and deserialization, rather than reading and writing bytes. Now, in the particular case of using ReadBytes and Write(byte[]) in particular, those are really just raw bytes; but there's not much point to use these classes just for that. Reading and writing bytes is the thing every Stream gives you - and that includes FileStreams. There's no reason to use BinaryReader/BinaryWriter here whatsover - the file streams give you everything you need.
A better approach would be to simply use
using (var fsn = ...)
{
fsn.Write(fileData, 0, fileData.Length);
}
or even just
File.WriteAllBytes(fileName, fileData);
Maybe you're thinking that writing a byte at a time is closer to "the metal", but that simply isn't the case. At no point during this does the CPU pass a byte at a time to the hard drive. Instead, the hard drive copies data directly from RAM, with no intervention from the CPU. And most hard drives still can't write (or read) arbitrary amounts of data from the physical media - instead, you're reading and writing whole sectors. If the system really did write a byte at a time, you'd just keep rewriting the same sector over and over again, just to write one more byte.
An even better approach would be to use the fact that you've got file streams open, and stream the files from source to destination rather than first reading everything into memory, and then writing it back to disk.
There is an File.Copy() Method in C#, you can see it here https://msdn.microsoft.com/ru-ru/library/c6cfw35a(v=vs.110).aspx
If you want to realize it by yourself, try to place a breakpoint inside your methods and use a debug. It is like a story about fisher and god, who gived a rod to fisher - to got a fish, not the exactly fish.
Also, look at you int[] fileData and byte[] fileData inside last method, maybe this is problem.
I have a code that is SSIS script task to zip file written in C#.
I have problem when zipping 1gb (approxymately) file.
I try to implement this code and still get error 'System.OutOfMemoryException'
System.OutOfMemoryException: Exception of type 'System.OutOfMemoryException' was thrown.
at ST_4cb59661fb81431abcf503766697a1db.ScriptMain.AddFileToZipUsingStream(String sZipFile, String sFilePath, String sFileName, String sBackupFolder, String sPrefixFolder) in c:\Users\dtmp857\AppData\Local\Temp\vsta\84bef43d323b439ba25df47c365b5a29\ScriptMain.cs:line 333
at ST_4cb59661fb81431abcf503766697a1db.ScriptMain.Main() in c:\Users\dtmp857\AppData\Local\Temp\vsta\84bef43d323b439ba25df47c365b5a29\ScriptMain.cs:line 131
This is the snippet of code when zipping file:
protected bool AddFileToZipUsingStream(string sZipFile, string sFilePath, string sFileName, string sBackupFolder, string sPrefixFolder)
{
bool bIsSuccess = false;
try
{
if (File.Exists(sZipFile))
{
using (ZipArchive addFile = ZipFile.Open(sZipFile, ZipArchiveMode.Update))
{
addFile.CreateEntryFromFile(sFilePath, sFileName);
//Move File after zipping it
BackupFile(sFilePath, sBackupFolder, sPrefixFolder);
}
}
else
{
//from https://stackoverflow.com/questions/28360775/adding-large-files-to-io-compression-ziparchiveentry-throws-outofmemoryexception
using (var zipFile = ZipFile.Open(sZipFile, ZipArchiveMode.Update))
{
var zipEntry = zipFile.CreateEntry(sFileName);
using (var writer = new BinaryWriter(zipEntry.Open()))
using (FileStream fs = File.Open(sFilePath, FileMode.Open))
{
var buffer = new byte[16 * 1024];
using (var data = new BinaryReader(fs))
{
int read;
while ((read = data.Read(buffer, 0, buffer.Length)) > 0)
writer.Write(buffer, 0, read);
}
}
}
//Move File after zipping it
BackupFile(sFilePath, sBackupFolder, sPrefixFolder);
}
bIsSuccess = true;
}
catch (Exception ex)
{
throw ex;
}
return bIsSuccess;
}
What I am missing, please give me suggestion maybe tutorial or best practice handling this problem.
I know this is an old post but what can I say, it helped me sort out some stuff and still comes up as a top hit on Google.
So there is definitely something wrong with the System.IO.Compression library!
First and Foremost...
You must make sure to turn off 32-Preferred. Having this set (in my case with a build for "AnyCPU") causes so many inconsistent issues.
Now with that said, I took some demo files (several less than 500MB, one at 500MB, and one at 1GB), and created a sample program with 3 buttons that made use of the 3 methods.
Button 1 - ZipArchive.CreateFromDirectory(AbsolutePath, TargetFile);
Button 2 - ZipArchive.CreateEntryFromFile(AbsolutePath, RelativePath);
Button 3 - Using the [16 * 1024] Byte Buffer method from above
Now here is where it gets interesting. (Assuming that the program is built as "AnyCPU" and with NO 32 Preferred check)... all 3 Methods worked on a Windows 64-Bit OS, regardless of how much memory it had.
However, as soon as I ran the same test on a 32-Bit OS, regardless of how much memory it had, ONLY method 1 worked!
Method 2 and 3 blew up with the outofmemory error AND to add salt to it, method 3 (the preferred method of chunking) actually corrupted more files than method #2!
By messed up, I mean that of my files, the 500MB and the 1GB file ended up in the zipped archive but at a size less than the original (it was basically truncated).
So I dunno... since there are not many 32-bit OS around anymore, I guess maybe it is a moot point.
But seems like there are some bugs in the System.IO.Compression Framework!
How to convert Byte array to an image and open with some process (such as Windows Photo Viewer)?
In here i don't want to convert array data to an image file and save it in the disk, what i would like to do is to convert byte array to a memory stream or such a thing and using this i want to open that specific image.
Is it possible? (not to show them in a picture box or such a thing).
You will not be able to open it in an external viewer unless it's a file. However, if you don't care about that file, use a temporary one:
public void ViewImage(Byte[] ImageBytes)
{
try
{
Byte[] ba = new Byte[1];
using (MemoryStream ms = new MemoryStream(ba))
{
Image img = Image.FromStream(ms);
String tmpFile = Path.GetTempFileName();
tmpFile = Path.ChangeExtension(tmpFile, "jpg");
img.Save(tmpFile);
if (File.Exists(tmpFile))
Process.Start(tmpFile); //Windows will use file association to open a viewer
}
}
catch (OutOfMemoryException ex)
{
//React appropriately
}
}
Since this forces saving the image as a JPG, if the type of the original image is important, more logic should be added to deal with that fact.
Using C# how can I test a file is a jpeg? Should I check for a .jpg extension?
Thanks
Several options:
You can check for the file extension:
static bool HasJpegExtension(string filename)
{
// add other possible extensions here
return Path.GetExtension(filename).Equals(".jpg", StringComparison.InvariantCultureIgnoreCase)
|| Path.GetExtension(filename).Equals(".jpeg", StringComparison.InvariantCultureIgnoreCase);
}
or check for the correct magic number in the header of the file:
static bool HasJpegHeader(string filename)
{
using (BinaryReader br = new BinaryReader(File.Open(filename, FileMode.Open, FileAccess.Read)))
{
UInt16 soi = br.ReadUInt16(); // Start of Image (SOI) marker (FFD8)
UInt16 marker = br.ReadUInt16(); // JFIF marker (FFE0) or EXIF marker(FFE1)
return soi == 0xd8ff && (marker & 0xe0ff) == 0xe0ff;
}
}
Another option would be to load the image and check for the correct type. However, this is less efficient (unless you are going to load the image anyway) but will probably give you the most reliable result (Be aware of the additional cost of loading and decompression as well as possible exception handling):
static bool IsJpegImage(string filename)
{
try
{
using (System.Drawing.Image img = System.Drawing.Image.FromFile(filename))
{
// Two image formats can be compared using the Equals method
// See http://msdn.microsoft.com/en-us/library/system.drawing.imaging.imageformat.aspx
//
return img.RawFormat.Equals(System.Drawing.Imaging.ImageFormat.Jpeg);
}
}
catch (OutOfMemoryException)
{
// Image.FromFile throws an OutOfMemoryException
// if the file does not have a valid image format or
// GDI+ does not support the pixel format of the file.
//
return false;
}
}
Open the file as a stream and look for the magic number for JPEG.
JPEG image files begin with FF D8 and
end with FF D9. JPEG/JFIF files
contain the ASCII code for 'JFIF' (4A
46 49 46) as a null terminated string.
JPEG/Exif files contain the ASCII code
for 'Exif' (45 78 69 66) also as a
null terminated string
OMG, So many of these code examples are wrong, wrong wrong.
EXIF files have a marker of 0xff*e1*, JFIF files have a marker of 0xff*e0*. So all code that relies on 0xffe0 to detect a JPEG file will miss all EXIF files.
Here's a version that will detect both, and can easily be altered to return only for JFIF or only for EXIF. (Useful when trying to recover your iPhone pictures, for example).
public static bool HasJpegHeader(string filename)
{
try
{
// 0000000: ffd8 ffe0 0010 4a46 4946 0001 0101 0048 ......JFIF.....H
// 0000000: ffd8 ffe1 14f8 4578 6966 0000 4d4d 002a ......Exif..MM.*
using (BinaryReader br = new BinaryReader(File.Open(filename, FileMode.Open, FileAccess.ReadWrite)))
{
UInt16 soi = br.ReadUInt16(); // Start of Image (SOI) marker (FFD8)
UInt16 marker = br.ReadUInt16(); // JFIF marker (FFE0) EXIF marker (FFE1)
UInt16 markerSize = br.ReadUInt16(); // size of marker data (incl. marker)
UInt32 four = br.ReadUInt32(); // JFIF 0x4649464a or Exif 0x66697845
Boolean isJpeg = soi == 0xd8ff && (marker & 0xe0ff) == 0xe0ff;
Boolean isExif = isJpeg && four == 0x66697845;
Boolean isJfif = isJpeg && four == 0x4649464a;
if (isJpeg)
{
if (isExif)
Console.WriteLine("EXIF: {0}", filename);
else if (isJfif)
Console.WriteLine("JFIF: {0}", filename);
else
Console.WriteLine("JPEG: {0}", filename);
}
return isJpeg;
return isJfif;
return isExif;
}
}
catch
{
return false;
}
}
You could try loading the file into an Image and then check the format
Image img = Image.FromFile(filePath);
bool isBitmap = img.RawFormat.Equals(ImageFormat.Jpeg);
Alternatively you could open the file and check the header to get the type
You could find documentation on the jpeg file format, specifically the header information. Then try to read this information from the file and compare it to the expected jpeg header bytes.
Read the header bytes. This article contains info on several common image formats, including JPEG:
Using Image File Headers To Verify Image Format
JPEG Header Information
Once you have the extension you could use a regular expression to validate it.
^.*\.(jpg|JPG)$
This will loop through each file in the current directory and will output if any found files with JPG or JPEG extension are Jpeg images.
foreach (FileInfo f in new DirectoryInfo(".").GetFiles())
{
if (f.Extension.ToUpperInvariant() == ".JPG"
|| f.Extension.ToUpperInvariant() == ".JPEG")
{
Image image = Image.FromFile(f.FullName);
if (image.RawFormat == ImageFormat.Jpeg)
{
Console.WriteLine(f.FullName + " is a Jpeg image");
}
}
}
Depending on the context in which you're looking at this file, you need to remember that you can't open the file until the user tells you to open it.
(The link is to a Raymond Chen blog entry.)
The code here:
http://mark.michaelis.net/Blog/RetrievingMetaDataFromJPEGFilesUsingC.aspx
Shows you how to get the Meta Data. I guess that would throw an exception if your image wasn't a valid JPEG.
Checking the file extension is not enough as the filename might be lying.
A quick and dirty way is to try and load the image using the Image class and catching any exceptions:
Image image = Image.FromFile(#"c:\temp\test.jpg");
This isn't ideal as you could get any kind of exception, such as OutOfMemoryException, FileNotFoundException, etc. etc.
The most thorough way is to treat the file as binary and ensure the header matches the JPG format. I'm sure it's described somewhere.
The best way would to try and create an image from it using the Drawing.Bitmap (string) constructor and see if it fails to do so or throws an exception. The problem with some of the answers are this: firstly, the extension is purely arbitrary, it could be jpg, jpeg, jpe, bob, tim, whatever. Secondly, just using the header isn't enough to be 100% sure. It can definitely determine that a file isn't a jpeg but can't guarantee that a file is a jpeg, an arbitrary binary file could have the same byte sequence at the start.
Just take the media type of file and verify:
private bool isJpeg()
{
string p = currFile.Headers.ContentType.MediaType;
return p.ToLower().Equals("image/jpeg") || p.ToLower().Equals("image/pjpeg") || p.ToLower().Equals("image/png");
}
after check extention of file read first four byte of image and two last byte of image like this, do it for two last byte for value 255 , 217
for other file can do it
Validate image from file in C#
http://www.garykessler.net/library/file_sigs.html
// after check extention of file
byte[] ValidFileSignture = new byte[] { 255, 216, 255, 224 };
byte[] bufferforCheck = new byte[ValidFileSignture.Length];
Stream _inputStream = file.InputStream;
byte[] bufferforCheck1 = new byte[] { 255, 216, 255, 224 };
_inputStream.Read(bufferforCheck, 0, ValidFileSignture.Length);
if (!Enumerable.SequenceEqual(bufferforCheck, ValidFileSignture))
{
//file OK
}
System.Web.MimeMapping.GetMimeMapping(filename).StartsWith("image/");
MimeMapping.GetMimeMapping produces these results:
file.jpg: image/jpeg
file.gif: image/gif
file.jpeg: image/jpeg
file.png: image/png
file.bmp: image/bmp
file.tiff: image/tiff
file.svg: application/octet-stream
You can use the Path.GetExtension Method.
Is there a way to know how many bytes of a stream have been used by StreamReader?
I have a project where we need to read a file that has a text header followed by the start of the binary data. My initial attempt to read this file was something like this:
private int _dataOffset;
void ReadHeader(string path)
{
using (FileStream stream = File.OpenRead(path))
{
StreamReader textReader = new StreamReader(stream);
do
{
string line = textReader.ReadLine();
handleHeaderLine(line);
} while(line != "DATA") // Yes, they used "DATA" to mark the end of the header
_dataOffset = stream.Position;
}
}
private byte[] ReadDataFrame(string path, int frameNum)
{
using (FileStream stream = File.OpenRead(path))
{
stream.Seek(_dataOffset + frameNum * cbFrame, SeekOrigin.Begin);
byte[] data = new byte[cbFrame];
stream.Read(data, 0, cbFrame);
return data;
}
return null;
}
The problem is that when I set _dataOffset to stream.Position, I get the position that the StreamReader has read to, not the end of the header. As soon as I thought about it this made sense, but I still need to be able to know where the end of the header is and I'm not sure if there's a way to do it and still take advantage of StreamReader.
You can find out how many bytes the StreamReader has actually returned (as opposed to read from the stream) in a number of ways, none of them too straightforward I'm afraid.
Get the result of textReader.CurrentEncoding.GetByteCount(totalLengthOfAllTextRead) and then seek to this position in the stream.
Use some reflection hackery to retrieve the value of the private variable of the StreamReader object that corresponds to the current byte position within the internal buffer (different from that with the stream - usually behind, but no more than equal to of course). Judging by .NET Reflector, the this variable seems to be named bytePos.
Don't bother using a StreamReader at all but instead implement your custom ReadLine function built on top of the Stream or BinaryReader even (BinaryReader is guaranteed never to read further ahead than what you request). This custom function must read from the stream char by char, so you'd actually have to use the low-level Decoder object (unless the encoding is ASCII/ANSI, in which case things are a bit simpler due to single-byte encoding).
Option 1 is going to be the least efficient I would imagine (since you're effectively re-encoding text you just decoded), and option 3 the hardest to implement, though perhaps the most elegant. I'd probably recommend against using the ugly reflection hack (option 2), even though it's looks tempting, being the most direct solution and only taking a couple of lines. (To be quite honest, the StreamReader class really ought to expose this variable via a public property, but alas it does not.) So in the end, it's up to you, but either method 1 or 3 should do the job nicely enough...
Hope that helps.
So the data is utf8 (the default encoding for StreamReader). This is a multibyte encoding, so IndexOf would be inadvisable. You could:
Encoding.UTF8.GetByteCount(string)
on your data so far, adding 1 or 2 bytes for the missing line ending.
If you're needing to count bytes, I'd go with the BinaryReader. You can take the results and cast them about as needed, but I find its idea of its current position to be more reliable (in that since it reads in binary, its immune to character-set problems).
So your last line contains 'DATA' + an unknown amount of data bytes. You could extract the position by using IndexOf() with your last read line. Then readjust the stream.Position.
But I am not sure if you should use ReadLine() at all in this case. Maybe it would be better to read byte by byte until you reach the 'DATA' mark.
The line breaks are easily identifiable without needing to decode the stream first (except for some encodings rarely used for text files like EBCDIC, UTF-16, UTF-32), so you can just read each line as bytes and then decode the entire line:
using (FileStream stream = File.OpenRead(path)) {
List<byte> buffer = new List<byte>();
bool hasCr = false;
bool done = false;
while (!done) {
int b = stream.ReadByte();
if (b == -1) throw new IOException("End of file reached in header.");
if (b == 13) {
hasCr = true;
} else if (b == 10 && hasCr) {
string line = Encoding.UTF8.GetString(buffer.ToArray(), 0, buffer.Count);
if (line == "DATA") {
done = true;
} else {
HandleHeaderLine(line);
}
buffer.Clear();
hasCr = false;
} else {
if (hasCr) buffer.Add(13);
hasCr = false;
buffer.Add((byte)b);
}
}
_dataOffset = stream.Position;
}
Instead of closing the stream and open it again, you could of course just keep on reading the data.