CSVReader throwing error at the last row last column - c#

I am using CSVReader in my project for reading .csv files in windows service.
we pass the .csv file to CSVReader for processing the file and it was working fine.
But recently we decided to save the csv file to Database table and read it from there.
when a user submits an csv file for processing , our aspx page will read the csv file
and converts it to byte array and saves it to database.
and the service will read the table and picks up the byte array and converts it to Filestream.
This stream is passed to CSVReader for further work.
Now it throws error for the last row last column.
Its happening only after saving and reading from Database.
I am getting the following error.
No idea how to fix this.
"The CSV appears to be corrupt near record '9' field '3 at position '494'. Current raw data : '
Hear is the code
Converting file to Byte Array .....
try
{
fs = new FileStream(filepath, FileMode.Open);
fsbuffer = new byte[fs.Length];
fs.Read(fsbuffer, 0, (int)fs.Length);
}
Reading from DB to Byte Array ......
myobject.FileByteArray = ObjectToByteArray(row);
public byte[] ObjectToByteArray(DataRow row)
{
if (row["fileBytearray"] == null)
return null;
try
{
BinaryFormatter bf = new BinaryFormatter();
System.IO.MemoryStream ms = new System.IO.MemoryStream();
bf.Serialize(ms, row["fileBytearray"]);
return ms.ToArray();
}
}
Stream fileStream = new MemoryStream(myobject.FileByteArray)
using (CsvReader csv =
new CsvReader(new StreamReader(fileStream, System.Text.Encoding.UTF7), hasHeader, ','))

I haven't been able to recreate your issue, so instead, as, without seeing the save and load from beginning to end, I'll suggest trying the following for retrieval:
public MemoryStream LoadReportData(int rowId)
{
MemoryStream stream = new MemoryStream();
using (BinaryWriter writer = new BinaryWriter(stream))
{
using (DbConnection connection = db.CreateConnection())
{
DbCommand selectCommand = "SELECT CSVData FROM YourTable WHERE Id = #rowId";
selectCommand.Connection = connection;
db.AddInParameter(selectCommand, "#rowId", DbType.Int32, rowId);
connection.Open();
using (IDataReader reader = selectCommand.ExecuteReader(CommandBehavior.SequentialAccess))
{
while (reader.Read())
{
int startIndex = 0;
int bufferSize = 8192;
byte[] buffer = new byte[bufferSize];
long retVal = reader.GetBytes(0, startIndex, buffer, 0, bufferSize);
while (retVal == bufferSize)
{
writer.Write(buffer);
writer.Flush();
startIndex += bufferSize;
retVal = reader.GetBytes(0, startIndex, buffer, 0, bufferSize);
}
writer.Write(buffer, 0, (int)retVal);
}
}
}
}
return stream;
}
You'll need to replace the selectCommand sql and parameters with whatever sql you are using to return the data.
This method uses a SqlDataReader to sequentially read bytes from the appropriate row (identified by rowId in my example) and column (called CSVData in my example) which should avoid any truncation issues on the way out (it could be that your DataTable object is only returning the first n bytes). The MemoryStream object could be used to resave the CSV file to the file system for testing, or fed straight into your CSVReader.
If you can post your Save method (where you actually persist the data to the database) then we can check that too to make sure that truncation isn't happening there, either.
One other suggestion I can make right now involves your loading the file into a byte array. If you are going to load the file in one go, then you can simply replace:
try
{
fs = new FileStream(filepath, FileMode.Open);
fsbuffer = new byte[fs.Length];
fs.Read(fsbuffer, 0, (int)fs.Length);
}
with
byte[] fileBytes = File.ReadAllBytes(filepath);

Related

Error decompressing gzipstream -- The magic number in GZip header is not correct

I'm using the C# System.IO class (framework 4.0) to compress an image pulled off the file system with FileDialog and then inserted into a SQL Server database as a varbinary(max) data type. The problem I'm having is when I pull the data out of the database and attempt to decompress I get the subject error with an additional message -- make sure you are passing in a gzip stream.
The code to get the file:
OpenFileDialog dlgOpen = new OpenFileDialog();
if (dlgOpen.ShowDialog() == DialogResult.OK)
{
FileStream fs = File.OpenRead(dlgOpen.FileName);
byte[] picbyte1 = new byte[fs.Length];
byte[] picbyte = Compress(picbyte1);
fs.Read(picbyte, 0, System.Convert.ToInt32(picbyte.Length));
String ImageName = dlgOpen.FileName;
//String bs64OfBytes = Convert.ToBase64String(picbyte);
fs.Close();
//additional code inserts into database
....
}
The compress method:
private static byte[] Compress(byte[] data)
{
var output = new MemoryStream();
using (var gzip = new GZipStream(output, CompressionMode.Compress, true))
{
gzip.Write(data, 0, data.Length);
gzip.Close();
}
return output.ToArray();
}
The decompress method:
private static byte[] Decompress(byte[] data)
{
var output = new MemoryStream();
var input = new MemoryStream();
input.Write(data, 0, data.Length);
input.Position = 0;
using (var gzip = new GZipStream(input, CompressionMode.Decompress, true))
{
var buff = new byte[64];//also used 32
var read = gzip.Read(buff, 0, buff.Length);//error occurs here
while (read > 0)
{
output.Write(buff, 0, read);
read = gzip.Read(buff, 0, buff.Length);
}
gzip.Close();
}
return output.ToArray();
}
You need to insert a line and remove an other:
FileStream fs = File.OpenRead(dlgOpen.FileName);
byte[] picbyte1 = new byte[fs.Length];
fs.Read(picbyte1, 0, (int)fs.Length); // <-- Add this one
byte[] picbyte = Compress(picbyte1);
// fs.Read(picbyte, 0, System.Convert.ToInt32(picbyte.Length)); // <-- And remove this one
// ...
You are reading the image in your code, but something is in the wrong order:
// Original but incorrect sequence
FileStream fs = File.OpenRead(dlgOpen.FileName); // Open the file
byte[] picbyte1 = new byte[fs.Length]; // Assign the array
byte[] picbyte = Compress(picbyte1); // Compress the assigned array, but there is no contents...
fs.Read(picbyte, 0, System.Convert.ToInt32(picbyte.Length)); // You save the file to the already compressed bytes...
So you have saved the first part of the original file, not the compressed one (but the number of saved bytes corresponds with the number of compressed bytes). If you send this to the DB and read it back, the decompressor does not find it's Magic Number.
As an improvement, you can change these lines:
FileStream fs = File.OpenRead(dlgOpen.FileName);
byte[] picbyte1 = new byte[fs.Length];
fs.Read(picbyte1, 0, (int)fs.Length); // line that I suggested to add
probably change to:
byte[] picbyte1 = File.ReadAllBytes(dlgOpen.FileName);

System.OutOfMemory exception when trying to read large files

public static byte[] ReadMemoryMappedFile(string fileName)
{
long length = new FileInfo(fileName).Length;
using (var stream = File.Open(fileName, FileMode.OpenOrCreate, FileAccess.Read, FileShare.ReadWrite))
{
using (var mmf = MemoryMappedFile.CreateFromFile(stream, null, length, MemoryMappedFileAccess.Read, null, HandleInheritability.Inheritable, false))
{
using (var viewStream = mmf.CreateViewStream(0, length, MemoryMappedFileAccess.Read))
{
using (BinaryReader binReader = new BinaryReader(viewStream))
{
var result = binReader.ReadBytes((int)length);
return result;
}
}
}
}
}
OpenFileDialog openfile = new OpenFileDialog();
openfile.Filter = "All Files (*.*)|*.*";
openfile.ShowDialog();
byte[] buff = ReadMemoryMappedFile(openfile.FileName);
texteditor.Text = BitConverter.ToString(buff).Replace("-"," "); <----A first chance exception of type 'System.OutOfMemoryException' occurred in mscorlib.dll
I get a System.OutOfMemory exception when trying to read large files.
I've read a lot for 4 weeks in all the web... and tried a lot!!! But still, I can't seem to find a good solution to my problem.
Please help me..
Update
public byte[] FileToByteArray(string fileName)
{
byte[] buff = null;
FileStream fs = new FileStream(fileName,
FileMode.Open,
FileAccess.Read);
BinaryReader br = new BinaryReader(fs);
long numBytes = new FileInfo(fileName).Length;
buff = br.ReadBytes((int)numBytes);
//return buff;
return File.ReadAllBytes(fileName);
}
OR
public static byte[] FileToByteArray(FileStream stream, int initialLength)
{
// If we've been passed an unhelpful initial length, just
// use 32K.
if (initialLength < 1)
{
initialLength = 32768;
}
BinaryReader br = new BinaryReader(stream);
byte[] buffer = new byte[initialLength];
int read = 0;
int chunk;
while ((chunk = br.Read(buffer, read, buffer.Length - read)) > 0)
{
read += chunk;
// If we've reached the end of our buffer, check to see if there's
// any more information
if (read == buffer.Length)
{
int nextByte = br.ReadByte();
// End of stream? If so, we're done
if (nextByte == -1)
{
return buffer;
}
// Nope. Resize the buffer, put in the byte we've just
// read, and continue
byte[] newBuffer = new byte[buffer.Length * 2];
Array.Copy(buffer, newBuffer, buffer.Length);
newBuffer[read] = (byte)nextByte;
buffer = newBuffer;
read++;
}
}
// Buffer is now too big. Shrink it.
byte[] ret = new byte[read];
Array.Copy(buffer, ret, read);
return ret;
}
I still get a System.OutOfMemory exception when trying to read large files.
If your file is 4GB, then BitConverter will turn each byte into XX- string, each char in string is 2 bytes * 3 chars per byte * 4 294 967 295 bytes = 25 769 803 770. You need +25Gb of free memory to fit entire string, plus you already have your file in memory as byte array.
Besides, no single object in a .Net program may be over 2GB. Theoretical limit for a string length would be 1,073,741,823 chars, but you also need to have a 64-bit process.
So solution in your case - open FileStream. Read first 16384 bytes (or how much can fit on your screen), convert to hex and display, and remember file offset. When user wants to navigate to next or previous page - seek to that position in file on disk, read and display again, etc.
You need to read the file in chunks, keep track of where you are in the file, page the contents on screen and use seek and position to move up and down in the file stream.
You will not be able to display 4Gb file reading all of it in memory first by any approach.
The approach is to virtualize the data, reading only the visible lines when user scrolls. If you need to do a read-only text viewer then you can use WPF ItemsControl with virtulizing stack panel and bind to custom IList collection which will lazily fetch lines from the file calculating file offset by for the line index.

Read all contents of memory mapped file or Memory Mapped View Accessor without knowing the size of it

I need something similar to ReadToEnd or ReadAllBytes to read all of the contents of the MemoryMappedFile using the MappedViewAccessor if I don't know the size of it, how can I do it?
I have searched for it, I have seen this question, but it is not the thing I am looking for:
How can I quickly read bytes from a memory mapped file in .NET?
Edit:
There is a problem, the (int)stream.Length is not giving me the correct length, it rather gives the size of the internal buffer used! I need to refresh this question because it is very pressing.
Rather use the Stream:
public static Byte[] ReadMMFAllBytes(string fileName)
{
using (var mmf = MemoryMappedFile.OpenExisting(fileName))
{
using (var stream = mmf.CreateViewStream())
{
using (BinaryReader binReader = new BinaryReader(stream))
{
return binReader.ReadBytes((int)stream.Length);
}
}
}
}
This is difficult to answer since there are still many details of your application that you haven't specified, but I think both Guffa's and Amer's answers are still partially correct:
A MemoryMappedFile is more memory than file; it is a sequence of 4Kb pages in memory. So, stream.Length will in fact give you all of the bytes (there is no "internal buffer size"), but it might give you more bytes than you expect since the size will always be rounded up to a 4Kb boundary.
The "file" semantic comes from associating the MemoryMappedFile to a real filesystem file. Assuming that the process which creates the file always adjusts the file size, then you can get the precise size of the file via the fileSystem.
If all of the above would fit your application, then the following should work:
static byte[] ReadMemoryMappedFile(string fileName)
{
long length = new FileInfo(fileName).Length;
using (var stream = File.Open(fileName, FileMode.OpenOrCreate, FileAccess.Read, FileShare.ReadWrite))
{
using (var mmf = MemoryMappedFile.CreateFromFile(stream, null, length, MemoryMappedFileAccess.Read, null, HandleInheritability.Inheritable, false))
{
using (var viewStream = mmf.CreateViewStream(0, length, MemoryMappedFileAccess.Read))
{
using (BinaryReader binReader = new BinaryReader(viewStream))
{
var result = binReader.ReadBytes((int)length);
return result;
}
}
}
}
}
To write the data, you can use this:
private static void WriteData(string fileName, byte[] data)
{
using (var stream = File.Open(fileName, FileMode.OpenOrCreate, FileAccess.ReadWrite, FileShare.ReadWrite))
{
using (var mmf = MemoryMappedFile.CreateFromFile(stream, null, data.Length, MemoryMappedFileAccess.ReadWrite, null, HandleInheritability.Inheritable, true))
{
using (var view = mmf.CreateViewAccessor())
{
view.WriteArray(0, data, 0, data.Length);
}
}
stream.SetLength(data.Length); // Make sure the file is the correct length, in case the data got smaller.
}
}
But, by the time you do all of the above you might do just as well to use the file directly and avoid the memory mapping. If mapping it to the filesystem isn't acceptable, then Guffa's answer of encoding the length (or an end marker) in the data itself is probably best.
You can't do that.
A view accessor is created with a minimum size of a system page, which means that it may be larger than the actual file. A view stream is just a stream form of an accessor, so it will also give the same behaviour.
"views are provided in units of system pages, and the size of the view
is rounded up to the next system page size"
http://msdn.microsoft.com/en-us/library/dd267577.aspx
The accessor will gladly read and write outside the actual file without throwing an exception. When reading, any bytes outside the file will just be zero. When writing, the bytes written outside the file are just ignored.
To read the file from a memory mapped file with the exact size of the original file, you have to already know that size.
Stream created by MemoryMappedFile has a length aligned to file system page size (usually 4096). You have to get the file size from somewhere else. If it is memory mapped file you could use that code:
byte[] ReadAllMemoryMappedFileBytes(string filePath)
{
var fileInfo = new FileInfo(filePath);
using (var file = MemoryMappedFile.CreateFromFile(filePath, FileMode.Open))
using (var stream = file.CreateViewAccessor())
{
byte[] bytes = new byte[fileInfo.Length];
stream.ReadArray(0, bytes, 0, bytes.Length);
return bytes;
}
}
Use FileInfo class to get length as shown below
using System.Data;
using System.IO;
using System.IO.Compression;
using System.IO.MemoryMappedFiles;
// ...
public void WriteToMemoryMap(DataSet ds, string key, string fileName)
{
var bytes = CompressData(ds);
using (MemoryMappedFile objMf = MemoryMappedFile.CreateFromFile(fileName, FileMode.OpenOrCreate, key, bytes.Length))
{
using (MemoryMappedViewAccessor accessor = objMf.CreateViewAccessor())
{
accessor.WriteArray(0, bytes, 0, bytes.Length);
}
}
}
public DataSet ReadFromMemoryMap(string fileName)
{
var fi = new FileInfo(fileName);
var length = (int)fi.Length;
var newBytes = new byte[length];
using (MemoryMappedFile objMf = MemoryMappedFile.CreateFromFile(fileName, FileMode.Open))
{
using (MemoryMappedViewAccessor accessor = objMf.CreateViewAccessor())
{
accessor.ReadArray(0, newBytes, 0, length);
}
}
return DecompressData(newBytes);
}
public byte[] CompressData(DataSet ds)
{
try
{
byte[] data = null;
var memStream = new MemoryStream();
var zipStream = new GZipStream(memStream, CompressionMode.Compress);
ds.WriteXml(zipStream, XmlWriteMode.WriteSchema);
zipStream.Close();
data = memStream.ToArray();
memStream.Close();
return data;
}
catch (Exception)
{
return null;
}
}
public DataSet DecompressData(byte[] data)
{
try
{
var memStream = new MemoryStream(data);
var unzipStream = new GZipStream(memStream, CompressionMode.Decompress);
var objDataSet = new DataSet();
objDataSet.ReadXml(unzipStream, XmlReadMode.ReadSchema);
unzipStream.Close();
memStream.Close();
return objDataSet;
}
catch (Exception)
{
return null;
}
}
Just the #Amer Sawan solution translated to Vb.NET:
' Usage Example:
' Dim ReadBytes As Byte() = ReadMemoryMappedFile(Name:="My MemoryMappedFile Name") ' Read the byte-sequence from memory.
' Dim Message As String = System.Text.Encoding.ASCII.GetString(ReadBytes.ToArray) ' Convert the bytes to String.
' Message = Message.Trim({ControlChars.NullChar}) ' Remove null chars (leading zero-bytes)
' MessageBox.Show(Message, "", MessageBoxButtons.OK) ' Show the message. '
'
''' <summary>
''' Reads a byte-sequence from a <see cref="IO.MemoryMappedFiles.MemoryMappedFile"/> without knowing the exact size.
''' Note that the returned byte-length is rounded up to 4kb,
''' this means if the mapped memory-file was written with 1 byte-length, this method will return 4096 byte-length.
''' </summary>
''' <param name="Name">Indicates an existing <see cref="IO.MemoryMappedFiles.MemoryMappedFile"/> assigned name.</param>
''' <returns>System.Byte().</returns>
Private Function ReadMemoryMappedFile(ByVal Name As String) As Byte()
Try
Using MemoryFile As IO.MemoryMappedFiles.MemoryMappedFile =
IO.MemoryMappedFiles.MemoryMappedFile.OpenExisting(Name, IO.MemoryMappedFiles.MemoryMappedFileRights.ReadWrite)
Using Stream = MemoryFile.CreateViewStream()
Using Reader As New BinaryReader(Stream)
Return Reader.ReadBytes(CInt(Stream.Length))
End Using ' Reader
End Using ' Stream
End Using ' MemoryFile
Catch exNoFile As IO.FileNotFoundException
Throw
Return Nothing
Catch ex As Exception
Throw
End Try
End Function
I would like to have something from MemoryStream .ToArray() method to return all bytes, and the code below work for me:
using (MemoryMappedFile mmf = MemoryMappedFile.OpenExisting(MemoryMappedName))
{
using (MemoryMappedViewStream stream = mmf.CreateViewStream())
{
using (MemoryStream memStream = new MemoryStream())
{
stream.CopyTo(memStream);
return memStream.ToArray();
}
}
}
Cheers!

How to Read First 512 Bytes of data from a .dat file in C#?

H,
How to Read First 512 Bytes of data from a .dat file in C# ?
My dat files contains binary data.
I am using File.ReadAllBytes currently to read data from the dat file.But it Reads all the data, i want to read only first 512 bytes then break.
I need to use for loop for this or any other approach.
Any help is appreciated.
You can try this:
byte[] buffer = new byte[512];
try
{
using (FileStream fs = new FileStream(filename, FileMode.Open, FileAccess.Read))
{
var bytes_read = fs.Read(buffer, 0, buffer.Length);
fs.Close();
if (bytes_read != buffer.Length)
{
// Couldn't read 512 bytes
}
}
}
catch (System.UnauthorizedAccessException ex)
{
Debug.Print(ex.Message);
}
You can use a byte[] variable, and FileStream.Read for that.
A simple but effective approach:
var result = ""; //Define a string variable. This doesn't have to be a string, this is just an example.
using (BinaryReader br = new BinaryReader(File.OpenRead(openFileDailog1.FileName))) //Begin reading the file with the BinaryReader class.
{
br.BaseStream.Seek(0x4D, SeekOrigin.Begin); //Put the beginning of a .dat file here. I put 0x4D, because it's the generic start of a file.
result = Encoding.UTF8.GetString(br.ReadBytes(512)); //You don't have to return it as a string, this is just an example.
}
br.Close(); //Close the BinaryReader.
using System.IO; enables access to the BinaryReader class.
Hope this helps!

Certain Files getting corrupted by SQL Server FileStream

I am saving files to a SQL server 2008 (Express) database using FILESTREAM, the trouble I'm having is that certain files seem to be getting corrupted in the process.
For example if I save a word or excel document in one of the newer formats (docx, or xslx) then when I try to open the file I get an error message saying that the data is corrupted and would I like word/excel to try and recover it, If I click yes office is able to 'recover' the data and opens the file in compatibility mode.
However if i zip the file first then after extracting the contents I'm able to open the file without a problem. Strangely If I save an mp3 file to the database then I have the reverse issue, I can open the file no problem, but If I saved a zipped version of the mp3 I can't even extract the contents of that zip. When I tried to save a pdf or power-point file I ran into similar problems (the pdf i could only read if I zipped it first, and the ppt I couldn't read at all).
Update: here's my code that I'm using to write to the database and to read
To write to the database:
SQL = "SELECT Attachment.PathName(), GET_FILESTREAM_TRANSACTION_CONTEXT() FROM Activity " +
"WHERE RowID = CAST(#RowID as uniqueidentifier)";
transaction = connection.BeginTransaction();
command.Transaction = transaction;
command.CommandText = SQL;
command.Parameters.Clear();
command.Parameters.Add(rowIDParam);
SqlDataReader readerFS = null;
readerFS= command.ExecuteReader();
string path = (string)readerFS[0].ToString();
byte[] context = (byte[])readerFS[1];
int length = context.Length;
SqlFileStream targetStream = new SqlFileStream(path, context, FileAccess.Write);
int blockSize = 1024 * 512; //half a megabyte
byte[] buffer = new byte[blockSize];
int bytesRead = sourceStream.Read(buffer, 0, buffer.Length);
while (bytesRead > 0)
{
targetStream.Write(buffer, 0, bytesRead);
bytesRead = sourceStream.Read(buffer, 0, buffer.Length);
}
targetStream.Close();
sourceStream.Close();
readerFS.Close();
transaction.Commit();
And to read:
SqlConnection connection = null;
SqlTransaction transaction = null;
try
{
connection = getConnection();
connection.Open();
transaction = connection.BeginTransaction();
SQL = "SELECT Attachment.PathName(), + GET_FILESTREAM_TRANSACTION_CONTEXT() FROM Activity"
+ " WHERE ActivityID = #ActivityID";
SqlCommand command = new SqlCommand(SQL, connection);
command.Transaction = transaction;
command.Parameters.Add(new SqlParameter("ActivityID", activity.ActivityID));
SqlDataReader reader = command.ExecuteReader();
string path = (string)reader[0];
byte[] context = (byte[])reader[1];
int length = context.Length;
reader.Close();
SqlFileStream sourceStream = new SqlFileStream(path, context, FileAccess.Read);
int blockSize = 1024 * 512; //half a megabyte
byte[] buffer = new byte[blockSize];
List<byte> attachmentBytes = new List<byte>();
int bytesRead = sourceStream.Read(buffer, 0, buffer.Length);
while (bytesRead > 0)
{
bytesRead = sourceStream.Read(buffer, 0, buffer.Length);
foreach (byte b in buffer)
{
attachmentBytes.Add(b);
}
}
FileStream outputStream = File.Create(outputPath);
foreach (byte b in attachmentBytes)
{
byte[] barr = new byte[1];
barr[0] = b;
outputStream.Write(barr, 0, 1);
}
outputStream.Close();
sourceStream.Close();
command.Transaction.Commit();
Your read code is incorrect:
while (bytesRead > 0)
{
bytesRead = sourceStream.Read(buffer, 0, buffer.Length);
foreach (byte b in buffer)
{
attachmentBytes.Add(b);
}
}
If the bytesRead is less than buffer.Length, you still add the entire buffer to the attachementBytes. Thus, you always corrupt the document returned by adding any garbage in the end of the last buffer post bytesRead.
Other than that, allow me to have a really WTF moment. Reading a stream as a List<byte> ?? C'mon! First, I don't see the reason why you need to read into an intermediate in-memory storage to start with. You can simply read buffer by buffer and write each buffer straight into the outputStream. Second, if you must use an intermediate in-memory storage, use a MemoryStream, not a List<byte>.
I had the exact problem a few months back and figured out that I was adding an extra byte at the end of the file when reading it from FILESTREAM.

Categories