How to read a file from the x line in .NET

How to read a file from the x line in .NET - c#

I read a file from the file system and FTP it to an FTP server. I now have a request to skip the first line (it's a CSV with header info). I think I can somehow do this with the Offset of the stream.Read method (or the write method) but I am not sure how to translate the byte array offset from a single line.
How would I calculate the offset to tell it to only read from the 2nd line of the file?
Thanks
// Read the file to be uploaded into a byte array
stream = File.OpenRead(currentQueuePathAndFileName);
var buffer = new byte[stream.Length];
stream.Read(buffer, 0, buffer.Length);
stream.Close();
// Get the stream for the request and write the byte array to it
var reqStream = request.GetRequestStream();
reqStream.Write(buffer, 0, buffer.Length);
reqStream.Close();
return request;

You should use File.ReadAllLines. It returns array of strings. Then just strArray.Skip(1) will return you all lines except of first.
UPDATE
Here is the code:
var stringArray = File.ReadAllLines(fileName);
if (stringArray.Length > 1)
{
stringArray = stringArray.Skip(1).ToArray();
var reqStream = request.GetRequestStream();
reqStream.Write(stringArray, 0, stringArray.Length);
reqStream.Close();
}

If you need just skip the one line, you can use the reader class, read line, get the current position and read contents as usual.
using (var stream = File.OpenRead("currentQueuePathAndFileName"))
using (var reader = new StreamReader(stream))
{
reader.ReadLine();
Console.Write(stream.Position);
var buffer = new byte[stream.Length - stream.Position];
stream.Read(buffer, 0, buffer.Length);
var reqStream = request.GetRequestStream();
reqStream.Write(buffer, 0, buffer.Length);
reqStream.Close();
return request;
}
And Why are you using RequestStream? Aren't you supposed to use the HttpContext.Current.Response.OutputStream ?

You could take the string from your stream input, parse it into lines with array split, and then drop the first, and put it back together, OR, you could regex out the first line, or just look for a line feed and copy the string from there on. etc.

You should use a [StreamReader][1] class. Then by using ReadLine until the result is null you read all the lines one by one. Just skeep the first one as your requirement. Reading all the file in memory is usually not a good idea ( it does not scale as the file became bigger )

The brute force approach to is to iterate through the source byte array and look for the first occurrence of the carriage return character followed by a line feed (CR+LF). That way you'll know the offset in the source array to exclude the first line in the CSV file.
Here's an example:
const byte carriageReturn = 13;
const byte lineFeed = 10;
byte[] file = File.ReadAllBytes(currentQueuePathAndFileName);
int offset = 0;
for (int i = 0; i < file.Length; i++)
{
if (file[i] == carriageReturn && file[++i] == lineFeed)
{
offset = i + 2;
break;
}
}
Note that this example assumes the source CSV file was created on Windows, since line breaks are expressed differently on other platforms.
Related resources:
Environment.NewLine Property

Related

C# - Stream.Read offset is working incorrectly

I have a test code like below. I am reading from stream, offsetting by 2 positions, and then taking next 2 bytes. I would hope that result would be an array with 2 elements. This does not work though - offset is completely ignored, and full sized array is always returned, with only offset blocks having values. But this means my result table is still very large, it just has a lot of unwanted zeroes
How can I rework below code, so that file.Read() returns only an array of 2 bytes instead of 10 when length = 2 and offset = 2? In real world scenario I am dealing with large files (>2gigs) so filtering out the result array is not an option.
Edit: As the issue is unclear - below code requires me to always define output array that is the same size as the stream. Instead I would like to have an output that is of size of length (in below example I would like to have var buffer = new byte[2], but that will throw an exception because file.Read ignores offset and length and always returns 10 elements (with only 2 of them being read, rest is dummy zeroes).
private byte[] GetFilePart(int length, int offset)
{
//build some dummy content
var content = new byte[10];
for (int i = 0; i<10; i++)
{
content[i] = 1;
}
//read the data from content
var buffer = new byte[10];
using (Stream file = new MemoryStream(content))
{
file.Read(buffer, offset, length);
}
return buffer;
}

Looks like it's working properly to me; maybe your confusion would clear a bit if you inited your content array with something like:
for (int i = 1; i<=10; i++)
{
content[i-1] = i;
}
then each byte would have a different number and the image would look like:
offset relates to where into buffer the Stream will write the bytes to (it reads from the start of content). It does not relate to what bytes are read out of content.
Imagine Read as being called WriteBytesInto(byte[] whatBuffer, int whereToStartWriting, int howManyBytesToWrite) - you provide the buffer it will write into and tell it where to start and how many to do
If you did this, having inited content to be incrementing numbers:
file.Read(buffer, 2, 3); //read 3 bytes from stream and write to buffer # index 2
file.Read(buffer, 0, 2); //read 2 bytes from stream and write to buffer # index 0
Your buffer would end up looking like:
4,5,1,2,3,0,0,0,0,0
The 1,2,3 having been written first, then the 4,5 written next
If you want to skip two bytes from the stream (i.e. read the 3rd and 4th byte from content, Seek() the stream or set its Position (or as canton7 advises in the commments, if the stream is not seekable, read and discard some bytes)
How can I rework below code, so that file.Read() returns only an array of 2 bytes instead of 10 when length = 2 and offset = 2?
Well, file.Read doesn't return an array at all; it modifies an array you give it. If you want a 2 byte array, give it a 2 byte array:
byte buf = new byte[2];
file.Read(buf, 0, buf.Length);
If you want to open a file, skip the first 7 bytes and then read bytes 8th and 9th into your length-of-2 byte array then:
byte buf = new byte[2];
file.Position = 7; //absolute skip to 8th byte
file.Read(buf, 0, buf.Length);
For more on seeking in streams see Stream.Seek(0, SeekOrigin.Begin) or Position = 0

Replace a byte of data

I'm trying to replace only one byte of data from a file, meaning something like 0X05 -> 0X15.
I'm using Replace function to do this.
using (StreamReader reader = new System.IO.StreamReader(Inputfile))
{
content = reader.ReadToEnd();
content = content.Replace("0x05","0x15");
reader.Close();
}
using (FileStream stream = new FileStream(outputfile, FileMode.Create))
{
using (BinaryWriter writer = new BinaryWriter(stream, Encoding.UTF8))
{
writer.Write(content);
}
}
Technically speaking, only that byte of data had to replaced with new byte, but I see there are many bytes of data changed.
Why other bytes are changing ?How can I achieve this?

You're talking about bytes but you've written code that reads strings; strings are an interpretation of bytes so if you truly do mean bytes, mangling them through strings is the wrong way to go
Anyways, there are helper methods to make your life easy, if the file is relatively small (maybe up to 500mb - I'd switch to using an incremental streaming reading/changing/writing method if it's bigger than this)
If you want bytes changed:
var b = File.ReadAllBytes("path");
for(int x = 0; x < b.Length; x++)
if(b[x] == 0x5)
b[x] = (byte)0x15;
File.WriteAllBytes("path", b);
If your file is a text file that literally has "0x05" in it:
File.WriteAllText("path", File.ReadAllText("path").Replace("0x05", "0x15"));
In response to your question in the comments, and assuming you want your file to grow by 2 bytes more for each 0x05 it contains (so a 1000 byte file that cotnains three 0x05 bytes will be 1006 bytes after being written) it is probably simplest to:
var b = File.ReadAllBytes("path");
using(FileStream fs = new FileStream("path", FileMode.Create)) //replace file
{
for(int x = 0; x < b.Length; x++)
if(b[x] == 0x5) {
fs.WriteByte((byte)0x15);
fs.WriteByte((byte)0x5);
fs.WriteByte((byte)0x15);
} else
fs.WriteByte(b);
}
Don't worry about writing a single byte at a time - it is buffered elsewhere in the IO chain. You could go for a solution that writes blocks of bytes from the array if you wanted.. this is just easier to code/understand

How to read file bytes from byte offset?

If I am given a .cmp file and a byte offset 0x598, how can I read a file from this offset?
I can ofcourse read file bytes like this
byte[] fileBytes = File.ReadAllBytes("upgradefile.cmp");
But how can I read it from byte offset 0x598
To explain a bit more, actually from this offset the actual data starts that I have to read and before this byte offset it is just header data, so basically I have to read file from that offset till end.

Try code like this:
using (BinaryReader reader = new BinaryReader(File.Open("upgradefile.cmp", FileMode.Open)))
{
long offset = 0x598;
if (reader.BaseStream.Length > offset)
{
reader.BaseStream.Seek(offset, SeekOrigin.Begin);
byte[]fileBytes = reader.ReadBytes((int) (reader.BaseStream.Length - offset));
}
}

If you are not familiar with Streams, Linq, or whatever, I have simplest solution for you:
Read entire file into memory (I hope you deal with small files):
byte[] fileBytes = File.ReadAllBytes("upgradefile.cmp");
Calculate how many bytes are present in array after given offset:
long startOffset = 0x598; // this is just hexadecimal representation for human, it can be decimal or whatever
long howManyBytesToRead = fileBytes.Length - startOffset;
Then just copy data to new array:
byte[] newArray = new byte[howManyBytesToRead];
long pos = 0;
for (int i = startOffset; i < fileBytes.Length; i++)
{
newArray[pos] = fileBytes[i];
pos = pos + 1;
}
If you understand how it works you can look at Array.Copy method in Microsoft documentation.

By not using ReadAllBytes.
Get a stream, move to potition, read rest of files.
You basically complain that a convenience method made to allow a one line read of a whole file is not what you want - ignoring that it is just that, a convenience method. The normal way to deal with files is opening them and using a Stream.

C# Beginner File Reading

Okay I searched for an answer to this but couldn't find it.
here's the code:
FileStream fs = new FileStream(fileName, FileMode.Open, FileAccess.Read);
byte[] fileText = new byte[fs.Length];
int bytesRead = fs.Read(fileText, 0, fileText.Length);
Console.WriteLine(Encoding.ASCII.GetString(fileText, 0, bytesRead));
Let me get this straight,
We declare a filestream
We Declare a byte array.. and set its CAPACITY to fs.Length
???? Why does fs.Read() return an INTEGER ???
??? How does this line display the text from the .txt file to the console? we passed in the byte[] in the getstring() method, but isnt that byte[] empty? we only set its capacity to fs.length? where did the reading happen and how?
TIA

If you are trying to read a text file and display all it's lines in console
foreach(string line in File.ReadAllLines("YourFilePath"))
{
Console.WriteLine(line);
}
In your method
FileStream fs = new FileStream("YourFilePath", FileMode.Open, FileAccess.Read);
Opens the file for reading into stream fs.
byte[] fileText = new byte[fs.Length];
Gets the number of bytes in the file content, and creates a byte array of that size
int bytesRead = fs.Read(fileText, 0, fileText.Length);
Reads the byte content, from 0 to end of content (we have length from last statement), i.e. the complete contents into the array you created. So, now your byte array fileText has all the byte contents from the file.
It returns the number of bytes read in this operation, if you need that for some reason. This can be <= the number of bytes you wanted to read (less if less bytes were available in the file content). In your case, it will be same as fileText.Length since you already calculated that.
System.Console.WriteLine(Encoding.ASCII.GetString(fileText, 0, bytesRead));
Converts the byte array into ASCII encoded text and writes to console.

Read method returns the number of bytes that were read into the buffer paramters. You passed in an array that will be filled with the data when the Read method is actually called. You have passed in the number of bytes that you want to read as well.
Read the msdn documentation for more details here

FileStream.Read() returns the number of bytes actually read. It could be that you ask for 4096 bytes, but get 0, or 1, or 1000. This is what the docs say:
Return Value
Type: System.Int32
The total number of bytes read into the buffer. This might be less than the number of bytes requested if that number of bytes are not currently available, or zero if the end of the stream is reached.
If you are reading text, you can use one of the helpful File methods: File.ReadAllText, File.ReadAllLines, or File.OpenText which gives you a StreamReader object where you can read line-by-line.
If you need to read the bytes (this is a much lower-level usage, and really you should be able to use a StreamReader), then you don't want to create a buffer the length of the stream, since this could crash your program with an OutOfMemoryException. Instead, make the buffer something like 4096 bytes, then call FileStream.Read in a loop, until it returns 0. Note, however, that you are not reading text lines here, and a line break may come in the middle of the buffer. Here's an example:
using (var fileStream = File.OpenRead("c:\\file.txt"))
{
var buffer = new Byte[4096];
var offset = 0;
var read = 0;
while ((read = fileStream.Read(buffer, offset, buffer.Length)) > 0)
{
var s = Encoding.ASCII.GetString(buffer, 0, read);
Console.Write(s);
offset += read;
}
}

3) has been answered here already.
As for 4): the Read method actually also fills the buffer with bytes and returns the number of bytes it filled into the buffer.
Passing the buffer and the number of bytes read to Encoding. GetString() interprets the bytes from the file as character codes for the given encoding, in your case ASCII, and creates an string from the byte array based on the encoding.

Unable to read beyond the end of the stream

I did some quick method to write a file from a stream but it's not done yet. I receive this exception and I can't find why:
Unable to read beyond the end of the stream
Is there anyone who could help me debug it?
public static bool WriteFileFromStream(Stream stream, string toFile)
{
FileStream fileToSave = new FileStream(toFile, FileMode.Create);
BinaryWriter binaryWriter = new BinaryWriter(fileToSave);
using (BinaryReader binaryReader = new BinaryReader(stream))
{
int pos = 0;
int length = (int)stream.Length;
while (pos < length)
{
int readInteger = binaryReader.ReadInt32();
binaryWriter.Write(readInteger);
pos += sizeof(int);
}
}
return true;
}
Thanks a lot!

Not really an answer to your question but this method could be so much simpler like this:
public static void WriteFileFromStream(Stream stream, string toFile)
{
// dont forget the using for releasing the file handle after the copy
using (FileStream fileToSave = new FileStream(toFile, FileMode.Create))
{
stream.CopyTo(fileToSave);
}
}
Note that i also removed the return value since its pretty much useless since in your code, there is only 1 return statement
Apart from that, you perform a Length check on the stream but many streams dont support checking Length.
As for your problem, you first check if the stream is at its end. If not, you read 4 bytes. Here is the problem. Lets say you have a input stream of 6 bytes. First you check if the stream is at its end. The answer is no since there are 6 bytes left. You read 4 bytes and check again. Ofcourse the answer is still no since there are 2 bytes left. Now you read another 4 bytes but that ofcourse fails since there are only 2 bytes. (readInt32 reads the next 4 bytes).

I presume that the input stream have ints only (Int32). You need to test the PeekChar() method,
while (binaryReader.PeekChar() != -1)
{
int readInteger = binaryReader.ReadInt32();
binaryWriter.Write(readInteger);
}

You are doing while (pos < length) and length is the actual length of the stream in bytes. So you are effectively counting the bytes in the stream and then trying to read that many number of ints (which is incorrect). You could take length to be stream.Length / 4 since an Int32 is 4 bytes.

try
int length = (int)binaryReader.BaseStream.Length;

After reading the stream by the binary reader the position of the stream is at the end, you have to set the position to zero "stream.position=0;"

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

How to read a file from the x line in .NET - c#

You could take the string from your stream input, parse it into lines with array split, and then drop the first, and put it back together, OR, you could regex out the first line, or just look for a line feed and copy the string from there on. etc.

You should use a [StreamReader][1] class. Then by using ReadLine until the result is null you read all the lines one by one. Just skeep the first one as your requirement. Reading all the file in memory is usually not a good idea ( it does not scale as the file became bigger )

Related

C# - Stream.Read offset is working incorrectly

Replace a byte of data

How to read file bytes from byte offset?

C# Beginner File Reading

Unable to read beyond the end of the stream

Categories

Resources