Get/delete last character of file without loading into memory

Get/delete last character of file without loading into memory - c#

How can I get the last character in a file, and if it is a certain character, delete it without loading the entire file into memory?
This is what I have so far.
using (var fileStream = new FileStream("file.txt", FileMode.Open, FileAccess.ReadWrite)){
fileStream.Seek(-1, SeekOrigin.End);
if (Convert.ToChar(fileStream.ReadByte()) == ']')
{
// Here to come deleting the character
}

If your text file is encoded as ASCII or UTF-8 (in which ] would be stored as a single byte), then you could just trim your file by one byte:
if (fileStream.ReadByte() == ']')
fileStream.SetLength(fileStream.Length - 1);

You can set the Position of a Filestream to the last byte and look what it is. Then use Douglas Solution to delete it:
using( FileStream fs = new FileStream( filePath, FileMode.Open ) )
{
fs.Position = fs.Seek( -1, SeekOrigin.End );
if(fs.ReadByte() == ']' )
fs.SetLength( fs.Length - 1 );
}

Related

UTF-8 remove BOM

I have an XML file with a UTF-8 BOM in the beginning of the file, which hinders me from using existing code that reads UTF-8 files.
How can I remove the BOM from the XML file in an easy way?
Here I have a variable xmlfile in Byte type that I convert to string. xmlfile contains the entire XML file.
byte[] xmlfile = ((Byte[])myReader["xmlSQL"]);
string xmlstring = Encoding.UTF8.GetString(xmlfile);

Great stuff DBC :) that worked well with your link. To fix my problem where i had a UTF-8 BOM tag in the beginning of my xml file. I simply added memorystream and streamreader, which automaticly cleanced the the xmlfile(htmlbytes) of BOM elements.
Really easy to implement for existing code.
byte[] htmlbytes = ((Byte[])myReader["xmlMelding"]);
var memorystream = new MemoryStream(htmlbytes);
var s = new StreamReader(memorystream).ReadToEnd();

Encoding.GetString() has an overload that accepts an offset into the byte[] array. Simply check if the array starts with a BOM, and if so then skip it when calling GetString(), eg:
byte[] xmlfile = ((Byte[])myReader["xmlSQL"]);
int offset = 0;
if (xmlfile.Length >= 3 &&
xmlfile[0] == 0xEF &&
xmlfile[1] == 0xBB &&
xmlfile[2] == 0xBF)
{
offset += 3;
}
string xmlstring = Encoding.UTF8.GetString(xmlfile, offset, xmlfile.Length - offset);

Stream reader.Read number of character

Is there any Stream reader Class to read only number of char from string Or byte from byte[]?
forexample reading string:
string chunk = streamReader.ReadChars(5); // Read next 5 chars
or reading bytes
byte[] bytes = streamReader.ReadBytes(5); // Read next 5 bytes
Note that the return type of this method or name of the class does not matter. I just want to know if there is some thing similar to this then i can use it.
I have byte[] from midi File. I want to Read this midi file in C#. But i need ability to read number of bytes. or chars(if i convert it to hex). To validate midi and read data from it more easily.

Thanks for the comments. I didnt know there is an Overload for Read Methods. i could achieve this with FileStream.
using (FileStream fileStream = new FileStream(path, FileMode.Open))
{
byte[] chunk = new byte[4];
fileStream.Read(chunk, 0, 4);
string hexLetters = BitConverter.ToString(chunk); // 4 Hex Letters that i need!
}

You can achieve this by doing something like below but I am not sure this will applicable for your problem or not.
StreamReader sr = new StreamReader(stream);
StringBuilder S = new StringBuilder();
while(true)
{
S = S.Append(sr.ReadLine());
if (sr.EndOfStream == true)
{
break;
}
}
Once you have value on "S", you can consider sub strings from it.

FileStream returning null characters every other character

I seem to be having some issues with a Filestream in C#.
I am trying to read the last line from a VERY large text file, 10mb, that is generated by a MSI installer.
The code I am using is:
string path = #"C:\uninstall.log";
byte[] buffer = new byte[100];
using (FileStream fs = new FileStream(path, FileMode.Open, FileAccess.Read))
{
long len = fs.Length;
fs.Seek(-100, SeekOrigin.End);
fs.Read(buffer, 0, 100);
}
string foo = Encoding.UTF8.GetString(buffer);
Console.WriteLine("\"" + foo + "\"");
But the output looks similar to this:
H E L L O W O R L D ! ! ! B L A H B L A H
Apparently the stream that is read contains a '\0' (null) character every other character.
Does anyone know what is causing this?

Use Encoding.UnicodeEncoding instead. Your file is encoded in UTF-16, not UTF-8.

The file is probably a UTF-16 file, not a UTF-8 file. Just try using Encoding.Unicode instead of Encoding.UTF8.

Sounds like the file is actually UTF-16 encoded. Change UTF-8 in your GetString().

Processing and updating a large file row by row

So I am processing a 200 mb txt file and I have to read each row in the file update one or two columns and then save the same. What is the best way to achieve the same?
I was thinking of lading into a datatable but holding that big of a file in memory is a big pain.
I realise I should do it in batches but what is the best way to achieve the same?
I dont think I want to load into a dB first cos I cant do a mass update anyways. i Have to do a line by line read there too.
Just as an update my files basically have columns in any order and I need to update two or more columns all the time.
Thanks.

Read a line, parse it, and write fields into a temp file. When all the lines are done, delete the original file and rename the temp file.

To add to what Ants said...
You have options ...
Line by line:
StreamReader fileStream = new StreamReader( sourceFileName );
StreamWriter ansiWriter = new StreamWriter( destinationFileName,
false, Encoding.GetEncoding( 20127 ) );
string fileContent;
while ( ( fileContent = fileStream.ReadLine() ) != null )
{
YourReplaceMethod( fileContent );
ansiWriter.WriteLine( fileContent );
}
fileStream.Close();
ansiWriter.Close();
Bulk (today's boxes should be able to handle 200MB w/o problems):
byte[] bytes = File.ReadAllBytes( sourceFileName );
byte[] writeMeBytes = YourReplaceMethod( bytes );
File.WriteAllBytes( destinationFileName, writeMeBytes );

How to read a file starting at a specific cursor point in C#?

I want to read a file but not from the beginning of the file but at a specific point of a file. For example I want to read a file after 977 characters after the beginning of the file, and then read the next 200 characters at once. Thanks.

If you want to read the file as text, skipping characters (not bytes):
using (var textReader = System.IO.File.OpenText(path))
{
// read and disregard the first 977 chars
var buffer = new char[977];
textReader.Read(buffer, 0, buffer.Length);
// read 200 chars
buffer = new char[200];
textReader.Read(buffer, 0, buffer.Length);
}
If you merely want to skip a certain number of bytes (not characters):
using (var fileStream = System.IO.File.OpenRead(path))
{
// seek to starting point
fileStream.Seek(977, SeekOrigin.Begin);
// read 200 bytes
var buffer = new byte[200];
fileStream.Read(buffer, 0, buffer.Length);
}

you can use Linq and converting array of char to string .
add these namespace :
using System.Linq;
using System.IO;
then you can use this to get an array of characters starting index a as much as b characters from your text file :
char[] c = File.ReadAllText(FilePath).ToCharArray().Skip(a).Take(b).ToArray();
Then you can have a string , includes continuous chars of c :
string r = new string(c);
for example , i have this text in a file :
hello how are you ?
i use this code :
char[] c = File.ReadAllText(FilePath).ToCharArray().Skip(6).Take(3).ToArray();
string r = new string(c);
MessageBox.Show(r);
and it shows : how
Way 2
Very simple :
Using Substring method
string s = File.ReadAllText(FilePath);
string r = s.Substring(6,3);
MessageBox.Show(r);
Good Luck ;

using (var fileStream = System.IO.File.OpenRead(path))
{
// seek to starting point
fileStream.Position = 977;
// read
}

if you want to read specific data types from files System.IO.BinaryReader is the best choice.
if you are not sure about file encoding use
using (var binaryreader = new BinaryReader(File.OpenRead(path)))
{
// seek to starting point
binaryreader.ReadChars(977);
// read
char[] data = binaryreader.ReadChars(200);
//do what you want with data
}
else if you know character size in source file size are 1 or 2 byte use
using (var binaryreader = new BinaryReader(File.OpenRead(path)))
{
// seek to starting point
binaryreader.BaseStream.Position = 977 * X;//x is 1 or 2 base on character size in sourcefile
// read
char[] data = binaryreader.ReadChars(200);
//do what you want with data
}

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Get/delete last character of file without loading into memory - c#

If your text file is encoded as ASCII or UTF-8 (in which ] would be stored as a single byte), then you could just trim your file by one byte: if (fileStream.ReadByte() == ']') fileStream.SetLength(fileStream.Length - 1);

You can set the Position of a Filestream to the last byte and look what it is. Then use Douglas Solution to delete it: using( FileStream fs = new FileStream( filePath, FileMode.Open ) ) { fs.Position = fs.Seek( -1, SeekOrigin.End ); if(fs.ReadByte() == ']' ) fs.SetLength( fs.Length - 1 ); }

Related

UTF-8 remove BOM

Stream reader.Read number of character

FileStream returning null characters every other character

Processing and updating a large file row by row

How to read a file starting at a specific cursor point in C#?

Categories

Resources