read and output a text file using streamreader char by char

read and output a text file using streamreader char by char - c#

what i am trying to do is to read the file a.txt and output each character in a single line i am having a real difficulty to solve this problem any help will be really appreciated.if you write the code please comment so i can understand more clearly as i am beginner.thanks
namespace ConsoleApplication13
{
class Program
{
static void Main(string[] args)
{
using (StreamReader r = new StreamReader("a.txt"))
{
string #char;
while((#char = r.ReadBlock() != null))
foreach(char i in #char)
{
Console.WriteLine(i);
}
}
}
}
}

i want to read the file and output all the file char by char , each char in new line
OK; there's a lot of ways to do that; the simplest would be (for small files):
string body = File.ReadAllText("a.txt");
foreach (char c in body) Console.WriteLine(c);
To use ReadBlock to handle the file in chunks (not lines):
using (StreamReader r = new StreamReader("a.txt"))
{
char[] buffer = new char[1024];
int read;
while ((read = r.ReadBlock(buffer, 0, buffer.Length)) > 0)
{
for (int i = 0; i < read; i++)
Console.WriteLine(buffer[i]);
}
}
This reads blocks of up to 1024 characters at a time, then writes out whatever we read, each character on a new line. The variable read tells us how many characters we read on that iteration; the read > 0 test (hidden slightly, but it is there) asks "have we reached the end of the file?" - as ReadBlock will return 0 at the end.

Related

Creating lines of 152 characters and adjusting line endings at ends of words

I am trying to write a little utility class for myself to do some formatting of text so that each line is as close as possible to 152 characters in length. I have written this code:
StreamReader sr = new StreamReader("C:\\Users\\Owner\\Videos\\XSplit\\Luke11\\Luke11fromweb.txt");
StreamWriter sw = new StreamWriter("C:\\Users\\Owner\\Videos\\XSplit\\Luke11\\Luke11raw.txt");
int count = 152;
char chunk;
do
{
for (int i = 0; i < count; i++)
{
chunk = (char)sr.Read();
sw.Write(chunk);
}
while (Char.IsWhiteSpace((char)sr.Peek()) == false && (char)sr.Peek() > -1)
{
chunk = (char)sr.Read();
sw.Write(chunk);
}
sw.WriteLine();
} while (sr.Peek() >= 0);
sr.Close();
sw.Close();
The for statement works fine. It reads and writes 152 characters without flaw. However, there is no guarantee that 152 characters will fall at the end of a word. So I wrote the nested while statement to check if the next character is a space, and if not, to read and write that character. The inner while statement is supposed to stop when it sees that the next character is a space, and then write in the line end statement.
After the reader and writer have gone through the entire document, I close them both and should have a new document where all the lines are approximately 152 characters long and end at the end of a word.
Obviously this isn't working as I anticipated and that is the reason for my question. Since the for statement works, there is something wrong in my nested while statement (perhaps the condition?) and I am not exiting the program without errors.
Any advice would be appreciated. Thanks in advance.

Your end of file test is incorrect
while (Char.IsWhiteSpace((char)sr.Peek()) == false && (char)sr.Peek() > -1)
you mean
while (Char.IsWhiteSpace((char)sr.Peek()) == false && sr.Peek() > -1)
as per docs
The Peek method returns an integer value in order to determine whether the end of the file, or another error has occurred. This allows a user to first check if the returned value is -1 before casting it to a Char type.
Note before casting

Might I suggest something like the following.
using System;
using System.IO;
public class Program
{
public static void Main()
{
Console.WriteLine("Hello World");
int maxLength = 152;
string inputPath = #"c:\Users\Owner\Videos\XSplit\Luke11\Luke11fromweb.txt";
string outputPath = #"c:\Users\Owner\Videos\XSplit\Luke11\Luke11raw.txt";
try
{
if (File.Exists(outputPath))
{
File.Delete(outputPath);
}
using (StreamWriter sw = new StreamWriter(inputPath))
{
using (StreamReader sr = new StreamReader(outputPath))
{
do
{
WriteMaxPlus(sr, sw, maxLength);
}
while (sr.Peek() >= 0);
}
}
}
catch (Exception e)
{
Console.WriteLine("The process failed: {0}", e.ToString());
}
}
private static void WriteMaxPlus(StreamReader sr, StreamWriter sw, int maxLength)
{
for (int i = 0; i < maxLength; i++)
{
if (sr.Peek() >= 0)
{
sw.Write((char)sr.Read());
}
}
while (sr.Peek() >= 0 && !Char.IsWhiteSpace((char)sr.Peek()))
{
sw.Write((char)sr.Read());
}
sw.WriteLine();
}
}

Counting total characters of a file

Hi I'm pretty new to C# and trying to do some exercises to get up to speed with it. I'm trying to count the total number of characters in a file but it's stopping after the first word, would someone be able to tell me where I am going wrong? Thanks in advance
public void TotalCharacterCount()
{
string str;
int count, i, l;
count = i = 0;
StreamReader reader = File.OpenText("C:\\Users\\Lewis\\file.txt");
str = reader.ReadLine();
l = str.Length;
while (str != null && i < l)
{
count++;
i++;
str = reader.ReadLine();
}
reader.Close();
Console.Write("Number of characters in the file is : {0}\n", count);
}

If you want to know the size of a file:
long length = new System.IO.FileInfo("C:\\Users\\Lewis\\file.txt").Length;
Console.Write($"Number of characters in the file is : {length}");
If you want to count characters to play around with C#, then here is some sample code that might help you
int totalCharacters = 0;
// Using will do the reader.Close for you.
using (StreamReader reader = File.OpenText("C:\\Users\\Lewis\\file.txt"))
{
string str = reader.ReadLine();
while (str != null)
{
totalCharacters += str.Length;
str = reader.ReadLine();
}
}
// If you add the $ in front of the string, then you can interpolate expressions
Console.Write($"Number of characters in the file is : {totalCharacters}");

it's stopping after the first word
It is because you have check && i < l in the loop and then increment it so the check doesn't pass you don't change the value of l variable(by the way, the name is not very good, I was sure it was 1, not l).
Then if you need to get total count of characters in the file you could read the whole file to a string variable and just get it from Count() Length
var count = File.ReadAllText(path).Count();
Getting Length property of the FileInfo will give the size, in bytes, of the current file, which is not necessary will be equal to characters count(depending on Encoding a character may take more than a byte)
And regarding the way you read - it also depends whether you want to count new line symbols and others or not.
Consider the following sample
static void Main(string[] args)
{
var sampleWithEndLine = "a\r\n";
var length1 = "a".Length;
var length2 = sampleWithEndLine.Length;
var length3 = #"a
".Length;
Console.WriteLine($"First sample: {length1}");
Console.WriteLine($"Second sample: {length2}");
Console.WriteLine($"Third sample: {length3}");
var totalCharacters = 0;
File.WriteAllText("sample.txt", sampleWithEndLine);
using(var reader = File.OpenText("sample.txt"))
{
string str = reader.ReadLine();
while (str != null)
{
totalCharacters += str.Length;
str = reader.ReadLine();
}
}
Console.WriteLine($"Second sample read with stream reader: {totalCharacters}");
Console.ReadKey();
}
For the second sample, first, the Length will return 3, because it actually contains three symbols, while with stream reader you will get 1, because The string that is returned does not contain the terminating carriage return or line feed. The returned value is null if the end of the input stream is reached

How does StreamReader read all chars, including 0x0D 0x0A chars?

How does StreamReader read all chars, including 0x0D 0x0A chars?
I have an old .txt file I am trying to covert. Many lines (but not all) end with "0x0D 0x0D 0x0A".
This code reads all of the lines.
StreamReader srFile = new StreamReader(gstPathFileName);
while (!srFile.EndOfStream) {
string stFileContents = srFile.ReadLine();
...
}
This results in extra "" strings between each .txt line. As there are some blank lines between the paragraphs, removing all "" strings removes those blank lines.
Is there a way to have StreamReader read all of the chars including the "0x0D 0x0D 0x0A"?
Edited two hours later ... the file is huge, 1.6MB.

A very simple reimplementation of ReadLine. I have done a version that returns an IEnumerable<string> because it's easier. I've put it in an extension method, so the static class. The code is heavily commented, so it should be easy to read.
public static class StreamEx
{
public static string[] ReadAllLines(this TextReader tr, string separator)
{
return tr.ReadLines(separator).ToArray();
}
// StreamReader is based on TextReader
public static IEnumerable<string> ReadLines(this TextReader tr, string separator)
{
// Handling of empty file: old remains null
string old = null;
// Read buffer
var buffer = new char[128];
while (true)
{
// If we already read something
if (old != null)
{
// Look for the separator
int ix = old.IndexOf(separator);
// If found
if (ix != -1)
{
// Return the piece of line before the separator
yield return old.Remove(ix);
// Then remove the piece of line before the separator plus the separator
old = old.Substring(ix + separator.Length);
// And continue
continue;
}
}
// old doesn't contain any separator, let's read some more chars
int read = tr.ReadBlock(buffer, 0, buffer.Length);
// If there is no more chars to read, break the cycle
if (read == 0)
{
break;
}
// Add the just read chars to the old chars
// note that null + "somestring" == "somestring"
old += new string(buffer, 0, read);
// A new "round" of the while cycle will search for the separator
}
// Now we have to handle chars after the last separator
// If we read something
if (old != null)
{
// Return all the remaining characters
yield return old;
}
}
}
Note that, as written, it won't directly handle your problem :-) But it lets you select the separator you want to use. So you use "\r\n" and then you trim the excess '\r'.
Use it like this:
using (var sr = new StreamReader("somefile"))
{
// Little LINQ to strip excess \r and to make an array
// (note that by making an array you'll put all the file
// in memory)
string[] lines = sr.ReadLines("\r\n").Select(x => x.TrimEnd('\r')).ToArray();
}
or
using (var sr = new StreamReader("somefile"))
{
// Little LINQ to strip excess \r
// (note that the file will be read line by line, so only
// a line at a time is in memory (plus some remaining characters
// of the next line in the old buffer)
IEnumerable<string> lines = sr.ReadLines("\r\n").Select(x => x.TrimEnd('\r'));
foreach (string line in lines)
{
// Do something
}
}

You could always use a BinaryReader and manually read in lines a byte at a time. Keep hold of the bytes, then when you come across 0x0d 0x0d 0x0a, make a new string of the bytes for the current line.
Note:
I'm assuming that your encoding is Encoding.UTF8 but your case might be different. Accessing bytes directly, I don't know off-hand how to interpret the encoding.
If your file has extra information, e.g. a byte order mark, that will be returned too.
Here it is:
public static IEnumerable<string> ReadLinesFromStream(string fileName)
{
using ( var fileStream = File.Open(gstPathFileName) )
using ( BinaryReader binaryReader = new BinaryReader(fileStream) )
{
var bytes = new List<byte>();
while ( binaryReader.PeekChar() != -1 )
{
bytes.Add(binaryReader.ReadByte());
bool newLine = bytes.Count > 2
&& bytes[bytes.Count - 3] == 0x0d
&& bytes[bytes.Count - 2] == 0x0d
&& bytes[bytes.Count - 1] == 0x0a;
if ( newLine )
{
yield return Encoding.UTF8.GetString(bytes.Take(bytes.Count - 3).ToArray());
bytes.Clear();
}
}
if ( bytes.Count > 0 )
yield return Encoding.UTF8.GetString(bytes.ToArray());
}
}

A very easy solution (not optimized for memory consumption) could be:
var allLines = File.ReadAllText(gstPathFileName)
.Split('\n');
The if you need to remove trailing carriage return characters, then do:
for(var i = 0; i < allLines.Length; ++i)
allLines[i] = allLines[i].TrimEnd('\r');
You can put relevant processing into that for link if you want. Or if you do not want to keep the array, use this instead of the for:
foreach(var line in allLines.Select(x => x.TrimEnd('\r')))
{
// use 'line' here ...
}

This code works well ... reads every char.
char[] acBuf = null;
int iReadLength = 100;
while (srFile.Peek() >= 0) {
acBuf = new char[iReadLength];
srFile.Read(acBuf, 0, iReadLength);
string s = new string(acBuf);
}

C# - Read External CSV File Character by Character

What is the easiest way to read a file character by character in C#?
Currently, I am reading line by line by calling System.io.file.ReadLine(). I see that there is a Read() function but it doesn;t return a character...
I would also like to know how to detect the end of a line using such an approach...The input file in question is a CSV file....

Open a TextReader (e.g. by File.OpenText - note that File is a static class, so you can't create an instance of it) and repeatedly call Read. That returns int rather than char so it can also indicate end of file:
int readResult = reader.Read();
if (readResult != -1)
{
char nextChar = (char) readResult;
// ...
}
Or to loop:
int readResult;
while ((readResult = reader.Read()) != -1)
{
char nextChar = (char) readResult;
// ...
}
Or for more funky goodness:
public static IEnumerable<char> ReadCharacters(string filename)
{
using (var reader = File.OpenText(filename))
{
int readResult;
while ((readResult = reader.Read()) != -1)
{
yield return (char) readResult;
}
}
}
...
foreach (char c in ReadCharacters("foo.txt"))
{
...
}
Note that all by default, File.OpenText will use an encoding of UTF-8. Specify an encoding explicitly if that isn't what you want.
EDIT: To find the end of a line, you'd check whether the character is \n... you'd potentially want to handle \r specially too, if this is a Windows text file.
But if you want each line, why not just call ReadLine? You can always iterate over the characters in the line afterwards...

Here is a snippet from msdn
using (StreamReader sr = new StreamReader(path))
{
char[] c = null;
while (sr.Peek() >= 0)
{
c = new char[1];
sr.Read(c, 0, c.Length);
// do something with c[0]
}
}

C#: FileStream.Read() doesn't read the file up to the end, but returns 0

Here is how i do it:
static void Main(string[] args)
{
string FileName = "c:\\error.txt";
long FilePosition = 137647;
FileStream fr = new FileStream(FileName, FileMode.Open);
byte[] b = new byte[1024];
string data = string.Empty;
fr.Seek(FilePosition, SeekOrigin.Begin);
UTF8Encoding encoding = new UTF8Encoding();
while (fr.Read(b, 0, b.Length) > 0)
{
data += encoding.GetString(b);
}
fr.Close();
string[] str = data.Split(new string[] { "\r\n" }, StringSplitOptions.None);
foreach (string s in str)
{
Console.WriteLine(s);
}
Console.ReadKey();
}
The str array ends with these lines:
***** History for hand T5-2847880-18 (TOURNAMENT: S-976-46079) *****
Start hand: Tue Aug 11 18:14
but there are more lines in the file.
I've uploaded error.txt to sendspace: http://www.sendspace.com/file/5vgjtn
And here is the full console output: the_same_site/file/k05x3a
Please help! I'm really clueless here.
Thanks in advance!

Your code has some subtle errors and problems in:
You assume that the whole buffer has been filled by calling GetString(b)
You assume that each buffer ends at the end of a character. Use a TextReader (e.g. StreamReader) to read text data, avoiding this sort of problem.
You're not closing the file if an exception occurs (use a using directive)
You're using string concatenation in a loop: prefer StringBuilder
As others have pointed out, File.ReadAllLines would avoid a lot of this work. There's also File.ReadAllText, and TextReader.ReadToEnd for non-files.
Finally, just use Encoding.UTF8 instead of creating a new instance unless you really need to tweak some options.

Not technically an answer to your question, but you could replace all that with:
string[] str = File.ReadAllLines("c:\\error.txt");
Edit (as promised):
Rather than missing a piece, it seems to me that you will have a duplicate of the last part. You're not reading a full 1024 bytes from the file, but you are turning all 1024 bytes into a string and appending it.
Your loop should be like this instead:
int bytesRead;
while ((bytesRead = fr.Read(b, 0, b.Length)) > 0)
{
data += encoding.GetString(b, 0, bytesRead);
}
Other than that: what Jon said :)

Why don't you make your life easier and do this?
string[] str = System.IO.File.ReadAllLines("c:\\error.txt");

You might find it significantly easier to simply use File.ReadLines(). Skip the lines you don't care about, instead of using the position.
int counter = 0;
foreach (string s in File.ReadAllLines(FileName))
{
++counter;
if (counter > 50?)
{
Console.WriteLine(s);
}
}
You could also use the StreamReader, which lets you wrap the stream after setting its position, then use the ReadLine() method.

I know this is old, but check this:
public void StreamBuffer(Stream sourceStream, Stream destinationStream, int buffer = 4096)
{
using (var memoryStream = new MemoryStream())
{
sourceStream.CopyTo(memoryStream);
var memoryBuffer = memoryStream.GetBuffer();
for (int i = 0; i < memoryBuffer.Length;)
{
var networkBuffer = new byte[buffer];
for (int j = 0; j < networkBuffer.Length && i < memoryBuffer.Length; j++)
{
networkBuffer[j] = memoryBuffer[i];
i++;
}
destinationStream.Write(networkBuffer, 0, networkBuffer.Length);
}
}
}
...and keep in mind that memory streams are great and guaranteed to get you a byte array from a stream to be used in other ways. Use it.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

read and output a text file using streamreader char by char - c#

Related

Creating lines of 152 characters and adjusting line endings at ends of words

Counting total characters of a file

How does StreamReader read all chars, including 0x0D 0x0A chars?

C# - Read External CSV File Character by Character

C#: FileStream.Read() doesn't read the file up to the end, but returns 0

Categories

Resources