Reading in files that contain specific characters in C#

Reading in files that contain specific characters in C# - c#

I have a text file named C:/test.txt:
1 2 3 4
5 6
I want to read every number in this file using StreamReader.
How can I do that?

Do you really need to use a StreamReader to do this?
IEnumerable<int> numbers =
Regex.Split(File.ReadAllText(#"c:\test.txt"), #"\D+").Select(int.Parse);
(Obviously if it's impractical to read the entire file in one hit then you'll need to stream it, but if you're able to use File.ReadAllText then that's the way to do it, in my opinion.)
For completeness, here's a streaming version:
public IEnumerable<int> GetNumbers(string fileName)
{
using (StreamReader sr = File.OpenText(fileName))
{
string line;
while ((line = sr.ReadLine()) != null)
{
foreach (string item in Regex.Split(line, #"\D+"))
{
yield return int.Parse(item);
}
}
}
}

using (StreamReader reader = new StreamReader(stream))
{
string contents = reader.ReadToEnd();
Regex r = new Regex("[0-9]");
Match m = r.Match(contents );
while (m.Success)
{
int number = Convert.ToInt32(match.Value);
// do something with the number
m = m.NextMatch();
}
}

Something like so might do the trick, if what you want is to read integers from a file and store them in a list.
try
{
StreamReader sr = new StreamReader("C:/test.txt"))
List<int> theIntegers = new List<int>();
while (sr.Peek() >= 0)
theIntegers.Add(sr.Read());
sr.Close();
}
catch (Exception e)
{
//Do something clever to deal with the exception here
}

Solution for big files:
class Program
{
const int ReadBufferSize = 4096;
static void Main(string[] args)
{
var result = new List<int>();
using (var reader = new StreamReader(#"c:\test.txt"))
{
var readBuffer = new char[ReadBufferSize];
var buffer = new StringBuilder();
while ((reader.Read(readBuffer, 0, readBuffer.Length)) > 0)
{
foreach (char c in readBuffer)
{
if (!char.IsDigit(c))
{
// we found non digit character
int newInt;
if (int.TryParse(buffer.ToString(), out newInt))
{
result.Add(newInt);
}
buffer.Remove(0, buffer.Length);
}
else
{
buffer.Append(c);
}
}
}
// check buffer
if (buffer.Length > 0)
{
int newInt;
if (int.TryParse(buffer.ToString(), out newInt))
{
result.Add(newInt);
}
}
}
result.ForEach(Console.WriteLine);
Console.ReadKey();
}
}

I might be wrong but with StreamReader you cannot set delimeter.
But you can use String.Split() to set delimeter (it is space in your case?) and extract all numbers into separate array.

Something like this ought to work:
using (var sr = new StreamReader("C:/test.txt"))
{
var s = sr.ReadToEnd();
var numbers = (from x in s.Split('\n')
from y in x.Split(' ')
select int.Parse(y));
}

Something like this:
using System;
using System.IO;
class Test
{
public static void Main()
{
string path = #"C:\Test.txt";
try
{
if( File.Exists( path ) )
{
using( StreamReader sr = new StreamReader( path ) )
{
while( sr.Peek() >= 0 )
{
char c = ( char )sr.Read();
if( Char.IsNumber( c ) )
Console.Write( c );
}
}
}
}
catch (Exception e)
{
Console.WriteLine("The process failed: {0}", e.ToString());
}
}
}

Related

Read the flat file,group and write to file(Add special Characters as '*' in Empty Space)

E2739158012008-10-01O9918107NPF7547379999010012008-10-0100125000000
E2739158PU0000-00-00 010012008-10-0100081625219
E3180826011985-01-14L9918007NPM4927359999010011985-01-1400005620000
E3180826PU0000-00-00 020011985-01-14000110443500021997-01-1400000518799
E3292015011985-01-16L9918007NPM4927349999010011985-01-1600003623300
I have this flat file and I need to group this based on the 2nd position to 8th position
example(2739158/3180826/3292015) and write to another flat file.
So the data Starting with 'E' should Repeat in the single line along with that group field(2nd to 8th Position in the start) and I should take the 9th Position after 'E'
Also I need to replace Empty space with ('*' star)
For example
1st Line
2739158**E**012008-10-01O9918107NPF7547379999010012008-10-0100125000000*****E**012008-10-01O9918107NPF7547379999010012008-10-0100125000000
2nd Line
3180826**E**011985-01-14L9918007NPM4927359999010011985-01-1400005620000**E**011985-01-14L9918007NPM4927359999010011985-01-140000562000**E**011985-01-14L9918007NPM4927359999010011985-01-140000562000***
3rd Line
3292015**E**011985-01-16L9918007NPM4927349999010011985-01-1600003623300****
Can we do this in Stream reader c#, please?
Any help would be highly appreciated.The file size is more than 285 MB so it it good to read through Stream Reader?
Thanks

#jdweng: thanks very much for your input. i tried somehow without grouping and it works as expected.Thanks everyone who tried to solve the issue.
string sTest= string.Empty; List<SortLines> lines = new List<SortLines>();
List<String> FinalLines = new List<String>();
using (StreamReader sr = new StreamReader(#"C:\data\Input1.txt))
{
sr.ReadLine();
string line = "";
while (!sr.EndOfStream)
{
line = sr.ReadLine();
//line = line.Trim();
if (line.Length > 0)
{
line = line.Replace(" ", "*");
SortLines newLine = new SortLines()
{
key = line.Substring(1, 7),
line = line
};
if (sTest != newLine.key)
{
//Add the Line Items to String List
sOuterLine = sTest + sOneLine;
FinalLines.Add(sOuterLine);
string sFinalLine = newLine.line.Remove(1, 7);
string snewLine = newLine.key + sFinalLine;
sTest = snewLine.Substring(0, 7);
//To hold the data for the 1st occurence
sOtherLine = snewLine.Remove(0, 7);
bOtherLine = true;
string sKey = newLine.key;
lines.Add(newLine);
}
else if (sTest == newLine.key)
{
string sConcatLine = String.Empty;
string sFinalLine = newLine.line.Remove(1, 7);
//Check if 1st Set
if (bOtherLine == true)
{
sOneLine = sOtherLine + sFinalLine;
bOtherLine = false;
}
//If not add subsequent data
else
{
sOneLine = sOneLine + sFinalLine;
}
//Check for the last line in the flat file
if (sr.Peek() == -1)
{
sOuterLine = sTest + sOneLine;
FinalLines.Add(sOuterLine);
}
}
}
}
}
//Remove the Empty List
FinalLines.RemoveAll(x => x == "");
StreamWriter srWriter = new StreamWriter(#"C:\data\test.txt);
foreach (var group in FinalLines)
{
srWriter.WriteLine(group);
}
srWriter.Flush();
srWriter.Close();

Try code below :
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;
namespace ConsoleApplication1
{
class Program
{
const string INPUT_FILENAME = #"c:\temp\test.txt";
const string OUTPUT_FILENAME = #"c:\temp\test1.txt";
static void Main(string[] args)
{
List<SortLines> lines = new List<SortLines>();
StreamReader reader = new StreamReader(INPUT_FILENAME);
string line = "";
while ((line = reader.ReadLine()) != null)
{
line = line.Trim();
if (line.Length > 0)
{
line = line.Replace(" ", "*");
SortLines newLine = new SortLines() { key = line.Substring(2, 7), line = line };
lines.Add(newLine);
}
}
reader.Close();
var groups = lines.GroupBy(x => x.key);
StreamWriter writer = new StreamWriter(OUTPUT_FILENAME);
foreach (var group in groups)
{
foreach (SortLines sortLine in group)
{
writer.WriteLine(sortLine.line);
}
}
writer.Flush();
writer.Close();
}
}
public class SortLines : IComparable<SortLines>
{
public string line { get; set; }
public string key { get; set; }
public int CompareTo(SortLines other)
{
return key.CompareTo(other);
}
}
}

Replace value & save file during reading CSV file (C#)

I'm reading csv file:
string line;
StreamReader sr = new StreamReader(file.ToString());
while ((line = sr.ReadLine()) != null)
{
string col1 = line.Split(',')[10]; //old value
col1 = "my value"; //new value
}
sr.Close();
sr.Dispose();
I want to replace old value by the new.
Then I need to save the file with the changes.
How can I do that?

I suggest using File class instead of Streams and Readers. Linq is very convenient when querying data:
var modifiedData = File
.ReadLines(file.ToString())
.Select(line => line.Split(','))
.Select(items => {
//TODO: put relevant logic here: given items we should return csv line
items[10] = "my value";
return string.Join(",", items);
})
.ToList(); // <- we have to store modified data in memory
File.WriteAllLines(file.ToString(), modifiedData);
Another possibility (say, when initial file is too long to fit memory) is to save the modified data into a temporary file and then Move it:
var modifiedData = File
.ReadLines(file.ToString())
.Select(line => line.Split(','))
.Select(items => {
//TODO: put relevant logic here: given items we should return csv line
items[10] = "my value";
return string.Join(",", items);
});
string tempFile = Path.Combine(Path.GetTempPath(), $"{Guid.NewGuid()}.tmp");
File.WriteAllLines(tempFile, modifiedData);
File.Delete(file.ToString());
File.Move(tempFile, file.ToString());

Reading an entire file at once is memory-expensive. Not to mention creating its parallel copy. Using streams can fix it. Try this:
void Modify()
{
using (var fs = new FileStream(file, FileMode.Open, FileAccess.ReadWrite))
{
string line;
long position;
while ((line = fs.ReadLine(out position)) != null)
{
var tmp = line.Split(',');
tmp[1] = "00"; // new value
var newLine = string.Join(",", tmp);
fs.WriteLine(position, newLine);
}
}
}
with extensions:
static class FileStreamExtensions
{
private static readonly char[] newLine = Environment.NewLine.ToCharArray();
private static readonly int length = Environment.NewLine.Length;
private static readonly char eof = '\uFFFF';
public static string ReadLine(this FileStream fs, out long position)
{
position = fs.Position;
var chars = new List<char>();
char c;
while ((c = (char)fs.ReadByte()) != eof && (chars.Count < length || !chars.Skip(chars.Count - 2).SequenceEqual(newLine)))
{
chars.Add(c);
}
fs.Position--;
if (chars.Count == 0)
return null;
return new string(chars.ToArray());
}
public static void WriteLine(this FileStream fs, long position, string line)
{
var bytes = line.ToCharArray().Concat(newLine).Select(c => (byte)c).ToArray();
fs.Position = position;
fs.Write(bytes, 0, bytes.Length);
}
}
The shortcoming is you must keep your values the same length. E.g. 999 and __9 are both of length 3. Fixing this makes things much more complicated, so I'd leave it this way.
Full working example

c# Remove rows from csv

I have two csv files. In the first file i have a list of users, and in the second file i have a list of duplicate users. Im trying to remove the rows in the first file that are equal to the second file.
Heres the code i have so far:
StreamWriter sw = new StreamWriter(path3);
StreamReader sr = new StreamReader(path2);
string[] lines = File.ReadAllLines(path);
foreach (string line in lines)
{
string user = sr.ReadLine();
if (line != user)
{
sw.WriteLine(line);
}
File 1 example:
Modify,ABAMA3C,Allpay - Free State - HO,09072701
Modify,ABCG327,Processing Centre,09085980
File 2 Example:
Modify,ABAA323,Group HR Credit Risk & Finance
Modify,ABAB959,Channel Sales & Service,09071036
Any suggestions?
Thanks.

All you'd have to do is change the following file paths in the code below and you will get a file back (file one) without the duplicate users from file 2. This code was written with the idea in mind that you want something that is easy to understand. Sure there are other more elegant solutions, but I wanted to make it as basic as possible for you:
(Paste this in the main method of your program)
string line;
StreamReader sr = new StreamReader(#"C:\Users\J\Desktop\texts\First.txt");
StreamReader sr2 = new StreamReader(#"C:\Users\J\Desktop\texts\Second.txt");
List<String> fileOne = new List<string>();
List<String> fileTwo = new List<string>();
while (sr.Peek() >= 0)
{
line = sr.ReadLine();
if(line != "")
{
fileOne.Add(line);
}
}
sr.Close();
while (sr2.Peek() >= 0)
{
line = sr2.ReadLine();
if (line != "")
{
fileTwo.Add(line);
}
}
sr2.Close();
var t = fileOne.Except(fileTwo);
StreamWriter sw = new StreamWriter(#"C:\Users\justin\Desktop\texts\First.txt");
foreach(var z in t)
{
sw.WriteLine(z);
}
sw.Flush();

If this is not homework, but a production thing, and you can install assemblies, you'll save 3 hours of your life if you swallow your pride and use a piece of the VB library:
There are many exceptions (CR/LF between commas=legal in quotes; different types of quotes; etc.) This will handle anything excel will export/import.
Sample code to load a 'Person' class pulled from a program I used it in:
Using Reader As New Microsoft.VisualBasic.FileIO.TextFieldParser(CSVPath)
Reader.TextFieldType = Microsoft.VisualBasic.FileIO.FieldType.Delimited
Reader.Delimiters = New String() {","}
Reader.TrimWhiteSpace = True
Reader.HasFieldsEnclosedInQuotes = True
While Not Reader.EndOfData
Try
Dim st2 As New List(Of String)
st2.addrange(Reader.ReadFields())
If iCount > 0 Then ' ignore first row = field names
Dim p As New Person
p.CSVLine = st2
p.FirstName = st2(1).Trim
If st2.Count > 2 Then
p.MiddleName = st2(2).Trim
Else
p.MiddleName = ""
End If
p.LastNameSuffix = st2(0).Trim
If st2.Count >= 5 Then
p.TestCase = st2(5).Trim
End If
If st2(3) > "" Then
p.AccountNumbersFromCase.Add(st2(3))
End If
While p.CSVLine.Count < 15
p.CSVLine.Add("")
End While
cases.Add(p)
End If
Catch ex As Microsoft.VisualBasic.FileIO.MalformedLineException
MsgBox("Line " & ex.Message & " is not valid and will be skipped.")
End Try
iCount += 1
End While
End Using

this to close the streams properly:
using(var sw = new StreamWriter(path3))
using(var sr = new StreamReader(path2))
{
string[] lines = File.ReadAllLines(path);
foreach (string line in lines)
{
string user = sr.ReadLine();
if (line != user)
{
sw.WriteLine(line);
}
}
}
for help on the real logic of removal or compare, answer the comment of El Ronnoco above...

You need to close the streams or utilize using clause
sw.Close();
using(StreamWriter sw = new StreamWriter(#"c:\test3.txt"))

You can use LINQ...
class Program
{
static void Main(string[] args)
{
var fullList = "TextFile1.txt".ReadAsLines();
var removeThese = "TextFile2.txt".ReadAsLines();
//Change this line if you need to change the filter results.
//Note: this assume you are wanting to remove results from the first
// list when the entire record matches. If you want to match on
// only part of the list you will need to split/parse the records
// and then filter your results.
var cleanedList = fullList.Except(removeThese);
cleanedList.WriteAsLinesTo("result.txt");
}
}
public static class Tools
{
public static IEnumerable<string> ReadAsLines(this string filename)
{
using (var reader = new StreamReader(filename))
while (!reader.EndOfStream)
yield return reader.ReadLine();
}
public static void WriteAsLinesTo(this IEnumerable<string> lines, string filename)
{
using (var writer = new StreamWriter(filename) { AutoFlush = true, })
foreach (var line in lines)
writer.WriteLine(line);
}
}

using(var sw = new StreamWriter(path3))
using(var sr = new StreamReader(path))
{
string []arrRemove = File.ReadAllLines(path2);
HashSet<string> listRemove = new HashSet<string>(arrRemove.Count);
foreach(string s in arrRemove)
{
string []sa = s.Split(',');
if( sa.Count < 2 ) continue;
listRemove.Add(sa[1].toUpperCase());
}
string line = sr.ReadLine();
while( line != null )
{
string []sa = line.Split(',');
if( sa.Count < 2 )
sw.WriteLine(line);
else if( !listRemove.contains(sa[1].toUpperCase()) )
sw.WriteLine(line);
line = sr.ReadLine();
}
}

How to give a file path in code instead of command line

I've been working on my module exercises and I came across this code snippet which reads the text file and prints the details about it.
It's working fine, but I just want to know how to give the path of the text file in the code itself other than giving the path in the command line.
Below is my code.
class Module06
{
public static void Exercise01(string[] args)
{
string fileName = args[0];
FileStream stream = new FileStream(fileName, FileMode.Open);
StreamReader reader = new StreamReader(stream);
int size = (int)stream.Length;
char[] contents = new char[size];
for (int i = 0; i < size; i++)
{
contents[i] = (char)reader.Read();
}
reader.Close();
Summarize(contents);
}
static void Summarize(char[] contents)
{
int vowels = 0, consonants = 0, lines = 0;
foreach (char current in contents)
{
if (Char.IsLetter(current))
{
if ("AEIOUaeiou".IndexOf(current) != -1)
{
vowels++;
}
else
{
consonants++;
}
}
else if (current == '\n')
{
lines++;
}
}
Console.WriteLine("Total no of characters: {0}", contents.Length);
Console.WriteLine("Total no of vowels : {0}", vowels);
Console.WriteLine("Total no of consonants: {0}", consonants);
Console.WriteLine("Total no of lines : {0}", lines);
}
}

In your static void Main, call
string[] args = {"filename.txt"};
Module06.Exercise01(args);

Reading of a text file is much easier with File.ReadAllText then you don't need to think about closing the file you just use it. It accepts file name as parameter.
string fileContent = File.ReadAllText("path to my file");

string fileName = #"path\to\file.txt";

C# Find if a word is in a document

I am looking for a way to check if the "foo" word is present in a text file using C#.
I may use a regular expression but I'm not sure that is going to work if the word is splitted in two lines. I got the same issue with a streamreader that enumerates over the lines.
Any comments ?

What's wrong with a simple search?
If the file is not large, and memory is not a problem, simply read the entire file into a string (ReadToEnd() method), and use string Contains()

Here ya go. So we look at the string as we read the file and we keep track of the first word last word combo and check to see if matches your pattern.
string pattern = "foo";
string input = null;
string lastword = string.Empty;
string firstword = string.Empty;
bool result = false;
FileStream FS = new FileStream("File name and path", FileMode.Open, FileAccess.Read, FileShare.Read);
StreamReader SR = new StreamReader(FS);
while ((input = SR.ReadLine()) != null)
{
firstword = input.Substring(0, input.IndexOf(" "));
if(lastword.Trim() != string.Empty) { firstword = lastword.Trim() + firstword.Trim(); }
Regex RegPattern = new Regex(pattern);
Match Match1 = RegPattern.Match(input);
string value1 = Match1.ToString();
if (pattern.Trim() == firstword.Trim() || value1 != string.Empty) { result = true; }
lastword = input.Trim().Substring(input.Trim().LastIndexOf(" "));
}

Here is a quick quick example using LINQ
static void Main(string[] args)
{
{ //LINQ version
bool hasFoo = "file.txt".AsLines()
.Any(l => l.Contains("foo"));
}
{ // No LINQ or Extension Methods needed
bool hasFoo = false;
foreach (var line in Tools.AsLines("file.txt"))
if (line.Contains("foo"))
{
hasFoo = true;
break;
}
}
}
}
public static class Tools
{
public static IEnumerable<string> AsLines(this string filename)
{
using (var reader = new StreamReader(filename))
while (!reader.EndOfStream)
{
var line = reader.ReadLine();
while (line.EndsWith("-") && !reader.EndOfStream)
line = line.Substring(0, line.Length - 1)
+ reader.ReadLine();
yield return line;
}
}
}

What about if the line contains football? Or fool? If you are going to go down the regular expression route you need to look for word boundaries.
Regex r = new Regex("\bfoo\b");
Also ensure you are taking into consideration case insensitivity if you need to.

You don't need regular expressions in a case this simple. Simply loop over the lines and check if it contains foo.
using (StreamReader sr = File.Open("filename", FileMode.Open, FileAccess.Read))
{
string line = null;
while (!sr.EndOfStream) {
line = sr.ReadLine();
if (line.Contains("foo"))
{
// foo was found in the file
}
}
}

You could construct a regex which allows for newlines to be placed between every character.
private static bool IsSubstring(string input, string substring)
{
string[] letters = new string[substring.Length];
for (int i = 0; i < substring.Length; i += 1)
{
letters[i] = substring[i].ToString();
}
string regex = #"\b" + string.Join(#"(\r?\n?)", letters) + #"\b";
return Regex.IsMatch(input, regex, RegexOptions.ExplicitCapture);
}

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Reading in files that contain specific characters in C# - c#

I have a text file named C:/test.txt: 1 2 3 4 5 6 I want to read every number in this file using StreamReader. How can I do that?

using (StreamReader reader = new StreamReader(stream)) { string contents = reader.ReadToEnd(); Regex r = new Regex("[0-9]"); Match m = r.Match(contents ); while (m.Success) { int number = Convert.ToInt32(match.Value); // do something with the number m = m.NextMatch(); } }

I might be wrong but with StreamReader you cannot set delimeter. But you can use String.Split() to set delimeter (it is space in your case?) and extract all numbers into separate array.

Something like this ought to work: using (var sr = new StreamReader("C:/test.txt")) { var s = sr.ReadToEnd(); var numbers = (from x in s.Split('\n') from y in x.Split(' ') select int.Parse(y)); }

Related

Read the flat file,group and write to file(Add special Characters as '*' in Empty Space)

Replace value & save file during reading CSV file (C#)

c# Remove rows from csv

How to give a file path in code instead of command line

C# Find if a word is in a document

Categories

Resources