How to find average from multiple lines of a text file

How to find average from multiple lines of a text file - c#

I need to take a number from every line of a text file and find the average. I'm using stream reader to take the number out of a text file, I don't know where to go from here. Here's what I've done so far
using (StreamReader sr = new StreamReader("pupilSkiTimes.txt"))
{
string line = "";
while ((line = sr.ReadLine()) != null)
{
string[] components = line.Split("~".ToCharArray(), StringSplitOptions.RemoveEmptyEntries);
skiTime.Add(components[7]);
}
sr.Close();
}
How do I get this to read from every line of the text file, and once that's done, how do I get the average.
In case you need to know, the data I'm trying to read is doubles, e.g "23.43"

Here is how I will do it, as you mentioned in comments components[7] are double data that you read from the file.
We need to parse it to double, sum it up and divide it by the counting time we are able to parse the number in the file. If the number is not parsed and you want the total average of all lines then move the count out of the if statement.
using (StreamReader sr = new StreamReader("pupilSkiTimes.txt"))
{
string line;
double sum = 0;
int count = 0;
while ((line = sr.ReadLine()) != null)
{
string[] components = line.Split("~".ToCharArray(),
StringSplitOptions.RemoveEmptyEntries);
if (double.TryParse(components[7], out var result))
{
count++;
sum += result;
}
}
sr.Close();
var average = sum / count;
Console.WriteLine(average);
}

I think the handy method for this situation like this is useful hopefully you use it, I am using a similar this in my codes.
I am passing FilePath, Separator, and index value
static double getAvgAtIndex(string fPath, char seperator, int index)
{
double sum = 0;
int counter = 0;
using (StreamReader sr = new StreamReader(fPath))
{
string line = "";
while ((line = sr.ReadLine()) != null)
{
double rawData = 0;
string[] lineData = line.Split(seperator, StringSplitOptions.RemoveEmptyEntries);
double.TryParse(lineData[index], out rawData);
sum += rawData;
counter++;
}
sr.Close();
}
return sum / counter;
}
Usage of this,
static void Main(string[] args)
{
Console.WriteLine("The Avg is: {0}", getAvgAtIndex(#"..\myTextFile.txt", '~' ,1));
// The Avg is: 34.688
}

Here is how to use LINQ to clean up the code a bit
static class Program
{
static void Main(string[] args)
{
var data = File.ReadAllLines("pupilSkiTimes.txt")
.Select((line)=> line.Split("~".ToCharArray(), StringSplitOptions.RemoveEmptyEntries));
List<double> skiTime = new List<double>();
foreach (var parts in data)
{
if (double.TryParse(parts[7], out double x))
{
skiTime.Add(x);
}
}
double average = skiTime.Average();
}
}

Related

How to read C# . using StreamReader

I'm trying to read a string with StreamReader, so I don't know how to read it.
using System;
using System.Diagnostics;
using System.IO;
using System.Text;
namespace
{
class Program
{
static void Main(string[] args)
{
string itemCostsInput = "25.34\n10.99\n250.22\n21.87\n50.24\n15";
string payerCountInput = "8\n";
string individualCostInput = "52.24\n";
double individualCost = RestaurantBillCalculator.CalculateIndividualCost(reader2, totalCost);
Debug.Assert(individualCost == 54.14);
uint payerCount = RestaurantBillCalculator.CalculatePayerCount(reader3, totalCost);
Debug.Assert(payerCount == 9);
}
}
}
}
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.IO;
namespace as
{
public static class RestaurantBillCalculator
{
public static double CalculateTotalCost(StreamReader input)
{
// I want to read the input (not System.IO.StreamReader,
25.34
10.99
250.22
21.87
50.24
15
//below is what i tried..
int[] numbers = new int[6];
for (int i = 0; i < 5; i++)
{
numbers[int.Parse(input.ReadLine())]++;
}
for (int i = 0; i < 5; i++)
{
Console.WriteLine(numbers[i]);
}
return 0;
}
public static double CalculateIndividualCost(StreamReader input, double totalCost)
{
return 0;
}
public static uint CalculatePayerCount(StreamReader input, double totalCost)
{
return 0;
}
}
}
Even when I googled it, only file input/output came up with that phrase.
I want to get a simple string and read it.
int[] numbers = new int[6]; // The number at the index number
// take the given numbers
for (int i = 0; i < n; i++)
{
numbers[int. Parse(sr. ReadLine())]++;
}
I tried the above method, but it didn't work.
I just want to get the index and read the contents of itemCostsInput as it is. If I just execute Console.writeLine, String == System.IO.StreamReader
comes out I want to read and save the values of itemCostsInput respectively. I just want to do something like read.
I'm sorry I'm not good at English
I expected input Read
25.34
10.99
250.22
21.87
50.24
15
but console print System.IO.StreamReader

This lines are the ones causing (more) trouble I think:
for (int i = 0; i < 5; i++)
{
numbers[int.Parse(input.ReadLine())]++;
}
Should be
for (int i = 0; i < 5; i++)
{
numbers[i] = int.Parse(input.ReadLine());
}
But since you have a decimal input (in string format due to the streamreader), maybe numbers should be an array of decimals.
Also there are quite a few remarks about the use of StreamReader, since if the file doesn't have 5 or more lines, your program will also break. I let this here hoping will clarify something to you, though

Your code does not make sense in its current state.
Please read up on Streams.
Usually you'd get a stream from a file or from a network connection but not from a string.
You are confusing integer and double.
The double data type represents floating point numbers.
It seems to me that you just started programming and are missing out on most of the fundamentals.
First, convert your string input into a stream:
static System.IO.Stream GetStream(string input)
{
Stream stream = new MemoryStream();
StreamWriter writer = new StreamWriter(stream);
writer.Write(input);
writer.Flush();
stream.Position = 0;
return stream;
}
Now you can convert your input to a stream like this:
// ... code ...
string itemCostsInput = "25.34\n10.99\n250.22\n21.87\n50.24\n15";
var dataStream = GetStream(itemCostsInput);
// ... code ...
Now you that you converted your string input into a stream you can start to parse your data and extract the numbers:
static List<double> GetDoubleFromStream(Stream stream)
{
if (stream == null) {
return new List<double>();
}
const char NEWLINE = '\n';
List<double> result = new List<double>();
using (var reader = new StreamReader(stream))
{
// Continue until end of stream has been reached.
while (reader.Peek() > -1)
{
string temp = string.Empty;
// Read while not end of stream and char is not new line.
while (reader.Peek() != NEWLINE && reader.Peek() > -1) {
temp += (char)reader.Read();
}
// Perform another read operation
// to skip the current new line character
// and continue reading.
reader.Read();
// Parse data to double if valid.
if (!(string.IsNullOrEmpty(temp)))
{
double d;
// Allow decimal points and ignore culture.
if (double.TryParse(
temp,
NumberStyles.AllowDecimalPoint,
CultureInfo.InvariantCulture,
out d))
{
result.Add(d);
}
}
}
}
return result;
}
This would be your intermediate result:
Now you can convert your input to a stream like this:
// ... code ...
string itemCostsInput = "25.34\n10.99\n250.22\n21.87\n50.24\n15";
var dataStream = GetStream(itemCostsInput);
var result = GetDoubleFromStream(dataStream);
// ... code ...

Parsing a huge text file(around 2GB) with custom delimiters

I have a huge text file around 2GB which I am trying to parse in C#.
The file has custom delimiters for rows and columns. I want to parse the file and extract the data and write to another file by inserting column header and replacing RowDelimiter by newline and ColumnDelimiter by tab so that I can get the data in tabular format.
sample data:
1'~'2'~'3#####11'~'12'~'13
RowDelimiter: #####
ColumnDelimiter: '~'
I keep on getting System.OutOfMemoryException on the following line
while ((line = rdr.ReadLine()) != null)
public void ParseFile(string inputfile,string outputfile,string header)
{
using (StreamReader rdr = new StreamReader(inputfile))
{
string line;
while ((line = rdr.ReadLine()) != null)
{
using (StreamWriter sw = new StreamWriter(outputfile))
{
//Write the Header row
sw.Write(header);
//parse the file
string[] rows = line.Split(new string[] { ParserConstants.RowSeparator },
StringSplitOptions.None);
foreach (string row in rows)
{
string[] columns = row.Split(new string[] {ParserConstants.ColumnSeparator},
StringSplitOptions.None);
foreach (string column in columns)
{
sw.Write(column + "\\t");
}
sw.Write(ParserConstants.NewlineCharacter);
Console.WriteLine();
}
}
Console.WriteLine("File Parsing completed");
}
}
}

As mentioned already in the comments you won't be able to use ReadLine to handle this, you'll have to essentially process the data one byte - or character - at a time. The good news is that this is basically how ReadLine works anyway, so we're not losing a lot in this case.
Using a StreamReader we can read a series of characters from the source stream (in whatever encoding you need) into an array. Using that and a StringBuilder we can process the stream in chunks and check for separator sequences on the way.
Here's a method that will handle an arbitrary delimiter:
public static IEnumerable<string> ReadDelimitedRows(StreamReader reader, string delimiter)
{
char[] delimChars = delimiter.ToArray();
int matchCount = 0;
char[] buffer = new char[512];
int rc = 0;
StringBuilder sb = new StringBuilder();
while ((rc = reader.Read(buffer, 0, buffer.Length)) > 0)
{
for (int i = 0; i < rc; i++)
{
char c = buffer[i];
if (c == delimChars[matchCount])
{
if (++matchCount >= delimChars.Length)
{
// found full row delimiter
yield return sb.ToString();
sb.Clear();
matchCount = 0;
}
}
else
{
if (matchCount > 0)
{
// append previously matched portion of the delimiter
sb.Append(delimChars.Take(matchCount));
matchCount = 0;
}
sb.Append(c);
}
}
}
// return the last row if found
if (sb.Length > 0)
yield return sb.ToString();
}
This should handle any cases where part of your block delimiter can appear in the actual data.
In order to translate your file from the input format you describe to a simple tab-delimited format you could do something along these lines:
const string RowDelimiter = "#####";
const string ColumnDelimiter = "'~'";
using (var reader = new StreamReader(inputFilename))
using (var writer = new StreamWriter(File.Create(ouputFilename)))
{
foreach (var row in ReadDelimitedRows(reader, RowDelimiter))
{
writer.Write(row.Replace(ColumnDelimiter, "\t"));
}
}
That should process fairly quickly without eating up too much memory. Some adjustments might be required for non-ASCII output.

Read the data into a buffer and then do your parsing.
using (StreamReader rdr = new StreamReader(inputfile))
using (StreamWriter sw = new StreamWriter(outputfile))
{
char[] buffer = new char[256];
int read;
//Write the Header row
sw.Write(header);
string remainder = string.Empty;
while ((read = rdr.Read(buffer, 0, 256)) > 0)
{
string bufferData = new string(buffer, 0, read);
//parse the file
string[] rows = bufferData.Split(
new string[] { ParserConstants.RowSeparator },
StringSplitOptions.None);
rows[0] = remainder + rows[0];
int completeRows = rows.Length - 1;
remainder = rows.Last();
foreach (string row in rows.Take(completeRows))
{
string[] columns = row.Split(
new string[] {ParserConstants.ColumnSeparator},
StringSplitOptions.None);
foreach (string column in columns)
{
sw.Write(column + "\\t");
}
sw.Write(ParserConstants.NewlineCharacter);
Console.WriteLine();
}
}
if(reamainder.Length > 0)
{
string[] columns = remainder.Split(
new string[] {ParserConstants.ColumnSeparator},
StringSplitOptions.None);
foreach (string column in columns)
{
sw.Write(column + "\\t");
}
sw.Write(ParserConstants.NewlineCharacter);
Console.WriteLine();
}
Console.WriteLine("File Parsing completed");
}

The problem you have is that you are eagerly consuming the whole file and placing it in memory. Attempting to split a 2GB file in memory is going to be problematic, as you now know.
Solution? Consume one lime a time. Because your file doesn't have a standard line separator you'll have to implement a custom parser that does this for you. The following code does just that (or I think it does, I haven't tested it). Its probably very improvable from a performance perspective but it should at least get you started in the right direction (c#7 syntax):
public static IEnumerable<string> GetRows(string path, string rowSeparator)
{
bool tryParseSeparator(StreamReader reader, char[] buffer)
{
var count = reader.Read(buffer, 0, buffer.Length);
if (count != buffer.Length)
return false;
return Enumerable.SequenceEqual(buffer, rowSeparator);
}
using (var reader = new StreamReader(path))
{
int peeked;
var rowBuffer = new StringBuilder();
var separatorBuffer = new char[rowSeparator.Length];
while ((peeked = reader.Peek()) > -1)
{
if ((char)peeked == rowSeparator[0])
{
if (tryParseSeparator(reader, separatorBuffer))
{
yield return rowBuffer.ToString();
rowBuffer.Clear();
}
else
{
rowBuffer.Append(separatorBuffer);
}
}
else
{
rowBuffer.Append((char)reader.Read());
}
}
if (rowBuffer.Length > 0)
yield return rowBuffer.ToString();
}
}
Now you have a lazy enumeration of rows from your file, and you can process it as you intended to:
foreach (var row in GetRows(inputFile, ParserConstants.RowSeparator))
{
var columns = line.Split(new string[] {ParserConstants.ColumnSeparator},
StringSplitOptions.None);
//etc.
}

I think this should do the trick...
public void ParseFile(string inputfile, string outputfile, string header)
{
int blockSize = 1024;
using (var file = File.OpenRead(inputfile))
{
using (StreamWriter sw = new StreamWriter(outputfile))
{
int bytes = 0;
int parsedBytes = 0;
var buffer = new byte[blockSize];
string lastRow = string.Empty;
while ((bytes = file.Read(buffer, 0, buffer.Length)) > 0)
{
// Because the buffer edge could split a RowDelimiter, we need to keep the
// last row from the prior split operation. Append the new buffer to the
// last row from the prior loop iteration.
lastRow += Encoding.Default.GetString(buffer,0, bytes);
//parse the file
string[] rows = lastRow.Split(new string[] { ParserConstants.RowSeparator }, StringSplitOptions.None);
// We cannot process the last row in this set because it may not be a complete
// row, and tokens could be clipped.
if (rows.Count() > 1)
{
for (int i = 0; i < rows.Count() - 1; i++)
{
sw.Write(new Regex(ParserConstants.ColumnSeparator).Replace(rows[i], "\t") + ParserConstants.NewlineCharacter);
}
}
lastRow = rows[rows.Count() - 1];
parsedBytes += bytes;
// The following statement is not quite true because we haven't parsed the lastRow.
Console.WriteLine($"Parsed {parsedBytes.ToString():N0} bytes");
}
// Now that there are no more bytes to read, we know that the lastrow is complete.
sw.Write(new Regex(ParserConstants.ColumnSeparator).Replace(lastRow, "\t"));
}
}
Console.WriteLine("File Parsing completed.");
}

Late to the party here, but in case anyone else want to know easy way to load such large CSV file with custom delimiters, Cinchoo ETL does the job for you.
using (var parser = new ChoCSVReader("CustomNewLine.csv")
.WithDelimiter("~")
.WithEOLDelimiter("#####")
)
{
foreach (dynamic x in parser)
Console.WriteLine(x.DumpAsJson());
}
Disclaimer: I'm the author of this library.

How to read from a text file then convert the text into a string then an integer

Ok so im trying to convert text from a text file to a string then to an integer so then I could use them in my array(I know there's simpler ways of stating how big a 2D array is but I just want to do it this way so I can learn).
Map.txt (First line in the text)
20, 20
Then is just a integer map that below.
Here is the code that reads the text and displays also the map, again I want to Take the first line of Map.txt convert it to a string, then to an int so then I could use it for other things
static void worldLoad()
{
int counter = 0; //Why do I need to declare it as 0....
string line;
//Read the file
System.IO.StreamReader file = new System.IO.StreamReader(#"Map.txt");
while((line = file.ReadLine()) != null)
{
Console.WriteLine(line);
counter = counter + 1;
if(counter == 1)
{
Console.Clear();
}
if(counter == 21)
{
break;
}
}
}

No need to convert file.ReadLine() to a string, it's already a string. What you want to do is int.TryParse(). See below:
static void Main(string[] args)
{
int counter = 0;
string line;
int output;
//Read the file
System.IO.StreamReader file = new System.IO.StreamReader(#"Map.txt");
while ((line = file.ReadLine()) != null)
{
int.TryParse(line, out output);
Console.WriteLine(line);
counter = counter + 1;
if (counter == 1)
{
Console.Clear();
}
if (counter == 21)
{
break;
}
}
}

How to search to search a file for string, display the line containing the string and also the 6 lines preceding it

I am trying to search through a text file for a string, once I have found this string I need to display this line and then also display the 6 preceding lines i.e. which will contain the details about the error message in the string. I have been searching for similar code and have found the following code but it doesn’t meet my requirements, just wondering if it's possible to do this.
Thanks,
John.
private static void Main(string[] args)
{
string cacheline = "";
string line;
System.IO.StreamReader file = new
System.IO.StreamReader(#"D:\Temp\AccessOutlook.txt");
List<string> lines = new List<string>();
while ((line = file.ReadLine()) != null)
{
if (line.Contains("errors"))
{
lines.Add(cacheline);
}
cacheline = line;
}
file.Close();
foreach (var l in lines)
{
Console.WriteLine(l);
}
}
}

This is probably what you want:
static void Main(string[] args)
{
Queue<string> lines = new Queue<string>();
using (var reader = new StreamReader(args[0]))
{
string line;
while ((line = reader.ReadLine()) != null)
{
if (line.Contains("error"))
{
Console.WriteLine("----- ERROR -----");
foreach (var errLine in lines)
Console.WriteLine(errLine);
Console.WriteLine(line);
Console.WriteLine("-----------------");
}
lines.Enqueue(line);
while (lines.Count > 6)
lines.Dequeue();
}
}
}

You can keep caching the lines until you find the line you are looking for:
using(var file = new StreamReader(#"D:\Temp\AccessOutlook.txt"))
{
List<string> lines = new List<string>();
while ((line = file.ReadLine()) != null)
{
if (!line.Contains(myString))
{
lines.Add(line);
}
else
{
Console.WriteLine(string.Join(Environment.NewLine, lines.Concat(new[] { line })));
}
if(lines.Count > 6) lines.RemoveAt(0);
}
}

string filename = "filename"; // Put your own filename here.
string target = "target"; // Put your target string here.
int numLinesToShow = 7;
var lines = File.ReadAllLines(filename);
int index = Array.FindIndex(lines, element => element.Contains(target));
if (index >= 0)
{
int start = Math.Max(0, index - numLinesToShow + 1);
var result = lines.Skip(start).Take(numLinesToShow).ToList();
// Use result.
}

The code below will open the file, search for the line you want, and then write the 6 preceeding lines to the Console.
var lines = File.ReadAllLines(filePath);
int lineIndex;
for (lineIndex = 0; lineIndex < lines.Length - 1; lineIndex++)
{
if (lines[lineIndex] == textToFind)
{
break;
}
}
var startLine = Math.Max(0, lineIndex - 6);
for (int i = startLine; i < lineIndex; i++)
{
Console.WriteLine(lines[i]);
}

Remove Duplicate Lines From Text File?

Given an input file of text lines, I want duplicate lines to be identified and removed. Please show a simple snippet of C# that accomplishes this.

For small files:
string[] lines = File.ReadAllLines("filename.txt");
File.WriteAllLines("filename.txt", lines.Distinct().ToArray());

This should do (and will copy with large files).
Note that it only removes duplicate consecutive lines, i.e.
a
b
b
c
b
d
will end up as
a
b
c
b
d
If you want no duplicates anywhere, you'll need to keep a set of lines you've already seen.
using System;
using System.IO;
class DeDuper
{
static void Main(string[] args)
{
if (args.Length != 2)
{
Console.WriteLine("Usage: DeDuper <input file> <output file>");
return;
}
using (TextReader reader = File.OpenText(args[0]))
using (TextWriter writer = File.CreateText(args[1]))
{
string currentLine;
string lastLine = null;
while ((currentLine = reader.ReadLine()) != null)
{
if (currentLine != lastLine)
{
writer.WriteLine(currentLine);
lastLine = currentLine;
}
}
}
}
}
Note that this assumes Encoding.UTF8, and that you want to use files. It's easy to generalize as a method though:
static void CopyLinesRemovingConsecutiveDupes
(TextReader reader, TextWriter writer)
{
string currentLine;
string lastLine = null;
while ((currentLine = reader.ReadLine()) != null)
{
if (currentLine != lastLine)
{
writer.WriteLine(currentLine);
lastLine = currentLine;
}
}
}
(Note that that doesn't close anything - the caller should do that.)
Here's a version that will remove all duplicates, rather than just consecutive ones:
static void CopyLinesRemovingAllDupes(TextReader reader, TextWriter writer)
{
string currentLine;
HashSet<string> previousLines = new HashSet<string>();
while ((currentLine = reader.ReadLine()) != null)
{
// Add returns true if it was actually added,
// false if it was already there
if (previousLines.Add(currentLine))
{
writer.WriteLine(currentLine);
}
}
}

For a long file (and non consecutive duplications) I'd copy the files line by line building a hash // position lookup table as I went.
As each line is copied check for the hashed value, if there is a collision double check that the line is the same and move to the next. (
Only worth it for fairly large files though.

Here's a streaming approach that should incur less overhead than reading all unique strings into memory.
var sr = new StreamReader(File.OpenRead(#"C:\Temp\in.txt"));
var sw = new StreamWriter(File.OpenWrite(#"C:\Temp\out.txt"));
var lines = new HashSet<int>();
while (!sr.EndOfStream)
{
string line = sr.ReadLine();
int hc = line.GetHashCode();
if(lines.Contains(hc))
continue;
lines.Add(hc);
sw.WriteLine(line);
}
sw.Flush();
sw.Close();
sr.Close();

I am new to .net & have written something more simpler,may not be very efficient.Please fill free to share your thoughts.
class Program
{
static void Main(string[] args)
{
string[] emp_names = File.ReadAllLines("D:\\Employee Names.txt");
List<string> newemp1 = new List<string>();
for (int i = 0; i < emp_names.Length; i++)
{
newemp1.Add(emp_names[i]); //passing data to newemp1 from emp_names
}
for (int i = 0; i < emp_names.Length; i++)
{
List<string> temp = new List<string>();
int duplicate_count = 0;
for (int j = newemp1.Count - 1; j >= 0; j--)
{
if (emp_names[i] != newemp1[j]) //checking for duplicate records
temp.Add(newemp1[j]);
else
{
duplicate_count++;
if (duplicate_count == 1)
temp.Add(emp_names[i]);
}
}
newemp1 = temp;
}
string[] newemp = newemp1.ToArray(); //assigning into a string array
Array.Sort(newemp);
File.WriteAllLines("D:\\Employee Names.txt", newemp); //now writing the data to a text file
Console.ReadLine();
}
}

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

How to find average from multiple lines of a text file - c#

Related

How to read C# . using StreamReader

Parsing a huge text file(around 2GB) with custom delimiters

How to read from a text file then convert the text into a string then an integer

How to search to search a file for string, display the line containing the string and also the 6 lines preceding it

Remove Duplicate Lines From Text File?

Categories

Resources