Parsing text file using C# - c#

Looking for a good way to parse out of this text file, the values highlighted with the yellow boxes using C#. Each section is delineated by a TERM # which I forgot to highlight. Tried this:
string fileName = "ATMTerminalTotals.txt";
StreamReader sr = new StreamReader(fileName);
string[] delimiter = new string[] { " " };
while (!sr.EndOfStream)
{
string[] lines = sr.ReadLine().Split(delimiter, StringSplitOptions.RemoveEmptyEntries);
foreach (string line in lines)
{
Console.WriteLine(line);
}
}
Console.ReadLine();
Safe to say I am reading lines correctly and removing "white spaces." Although, as an amateur to programming, not sure of a valid way to accurately "know" that I am getting the values from this report that I need. Any advice?

i've tested this with a very simple program to parse the given file,
basically i've created two basic classes, a page class holding a collection of terminal report class (the tran type rows)
these rows maybe even can be represented as transaction and a billing class too
first parsed the data, setting the parameters needed and lastly just accessing the properties
just rushed it to be as simple as possible, no error handling etc... its just to give you a sense of how id start solving these kind of tasks, hope it helps
Adam
namespace TerminalTest
{
class Program
{
public class TerminalReport
{
public string Word { get; set; }
public int Denials { get; set; }
public int Approvals { get; set; }
public int Reversals { get; set; }
public double Amount { get; set; }
public int ON_US { get; set; }
public int Alphalink { get; set; }
public int Interchange { get; set; }
public int Surcharged { get; set; }
public static TerminalReport FromLine(string line)
{
TerminalReport report = new TerminalReport();
report.Word = line.Substring(0, 11);
line = line.Replace(report.Word, string.Empty).Trim();
string[] split = line.Split(' ');
int i = 0;
// transaction summary
report.Denials = int.Parse(split[i++]);
report.Approvals = int.Parse(split[i++]);
report.Reversals = int.Parse(split[i++]);
report.Amount = double.Parse(split[i++]);
// billing counts
report.ON_US = int.Parse(split[i++]);
report.Alphalink = int.Parse(split[i++]);
report.Interchange = int.Parse(split[i++]);
report.Surcharged = int.Parse(split[i++]);
return report;
}
}
public class TerminalPage
{
public int PageNumber { get; set; }
public double TotalSurcharges { get; set; }
public List<TerminalReport> Rows { get; set; }
public TerminalPage(int num)
{
PageNumber = num;
Rows = new List<TerminalReport>();
}
public int TotalDenials
{
get
{
return rows.Sum(r => r.Denials);
}
}
public int TotalApprovals
{
get
{
return Rows.Sum(r => r.Approvals;
}
}
public int TotalReversals
{
get
{
return Rows.Sum(r => r.Reversals;
}
}
public double TotalAmount
{
get
{
return Rows.Sum(r => r.Amount);
}
}
public int TotalON_US
{
get
{
return Rows.Sum(r => r.ON_US);
}
}
public int TotalAlphalink
{
get
{
return Rows.Sum(r => r.Alphalink);
}
}
public int TotalInterchange
{
get
{
return Rows.Sum(r => r.Interchange);
}
}
public int TotalSurcharged
{
get
{
return Rows.Sum(r => r.Surcharged);
}
}
}
private static string CleanString(string text)
{
return Regex.Replace(text, #"\s+", " ").Replace(",", string.Empty).Trim();
}
private static List<TerminalPage> ParseData(string filename)
{
using (StreamReader sr = new StreamReader(File.OpenRead(filename)))
{
List<TerminalPage> pages = new List<TerminalPage>();
int pageNumber = 1;
TerminalPage page = null;
bool parse = false;
while (!sr.EndOfStream)
{
string line = sr.ReadLine();
line = CleanString(line);
if (line.StartsWith("TRAN TYPE"))
{
// get rid of the ----- line
sr.ReadLine();
parse = true;
if (page != null)
{
pages.Add(page);
}
page = new TerminalPage(pageNumber++);
}
else if (line.StartsWith("="))
{
parse = false;
}
else if (line.StartsWith("TOTAL SURCHARGES:"))
{
line = line.Replace("TOTAL SURCHARGES:", string.Empty).Trim();
page.TotalSurcharges = double.Parse(line);
}
else if (parse)
{
TerminalReport r = TerminalReport.FromLine(line);
page.Rows.Add(r);
}
}
if (page != null)
{
pages.Add(page);
}
return pages;
}
}
static void Main(string[] args)
{
string filename = #"C:\bftransactionsp.txt";
List<TerminalPage> pages = ParseData(filename);
foreach (TerminalPage page in pages)
{
Console.WriteLine("TotalSurcharges: {0}", page.TotalSurcharges);
foreach (TerminalReport r in page.Rows)
Console.WriteLine(r.Approvals);
}
}
}
}

I'm not sure I'd split it by spaces actually.. the textfile looks like its split into columns. You might want to read like 10 chars (or whatever the width of the column is) at a time... and I'd parse the whole file into a dictionary so you get entries like
dict["WDL FRM CHK"]["# DENIALS"] = 236
then you can easily retrieve the values you want from there, and if you ever need more values in the future, you've got them.
Alternatively, you can use regexs. You can grab the first value with a regex like
^WDL FRM CHK\s+(?<denials>[0-9,]+)\s+(?<approvals>[0-9,]+)$
using
m.Groups["approvals"]

anyway I recommend you to wrap your StreamReader with using block:
using (StreamReader sr = new StreamReader(fileName))
{
// do stuff
}
Read more on MSDN

Given that it seems to have a standard, regular format, I would use regular expressions. You can check the starting code to figure out what row you're on, then an expression that will parse out the numbers and ignore whitespace will, very likely, be easier than handling it manually.

using System;
using System.Text.RegularExpressions;
namespace ConsoleApplication3
{
class Program
{
static void Main(string[] args)
{
Regex exp = new Regex(#"WDL FRM CHK(\s)+[1-9,]+(\s)+(?<approvals>[1-9,]+)(\s)+");
string str = "WDL FRM CHK 236 1,854 45,465 123 3";
Match match = exp.Match(str);
if (match.Success)
{
Console.WriteLine("Approvals: " + match.Groups["approvals"].Value);
}
Console.ReadLine();
}
}
}
Apdated from the following article to parse one of your numbers:
How to match a pattern by using regular expressions and Visual C#

Related

How to import csv file which contains qoute signs and the delimiter is coma?

So i am having a problem importing a csv file, i want to make an object from the columns but i cant read in properly.
So the header line looks like this: Title,Year,Genre,Rating,Votes,Directors
The data line looks like this: The Last of Us: Mass Effect 2,2010,"Action, Adventure, Drama",9.5,19961,Casey Hudson
The problem is that, i get the exception "Input string is not in correct form"
I am using coma as delimiter, is there a way to make quotes as delimiters too?
Also, what are in the qoutes belongs to the Genre attribute.
I am using this code as the CsvParser right now:
using Games.Models;
using System.Globalization;
using System.Text;
namespace Games.Utils
{
public class CsvParser
{
private readonly string _path;
public char Delimiter { get; set; } = ',';
public bool SkipFirst { get; set; } = true;
public bool Verbose { get; set; } = true;
public NumberFormatInfo NumberFormatInfo { get; private set; } = new NumberFormatInfo();
public Encoding Encoding { get; set; } = Encoding.Default;
public CsvParser(string path) => _path = path;
public IEnumerable<Game> StreamParseGames() => GenerateGames(Enumerables.EnumerateStreamReaderLines(new(_path, Encoding)));
public IEnumerable<Game> TextParseGames() => GenerateGames(File.ReadAllLines(_path, Encoding));
private IEnumerable<Game> GenerateGames(IEnumerable<string> lineProvider)
{
if (SkipFirst) lineProvider = lineProvider.Skip(1);
int lineNum = SkipFirst ? 1 : 0;
foreach (var line in lineProvider)
{
string[] parts = line.Split(Delimiter);
Game game;
try
{
game = new()
{
Title = parts[0],
Year = Convert.ToInt32(parts[1], NumberFormatInfo),
Genre = parts[2],
Rating = Convert.ToDouble(parts[3], NumberFormatInfo),
Votes = Convert.ToDouble(parts[4], NumberFormatInfo),
Directors = parts[5],
};
}
catch (FormatException e)
{
if (Verbose) Console.WriteLine($"Line {lineNum + 1:000000} omitted due: {e.Message}");
continue;
}
catch (IndexOutOfRangeException e)
{
if (Verbose) Console.WriteLine($"Line {lineNum + 1:000000} omitted due: {e.Message}");
continue;
}
finally
{
++lineNum;
}
yield return game;
}
}
}
}
I'd suggest you use CsvHelper which can deal with that instead of rolling your own CSV parser.
using CsvHelper;
using CsvHelper.Configuration;
var config = new CsvConfiguration(CultureInfo.InvariantCulture)
{
Delimiter = ",",
};
using (var reader = new StreamReader("path\\to\\file.csv"))
using (var csv = new CsvReader(reader, config))
{
var records = csv.GetRecords<Foo>();
}

How to fix the coding error 'input string was not in a correct format'

Here are the full details of my code:
public partial class Form1 : Form
{
List<Sales> sales = new List<Sales>();
BindingSource bs = new BindingSource();
public Form1()
{
InitializeComponent();
LoadCSV();
bs.DataSource = sales;
dgvSales.DataSource = bs;
}
private void Form1_Load(object sender, EventArgs e)
{
}
private void LoadCSV()
{
string filePath = #"c:\Users\demo\Task3_shop_data.csv";
List<string> lines = new List<string>();
lines = File.ReadAllLines(filePath).ToList();
foreach (string line in lines)
{
List<string> items = line.Split(',').ToList();
Sales s = new Sales();
s.TextBook = items[0];
s.Subject = items[1];
s.Seller = items[2];
s.Purchaser = items[3];
s.purchasedPrice = float.Parse(items[4]);
s.SalePrice = items[6];
s.Rating = items[7];
sales.Add(s);
}
}
}
}
my sales class:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace MichaelSACU301task3
{
internal class Sales
{
public string TextBook { get; set; }
public string Subject { get; set; }
public string Seller { get; set; }
public string Purchaser { get; set; }
public float purchasedPrice { get; set; }
public string SalePrice { get; set; }
public string Rating { get; set; }
}
}
I tried launching it but the error message keeps appearing can someone please help me fix this problem.
Use float.TryParse prior to assigning to purchasedPrice property, if the value can not be converted remember it in a list. In the example below the code to read file data is in a separate class which returns a list of sales and a list of int which is used to remember invalid lines where purchasedPrice data is invalid. You should also consider validating other data and also ensure proper amount of data after performing the line split.
public class FileOperations
{
public static (List<Sales>, List<int>) LoadSalesFromFile()
{
List<Sales> sales = new List<Sales>();
List<int> InvalidLine = new List<int>();
string filePath = #"c:\Users\demo\Task3_shop_data.csv";
List<string> lines = File.ReadAllLines(filePath).ToList();
for (int index = 0; index < lines.Count; index++)
{
var parts = lines[0].Split(',');
// validate purchase price
if (float.TryParse(parts[4], out var purchasePrice))
{
Sales s = new Sales();
s.TextBook = parts[0];
s.Subject = parts[1];
s.Seller = parts[2];
s.Purchaser = parts[3];
s.purchasedPrice = purchasePrice;
s.SalePrice = parts[6];
s.Rating = parts[7];
sales.Add(s);
}
else
{
// failed to convert purchase price
InvalidLine.Add(index);
}
}
return (sales, InvalidLine);
}
}
Call the above code in your form
var (salesList, invalidLines) = FileOperations.LoadSalesFromFile();
if (invalidLines.Count > 0)
{
// use to examine bad lines in file
}
else
{
// uses sales list
}
the error sis probably due the impossiability of float.Parse() parse the items[4] in float
you may track value of items[4] using brake point in VS

Order lines from a text file by number found in a certain spot of the line

I'm sorry that I interrupt you in this manner, I'm new to C# and I've been struggling with this problem for days... Maybe it will seem easy for you :)
I have this text file in this format
name|ID|domain|grade|verdict
Ryan|502322|Computers|9,33|Undefined
Marcel|302112|Automatics|6,22|Undefined
Alex|301234|Computers|5,66|Undefined
Leo|201122|Automatics|3,22|Undefined
How can I sort the text file using any methods (including LINQ) so that the list from the text file will be ordered by domain, and then descending by the grade column? Like this:
name|ID|domain|grade|verdict
Marcel|302112|Automatics|6,22|Undefined
Leo|201122|Automatics|3,22|Undefined
Ryan|502322|Computers|9,33|Undefined
Alex|301234|Computers|5,66|Undefined
To read the file, I'm using var Students = File.ReadAllLines(#"filepath");, I don't know if it's the smartest approach, and then I write using File.WriteAllLines
Thanks in advance! Sorry once again, I know it should be easy, but for me is really tuff :(
You can use some thing like this:
var students= File.ReadAllLines(#"filepath");
var headers = lines[0];
students = lines.Skip(1).ToArray();
var orders = lines.Select(x => x.Split('|'))
.Select(x => new { Domain = x[2], Grade = int.Parse(x[3].Replace(",", "")), All = x })
.OrderBy(x => x.Domain).ThenByDescending(x => x.Grade).Select(x => string.Join("|", x.All)).ToList();
orders.Insert(0, headers);
students=orders.ToArray();
try following code:
private void ReadFile()
{
char Delimiter = '|';
string[] Lines = File.ReadAllLines(#"E:\RaftehHa.txt", Encoding.Default);
List<string[]> FileRows = Lines.Select(line =>
line.Split(new[] { Delimiter }, StringSplitOptions.RemoveEmptyEntries)).ToList();
DataTable dt = new DataTable();
dt.Columns.AddRange(FileRows[0].Select(col => new DataColumn() { ColumnName = col }).ToArray());
FileRows.RemoveAt(0);
FileRows.ForEach(row => dt.Rows.Add(row));
DataView dv = dt.DefaultView;
dv.Sort = " ID ASC ";
dt = dv.ToTable();
dataGridView1.DataSource = dt;
}
A bit the same as already mentioned above, but as you mention you are new to C#, I have tried to add a little bit of structure to the code, but leaving the completion to you.
public class Data
{
public Data(string inputLine)
{
var split = inputLine.Split('|');
Name = split[0];
Id = int.Parse(split[1]);
Domain = split[2];
Grade = double.Parse(split[3].Replace(",", "."));
Verdict = split[4];
}
public string Name { get; }
public int Id { get; }
public string Domain { get; }
public double Grade { get; }
public string Verdict { get; }
}
public class DataFile
{
public static IEnumerable Read(string fileName)
{
var input = File.ReadAllLines(fileName);
return input.Skip(1).Select(p => new Data(p)); // skip header
}
public static void Write(IEnumerable data)
{
// todo :)
}
}
void Main()
{
var input = DataFile.Read(#"C:\Temp\ExampleData.txt");
var result = input.OrderBy(p => p.Domain).ThenByDescending(p => p.Grade);
DataFile.Write(result);
}
using System;
using System.IO;
using System.Linq;
using System.Collections.Generic;
enum DomainType {
Automatics, // 0
Computers // 1
}
class Data {
public int Id { get; set; }
public string Name { get; set; }
public string Verdict { get; set; }
public DomainType Domain { get; set; }
public Tuple<int, int> Grade { get; set; }
}
public static class Program {
static IEnumerable<Data> FileContent(string path) {
string line;
using (var reader = File.OpenText(path))
{
bool skipHeader = false;
while((line = reader.ReadLine()) != null)
{
if (!skipHeader) {
skipHeader = true;
continue;
}
var fields = line.Split('|');
string name = fields[0];
int id = int.Parse(fields[1]);
var domain = (DomainType)Enum.Parse(typeof(DomainType), fields[2]);
var grade = Tuple.Create(int.Parse(fields[3].Split(',')[0]),
int.Parse(fields[3].Split(',')[1]));
string verdict = fields[4];
var data = new Data() {
Name = name, Id = id, Domain = domain, Grade = grade, Verdict = verdict };
yield return data;
}
}
}
public static void Main() {
var result = FileContent("path_to_file").OrderBy(data => data.Domain);
foreach (var line in result) {
Console.WriteLine(line.Name);
}
}
}

Need to extract text field alone from JSON

Could anybody help me to extract only the text field from the below JSON (response from my program) using c#
[{"unMeta":{}},[{"t":"Plain","c":[{"t":"Str","c":"{\"language\":\"en\",\"textAngle\":0.0,\"orientation\":\"Up\",\"regions\":[{\"boundingBox\":\"7,7,476,264\",\"lines\":
[{\"boundingBox\":\"7,7,476,58\",\"words\":
[{\"boundingBox\":\"7,7,42,44\",\"text\":\"If\"},{\"boundingBox\":\"62,16,283,49\",\"text\":\"computers\"},{\"boundingBox\":\"361,9,122,45\",\"text\":\"can't\"}]},
{\"boundingBox\":\"7,77,451,57\",\"words\":
[{\"boundingBox\":\"7,77,149,56\",\"text\":\"adapt\"},{\"boundingBox\":\"172,77,155,57\",\"text\":\"easily,\"},{\"boundingBox\":\"338,79,120,43\",\"text\":\"then\"}]},
{\"boundingBox\":\"8,146,460,57\",\"words\":
[{\"boundingBox\":\"8,146,178,56\",\"text\":\"maybe\"},{\"boundingBox\":\"201,147,82,44\",\"text\":\"the\"},{\"boundingBox\":\"299,148,169,55\",\"text\":\"people\"}]},
{\"boundingBox\":\"7,214,414,57\",\"words\":
[{\"boundingBox\":\"7,214,145,57\",\"text\":\"using\"},{\"boundingBox\":\"166,216,137,43\",\"text\":\"them\"},{\"boundingBox\":\"318,231,103,29\",\"text\":\"can.\"}]}]}]}"}]}]]
I am using the below code but response .regions throws error* i need to extract only the text field from the above JSON. I need to loop through the nodes
static async Task readJsonOutput(string response)
{
StringBuilder stringBuilder = new StringBuilder();
if (response != null && **response.Regions** != null)
{
foreach (var item in response.Regions)
{
foreach (var line in item.Lines)
{
foreach (var word in line.Words)
{
stringBuilder.Append(word.Text);
stringBuilder.Append(" ");
}
stringBuilder.AppendLine();
}
stringBuilder.AppendLine();
}
}
string result = stringBuilder.ToString();
//return stringBuilder.ToString();
}
}
public class Region
{
public string BoundingBox { get; set; }
public List<Line> Lines { get; set; }
}
public class Line
{
public string BoundingBox { get; set; }
public List<Word> Words { get; set; }
}
public class Word
{
public string BoundingBox { get; set; }
public string Text { get; set; }
}
Part of the problem is the JSON is double serialized. So you need to parse it, then get the "real" JSON from innermost c property and parse that part a second time. Then you can extract the text properties from there.
Using Json.Net's LINQ-to-JSON API, you can do it like this:
var innerJson = (string)JToken.Parse(json).SelectTokens("$..c").Last();
var words = JToken.Parse(innerJson).SelectTokens("$..text").Select(t => (string)t);
var text = string.Join(" ", words);
Demo here: https://dotnetfiddle.net/UvBRqv

Skip reading the first line of the csv file

I am a beginner in programming,It's really difficult for me to analyze and debug how to skip reading the first line of the csv file. I need some help.
I need my id to fill my combobox in my form that contains all
Id's.In order to not include the header in browsing and
displaying.I need to skip the first line.
public bool ReadEntrie(int id, ref string name, ref string lastname, ref
string phone, ref string mail, ref string website)
{
int count = 0;
CreateConfigFile();
try
{
fs = new FileStream(data_path, FileMode.Open);
sr = new StreamReader(fs);
string temp = "";
bool cond = true;
while (cond == true)
{
if ((temp = sr.ReadLine()) == null)
{
sr.Close();
fs.Close();
cond = false;
if (count == 0)
return false;
}
if (count == id)
{
string[] stringSplit = temp.Split(',');
int _maxIndex = stringSplit.Length;
name = stringSplit[0].Trim('"');
lastname = stringSplit[1].Trim('"');
phone = stringSplit[2].Trim('"');
mail = stringSplit[3].Trim('"');
website = stringSplit[4].Trim('"');
}
count++;
}
sr.Close();
fs.Close();
return true;
}
catch
{
return false;
}
}
#Somadina's answer is correct, but I would suggest a better alternative. You could use a CSV file parser library such as CSV Helpers.
You can get the library from Nuget or Git. Nuget command would be:
Install-Package CsvHelper
Declare the following namespaces:
using CsvHelper;
using CsvHelper.Configuration;
Here's how simple your code looks when you use such a library:
class Program
{
static void Main(string[] args)
{
var csv = new CsvReader(File.OpenText("Path_to_your_csv_file"));
csv.Configuration.IgnoreHeaderWhiteSpace = true;
csv.Configuration.RegisterClassMap<MyCustomObjectMap>();
var myCustomObjects = csv.GetRecords<MyCustomObject>();
foreach (var item in myCustomObjects.ToList())
{
// Apply your application logic here.
Console.WriteLine(item.Name);
}
}
}
public class MyCustomObject
{
// Note: You may want to use a type converter to convert the ID to an integer.
public string ID { get; set; }
public string Name { get; set; }
public string Lastname { get; set; }
public string Phone { get; set; }
public string Mail { get; set; }
public string Website { get; set; }
public override string ToString()
{
return Name.ToString();
}
}
public sealed class MyCustomObjectMap : CsvClassMap<MyCustomObject>
{
public MyCustomObjectMap()
{
// In the name method, you provide the header text - i.e. the header value set in the first line of the CSV file.
Map(m => m.ID).Name("id");
Map(m => m.Name).Name("name");
Map(m => m.Lastname).Name("lastname");
Map(m => m.Phone).Name("phone");
Map(m => m.Mail).Name("mail");
Map(m => m.Website).Name("website");
}
}
Some more details in an answer here.
To skip the first line, just replace the line:
if (count == id)
with
if (count > 0 && count == id)
MORE THOUGHTS ON YOUR APPROACH
Because you used the ref keyword, each line you read will override the previous values you stored in the parameters. A better way to do this is to create a class to hold all the properties of interest. Then, for each line you read, package an instance of the class and add it to a list. You method signature (even the return type) will change eventually.
From your code, the class will look like this:
public class DataModel
{
public string Name { get; set; }
public string LastName { get; set; }
public string Phone{ get; set; }
public string Mail { get; set; }
public string Website{ get; set; }
}
Then your method will be like this:
public IList<DataModel> ReadEntrie(int id, string data_path)
{
int count = 0;
CreateConfigFile();
var fs = new FileStream(data_path, FileMode.Open);
var sr = new StreamReader(fs);
try
{
var list = new List<DataModel>();
string temp = "";
bool cond = true;
while (cond == true)
{
if ((temp = sr.ReadLine()) == null)
{
cond = false;
if (count == 0)
throw new Exception("Failed");
}
if (count > 0 && count == id)
{
string[] stringSplit = temp.Split(',');
var item = new DataModel();
item.Name = stringSplit[0].Trim('"');
item.LastName = stringSplit[1].Trim('"');
item.Phone = stringSplit[2].Trim('"');
item.Mail = stringSplit[3].Trim('"');
item.Website = stringSplit[4].Trim('"');
// add item to list
list.Add(item);
}
count++;
}
return list;
}
catch
{
throw; // or do whatever you wish
}
finally
{
sr.Close();
fs.Close();
}
}

Categories