read large text file into array with specific format - c#

I have a text file with the following format of information:
1 1.2323232 2.2356 4.232 1.23664
2 1.344545
3 6.2356 7.56455212
etc....
How do I read the file in C#, parse it into an array, then do some processing on it?

Use File Helpers.
For eg. All you would need is to define the record parsing as this :
[DelimitedRecord("|")]
public class Orders
{
public int OrderID;
public string CustomerID;
[FieldConverter(ConverterKind.Date, "ddMMyyyy")] public DateTime OrderDate;
public decimal Freight;
}
And read the file in as this :
FileHelperEngine engine = new FileHelperEngine(typeof(Orders));
// to Read use:
Orders[] res = engine.ReadFile("TestIn.txt") as Orders[];
// to Write use:
engine.WriteFile("TestOut.txt", res);
You could change the delimiter to " " & suitably update the member types as well.

Hey your code looks like there is an ID value at position one. So I created some example code.
private List<MyValues> Read(string fileName)
{
var result = new List<MyValues>();
var line = new string[] { };
using (StreamReader sr = new StreamReader(fileName))
{
while (sr.Peek() > -1)
{
line = sr.ReadLine().Trim().Split(' ');
var val = new MyValues();
val.Id = Convert.ToInt32(line.ElementAt(0));
for (int n = 1; n < line.Count(); n++)
{
val.Values.Add(Convert.ToDouble(line[n]));
}
result.Add(val);
}
}
return result;
}
class MyValues
{
public int Id = 0;
public List<double> Values = new List<double>();
}

Related

C# Read file, split at ; write in array

so I have a file like this:
1;2;5
1;3;3
1;4;3
1;5;1
1;6;0
and I want every number as easy accassible as possibe
so I thought multi dimentional array
that's the idea so far:
System.IO.StreamReader file = new System.IO.StreamReader(#"C:\Users\Hisfantor\Desktop\transport.txt");
int count = 0;
string linee;
string line;
string[] extract;
while ((linee = file.ReadLine()) != null)
{
count++;
}
textBox1.Text = Convert.ToString(count);
double[,] destinations = new double[(int)count, 3];
for (int i = 0; i < count; i++)
{
line = file.ReadLine();
extract = line.Split(';');
destinations[i, 1] = Convert.ToDouble(extract[0]);
destinations[i, 2] = Convert.ToDouble(extract[1]);
destinations[i, 3] = Convert.ToDouble(extract[2]);
listBox1.Items.Add(destinations[i, 1]);
}
file.Close();
I tried different things, but never get anything in the listbox(just for testing)
Try this:
using System.IO.StreamReader file = new System.IO.StreamReader(#"C:\Users\Hisfantor\Desktop\transport.txt");
int count = 0;
string line;
string[] extract;
while ((line = file.ReadLine()) != null)
{
extract = line.Split(';');
var lineValue = new LineValue()
{
Col1 = Convert.ToDouble(extract[0]),
Col2 = Convert.ToDouble(extract[1]),
Col3 = Convert.ToDouble(extract[2]),
};
listBox1.Items.Add(lineValue);
count++;
}
textBox1.Text = Convert.ToString(count);
public class LineValue
{
public double Col1 { get; set; }
public double Col2 { get; set; }
public double Col3 { get; set; }
public override string ToString() => $"{Col1};{Col2};{Col3}";
}
I suggest jagged array, i.e. array of array double[][] instead of 2D one (double[,]). Then you can query the file with a help of Linq:
using System.Linq;
...
double[][] destinations = File
.ReadLines(#"C:\Users\Hisfantor\Desktop\transport.txt")
.Where(line => !string.IsNullOrWhiteSpace(line))
.Select(line => line
.Split(';')
.Select(item => double.Parse(item))
.ToArray())
.ToArray();
foreach (var array in destinations)
listBox1.Items.Add(array[1]);
This is how I would approach it.
class Item
{
public int Value1 { get; set; }
public int Value2 { get; set; }
public int Value3 { get; set; }
}
string line;
List<Item> items = new List<Item>();
using (StreamReader reader = new StreamReader(path))
{
while ((line = reader.ReadLine()) != null)
{
string[] tokens = line.Split(';');
if (tokens.Length == 3)
{
int value;
Item item = new Item();
item.Value1 = int.TryParse(tokens[0], out value) ? value : 0;
item.Value2 = int.TryParse(tokens[1], out value) ? value : 0;
item.Value3 = int.TryParse(tokens[2], out value) ? value : 0;
items.Add(item);
}
}
}
This code reads a line at a time and populates a list of Items as it reads them. It includes error checking to avoid unexpected exceptions due to invalid inputs.
Also, you should wrap StreamReader in a using block to ensure it cleans up in a timely manner, even if an exception exits your method prematurely.

binary search in a sorted list in c#

I am retrieving client id\ drum id from a file and storing them in a list.
then taking the client id and storing it in another list.
I need to display the client id that the user specifies (input_id) on a Datagrid.
I need to get all the occurrences of this specific id using binary search.
the file is already sorted.
I need first to find the occurrences of input_id in id_list.
The question is: how to find all the occurrences of input_id in the sorted list id_list using binary search?
using(StreamReader sr= new StreamReader(path))
{
List<string> id_list = new List<string>();
List<string> all_list= new List<string>();
List<int> indexes = new List<int>();
string line = sr.ReadLine();
line = sr.ReadLine();
while (line != null)
{
all_list.Add(line);
string[] break1 = line.Split('/');
id_list.Add(break1[0]);
line = sr.ReadLine();
}
}
string input_id = textBox1.Text;
Data in the file:
client id/drum id
-----------------
123/321
231/3213
321/213123 ...
If the requirement was to use binary search I would create a custom class with a comparer, and then find an element and loop forward/backward to get any other elements. Like:
static void Main(string[] args
{
var path = #"file path...";
// read all the Ids from the file.
var id_list = File.ReadLines(path).Select(x => new Drum
{
ClientId = x.Split('/').First(),
DrumId = x.Split('/').Last()
}).OrderBy(o => o.ClientId).ToList();
var find = new Drum { ClientId = "231" };
var index = id_list.BinarySearch(find, new DrumComparer());
if (index != -1)
{
List<Drum> matches = new List<Drum>();
matches.Add(id_list[index]);
//get previous matches
for (int i = index - 1; i > 0; i--)
{
if (id_list[i].ClientId == find.ClientId)
matches.Add(id_list[i]);
else
break;
}
//get forward matches
for (int i = index + 1; i < id_list.Count; i++)
{
if (id_list[i].ClientId == find.ClientId)
matches.Add(id_list[i]);
else
break;
}
}
}
public class Drum
{
public string DrumId { get; set; }
public string ClientId { get; set; }
}
public class DrumComparer : Comparer<Drum>
{
public override int Compare(Drum x, Drum y) =>
x.ClientId.CompareTo(y.ClientId);
}
If i understand you question right then this should be a simple where stats.
// read all the Ids from the file.
var Id_list = File.ReadLines(path).Select(x => new {
ClientId = x.Split('/').First(),
DrumId = x.Split('/').Last()
}).ToList();
var foundIds = Id_list.Where(x => x.ClientId == input_id);

Read lines of data from CSV then display data

I have to read info from a txt file, store it in a manner (array or list), then display the data. Program must include at least one additional class.
I've hit a wall and can't progress.
string, string, double, string
name,badge,salary,position
name,badge,salary,position
name,badge,salary,position
I'm sorry and I know the code below is disastrous but I'm at a loss and am running out of time.
namespace Employees
{
class Program
{
static void Main()
{
IndividualInfo collect = new IndividualInfo();
greeting();
collect.ReadInfo();
next();
for (int i = 0; i < 5; i++)
{
displayInfo(i);
}
exit();
void greeting()
{
Console.WriteLine("\nWelcome to the Software Development Company\n");
}
void next()
{
Console.WriteLine("\n*Press enter key to display information . . . *");
Console.Read();
}
void displayInfo(int i)
{
Console.WriteLine($"\nSoftware Developer {i + 1} Information:");
Console.WriteLine($"\nName:\t\t\t{collect.nameList[i]}");
}
void exit()
{
Console.WriteLine("\n\n*Press enter key to exit . . . *");
Console.Read();
Console.Read();
}
}
}
}
class IndividualInfo
{
public string Name { get; set; }
//public string Badge{ get; set; }
//public string Position{ get; set; }
//public string Salary{ get; set; }
public void ReadInfo()
{
int i = 0;
string inputLine;
string[] eachLine = new string[4];
string[,] info = new string[5, 4]; // 5 developers, 4x info each
StreamReader file = new StreamReader("data.txt");
while ((inputLine = file.ReadLine()) != null)
{
eachLine = inputLine.Split(',');
for (int x = 0; x < 5; x++)
{
eachLine[x] = info[i, x];
x++;
}
i++;
}
string name = info[i, 0];
string badge = info[i, 1];
string position = info[i, 2];
double salary = Double.Parse(info[i, 3]);
}
public List<string> nameList = new List<string>();
}
So far I think I can collect it with a two-dimensional array, but a List(s) would be better. Also, the code I've posted up there won't run because I can't yet figure out a way to get it to display. Which is why I'm here.
using System.IO;
static void Main(string[] args)
{
using(var reader = new StreamReader(#"C:\test.csv"))
{
List<string> listA = new List<string>();
List<string> listB = new List<string>();
while (!reader.EndOfStream)
{
var line = reader.ReadLine();
var values = line.Split(';');
listA.Add(values[0]);
listB.Add(values[1]);
}
}
}
https://www.rfc-editor.org/rfc/rfc4180
or
using Microsoft.VisualBasic.FileIO;
var path = #"C:\Person.csv"; // Habeeb, "Dubai Media City, Dubai"
using (TextFieldParser csvParser = new TextFieldParser(path))
{
csvParser.CommentTokens = new string[] { "#" };
csvParser.SetDelimiters(new string[] { "," });
csvParser.HasFieldsEnclosedInQuotes = true;
// Skip the row with the column names
csvParser.ReadLine();
while (!csvParser.EndOfData)
{
// Read current line fields, pointer moves to the next line.
string[] fields = csvParser.ReadFields();
string Name = fields[0];
string Address = fields[1];
}
}
http://codeskaters.blogspot.ae/2015/11/c-easiest-csv-parser-built-in-net.html
or
LINQ way:
var lines = File.ReadAllLines("test.txt").Select(a => a.Split(';'));
var csv = from line in lines
select (from piece in line
select piece);
^^Wrong - Edit by Nick
It appears the original answerer was attempting to populate csv with a 2 dimensional array - an array containing arrays. Each item in the first array contains an array representing that line number with each item in the nested array containing the data for that specific column.
var csv = from line in lines
select (line.Split(',')).ToArray();
This question was fully addressed here:
Reading CSV file and storing values into an array

Reading delimited text file?

I am trying to read from a delimited text file, but everything is returned in in one row and one column.
My connections string is
OleDbConnection con = new OleDbConnection("Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" +
Path.GetDirectoryName(#textBox1txtPath.Text) + ";" +
"Extended Properties=\"text;HDR=YES;IMEX=1;Format=Delimited(|)\"");
And my text file reads:
ItemNumber|ProductStatus|UPC
0000012|closed|2525
Please assist
Okay, so one option would be to take a different approach. Consider the following code:
// read the entire file and store each line
// as a new element in a string[]
var lines = File.ReadAllLines(pathToFile);
// we can skip the first line because it's
// just headings - if you need the headings
// just grab them off the 0 index
for (int i = 1; i < lines.Length; i++)
{
var vals = lines[i].Split('|');
// do something with the vals because
// they are now in a zero-based array
}
This gets rid of that monstrosity of a connection string, eliminates the overhead of an Odbc driver, and drastically increases the readability of the code.
i don't know exactly what do you need, but you can do this:
if you have string str with the whole text in it you can do
string[] lines = str.Split('\n');// split it to lines;
and then for each line you can do
string[] cells = line.Split('|');// split a line to cells
if we take it to the next level we can do:
public class line
{
public int ItemNumber { get; set; }
public string ProductStatus { get; set; }
public int UPC { get; set; }
public line(string currLine)
{
string[] cells = currLine.Split('|');
int item;
if(int.TryParse(cells[0], out item))
{
ItemNumber = item;
}
ProductStatus = cells[1];
int upc;
if (int.TryParse(cells[2], out upc))
{
UPC = upc;
}
}
}
and then:
string[] lines = str.Substring(str.IndexOf("\n")).Split('\n');// split it to lines;
List<line> tblLines = new List<line>();
foreach(string curr in lines)
{
tblLines.Add(new line(curr);
}
It's right in the framework -- TextFieldParser. Don't worry about the namespace, it was originally a shim for folks converting from VB6, but it's very useful. Here's a SSCCE that demonstrates its use for a number of different delimiters:
class Program
{
static void Main(string[] args)
{
var comma = #"one,""two, yo"",three";
var tab = "one\ttwo, yo\tthee";
var random = #"onelol""two, yo""lolthree";
var parser = CreateParser(comma, ",");
Console.WriteLine("Parsing " + comma);
Dump(parser);
Console.WriteLine();
parser = CreateParser(tab, "\t");
Console.WriteLine("Parsing " + tab);
Dump(parser);
Console.WriteLine();
parser = CreateParser(random, "lol");
Console.WriteLine("Parsing " + random);
Dump(parser);
Console.WriteLine();
Console.ReadLine();
}
private static TextFieldParser CreateParser(string value, params string[] delims)
{
var parser = new Microsoft.VisualBasic.FileIO.TextFieldParser(ToStream(value));
parser.Delimiters = delims;
return parser;
}
private static void Dump(TextFieldParser parser)
{
while (!parser.EndOfData)
foreach (var field in parser.ReadFields())
Console.WriteLine(field);
}
static Stream ToStream(string value)
{
return new MemoryStream(Encoding.Default.GetBytes(value));
}
}

Removing quotes in file helpers

I have a .csv file(I have no control over the data) and for some reason it has everything in quotes.
"Date","Description","Original Description","Amount","Type","Category","Name","Labels","Notes"
"2/02/2012","ac","ac","515.00","a","b","","javascript://"
"2/02/2012","test","test","40.00","a","d","c",""," "
I am using filehelpers and I am wondering what the best way to remove all these quotes would be? Is there something that says "if I see quotes remove. If no quotes found do nothing"?
This messes with the data as I will have "\"515.00\"" with unneeded extra quotes(especially since I want in this case it to be a decimal not a string".
I am also not sure what the "javascript" is all about and why it was generated but this is from a service I have no control over.
edit
this is how I consume the csv file.
using (TextReader textReader = new StreamReader(stream))
{
engine.ErrorManager.ErrorMode = ErrorMode.SaveAndContinue;
object[] transactions = engine.ReadStream(textReader);
}
You can use the FieldQuoted attribute described best on the attributes page here. Note that the attribute can be applied to any FileHelpers field (even if it type Decimal). (Remember that the FileHelpers class describes the spec for your import file.. So when you mark a Decimal field as FieldQuoted, you are saying in the file, this field will be quoted.)
You can even specify whether or not the quotes are optional with
[FieldQuoted('"', QuoteMode.OptionalForBoth)]
Here is a console application which works with your data:
class Program
{
[DelimitedRecord(",")]
[IgnoreFirst(1)]
public class Format1
{
[FieldQuoted]
[FieldConverter(ConverterKind.Date, "d/M/yyyy")]
public DateTime Date;
[FieldQuoted]
public string Description;
[FieldQuoted]
public string OriginalDescription;
[FieldQuoted]
public Decimal Amount;
[FieldQuoted]
public string Type;
[FieldQuoted]
public string Category;
[FieldQuoted]
public string Name;
[FieldQuoted]
public string Labels;
[FieldQuoted]
[FieldOptional]
public string Notes;
}
static void Main(string[] args)
{
var engine = new FileHelperEngine(typeof(Format1));
// read in the data
object[] importedObjects = engine.ReadString(#"""Date"",""Description"",""Original Description"",""Amount"",""Type"",""Category"",""Name"",""Labels"",""Notes""
""2/02/2012"",""ac"",""ac"",""515.00"",""a"",""b"","""",""javascript://""
""2/02/2012"",""test"",""test"",""40.00"",""a"",""d"",""c"","""","" """);
// check that 2 records were imported
Assert.AreEqual(2, importedObjects.Length);
// check the values for the first record
Format1 customer1 = (Format1)importedObjects[0];
Assert.AreEqual(DateTime.Parse("2/02/2012"), customer1.Date);
Assert.AreEqual("ac", customer1.Description);
Assert.AreEqual("ac", customer1.OriginalDescription);
Assert.AreEqual(515.00, customer1.Amount);
Assert.AreEqual("a", customer1.Type);
Assert.AreEqual("b", customer1.Category);
Assert.AreEqual("", customer1.Name);
Assert.AreEqual("javascript://", customer1.Labels);
Assert.AreEqual("", customer1.Notes);
// check the values for the second record
Format1 customer2 = (Format1)importedObjects[1];
Assert.AreEqual(DateTime.Parse("2/02/2012"), customer2.Date);
Assert.AreEqual("test", customer2.Description);
Assert.AreEqual("test", customer2.OriginalDescription);
Assert.AreEqual(40.00, customer2.Amount);
Assert.AreEqual("a", customer2.Type);
Assert.AreEqual("d", customer2.Category);
Assert.AreEqual("c", customer2.Name);
Assert.AreEqual("", customer2.Labels);
Assert.AreEqual(" ", customer2.Notes);
}
}
(Note, your first line of data seems to have 8 fields instead of 9, so I marked the Notes field with FieldOptional).
Here’s one way of doing it:
string[] lines = new string[]
{
"\"Date\",\"Description\",\"Original Description\",\"Amount\",\"Type\",\"Category\",\"Name\",\"Labels\",\"Notes\"",
"\"2/02/2012\",\"ac\",\"ac\",\"515.00\",\"a\",\"b\",\"\",\"javascript://\"",
"\"2/02/2012\",\"test\",\"test\",\"40.00\",\"a\",\"d\",\"c\",\"\",\" \"",
};
string[][] values =
lines.Select(line =>
line.Trim('"')
.Split(new string[] { "\",\"" }, StringSplitOptions.None)
.ToArray()
).ToArray();
The lines array represents the lines in your sample. Each " character must be escaped as \" in C# string literals.
For each line, we start off by removing the first and last " characters, then proceed to split it into a collection of substrings, using the "," character sequence as the delimiter.
Note that the above code will not work if you have " characters occurring naturally within your values (even if escaped).
Edit: If your CSV is to be read from a stream, all your need to do is:
var lines = new List<string>();
using (var streamReader = new StreamReader(stream))
while (!streamReader.EndOfStream)
lines.Add(streamReader.ReadLine());
The rest of the above code would work intact.
Edit: Given your new code, check whether you’re looking for something like this:
for (int i = 0; i < transactions.Length; ++i)
{
object oTrans = transactions[i];
string sTrans = oTrans as string;
if (sTrans != null &&
sTrans.StartsWith("\"") &&
sTrans.EndsWith("\""))
{
transactions[i] = sTrans.Substring(1, sTrans.Length - 2);
}
}
I have the same predicament and I replace the quotes when I load the value into my list object:
using System;
using System.Collections.Generic;
using System.IO;
using System.Windows.Forms;
namespace WindowsFormsApplication6
{
public partial class Form1 : Form
{
public Form1()
{
InitializeComponent();
}
private void Form1_Load(object sender, EventArgs e)
{
LoadCSV();
}
private void LoadCSV()
{
List<string> Rows = new List<string>();
string m_CSVFilePath = "<Path to CSV File>";
using (StreamReader r = new StreamReader(m_CSVFilePath))
{
string row;
while ((row = r.ReadLine()) != null)
{
Rows.Add(row.Replace("\"", ""));
}
foreach (var Row in Rows)
{
if (Row.Length > 0)
{
string[] RowValue = Row.Split(',');
//Do something with values here
}
}
}
}
}
}
This code might help which I developed:
using (StreamReader r = new StreamReader("C:\\Projects\\Mactive\\Audience\\DrawBalancing\\CSVFiles\\Analytix_ABC_HD.csv"))
{
string row;
int outCount;
StringBuilder line=new StringBuilder() ;
string token="";
char chr;
string Eachline;
while ((row = r.ReadLine()) != null)
{
outCount = row.Length;
line = new StringBuilder();
for (int innerCount = 0; innerCount <= outCount - 1; innerCount++)
{
chr=row[innerCount];
if (chr != '"')
{
line.Append(row[innerCount].ToString());
}
else if(chr=='"')
{
token = "";
innerCount = innerCount + 1;
for (; innerCount < outCount - 1; innerCount++)
{
chr=row[innerCount];
if(chr=='"')
{
break;
}
token = token + chr.ToString();
}
if(token.Contains(",")){token=token.Replace(",","");}
line.Append(token);
}
}
Eachline = line.ToString();
Console.WriteLine(Eachline);
}
}

Categories