C# stream reader reading whitespaces and bypassing them - c#

I am creating a program that reads a text file and gets the data then puts it into an array. My problem is that there are instances where a column is intended to be blank but the blank value must still be considered as a value but when my program reads the blank column, it reads the next value and puts it in the array where the value should be 0 or blank. I have tried to count the spaces between each column to make it a condition but the spaces are not reliable since the data varies in length. Any ideas about how I might do this?
Here is what my text data looks like.
Data1 Data2 Data3
1.325 1.57 51.2
2.2 21.85
12.5 25.13
15.85 13.78 1.85
I need my array to look like this
firstRow['1.325','1.57','51.2'];
secondRow['2.2','0','21.85'];

If your file is tab-splitted, use line.Split("\t") to get array of substrings of each line. Then, each substring you can convert into you data type. In your case it must be nullable, e,g, decimal?.

Here's a starting point if you have a list of headers in the order they appear in the data and if your values are always aligned to the headers.
import io, csv, sys
data = '''\
Data 1 Data 2 Data 3
1.325 1.57 51.2
2.2 21.85
12.5 25.13
15.85 13.78 1.85
'''
headers = ['Data 1', 'Data 2', 'Data 3'] # order should match headers
f = io.StringIO(data)
h = f.readline()
indexes = [h.find(s) for s in headers]
rows = []
for line in f:
line = line[:-1] # strip trailing linefeed
d = {}
for key, index in list(zip(headers, indexes))[::-1]: # slice from the right
val = line[index:]
line = line[:index]
d[key] = val.strip()
rows.append(d)
writer = csv.DictWriter(sys.stdout, headers)
writer.writeheader()
writer.writerows(rows)

Since I have ran out of time, what I did was to count the number of spaces and if the spaces exceed by a number (in my case, 10) I'll add a value empty value in my array
string[] lsData = pData.Split(' ');
string[] lsData1 = new string[18];
int newArrayData = 0;
int spaceCounter = 0;
for (int i = 0; i < lsData.Length; i++)
{
if (lsData[i] != "")
{
lsData1[newArrayData] = lsData[i];
newArrayData++;
spaceCounter = 0;
}
else
{
spaceCounter++;
}
if (spaceCounter >= 10)
{
lsData1[newArrayData] = "";
newArrayData++;
spaceCounter = 0;
}
}

Related

Sequentially arranged and add placeholders in text file

I have a text file which contains data. There are 3 columns, each column starts at a specific location and ends a specific location in the file. The first column which is (300, 301, 302, 304...) is always number based. the second column is a string, and the last column is currency.
The current .txt file is missing numbers which is (303, 305).
I was able to find the missing numbers and add it to an array then write it to the file.
My goal is to write all the columns data sequentially to the text file even the missing ones. As for column 2 and 3, I want 0 to be the placeholder for the missing data and aligned with its own column.
I'm close but need help
//read file
string[] lines = File.ReadAllLines(FilePath);
var Numbers = new List<int>();
int i = 0;
foreach (var line in lines)
{
//get value of first column
var FirstColumn = line.Substring(0, 3);
//add it to array
Numbers.Add(Convert.ToInt32(FirstColumn));
++i;
}
//find missing numbers add to array
var result = Enumerable.Range(Numbers.Min(), Numbers.Count);
//write to file
using (StreamWriter file = new StreamWriter(OutPutFile, true))
{
foreach (var item in result.ToArray())
{
file.WriteLine(item);
}
}
Console.ReadKey();
Current .txt file
300 Family Guy 1,123
301 Dexters Lab 456
302 Rugrats 1,789.52
304 Scooby-Doo 321
306 Recess 2,654
307 Popeye 1,987.02
GOAL: Desired Output .txt file
300 Family Guy 1,123
301 Dexters Lab 456
302 Rugrats 1,789.52
303 0 0
304 Scooby-Doo 321
305 0 0
306 Recess 2,654
307 Popeye 1,987.02
You are reading the first column, but not the rest. What I do is create a dictionary, using the first number as the index, and stuffing the other two fields into a System.ValueTuple (you need to include the ValueTyple Nuget package to get this to work).
First I set some stuff up:
const int column1Start = 0;
const int column1Length = 3;
const int column2Start = 8;
const int column2Length = 15;
const int column3Start = 24;
int indexMin = int.MaxValue; //calculated during the first
int indexMax = int.MinValue; //pass through the file
Then I create my dictionary. That (string, decimal) syntax describes a 2-tuple that contains a string and a decimal number (kind of like the ordered-pairs you were taught about in high school).
Dictionary<int, (string, decimal)> data = new Dictionary<int, (string, decimal)>();
Then I make a pass through the file's lines, reading through the data, and stuffing the results in my dictionary (and calculating the max and min values for that first column):
var lines = File.ReadAllLines(fileName);
foreach (var line in lines) {
//no error checking
var indexString = line.Substring(column1Start, column1Length);
var cartoon = line.Substring(column2Start, column2Length).TrimEnd();
var numberString = line.Substring(column3Start);
if (int.TryParse(indexString, out var index)) {
//I have to parse the first number - otherwise there's nothing to index on
if (!decimal.TryParse(numberString, out var number)){
number = 0.0M;
}
data.Add(index, (cartoon, number));
if (index < indexMin) {
indexMin = index;
}
if (index > indexMax) {
indexMax = index;
}
}
}
Finally, with all my data in hand, I iterate from the min value to the max value, fetching the other two columns out of my dictionary:
for (int i = indexMin; i <= indexMax; ++i) {
if (!data.TryGetValue(i, out var val)){
val = ("0", 0.0M);
}
Console.WriteLine($"{i,5} {val.Item1,-column2Length - 2} {val.Item2, 10:N}");
}
My formatting isn't quite the same as yours (I cleaned it up a bit). You can do what you want. My results look like:
300 Family Guy 1,123.00
301 Dexters Lab 456.00
302 Rugrats 1,789.52
303 0 0.00
304 Scooby-Doo 321.00
305 0 0.00
306 Recess 2,654.00
307 Popeye 1,987.02

Need only the last few items from each row in a CSV file

I have a CSV file (using ';' as the separator). I have used a StreamReader to read in each line of the file. The file contains almost 4000 rows and each row has 16 columns. I only need the last 5 numbers from each row, but I am unsure as to how to split each row and get only the last 5 numbers.
Example data:
2002;10;;0;0 EUR;122;448 823 EUR;8315;6 973 EUR;192233;586 EUR;6;13;55;66;81
2002;9;;0;0 EUR;62;750 138 EUR;4784;10 294 EUR;137390;697 EUR;13;51;55;62;74
2002;8;;0;0 EUR;56;801 650 EUR;6377;7 454 EUR;177197;522 EUR;12;13;19;28;85
So for the first row, the data I actually need is { 6; 13; 55; 66; 81 }
I am writing the part of the logic as per the example you provided. This would split one entire row and return you the last five numbers in an array.
string row = "2002; 10; ; 0; 0 EUR; 122; 448 823 EUR; 8315; 6 973 EUR; 192233; 586 EUR; 6; 13; 55; 66; 81";
string[] rowArray = row.Trim().Split(';');
string[] numbers = rowArray.Skip(Math.Max(0, rowArray.Length - 5)).ToArray();
numbers would contain all the last five numbers you want which you can access with the indexes- numbers[0], numbers[1], and so on.. upto numbers[4].
Note: You have to split the data as read from the StreamReader into rows. You you get the rows, loop through each row and use the above three lines of code to get the last five numbers.
You can do this easily with the String.Split method.
foreach(var line in file)
{
var result = test.Split(';');
var last = result.Length-1;
var first = result[last-4];
var second = result[last-3];
var third = result[last-2];
var fourth = result[last-1];
var fifth = result[last];
}
As a side note, a library that I have found very helpful when dealing with CSV files is LINQtoCSV. There is a NuGet package available so it can be easily added to a project. If you are going to have to do anything else with this data, you may want to check it out.
Edit:
Here is an example of doing this with LINQtoCSV. If you read the documentation they show how to set up a more strongly typed class that you could read into, for simplicity here I am just doing it in a raw fashion.
// Define the class for reading, both IDataRow and DataRowItem
// are part of the LINQtoCSV namespace
public class MyRow : List<DataRowItem>, IDataRow
{
}
// Create the context, the file description and then read the file.
var context = new CsvContext();
var inputFileDescription = new CsvFileDescription
{
SeparatorChar = ';',
FirstLineHasColumnNames = false, // Change this if yours does
};
// Note: You don't need to use your stream reader, just use the LINQtoCSV
// Read method to load the data into an IEnumerable. You can read the
// documentation for more information and options on loading/reading the
// data.
var products = context.Read<MyRow>(#"yourfile.csv", inputFileDescription);
// Iterate all the rows and grab the last 5 items from the row
foreach (var row in products)
{
var last = row.Count - 1;
var first = row[last - 4];
var second = row[last - 3];
var third = row[last - 2];
var fourth = row[last - 1];
var fifth = row[last];
}
You can try with Cinchoo ETL library, to parse the file and access the last 5 members as below
foreach (dynamic rec in new ChoCSVReader("quotes.csv").WithDelimiter(";"))
{
Console.WriteLine("{0}", rec[11]);
Console.WriteLine("{0}", rec[12]);
Console.WriteLine("{0}", rec[13]);
Console.WriteLine("{0}", rec[14]);
Console.WriteLine("{0}", rec[15]);
}

c# Read lines from a file and replace with text from DataGridView Data

I am relatively new to c#, I am creating an windows application which would read all the lines from a text file. The user will input the string which needs to be replaced in Column[0] and the text with which it needs to be replaced in Column1 of the DataGridView control.
I have created two string arrays column0 and column1.
However, I am getting an error while replacing the string in line (column0, column1)
The following is my code:
string[] column0 = new string[dgvMapping.Rows.Count];
string[] column1 = new string[dgvMapping.Rows.Count];
int j = 0;
foreach(DataGridViewRow row in dgvMapping.Rows)
{
if (!string.IsNullOrEmpty(Convert.ToString(row.Cells[0].Value)))
{
column0[j] = Convert.ToString(row.Cells[0].Value);
column1[j] = Convert.ToString(row.Cells[1].Value);
j++;
}
}
var _data = string.Empty;
String[] arrayofLine = File.ReadAllLines(ofd.FileName);
using (StreamWriter sw = new StreamWriter(ofd.FileName + ".output"))
{
for (int i = 0; i < arrayofLine.Length; i++)
{
string line = arrayofLine[i];
line = line.Replace(column0[i], column1[i]);
sw.WriteLine(line);
}
}
I am using OpenFileDialog to select the file.
The Error While Executing:
You are looping around a file of unknown number of lines, and assuming that the count of lines in the grid is exactly the same as that of the file. Your code will only work if both the file and the gridView have the same number of lines.
One of the solutions, is to loop over the array of lines (as you have already did), and search for the GridViewRow in which the current line contains a key in your DGV. If this is the case, then replace all the occurences of the key by the value (obtained from the gridView) in that line, otherwise do nothing.
Check out the code below :
// Convert the row collection to a list, so that we could query it easily with Linq
List<DataGridViewRow> mySearchList = dataGridView1.Rows.Cast<DataGridViewRow>().ToList();
const int KEY_INDEX = 0; // Search index in the grid
const int VALUE_INDEX = 1; // Value (replace) index in the grid
for (int i = 0; i < arrayofLines.Length; i++)
{
string line = arrayofLines[i];
// Get data grid view Row where this line contains the key string
DataGridViewRow matchedRow = mySearchList.FirstOrDefault(obj => line.Contains(obj.Cells[KEY_INDEX].Value.ToString()));
// If this row exists, replace the key with the value (obtained from the grid)
if (matchedRow != null)
{
string key = matchedRow.Cells[KEY_INDEX].Value.ToString();
string value = matchedRow.Cells[VALUE_INDEX].Value.ToString();
line = line.Replace(key, value);
sw.WriteLine(line);
}
else
{
// Otherwise, do nothing
}
}
Stuartd is correct… there are more lines in the file than there are elements to search. I am not sure what the search is doing in a sense that it seems somewhat limited. The code appears to search for each item depending on what line it is. The searched value in column 0 and the replace value in column 1 of row 0… will only replace those values for the FIRST line in the file. The DataGridViews second row values will search/replace only the SECOND line and so on. This seems odd.
Example the two string arrays (column0 and column1) have sizes set to the number of rows in dgvMapping. Let’s say there are 5 rows in the grid, then the array sizes will be 5 strings. When you start the loop to write the strings, the loop starts at 0 and stops at the number of lines in the file. The code uses this i variable as an index into the two arrays. If there are more lines in the file, than there are rows in the grid… then you will get the error.
Again, this seems odd to do the search and replace this way. Assuming you want to search for EACH term in all the rows in column 0 and replace the found searched string with the replace string in column 1, then you will need to loop through EACH row of the grid for EACH line in the file. This will replace ALL the search/replace terms in the grid with ALL the lines in the file. If this is what you what to accomplish below is one way to achieve this, however…there are possibly better ways to accomplish this.
The code below reads the file into one big string. Then the code loops through ALL the grid rows to search/replace the strings in the big string. Hope this helps.
string bigString = File.ReadAllText(ofd.FileName);
try {
using (StreamWriter sw = new StreamWriter(ofd.FileName + ".output")) {
for (int k = 0; k < dgvMapping.Rows.Count; k++) {
if (dgvMapping.Rows[k].Cells[0].Value != null && dgvMapping.Rows[k].Cells[1].Value != null) {
string searchTerm = dgvMapping.Rows[k].Cells[0].Value.ToString();
string replaceTerm = dgvMapping.Rows[k].Cells[1].Value.ToString();
if (searchTerm != "") {
bigString = bigString.Replace(searchTerm, replaceTerm);
} else {
// one of the terms is empty
}
} else {
// one of the terms is null}
}
}
sw.WriteLine(bigString);
}
}
catch (Exception ex) {
MessageBox.Show("Write Erro: " + ex.Message);
}

How to read a space-delimited text file with empty columns

I'm trying to read a space-delimited file using StreamReader.
For this I'm reading the file line by line split them into arrays and reading a specific data by providing an index.
The problem is when in some rows a column is empty. This causes the program to reach the wrong item.
col1 col2 col3
a b c
d e
f g h
For example, I'm having problems with the second row.
Unless you have fixed width columns you wont know which value should be empty, if you have control over the format you should wrap values in quotes, or have a CSV format with quotes for value wrapping, espacing inner quotes, you can then have the luxury of view if in excel :-) for free.
https://en.wikipedia.org/wiki/Comma-separated_values
I can see two approaches on this.
Use exact width/spaces to find element positioning inside row
Analyze element positions inside every row in case when you have less than three elements. Compare x position of each element to x positions of headers, for example:
string header = "col1 col2 col3";
string row1 = "adfgdgdfg c";
int[] headerPoss = { header.IndexOf("col1"), header.IndexOf("col2"), header.IndexOf("col3") };
string[] row1Elements = row1.Split(new[] {' '}, StringSplitOptions.RemoveEmptyEntries);
int[] rowElementsPos = new int[row1Elements.Length];
for (int i = 0; i < row1Elements.Length; i++)
rowElementsPos[i] = row1.IndexOf(row1Elements[i]);
for (int i = 0; i < row1Elements.Length; i++)
{
Console.WriteLine("This element is from column {0}", headerPoss.Min(hp => Math.Abs(hp - rowElementsPos[i])) + 1);
}
Console.ReadKey();
Output :

Aligning strings into columns

I have a collection of strings that the user can add to or subtract from. I need a way to print the strings out in columns so that the 1st letter of each string aligned. However I the number of columns must be changeable during run time. Although the default is 4 columns the use can opt for any number from 1 to 6. I have no idea how to format an unknown quantity of string into an unknown number of columns.
Example Input:
it we so be a i o u t y z c yo bo go an
Example output of four columns
"Words" with 2 letters:
it so be we
yo bo go an
"Words" with 1 letter:
a i o u
t y z c
Note: not worried about parsing of the words I already have that in my code which I can add if helpful.
If you are trying to create fixed width columns, you can use string.PadLeft(paddingChar, width) and string.PadRight(paddingChar, width) when you are creating your rows.
http://msdn.microsoft.com/en-us/library/system.string.padleft.aspx
You can loop through your words and call .PadXXXX(width) on each word. It will automatically pad your words with the correct number of spaces to make your string the width you supplied.
You can divide the total line width by the number of columns and pad each string to that length. You may also want to trim extra long strings. Here's an example that pads strings that are shorter than the column width and trims strings that are longer. You may want to tweak the behavior for longer strings:
int Columns = 4;
int LineLength = 80;
public void WriteGroup(String[] group)
{
// determine the column width given the number of columns and the line width
int columnWidth = LineLength / Columns;
for (int i = 0; i < group.Length; i++)
{
if (i > 0 && i % Columns == 0)
{ // Finished a complete line; write a new-line to start on the next one
Console.WriteLine();
}
if (group[i].Length > columnWidth)
{ // This word is too long; truncate it to the column width
Console.WriteLine(group[i].Substring(0, columnWidth));
}
else
{ // Write out the word with spaces padding it to fill the column width
Console.Write(group[i].PadRight(columnWidth));
}
}
}
If you call the above method with this sample code:
var groupOfWords = new String[] { "alphabet", "alegator", "ant",
"ardvark", "ark", "all", "amp", "ally", "alley" };
WriteGroup(groupOfWords);
Then you should get output that looks like this:
alphabet alegator ant ardvark
ark all amp ally
alley

Categories