Sequentially arranged and add placeholders in text file

Sequentially arranged and add placeholders in text file - c#

I have a text file which contains data. There are 3 columns, each column starts at a specific location and ends a specific location in the file. The first column which is (300, 301, 302, 304...) is always number based. the second column is a string, and the last column is currency.
The current .txt file is missing numbers which is (303, 305).
I was able to find the missing numbers and add it to an array then write it to the file.
My goal is to write all the columns data sequentially to the text file even the missing ones. As for column 2 and 3, I want 0 to be the placeholder for the missing data and aligned with its own column.
I'm close but need help
//read file
string[] lines = File.ReadAllLines(FilePath);
var Numbers = new List<int>();
int i = 0;
foreach (var line in lines)
{
//get value of first column
var FirstColumn = line.Substring(0, 3);
//add it to array
Numbers.Add(Convert.ToInt32(FirstColumn));
++i;
}
//find missing numbers add to array
var result = Enumerable.Range(Numbers.Min(), Numbers.Count);
//write to file
using (StreamWriter file = new StreamWriter(OutPutFile, true))
{
foreach (var item in result.ToArray())
{
file.WriteLine(item);
}
}
Console.ReadKey();
Current .txt file
300 Family Guy 1,123
301 Dexters Lab 456
302 Rugrats 1,789.52
304 Scooby-Doo 321
306 Recess 2,654
307 Popeye 1,987.02
GOAL: Desired Output .txt file
300 Family Guy 1,123
301 Dexters Lab 456
302 Rugrats 1,789.52
303 0 0
304 Scooby-Doo 321
305 0 0
306 Recess 2,654
307 Popeye 1,987.02

You are reading the first column, but not the rest. What I do is create a dictionary, using the first number as the index, and stuffing the other two fields into a System.ValueTuple (you need to include the ValueTyple Nuget package to get this to work).
First I set some stuff up:
const int column1Start = 0;
const int column1Length = 3;
const int column2Start = 8;
const int column2Length = 15;
const int column3Start = 24;
int indexMin = int.MaxValue; //calculated during the first
int indexMax = int.MinValue; //pass through the file
Then I create my dictionary. That (string, decimal) syntax describes a 2-tuple that contains a string and a decimal number (kind of like the ordered-pairs you were taught about in high school).
Dictionary<int, (string, decimal)> data = new Dictionary<int, (string, decimal)>();
Then I make a pass through the file's lines, reading through the data, and stuffing the results in my dictionary (and calculating the max and min values for that first column):
var lines = File.ReadAllLines(fileName);
foreach (var line in lines) {
//no error checking
var indexString = line.Substring(column1Start, column1Length);
var cartoon = line.Substring(column2Start, column2Length).TrimEnd();
var numberString = line.Substring(column3Start);
if (int.TryParse(indexString, out var index)) {
//I have to parse the first number - otherwise there's nothing to index on
if (!decimal.TryParse(numberString, out var number)){
number = 0.0M;
}
data.Add(index, (cartoon, number));
if (index < indexMin) {
indexMin = index;
}
if (index > indexMax) {
indexMax = index;
}
}
}
Finally, with all my data in hand, I iterate from the min value to the max value, fetching the other two columns out of my dictionary:
for (int i = indexMin; i <= indexMax; ++i) {
if (!data.TryGetValue(i, out var val)){
val = ("0", 0.0M);
}
Console.WriteLine($"{i,5} {val.Item1,-column2Length - 2} {val.Item2, 10:N}");
}
My formatting isn't quite the same as yours (I cleaned it up a bit). You can do what you want. My results look like:
300 Family Guy 1,123.00
301 Dexters Lab 456.00
302 Rugrats 1,789.52
303 0 0.00
304 Scooby-Doo 321.00
305 0 0.00
306 Recess 2,654.00
307 Popeye 1,987.02

Related

C# stream reader reading whitespaces and bypassing them

I am creating a program that reads a text file and gets the data then puts it into an array. My problem is that there are instances where a column is intended to be blank but the blank value must still be considered as a value but when my program reads the blank column, it reads the next value and puts it in the array where the value should be 0 or blank. I have tried to count the spaces between each column to make it a condition but the spaces are not reliable since the data varies in length. Any ideas about how I might do this?
Here is what my text data looks like.
Data1 Data2 Data3
1.325 1.57 51.2
2.2 21.85
12.5 25.13
15.85 13.78 1.85
I need my array to look like this
firstRow['1.325','1.57','51.2'];
secondRow['2.2','0','21.85'];

If your file is tab-splitted, use line.Split("\t") to get array of substrings of each line. Then, each substring you can convert into you data type. In your case it must be nullable, e,g, decimal?.

Here's a starting point if you have a list of headers in the order they appear in the data and if your values are always aligned to the headers.
import io, csv, sys
data = '''\
Data 1 Data 2 Data 3
1.325 1.57 51.2
2.2 21.85
12.5 25.13
15.85 13.78 1.85
'''
headers = ['Data 1', 'Data 2', 'Data 3'] # order should match headers
f = io.StringIO(data)
h = f.readline()
indexes = [h.find(s) for s in headers]
rows = []
for line in f:
line = line[:-1] # strip trailing linefeed
d = {}
for key, index in list(zip(headers, indexes))[::-1]: # slice from the right
val = line[index:]
line = line[:index]
d[key] = val.strip()
rows.append(d)
writer = csv.DictWriter(sys.stdout, headers)
writer.writeheader()
writer.writerows(rows)

Since I have ran out of time, what I did was to count the number of spaces and if the spaces exceed by a number (in my case, 10) I'll add a value empty value in my array
string[] lsData = pData.Split(' ');
string[] lsData1 = new string[18];
int newArrayData = 0;
int spaceCounter = 0;
for (int i = 0; i < lsData.Length; i++)
{
if (lsData[i] != "")
{
lsData1[newArrayData] = lsData[i];
newArrayData++;
spaceCounter = 0;
}
else
{
spaceCounter++;
}
if (spaceCounter >= 10)
{
lsData1[newArrayData] = "";
newArrayData++;
spaceCounter = 0;
}
}

Need only the last few items from each row in a CSV file

I have a CSV file (using ';' as the separator). I have used a StreamReader to read in each line of the file. The file contains almost 4000 rows and each row has 16 columns. I only need the last 5 numbers from each row, but I am unsure as to how to split each row and get only the last 5 numbers.
Example data:
2002;10;;0;0 EUR;122;448 823 EUR;8315;6 973 EUR;192233;586 EUR;6;13;55;66;81
2002;9;;0;0 EUR;62;750 138 EUR;4784;10 294 EUR;137390;697 EUR;13;51;55;62;74
2002;8;;0;0 EUR;56;801 650 EUR;6377;7 454 EUR;177197;522 EUR;12;13;19;28;85
So for the first row, the data I actually need is { 6; 13; 55; 66; 81 }

I am writing the part of the logic as per the example you provided. This would split one entire row and return you the last five numbers in an array.
string row = "2002; 10; ; 0; 0 EUR; 122; 448 823 EUR; 8315; 6 973 EUR; 192233; 586 EUR; 6; 13; 55; 66; 81";
string[] rowArray = row.Trim().Split(';');
string[] numbers = rowArray.Skip(Math.Max(0, rowArray.Length - 5)).ToArray();
numbers would contain all the last five numbers you want which you can access with the indexes- numbers[0], numbers[1], and so on.. upto numbers[4].
Note: You have to split the data as read from the StreamReader into rows. You you get the rows, loop through each row and use the above three lines of code to get the last five numbers.

You can do this easily with the String.Split method.
foreach(var line in file)
{
var result = test.Split(';');
var last = result.Length-1;
var first = result[last-4];
var second = result[last-3];
var third = result[last-2];
var fourth = result[last-1];
var fifth = result[last];
}
As a side note, a library that I have found very helpful when dealing with CSV files is LINQtoCSV. There is a NuGet package available so it can be easily added to a project. If you are going to have to do anything else with this data, you may want to check it out.
Edit:
Here is an example of doing this with LINQtoCSV. If you read the documentation they show how to set up a more strongly typed class that you could read into, for simplicity here I am just doing it in a raw fashion.
// Define the class for reading, both IDataRow and DataRowItem
// are part of the LINQtoCSV namespace
public class MyRow : List<DataRowItem>, IDataRow
{
}
// Create the context, the file description and then read the file.
var context = new CsvContext();
var inputFileDescription = new CsvFileDescription
{
SeparatorChar = ';',
FirstLineHasColumnNames = false, // Change this if yours does
};
// Note: You don't need to use your stream reader, just use the LINQtoCSV
// Read method to load the data into an IEnumerable. You can read the
// documentation for more information and options on loading/reading the
// data.
var products = context.Read<MyRow>(#"yourfile.csv", inputFileDescription);
// Iterate all the rows and grab the last 5 items from the row
foreach (var row in products)
{
var last = row.Count - 1;
var first = row[last - 4];
var second = row[last - 3];
var third = row[last - 2];
var fourth = row[last - 1];
var fifth = row[last];
}

You can try with Cinchoo ETL library, to parse the file and access the last 5 members as below
foreach (dynamic rec in new ChoCSVReader("quotes.csv").WithDelimiter(";"))
{
Console.WriteLine("{0}", rec[11]);
Console.WriteLine("{0}", rec[12]);
Console.WriteLine("{0}", rec[13]);
Console.WriteLine("{0}", rec[14]);
Console.WriteLine("{0}", rec[15]);
}

how to read a csv file and display row values and column values

As I'm a beginner in c# I need help !!! . I have a .csv file containing Students details in following columns, All I need here is based on the highest marks scored by the individuals , I need to get the output as follows:
(/* The data's in csv file are, say may be 100 in each row and column cell */)
Name ,Maths,Science,English(header of csv)
Akash, 80, 67 , 54
Manoj, 64, 56 , 72
Subas, 78, 84 , 63
I can do the read operation and displayed the whole line. But my problem is I want to display the Name of the student and subject from the highest score from each subject.
Sample Output:
English- Manoj
Maths - Subas
Science- Akash
I'm stuck in middle and any answers without using VB are highly appreciated.

I don't see how your requirement description matches with the sample output. Here is the code to get each subject and the student with the highest grade in which.
List<string[]> data = new List<string[]>();
using (StringReader sr = new StringReader(csvText))
{
while (sr.Peek() > 0)
{
data.Add(sr.ReadLine().Split(','));
}
}
//Iterates through columns, skipping first one (names)
List<string> output = new List<string>();
for (int i = 1; i < data[0].Count(); i++)
{
string subjectName = data[0][i];
Dictionary<string,int> grades = new Dictionary<string, int>();
//Iterates through rows, skipping first one (headers)
for (int j = 1; j < data.Count; j++)
{
grades.Add(data[j][0],Convert.ToInt32(data[i][j]));
}
output.Add(subjectName + " - " + grades.Aggregate((l, r) => l.Value > r.Value ? l : r).Key);
}
Credit to answers in this question for the Aggregate linq: Get key of highest value in Dictionary

How to distribute percentage of weight in given scenario

Question - I have table X which have random rows (it could be 10 rows, 100 rows and so on). Now I have percentage of weight lets suppose 33% 40% and 27%
and let name it
A=33%
B=40%
C=27%
so i have add one more column which have percentage of random row
****Row** |--Weight
row1 | A
row2 | C
row3 | B
.
.
.
row100 |B
Let suppose table have 1000 row then weight should be assign random like
A= 330
B=400
c=270
What I Made-
For below program I have to distribute segment on basis of value. For example, in below code I am iterating value to 1000, but it will distribute value like
A=300
B=400
C=300
instead of
A= 250, B=450 C=300. As weight are 25%,45%,30%
It should be generic for any n number, for example, in this code n =1000 (iteration):
static void Main(string[] args)
{
//var t = Console.ReadLine().ToObservable();
List<string> li = new List<string>();
//t.Subscribe(m => Console.Write(m));
for (int i = 1; i <= 1000; i++)
{
li.Add(GetSegment(i, "2.5,6.5,10.0", "A,B,C"));
}
Console.WriteLine("A Contains {0}",li.Count(x => x.Contains("A")));
Console.WriteLine("B Contains {0}", li.Count(x => x.Contains("B")));
Console.WriteLine("C Contains {0}", li.Count(x => x.Contains("C")));
Console.ReadLine();
}
public static string GetSegment(long seed, string raw_segments, string segname)
{
var segmentsValue = raw_segments.Split(',').Select(entry => (double.Parse(entry))).ToArray();
var segmentName = segname.Split(',').Select(entry => entry).ToArray();
double theNumber = seed % 10;
double index1 = segmentsValue.Where(entry => entry > theNumber).First();
int index = Array.IndexOf(segmentsValue, index1);
return segmentName[index].ToString();
}

So you have some number of objects and you want to assign them randomly to three bins, based on some set distribution. For example, you want 33% in bin A, 40% in bin B, and the remaining 27% in bin C.
If your distribution doesn't have to be exact (i.e. given 1,000 items, bin A must contain exactly 330 items), then this is very easy: for each row you generate a random number between 0 and 1,000, and assign assign the row to the appropriate bin. For example:
int[] ranges = new int[]{330, 730, 1000};
var rnd = new Random();
for (var i = 0; i < 1000; ++i)
{
var r = rnd.Next(1000);
if (r < ranges[0])
Console.WriteLine("bin A");
else if (r < ranges[1])
Console.WriteLine("bin B");
else
Console.WriteLine("bin C");
}
On average over many runs, that will give you 33% in bin A, 40% in bin B, and 27% in bin C. But for any individual run the number of items in each bin will vary somewhat. For example, in one run you might end up with 327, 405, 268.
With a little work, you could adapt that method so that it doesn't over-assign any bin. Basically, when a bin fills up, remove it from the list of ranges. You'd need your list of ranges to be dynamic so that you could remove items and keep working, but it would allow you to exactly fill each bin.
If the number of items is small enough, you could create an array with the numbers from 0 to N, shuffle it, and assign the numbers that way. For example:
// builds an array of numbers from 0 to 999.
var numbers = Enumerable.Range(0, 1000).ToArray();
Shuffle(numbers);
Use the Fisher-Yates shuffle to shuffle the array. See https://gist.github.com/mikedugan/8249637 (among many others) for an implementation.
You now have an array containing the numbers from 0 to 999 in random order. This is like pre-assigning a unique random number to each of your records. So when you go through your list of records you look up its corresponding random number in the numbers array. For example:
for (var i = 0; i < 1000; ++i)
{
var value = numbers[i];
char bin;
if (value < 330) bin = 'A';
else if (value < 730) bin = 'B';
else bin = 'C';
Console.WriteLine("Record {0} goes to bin {1}", i, bin);
}

Find The Highest Score In Each Row Algorithm

In order to train my coding foo, I have decide to register on the CodeEval platform. I stumble upon an exercise which I thought was pretty simple, but some reason, there is a bug that I cannot resolve since a long time ago.
Here's the situation (I've put only what seem to be more important from the text):
"the participants calculated votes that they received for each painting and inserted them in the table. But, they could not determine which movement has won and whose work received the highest score, so they asked you to help.
You need to determine and print the highest score of each category in the table."
More on the exercice on the following link :
https://www.codeeval.com/open_challenges/208/
This is a sample input that the platform uses to verify that my algorithm is OK:
333 967 860 -742 -279 -905 |
-922 380 -127 630 38 -548 |
258 -522 157 -580 357 -502 |
963 486 909 -416 -936 -239 |
517 571 107 -676 531 -782 |
542 265 -171 251 -93 -638
Here's my output from this sample :
967 630 357 963 571
At first, I couldn't understand what was wrong. But it seems that after the last
"|", my code freezes and "jumps" on the second line from the file I'm reading. My code looked pretty ok for what I was doing.
Here is the sample code :
//Sample code to read in test cases:
using System.IO;
using System.Collections.Generic;
using System.Linq;
using System;
class Program
{
static void Main(string[] args)
{
using (StreamReader reader = File.OpenText(args[0]))
while (!reader.EndOfStream)
{
string line = reader.ReadLine();
if (null == line)
continue;
List<int> highestScores = new List<int>();
var temporaryNumbers = new List<int>();
string[] splittedLine = line.Split(' ');
foreach (var s in splittedLine)
{
if (s == "|")
{
highestScores.Add(temporaryNumbers.Max());
temporaryNumbers.Clear();
continue;
}
int value;
if (int.TryParse(s, out value))
{
temporaryNumbers.Add(value);
continue;
}
continue;
}
if(highestScores.Count == 0)
continue;
var newLine = highestScores.Aggregate(string.Empty, (current, value)=> current + (value + " "));
Console.Out.WriteLine(newLine);
}
}
}
I guess my question would how to fix a situation like this ? It's not jump one line from the input that they use, it's every line. At the last |, the code jumps to the next line, if ever there is one.

In broad strokes, this is how I'd go about handling this:
First split your string into rows using Split("|") (let's call the resulting array rows). Now create a List<int> called columnMax. Now loop through rows and for each row we will Split(" ") (let's call this cells). Now we know (from the original assignment) that we can assume that rows are all the same length, so we will loop through cells using a for loop and check:
var value = int.Parse(cells[i]); // leaving out error checking for now
// but you could use TryParse to catch bad data
if (columnMax.Count <= i)
{
columnMax.Add(value);
}
else if (columnMax[i] < value)
{
columnMax[i] = value;
}
Now at the end of your loop, columnMax should contain all the maximums for each column (i.e. category).
Just for kicks, here's a Linq solution:
var maximums = input.Split(new [] {'|'}, StringSplitOptions.RemoveEmptyEntries)
.Aggregate((IEnumerable<int>)null,(m,r) =>
{
var cells = r.Split(new [] {' '}, StringSplitOptions.RemoveEmptyEntries).Select(c => int.Parse(c));
return m == null ? cells : cells.Zip(m, Math.Max);
});

I was going to post the whole solution, but as I see is a contest where you're participating.
So this is my help:
Try to split your problem in little problems and resolve one thing at a time. Actually your code is a little bit messi.
At first create a method to load all file entries and return a string collection with people scores for each line in the file . This would require a few more methods to convert string[] to int[], like this one.
static void StringToIntegers()
{
var input = "333 967 860 -742 -279 -905 | -922 380 -127 630 38 -548 | 258 -522 157 -580 357 -502 | 963 486 909 -416 -936 -239 | 517 571 107 -676 531 -782 | 542 265 -171 251 -93 -638";
var primaryArray = input.Split('|');
foreach (var block in primaryArray)
{
var trimmedBlock = block.Trim();
var secondaryArray = trimmedBlock.Split(' ');
var intArray = StringArrToIntArr(secondaryArray);
}
}
private static int[] StringArrToIntArr(string[] secondaryArray)
{
int[] intArray = new int[secondaryArray.Length];
for (int i = 0; i < secondaryArray.Length; i++)
{
if (!int.TryParse(secondaryArray[i], out intArray[i]))
throw new FormatException(string.Format("The string {0} is not a compatible int type",
secondaryArray[i]));
}
return intArray;
}
Then for each int collection call a method able to group each category score in different int arrays and there you can return the max number for each one.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Sequentially arranged and add placeholders in text file - c#

Related

C# stream reader reading whitespaces and bypassing them

Need only the last few items from each row in a CSV file

how to read a csv file and display row values and column values

How to distribute percentage of weight in given scenario

Find The Highest Score In Each Row Algorithm

Categories

Resources