How to read individual columns from CSV file? - c#

Suppose the following is my CSV file:
Step,Magnetization,Energy
1,0.009375,12
2,0.009375,12
3,0.009375,12
4,0.009375,12
5,0.009375,12
I want to read the file and create three separate lists or arrays.
So, I wrote the following code:
class Program
{
static void Main(string[] args)
{
string csvFilePath = #"ising.csv";
CsvConfiguration myConfig = new CsvConfiguration(CultureInfo.CurrentCulture)
{
Delimiter = ","
};
using (var reader = new StreamReader(csvFilePath))
using (var csv = new CsvReader(reader, myConfig))
{
List<double> xAxisForSteps = new List<double>();
List<double> yAxisForMagnetization = new List<double>();
List<double> yAxisForEnergy = new List<double>();
while (csv.Read())
{
int step = csv.GetField<int>("Step");
double magnetization = csv.GetField<double>("Magnetization");
int energy = csv.GetField<int>("Energy");
xAxisForSteps.Add(step);
yAxisForMagnetization.Add(magnetization);
yAxisForEnergy.Add(energy);
}
}
}
}
This gives the following error:
An unhandled exception of type 'CsvHelper.ReaderException' occurred in CsvHelper.dll
Additional information: The header has not been read.
You must call ReadHeader() before any fields can be retrieved by name.
IReader state:
ColumnCount: 0
CurrentIndex: -1
HeaderRecord:
IParser state:
ByteCount: 0
CharCount: 27
Row: 1
RawRow: 1
Count: 3
RawRecord:
Step,Magnetization,Energy
How to resolve it?
EDIT:
After calling csv.ReadHeader() I get the following error:
An unhandled exception of type 'CsvHelper.ReaderException' occurred in CsvHelper.dll
Additional information: No header record was found.
IReader state:
ColumnCount: 0
CurrentIndex: -1
HeaderRecord:
IParser state:
ByteCount: 0
CharCount: 0
Row: 0
RawRow: 0
Count: 0
RawRecord:

Try changing your code like this:
List<double> yAxisForEnergy = new List<double>();
if(csv.Read() && csv.ReadHeader()){
while (csv.Read())
I'm not sure I agree that is the most obvious design, but that is how it should be done according to the documentation.
Please note that this will depend on the currentCulture, since not all cultures use . as decimal separator. Consider specifying the invariantCulture.

Related

Configuration of network incorrect

I'm a novice to Keras and Tensorflow. I am unsuccessfully trying to reshape this tutorial for Python (which I'm not familiar with at all); I have formulated the following code fragment.
var Functions = new int[] { 1, 2, 3, 4 };
var BatchSize = 64;
var InputDim = Functions.Count();
var OutputDim = 256;
var RnnUnits = 1024;
var iLayer1
= new Embedding(InputDim,
OutputDim,
input_shape: new Shape(new int[] { BatchSize, 0 } ) );
var iLayer2
= new GRU(RnnUnits,
return_sequences: true,
stateful: true, recurrent_initializer: "glorot_uniform");
var iLayer3 = new Dense(InputDim);
var iSequential = new Sequential();
iSequential.Add(iLayer1);
iSequential.Add(iLayer2);
iSequential.Add(iLayer3);
While this compiles, I'm getting the error message
Python.Runtime.PythonException:
"ValueError : Input 0 is incompatible with layer gru_1: expected ndim=3, found ndim=4"
when
iSequential.Add(iLayer2);
is executed. To my superficial understanding, this means that iLayer1 is configured in a way that makes it impossible to operate it together with iLayer2, but I have no idea what to do.
Edit: After some messing around, I got the error message
ValueError : slice index 0 of dimension 0 out of
bounds. for 'gru_1/strided_slice_10' (op: 'StridedSlice') with
input shapes: [0,64,256], [1], [1], [1] and with
computed input tensors: input[1] = <0>, input[2] = <1>, input[3] = <1>.
Any ideas?
If C# Keras uses the same convetions as Python Keras, your input shape for the embedding should not include the batch size.
Since you are forced to use the batch size due to stateful: true, you need to use the batch_input_shape argument istead of input_shape.
I'm not sure about 0 there. Is this the C# convention for variable length?
The error is saying that the second layer got a 4D tensor from the previous layer, while that tensor should have been 3D.
Options:
batch_input_shape: new Shape(new int[] { BatchSize, 0 } )
batch_shape: new Shape(new int[] { BatchSize, 0 } )
input_shape: new Shape(new int[] { 0 } ), batch_size: BatchSize
If none of these work on C#, you will have to try the functional API model instead of the sequential model.

NAudio WaveFileReader.ReadNextSampleFrame() returns invalid amplitudes

I recorded note D to the wav file and want to get array of it's amplitudes. I started to play this note before beginning of recording, so there is no empty interval.
Then I used NAudio.WaveFileReader.ReadNextSampleFrame() to get it's amplitudes but have output with a lot of zeros and rarely encountered non zero values, e.g:
0
0
0.1312
0
0
... a lot of zeros
0.12312
0.123123
0
0
..a lot of zeros
0
0.12312
This is not correct because Ableton shows that sound is wavy:
Here is my code:
using (var reader = new WaveFileReader("record.wav"))
{
var leftAmplitudes = new List<float>();
var rightAmplitudes = new List<float>();
for (int i = 0; i < reader.SampleCount; i++)
{
var sampleFrame = reader.ReadNextSampleFrame();
leftAmplitudes.Add(sampleFrame[0]);
rightAmplitudes.Add(sampleFrame[1]);
}
}
Do you know how can I get actual amplitudes?

Sequentially arranged and add placeholders in text file

I have a text file which contains data. There are 3 columns, each column starts at a specific location and ends a specific location in the file. The first column which is (300, 301, 302, 304...) is always number based. the second column is a string, and the last column is currency.
The current .txt file is missing numbers which is (303, 305).
I was able to find the missing numbers and add it to an array then write it to the file.
My goal is to write all the columns data sequentially to the text file even the missing ones. As for column 2 and 3, I want 0 to be the placeholder for the missing data and aligned with its own column.
I'm close but need help
//read file
string[] lines = File.ReadAllLines(FilePath);
var Numbers = new List<int>();
int i = 0;
foreach (var line in lines)
{
//get value of first column
var FirstColumn = line.Substring(0, 3);
//add it to array
Numbers.Add(Convert.ToInt32(FirstColumn));
++i;
}
//find missing numbers add to array
var result = Enumerable.Range(Numbers.Min(), Numbers.Count);
//write to file
using (StreamWriter file = new StreamWriter(OutPutFile, true))
{
foreach (var item in result.ToArray())
{
file.WriteLine(item);
}
}
Console.ReadKey();
Current .txt file
300 Family Guy 1,123
301 Dexters Lab 456
302 Rugrats 1,789.52
304 Scooby-Doo 321
306 Recess 2,654
307 Popeye 1,987.02
GOAL: Desired Output .txt file
300 Family Guy 1,123
301 Dexters Lab 456
302 Rugrats 1,789.52
303 0 0
304 Scooby-Doo 321
305 0 0
306 Recess 2,654
307 Popeye 1,987.02
You are reading the first column, but not the rest. What I do is create a dictionary, using the first number as the index, and stuffing the other two fields into a System.ValueTuple (you need to include the ValueTyple Nuget package to get this to work).
First I set some stuff up:
const int column1Start = 0;
const int column1Length = 3;
const int column2Start = 8;
const int column2Length = 15;
const int column3Start = 24;
int indexMin = int.MaxValue; //calculated during the first
int indexMax = int.MinValue; //pass through the file
Then I create my dictionary. That (string, decimal) syntax describes a 2-tuple that contains a string and a decimal number (kind of like the ordered-pairs you were taught about in high school).
Dictionary<int, (string, decimal)> data = new Dictionary<int, (string, decimal)>();
Then I make a pass through the file's lines, reading through the data, and stuffing the results in my dictionary (and calculating the max and min values for that first column):
var lines = File.ReadAllLines(fileName);
foreach (var line in lines) {
//no error checking
var indexString = line.Substring(column1Start, column1Length);
var cartoon = line.Substring(column2Start, column2Length).TrimEnd();
var numberString = line.Substring(column3Start);
if (int.TryParse(indexString, out var index)) {
//I have to parse the first number - otherwise there's nothing to index on
if (!decimal.TryParse(numberString, out var number)){
number = 0.0M;
}
data.Add(index, (cartoon, number));
if (index < indexMin) {
indexMin = index;
}
if (index > indexMax) {
indexMax = index;
}
}
}
Finally, with all my data in hand, I iterate from the min value to the max value, fetching the other two columns out of my dictionary:
for (int i = indexMin; i <= indexMax; ++i) {
if (!data.TryGetValue(i, out var val)){
val = ("0", 0.0M);
}
Console.WriteLine($"{i,5} {val.Item1,-column2Length - 2} {val.Item2, 10:N}");
}
My formatting isn't quite the same as yours (I cleaned it up a bit). You can do what you want. My results look like:
300 Family Guy 1,123.00
301 Dexters Lab 456.00
302 Rugrats 1,789.52
303 0 0.00
304 Scooby-Doo 321.00
305 0 0.00
306 Recess 2,654.00
307 Popeye 1,987.02

Need only the last few items from each row in a CSV file

I have a CSV file (using ';' as the separator). I have used a StreamReader to read in each line of the file. The file contains almost 4000 rows and each row has 16 columns. I only need the last 5 numbers from each row, but I am unsure as to how to split each row and get only the last 5 numbers.
Example data:
2002;10;;0;0 EUR;122;448 823 EUR;8315;6 973 EUR;192233;586 EUR;6;13;55;66;81
2002;9;;0;0 EUR;62;750 138 EUR;4784;10 294 EUR;137390;697 EUR;13;51;55;62;74
2002;8;;0;0 EUR;56;801 650 EUR;6377;7 454 EUR;177197;522 EUR;12;13;19;28;85
So for the first row, the data I actually need is { 6; 13; 55; 66; 81 }
I am writing the part of the logic as per the example you provided. This would split one entire row and return you the last five numbers in an array.
string row = "2002; 10; ; 0; 0 EUR; 122; 448 823 EUR; 8315; 6 973 EUR; 192233; 586 EUR; 6; 13; 55; 66; 81";
string[] rowArray = row.Trim().Split(';');
string[] numbers = rowArray.Skip(Math.Max(0, rowArray.Length - 5)).ToArray();
numbers would contain all the last five numbers you want which you can access with the indexes- numbers[0], numbers[1], and so on.. upto numbers[4].
Note: You have to split the data as read from the StreamReader into rows. You you get the rows, loop through each row and use the above three lines of code to get the last five numbers.
You can do this easily with the String.Split method.
foreach(var line in file)
{
var result = test.Split(';');
var last = result.Length-1;
var first = result[last-4];
var second = result[last-3];
var third = result[last-2];
var fourth = result[last-1];
var fifth = result[last];
}
As a side note, a library that I have found very helpful when dealing with CSV files is LINQtoCSV. There is a NuGet package available so it can be easily added to a project. If you are going to have to do anything else with this data, you may want to check it out.
Edit:
Here is an example of doing this with LINQtoCSV. If you read the documentation they show how to set up a more strongly typed class that you could read into, for simplicity here I am just doing it in a raw fashion.
// Define the class for reading, both IDataRow and DataRowItem
// are part of the LINQtoCSV namespace
public class MyRow : List<DataRowItem>, IDataRow
{
}
// Create the context, the file description and then read the file.
var context = new CsvContext();
var inputFileDescription = new CsvFileDescription
{
SeparatorChar = ';',
FirstLineHasColumnNames = false, // Change this if yours does
};
// Note: You don't need to use your stream reader, just use the LINQtoCSV
// Read method to load the data into an IEnumerable. You can read the
// documentation for more information and options on loading/reading the
// data.
var products = context.Read<MyRow>(#"yourfile.csv", inputFileDescription);
// Iterate all the rows and grab the last 5 items from the row
foreach (var row in products)
{
var last = row.Count - 1;
var first = row[last - 4];
var second = row[last - 3];
var third = row[last - 2];
var fourth = row[last - 1];
var fifth = row[last];
}
You can try with Cinchoo ETL library, to parse the file and access the last 5 members as below
foreach (dynamic rec in new ChoCSVReader("quotes.csv").WithDelimiter(";"))
{
Console.WriteLine("{0}", rec[11]);
Console.WriteLine("{0}", rec[12]);
Console.WriteLine("{0}", rec[13]);
Console.WriteLine("{0}", rec[14]);
Console.WriteLine("{0}", rec[15]);
}

Find The Highest Score In Each Row Algorithm

In order to train my coding foo, I have decide to register on the CodeEval platform. I stumble upon an exercise which I thought was pretty simple, but some reason, there is a bug that I cannot resolve since a long time ago.
Here's the situation (I've put only what seem to be more important from the text):
"the participants calculated votes that they received for each painting and inserted them in the table. But, they could not determine which movement has won and whose work received the highest score, so they asked you to help.
You need to determine and print the highest score of each category in the table."
More on the exercice on the following link :
https://www.codeeval.com/open_challenges/208/
This is a sample input that the platform uses to verify that my algorithm is OK:
333 967 860 -742 -279 -905 |
-922 380 -127 630 38 -548 |
258 -522 157 -580 357 -502 |
963 486 909 -416 -936 -239 |
517 571 107 -676 531 -782 |
542 265 -171 251 -93 -638
Here's my output from this sample :
967 630 357 963 571
At first, I couldn't understand what was wrong. But it seems that after the last
"|", my code freezes and "jumps" on the second line from the file I'm reading. My code looked pretty ok for what I was doing.
Here is the sample code :
//Sample code to read in test cases:
using System.IO;
using System.Collections.Generic;
using System.Linq;
using System;
class Program
{
static void Main(string[] args)
{
using (StreamReader reader = File.OpenText(args[0]))
while (!reader.EndOfStream)
{
string line = reader.ReadLine();
if (null == line)
continue;
List<int> highestScores = new List<int>();
var temporaryNumbers = new List<int>();
string[] splittedLine = line.Split(' ');
foreach (var s in splittedLine)
{
if (s == "|")
{
highestScores.Add(temporaryNumbers.Max());
temporaryNumbers.Clear();
continue;
}
int value;
if (int.TryParse(s, out value))
{
temporaryNumbers.Add(value);
continue;
}
continue;
}
if(highestScores.Count == 0)
continue;
var newLine = highestScores.Aggregate(string.Empty, (current, value)=> current + (value + " "));
Console.Out.WriteLine(newLine);
}
}
}
I guess my question would how to fix a situation like this ? It's not jump one line from the input that they use, it's every line. At the last |, the code jumps to the next line, if ever there is one.
In broad strokes, this is how I'd go about handling this:
First split your string into rows using Split("|") (let's call the resulting array rows). Now create a List<int> called columnMax. Now loop through rows and for each row we will Split(" ") (let's call this cells). Now we know (from the original assignment) that we can assume that rows are all the same length, so we will loop through cells using a for loop and check:
var value = int.Parse(cells[i]); // leaving out error checking for now
// but you could use TryParse to catch bad data
if (columnMax.Count <= i)
{
columnMax.Add(value);
}
else if (columnMax[i] < value)
{
columnMax[i] = value;
}
Now at the end of your loop, columnMax should contain all the maximums for each column (i.e. category).
Just for kicks, here's a Linq solution:
var maximums = input.Split(new [] {'|'}, StringSplitOptions.RemoveEmptyEntries)
.Aggregate((IEnumerable<int>)null,(m,r) =>
{
var cells = r.Split(new [] {' '}, StringSplitOptions.RemoveEmptyEntries).Select(c => int.Parse(c));
return m == null ? cells : cells.Zip(m, Math.Max);
});
I was going to post the whole solution, but as I see is a contest where you're participating.
So this is my help:
Try to split your problem in little problems and resolve one thing at a time. Actually your code is a little bit messi.
At first create a method to load all file entries and return a string collection with people scores for each line in the file . This would require a few more methods to convert string[] to int[], like this one.
static void StringToIntegers()
{
var input = "333 967 860 -742 -279 -905 | -922 380 -127 630 38 -548 | 258 -522 157 -580 357 -502 | 963 486 909 -416 -936 -239 | 517 571 107 -676 531 -782 | 542 265 -171 251 -93 -638";
var primaryArray = input.Split('|');
foreach (var block in primaryArray)
{
var trimmedBlock = block.Trim();
var secondaryArray = trimmedBlock.Split(' ');
var intArray = StringArrToIntArr(secondaryArray);
}
}
private static int[] StringArrToIntArr(string[] secondaryArray)
{
int[] intArray = new int[secondaryArray.Length];
for (int i = 0; i < secondaryArray.Length; i++)
{
if (!int.TryParse(secondaryArray[i], out intArray[i]))
throw new FormatException(string.Format("The string {0} is not a compatible int type",
secondaryArray[i]));
}
return intArray;
}
Then for each int collection call a method able to group each category score in different int arrays and there you can return the max number for each one.

Categories