Output data to CSV specific columns from Dictionary c# - c#

I am trying to output the values from the dictionary to the CSV and am able to do this. But facing issue with the specific columns this need to output to the csv. I need the specific data value from dictionary to be output to a specific column in csv.
Dictionary<string, List<string>> file = new Dictionary<string, List<string>>();
for (var i = 0; i < stringList.Count(); i++)
{
string line = stringList[i];
string path = line.Replace("\r\n", "");
path = path.Replace(" ", "");
path = path.TrimEnd(':');
if (File.Exists(path))
{
file[path] = file.ContainsKey(path) ? file[path] : new List<string>();
for (var j = i + 1; j < stringList.Count(); j++)
{
string line2 = stringList[j];
string path2 = line2.Replace("\r\n", "");
path2 = path2.Replace(" ", "");
path2 = path2.TrimEnd(':');
if (File.Exists(path2))
{
i = j - 1;
break;
}
else
{
if (path2.Contains("Verified") | path2.Contains("Algorithm"))
{
var strings = path2.Split(':');
var listValue = strings[1].Trim();
file[path].Add(listValue);
}
}
}
}
}
using (var writer = new StreamWriter(outputdir + "\\output_" +
DateTime.Now.ToString("yyyy_MM_dd_HHmmss") + ".csv"))
{
writer.WriteLine("FilePath,Signature,HashValueSHA1, HashValueSHA2, HashValueMD5, Other");
foreach (var keyvaluepair in file)
{
if (!keyvaluepair.Value.Contains("Unsigned"))
{
var values = String.Join(",", keyvaluepair.Value.Distinct().Select(x => x.ToString()).ToArray());
writer.WriteLine("{0},{1}", keyvaluepair.Key, values);
}
}
}
Current Output looks like below:
Sample output I need as below:
The Dictionary key(string) would hold the file path and the values(List) would hold something like below:
Signed
sha1RSA
md5RSA
md5RSA
Signed
sha1RSA
sha1RSA
sha256RSA
sha256RSA
Please suggest how can I get the one as required output.

I had a longer answer typed, but I see the problem.
On this line
var values = String.Join(",", keyvaluepair.Value.Distinct().Select(x => x.ToString()).ToArray());
take out Distinct. It looks like you have the correct number of items in each string, but if a list contains multiple blank entries Distinct is eliminating the duplicates. If a list contains two or three blank entries you need all of them. If you delete duplicate blanks your columns won't line up.
Also, when you use Distinct there's no guarantee that items will come back in any particular order. In this case the order is very important so that values end up in the right columns.
So in your example above, even though there's a blank in the third column of the first row, the value from the fourth column ends up in the third column and the blank goes to the end.
That will likely fix the immediate problem. I'd recommend not using a List<string> when you're expecting a certain number of values (they need to match up with columns) because a List<string> can contain any number of values.
Instead, try something like this:
public class WhateverThisIs
{
public string Signature { get; set; }
public string HashValueSha1 { get; set; }
public string HashValueSha2 { get; set; }
public string HashValueMd5 { get; set; }
public string Other { get; set; }
}
Then, as a starting point, use Dictionary<string, WhateverThisIs>.
Then the part that outputs lines would look more like this:
var value = keyvaluepair.Value;
var values = String.Join(",", value.Signature, value.HashValueSha1, value.HashValueSha2,
value.HashValueMd5, value.Other);
(and yes, that accounts for null values.)
If you want to replace nulls or empty values with "N/A" then you'd need a separate function for that, like
string ReplaceNullWithNa(string value)
{
return string.IsNullOrEmpty(value) ? "N/A" : value;
}

Related

What is the easiest way to split columns from a txt file

I've been looking around a bit but haven't really found a good example with what I'm struggling right now.
I have a .txt file with a couple of columns as follows:
# ID,YYYYMMDD, COLD,WATER, OD, OP,
52,20120406, 112, 91, 20, 130,
53,20130601, 332, 11, 33, 120,
And I'm reading these from the file into a string[] array.
I'd like to split them into a list
for example
List results, and [0] index will be the first index of the columns
results[0].ID
results[0].COLD
etc..
Now I've been looking around, and came up with the "\\\s+" split
but I'm not sure how to go about it since each entry is under another one.
string[] lines = File.ReadAllLines(path);
List<Bus> results = new List<Bus>();
//Bus = class with all the vars in it
//such as Bus.ID, Bus.COLD, Bus.YYYYMMDD
foreach (line in lines) {
var val = line.Split("\\s+");
//not sure where to go from here
}
Would greatly appreciate any help!
Kind regards, Venomous.
I suggest using Linq, something like this:
List<Bus> results = File
.ReadLines(#"C:\MyFile.txt") // we have no need to read All lines in one go
.Skip(1) // skip file's title
.Select(line => line.Split(','))
.Select(items => new Bus( //TODO: check constructor's syntax
int.Parse(items[1]),
int.Parse(items[3]),
DateTime.ParseExact(items[2], "yyyyMMdd", CultureInfo.InvariantCulture)))
.ToList();
I would do
public class Foo
{
public int Id {get; set;}
public string Date {get; set;}
public double Cold {get; set;}
//...more
}
Then read the file
var l = new List<Foo>();
foreach (line in lines)
{
var sp = line.Split(',');
var foo = new Foo
{
Id = int.Parse(sp[0].Trim()),
Date = sp[1].Trim(),//or pharse the date to a date time struct
Cold = double.Parse(sp[2].Trim())
}
l.Add(foo);
}
//now l contains a list filled with Foo objects
I would probably keep a List of properties and use reflection to populate the object, something like this :
var columnMap = new[]{"ID","YYYYMMDD","COLD","WATER","OD","OP"};
var properties = columnMap.Select(typeof(Bus).GetProperty).ToList();
var resultList = new List<Bus>();
foreach(var line in lines)
{
var val = line.Split(',');
var adding = new Bus();
for(int i=0;i<val.Length;i++)
{
properties.ForEach(p=>p.SetValue(adding,val[i]));
}
resultList.Add(adding);
}
This is assuming that all of your properties are strings however
Something like this perhaps...
results.Add(new Bus
{
ID = val[0],
YYYYMMDD = val[1],
COLD = val[2],
WATER = val[3],
OD = val[4],
OP = val[5]
});
Keep in mind that all of the values in the val array are still strings at this point. If the properties of Bus are typed, you will need to parse them into the correct types e.g. assume ID is typed as an int...
ID = string.IsNullOrEmpty(val[0]) ? default(int) : int.Parse(val[0]),
Also, if the column headers are actually present in the file in the first line, you'll need to skip/disregard that line and process the rest.
Given that we have the Bus class with all the variables from your textfile:
class Bus
{
public int id;
public DateTime date;
public int cold;
public int water;
public int od;
public int op;
public Bus(int _id, DateTime _date, int _cold, int _water, int _od, int _op)
{
id = _id;
date = _date;
cold = _cold;
water = _water;
od = _od;
op = _op;
}
}
Then we can list them all in the results list like this:
List<Bus> results = new List<Bus>();
foreach (string line in File.ReadAllLines(path))
{
if (line.StartsWith("#"))
continue;
string[] parts = line.Replace(" ", "").Split(','); // Remove all spaces and split at commas
results.Add(new Bus(
int.Parse(parts[0]),
DateTime.ParseExact(parts[1], "yyyyMMdd", CultureInfo.InvariantCulture),
int.Parse(parts[2]),
int.Parse(parts[3]),
int.Parse(parts[4]),
int.Parse(parts[5])
));
}
And access the values as you wish:
results[0].id;
results[0].cold;
//etc.
I hope this helps.

C# Why can't I implicitly convert type 'string' to 'System.Collections.Generic.List<int>'?

I am trying to figure out how to solve the error as stated in the title, which occurs on the bold line within this snippet:
while (textIn.Peek() != -1)
{
string row = textIn.ReadLine();
string[] columns = row.Split('|');
StudentClass studentList = new StudentClass();
studentList.Name = columns[0];
**studentList.Scores = columns[1];**
students.Add(studentList);
}
The previous line of code loads the names just fine since it is not a List within the class I have created, but "Scores" is within a list, however. What modifications would I need to do? These values are supposed to be displayed within a textbox from a text file upon loading the application.
Here is the class in which "Scores" is in (I have highlighted it):
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace MyNameSpace
{
//set the class to public
public class StudentClass
{
public StudentClass()
{
this.Scores = new List<int>();
}
public StudentClass (string Name, List<int> Scores)
{
this.Name = Name;
this.Scores = Scores;
}
public string Name
{ get;
set;
}
//initializes the scores
**public List<int> Scores
{ get;
set;
}**
public override string ToString()
{
string names = this.Name;
foreach (int myScore in Scores)
{ names += "|" + myScore.ToString();
}
return names;
}
public int GetScoreTotal()
{
int sum = 0;
foreach (int score in Scores)
{ sum += score;
}
return sum;
}
public int GetScoreCount()
{ return Scores.Count;
}
public void addScore(int Score)
{
Scores.Add(Score);
}
}
}
You can't just assign a string containing a sequence of numbers to a property of type List<int>.
You need to split the string into seperate numbers, then parse these substrings to get the integers they represent.
E.g.
var text = "1 2 3 4 5 6";
var numerals = text.Split(' ');
var numbers = numerals.Select(x => int.Parse(x)).ToList();
I.e. in your code replace:
studentList.Scores = columns[1];
with:
studentList.Scores = columns[1].Split(' ').Select(int.Parse).ToList();
(Or your own multi-line, more readable/debugable equivalent.)
You'll need to modify the parameter passed to Split() according to how the scores are formatted in your column.
You'll also need to add using System.Linq; at the top if you don't already have it.
As far as the question goes, how would the compiler ever know how to convert the string to a list, when there could be so many string representations of a list. If it was to do this it would be an incredibly slow operation.
Fix
To fix your code you could replace your loop with this.
while (textIn.Peek() != -1)
{
string row = textIn.ReadLine();
StudentClass studentList = new StudentClass();
int index = row.IndexOf("|");
//checking that there were some scores
if (index < 0) {
studentList.Name = row;
continue;
}
studentList.Name = row.Substring(0, index);
studentList.Scores = row.Substring(index + 1).Split('|').Select(int.Parse).ToList();
students.Add(studentList);
}
There are however a number of problems even with this fix.
For one if you were to add another list delimited by '|' it would become increasingly hard for you to parse using this kind of approach.
I suggest instead that you look at serializing your class(es) with something a little more powerful and generic like Json.Net.
Hope this helps.

How to count occurences of number stored in file containing multiple delimeters?

This is my input store in file:
50|Carbon|Mercury|P:4;P:00;P:1
90|Oxygen|Mars|P:10;P:4;P:00
90|Serium|Jupiter|P:4;P:16;P:10
85|Hydrogen|Saturn|P:00;P:10;P:4
Now i will take my first row P:4 and then next P:00 and then next like wise and want to count occurence in every other row so expected output will be:
P:4 3(found in 2nd row,3rd row,4th row(last cell))
P:00 2 (found on 2nd row,4th row)
P:1 0 (no occurences are there so)
P:10 1
P:16 0
etc.....
Like wise i would like to print occurence of each and every proportion.
So far i am successfull in splitting row by row and storing in my class file object like this:
public class Planets
{
//My rest fields
public string ProportionConcat { get; set; }
public List<proportion> proportion { get; set; }
}
public class proportion
{
public int Number { get; set; }
}
I have already filled my planet object like below and Finally my List of planet object data is like this:
List<Planets> Planets = new List<Planets>();
Planets[0]:
{
Number:50
name: Carbon
object:Mercury
ProportionConcat:P:4;P:00;P:1
proportion[0]:
{
Number:4
},
proportion[1]:
{
Number:00
},
proportion[2]:
{
Number:1
}
}
Etc...
I know i can loop through and perform search and count but then 2 to 3 loops will be required and code will be little messy so i want some better code to perform this.
Now how do i search each and count every other proportion in my planet List object??
Well, if you have parsed proportions, you can create new struct for output data:
// Class to storage result
public class Values
{
public int Count; // count of proportion entry.
public readonly HashSet<int> Rows = new HashSet<int>(); //list with rows numbers.
/// <summary> Add new proportion</summary>
/// <param name="rowNumber">Number of row, where proportion entries</param>
public void Increment(int rowNumber)
{
++Count; // increase count of proportions entries
Rows.Add(rowNumber); // add number of row, where proportion entry
}
}
And use this code to fill it. I'm not sure it's "messy" and don't see necessity to complicate the code with LINQ. What do you think about it?
var result = new Dictionary<int, Values>(); // create dictionary, where we will storage our results. keys is proportion. values - information about how often this proportion entries and rows, where this proportion entry
for (var i = 0; i < Planets.Count; i++) // we use for instead of foreach for finding row number. i == row number
{
var planet = Planets[i];
foreach (var proportion in planet.proportion)
{
if (!result.ContainsKey(proportion.Number)) // if our result dictionary doesn't contain proportion
result.Add(proportion.Number, new Values()); // we add it to dictionary and initialize our result class for this proportion
result[proportion.Number].Increment(i); // increment count of entries and add row number
}
}
You can use var count = Regex.Matches(lineString, input).Count;. Try this example
var list = new List<string>
{
"50|Carbon|Mercury|P:4;P:00;P:1",
"90|Oxygen|Mars|P:10;P:4;P:00",
"90|Serium|Jupiter|P:4;P:16;P:10",
"85|Hydrogen|Saturn|P:00;P:10;P:4"
};
int totalCount;
var result = CountWords(list, "P:4", out totalCount);
Console.WriteLine("Total Found: {0}", totalCount);
foreach (var foundWords in result)
{
Console.WriteLine(foundWords);
}
public class FoundWords
{
public string LineNumber { get; set; }
public int Found { get; set; }
}
private List<FoundWords> CountWords(List<string> words, string input, out int total)
{
total = 0;
int[] index = {0};
var result = new List<FoundWords>();
foreach (var f in words.Select(word => new FoundWords {Found = Regex.Matches(word, input).Count, LineNumber = "Line Number: " + index[0] + 1}))
{
result.Add(f);
total += f.Found;
index[0]++;
}
return result;
}
I made a DotNetFiddle for you here: https://dotnetfiddle.net/z9QwmD
string raw =
#"50|Carbon|Mercury|P:4;P:00;P:1
90|Oxygen|Mars|P:10;P:4;P:00
90|Serium|Jupiter|P:4;P:16;P:10
85|Hydrogen|Saturn|P:00;P:10;P:4";
string[] splits = raw.Split(
new string[] { "|", ";", "\n" },
StringSplitOptions.None
);
foreach (string p in splits.Where(s => s.ToUpper().StartsWith(("P:"))).Distinct())
{
Console.WriteLine(
string.Format("{0} - {1}",
p,
splits.Count(s => s.ToUpper() == p.ToUpper())
)
);
}
Basically, you can use .Split to split on multiple delimiters at once, it's pretty straightforward. After that, everything is gravy :).
Obviously my code simply outputs the results to the console, but that part is fairly easy to change. Let me know if there's anything you didn't understand.

Reading a delimited text file by line and by delimiter in C#

I would just like to apologise for mposting this, there are a lot of questions like this here but I can't find one that is specific to this.
I have a list and each item contains a DateTime, int and a string. I have successfully written all list items to a .txt file which is delimited by commas.
For example 09/04/2015 22:12:00,10,Posting on Stackoverflow.
I need to loop through the file line by line, each line starting at index 0, through to index 2. At the moment I am able to call index 03, which returns the DateTime of the second list item in the text file. The file is written line by line, but I am struggling to read it back with the delimiters and line breaks.
I am sorry if I am not making much sense, I will appreciate any help, thank you.
string[] lines = File.ReadAllLines( filename );
foreach ( string line in lines )
{
string[] col = line.Split(',');
// process col[0], col[1], col[2]
}
You can read all the lines at once via var lines = File.ReadAllLines(pathToFile);. Then you can split each line into an array of fields via:
foreach (var line in lines) {
String[] fields = line.Split(',');
}
If you don't have any stray commas in your file and the only commas are true delimiters, this will mean that fields will always be a 3 element array, with each of your fields in succession.
Alternatively, you can do the following:
public static List<Values> GetValues(string path)
{
List<Values> valuesCollection = new List<Values>();;
using (var f = new StreamReader(path))
{
string line = string.Empty;
while ((line = f.ReadLine()) != null)
{
var parts = line.Split(',');
valuesCollection.Add(new Values(Convert.ToDateTime(parts[0]), Convert.ToInt32(parts[1]), parts[2]);
}
}
return valuesCollection;
}
class Values
{
public DateTime Date { get; set; }
public int IntValue { get; set; }
public string StringValue { get; set; }
public Values()
{
}
public Values(DateTime date, int intValue, string stringValue)
{
this.Date = date;
this.IntValue = intValue;
this.StringValue = stringValue;
}
}
You could then iterate through the list or collection of values and access each object and its properties. For example:
Console.WriteLine(values.Date);
Console.WriteLine(values.IntValue);
Console.WriteLine(values.StringValue);

Is there a code pattern for mapping a CSV with random column order to defined properties?

I have a CSV that is delivered to my application from various sources. The CSV will always have the same number columns and the header values for the columns will always be the same.
However, the columns may not always be in the same order.
Day 1 CSV may look like this
ID,FirstName,LastName,Email
1,Johh,Lennon,jlennon#applerecords.com
2,Paul,McCartney,macca#applerecords.com
Day 2 CSV may look like this
Email,FirstName,ID,LastName
resident1#friarpark.com,George,3,Harrison
ringo#allstarrband.com,Ringo,4,Starr
I want to read in the header row for each file and have a simple mechanism for associating each "column" of data with the associated property I have defined in my class.
I know I can use selection statements to figure it out, but that seems like a "bad" way to handle it.
Is there a simple way to map "columns" to properties using a dictionary or class at runtime?
Use a Dictionary to map column heading text to column position.
Hard-code mapping of column heading text to object property.
Example:
// Parse first line of text to add column heading strings and positions to your dictionary
...
// Parse data row into an array, indexed by column position
...
// Assign data to object properties
x.ID = row[myDictionary["ID"]];
x.FirstName = row[myDictionary["FirstName"]];
...
You dont need a design pattern for this purpose.
http://www.codeproject.com/Articles/9258/A-Fast-CSV-Reader
I have used this Reader, while it is pretty good, it has a functionality as row["firstname"] or row["id"] which you can parse and create your objects.
I have parsed both CSV files using Microsoft.VisualBasic.FileIO.TextFieldParser. I have populated DataTable after parsing both csv files:
DataTable dt;
private void button1_Click(object sender, EventArgs e)
{
dt = new DataTable();
ParseCSVFile("day1.csv");
ParseCSVFile("day2.csv");
dataGridView1.DataSource = dt;
}
private void ParseCSVFile(string sFileName)
{
var dIndex = new Dictionary<string, int>();
using (TextFieldParser csvReader = new TextFieldParser(sFileName))
{
csvReader.Delimiters = new string[] { "," };
var colFields = csvReader.ReadFields();
for (int i = 0; i < colFields.Length; i++)
{
string sColField = colFields[i];
if (sColField != string.Empty)
{
dIndex.Add(sColField, i);
if (!dt.Columns.Contains(sColField))
dt.Columns.Add(sColField);
}
}
while (!csvReader.EndOfData)
{
string[] fieldData = csvReader.ReadFields();
if (fieldData.Length > 0)
{
DataRow dr = dt.NewRow();
foreach (var kvp in dIndex)
{
int iVal = kvp.Value;
if (iVal < fieldData.Length)
dr[kvp.Key] = fieldData[iVal];
}
dt.Rows.Add(dr);
}
}
}
}
day1.csv and day2.csv as mentioned in the question.
Here is how output dataGridView1 look like:
Here is a simple generic method that will take a CSV file (broken into string[]) and create from it a list of objects. The assumption is that the object properties will have the same name as the headers. If this is not the case you might look into the DataMemberAttribute property and modify accordingly.
private static List<T> ProcessCSVFile<T>(string[] lines)
{
List<T> list = new List<T>();
Type type = typeof(T);
string[] headerArray = lines[0].Split(new char[] { ',' });
PropertyInfo[] properties = new PropertyInfo[headerArray.Length];
for (int prop = 0; prop < properties.Length; prop++)
{
properties[prop] = type.GetProperty(headerArray[prop]);
}
for (int count = 1; count < lines.Length; count++)
{
string[] valueArray = lines[count].Split(new char[] { ',' });
T t = Activator.CreateInstance<T>();
list.Add(t);
for (int value = 0; value < valueArray.Length; value++)
{
properties[value].SetValue(t, valueArray[value], null);
}
}
return list;
}
Now, in order to use it just pass your file formatted as an array of strings. Let's say the class you want to read into looks like this:
class Music
{
public string ID { get; set; }
public string FirstName { get; set; }
public string LastName { get; set; }
public string Email { get; set; }
}
So you can call this:
List<Music> newlist = ProcessCSVFile<Music>(list.ToArray());
...and everything gets done with one call.

Categories