WPF list filtering - c#

I am new to WPF so this is probably an easy question. I have an app that reads some words from a csv file and stores them in a list of strings. What I am trying to do is parametise this list to show the most popular words in my list. So in my UI I want to have a text box which when I enter a number e.g. 5 would filter the original list leaving only the 5 most popular (frequent) words in the new list. Can anyone assist with this final step? Thanks -
public class VM
{
public VM()
{
Words = LoadWords(fileList);
}
public IEnumerable<string> Words { get; private set; }
string[] fileList = Directory.GetFiles(#"Z:\My Documents\", "*.csv");
private static IEnumerable<string> LoadWords(String[] fileList)
{
List<String> words = new List<String>();
//
if (fileList.Length == 1)
{
try
{
foreach (String line in File.ReadAllLines(fileList[0]))
{
string[] rows = line.Split(',');
words.AddRange(rows);
}
}
catch (Exception ex)
{
System.Windows.MessageBox.Show(ex.Message, "Problem!");
}
}
else
{
System.Windows.MessageBox.Show("Please ensure that you have ONE read file in the source folder.", "Problem!");
}
return words;
}
}

A LINQ query that groups by the word and orders by the count of that word descending should do it. Try this
private static IEnumerable<string> GetTopWords(int Count)
{
var popularWords = (from w in words
group w by w
into grp
orderby grp.Count() descending
select grp.Key).Take(Count).ToList();
return popularWords;
}

You could use CollectionViewSource.GetDefaultView(viewModel.Words), which returns ICollectionView.
ICollectionView exposes Filter property of type Predicate<object>, that you could involve for filtering.
So the common scenario looks like:
ViewModel exposes property PopularCount, that is binded to some textbox in View.
ViewModel listens for PopularCount property's changing.
When notification occured, model obtains ICollectionView for viewModel.Words collection and set up Filter property.
You could find working sample of Filter property usage here. If you get stuck with code, let me know.

Instead of reading all the words into the list and then sorting it based on the frequency, a cleaner approach would be to create a custom class MyWord that stores the word and the frequency. While reading the file, the frequency of the word can be incremented. The class can implement IComparable<T> to compare the words based on the frequency.
public class MyWord : IComparable<MyWord>
{
public MyWord(string word)
{
this.Word = word;
this.Frequency = 0;
}
public MyWord(string word, int frequency)
{
this.Word = word;
this.Frequency = frequency;
}
public string Word { get; private set;}
public int Frequency { get; private set;}
public void IncrementFrequency()
{
this.Frequency++;
}
public void DecrementFrequency()
{
this.Frequency--;
}
public int CompareTo(MyWord secondWord)
{
return this.Frequency.CompareTo(secondWord.Frequency);
}
}
The main class VM would have these members,
public IEnumerable<MyWord> Words { get; private set; }
private void ShowMostPopularWords(int numberOfWords)
{
SortMyWordsDescending();
listBox1.Items.Clear();
for (int i = 0; i < numberOfWords; i++ )
{
listBox1.Items.Add(this.Words.ElementAt(i).Word + "|" + this.Words.ElementAt(i).Frequency);
}
}
And the call to ShowMostPopularWords()
private void Button_Click(object sender, RoutedEventArgs e)
{
int numberOfWords;
if(Int32.TryParse(textBox1.Text, NumberStyles.Integer, CultureInfo.CurrentUICulture, out numberOfWords))
{
ShowMostPopularWords(numberOfWords);
}
}

I'm not sure if grouping and ordering of the 'words' list is what you want but if yes this could be a way of doing it:
int topN = 3;
List<string> topNWords = new List<string>();
string[] words = new string[] {
"word5",
"word1",
"word1",
"word1",
"word2",
"word2",
"word2",
"word3",
"word3",
"word4",
"word5",
"word6",
};
// [linq query][1]
var wordGroups = from s in words
group s by s into g
select new { Count = g.Count(), Word = g.Key };
for (int i = 0; i < Math.Min(topN, wordGroups.Count()); i++)
{
// (g) => g.Count is a [lambda expression][2]
// .OrderBy and Reverse are IEnumerable extension methods
var element = wordGroups.OrderBy((g) => g.Count).Reverse().ElementAt(i);
topNWords.Add(element.Count + " - " + element.Word);
}
Thsi could be made much shorter by using ordering in the linq select clause but I wished to introduce you to inline lambdas and ienumerable extensions too.
The short version could be:
topNWords = (from s in words
group s by s
into g
orderby g.Count() descending
select g.Key).Take(Math.Min(topN, g.Count()).ToList();

Related

How to Get Groups of Files from GetFiles()

I have to process files everyday. The files are named like so:
fg1a.mmddyyyy
fg1b.mmddyyyy
fg1c.mmddyyyy
fg2a.mmddyyyy
fg2b.mmddyyyy
fg2c.mmddyyyy
fg2d.mmddyyyy
If the entire file group is there for a particular date, I can process it. If it isn't there, I should not process it. I may have several partial file groups that run over several days. So when I have fg1a.12062017, fg1b.12062017 and fg1c.12062017, I can process that group (fg1) only.
Here is my code so far. It doesn't work because I can't figure out how to get only the full groups to add to the the processing file list.
fileList = Directory.GetFiles(#"c:\temp\");
string[] fileGroup1 = { "FG1A", "FG1B", "FG1C" }; // THIS IS A FULL GROUP
string[] fileGroup2 = { "FG2A", "FG2B", "FG2C", "FG2D" };
List<string> fileDates = new List<string>();
List<string> procFileList;
// get a list of file dates
foreach (string fn in fileList)
{
string dateString = fn.Substring(fn.IndexOf('.'), 9);
if (!fileDates.Contains(dateString))
{
fileDates.Add(dateString);
}
}
bool allFiles = true;
foreach (string fg in fileGroup1)
{
foreach (string fd in fileDates)
{
string finder = fg + fd;
bool foundIt = false;
foreach (string fn in fileList)
{
if (fn.ToUpper().Contains(finder))
{
foundIt = true;
}
}
if (!foundIt)
{
allFiles = false;
}
else
{
foreach (string fn in fileList)
{
procFileList.Add(fn);
}
}
}
}
foreach (string fg in fileGroup2)
{
foreach (string fd in fileDates)
{
string finder = fg + fd;
bool foundIt = false;
foreach (string fn in fileList)
{
if (fn.ToUpper().Contains(finder))
{
foundIt = true;
}
}
if (!foundIt)
{
allFiles = false;
}
else
{
foreach (string fn in fileList)
{
procFileList.Add(fn);
}
}
}
}
Any help or advice would be greatly appreciated.
Because it can sometimes get messy dealing with multiple lists, groupings, and parsing file names, I would start by creating a class that represents a FileGroupItem. This class would have a Parse method that takes in a file path, and then has properties that represent the group part and date part of the file name, as well as the full path to the file:
public class FileGroupItem
{
public string DatePart { get; set; }
public string GroupName { get; set; }
public string FilePath { get; set; }
public static FileGroupItem Parse(string filePath)
{
if (string.IsNullOrWhiteSpace(filePath)) return null;
// Split the file name on the '.' character to get the group and date parts
var fileParts = Path.GetFileName(filePath).Split('.');
if (fileParts.Length != 2) return null;
return new FileGroupItem
{
GroupName = fileParts[0],
DatePart = fileParts[1],
FilePath = filePath
};
}
}
Then, in my main code, I would create a list of the file group definitions, and then populate a list of FileGroupItems from the directory we're scanning. After that, we can determine if any file group definition is complete by comparing it's items (in a case-insensitive way) to the actual FileGroupItems we found in the directory (after first grouping the FileGroupItems by it's DatePart). If the intersection of these two lists has the same number of items as the file group definition, then it's complete and we can process that group.
Maybe it will make more sense in code:
private static void Main()
{
var scanDirectory = #"f:\public\temp\";
var processedDirectory = #"f:\public\temp2\";
// The lists that define a complete group
var fileGroupDefinitions = new List<List<string>>
{
new List<string> {"FG1A", "FG1B", "FG1C"},
new List<string> {"FG2A", "FG2B", "FG2C", "FG2D"}
};
// Populate a list of FileGroupItems from the files
// in our directory, and group them on the DatePart
var fileGroups = Directory.EnumerateFiles(scanDirectory)
.Select(FileGroupItem.Parse)
.GroupBy(f => f.DatePart);
// Now go through each group and compare the items
// for that date with our file group definitions
foreach (var fileGroup in fileGroups)
{
foreach (var fileGroupDefinition in fileGroupDefinitions)
{
// Get the intersection of the group definition and this file group
var intersection = fileGroup
.Where(f => fileGroupDefinition.Contains(
f.GroupName, StringComparer.OrdinalIgnoreCase))
.ToList();
// If all the items in the definition are there, then process the files
if (intersection.Count == fileGroupDefinition.Count)
{
foreach (var fileGroupItem in intersection)
{
Console.WriteLine($"Processing file: {fileGroupItem.FilePath}");
// Move the file to the processed directory
File.Move(fileGroupItem.FilePath,
Path.Combine(processedDirectory,
Path.GetFileName(fileGroupItem.FilePath)));
}
}
}
}
Console.WriteLine("\nDone!\nPress any key to exit...");
Console.ReadKey();
}
I think you could simplify your algorithm so you just have file groups as a prefix and a number of files to expect, fg1 is 3 files for a given date
I think your code to find the distinct dates present is a good idea, though you should use a hash set rather than a list, if you occasionally expect a large number of dates.. ("Valentine's Day?" - Ed)
Then you just need to work on the other loop that does the checking. An algorithm like this
//make a new Dictionary<string,int> for the filegroup prefixes and their counts3
//eg myDict["fg1"] = 3; myDict["fg2"] = 4;
//list the files in the directory, into an array of fileinfo objects
//see the DirectoryInfo.GetFiles method
//foreach string d in the list of dates
//foreach string fgKey in myDict.Keys - the list of group prefixes
//use a bit of Linq to get all the fileinfos with a
//name starting with the group and ending with the date
var grplist = myfileinfos.Where(fi => fi.Name.StartsWith(fg) && fi.Name.EndsWith(d));
//if the grplist.Count == the filegroup count ( myDict[fgKey] )
//then send every file in grplist for processing
//remember that grplist is a collection of fileinfo objects,
//if your processing method takes a string filename, use fileinfo.Fullname
Putting your file groupings into one dictionary will make things a lot easier than having them as x separate arrays
I haven't written all the code for you, but I've comment sketched the algorithm, and I've put in some of the more awkward bits like the link, dictionary declaration and how to fill it.. have a go at fleshing it out with code, ask any questions in a comment on this post
First, create an array of the groups to make processing easier:
var fileGroups = new[] {
new[] { "FG1A", "FG1B", "FG1C" },
new[] { "FG2A", "FG2B", "FG2C", "FG2D" }
};
Then you can convert the array into a Dictionary to map each name back to its group:
var fileGroupMap = fileGroups.SelectMany(g => g.Select(f => new { key = f, group = g })).ToDictionary(g => g.key, g => g.group);
Then, preprocess the files you get from the directory:
var fileList = from fname in Directory.GetFiles(...)
select new {
fname,
fdate = Path.GetExtension(fname),
ffilename = Path.GetFileNameWithoutExtension(fname).ToUpper()
};
Now you can take your fileList and group by date and group, and then filter to just completed groups:
var profFileList = (from file in fileList
group file by new { file.fdate, fgroup = fileGroupMap[file.ffilename] } into fng
where fng.Key.fgroup.All(f => fng.Select(fn => fn.ffilename).Contains(f))
from fn in fng
select fn.fname).ToList();
Since you didn't preserve the groups, I flattened the groups at the end of the query into just a list of files to be processed. If you needed, you could keep them in groups and process the groups instead.
Note: If a file exists that belongs to no group, you will get an error from the lookup in fileGroupMap. If that is a possiblity you can filter the fileList to just known names as follows:
var fileList = from fname in GetFiles
let ffilename = Path.GetFileNameWithoutExtension(fname).ToUpper()
where fileGroupMap.Keys.Contains(ffilename)
select new {
fname,
fdate = Path.GetExtension(fname),
ffilename
};
Also note that having a name in multiple groups will cause an error in the creation of fileGroupMap. If that is a possibility, the queries would become more complex and have to be written differently.
Here is a simple class
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
string[] filenames = { "fg1a.12012017", "fg1b.12012017", "fg1c.12012017", "fg2a.12012017", "fg2b.12012017", "fg2c.12012017", "fg2d.12012017" };
new SplitFileName(filenames);
List<List<SplitFileName>> results = SplitFileName.GetGroups();
}
}
public class SplitFileName
{
public static List<SplitFileName> names = new List<SplitFileName>();
string filename { get; set; }
string prefix { get; set; }
string letter { get; set; }
DateTime date { get; set; }
public SplitFileName() { }
public SplitFileName(string[] splitNames)
{
foreach(string name in splitNames)
{
SplitFileName splitName = new SplitFileName();
names.Add(splitName);
splitName.filename = name;
string[] splitArray = name.Split(new char[] { '.' });
splitName.date = DateTime.ParseExact(splitArray[1],"MMddyyyy", System.Globalization.CultureInfo.InvariantCulture);
splitName.prefix = splitArray[0].Substring(0, splitArray[0].Length - 1);
splitName.letter = splitArray[0].Substring(splitArray[0].Length - 1,1);
}
}
public static List<List<SplitFileName>> GetGroups()
{
return names.OrderBy(x => x.letter).GroupBy(x => new { date = x.date, prefix = x.prefix })
.Where(x => string.Join(",",x.Select(y => y.letter)) == "a,b,c,d")
.Select(x => x.ToList())
.ToList();
}
}
}
With everyone's help, I solved it too. This is what I'm going with because it's the most maintainable for me but the solutions were so smart!!! Thanks everyone for your help.
private void CheckFiles()
{
var fileGroups = new[] {
new [] { "FG1A", "FG1B", "FG1C", "FG1D" },
new[] { "FG2A", "FG2B", "FG2C", "FG2D", "FG2E" } };
List<string> fileDates = new List<string>();
List<string> pfiles = new List<string>();
// get a list of file dates
foreach (string fn in fileList)
{
string dateString = fn.Substring(fn.IndexOf('.'), 9);
if (!fileDates.Contains(dateString))
{
fileDates.Add(dateString);
}
}
// check if a date has all the files
foreach (string fd in fileDates)
{
int fgCount = 0;
// for each file group
foreach (Array masterfg in fileGroups)
{
foreach (string fg in masterfg)
{
// see if all the files are there
bool foundIt = false;
string finder = fg + fd;
foreach (string fn in fileList)
{
if (fn.ToUpper().Contains(finder))
{
pfiles.Add(fn);
}
}
fgCount++;
}
if (fgCount == pfiles.Count())
{
foreach (string fn in pfiles)
{
procFileList.Add(fn);
}
pfiles.Clear();
}
else
{
pfiles.Clear();
}
}
}
return;
}

How to compare two csv files by 2 columns?

I have 2 csv files
1.csv
spain;russia;japan
italy;russia;france
2.csv
spain;russia;japan
india;iran;pakistan
I read both files and add data to lists
var lst1= File.ReadAllLines("1.csv").ToList();
var lst2= File.ReadAllLines("2.csv").ToList();
Then I find all unique strings from both lists and add it to result lists
var rezList = lst1.Except(lst2).Union(lst2.Except(lst1)).ToList();
rezlist contains this data
[0] = "italy;russia;france"
[1] = "india;iran;pakistan"
At now I want to compare, make except and union by second and third column in all rows.
1.csv
spain;russia;japan
italy;russia;france
2.csv
spain;russia;japan
india;iran;pakistan
I think I need to split all rows by symbol ';' and make all 3 operations (except, distinct and union) but cannot understand how.
rezlist must contains
india;iran;pakistan
I added class
class StringLengthEqualityComparer : IEqualityComparer<string>
{
public bool Equals(string x, string y)
{
...
}
public int GetHashCode(string obj)
{
...
}
}
StringLengthEqualityComparer stringLengthComparer = new StringLengthEqualityComparer();
var rezList = lst1.Except(lst2,stringLengthComparer ).Union(lst2.Except(lst1,stringLengthComparer),stringLengthComparer).ToList();
Your question is not very clear: for instance, is india;iran;pakistan the desired result primarily because russia is at element[1]? Isn't it also included because element [2] pakistan does not match france and japan? Even though thats unclear, I assume the desired result comes from either situation.
Then there is this: find all unique string from both lists which changes the nature dramatically. So, I take it that the desired results are because "iran" appears in column[1] no where else in column[1] in either file and even if it did, that row would still be unique due to "pakistan" in col[2].
Also note that a data sample of 2 leaves room for a fair amount of error.
Trying to do it in one step makes it very confusing. Since eliminating dupes found in 1.CSV is pretty easy, do it first:
// parse "1.CSV"
List<string[]> lst1 = File.ReadAllLines(#"C:\Temp\1.csv").
Select(line => line.Split(';')).
ToList();
// parse "2.CSV"
List<string[]> lst2 = File.ReadAllLines(#"C:\Temp\2.csv").
Select(line => line.Split(';')).
ToList();
// extracting once speeds things up in the next step
// and leaves open the possibility of iterating in a method
List<List<string>> tgts = new List<List<string>>();
tgts.Add(lst1.Select(z => z[1]).Distinct().ToList());
tgts.Add(lst1.Select(z => z[2]).Distinct().ToList());
var tmpLst = lst2.Where(x => !tgts[0].Contains(x[1]) ||
!tgts[1].Contains(x[2])).
ToList();
That results in the items which are not in 1.CSV (no matching text in Col[1] nor Col[2]). If that is really all you need, you are done.
Getting unique rows within 2.CSV is trickier because you have to actually count the number of times each Col[1] item occurs to see if it is unique; then repeat for Col[2]. This uses GroupBy:
var unique = tmpLst.
GroupBy(g => g[1], (key, values) =>
new GroupItem(key,
values.ToArray()[0],
values.Count())
).Where(q => q.Count == 1).
GroupBy(g => g.Data[2], (key, values) => new
{
Item = string.Join(";", values.ToArray()[0]),
Count = values.Count()
}
).Where(q => q.Count == 1).Select(s => s.Item).
ToList();
The GroupItem class is trivial:
class GroupItem
{
public string Item { set; get; } // debug aide
public string[] Data { set; get; }
public int Count { set; get; }
public GroupItem(string n, string[] d, int c)
{
Item = n;
Data = d;
Count = c;
}
public override string ToString()
{
return string.Join(";", Data);
}
}
It starts with tmpList, gets the rows with a unique element at [1]. It uses a class for storage since at this point we need the array data for further review.
The second GroupBy acts on those results, this time looking at col[2]. Finally, it selects the joined string data.
Results
Using 50,000 random items in File1 (1.3 MB), 15,000 in File2 (390 kb). There were no naturally occurring unique items, so I manually made 8 unique in 2.CSV and copied 2 of them into 1.CSV. The copies in 1.CSV should eliminate 2 if the 8 unique rows in 2.CSV making the expected result 6 unique rows:
NepalX and ItalyX were the repeats in both files and they correctly eliminated each other.
With each step it is scanning and working with less and less data, which seems to make it pretty fast for 65,000 rows / 130,000 data elements.
your GetHashCode()-Method in EqualityComparer are buggy. Fixed version:
public int GetHashCode(string obj)
{
return obj.Split(';')[1].GetHashCode();
}
now the result are correct:
// one result: "india;iran;pakistan"
btw. "StringLengthEqualityComparer"is not a good name ;-)
private void GetUnion(List<string> lst1, List<string> lst2)
{
List<string> lstUnion = new List<string>();
foreach (string value in lst1)
{
string valueColumn1 = value.Split(';')[0];
string valueColumn2 = value.Split(';')[1];
string valueColumn3 = value.Split(';')[2];
string result = lst2.FirstOrDefault(s => s.Contains(";" + valueColumn2 + ";" + valueColumn3));
if (result != null)
{
if (!lstUnion.Contains(result))
{
lstUnion.Add(result);
}
}
}
}
class Program
{
static void Main(string[] args)
{
var lst1 = File.ReadLines(#"D:\test\1.csv").Select(x => new StringWrapper(x)).ToList();
var lst2 = File.ReadLines(#"D:\test\2.csv").Select(x => new StringWrapper(x));
var set = new HashSet<StringWrapper>(lst1);
set.SymmetricExceptWith(lst2);
foreach (var x in set)
{
Console.WriteLine(x.Value);
}
}
}
struct StringWrapper : IEquatable<StringWrapper>
{
public string Value { get; }
private readonly string _comparand0;
private readonly string _comparand14;
public StringWrapper(string value)
{
Value = value;
var split = value.Split(';');
_comparand0 = split[0];
_comparand14 = split[14];
}
public bool Equals(StringWrapper other)
{
return string.Equals(_comparand0, other._comparand0, StringComparison.OrdinalIgnoreCase)
&& string.Equals(_comparand14, other._comparand14, StringComparison.OrdinalIgnoreCase);
}
public override bool Equals(object obj)
{
if (ReferenceEquals(null, obj)) return false;
return obj is StringWrapper && Equals((StringWrapper) obj);
}
public override int GetHashCode()
{
unchecked
{
return ((_comparand0 != null ? StringComparer.OrdinalIgnoreCase.GetHashCode(_comparand0) : 0)*397)
^ (_comparand14 != null ? StringComparer.OrdinalIgnoreCase.GetHashCode(_comparand14) : 0);
}
}
}

How to count occurences of number stored in file containing multiple delimeters?

This is my input store in file:
50|Carbon|Mercury|P:4;P:00;P:1
90|Oxygen|Mars|P:10;P:4;P:00
90|Serium|Jupiter|P:4;P:16;P:10
85|Hydrogen|Saturn|P:00;P:10;P:4
Now i will take my first row P:4 and then next P:00 and then next like wise and want to count occurence in every other row so expected output will be:
P:4 3(found in 2nd row,3rd row,4th row(last cell))
P:00 2 (found on 2nd row,4th row)
P:1 0 (no occurences are there so)
P:10 1
P:16 0
etc.....
Like wise i would like to print occurence of each and every proportion.
So far i am successfull in splitting row by row and storing in my class file object like this:
public class Planets
{
//My rest fields
public string ProportionConcat { get; set; }
public List<proportion> proportion { get; set; }
}
public class proportion
{
public int Number { get; set; }
}
I have already filled my planet object like below and Finally my List of planet object data is like this:
List<Planets> Planets = new List<Planets>();
Planets[0]:
{
Number:50
name: Carbon
object:Mercury
ProportionConcat:P:4;P:00;P:1
proportion[0]:
{
Number:4
},
proportion[1]:
{
Number:00
},
proportion[2]:
{
Number:1
}
}
Etc...
I know i can loop through and perform search and count but then 2 to 3 loops will be required and code will be little messy so i want some better code to perform this.
Now how do i search each and count every other proportion in my planet List object??
Well, if you have parsed proportions, you can create new struct for output data:
// Class to storage result
public class Values
{
public int Count; // count of proportion entry.
public readonly HashSet<int> Rows = new HashSet<int>(); //list with rows numbers.
/// <summary> Add new proportion</summary>
/// <param name="rowNumber">Number of row, where proportion entries</param>
public void Increment(int rowNumber)
{
++Count; // increase count of proportions entries
Rows.Add(rowNumber); // add number of row, where proportion entry
}
}
And use this code to fill it. I'm not sure it's "messy" and don't see necessity to complicate the code with LINQ. What do you think about it?
var result = new Dictionary<int, Values>(); // create dictionary, where we will storage our results. keys is proportion. values - information about how often this proportion entries and rows, where this proportion entry
for (var i = 0; i < Planets.Count; i++) // we use for instead of foreach for finding row number. i == row number
{
var planet = Planets[i];
foreach (var proportion in planet.proportion)
{
if (!result.ContainsKey(proportion.Number)) // if our result dictionary doesn't contain proportion
result.Add(proportion.Number, new Values()); // we add it to dictionary and initialize our result class for this proportion
result[proportion.Number].Increment(i); // increment count of entries and add row number
}
}
You can use var count = Regex.Matches(lineString, input).Count;. Try this example
var list = new List<string>
{
"50|Carbon|Mercury|P:4;P:00;P:1",
"90|Oxygen|Mars|P:10;P:4;P:00",
"90|Serium|Jupiter|P:4;P:16;P:10",
"85|Hydrogen|Saturn|P:00;P:10;P:4"
};
int totalCount;
var result = CountWords(list, "P:4", out totalCount);
Console.WriteLine("Total Found: {0}", totalCount);
foreach (var foundWords in result)
{
Console.WriteLine(foundWords);
}
public class FoundWords
{
public string LineNumber { get; set; }
public int Found { get; set; }
}
private List<FoundWords> CountWords(List<string> words, string input, out int total)
{
total = 0;
int[] index = {0};
var result = new List<FoundWords>();
foreach (var f in words.Select(word => new FoundWords {Found = Regex.Matches(word, input).Count, LineNumber = "Line Number: " + index[0] + 1}))
{
result.Add(f);
total += f.Found;
index[0]++;
}
return result;
}
I made a DotNetFiddle for you here: https://dotnetfiddle.net/z9QwmD
string raw =
#"50|Carbon|Mercury|P:4;P:00;P:1
90|Oxygen|Mars|P:10;P:4;P:00
90|Serium|Jupiter|P:4;P:16;P:10
85|Hydrogen|Saturn|P:00;P:10;P:4";
string[] splits = raw.Split(
new string[] { "|", ";", "\n" },
StringSplitOptions.None
);
foreach (string p in splits.Where(s => s.ToUpper().StartsWith(("P:"))).Distinct())
{
Console.WriteLine(
string.Format("{0} - {1}",
p,
splits.Count(s => s.ToUpper() == p.ToUpper())
)
);
}
Basically, you can use .Split to split on multiple delimiters at once, it's pretty straightforward. After that, everything is gravy :).
Obviously my code simply outputs the results to the console, but that part is fairly easy to change. Let me know if there's anything you didn't understand.

CSV-file values to List<List>

I have a CSV file that I want some values from. One problem is that I don't know how many columns the file has. The number can be different every time I get a new CSV file. It will always have columns and rows with values. I will get it from a normal excel-file.
I want the method to return a List<List>.
ListA(FirstName, LastName, PhoneNumber... and so on) here I don't know how many items ListA will have. It can be different every time.
Inside ListA I want lists of persons like this:
ListA[FirstName] = List1(Zlatan, Lionel, Anders.....)
ListA[LastName] = List2(Ibrahimovic, Messi, Svensson.....) .. and so on.
You could create a class Person
class person {
private string FirstName;
private string LastName;
// others
}
Open the File and split each row in the file with the String.Split()-Method then convert each value and create Objects, which you can add to a List.
List<Person> persons = new List<Person>();
persons.Add(personFromFile);
Thats a pretty short solution but it works
Edit: Variable Fields per Row
If thats the case you could use a List<string[]> stringArraylist; and then add the results of the String.Split()-Method to it.
List<string[]> stringArraylist;
stringArraylist = new List<string[]>();
stringArraylist.Add("Andrew;Pearson;...;lololo;".Split(';'));
Is that more of what you wanted?
There are a lot of questions on SO that deal with parsing CSV files. See here for one: Reading CSV files in C#. I am fairly certain there are some solutions built in to .NET, though I can't recall what they are at the moment. (#ZoharPeled suggested TextFieldParser)
Most of the parsing solutions with give you a collection of rows where each item is a collection of columns. So assuming you have something like a IEnumerable<IList<string>>, you could create a class and use LINQ queries to get what you need:
public class CSVColumns
{
public IEnumerable<IList<string>> CSVContents { get; private set; }
public CSVColumns(IEnumerable<IList<string>> csvcontents)
{
this.CSVContents = csvcontents;
}
public List<string> FirstNames
{
get { return GetColumn("FirstName"); }
}
public List<string> LastNames
{
get { return GetColumn("LastName"); }
}
/// <summary>
/// Gets a collection of the column data based on the name of the column
/// from the header row.
/// </summary>
public List<string> GetColumn(string columnname)
{
//Get the index of the column with the name
var firstrow = CSVContents.ElementAtOrDefault(0);
if (firstrow != null)
{
int index = -1;
foreach (string s in firstrow)
{
index++;
if (s == columnname)
{
return GetColumn(index, true);
}
}
}
return new List<string>();
}
/// <summary>
/// Gets all items from a specific column number but skips the
/// header row if needed.
/// </summary>
public List<string> GetColumn(int index, bool hasHeaderRow = true)
{
IEnumerable<IList<string>> columns = CSVContents;
if (hasHeaderRow)
columns = CSVContents.Skip(1);
return columns.Select(list =>
{
try
{
return list[index];
}
catch (IndexOutOfRangeException ex)
{
return "";
}
}
).ToList();
}
}
I finally got a solution and it's working for me. My friend made it so all creed to him. No user here on stackoverflow so I post it instead.
private List<Attributes> LoadCsv()
{
string filename = #"C:\Desktop\demo.csv";
// Get the file's text.
string whole_file = System.IO.File.ReadAllText(filename);
// Split into lines.
whole_file = whole_file.Replace('\n', '\r');
string[] lines = whole_file.Split(new char[] { '\r' },
StringSplitOptions.RemoveEmptyEntries);
// See how many rows and columns there are.
int num_rows = lines.Length;
int num_cols = lines[0].Split(';').Length;
// Allocate the data array.
string[,] values = new string[num_rows, num_cols];
// Load the array.
for (int r = 0; r < num_rows; r++)
{
string[] line_r = lines[r].Split(';');
for (int c = 0; c < num_cols; c++)
{
values[r, c] = line_r[c];
}
}
var attr = new List<Attributes>();
for (var r = 0; r < num_rows; r++)
{
if (r == 0)
{
for (var c = 0; c < num_cols; c++)
{
attr.Add(new Attributes());
attr[c].Name = values[r, c];
attr[c].Value = new List<String>();
}
}
else
{
for (var b = 0; b < num_cols; b++)
{
var input = values[r, b];
attr[b].Value.Add(input);
}
}
}
// Return the values.
return attr;
}

remove list-items with Linq when List.property = myValue

I have the following code:
List<ProductGroupProductData> productGroupProductDataList = FillMyList();
string[] excludeProductIDs = { "871236", "283462", "897264" };
int count = productGroupProductDataList.Count;
for (int removeItemIndex = 0; removeItemIndex < count; removeItemIndex++)
{
if (excludeProductIDs.Contains(productGroupProductDataList[removeItemIndex].ProductId))
{
productGroupProductDataList.RemoveAt(removeItemIndex);
count--;
}
}
Now i want to do the same with linq. Is there any way for this?
The second thing would be, to edit each List-Item property with linq.
you could use RemoveAll.
Example:
//create a list of 5 products with ids from 1 to 5
List<Product> products = Enumerable.Range(1,5)
.Select(c => new Product(c, c.ToString()))
.ToList();
//remove products 1, 2, 3
products.RemoveAll(p => p.id <=3);
where
// our product class
public sealed class Product {
public int id {get;private set;}
public string name {get; private set;}
public Product(int id, string name)
{
this.id=id;
this.name=name;
}
}
Firstly corrected version of your current code that won't skip entries
List<ProductGroupProductData> productGroupProductDataList = FillMyList();
string[] excludeProductIDs = { "871236", "283462", "897264" };
int count = productGroupProductDataList.Count;
for (int removeItemIndex = 0; removeItemIndex < count; removeItemIndex++)
{
while (removeItemIndex < count && excludeProductIDs.Contains(productGroupProductDataList[removeItemIndex].ProductId)) {
productGroupProductDataList.RemoveAt(removeItemIndex);
count--;
}
}
}
This linq code would do the job.
List<ProductGroupProductData> productGroupProductDataList = FillMyList();
string[] excludeProductIDs = { "871236", "283462", "897264" };
productGroupProductDataList=productGroupProductDataList.Where(x=>!excludedProductIDs.Contains(x.ProductId)).ToList();
Alternatively using paolo's answer of remove all the last line would be would be
productGroupProductDataList.RemoveAll(p=>excludedProductIDs.Contains(p=>p.ProductId));
What you mean by "The second thing would be, to edit each List-Item property with linq."?
As per your comment here's a version that creates a set that excludes the elements rather than removing them from the original list.
var newSet = from p in productGroupProductDataList
where !excludeProductIDs.Contains(p.ProductId))
select p;
The type of newSet is IEnumerable if you need (I)List you can easily get that:
var newList = newSet.ToList();

Categories