Group files in a directory based on their prefix - c#

I have a folder with pictures:
Folder 1:
Files:
ABC-138923
ABC-3223
ABC-33489
ABC-3111
CBA-238923
CBA-1313
CBA-1313
DAC-38932
DAC-1111
DAC-13893
DAC-23232
DAC-9999
I want to go through this folder and count how many of each picture pre-fix I have.
For example, there are 4 pictures of pre-fix ABC and 3 pictures of pre-fix CBA above.
I'm having a hard time trying to figure out how to loop through this. Anyone can give me a hand?

Not a loop, but more clear and readable:
string[] fileNames = ...; //some initializing code
var prefixes = fileNames.GroupBy(x => x.Split('-')[0]).
Select(y => new {Prefix = y.Key, Count = y.Count()});
Upd:
To display the count for each prefix:
foreach (var prefix in prefixes)
{
Console.WriteLine("Prefix: {0}, Count: {1}", prefix.Prefix, prefix.Count);
}

Here it is with a 'foreach' loop:
var directoryPath = ".\Folder1\";
var prefixLength = 3;
var accumulator = new Dictionary<string, int>();
foreach (var file in System.IO.Directory.GetFiles(directoryPath)) {
var prefix = filefile.Replace(directoryPath, string.Empty).Substring(0, prefixLength);
if (!accumulator.ContainsKey(prefix))
{
accumulator.Add(prefix, 0);
}
accumulator[prefix]++;
}
foreach(var prefix in accumulator.Keys) {
Console.WriteLine("{0}: {1}", prefix, accumulator[prefix]);
}

in C#,
using System.IO;
using System.Collections.Generic;
...
DirectoryInfo dir = new DirectoryInfo("C:\\yourfolder");
FileInfo[] files = dir.GetFiles();
List<string> prefix = new List<string>();
List<int> count = new List<int>();
foreach (FileInfo file in files)
{
if (prefix.Count > 0)
{
Boolean AddNew = true;
for (int i = 0; i < prefix.Count; i++)
{
if (file.Name.Substring(0, 3) == prefix[i])
{
count[i]++;
AddNew = false;
}
}
if (AddNew)
{
prefix.Add(file.Name.Substring(0, 3));
count.Add(1);
}
}
else
{
prefix.Add(file.Name.Substring(0, 3));
count.Add(1);
}
}
...
The prefix string list is parallel to the count list, so to access you could loop through the array. I haven't tested or optimized it, but if you're heading down this route (c#) this could be a start.

The algorithm:
Create a dictionary:
Dictionary<string, int> D;
Loop through the directory using:
foreach (var file in System.IO.Directory.GetFiles(dir))
...
Complete the following 3 steps for each file:
Extract the prefix and see if a matching key exists in D. If TRUE, go to step 3.
Insert the prefix as a new key in D, with value 0
Increment the key's value by 1
To display results when the entire directory has been processed:
foreach (KeyValuePair<string, int> pair in D)
Console.WriteLine("{0} prefix has {1} files", pair.Key, pair.Value);

Related

Count how many files starts with the same first characters c#

I want to make function that will count how many files in selected folder starts with the same 10 characters.
For example in folder will be files named File1, File2, File3 and int count will give 1 because all 3 files starts with the same characters "File", if in folder will be
File1,File2,File3,Docs1,Docs2,pdfs1,pdfs2,pdfs3,pdfs4
will give 3, because there are 3 unique values for fileName.Substring(0, 4).
I've tried something like this, but it gives overall number of files in folder.
int count = 0;
foreach (string file in Directory.GetFiles(folderLocation))
{
string fileName = Path.GetFileName(file);
if (fileName.Substring(0, 10) == fileName.Substring(0, 10))
{
count++;
}
}
Any idea how to count this?
You can try querying directory with a help of Linq:
using System.IO;
using System.Linq;
...
int n = 10;
int count = Directory
.EnumerateFiles(folderLocation, "*.*")
.Select(file => Path.GetFileNameWithoutExtension(file))
.Select(file => file.Length > n ? file.Substring(0, n) : file)
.GroupBy(name => name, StringComparer.OrdinalIgnoreCase)
.OrderByDescending(group => group.Count())
.FirstOrDefault()
?.Count() ?? 0;
You could instantiate a list of strings of files with a unique name, and check if each file is in that list or not:
int count = 0;
int length = 0;
List<string> list = new List<string>();
foreach (string file in Directory.GetFiles(folderLocation))
{
boolean inKnown = false;
string fileName = Path.GetFileName(file);
for (string s in list)
{
if (s.Length() < length)
{
// Add to known list just so that we don't check for this string later
inKnown = true;
count--;
break;
}
if (s.Substring(0, length) == fileName.Substring(0, length))
{
inKnown = true;
break;
}
}
if (!inKnown)
{
count++;
list.Add(s);
}
}
The limitation here is that you are asking if the first ten characters are the same, but your examples given showed the first 4, so just adjust the length variable according to how many characters you would like to check for.
#acornTime give me idea, his solution didn't work but this worked. Thanks for help!
List<string> list = new List<string>();
foreach (string file in Directory.GetFiles(folderLocation))
{
string fileName = Path.GetFileName(file);
list.Add(fileName.Substring(0, 10));
}
list = list.Distinct().ToList();
//count how many items are in list
int count = list.Count;

Get Filepath, and move the files in an other Directory if they contain an ID in the name

Basically, I need to check if a file exist in 4 version which mean that a 11 digits code appear in the filename.
Once the check is done I need to move the file on another Server.
My problem is that I get the ID, and I do know when an ID appear 4 times, but I don't know how to get the files Path from the ID I got and then move the files.
Any kind of help would be super appreciated.
static void Main(string[] args)
{
string ExtractIDFromFileName(string filename)
{
return filename.Split('_').Last();
}
Dictionary<string, int> GetDictOfIDCounts()
{
List<string> allfiles = Directory.GetFiles("C:/Users/Desktop/Script/tiptop", "*.txt").Select(Path.GetFileNameWithoutExtension).ToList();
allfiles.AddRange(Directory.GetFiles("C:/Users/Desktop/Script/tiptop", "*.top").Select(Path.GetFileNameWithoutExtension).ToList());
Dictionary<string, int> dict = new Dictionary<string, int>();
foreach (var x in allfiles)
{
string fileID = ExtractIDFromFileName(x);
if (dict.ContainsKey(fileID))
{
dict[fileID]++;
}
else
{
dict.Add(fileID, 1);
}
}
return dict;
}
var result = GetDictOfIDCounts();
foreach (var item in result)
{
//Console.WriteLine("{0} > {1}", item.Key, item.Value);
if (item.Value == 4)
{
//When we know that those ID appear 4 times, I need to grab back the FilePath and then move the files in an other DIR.
Console.WriteLine("{0} > {1}", item.Key, item.Value);
}
}
Console.ReadLine();
}
You'll want to use File.Move: https://learn.microsoft.com/en-us/dotnet/api/system.io.file.move?view=netframework-4.7.2
It's pretty straightforward. Make the ID in your dictionary the full path of the source file, then File.Move(dict.Key, some-variable-holding-the-new-directory-and-file-name);
Since you're going to want to use the filepath, just switch over to using the instance version of Directory & File: DirectoryInfo & FileInfo:
replace
List<string> allfiles = Directory.GetFiles("C:/Users/Desktop/Script/tiptop", "*.txt").Select(Path.GetFileNameWithoutExtension).ToList();
with
DirectoryInfo di = new DirectoryInfo("C:/Users/Desktop/Script/tiptop");
var allFiles = di.GetFiles("*.txt");
Make the FileInfo the key of the dictionary. Then you can do things like
dict.Key.FullName
Try this to get files you want, it uses GroupBy on the ID and Count(). I haven't compiled it so there might be errors.
var files = Directory.GetFiles(#"C:\Users\Desktop\Script\tiptop", "*.*")
.Where(file => {
var ext = Path.GetExtension(file).ToLower();
return ext == ".txt" || ext == ".top";
})
.Select(file => new { Path = file, Id = file.Split('_').Last()})
.GroupBy(file => file.Id)
.Where(grp => grp.Count() >= 4)
.SelectMany(x => x)
.Select(x => x.Path);
Another solution would be to use a Dictionary<string, List<string>> instead of Dictionary<string, int>. When adding a new key to the Dictionary you would add new List<string> { x } so that you would keep the file names in the list of ids. Instead of checking for the item.Value == 4 in you if you could check the size of the list like item.Count == 4. Then you still have the file names so you may use them to move the files.
Hope it helps!

How to Get Groups of Files from GetFiles()

I have to process files everyday. The files are named like so:
fg1a.mmddyyyy
fg1b.mmddyyyy
fg1c.mmddyyyy
fg2a.mmddyyyy
fg2b.mmddyyyy
fg2c.mmddyyyy
fg2d.mmddyyyy
If the entire file group is there for a particular date, I can process it. If it isn't there, I should not process it. I may have several partial file groups that run over several days. So when I have fg1a.12062017, fg1b.12062017 and fg1c.12062017, I can process that group (fg1) only.
Here is my code so far. It doesn't work because I can't figure out how to get only the full groups to add to the the processing file list.
fileList = Directory.GetFiles(#"c:\temp\");
string[] fileGroup1 = { "FG1A", "FG1B", "FG1C" }; // THIS IS A FULL GROUP
string[] fileGroup2 = { "FG2A", "FG2B", "FG2C", "FG2D" };
List<string> fileDates = new List<string>();
List<string> procFileList;
// get a list of file dates
foreach (string fn in fileList)
{
string dateString = fn.Substring(fn.IndexOf('.'), 9);
if (!fileDates.Contains(dateString))
{
fileDates.Add(dateString);
}
}
bool allFiles = true;
foreach (string fg in fileGroup1)
{
foreach (string fd in fileDates)
{
string finder = fg + fd;
bool foundIt = false;
foreach (string fn in fileList)
{
if (fn.ToUpper().Contains(finder))
{
foundIt = true;
}
}
if (!foundIt)
{
allFiles = false;
}
else
{
foreach (string fn in fileList)
{
procFileList.Add(fn);
}
}
}
}
foreach (string fg in fileGroup2)
{
foreach (string fd in fileDates)
{
string finder = fg + fd;
bool foundIt = false;
foreach (string fn in fileList)
{
if (fn.ToUpper().Contains(finder))
{
foundIt = true;
}
}
if (!foundIt)
{
allFiles = false;
}
else
{
foreach (string fn in fileList)
{
procFileList.Add(fn);
}
}
}
}
Any help or advice would be greatly appreciated.
Because it can sometimes get messy dealing with multiple lists, groupings, and parsing file names, I would start by creating a class that represents a FileGroupItem. This class would have a Parse method that takes in a file path, and then has properties that represent the group part and date part of the file name, as well as the full path to the file:
public class FileGroupItem
{
public string DatePart { get; set; }
public string GroupName { get; set; }
public string FilePath { get; set; }
public static FileGroupItem Parse(string filePath)
{
if (string.IsNullOrWhiteSpace(filePath)) return null;
// Split the file name on the '.' character to get the group and date parts
var fileParts = Path.GetFileName(filePath).Split('.');
if (fileParts.Length != 2) return null;
return new FileGroupItem
{
GroupName = fileParts[0],
DatePart = fileParts[1],
FilePath = filePath
};
}
}
Then, in my main code, I would create a list of the file group definitions, and then populate a list of FileGroupItems from the directory we're scanning. After that, we can determine if any file group definition is complete by comparing it's items (in a case-insensitive way) to the actual FileGroupItems we found in the directory (after first grouping the FileGroupItems by it's DatePart). If the intersection of these two lists has the same number of items as the file group definition, then it's complete and we can process that group.
Maybe it will make more sense in code:
private static void Main()
{
var scanDirectory = #"f:\public\temp\";
var processedDirectory = #"f:\public\temp2\";
// The lists that define a complete group
var fileGroupDefinitions = new List<List<string>>
{
new List<string> {"FG1A", "FG1B", "FG1C"},
new List<string> {"FG2A", "FG2B", "FG2C", "FG2D"}
};
// Populate a list of FileGroupItems from the files
// in our directory, and group them on the DatePart
var fileGroups = Directory.EnumerateFiles(scanDirectory)
.Select(FileGroupItem.Parse)
.GroupBy(f => f.DatePart);
// Now go through each group and compare the items
// for that date with our file group definitions
foreach (var fileGroup in fileGroups)
{
foreach (var fileGroupDefinition in fileGroupDefinitions)
{
// Get the intersection of the group definition and this file group
var intersection = fileGroup
.Where(f => fileGroupDefinition.Contains(
f.GroupName, StringComparer.OrdinalIgnoreCase))
.ToList();
// If all the items in the definition are there, then process the files
if (intersection.Count == fileGroupDefinition.Count)
{
foreach (var fileGroupItem in intersection)
{
Console.WriteLine($"Processing file: {fileGroupItem.FilePath}");
// Move the file to the processed directory
File.Move(fileGroupItem.FilePath,
Path.Combine(processedDirectory,
Path.GetFileName(fileGroupItem.FilePath)));
}
}
}
}
Console.WriteLine("\nDone!\nPress any key to exit...");
Console.ReadKey();
}
I think you could simplify your algorithm so you just have file groups as a prefix and a number of files to expect, fg1 is 3 files for a given date
I think your code to find the distinct dates present is a good idea, though you should use a hash set rather than a list, if you occasionally expect a large number of dates.. ("Valentine's Day?" - Ed)
Then you just need to work on the other loop that does the checking. An algorithm like this
//make a new Dictionary<string,int> for the filegroup prefixes and their counts3
//eg myDict["fg1"] = 3; myDict["fg2"] = 4;
//list the files in the directory, into an array of fileinfo objects
//see the DirectoryInfo.GetFiles method
//foreach string d in the list of dates
//foreach string fgKey in myDict.Keys - the list of group prefixes
//use a bit of Linq to get all the fileinfos with a
//name starting with the group and ending with the date
var grplist = myfileinfos.Where(fi => fi.Name.StartsWith(fg) && fi.Name.EndsWith(d));
//if the grplist.Count == the filegroup count ( myDict[fgKey] )
//then send every file in grplist for processing
//remember that grplist is a collection of fileinfo objects,
//if your processing method takes a string filename, use fileinfo.Fullname
Putting your file groupings into one dictionary will make things a lot easier than having them as x separate arrays
I haven't written all the code for you, but I've comment sketched the algorithm, and I've put in some of the more awkward bits like the link, dictionary declaration and how to fill it.. have a go at fleshing it out with code, ask any questions in a comment on this post
First, create an array of the groups to make processing easier:
var fileGroups = new[] {
new[] { "FG1A", "FG1B", "FG1C" },
new[] { "FG2A", "FG2B", "FG2C", "FG2D" }
};
Then you can convert the array into a Dictionary to map each name back to its group:
var fileGroupMap = fileGroups.SelectMany(g => g.Select(f => new { key = f, group = g })).ToDictionary(g => g.key, g => g.group);
Then, preprocess the files you get from the directory:
var fileList = from fname in Directory.GetFiles(...)
select new {
fname,
fdate = Path.GetExtension(fname),
ffilename = Path.GetFileNameWithoutExtension(fname).ToUpper()
};
Now you can take your fileList and group by date and group, and then filter to just completed groups:
var profFileList = (from file in fileList
group file by new { file.fdate, fgroup = fileGroupMap[file.ffilename] } into fng
where fng.Key.fgroup.All(f => fng.Select(fn => fn.ffilename).Contains(f))
from fn in fng
select fn.fname).ToList();
Since you didn't preserve the groups, I flattened the groups at the end of the query into just a list of files to be processed. If you needed, you could keep them in groups and process the groups instead.
Note: If a file exists that belongs to no group, you will get an error from the lookup in fileGroupMap. If that is a possiblity you can filter the fileList to just known names as follows:
var fileList = from fname in GetFiles
let ffilename = Path.GetFileNameWithoutExtension(fname).ToUpper()
where fileGroupMap.Keys.Contains(ffilename)
select new {
fname,
fdate = Path.GetExtension(fname),
ffilename
};
Also note that having a name in multiple groups will cause an error in the creation of fileGroupMap. If that is a possibility, the queries would become more complex and have to be written differently.
Here is a simple class
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
string[] filenames = { "fg1a.12012017", "fg1b.12012017", "fg1c.12012017", "fg2a.12012017", "fg2b.12012017", "fg2c.12012017", "fg2d.12012017" };
new SplitFileName(filenames);
List<List<SplitFileName>> results = SplitFileName.GetGroups();
}
}
public class SplitFileName
{
public static List<SplitFileName> names = new List<SplitFileName>();
string filename { get; set; }
string prefix { get; set; }
string letter { get; set; }
DateTime date { get; set; }
public SplitFileName() { }
public SplitFileName(string[] splitNames)
{
foreach(string name in splitNames)
{
SplitFileName splitName = new SplitFileName();
names.Add(splitName);
splitName.filename = name;
string[] splitArray = name.Split(new char[] { '.' });
splitName.date = DateTime.ParseExact(splitArray[1],"MMddyyyy", System.Globalization.CultureInfo.InvariantCulture);
splitName.prefix = splitArray[0].Substring(0, splitArray[0].Length - 1);
splitName.letter = splitArray[0].Substring(splitArray[0].Length - 1,1);
}
}
public static List<List<SplitFileName>> GetGroups()
{
return names.OrderBy(x => x.letter).GroupBy(x => new { date = x.date, prefix = x.prefix })
.Where(x => string.Join(",",x.Select(y => y.letter)) == "a,b,c,d")
.Select(x => x.ToList())
.ToList();
}
}
}
With everyone's help, I solved it too. This is what I'm going with because it's the most maintainable for me but the solutions were so smart!!! Thanks everyone for your help.
private void CheckFiles()
{
var fileGroups = new[] {
new [] { "FG1A", "FG1B", "FG1C", "FG1D" },
new[] { "FG2A", "FG2B", "FG2C", "FG2D", "FG2E" } };
List<string> fileDates = new List<string>();
List<string> pfiles = new List<string>();
// get a list of file dates
foreach (string fn in fileList)
{
string dateString = fn.Substring(fn.IndexOf('.'), 9);
if (!fileDates.Contains(dateString))
{
fileDates.Add(dateString);
}
}
// check if a date has all the files
foreach (string fd in fileDates)
{
int fgCount = 0;
// for each file group
foreach (Array masterfg in fileGroups)
{
foreach (string fg in masterfg)
{
// see if all the files are there
bool foundIt = false;
string finder = fg + fd;
foreach (string fn in fileList)
{
if (fn.ToUpper().Contains(finder))
{
pfiles.Add(fn);
}
}
fgCount++;
}
if (fgCount == pfiles.Count())
{
foreach (string fn in pfiles)
{
procFileList.Add(fn);
}
pfiles.Clear();
}
else
{
pfiles.Clear();
}
}
}
return;
}

How to search for multiple strings and keep counters for them

What I'm trying to do is the following - I have hundreds of log files, that I need to search through and do some counting. The basic idea is this, take a .txt file, read every line, if search item 1 is found, increment the counter for search item 1, if search item 2 is found, increment the counter for search item 2 and so on.. For example, if the file contained something like...
a b c
d e f
g h i
j k h
And If I specified the searchables to be e & h, the output should say
e : 1
h : 2
The number of search terms is expandable, basically the user can give either 1 search number or 10, so i'm not sure how I can implement n number of counters based on the number of searchables.
The below is what I have so far, its just a basic approach to see what works and what doesnt... Right now, it only keeps the count for one of the search terms. At the moment, I am writing the results to the console to just test, ultimately, It will be written to a .txt or .xlsx. any help will be appreciated!
string line;
int Scounter = 0;
int Mcounter = 0;
List<string> searchables = new List<string>();
private void search_Log(string p)
{
searchables.Add("S");
searchables.Add("M");
StreamReader reader = new StreamReader(p);
while ((line = reader.ReadLine()) != null)
{
for (int i = 0; i < searchables.Count(); i++)
{
if (line.Contains(searchables[i]))
{
Scounter++;
}
}
}
reader.Close();
Console.WriteLine("# of S: " + Scounter);
Console.WriteLine("# of M: " + Mcounter);
}
A common approach to this is to use a Dictionary<string, int> to track the values and counts:
// Initialise the dictionary:
Dictionary<string, int> counters = new Dictionary<string, int>();
Then later:
if (line.Contains(searchables[i]))
{
if (counters.ContainsKey(searchables[i]))
{
counters[searchables[i]] ++;
}
else
{
counters.Add(searchables[i], 1);
}
}
Then, when you are finished processing:
// Add in any searches which had no results:
foreach (var searchTerm in searchables)
{
if (counters.ContainsKey(searchTerm) == false)
{
counters.Add(searchTerm, 0);
}
}
foreach (var item in counters)
{
Console.WriteLine("Value {0} occurred {1} times", item.Key, item.Value);
}
you could use a class for the searchables like:
public class Searchable
{
public string searchTerm;
public int count;
}
then
while ((line = reader.ReadLine()) != null)
{
foreach (var searchable in searchables)
{
if (line.Contains(searchable.searchTerm))
{
searchable.count++;
}
}
}
This would be one of many ways to track multiple search terms and their counts.
You can make use of linq here:
string lines = reader.ReadtoEnd();
var result = lines.Split(new string[]{" ","\r\n"},StringSplitOptions.RemoveEmptyEntries)
.GroupBy(x=>x)
.Select(g=> new
{
Alphabet = g.Key ,
Count = g.Count()
}
);
Input:
a b c
d e f
Output :
a: 1
b: 1
c: 1
d: 1
e: 1
f: 1
This version will count 1^n search terms that occur 1^n times per file line. It accounts for the possibility of a term existing more than once on one line.
using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace ConsoleApplication5
{
class Program
{
static void Main(string[] args)
{
Func<string, string[], Dictionary<string, int>> searchForCounts = null;
searchForCounts = (filePathAndName, searchTerms) =>
{
Dictionary<string, int> results = new Dictionary<string, int>();
if (string.IsNullOrEmpty(filePathAndName) || !File.Exists(filePathAndName))
return results;
using (TextReader tr = File.OpenText(filePathAndName))
{
string line = null;
while ((line = tr.ReadLine()) != null)
{
for (int i = 0; i < searchTerms.Length; ++i)
{
var searchTerm = searchTerms[i].ToLower();
var index = 0;
while (index > -1)
{
index = line.IndexOf(searchTerm, index, StringComparison.OrdinalIgnoreCase);
if (index > -1)
{
if (results.ContainsKey(searchTerm))
results[searchTerm] += 1;
else
results[searchTerm] = 1;
index += searchTerm.Length - 1;
}
}
}
}
}
return results;
};
var counts = searchForCounts("D:\\Projects\\ConsoleApplication5\\ConsoleApplication5\\TestLog.txt", new string[] { "one", "two" });
Console.WriteLine("----Counts----");
foreach (var keyPair in counts)
{
Console.WriteLine("Term: " + keyPair.Key.PadRight(10, ' ') + " Count: " + keyPair.Value.ToString());
}
Console.ReadKey(true);
}
}
}
Input:
OnE, TwO
Output:
----Counts----
Term: one Count: 7
Term: two Count: 15

Counting words using LinkedList

I have a class WordCount which has string wordDic and int count. Next, I have a List.
I have ANOTHER List which has lots of words inside it. I am trying to use List to count the occurrences of each word inside List.
Below is where I am stuck.
class WordCount
{
string wordDic;
int count;
}
List<WordCount> usd = new List<WordCount>();
foreach (string word in wordsList)
{
if (usd.wordDic.Contains(new WordCount {wordDic=word, count=0 }))
usd.count[value] = usd.counts[value] + 1;
else
usd.Add(new WordCount() {wordDic=word, count=1});
}
I don't know how to properly implement this in code but I am trying to search my List to see if the word in wordsList already exists and if it does, add 1 to count but if it doesn't then insert it inside usd with count of 1.
Note: *I have to use Lists to do this. I am not allowed to use anything else like hash tables...*
This is the answer before you edited to only use lists...btw, what is driving that requirement?
List<string> words = new List<string> {...};
// For case-insensitive you can instantiate with
// new Dictionary<string, int>(StringComparer.OrdinalIgnoreCase)
Dictionary<string, int> counts = new Dictionary<string, int>();
foreach (string word in words)
{
if (counts.ContainsKey(word))
{
counts[word] += 1;
}
else
{
counts[word] = 1;
}
}
If you can only use lists, Can you use List<KeyValuePair<string,int>> counts which is the same thing as a dictionary (although I'm not sure it would guarantee uniqueness). The solution would be very similar. If you can only use lists the following will work.
List<string> words = new List<string>{...};
List<string> foundWord = new List<string>();
List<int> countWord = new List<int>();
foreach (string word in words)
{
if (foundWord.Contains(word))
{
countWord[foundWord.IndexOf(word)] += 1;
}
else
{
foundWord.Add(word);
countWord.Add(1);
}
}
Using your WordCount class
List<string> words = new List<string>{...};
List<WordCount> foundWord = new List<WordCount>();
foreach (string word in words)
{
WordCount match = foundWord.SingleOrDefault(w => w.wordDic == word);
if (match!= null)
{
match.count += 1;
}
else
{
foundWord.Add(new WordCount { wordDic = word, count = 1 });
}
}
You can use Linq to do this.
static void Main(string[] args)
{
List<string> wordsList = new List<string>()
{
"Cat",
"Dog",
"Cat",
"Hat"
};
List<WordCount> usd = wordsList.GroupBy(x => x)
.Select(x => new WordCount() { wordDic = x.Key, count = x.Count() })
.ToList();
}
Use linq: Assuming your list of words :
string[] words = { "blueberry", "chimpanzee", "abacus", "banana", "abacus","apple", "cheese" };
You can do:
var count =
from word in words
group word.ToUpper() by word.ToUpper() into g
where g.Count() > 0
select new { g.Key, Count = g.Count() };
(or in your case, select new WordCount()... it'll depend on how you have your constructor set up)...
the result will look like:
First, all of your class member is private, thus, they could not be accessed somewhere out of your class. Let's assume you're using them in WordCount class too.
Second, your count member is an int. Therefore, follow statement will not work:
usd.count[value] = usd.counts[value] + 1;
And I think you've made a mistype between counts and count.
To solve your problem, find the counter responding your word. If it exists, increase count value, otherwise, create the new one.
foreach (string word in wordsList) {
WordCount counter = usd.Find(c => c.wordDic == word);
if (counter != null) // Counter exists
counter.count++;
else
usd.Add(new WordCount() { wordDic=word, count = 1 }); // Create new one
}
You should use a Dictionary as its faster when using the "Contains" method.
Just replace your list with this
Dictionary usd = new Dictionary();
foreach (string word in wordsList)
{
if (usd.ContainsKey(word.ToLower()))
usd.count[word.ToLower()].count++;
else
usd.Add(word.ToLower(), new WordCount() {wordDic=word, count=1});
}

Categories