Extract multiple substring in the same line - c#

I'm trying to build a logparser but i'm stuck.
Right now my program goes trough multiple file in a directory and read all the file line by line.
I was able to identify the substring i was looking for "fct=" and extract the value next to the "=" using delimiter but i notice that when i have a line with more then one "fct=" it doesnt see it.
So i restart my code and i find a way to get the index position of all occurence of fct= in the same line using an extension method that put the index in a list but i dont see how i can use this list to get the value next to the "=" and using my delimiter.
How can i extract the value next to the "=" knowing the start position of "fct=" and the delimiter at the end of the wanted value?
I'm starting in C# so let me know if i can give you more information.
Thanks,
Here's an example of what i would like to parse:
<dat>FCT=10019,XN=KEY,CN=ROHWEPJQSKAUMDUC FCT=666</dat></logurl>
<dat>XN=KEY,CN=RTU FCT=4515</dat></logurl>
<dat>XN=KEY,CN=RT</dat></logurl>
I would like t retrieve 10019,666 and 4515.
namespace LogParserV1
{
class Program
{
static void Main(string[] args)
{
int counter = 0;
string[] dirs = Directory.GetFiles(#"C:/LogParser/LogParserV1", "*.txt");
string fctnumber;
char[] enddelimiter = { '<', ',', '&', ':', ' ', '\\', '\'' };
foreach (string fileName in dirs)
{
StreamReader sr = new StreamReader(fileName);
{
String lineRead;
while ((lineRead = sr.ReadLine()) != null)
{
if (lineRead.Contains("fct="))
{
List<int> list = MyExtensions.GetPositions(lineRead, "fct");
//int start = lineRead.IndexOf("fct=") + 4;
// int end = lineRead.IndexOfAny(enddelimiter, start);
//string result = lineRead.Substring(start, end - start);
fctnumber = result;
//System.Console.WriteLine(fctnumber);
list.ForEach(Console.WriteLine);
}
// affiche tout les ligne System.Console.WriteLine(lineRead);
counter++;
}
System.Console.WriteLine(fileName);
sr.Close();
}
}
// Suspend the screen.
System.Console.ReadLine();
}
}
}
namespace ExtensionMethods
{
public class MyExtensions
{
public static List<int> GetPositions(string source, string searchString)
{
List<int> ret = new List<int>();
int len = searchString.Length;
int start = -len;
while (true)
{
start = source.IndexOf(searchString, start + len);
if (start == -1)
{
break;
}
else
{
ret.Add(start);
}
}
return ret;
}
}
}

You could simplify your code a lot by using Regex pattern matching instead.
The following pattern: (?<=FCT=)[0-9]* will match any group of digits preceded by FCT=.
Try it out
This enables us to do the following:
string input = "<dat>FCT=10019,XN=KEY,CN=ROHWEPJQSKAUMDUC FCT=666</dat></logurl>...";
string pattern = "(?<=FCT=)[0-9]*";
var values = Regex.Matches(input, pattern).Cast<Match>().Select(x => x.Value);

I have tested this solution with your data, and it gives me the expected results (10019,666 and 4515)
string data = #"<dat>FCT=10019,XN=KEY,CN=ROHWEPJQSKAUMDUC FCT=666</dat></logurl>
<dat>XN=KEY,CN=RTU FCT=4515</dat></logurl>
<dat>XN=KEY,CN=RT</dat></logurl>";
char[] delimiters = { '<', ',', '&', ':', ' ', '\\', '\'' };
Regex regex = new Regex("fct=(.+)", RegexOptions.IgnoreCase);
var values = data.Split(delimiters).Select(x => regex.Match(x).Groups[1].Value);
values = values.Where(x => !string.IsNullOrWhiteSpace(x));
values.ToList().ForEach(Console.WriteLine);
I hope my solution will be helpful, let me know.

Below code is usefull to extract the repeated words with linq in text
string text = "Hi Naresh, How are you. You will be next Super man";
IEnumerable<string> strings = text.Split(' ').ToList();
var result = strings.AsEnumerable().Select(x => new {str = Regex.Replace(x.ToLowerInvariant(), #"[^0-9a-zA-Z]+", ""), count = Regex.Matches(text.ToLowerInvariant(), #"\b" + Regex.Escape(Regex.Replace(x.ToLowerInvariant(), #"[^0-9a-zA-Z]+", "")) + #"\b").Count}).Where(x=>x.count>1).GroupBy(x => x.str).Select(x => x.First());
foreach(var item in result)
{
Console.WriteLine(item.str +" = "+item.count.ToString());
}

As always, break down the porblem into smaller bits. See if the following methods help in any way. Tying it up to your code is left as an excercise.
private const string Prefix = "fct=";
//make delimiter look up fast
private static HashSet<char> endDelimiters =
new HashSet<char>(new [] { '<', ',', '&', ':', ' ', '\\', '\'' });
private static string[] GetAllFctFields(string line) =>
line.Split(new string[] { Prefix });
private static bool TryGetValue(string delimitedString, out string value)
{
var buffer = new StringBuilder(delimitedString.Length);
foreach (var c in delimitedString)
{
if (endDelimiters.Contains(c))
break;
buffer.Append(c);
}
//I'm assuming that no end delimiter is a format error.
//Modify according to requirements
if (buffer.Length == delimitedString.Length)
{
value = null;
return false;
}
value = buffer.ToString();
return true;
}

Something like :
class Program
{
static void Main(string[] args)
{
char[] enddelimiter = { '<', ',', '&', ':', ' ', '\\', '\'' };
var fct = "fct=";
var lineRead = "fct=value1,useless text fct=vfct=alue2,fct=value3";
var values = new List<string>();
int start = lineRead.IndexOf(fct);
while(start != -1)
{
start += fct.Length;
int end = lineRead.IndexOfAny(enddelimiter, start);
if (end == -1)
end = lineRead.Length;
string result = lineRead.Substring(start, end - start);
values.Add(result);
start = lineRead.IndexOf(fct, end);
}
values.ForEach(Console.WriteLine);
}
}

You can split the line by string[]
char[] enddelimiter = { '<', ',', '&', ':', ' ', '\\', '\'' };
while ((lineRead = sr.ReadLine()) != null)
{
string[] parts1 = lineRead.Split(new string[] { "fct=" },StringSplitOptions.None);
if(parts1.Length > 0)
{
foreach(string _ar in parts1)
{
if(!string.IsNullOrEmpty(_ar))
{
if(_ar.IndexOfAny(enddelimiter) > 0)
{
MessageBox.Show(_ar.Substring(0, _ar.IndexOfAny(enddelimiter)));
}
else
{
MessageBox.Show(_ar);
}
}
}
}
}

Related

C# string.split() separate string by uppercase

I've been using the Split() method to split strings. But this work if you set some character for condition in string.Split(). Is there any way to split a string when is see Uppercase?
Is it possible to get few words from some not separated string like:
DeleteSensorFromTemplate
And the result string is to be like:
Delete Sensor From Template
Use Regex.split
string[] split = Regex.Split(str, #"(?<!^)(?=[A-Z])");
Another way with regex:
public static string SplitCamelCase(string input)
{
return System.Text.RegularExpressions.Regex.Replace(input, "([A-Z])", " $1", System.Text.RegularExpressions.RegexOptions.Compiled).Trim();
}
If you do not like RegEx and you really just want to insert the missing spaces, this will do the job too:
public static string InsertSpaceBeforeUpperCase(this string str)
{
var sb = new StringBuilder();
char previousChar = char.MinValue; // Unicode '\0'
foreach (char c in str)
{
if (char.IsUpper(c))
{
// If not the first character and previous character is not a space, insert a space before uppercase
if (sb.Length != 0 && previousChar != ' ')
{
sb.Append(' ');
}
}
sb.Append(c);
previousChar = c;
}
return sb.ToString();
}
I had some fun with this one and came up with a function that splits by case, as well as groups together caps (it assumes title case for whatever follows) and digits.
Examples:
Input -> "TodayIUpdated32UPCCodes"
Output -> "Today I Updated 32 UPC Codes"
Code (please excuse the funky symbols I use)...
public string[] SplitByCase(this string s) {
var ʀ = new List<string>();
var ᴛ = new StringBuilder();
var previous = SplitByCaseModes.None;
foreach(var ɪ in s) {
SplitByCaseModes mode_ɪ;
if(string.IsNullOrWhiteSpace(ɪ.ToString())) {
mode_ɪ = SplitByCaseModes.WhiteSpace;
} else if("0123456789".Contains(ɪ)) {
mode_ɪ = SplitByCaseModes.Digit;
} else if(ɪ == ɪ.ToString().ToUpper()[0]) {
mode_ɪ = SplitByCaseModes.UpperCase;
} else {
mode_ɪ = SplitByCaseModes.LowerCase;
}
if((previous == SplitByCaseModes.None) || (previous == mode_ɪ)) {
ᴛ.Append(ɪ);
} else if((previous == SplitByCaseModes.UpperCase) && (mode_ɪ == SplitByCaseModes.LowerCase)) {
if(ᴛ.Length > 1) {
ʀ.Add(ᴛ.ToString().Substring(0, ᴛ.Length - 1));
ᴛ.Remove(0, ᴛ.Length - 1);
}
ᴛ.Append(ɪ);
} else {
ʀ.Add(ᴛ.ToString());
ᴛ.Clear();
ᴛ.Append(ɪ);
}
previous = mode_ɪ;
}
if(ᴛ.Length != 0) ʀ.Add(ᴛ.ToString());
return ʀ.ToArray();
}
private enum SplitByCaseModes { None, WhiteSpace, Digit, UpperCase, LowerCase }
Here's another different way if you don't want to be using string builders or RegEx, which are totally acceptable answers. I just want to offer a different solution:
string Split(string input)
{
string result = "";
for (int i = 0; i < input.Length; i++)
{
if (char.IsUpper(input[i]))
{
result += ' ';
}
result += input[i];
}
return result.Trim();
}

How to Split an Already Split String

I have a code as below.
foreach (var item in betSlipwithoutStake)
{
test1 = item.Text;
splitText = test1.Split(new char[] { ':' }, StringSplitOptions.RemoveEmptyEntries);
if (!test.Exists(str => str == splitText[0]))
test.Add(splitText[0]);
}
I'm getting values like "Under 56.5 Points (+56.5)".
Now I want to split again with everything after '(' for each items in the list so i will get a new list and can use it. How can I do that?
if you want to extract value inside parenthesis:
foreach (var item in betSlipwithoutStake)
{
test1 = item.Text;
splitText = test1.Split(new char[] { ':' }, StringSplitOptions.RemoveEmptyEntries);
if (!test.Exists(str => str == splitText[0]))
if(splitText[0].Contains("("))
test.Add(splitText[0].Split('(', ')')[1]);
else
test.Add(splitText[0]);
}
Well, assuming you are after a solution without regular expressions, and that you have a List<string> test declared, you can follow up with a substring, with indexes (and some error handling):
foreach (var item in betSlipwithoutStake)
{
test1 = item.Text;
splitText = test1.Split(new char[] { ':' }, StringSplitOptions.RemoveEmptyEntries);
if (splitText.Length == 0)
continue;
string stringToCheck = splitText[0];
int openParenIndex = stringToCheck.IndexOf('(');
int closeParenIndex = stringToCheck.LastIndexOf(')');
if (openParenIndex >=0 && closeParenIndex >= 0)
{
// get what's inside the outermost set of parens
int length = closeParenIndex - openParenIndex + 1;
stringToCheck = stringToCheck.Substring(openParenIndex, length);
}
if (!test.Exists(str => str == splitText[0]))
test.Add(splitText[0]);
}
You can find out about all of the methods to use with strings here.

how to split comma with double quotes in c#?

string strExample =
"\"10553210\",\"na\",\"398,633,000\",\"20130709\",\"20130502\",\"20120724\",";
how to split above string with ","
I need an answer like
string[] arrExample = YourFunc(strExample);
arrExample[0] == "10553210";
arrExample[1] == "na";
arrExample[2] == "398,633,000";
...
with split option.
thanks in advance
Here is an easy way,
using Microsoft.VisualBasic.FileIO;
IList<string> arrExample;
using(var csvParser = new TextFieldParser(new StringReader(strExample))
{
fields = csvParser.ReadFields();
}
You may split not by comma "," but by whole string "\",\"".
Do not forget to Trim leading and trailing quotations ":
String strExample =
"\"10553210\",\"na\",\"398,633,000\",\"20130709\",\"20130502\",\"20120724\"";
string[] arrExample = St.Trim('"').Split(new String[] {"\",\""}, StringSplitOptions.None);
You can split on "," , The first and last entry you have to clean the " in the last and first entry:
string[] arr = strExample .Split(new string[] { "\",\"" },
StringSplitOptions.None);
//remove the extra quotes from the last and the first entry
arr[0] = arr[0].SubString(1,arr[0].Length - 1);
int last = arr.Length - 1;
arr[last] = arr[last].SubString(0,arr[last].Length - 1);
string[] arrExample = strExample.Split(",");
would do it, but your code won't compile. I assume you meant:
string strExample = "10553210,na,398,633,000,20130709,20130502,20120724";
If this isn't what you meant, please correct the question.
Assuming you meant this:
string strExample = "\"10553210\",\"na\",\"398,633,000\",\"20130709\",\"20130502\",\"20120724\"";
Split then Select the substring:
string[] parts = strExample.Split(',').Select(x => x.Substring(1, x.Length - 2)).ToArray();
Result:
strExample.Split(',');
You need to escape the double quotes if they're meant to be contained in your example string.
Using the example from Jodrell
private string[] SplitFields(string csvValue)
{
//if there aren't quotes, use the faster function
if (!csvValue.Contains('\"') && !csvValue.Contains('\''))
{
return csvValue.Trim(',').Split(',');
}
else
{
//there are quotes, use this built in text parser
using(var csvParser = new Microsoft.VisualBasic.FileIO.TextFieldParser(new StringReader(csvValue.Trim(','))))
{
csvParser.Delimiters = new string[] { "," };
csvParser.HasFieldsEnclosedInQuotes = true;
return csvParser.ReadFields();
}
}
}
This worked for me
public static IEnumerable<string> SplitCSV(string strInput)
{
string[] str = strInput.Split(',');
if (str == null)
yield return null;
StringBuilder quoteS = null;
foreach (string s in str)
{
if (s.StartsWith("\""))
{
if (s.EndsWith("\""))
{
yield return s;
}
quoteS = new StringBuilder(s);
continue;
}
if (quoteS != null)
{
quoteS.Append($",{s}");
if (s.EndsWith("\""))
{
string s1 = quoteS.ToString();
quoteS = null;
yield return s1;
}
else
continue;
}
yield return s;
}
}
static void Main(string[] args)
{
string s = "111,222,\"33,44,55\",666,\"77,88\",\"99\"";
Console.WriteLine(s);
var sp = SplitCSV(s);
foreach (string s1 in sp)
{
Console.WriteLine(s1);
}
Console.ReadKey();
}
you can do that by doing this ..
string stringname= "10553210,na,398,633,000,20130709,20130502,20120724";
List<String> asd = stringname.Split(',');
or if you wanr array then
array[] asd = stringname.Split(',').ToArray;

How do I make letters to uppercase after each of a set of specific characters

I have a collection of characters (',', '.', '/', '-', ' ') then I have a collection of strings (about 500).
What I want to do as fast as possible is: after each of the characters I want to make the next letter uppercase.
I want the first capitalized as well and many of the strings are all uppercase to begin with.
EDIT:
I modified tdragons answer to this final result:
public static String CapitalizeAndStuff(string startingString)
{
startingString = startingString.ToLower();
char[] chars = new[] { '-', ',', '/', ' ', '.'};
StringBuilder result = new StringBuilder(startingString.Length);
bool makeUpper = true;
foreach (var c in startingString)
{
if (makeUpper)
{
result.Append(Char.ToUpper(c));
makeUpper = false;
}
else
{
result.Append(c);
}
if (chars.Contains(c))
{
makeUpper = true;
}
}
return result.ToString();
}
Then I call this method for all my strings.
string a = "fef-aw-fase-fes-fes,fes-,fse--sgr";
char[] chars = new[] { '-', ',' };
StringBuilder result = new StringBuilder(a.Length);
bool makeUpper = true;
foreach (var c in a)
{
if (makeUpper)
{
result.Append(Char.ToUpper(c));
makeUpper = false;
}
else
{
result.Append(c);
}
if (chars.Contains(c))
{
makeUpper = true;
}
}
public static string Capitalise(string text, string targets, CultureInfo culture)
{
bool capitalise = true;
var result = new StringBuilder(text.Length);
foreach (char c in text)
{
if (capitalise)
{
result.Append(char.ToUpper(c, culture));
capitalise = false;
}
else
{
if (targets.Contains(c))
capitalise = true;
result.Append(c);
}
}
return result.ToString();
}
Use it like this:
string targets = ",./- ";
string text = "one,two.three/four-five six";
Console.WriteLine(Capitalise(text, targets, CultureInfo.InvariantCulture));
char[] chars = new char[] { ',', '.', '/', '-', ' ' };
string input = "Foo bar bar foo, foo, bar,foo-bar.bar_foo zz-";
string result = input[0] + new string(input.Skip(1).Select((c, i) =>
chars.Contains(input[i]) ? char.ToUpper(input[i + 1]) : input[i + 1]
).ToArray());
Console.WriteLine(result);
Or you can use a simply regex expression:
var result = Regex.Replace(str, #"([.,-][a-z]|\b[a-z])", m => m.Value.ToUpper());
You can stringSplit your whole string multiple times, once for each element, rinse and repeate, and then uppcase each block.
char[] tempChar = {',','-'};
List<string> tempList = new List();
tempList.Add(yourstring);
foreach(var currentChar in tempChar)
{
List<string> tempSecondList = new List();
foreach(var tempString in tempList)
{
foreach(var tempSecondString in tempString.split(currentchar))
{
tempSecondList.Add(tempSecondString);
}
}
tempList = tempSecondList;
}
I hope i did count correct, anyway, afterwards make every entry in tempList Upper

How to word by word iterate in string in C#?

I want to iterate over string as word by word.
If I have a string "incidentno and fintype or unitno", I would like to read every word one by one as "incidentno", "and", "fintype", "or", and "unitno".
foreach (string word in "incidentno and fintype or unitno".Split(' ')) {
...
}
var regex = new Regex(#"\b[\s,\.-:;]*");
var phrase = "incidentno and fintype or unitno";
var words = regex.Split(phrase).Where(x => !string.IsNullOrEmpty(x));
This works even if you have ".,; tabs and new lines" between your words.
Slightly twisted I know, but you could define an iterator block as an extension method on strings. e.g.
/// <summary>
/// Sweep over text
/// </summary>
/// <param name="Text"></param>
/// <returns></returns>
public static IEnumerable<string> WordList(this string Text)
{
int cIndex = 0;
int nIndex;
while ((nIndex = Text.IndexOf(' ', cIndex + 1)) != -1)
{
int sIndex = (cIndex == 0 ? 0 : cIndex + 1);
yield return Text.Substring(sIndex, nIndex - sIndex);
cIndex = nIndex;
}
yield return Text.Substring(cIndex + 1);
}
foreach (string word in "incidentno and fintype or unitno".WordList())
System.Console.WriteLine("'" + word + "'");
Which has the advantage of not creating a big array for long strings.
Use the Split method of the string class
string[] words = "incidentno and fintype or unitno".Split(" ");
This will split on spaces, so "words" will have [incidentno,and,fintype,or,unitno].
Assuming the words are always separated by a blank, you could use String.Split() to get an Array of your words.
There are multiple ways to accomplish this. Two of the most convenient methods (in my opinion) are:
Using string.Split() to create an array. I would probably use this method, because it is the most self-explanatory.
example:
string startingSentence = "incidentno and fintype or unitno";
string[] seperatedWords = startingSentence.Split(' ');
Alternatively, you could use (this is what I would use):
string[] seperatedWords = startingSentence.Split(new char[] {' '}, StringSplitOptions.RemoveEmptyEntries);
StringSplitOptions.RemoveEmptyEntries will remove any empty entries from your array that may occur due to extra whitespace and other minor problems.
Next - to process the words, you would use:
foreach (string word in seperatedWords)
{
//Do something
}
Or, you can use regular expressions to solve this problem, as Darin demonstrated (a copy is below).
example:
var regex = new Regex(#"\b[\s,\.-:;]*");
var phrase = "incidentno and fintype or unitno";
var words = regex.Split(phrase).Where(x => !string.IsNullOrEmpty(x));
For processing, you can use similar code to the first option.
foreach (string word in words)
{
//Do something
}
Of course, there are many ways to solve this problem, but I think that these two would be the simplest to implement and maintain. I would go with the first option (using string.Split()) just because regex can sometimes become quite confusing, while a split will function correctly most of the time.
When using split, what about checking for empty entries?
string sentence = "incidentno and fintype or unitno"
string[] words = sentence.Split(new char[] { ' ', ',' ,';','\t','\n', '\r'}, StringSplitOptions.RemoveEmptyEntries);
foreach (string word in words)
{
// Process
}
EDIT:
I can't comment so I'm posting here but this (posted above) works:
foreach (string word in "incidentno and fintype or unitno".Split(' '))
{
...
}
My understanding of foreach is that it first does a GetEnumerator() and the calles .MoveNext until false is returned. So the .Split won't be re-evaluated on each iteration
public static string[] MyTest(string inword, string regstr)
{
var regex = new Regex(regstr);
var phrase = "incidentno and fintype or unitno";
var words = regex.Split(phrase);
return words;
}
? MyTest("incidentno, and .fintype- or; :unitno",#"[^\w+]")
[0]: "incidentno"
[1]: "and"
[2]: "fintype"
[3]: "or"
[4]: "unitno"
I'd like to add some information to JDunkerley's awnser.
You can easily make this method more reliable if you give a string or char parameter to search for.
public static IEnumerable<string> WordList(this string Text,string Word)
{
int cIndex = 0;
int nIndex;
while ((nIndex = Text.IndexOf(Word, cIndex + 1)) != -1)
{
int sIndex = (cIndex == 0 ? 0 : cIndex + 1);
yield return Text.Substring(sIndex, nIndex - sIndex);
cIndex = nIndex;
}
yield return Text.Substring(cIndex + 1);
}
public static IEnumerable<string> WordList(this string Text, char c)
{
int cIndex = 0;
int nIndex;
while ((nIndex = Text.IndexOf(c, cIndex + 1)) != -1)
{
int sIndex = (cIndex == 0 ? 0 : cIndex + 1);
yield return Text.Substring(sIndex, nIndex - sIndex);
cIndex = nIndex;
}
yield return Text.Substring(cIndex + 1);
}
I write a string processor class.You can use it.
Example:
metaKeywords = bodyText.Process(prepositions).OrderByDescending().TakeTop().GetWords().AsString();
Class:
public static class StringProcessor
{
private static List<String> PrepositionList;
public static string ToNormalString(this string strText)
{
if (String.IsNullOrEmpty(strText)) return String.Empty;
char chNormalKaf = (char)1603;
char chNormalYah = (char)1610;
char chNonNormalKaf = (char)1705;
char chNonNormalYah = (char)1740;
string result = strText.Replace(chNonNormalKaf, chNormalKaf);
result = result.Replace(chNonNormalYah, chNormalYah);
return result;
}
public static List<KeyValuePair<String, Int32>> Process(this String bodyText,
List<String> blackListWords = null,
int minimumWordLength = 3,
char splitor = ' ',
bool perWordIsLowerCase = true)
{
string[] btArray = bodyText.ToNormalString().Split(splitor);
long numberOfWords = btArray.LongLength;
Dictionary<String, Int32> wordsDic = new Dictionary<String, Int32>(1);
foreach (string word in btArray)
{
if (word != null)
{
string lowerWord = word;
if (perWordIsLowerCase)
lowerWord = word.ToLower();
var normalWord = lowerWord.Replace(".", "").Replace("(", "").Replace(")", "")
.Replace("?", "").Replace("!", "").Replace(",", "")
.Replace("<br>", "").Replace(":", "").Replace(";", "")
.Replace("،", "").Replace("-", "").Replace("\n", "").Trim();
if ((normalWord.Length > minimumWordLength && !normalWord.IsMemberOfBlackListWords(blackListWords)))
{
if (wordsDic.ContainsKey(normalWord))
{
var cnt = wordsDic[normalWord];
wordsDic[normalWord] = ++cnt;
}
else
{
wordsDic.Add(normalWord, 1);
}
}
}
}
List<KeyValuePair<String, Int32>> keywords = wordsDic.ToList();
return keywords;
}
public static List<KeyValuePair<String, Int32>> OrderByDescending(this List<KeyValuePair<String, Int32>> list, bool isBasedOnFrequency = true)
{
List<KeyValuePair<String, Int32>> result = null;
if (isBasedOnFrequency)
result = list.OrderByDescending(q => q.Value).ToList();
else
result = list.OrderByDescending(q => q.Key).ToList();
return result;
}
public static List<KeyValuePair<String, Int32>> TakeTop(this List<KeyValuePair<String, Int32>> list, Int32 n = 10)
{
List<KeyValuePair<String, Int32>> result = list.Take(n).ToList();
return result;
}
public static List<String> GetWords(this List<KeyValuePair<String, Int32>> list)
{
List<String> result = new List<String>();
foreach (var item in list)
{
result.Add(item.Key);
}
return result;
}
public static List<Int32> GetFrequency(this List<KeyValuePair<String, Int32>> list)
{
List<Int32> result = new List<Int32>();
foreach (var item in list)
{
result.Add(item.Value);
}
return result;
}
public static String AsString<T>(this List<T> list, string seprator = ", ")
{
String result = string.Empty;
foreach (var item in list)
{
result += string.Format("{0}{1}", item, seprator);
}
return result;
}
private static bool IsMemberOfBlackListWords(this String word, List<String> blackListWords)
{
bool result = false;
if (blackListWords == null) return false;
foreach (var w in blackListWords)
{
if (w.ToNormalString().Equals(word))
{
result = true;
break;
}
}
return result;
}
}

Categories