Split large string into smaller chunks in c# - c#

I have a large string separated by newline character. This string contains 100 lines. I want to split these line into small chunks say chunk of 20 also based on newline character.
Let's say the string variable is like this,
Line1 This is line2 Line3 is here I am Line4
Now I want to split this large string variable into small chunks of 2. The result should be 2 strings as,
Line1 This is line2
Line3 is here I am Line4
Using Split function, I am not getting the expected results. Please help me in achieving this.
Thanks in advance,
Vijay

The simple approach (Split on Environment.NewLine, then loop and append):
public static List<string> GetStringSegments(string originalString, int linesPerSegment)
{
List<string> segments = new List<string>();
string[] allLines = originalString.Split(new string[] {Environment.NewLine}, StringSplitOptions.RemoveEmptyEntries);
StringBuilder sb = new StringBuilder();
int linesProcessed = 0;
for (int i = 0; i < allLines.Length; i++)
{
sb.AppendLine(allLines[i]);
linesProcessed++;
if (linesProcessed == linesPerSegment
|| i == allLines.Length-1)
{
segments.Add(sb.ToString());
sb.Clear();
inesProcessed = 0;
}
}
return segments;
}
The above approach is slightly inefficient since it requires splitting the string first into individual lines, which creates unnecessary strings. A string of 1000 lines will create an array of 1000 strings. We can improved this if we just scan the string and search for \n:
public static List<string> GetStringSegments(string original, int linesPerSegment)
{
List<string> segments = new List<string>();
int startIndex = 0;
int newLinesEncountered = 0;
for (int i = 0; i < original.Length; i++)
{
if (original[i] == '\n')
{
newLinesEncountered++;
}
if (newLinesEncountered == linesPerSegment
|| i == original.Length - 1)
{
segments.Add(original.Substring(startIndex, (i - startIndex + 1)));
startIndex = i + 1;
newLinesEncountered = 0;
}
}
return segments;
}

You can use something like the batch operator from http://www.make-awesome.com/2010/08/batch-or-partition-a-collection-with-linq
string s = "[YOUR DATA]";
var lines = s.Split(new[]{Environment.NewLine}, StringSplitOptions.RemoveEmptyEntries);
foreach(var batch in lines.Batch(20))
{
foreach(batchLine in batch)
{
Console.Writeline(batchLine);
}
}
static class LinqEx
{
// from http://www.make-awesome.com/2010/08/batch-or-partition-a-collection-with-linq
public static IEnumerable<IEnumerable<T>> Batch<T>(this IEnumerable<T> collection,
int batchSize)
{
List<T> nextbatch = new List<T>(batchSize);
foreach (T item in collection)
{
nextbatch.Add(item);
if (nextbatch.Count == batchSize)
{
yield return nextbatch;
nextbatch = new List<T>(batchSize);
}
}
if (nextbatch.Count > 0)
yield return nextbatch;
}
}

As several people mentioned, using string.Split will split the whole string into memory, which might be an allocation-heavy operation. This is why we have the TextReader class and its descendants, which should provide better memory performance, and might also be clearer, logically:
using (var reader = new StringReader(myString))
{
do
{
StringBuilder newString = null;
StringWriter newStringWriter = null;
if (lineCounter % 20 == 0)
{
newString = new StringBuilder();
newStringWriter = new StringWriter(newString);
newStringCollection.Add(newString);
}
string line = reader.ReadLine();
if (!string.isNullOrEmpty(line))
{
newStringWriter.WriteLine(line);
lineCounter++;
}
}
while (line != null)
}
We're using the StringReader to read our big string, one line at a time. And the corresponding StringWriter writes those lines to the new string, one line a time. After every 20 lines, we start a new StringBuilder (and the appropriate StringWriter wrapper).

split the strings by newline.
Then merge/fetch the number of strings together while using the strings.

string s = "Line1\nThis is line2 \nLine3 is here\nI am Line4";
string [] str = s.split('\n');
List<String> str1 = new List<String>();
for(int i=0; i<str.Length; i+=2)
{
string ss = str[i];
if(i+1 <str.Length)
ss += '\n' + str[i+1];
str1.Add(ss);
}
str = str1.ToArray();
If condition has been checked inside loop because may be the length of str is odd

var strAray = myLongString.Split('\n').ToList();
var skip=0;
var take=20;
var chunk = strAray.Skip(skip).Take(take).ToList();
While(chunk.Count >0)
{
foreach(var line in chunk)
{
// use line string
}
skip++;
chunk = strAray.Skip(skip).Take(take).ToList()
}

Related

Split one string and put it in two arrays

I'd like to split several strings of a text file into two strings each (example: car;driver). I do not know how to put the first word in array1 and the second word in array2. So I tried with an if query for a semicolon to put every single letter of word1 in array1 and the same with the second word to put them back together to the words later.
But I think it's too complicated what I've done and I am stuck now, lol.
Here I show a piece of my code:
private void BtnShow_Click(object sender, EventArgs e)
{
LibPasswords.Items.Clear();
string path = "passwords.txt";
int counter = 0;
using (FileStream fs = new FileStream(path, FileMode.Open, FileAccess.Read))
using (StreamReader reader = new StreamReader(fs))
{
while (reader.ReadLine() != null)
{
counter++;
}
//for (int i = 0; i < counter; i++)
//{
// var Website = reader.ReadLine().Split(';').Select(x => new String[] { x });
// var Passwort = reader.ReadLine().Split(';').Select(y => new String[] { y });
// LibPasswords.Items.Add(String.Format(table, Website, Passwort));
//}
string[] firstWord = new string[counter];
string[] lastWord = new string[counter];
int i = 0;
int index = 0;
while (reader.Peek() >= 0)
{
string ch = reader.Read().ToString();
if (ch != ";")
{
firstWord[i] = ch;
i++;
}
else
{
index = 1;
}
while (reader.Peek() >= 0)
{
??????????????????????????????????
}
}
}
}
Sorry for my English, it's not my mother tongue.
As you don't know in advance how many lines there are, it is more convenient to use a List<string> instead of a string[]. A List will automatically increase its capacity as needed.
You can use the string.Split method to split the string at the ';' into an array. If the resulting array has the correct number of parts, you can add those parts to the Lists.
List<string> firstWord = new List<string>();
List<string> lastWord = new List<string>();
string fileName = #"C:\temp\SO61715409.txt";
foreach (string line in File.ReadLines(fileName))
{
string[] parts = line.Split(new char[] { ';' });
if (parts.Length == 2)
{
firstWord.Add(parts[0]);
lastWord.Add(parts[1]);
}
}

How to reverse an array of strings without changing the position of special characters in C#

I'm working on reversing a sentence. I'm able to do it. But I'm not sure, how to reverse the word without changing the special characters positions. I'm using regex but as soon as it finds the special characters it's stopping the reversal of the word.
Following is the code:
Console.WriteLine("Enter:");
string w = Console.ReadLine();
string rw = String.Empty;
String[] arr = w.Split(' ');
var regexItem = new Regex("^[a-zA-Z0-9]*$");
StringBuilder appendString = new StringBuilder();
for (int i = 0; i < arr.Length; i++)
{
char[] chararray = arr[i].ToCharArray();
for (int j = chararray.Length - 1; j >= 0; j--)
{
if (regexItem.IsMatch(rw))
{
rw = appendString.Append(chararray[j]).ToString();
}
}
sb.Append(' ');
}
Console.WriteLine(rw);
Console.ReadLine();
Example : Input
Marshall! Hello.
Expected output
llahsram! olleh.
A basic solution with regex and LINQ. Try it online.
public static void Main()
{
Console.WriteLine("Marshall! Hello.");
Console.WriteLine(Reverse("Marshall! Hello."));
}
public static string Reverse(string source)
{
// we split by groups to keep delimiters
var parts = Regex.Split(source, #"([^a-zA-Z0-9])");
// if we got a group of valid characters
var results = parts.Select(x => x.All(char.IsLetterOrDigit)
// we reverse it
? new string(x.Reverse().ToArray())
// or we keep the delimiters as it
: x);
// then we concat all of them
return string.Concat(results);
}
The same solution without LINQ. Try it online.
public static void Main()
{
Console.WriteLine("Marshall! Hello.");
Console.WriteLine(Reverse("Marshall! Hello."));
}
public static bool IsLettersOrDigits(string s)
{
foreach (var c in s)
{
if (!char.IsLetterOrDigit(c))
{
return false;
}
}
return true;
}
public static string Reverse(char[] s)
{
Array.Reverse(s);
return new string(s);
}
public static string Reverse(string source)
{
var parts = Regex.Split(source, #"([^a-zA-Z0-9])");
var results = new List<string>();
foreach(var x in parts)
{
results.Add(IsLettersOrDigits(x)
? Reverse(x.ToCharArray())
: x);
}
return string.Concat(results);
}
This is a solution without LINQ. I wasn't sure about what are considered special characters.
string sentence = "Marshall! Hello.";
List<string> words = sentence.Split(' ').ToList();
List<string> reversedWords = new List<string>();
foreach (string word in words)
{
char[] arr = new char[word.Length];
for( int i=0; i<word.Length; i++)
{
if(!Char.IsLetterOrDigit((word[i])))
{
for ( int x=0; x< i; x++)
{
arr[x] = arr[x + 1];
}
arr[i] = word[i];
}
else
{
arr[word.Length - 1 - i] = word[i];
}
}
reversedWords.Add(new string(arr));
}
string reversedSentence = string.Join(" ", reversedWords);
Console.WriteLine(reversedSentence);
And this is the output:
Updated Output = llahsraM! olleH.
Here is a non-regex version that does what you want:
var sentence = "Hello, john!";
var parts = sentence.Split(' ');
var reversed = new StringBuilder();
var charPositions = sentence.Select((c, idx) => new { Char = c, Index = idx })
.Where(_ => !char.IsLetterOrDigit(_.Char));
for (int i = 0; i < parts.Length; i++)
{
var chars = parts[i].ToCharArray();
for (int j = chars.Length - 1; j >= 0; j--)
{
if (char.IsLetterOrDigit(chars[j]))
{
reversed.Append(chars[j]);
}
}
}
foreach (var ch in charPositions)
{
reversed.Insert(ch.Index, ch.Char);
}
// olleH, nhoj!
Console.WriteLine(reversed.ToString());
Basically the trick is to remember the position of special (i.e. non letter or digit) characters and insert them at the end to those positions.
This solution is without LINQ and Regex. It may not be an efficient answer but working properly for small string values.
// This will reverse the string and special characters will just stay there.
public string ReverseString(string rString)
{
StringBuilder ss = new StringBuilder(rString);
int y = 0;
// The idea is to swap values. Like swapping first value with last one. It will keep swapping unless it reaches at the middle of the string where no swapping will be needed.
// This first loop is to detect first values.
for(int i=rString.Length-1;i>=0;i--)
{
// This condition is to check if the values is String or not. If it is not string then it is considered as special character which will just stay there at same old position.
if(Char.IsLetter(Convert.ToChar(rString.Substring(i,1))))
{
// This is second loop which is starting from end to swap values from end with first.
for (int k = y; k < rString.Length; k++)
{
// Again checking last values if values are string or not.
if (Char.IsLetter(Convert.ToChar(rString.Substring(k, 1))))
{
// This is swapping. So st1 is First value in that string
// st2 is the last item in that string
char st1 = Convert.ToChar(rString.Substring(k, 1));
char st2 = Convert.ToChar(rString.Substring(i, 1));
//This is swapping. So last item will go to first position and first item will go to last position, To make sure string is reversed.
// Remember when the string value is Special Character, swapping will move forward without swapping.
ss[rString.IndexOf(rString.Substring(i, 1))] = st1;
ss[rString.IndexOf(rString.Substring(k, 1))] = st2;
y++;
// When the swapping is done for first 2 items. The loop will stop to change the values.
break;
}
else
{
// This is just increment if value was Special character.
y++;
}
}
}
}
return ss.ToString();
}
Thanks!

How to split a string on the nth occurrence?

What I want to do is to split on the nth occurrence of a string (in this case it's "\t"). This is the code I'm currently using and it splits on every occurrence of "\t".
string[] items = input.Split(new char[] {'\t'}, StringSplitOptions.RemoveEmptyEntries);
If input = "one\ttwo\tthree\tfour", my code returns the array of:
one
two
three
four
But let's say I want to split it on every "\t" after the second "\t". So, it should return:
one two
three
four
There is nothing built in.
You can use the existing Split, use Take and Skip with string.Join to rebuild the parts that you originally had.
string[] items = input.Split(new char[] {'\t'},
StringSplitOptions.RemoveEmptyEntries);
string firstPart = string.Join("\t", items.Take(nthOccurrence));
string secondPart = string.Join("\t", items.Skip(nthOccurrence))
string[] everythingSplitAfterNthOccurence = items.Skip(nthOccurrence).ToArray();
An alternative is to iterate over all the characters in the string, find the index of the nth occurrence and substring before and after it (or find the next index after the nth, substring on that etc... etc... etc...).
[EDIT] After re-reading the edited OP, I realise this doesn't do what is now asked. This will split on every nth target; the OP wants to split on every target AFTER the nth one.
I'll leave this here for posterity anyway.
If you were using the MoreLinq extensions you could take advantage of its Batch method.
Your code would then look like this:
string text = "1\t2\t3\t4\t5\t6\t7\t8\t9\t10\t11\t12\t13\t14\t15\t16\t17";
var splits = text.Split('\t').Batch(5);
foreach (var split in splits)
Console.WriteLine(string.Join("", split));
I'd probably just use Oded's implementation, but I thought I'd post this for an alternative approach.
The implementation of Batch() looks like this:
public static class EnumerableExt
{
public static IEnumerable<IEnumerable<TSource>> Batch<TSource>(this IEnumerable<TSource> source, int size)
{
TSource[] bucket = null;
var count = 0;
foreach (var item in source)
{
if (bucket == null)
bucket = new TSource[size];
bucket[count++] = item;
if (count != size)
continue;
yield return bucket;
bucket = null;
count = 0;
}
if (bucket != null && count > 0)
yield return bucket.Take(count);
}
}
It is likely that you will have to split and re-combine. Something like
int tabIndexToRemove = 3;
string str = "My\tstring\twith\tloads\tof\ttabs";
string[] strArr = str.Split('\t');
int numOfTabs = strArr.Length - 1;
if (tabIndexToRemove > numOfTabs)
throw new IndexOutOfRangeException();
str = String.Empty;
for (int i = 0; i < strArr.Length; i++)
str += i == tabIndexToRemove - 1 ?
strArr[i] : String.Format("{0}\t", strArr[i]);
Result:
My string withloads of tabs
I hope this helps.
// Return a substring of str upto but not including
// the nth occurence of substr
function getNth(str, substr, n) {
var idx;
var i = 0;
var newstr = '';
do {
idx = s.indexOf(c);
newstr += str.substring(0, idx);
str = str.substring(idx+1);
} while (++i < n && (newstr += substr))
return newstr;
}

Split a long string to a customized string

Hello I have a long string. I want to split it as some kind of format that has many return carrages.
Each line has 5 short words.
Ex.
string input="'250.0','250.00','250.01','250.02','250.03','250.1','250.10','250.11','250.12','250.13','250.2','250.20','250.21','250.22','250.23','250.3','250.30','250.31','250.32','250.33','250.4','250.40','250.41','250.42','250.43','250.5','250.50','250.51','250.52','250.53','250.6','250.60','250.61','250.62','250.63','250.7','250.70','250.71','250.72','250.73','250.8','250.80','250.81','250.82','250.83','250.9','250.90','250.91','250.92','250.93','357.2','357.20','362.01','362.02','362.03','362.04','362.05','362.06','362.07','366.41','648.0','648.00','648.01','648.02','648.03','648.04'";
It has 66 short words.
string output = "'250.0','250.00','250.01','250.02','250.03',
'250.1','250.10','250.11','250.12','250.13',
'250.2','250.20','250.21','250.22','250.23',
'250.3','250.30','250.31','250.32','250.33',
'250.4','250.40','250.41','250.42','250.43',
'250.5','250.50','250.51','250.52','250.53',
'250.6','250.60','250.61','250.62','250.63',
'250.7','250.70','250.71','250.72','250.73',
'250.8','250.80','250.81','250.82','250.83',
'250.9','250.90','250.91','250.92','250.93',
'357.2','357.20','362.01','362.02','362.03',
'362.04','362.05','362.06','362.07','366.41',
'648.0','648.00','648.01','648.02','648.03',
'648.04'";
I thought that I have to count char ',' in the string first such as in the example. But it could be kind of clumsy.
Thanks for advice.
If i've understood you correctly you want to
split those words by comma
group the result into lines where each line contains 5 words
build a string with Environment.NewLine as separator
string input = "'250.0','250.00','250.01','250.02','250.03','250.1','250.10','250.11','250.12','250.13','250.2','250.20','250.21','250.22','250.23','250.3','250.30','250.31','250.32','250.33','250.4','250.40','250.41','250.42','250.43','250.5','250.50','250.51','250.52','250.53','250.6','250.60','250.61','250.62','250.63','250.7','250.70','250.71','250.72','250.73','250.8','250.80','250.81','250.82','250.83','250.9','250.90','250.91','250.92','250.93','357.2','357.20','362.01','362.02','362.03','362.04','362.05','362.06','362.07','366.41','648.0','648.00','648.01','648.02','648.03','648.04'";
int groupCount = 5;
var linesGroups = input.Split(',')
.Select((s, index) => new { str = s, Position = index / groupCount, Index = index })
.GroupBy(x => x.Position);
StringBuilder outputBuilder = new StringBuilder();
foreach (var grp in linesGroups)
{
outputBuilder.AppendLine(String.Join(",", grp.Select(x => x.str)));
}
String output = outputBuilder.ToString();
Edit: The result is:
'250.0','250.00','250.01','250.02','250.03'
'250.1','250.10','250.11','250.12','250.13'
'250.2','250.20','250.21','250.22','250.23'
'250.3','250.30','250.31','250.32','250.33'
'250.4','250.40','250.41','250.42','250.43'
'250.5','250.50','250.51','250.52','250.53'
'250.6','250.60','250.61','250.62','250.63'
'250.7','250.70','250.71','250.72','250.73'
'250.8','250.80','250.81','250.82','250.83'
'250.9','250.90','250.91','250.92','250.93'
'357.2','357.20','362.01','362.02','362.03'
'362.04','362.05','362.06','362.07','366.41'
'648.0','648.00','648.01','648.02','648.03'
'648.04'
If you want to append every line with a comma(like in your example):
foreach (var grp in linesGroups)
{
outputBuilder.AppendLine(String.Join(",", grp.Select(x => x.str)) + ",");
}
// remove last comma + Environment.NewLine
outputBuilder.Length -= ( 1 + Environment.NewLine.Length );
How about this solution:
private static IEnumerable<string> SplitLongString(string input, char separator, int groupSize)
{
int indexCurrent = 0;
int indexLastOccurence = 0;
int separatorCounter = 0;
foreach (var character in input)
{
indexCurrent++;
if (character == separator)
{
separatorCounter++;
if (separatorCounter % groupSize == 0)
{
yield return input.Substring(indexLastOccurence, indexCurrent - indexLastOccurence);
indexLastOccurence = indexCurrent;
}
}
}
if (indexCurrent != indexLastOccurence)
{
yield return input.Substring(indexLastOccurence, indexCurrent - indexLastOccurence);
}
}
And you would call it:
var result = SplitLongString(input, ',', 5);
foreach (var row in result)
{
Console.WriteLine(row);
}
The simplest way would be to follow the approach from #Tim's deleted (edit: not deleted any more) answer:
Split the string into parts by comma (using string.Split)
Rearrange the obtained parts in any way you need: for example, packing by 5 in a line.
Something like that (not tested):
Console.WriteLine("string output =");
var parts = sourceString.Split(',');
int i = 0;
for (; i < parts.Length; i++)
{
if (i % 5 == 0)
Console.Write(' "');
Console.Write(parts[i]);
Console.Write(',');
if (i % 5 == 4 && i != parts.Length - 1)
Console.WriteLine('" +');
}
Console.WriteLine('";');
var input="'250.0','250.00','250.01','250.02','250.03','250.1','250.10','250.11','250.12','250.13','250.2','250.20','250.21','250.22','250.23','250.3','250.30','250.31','250.32','250.33','250.4','250.40','250.41','250.42','250.43','250.5','250.50','250.51','250.52','250.53','250.6','250.60','250.61','250.62','250.63','250.7','250.70','250.71','250.72','250.73','250.8','250.80','250.81','250.82','250.83','250.9','250.90','250.91','250.92','250.93','357.2','357.20','362.01','362.02','362.03','362.04','362.05','362.06','362.07','366.41','648.0','648.00','648.01','648.02','648.03','648.04'";
var wordsArray = input.Split(',');
var sbOutput = new StringBuilder();
for (int i = 1; i < wordsArray.Length +1; i++)
{
sbOutput.AppendFormat("{0},", wordsArray[i-1]);
if(i % 5 == 0)
sbOutput.AppendLine();
}
var output = sbOutput.ToString();
Do something like this:
string[] words = input.Split(',');
int wordsInString = words.Length;

Counting number of words in C#

I'm trying to count the number of words from a rich textbox in C# the code that I have below only works if it is a single line. How do I do this without relying on regex or any other special functions.
string whole_text = richTextBox1.Text;
string trimmed_text = whole_text.Trim();
string[] split_text = trimmed_text.Split(' ');
int space_count = 0;
string new_text = "";
foreach(string av in split_text)
{
if (av == "")
{
space_count++;
}
else
{
new_text = new_text + av + ",";
}
}
new_text = new_text.TrimEnd(',');
split_text = new_text.Split(',');
MessageBox.Show(split_text.Length.ToString ());
char[] delimiters = new char[] {' ', '\r', '\n' };
whole_text.Split(delimiters,StringSplitOptions.RemoveEmptyEntries).Length;
Since you are only interested in word count, and you don't care about individual words, String.Split could be avoided. String.Split is handy, but it unnecessarily generates a (potentially) large number of String objects, which in turn creates an unnecessary burden on the garbage collector. For each word in your text, a new String object needs to be instantiated, and then soon collected since you are not using it.
For a homework assignment, this may not matter, but if your text box contents change often and you do this calculation inside an event handler, it may be wiser to simply iterate through characters manually. If you really want to use String.Split, then go for a simpler version like Yonix recommended.
Otherwise, use an algorithm similar to this:
int wordCount = 0, index = 0;
// skip whitespace until first word
while (index < text.Length && char.IsWhiteSpace(text[index]))
index++;
while (index < text.Length)
{
// check if current char is part of a word
while (index < text.Length && !char.IsWhiteSpace(text[index]))
index++;
wordCount++;
// skip whitespace until next word
while (index < text.Length && char.IsWhiteSpace(text[index]))
index++;
}
This code should work better with cases where you have multiple spaces between each word, you can test the code online.
There are some better ways to do this, but in keeping with what you've got, try the following:
string whole_text = richTextBox1.Text;
string trimmed_text = whole_text.Trim();
// new line split here
string[] lines = trimmed_text.Split(Environment.NewLine.ToCharArray());
// don't need this here now...
//string[] split_text = trimmed_text.Split(' ');
int space_count = 0;
string new_text = "";
Now make two foreach loops. One for each line and one for counting words within the lines.
foreach (string line in lines)
{
// Modify the inner foreach to do the split on ' ' here
// instead of split_text
foreach (string av in line.Split(' '))
{
if (av == "")
{
space_count++;
}
else
{
new_text = new_text + av + ",";
}
}
}
new_text = new_text.TrimEnd(',');
// use lines here instead of split_text
lines = new_text.Split(',');
MessageBox.Show(lines.Length.ToString());
}
This was a phone screening interview question that I just took (by a large company located in CA who sells all kinds of devices that starts with a letter "i"), and I think I franked... after I got offline, I wrote this. I wish I were able to do it during interview..
static void Main(string[] args)
{
Debug.Assert(CountWords("Hello world") == 2);
Debug.Assert(CountWords(" Hello world") == 2);
Debug.Assert(CountWords("Hello world ") == 2);
Debug.Assert(CountWords("Hello world") == 2);
}
public static int CountWords(string test)
{
int count = 0;
bool wasInWord = false;
bool inWord = false;
for (int i = 0; i < test.Length; i++)
{
if (inWord)
{
wasInWord = true;
}
if (Char.IsWhiteSpace(test[i]))
{
if (wasInWord)
{
count++;
wasInWord = false;
}
inWord = false;
}
else
{
inWord = true;
}
}
// Check to see if we got out with seeing a word
if (wasInWord)
{
count++;
}
return count;
}
Have a look at the Lines property mentioned in #Jay Riggs comment, along with this overload of String.Split to make the code much simpler. Then the simplest approach would be to loop over each line in the Lines property, call String.Split on it, and add the length of the array it returns to a running count.
EDIT: Also, is there any reason you're using a RichTextBox instead of a TextBox with Multiline set to True?
I use an extension method for grabbing word count in a string. Do note, however, that double spaces will mess the count up.
public static int CountWords(this string line)
{
var wordCount = 0;
for (var i = 0; i < line.Length; i++)
if (line[i] == ' ' || i == line.Length - 1)
wordCount++;
return wordCount;
}
}
Your approach is on the right path. I would do something like, passing the text property of richTextBox1 into the method. This however won't be accurate if your rich textbox is formatting HTML, so you'll need to strip out any HTML tags prior to running the word count:
public static int CountWords(string s)
{
int c = 0;
for (int i = 1; i < s.Length; i++)
{
if (char.IsWhiteSpace(s[i - 1]) == true)
{
if (char.IsLetterOrDigit(s[i]) == true ||
char.IsPunctuation(s[i]))
{
c++;
}
}
}
if (s.Length > 2)
{
c++;
}
return c;
}
We used an adapted form of Yoshi's answer, where we fixed the bug where it would not count the last word in a string if there was no white-space after it:
public static int CountWords(string test)
{
int count = 0;
bool inWord = false;
foreach (char t in test)
{
if (char.IsWhiteSpace(t))
{
inWord = false;
}
else
{
if (!inWord) count++;
inWord = true;
}
}
return count;
}
using System.Collections;
using System;
class Program{
public static void Main(string[] args){
//Enter the value of n
int n = Convert.ToInt32(Console.ReadLine());
string[] s = new string[n];
ArrayList arr = new ArrayList();
//enter the elements
for(int i=0;i<n;i++){
s[i] = Console.ReadLine();
}
string str = "";
//Filter out duplicate values and store in arr
foreach(string i in s){
if(str.Contains(i)){
}else{
arr.Add(i);
}
str += i;
}
//Count the string with arr and s variables
foreach(string i in arr){
int count = 0;
foreach(string j in s){
if(i.Equals(j)){
count++;
}
}
Console.WriteLine(i+" - "+count);
}
}
}
int wordCount = 0;
bool previousLetterWasWhiteSpace = false;
foreach (char letter in keyword)
{
if (char.IsWhiteSpace(letter))
{
previousLetterWasWhiteSpace = true;
}
else
{
if (previousLetterWasWhiteSpace)
{
previousLetterWasWhiteSpace = false;
wordCount++;
}
}
}
public static int WordCount(string str)
{
int num=0;
bool wasInaWord=true;;
if (string.IsNullOrEmpty(str))
{
return num;
}
for (int i=0;i< str.Length;i++)
{
if (i!=0)
{
if (str[i]==' ' && str[i-1]!=' ')
{
num++;
wasInaWord=false;
}
}
if (str[i]!=' ')
{
wasInaWord=true;
}
}
if (wasInaWord)
{
num++;
}
return num;
}
class Program
{
static void Main(string[] args)
{
string str;
int i, wrd, l;
StringBuilder sb = new StringBuilder();
Console.Write("\n\nCount the total number of words in a string
:\n");
Console.Write("---------------------------------------------------
---\n");
Console.Write("Input the string : ");
str = Console.ReadLine();
l = 0;
wrd = 1;
foreach (var a in str)
{
sb.Append(a);
if (str[l] == ' ' || str[l] == '\n' || str[l] == '\t')
{
wrd++;
}
l++;
}
Console.WriteLine(sb.Replace(' ', '\n'));
Console.Write("Total number of words in the string is : {0}\n",
wrd);
Console.ReadLine();
}
This should work
input.Split(' ').ToList().Count;
This can show you the number of words in a line
string line = Console.ReadLine();
string[] word = line.Split(' ');
Console.WriteLine("Words " + word.Length);
You can also do it in this way!! Add this method to your extension methods.
public static int WordsCount(this string str)
{
return Regex.Matches(str, #"((\w+(\s?)))").Count;
}
And call it like this.
string someString = "Let me show how I do it!";
int wc = someString.WordsCount();

Categories