Split string into multiple strings with specific criteria - c#

I want to split a string into multiple strings based on the following criteria:
It has to be minimum 2 words together
Each word must be next to each other
For example:
"hello how are you" I want to split into:
"hello how are you"
"hello how are"
"hello how"
"how are"
"how are you"
"are you"
Can't repeat multiple times.
What I got so far is this:
string input = "hello how are you";
List<string> words = input.Split(' ').ToList();
List<string> inputs = new List<string>();
string temp = String.Empty;
for (int i = 0; i < words.Count; i++)
{
temp += words[i] + " ";
if (i > 0)
{
inputs.Add(temp);
}
}
It outputs the following:
hello how
hello how are
hello how are you
I want to get the others too and need a little help with that.

One approach would be to iterate over each word and get all its possible sequences.
Example:
string input = "hello how are you";
List<string> words = input.Split(' ').ToList();
List<string> inputs = new List<string>();
for (int i = 0; i < words.Count; i++)
{
var temp = words[i];
for(int j = i+1;j < words.Count;j++) {
temp += " " + words[j];
inputs.Add(temp);
}
}
//hello how
//hello how are
//hello how are you
//how are
//how are you
//are you

Here's the pseudocode
for (int i = 0; i < words.Count - 1; i++)
{
for each (int j = i + 1; j < words.Count; j++)
{
//rebuild string from words[i] through words[j] and add to list
}
}
The idea is to consider each word except the last as a starting word (since it can't have a words following it). For starting word, consider each possible ending word (the first would be the next word in the list and the last would be the last word). Then for each starting/ending word pair, rebuild the string out of all the words in between, and add it to the list

Related

Ignoring whitespace when adding to char array

I'm doing a school exercise where the user inputs a string and the program must check if it's a palindrome. My only problem currently is that I can't get the loop to ignore whitespaces included in the input string.
Console.Write("Insert string: ");
string input = Console.ReadLine();
char[] charArray = new char[input.Length];
for (int i = 0; i < input.Length; i++)
{
if (Char.IsWhiteSpace(input, i))
{
continue;
}
else
{
charArray[i] += input[i];
}
}
string original = new string(charArray);
I've seemingly tried everything I know, but the whitespaces just get added to the array no matter what I try. Is there a simple solution for this?
[EDIT] Ok you can try the replace method which replaces what you provide with what you want instead (space into no space)
string str = "This is a test";
str = str.Replace(" ", "");
MessageBox.Show(str);
How about using the framework and going this route:
char[] charArray = input.Replace(" ", "").ToCharArray();
Maybe this could work ?
Console.Write("Insert string: ");
string input = Console.ReadLine();
char[] charArray = input.Where(character => !Char.IsWhitespace(character)).ToArray();
When a whitespace is encountered, you never update the value at its position and so it remains a whitespace. So, write to a new array/string:
var newString = string.Empty;
for(int i = 0; i < input.Length; i++)
{
if(!Char.IsWhiteSpace(input[i]))
{
newString += input[i];
}
}
or something more like your code:
Console.Write("Insert string: ");
string input = Console.ReadLine();
char[] charArray = new char[input.Length];
var newString = string.Empty;
for (int i = 0; i < input.Length; i++)
{
if (Char.IsWhiteSpace(input, i))
{
continue;
}
else
{
newString += input[i];
}
}
Console.WriteLine(newString);

Split string and append to new string without the last element, til the stringlist is empty

I want to split a string, get it to a list and then append all the items to a new string without the last element of the list. This should be done til the list is empty.
So lets say my splitted string list looks like this:
01
02
03
04
then I want my new list to look like this:
01.02.03
01.02
01
Splitting the string is no problem and making my first correct string. The problem is how I get the loop to "start over" without the last element, and continue doing this til the list is empty. This is how far I've come:
var separator = ".";
var listOfStrings = "01.02.03.04";
var separatedStringList = listOfStrings.Split(separator).ToList();
string newString;
foreach(var item in separatedStringList)
{
if(separatedStringList.Last != item){
newString += item;
}
}
You can do the following to get the list of strings.
var listOfStrings = "01.02.03.04";
var separatedStringList = listOfStrings.Split('.').ToList();
var list =
Enumerable.Range(1, separatedStringList.Count - 1)
.Select(i => string.Join(".", separatedStringList.Take(i)))
.ToList();
foreach(var s in list) Console.WriteLine(s);
This will output
01
01.02
01.02.03
And if you want them in the oppisite order just throw in a Reverse() before the ToList(), or change the argument passed to Take from i to separatedStringList.Count - i.
var listOfStrings = "01.02.03.04";
var arr = listOfStrings.Split(new char[] { '.' });
List<string> results = new List<string>();
for (int i = 1; i < arr.Length; i++)
{
var str = String.Join(".", arr.Reverse().Skip(i).Reverse());
results.Add(str);
}
Edit:
for (int i = 1; i < arr.Length; i++)
{
var str = String.Join(".", arr.Take(arr.Length - i));
results.Add(str);
}
You can use LINQ's Take method here:
for (int i = separatedStringList.Count - 1; i--; i > 0)
{
newString = String.Join(".", separatedStringList.Take(i).ToArray());
Console.WriteLine(newString);
}
I am not sure if you want separate strings or one spanning multiple lines (or maybe a list of strings?). My answer most probably won't solve you problem, but it might be a good starting point.
Don't see any problem:
var separator = '.';
var listOfStrings = "01.02.03.04";
var separatedStringList = listOfStrings.Split(separator).ToList();
var result = new List<string>();
//take "01"
var temp = separatedStringList[0];
//add "01" to list
result.Add(temp);
if(separatedStringList.Count > 1)
{
//loop through second till last
for(int i = 1; i < separatedStringList.Count - 1; i++)
{
//make temp "01.02" then "01.02.03"
temp += "." + separatedStringList[i];
//add temp to list
result.Add(temp);
}
}
For opposite direction:
var separator = '.';
var listOfStrings = "01.02.03.04";
var separatedStringList = listOfStrings.Split(separator).ToList();
var result = new string[separatedStringList.Count - 1];
var temp = separatedStringList[0];
result[result.Length - 1] = temp;
for (int i = 1; i < separatedStringList.Count - 1; i++)
{
temp += "." + separatedStringList[i];
result[result.Length - i - 1] = temp;
}

How to split string in 2 word string set in c#

I have a string in c#. I want to split that string into 2 words string sets like:
string str = "Split handles splitting upon string and character delimiters."
Output should be:
1: "Split handles"
2: "splitting upon"
3: "string and"
4: "character delimiters."
What should be the best method to do this?
Here is what i have tried yet:
private List<string> Spilt(string text)
{
List<string> bunch = new List<string>();
int block = 15;
string[] words = text.Split(' ');
int length = words.Length;
int remain = 0;
while(remain < length)
{
bunch.Add(string.Join(" ", words.Take(block)));
remain += block;
}
return bunch;
}
The simplest approach would be to split at each space, and then "re-join" the pairs back, like this:
var pairs = str.Split(' ')
.Select((s,i) => new {s, i})
.GroupBy(n => n.i / 2)
.Select(g => string.Join(" ", g.Select(p=>p.s)))
.ToList();
Demo on ideone.
Try this
string str = "Split handles splitting upon string and character delimiters.";
var strnew = str.Split(' ');
var strRes = string.Empty;
int j = 1;
for (int i = 0; i < strnew.Length; i=i+2)
{
strRes += j.ToString()+": " + #"""" + strnew[i] + " " + strnew[i+1] + #"""" +"\n" ;
j++;
}
Console.Write(strRes);
// print strRes

why this loop is running so slowly: c#

So am working on a text mining project and currently trying to implement info gain. I have a data in which each line depict a document. so a new line character splits different documents.
i have to generate a matrix in which columns are all the distinct words in all documents and rows are different document. each cell in this table is either 1(true) or 0(false) for if the word is present or not in that document.
there are 987 documents, total words are 22860 and total distinct words are 3680. so 3680 words are compared with 22860. this is running slow but am fine with it. The loop that is taking more time is when i traverse through the objects of list of words to generate matrix. see below
Note: i have removed all repeated words in a document already.
class word_list
{
public string word;
public List<bool> doc= new List<bool>();
};//class ends
private void button2_Click(object sender, EventArgs e)
{
//Convert the string into an array of words
string[] w1 = richTextBox1.Text.Trim().Split('\n',' ').Select(x => x.Trim().ToLower()).Distinct().ToArray(); //all distinct words
string[] rich_doc = richTextBox1.Text.Trim().Split('\n'); //all documents array
List<word_list> words = new List<word_list>();
richTextBox2.Text+=("no. of distict words: " + w1.Length + ", no. of docs " + rich_doc.Length);
for (int i = 0; i < w1.Length; i++)
{
word_list temp = new word_list();
temp.word = w1[i]; //temp has the current distict word as class object
for(int j=0;j<rich_doc.Length;j++)//traverse all doc array
{
temp.doc.Add(false);
List<string> doc_word = Regex.Split(rich_doc[j], #"\b").Distinct(StringComparer.CurrentCultureIgnoreCase).ToList();
//richTextBox2.Text += ("\n no. of words in this doc: " + doc_word.Count);
//richTextBox2.SelectionStart = richTextBox1.Text.Length;
//richTextBox2.Focus();
int doc_count = doc_word.Count; // number of docs
for (int k = 0; k < doc_count; k++)//All words in a doc are compared
{
if(doc_word[k].ToLower() == w1[i].ToLower())
{
temp.doc[temp.doc.Count-1]=true;
break;
}
}
}
if ((words.Count - 1)>=0)
richTextBox2.Text += ("\n word(" + words.Count + "/" + w1.Length + "): " + words[words.Count - 1].word);
richTextBox2.SelectionStart = richTextBox1.Text.Length;
richTextBox2.Focus();
words.Add(temp);
}
//generate matrix
int t = rich_doc.Length; //no. of docs
int word_count = words.Count;
richTextBox1.Text = "Doc";
foreach (word_list w in words)
{
richTextBox1.Text += "\t" + w.word;
}
richTextBox1.Text += "\n";
//This loop is slow
for (int i = 0; i < t; i++) //traverse through number of docs
{
richTextBox1.Text += i + 1;
for (int h = 0; h < word_count; h++)//traverse through each distinct word in the list
{
if (words[h].doc[i])
richTextBox1.Text += "\t1";
else
richTextBox1.Text += "\t0";
}
richTextBox1.Text += "\n";
}
}//end of button 2
ta.speot.is is correct. Strings are supposed to be built with StringBuilder, using Append for instance, and only after the loop you assign the string to richTextBox1.Text. The code would look like this:
//generate matrix
StringBuilder sb = new StringBuilder();
int t = rich_doc.Length; //no. of docs
int word_count = words.Count;
richTextBox1.Text = "Doc";
foreach (word_list w in words)
{
sb.Append("\t");
sb.Append(w.word);
}
sb.AppendLine();
//This loop is not slow anymore :)
for (int i = 0; i < t; i++) //traverse through number of docs
{
sb.Append(i + 1);
for (int h = 0; h < word_count; h++)//traverse through each distinct word in the list
{
if (words[h].doc[i])
sb.Append("\t1");
else
sb.Append("\t0");
}
sb.AppendLine();
}
richTextBox1.Text = sb.ToString();
EDIT: There are valuable comments below. Changing RichEditBox.Text property is the most expensive operation here.

Reading a Text File Line by Line to Make a Map in XNA

I want to read a text file in order to build map.
For example I have this map:
0#0000000
0#0000000
0#0000000
000000000
000000000
000000000
000000000
I know I should use this:
StreamReader reader = new StreamReader(Application.StartupPath+#"/TestMap.MMAP");
string line = reader.ReadToEnd();
reader.Close();
Now, for example, I want read line 2 char "#". how can i do this?
please help me.
Solved:
Thank you (#L.B AND #user861114), at last my problem was solved:
string[,] item = new string[9, 7];
string[] line = File.ReadAllLines(Application.StartupPath + #"/TestMap.MMAP");
for (int j = 0; j < 7; j++)
{
for (int i = 0; i < 9; i++)
{
item[i, j] = line[j].Substring(i, 1);
Console.WriteLine(i + " " + j + "=" + item[i, j]);
}
}
string[] lines = File.ReadAllLines(your path);
then you can access
char ch = lines[1][1]; //second line's second char
I think it's little bit easy :
string[] strs = string.split(myString, "\n"); // split to array of string by delimiter endline
char[] chars = strs[1].ToCharArray(); // you can get second char "#"

Categories