How to ignore reading contents of /*comment*/ while reading a file - c#

Below is my code:
string ckeywords = File.ReadAllText("E:\\ckeywords.csv");
string[] clines = File.ReadAllLines("E:\\cprogram\\cpro\\bubblesort.c");
string letters="";
foreach(string line in clines)
{
char[] c = line.ToCharArray();
foreach(char i in c)
{
if (i == '/' || i == '"')
{
break;
}
else
{
letters = letters + i;
}
}
}
letters = Regex.Replace(letters, #"[^a-zA-Z ]+", " ");
List<string> listofc = letters.Split(new char[] { ' ' }, StringSplitOptions.RemoveEmptyEntries).ToList();
List<string> listofcsv = ckeywords.Split(new char[] { ',', '\t', '\n', ' ' }, StringSplitOptions.RemoveEmptyEntries).Select(p => p.Trim()).ToList();
List<string> Commonlist = listofcsv.Intersect(listofc).ToList();
With this if condition, I am able to ignore reading contents of single line comment and contents between ("").
I need to ignore reading contents of multi line comments. Which condition should I use?
Suppose my .c file is having this line of comment so with above code I don't know how to start iterating from /* to */ and to ignore the contents in between.
/*printf("Sorted list in ascending order:\n");
for ( c = 0 ; c < n ; c++ )
printf("%d\n", array[c]);*/

I successfully solved my problem now I can ignore reading contents of /* */ in a simpler way without using Regular Expression.
Here is my code:
string[] clines = File.ReadAllLines("E:\\cprogram\\cpro\\bubblesort.c");
List<string> list = new List<string>();
int startIndexofcomm, endIndexofcomm;
for (int i = 0; i < clines.Length ; i++ )
{
if (clines[i].Contains(#"/*"))
{
startIndexofcomm = clines[i].IndexOf(#"/*");
list.Add(clines[i].Substring(0, startIndexofcomm));
while(!(clines[i].Contains(#"*/")))
{
i++;
}
endIndexofcomm = clines[i].IndexOf(#"*/");
list.Add(clines[i].Substring(endIndexofcomm+2));
continue;
}
list.Add(clines[i]);
}

Here is code that naively does the following:
It strips out any multi-line comments starting with /* and ending with */, even if there are newlines between the two.
It strips out any single-line comments starting with // and ending at the end of the line
It does not strip out any comments like the above if they're within a string that starts with " and ends with a ".
LINQPad code:
void Main()
{
var code = File.ReadAllText(#"d:\temp\test.c");
code.Dump("input");
bool inString = false;
bool inSingleLineComment = false;
bool inMultiLineComment = false;
var output = new StringBuilder();
int index = 0;
while (index < code.Length)
{
// First deal with single line comments: // xyz
if (inSingleLineComment)
{
if (code[index] == '\n' || code[index] == '\r')
{
inSingleLineComment = false;
output.Append(code[index]);
index++;
}
else
index++;
continue;
}
// Then multi-line comments: /* ... */
if (inMultiLineComment)
{
if (code[index] == '*' && index + 1 < code.Length && code[index + 1] == '/')
{
inMultiLineComment = false;
index += 2;
}
else
index++;
continue;
}
// Then deal with strings
if (inString)
{
output.Append(code[index]);
if (code[index] == '"')
inString = false;
index++;
continue;
}
// If we get here we're not in a string or in a comment
if (code[index] == '"')
{
// We found the start of a string
output.Append(code[index]);
inString = true;
index++;
}
else if (code[index] == '/' && index + 1 < code.Length && code[index + 1] == '/')
{
// We found the start of a single line comment
inSingleLineComment = true;
index++;
}
else if (code[index] == '/' && index + 1 < code.Length && code[index + 1] == '*')
{
// We found the start of a multi line comment
inMultiLineComment = true;
index++;
}
else
{
// Just another character
output.Append(code[index]);
index++;
}
}
output.ToString().Dump("output");
}
Sample input:
This should be included // This should not
This should also be included /* while this
should not */ but this should again be included.
Any comments in " /* strings */ " should be included as well.
This goes for "// single line comments" as well.
Sample output (note that there are some spaces at the end of some of the lines below that aren't visible):
This should be included
This should also be included but this should again be included.
Any comments in " /* strings */ " should be included as well.
This goes for "// single line comments" as well.

Related

Use split to remove parts of a string that are surrounded by curly double quotes

I use the following as a parameter to the split function in C#:
private char[] delimiterComment = { '(', '{', '[', '\u201C' };
private char[] delimiterEndComment = { ')', '}', ']', '\u201D' };
It works on all the "brackets" but not the "curly double quotes". I don't understand why. Is it a bug in split, or a feature of the curly quote characters?
I have as an input string something like:
“this is a pre comment” then some vital info [there might be an embedded comment] and then some more vital info (then there is a post comment)
I wish to strip off the comments, but capture them in a structure, leaving a clean info string. It all worked beautifully with brackets, till I tried to add curly double quotes as additional delimiters...
(I am aware that embedded comments are being gathered as post comments deliberately)
The code I have written is as follows:
class CommentSplit
{
public bool split = false;
public bool error = false;
public string original = "";
public string remainder = "";
public string preComment = "";
public string postComment = "";
public CommentSplit(string inString, char[] startComment, char[] endComment, string[] ignoreStrings, string[] addStrings, bool usePre) // creator
{
if (inString == null)
return;
original = inString;
string[] starts = inString.Split(startComment);
if (starts.Length == 1)
{
remainder = inString;
return;
}
if (starts[0] != "")
remainder += starts[0].TrimEnd();
for (int i = 1; i < starts.Length; i++)
{
string[] ends = starts[i].Split(endComment);
if (ends.Length != 2) // more than one end comment for a start comment - BUT what about one start and one end comment
{
error = true;
return;
}
if (addStrings == null)
{
if (ignoreStrings == null)
{
if ((remainder == "") && usePre)
preComment += ends[0];
else
postComment += ends[0];
}
else
{
bool ignore = false;
for (int z = 0; z < ignoreStrings.Length; z++)
{
if (ends[0].ToLower() == ignoreStrings[z])
ignore = true;
}
if (!ignore) // was a comment but we might want to ignore it
{
if ((remainder == "") && usePre)
{
if (preComment != "")
preComment += " ";
preComment += ends[0];
}
else
{
if (postComment != "")
postComment += " ";
postComment += ends[0];
}
}
}
}
else
{
bool add = false;
for (int z = 0; z < addStrings.Length; z++)
{
if (ends[0].ToLower() == addStrings[z])
add = true;
}
if (add) // was a comment but want it in the remainder
{
if (remainder != "")
remainder += " ";
remainder += ends[0];
}
else
{
if (ignoreStrings == null)
{
if ((remainder == "") && usePre)
preComment += ends[0];
else
postComment += ends[0];
}
else
{
bool ignore = false;
for (int z = 0; z < ignoreStrings.Length; z++)
{
if (ends[0].ToLower() == ignoreStrings[z])
ignore = true;
}
if (!ignore) // was a comment but we might want to ignore it
{
if ((remainder == "") && usePre)
{
if (preComment != "")
preComment += " ";
preComment += ends[0];
}
else
{
if (postComment != "")
postComment += " ";
postComment += ends[0];
}
}
}
}
}
if (remainder != "")
remainder += " ";
remainder += ends[1].Trim();
}
split = true;
} // CommentSplit
}
I should note that I am a retired C programmer dabbling in C#, so my style may not be OOP efficient. I did originally include straight (non curly) double quotes, but they are not important, and indeed stuff things up as there is not an pre and post delimiter version of them.
Just Put double quote between single quote without using escape character :
private char[] delimiterComment = { '(', '{', '[', '\u201C', '"' };
private char[] delimiterEndComment = { ')', '}', ']', '\u201D', '"' };
Input :
string s = "abc(121), {12}, \" HI \"";
Console.WriteLine(string.Join(Environment.NewLine,(s.Split(delimiterComment)).Select(s=> s)));
Output :
abc
121),
12},
HI
So you want to cut off comments, e.g.
123 (456 [789) abc ] -> 123 abc ]
In this case you can try a simple loop:
//TODO: I suggest combining starts and ends into array of pairs, e.g.
// KeyValuePair<string,string>[]
private static string CutOffComments(string source, char[] starts, char[] ends) {
if (string.IsNullOrEmpty(source))
return source;
StringBuilder sb = new StringBuilder(source.Length);
int commentIndex = -1;
foreach (var c in source) {
if (commentIndex >= 0) { // within a comment, looking for its end
if (c == ends[commentIndex])
commentIndex = -1;
}
else { // out of comment, do we starting a new one?
commentIndex = Array.IndexOf(starts, c);
if (commentIndex < 0)
sb.Append(c);
}
}
//TODO:
// if (commentIndex >= 0) // dungling comment, e.g. 123[456
return sb.ToString();
}
Usage:
string source = "123[456]789";
// 123789
string result = CutOffComments(source, delimiterComment, delimiterEndComment);
Its something else in your code as this small verifiable example works fine:
char[] delimiterComment = { '(', '{', '[', '\u201C', '\u201D', '"', '“', '”', '}', ']', ')' };
string stringWithComment = "this has a “COMMENT” yeah really";
var result = stringWithComment.Split(delimiterComment);
//Output:
//result[0] = "this has a "
//result[1] = "COMMENT"
//result[2] = " yeah really"

Function to return the acronym of a string

How can I write a function which given an input string, passes back the acronym for the string using only If/Then/Else, simple String functions, and Looping syntax (not use the Split( ) function or its equivalent)?
String s_input, s_acronym
s_input = "Mothers against drunk driving"
s_acronym = f_get_acronym(s_input)
print "acronym = " + s_acronym
/* acronym = MADD */
My code is here. just looking to see if I could get better solution
static string f_get_acronym(string s_input)
{
string s_acronym = "";
for (int i = 0; i < s_input.Length; i++)
{
if (i == 0 && s_input[i].ToString() != " ")
{
s_acronym += s_input[i];
continue;
}
if (s_input[i - 1].ToString() == " " && s_input[i].ToString() != " ")
{
s_acronym += s_input[i];
}
}
return s_acronym.ToUpper();
}
Regex is the way to go in C#. I know you only wanted simple functions, but I want to put this here for any further readers who shall be directed on the right path. ;)
var regex = new Regex(#"(?:\s*(?<first>\w))\w+");
var matches = regex.Matches("Mother against druck driving");
foreach (Match match in matches)
{
Console.Write(match.Groups["first"].Value);
}
private static string f_get_acronym(string s_input)
{
if (string.IsNullOrWhiteSpace(s_input))
return string.Empty;
string accr = string.Empty;
accr += s_input[0];
while (s_input.Length > 0)
{
int index = s_input.IndexOf(' ');
if (index > 0)
{
s_input = s_input.Substring(index + 1);
if (s_input.Length == 0)
break;
accr += s_input[0];
}
else
{
break;
}
}
return accr.ToUpper();
}
Keep it simple:
public static string Acronym(string input)
{
string result = string.Empty;
char last = ' ';
foreach(var c in input)
{
if(char.IsWhiteSpace(last))
{
result += c;
}
last = c;
}
return result.ToUpper();
}
Best practice says you should use a StringBuilder when adding to a string in a loop though. Don't know how long your acronyms are going to be.
Your best way to do so would be to set up a loop to loop over every letter. If it is the first letter in the string, OR the first letter after a space, add that letter to a temp string, which is returned at the end.
eg (basic c++)
string toAcronym(string sentence)
{
string acronym = "";
bool wasSpace = true;
for (int i = 0; i < sentence.length(); i++)
{
if (wasSpace == true)
{
if (sentence[i] != ' ')
{
wasSpace = false;
acronym += toupper(sentence[i]);
}
else
{
wasSpace = true;
}
}
else if (sentence[i] == ' ')
{
wasSpace = true;
}
}
return acronym;
}
This could be further improved by checking to make sure the letter to add to the acronym is a letter, and not a number / symbol. OR...
If it is the first letter in the string, add it to the acronym. Then, run a loop for "find next of" a space. Then, add the next character. Continuously loop the "find next of" space until it returns null / eof / end of string, then return.

Parsing HTML source using a loop

I have a little problem, I am trying to parse a HTML string in my code, but what i want it to do is split the individual numbers up with a space inbetween each number ie:- " ".
I have made this loop to get rid of the tags
char[] array = new char[source.Length];
int arrayIndex = 0;
bool inside = false;
for (int i = 0; i < source.Length; i++)
{
numberfori = i;
char let = source[i];
if (let == '<')
{
inside = true;
continue;
}
if (let == '>')
{
inside = false;
continue;
}
if (!inside)
{
array[arrayIndex] = let;
Console.WriteLine(arrayIndex);
arrayIndex++;
}
}
return new string(array, 1, arrayIndex);
now this returns :-
201549.0717593/2203.5732.6719.4412.86
but i need :-
2015 49.0 7 175 9 3/22 0 3.57 32.67 19.44 12.86
and here is the HTML code string the loop runs through for you to see so you know where i get it from:-
>2015</a></td><td class="text-right">49.0</td><td class="text-right">7</td><td class="text-right">175</td><td class="text-right">9</td><td class="text-right">3/22</td><td class="text-right">0</td><td class="text-right">3.57</td><td class="text-right">32.67</td><td class="text-right">19.44</td><td class="text-right">12.86</td></tr><tr><td><a data
Eventually i want to put each of these numbers into their own variables but i need to split them first which is the first task one step at a time :)
Thank you for your help
Try adding this:
if (let == '>')
{
inside = false;
if (arrayIndex > 0 && array[arrayIndex - 1] != ' ')
{
array[arrayIndex] = ' ';
arrayIndex++;
}
continue;
}

Application that indents an unindented code in C#

My application should read a C# code sample that is unindented, then indent the code programatically. The way I am doing it may not be correct but still could achieve partial results.
I could set white spaces when a { is found then continue with the same amount of space for rest of the lines being read. When another { is found again add spaces and continue with this new space for rest of lines. For that this is what I did:
private void btn_format_Click(object sender, EventArgs e)
{
string lineInfo = "";
string fl = "";
string ctab= char.ConvertFromUtf32(32)+char.ConvertFromUtf32(32)+char.ConvertFromUtf32(32);
foreach (string line in txt_codepage.Lines) // text_codepage is a textbox with code
{
if (line.Contains("{"))
{
string l = line.Replace("{", ctab+"{");
lineInfo = lineInfo + (l + "\n");
fl = fl + ctab;
ctab = ctab + ctab;
}
else
{
lineInfo = lineInfo + (char.ConvertFromUtf32(32)+fl+ line + "\n");
}
I could achieve the proper indentation that I want till here. Now when I find a } I should do the reverse process but unfortunately that is not possible with strings. The reverse process that I meant is this:
if (line.Contains("}"))
{
string l = line.Replace(ctab + "}", "}");
lineInfo = lineInfo + (l + "\n");
fl = fl - ctab;
ctab = ctab - ctab;
}
else
{
lineInfo = lineInfo - (char.ConvertFromUtf32(32) + fl + line + "\n");
}
}
MessageBox.Show(lineInfo.ToString());
I know the above part of the code is a complete blunder but let me know how to achieve it in correct way
If you want parse string, you should use StringBuilder instead string concatenations (concatenations is to slow). I wrote some code, to demonstrate how you can parse CS or other code. It is not a full example, just a basic concepts.
If you want learn more about parsers you can read Compilers: Principles, Techniques, and Tools.
public static string IndentCSharpCode(string code)
{
const string INDENT_STEP = " ";
if (string.IsNullOrWhiteSpace(code))
{
return code;
}
var result = new StringBuilder();
var indent = string.Empty;
var lineContent = false;
var stringDefinition = false;
for (var i = 0; i < code.Length; i++)
{
var ch = code[i];
if (ch == '"' && !stringDefinition)
{
result.Append(ch);
stringDefinition = true;
continue;
}
if (ch == '"' && stringDefinition)
{
result.Append(ch);
stringDefinition = false;
continue;
}
if (stringDefinition)
{
result.Append(ch);
continue;
}
if (ch == '{' && !stringDefinition)
{
if (lineContent)
{
result.AppendLine();
}
result.Append(indent).Append("{");
if (lineContent)
{
result.AppendLine();
}
indent += INDENT_STEP;
lineContent = false;
continue;
}
if (ch == '}' && !stringDefinition)
{
if (indent.Length != 0)
{
indent = indent.Substring(0, indent.Length - INDENT_STEP.Length);
}
if (lineContent)
{
result.AppendLine();
}
result.Append(indent).Append("}");
if (lineContent)
{
result.AppendLine();
}
lineContent = false;
continue;
}
if (ch == '\r')
{
continue;
}
if ((ch == ' ' || ch == '\t') && !lineContent)
{
continue;
}
if (ch == '\n')
{
lineContent = false;
result.AppendLine();
continue;
}
if (!lineContent)
{
result.Append(indent);
lineContent = true;
}
result.Append(ch);
}
return result.ToString();
}
You can go and check out codemaid, an open source VS add in for cleaning code
Remove all of the whitespace from the line using String.Trim() and then add just the tabs you want. Also, your code would be much more readable if you could avoid char.ConvertFromUtf32(32) - why write that instead of " " or ' '?

Wrap text to the next line when it exceeds a certain length?

I need to write different paragraphs of text within a certain area. For instance, I have drawn a box to the console that looks like this:
/----------------------\
| |
| |
| |
| |
\----------------------/
How would I write text within it, but wrap it to the next line if it gets too long?
Split on last space before your row length?
int myLimit = 10;
string sentence = "this is a long sentence that needs splitting to fit";
string[] words = sentence.Split(new char[] { ' ' });
IList<string> sentenceParts = new List<string>();
sentenceParts.Add(string.Empty);
int partCounter = 0;
foreach (string word in words)
{
if ((sentenceParts[partCounter] + word).Length > myLimit)
{
partCounter++;
sentenceParts.Add(string.Empty);
}
sentenceParts[partCounter] += word + " ";
}
foreach (string x in sentenceParts)
Console.WriteLine(x);
UPDATE (the solution above lost the last word in some cases):
int myLimit = 10;
string sentence = "this is a long sentence that needs splitting to fit";
string[] words = sentence.Split(' ');
StringBuilder newSentence = new StringBuilder();
string line = "";
foreach (string word in words)
{
if ((line + word).Length > myLimit)
{
newSentence.AppendLine(line);
line = "";
}
line += string.Format("{0} ", word);
}
if (line.Length > 0)
newSentence.AppendLine(line);
Console.WriteLine(newSentence.ToString());
Here's one that is lightly tested and uses LastIndexOf to speed things along (a guess):
private static string Wrap(string v, int size)
{
v = v.TrimStart();
if (v.Length <= size) return v;
var nextspace = v.LastIndexOf(' ', size);
if (-1 == nextspace) nextspace = Math.Min(v.Length, size);
return v.Substring(0, nextspace) + ((nextspace >= v.Length) ?
"" : "\n" + Wrap(v.Substring(nextspace), size));
}
I started with Jim H.'s solution and end up with this method. Only problem is if text has any word that longer than limit. But works well.
public static List<string> GetWordGroups(string text, int limit)
{
var words = text.Split(new string[] { " ", "\r\n", "\n" }, StringSplitOptions.None);
List<string> wordList = new List<string>();
string line = "";
foreach (string word in words)
{
if (!string.IsNullOrWhiteSpace(word))
{
var newLine = string.Join(" ", line, word).Trim();
if (newLine.Length >= limit)
{
wordList.Add(line);
line = word;
}
else
{
line = newLine;
}
}
}
if (line.Length > 0)
wordList.Add(line);
return wordList;
}
I modified the version of Jim H such that it supports some special cases.
For example the case when the sentence does not contain any whitespace character; I also noted that there is a problem when a line has a space at the last position; then the space is added at the end and you end up with one character too much.
Here is my version just in case someone is interested:
public static List<string> WordWrap(string input, int maxCharacters)
{
List<string> lines = new List<string>();
if (!input.Contains(" "))
{
int start = 0;
while (start < input.Length)
{
lines.Add(input.Substring(start, Math.Min(maxCharacters, input.Length - start)));
start += maxCharacters;
}
}
else
{
string[] words = input.Split(' ');
string line = "";
foreach (string word in words)
{
if ((line + word).Length > maxCharacters)
{
lines.Add(line.Trim());
line = "";
}
line += string.Format("{0} ", word);
}
if (line.Length > 0)
{
lines.Add(line.Trim());
}
}
return lines;
}
This is a more complete and tested solution.
The bool overflow parameter specifies, whether long words are chunked in addition to splitting up by spaces.
Consecutive whitespaces, as well as \r, \n, are ignored and collapsed into one space.
Edge cases are throughfully tested
public static string WrapText(string text, int width, bool overflow)
{
StringBuilder result = new StringBuilder();
int index = 0;
int column = 0;
while (index < text.Length)
{
int spaceIndex = text.IndexOfAny(new[] { ' ', '\t', '\r', '\n' }, index);
if (spaceIndex == -1)
{
break;
}
else if (spaceIndex == index)
{
index++;
}
else
{
AddWord(text.Substring(index, spaceIndex - index));
index = spaceIndex + 1;
}
}
if (index < text.Length) AddWord(text.Substring(index));
void AddWord(string word)
{
if (!overflow && word.Length > width)
{
int wordIndex = 0;
while (wordIndex < word.Length)
{
string subWord = word.Substring(wordIndex, Math.Min(width, word.Length - wordIndex));
AddWord(subWord);
wordIndex += subWord.Length;
}
}
else
{
if (column + word.Length >= width)
{
if (column > 0)
{
result.AppendLine();
column = 0;
}
}
else if (column > 0)
{
result.Append(" ");
column++;
}
result.Append(word);
column += word.Length;
}
}
return result.ToString();
}
I modified Manfred's version. If you put a string with the '\n' character in it, it will wrap the text strangely because it will count it as another character. With this minor change all will go smoothly.
public static List<string> WordWrap(string input, int maxCharacters)
{
List<string> lines = new List<string>();
if (!input.Contains(" ") && !input.Contains("\n"))
{
int start = 0;
while (start < input.Length)
{
lines.Add(input.Substring(start, Math.Min(maxCharacters, input.Length - start)));
start += maxCharacters;
}
}
else
{
string[] paragraphs = input.Split('\n');
foreach (string paragraph in paragraphs)
{
string[] words = paragraph.Split(' ');
string line = "";
foreach (string word in words)
{
if ((line + word).Length > maxCharacters)
{
lines.Add(line.Trim());
line = "";
}
line += string.Format("{0} ", word);
}
if (line.Length > 0)
{
lines.Add(line.Trim());
}
}
}
return lines;
}
Other answers didn't consider East Asian languages, which don't use space to break words.
In general, a sentence in East Asian languages can be wrapped in any position between characters, except certain punctuations (it is not a big problem even if ignore punctuation rules). It is much simpler than European languages but when consider mixing different languages, you have to detect the language of each character by checking the Unicode table, and then apply the break lines by space algorithm only for European languages parts.
References:
https://en.wikipedia.org/wiki/Line_wrap_and_word_wrap
https://en.wikipedia.org/wiki/Line_breaking_rules_in_East_Asian_languages
https://en.wikibooks.org/wiki/Unicode/Character_reference/0000-0FFF
This code will wrap the paragraph text. It will break the paragraph text into lines. If it encounters any word which is even larger than the line length, it will break the word into multiple lines too.
private const int max_line_length = 25;
private string wrapLinesToFormattedText(string p_actual_string) {
string formatted_string = "";
int available_length = max_line_length;
string[] word_arr = p_actual_string.Trim().Split(' ');
foreach (string w in word_arr) {
string word = w;
if (word == "") {
continue;
}
int word_length = word.Length;
//if the word is even longer than the length that the line can have
//the large word will get break down into lines following by the successive words
if (word_length >= max_line_length)
{
if (available_length > 0)
{
formatted_string += word.Substring(0, available_length) + "\n";
word = word.Substring(available_length);
}
else
{
formatted_string += "\n";
}
word = word + " ";
available_length = max_line_length;
for (var count = 0;count<word.Length;count++) {
char ch = word.ElementAt(count);
if (available_length==0) {
formatted_string += "\n";
available_length = max_line_length;
}
formatted_string += ch;
available_length--;
}
continue;
}
if ((word_length+1) <= available_length)
{
formatted_string += word+" ";
available_length -= (word_length+1);
continue;
}
else {
available_length = max_line_length;
formatted_string += "\n"+word+" " ;
available_length -= (word_length + 1);
continue;
}
}//end of foreach loop
return formatted_string;
}
//end of function wrapLinesToFormattedText
Blockquote
Here is a small piece of optimized code for wrapping text according to float sentence length limit written in Visual Basic9.
Dim stringString = "Great code! I wish I could found that when searching for Print Word Wrap VB.Net and other variations when searching on google. I’d never heard of MeasureString until you guys mentioned it. In my defense, I’m not a UI programmer either, so I don’t feel bad for not knowing"
Dim newstring = ""
Dim t As Integer = 1
Dim f As Integer = 0
Dim z As Integer = 0
Dim p As Integer = stringString.Length
Dim myArray As New ArrayList
Dim endOfText As Boolean = False REM to exit loop after finding the last words
Dim segmentLimit As Integer = 45
For t = z To p Step segmentLimit REM you can adjust this variable to fit your needs
newstring = String.Empty
newstring += Strings.Mid(stringString, 1, 45)
If Strings.Left(newstring, 1) = " " Then REM Chr(13) doesn't work, that's why I have put a physical space
newstring = Strings.Right(newstring, newstring.Length - 1)
End If
If stringString.Length < 45 Then
endOfText = True
newstring = stringString
myArray.Add(newstring) REM fills the last entry then exits
myArray.TrimToSize()
Exit For
Else
stringString = Strings.Right(stringString, stringString.Length - 45)
End If
z += 44 + f
If Not Strings.Right(newstring, 1) = Chr(32) Then REM to detect space
Do Until Strings.Right(newstring, z + 1) = " "
If Strings.Right(newstring, z + f) = " " OrElse Strings.Left(stringString, 1) = " " Then
Exit Do
End If
newstring += Strings.Left(stringString, 1)
stringString = Strings.Right(stringString, stringString.Length - 1) REM crops the original
p = stringString.Length REM string from left by 45 characters and additional characters
t += f
f += 1
Loop
myArray.Add(newstring) REM puts the resulting segments of text in an array
myArray.TrimToSize()
newstring = String.Empty REM empties the string to load the next 45 characters
End If
t = 1
f = 1
Next
For Each item In myArray
MsgBox(item)
'txtSegmentedText.Text &= vbCrLf & item
Next
I know I am a bit late, But I managed to get a solution going by using recursion.
I think its one of the cleanest solutions proposed here.
Recursive Function:
public StringBuilder TextArea { get; set; } = new StringBuilder();
public void GenerateMultiLineTextArea(string value, int length)
{
// first call - first length values -> append first length values, remove first length values from value, make second call
// second call - second length values -> append second length values, remove first length values from value, make third call
// third call - value length is less then length just append as it is
if (value.Length <= length && value.Length != 0)
{
TextArea.Append($"|{value.PadRight(length)}" + "|");
}
else
{
TextArea.Append($"|{value.Substring(0, length).ToString()}".PadLeft(length) + "|\r\n");
value = value.Substring(length, (value.Length) - (length));
GenerateMultiLineTextArea(value, length);
}
}
Usage:
string LongString =
"This is a really long string that needs to break after it reaches a certain limit. " +
"This is a really long string that needs to break after it reaches a certain limit." + "This is a really long string that needs to break after it reaches a certain limit.";
GenerateMultiLineTextArea(LongString, 22);
Console.WriteLine("/----------------------\\");
Console.WriteLine(TextArea.ToString());
Console.WriteLine("\\----------------------/");
Outputs:
/----------------------\
|This is a really long |
|string that needs to b|
|reak after it reaches |
|a certain limit. This |
|is a really long strin|
|g that needs to break |
|after it reaches a cer|
|tain limit.This is a r|
|eally long string that|
| needs to break after |
|it reaches a certain l|
|imit. |
\----------------------/

Categories