Wrap text to the next line when it exceeds a certain length? - c#

I need to write different paragraphs of text within a certain area. For instance, I have drawn a box to the console that looks like this:
/----------------------\
| |
| |
| |
| |
\----------------------/
How would I write text within it, but wrap it to the next line if it gets too long?

Split on last space before your row length?
int myLimit = 10;
string sentence = "this is a long sentence that needs splitting to fit";
string[] words = sentence.Split(new char[] { ' ' });
IList<string> sentenceParts = new List<string>();
sentenceParts.Add(string.Empty);
int partCounter = 0;
foreach (string word in words)
{
if ((sentenceParts[partCounter] + word).Length > myLimit)
{
partCounter++;
sentenceParts.Add(string.Empty);
}
sentenceParts[partCounter] += word + " ";
}
foreach (string x in sentenceParts)
Console.WriteLine(x);
UPDATE (the solution above lost the last word in some cases):
int myLimit = 10;
string sentence = "this is a long sentence that needs splitting to fit";
string[] words = sentence.Split(' ');
StringBuilder newSentence = new StringBuilder();
string line = "";
foreach (string word in words)
{
if ((line + word).Length > myLimit)
{
newSentence.AppendLine(line);
line = "";
}
line += string.Format("{0} ", word);
}
if (line.Length > 0)
newSentence.AppendLine(line);
Console.WriteLine(newSentence.ToString());

Here's one that is lightly tested and uses LastIndexOf to speed things along (a guess):
private static string Wrap(string v, int size)
{
v = v.TrimStart();
if (v.Length <= size) return v;
var nextspace = v.LastIndexOf(' ', size);
if (-1 == nextspace) nextspace = Math.Min(v.Length, size);
return v.Substring(0, nextspace) + ((nextspace >= v.Length) ?
"" : "\n" + Wrap(v.Substring(nextspace), size));
}

I started with Jim H.'s solution and end up with this method. Only problem is if text has any word that longer than limit. But works well.
public static List<string> GetWordGroups(string text, int limit)
{
var words = text.Split(new string[] { " ", "\r\n", "\n" }, StringSplitOptions.None);
List<string> wordList = new List<string>();
string line = "";
foreach (string word in words)
{
if (!string.IsNullOrWhiteSpace(word))
{
var newLine = string.Join(" ", line, word).Trim();
if (newLine.Length >= limit)
{
wordList.Add(line);
line = word;
}
else
{
line = newLine;
}
}
}
if (line.Length > 0)
wordList.Add(line);
return wordList;
}

I modified the version of Jim H such that it supports some special cases.
For example the case when the sentence does not contain any whitespace character; I also noted that there is a problem when a line has a space at the last position; then the space is added at the end and you end up with one character too much.
Here is my version just in case someone is interested:
public static List<string> WordWrap(string input, int maxCharacters)
{
List<string> lines = new List<string>();
if (!input.Contains(" "))
{
int start = 0;
while (start < input.Length)
{
lines.Add(input.Substring(start, Math.Min(maxCharacters, input.Length - start)));
start += maxCharacters;
}
}
else
{
string[] words = input.Split(' ');
string line = "";
foreach (string word in words)
{
if ((line + word).Length > maxCharacters)
{
lines.Add(line.Trim());
line = "";
}
line += string.Format("{0} ", word);
}
if (line.Length > 0)
{
lines.Add(line.Trim());
}
}
return lines;
}

This is a more complete and tested solution.
The bool overflow parameter specifies, whether long words are chunked in addition to splitting up by spaces.
Consecutive whitespaces, as well as \r, \n, are ignored and collapsed into one space.
Edge cases are throughfully tested
public static string WrapText(string text, int width, bool overflow)
{
StringBuilder result = new StringBuilder();
int index = 0;
int column = 0;
while (index < text.Length)
{
int spaceIndex = text.IndexOfAny(new[] { ' ', '\t', '\r', '\n' }, index);
if (spaceIndex == -1)
{
break;
}
else if (spaceIndex == index)
{
index++;
}
else
{
AddWord(text.Substring(index, spaceIndex - index));
index = spaceIndex + 1;
}
}
if (index < text.Length) AddWord(text.Substring(index));
void AddWord(string word)
{
if (!overflow && word.Length > width)
{
int wordIndex = 0;
while (wordIndex < word.Length)
{
string subWord = word.Substring(wordIndex, Math.Min(width, word.Length - wordIndex));
AddWord(subWord);
wordIndex += subWord.Length;
}
}
else
{
if (column + word.Length >= width)
{
if (column > 0)
{
result.AppendLine();
column = 0;
}
}
else if (column > 0)
{
result.Append(" ");
column++;
}
result.Append(word);
column += word.Length;
}
}
return result.ToString();
}

I modified Manfred's version. If you put a string with the '\n' character in it, it will wrap the text strangely because it will count it as another character. With this minor change all will go smoothly.
public static List<string> WordWrap(string input, int maxCharacters)
{
List<string> lines = new List<string>();
if (!input.Contains(" ") && !input.Contains("\n"))
{
int start = 0;
while (start < input.Length)
{
lines.Add(input.Substring(start, Math.Min(maxCharacters, input.Length - start)));
start += maxCharacters;
}
}
else
{
string[] paragraphs = input.Split('\n');
foreach (string paragraph in paragraphs)
{
string[] words = paragraph.Split(' ');
string line = "";
foreach (string word in words)
{
if ((line + word).Length > maxCharacters)
{
lines.Add(line.Trim());
line = "";
}
line += string.Format("{0} ", word);
}
if (line.Length > 0)
{
lines.Add(line.Trim());
}
}
}
return lines;
}

Other answers didn't consider East Asian languages, which don't use space to break words.
In general, a sentence in East Asian languages can be wrapped in any position between characters, except certain punctuations (it is not a big problem even if ignore punctuation rules). It is much simpler than European languages but when consider mixing different languages, you have to detect the language of each character by checking the Unicode table, and then apply the break lines by space algorithm only for European languages parts.
References:
https://en.wikipedia.org/wiki/Line_wrap_and_word_wrap
https://en.wikipedia.org/wiki/Line_breaking_rules_in_East_Asian_languages
https://en.wikibooks.org/wiki/Unicode/Character_reference/0000-0FFF

This code will wrap the paragraph text. It will break the paragraph text into lines. If it encounters any word which is even larger than the line length, it will break the word into multiple lines too.
private const int max_line_length = 25;
private string wrapLinesToFormattedText(string p_actual_string) {
string formatted_string = "";
int available_length = max_line_length;
string[] word_arr = p_actual_string.Trim().Split(' ');
foreach (string w in word_arr) {
string word = w;
if (word == "") {
continue;
}
int word_length = word.Length;
//if the word is even longer than the length that the line can have
//the large word will get break down into lines following by the successive words
if (word_length >= max_line_length)
{
if (available_length > 0)
{
formatted_string += word.Substring(0, available_length) + "\n";
word = word.Substring(available_length);
}
else
{
formatted_string += "\n";
}
word = word + " ";
available_length = max_line_length;
for (var count = 0;count<word.Length;count++) {
char ch = word.ElementAt(count);
if (available_length==0) {
formatted_string += "\n";
available_length = max_line_length;
}
formatted_string += ch;
available_length--;
}
continue;
}
if ((word_length+1) <= available_length)
{
formatted_string += word+" ";
available_length -= (word_length+1);
continue;
}
else {
available_length = max_line_length;
formatted_string += "\n"+word+" " ;
available_length -= (word_length + 1);
continue;
}
}//end of foreach loop
return formatted_string;
}
//end of function wrapLinesToFormattedText
Blockquote

Here is a small piece of optimized code for wrapping text according to float sentence length limit written in Visual Basic9.
Dim stringString = "Great code! I wish I could found that when searching for Print Word Wrap VB.Net and other variations when searching on google. I’d never heard of MeasureString until you guys mentioned it. In my defense, I’m not a UI programmer either, so I don’t feel bad for not knowing"
Dim newstring = ""
Dim t As Integer = 1
Dim f As Integer = 0
Dim z As Integer = 0
Dim p As Integer = stringString.Length
Dim myArray As New ArrayList
Dim endOfText As Boolean = False REM to exit loop after finding the last words
Dim segmentLimit As Integer = 45
For t = z To p Step segmentLimit REM you can adjust this variable to fit your needs
newstring = String.Empty
newstring += Strings.Mid(stringString, 1, 45)
If Strings.Left(newstring, 1) = " " Then REM Chr(13) doesn't work, that's why I have put a physical space
newstring = Strings.Right(newstring, newstring.Length - 1)
End If
If stringString.Length < 45 Then
endOfText = True
newstring = stringString
myArray.Add(newstring) REM fills the last entry then exits
myArray.TrimToSize()
Exit For
Else
stringString = Strings.Right(stringString, stringString.Length - 45)
End If
z += 44 + f
If Not Strings.Right(newstring, 1) = Chr(32) Then REM to detect space
Do Until Strings.Right(newstring, z + 1) = " "
If Strings.Right(newstring, z + f) = " " OrElse Strings.Left(stringString, 1) = " " Then
Exit Do
End If
newstring += Strings.Left(stringString, 1)
stringString = Strings.Right(stringString, stringString.Length - 1) REM crops the original
p = stringString.Length REM string from left by 45 characters and additional characters
t += f
f += 1
Loop
myArray.Add(newstring) REM puts the resulting segments of text in an array
myArray.TrimToSize()
newstring = String.Empty REM empties the string to load the next 45 characters
End If
t = 1
f = 1
Next
For Each item In myArray
MsgBox(item)
'txtSegmentedText.Text &= vbCrLf & item
Next

I know I am a bit late, But I managed to get a solution going by using recursion.
I think its one of the cleanest solutions proposed here.
Recursive Function:
public StringBuilder TextArea { get; set; } = new StringBuilder();
public void GenerateMultiLineTextArea(string value, int length)
{
// first call - first length values -> append first length values, remove first length values from value, make second call
// second call - second length values -> append second length values, remove first length values from value, make third call
// third call - value length is less then length just append as it is
if (value.Length <= length && value.Length != 0)
{
TextArea.Append($"|{value.PadRight(length)}" + "|");
}
else
{
TextArea.Append($"|{value.Substring(0, length).ToString()}".PadLeft(length) + "|\r\n");
value = value.Substring(length, (value.Length) - (length));
GenerateMultiLineTextArea(value, length);
}
}
Usage:
string LongString =
"This is a really long string that needs to break after it reaches a certain limit. " +
"This is a really long string that needs to break after it reaches a certain limit." + "This is a really long string that needs to break after it reaches a certain limit.";
GenerateMultiLineTextArea(LongString, 22);
Console.WriteLine("/----------------------\\");
Console.WriteLine(TextArea.ToString());
Console.WriteLine("\\----------------------/");
Outputs:
/----------------------\
|This is a really long |
|string that needs to b|
|reak after it reaches |
|a certain limit. This |
|is a really long strin|
|g that needs to break |
|after it reaches a cer|
|tain limit.This is a r|
|eally long string that|
| needs to break after |
|it reaches a certain l|
|imit. |
\----------------------/

Related

C# Console Word Wrap

I have a string with newline characters and I want to wrap the words. I want to keep the newline characters so that when I display the text it looks like separate paragraphs. Anyone have a good function to do this? Current function and code below.(not my own function). The WordWrap function seems to be stripping out \n characters.
static void Main(string[] args){
StreamReader streamReader = new StreamReader("E:/Adventure Story/Intro.txt");
string intro = "";
string line;
while ((line = streamReader.ReadLine()) != null)
{
intro += line;
if(line == "")
{
intro += "\n\n";
}
}
WordWrap(intro);
public static void WordWrap(string paragraph)
{
paragraph = new Regex(#" {2,}").Replace(paragraph.Trim(), #" ");
var left = Console.CursorLeft; var top = Console.CursorTop; var lines = new List<string>();
for (var i = 0; paragraph.Length > 0; i++)
{
lines.Add(paragraph.Substring(0, Math.Min(Console.WindowWidth, paragraph.Length)));
var length = lines[i].LastIndexOf(" ", StringComparison.Ordinal);
if (length > 0) lines[i] = lines[i].Remove(length);
paragraph = paragraph.Substring(Math.Min(lines[i].Length + 1, paragraph.Length));
Console.SetCursorPosition(left, top + i); Console.WriteLine(lines[i]);
}
}
Here is a word wrap function that works by using regular expressions to find the places that it's ok to break and places where it must break. Then it returns pieces of the original text based on the "break zones". It even allows for breaks at hyphens (and other characters) without removing the hyphens (since the regex uses a zero-width positive lookbehind assertion).
IEnumerable<string> WordWrap(string text, int width)
{
const string forcedBreakZonePattern = #"\n";
const string normalBreakZonePattern = #"\s+|(?<=[-,.;])|$";
var forcedZones = Regex.Matches(text, forcedBreakZonePattern).Cast<Match>().ToList();
var normalZones = Regex.Matches(text, normalBreakZonePattern).Cast<Match>().ToList();
int start = 0;
while (start < text.Length)
{
var zone =
forcedZones.Find(z => z.Index >= start && z.Index <= start + width) ??
normalZones.FindLast(z => z.Index >= start && z.Index <= start + width);
if (zone == null)
{
yield return text.Substring(start, width);
start += width;
}
else
{
yield return text.Substring(start, zone.Index - start);
start = zone.Index + zone.Length;
}
}
}
If you want another newline to make text look-like paragraphs, just use Replace method of your String object.
var str =
"Line 1\n" +
"Line 2\n" +
"Line 3\n";
Console.WriteLine("Before:\n" + str);
str = str.Replace("\n", "\n\n");
Console.WriteLine("After:\n" + str);
Recently I've been working on creating some abstractions that imitate window-like features in a performance- and memory-sensitive console context.
To this end I had to implement word-wrapping functionality without any unnecessary string allocations.
The following is what I managed to simplify it into. This method:
preserves new-lines in the input string,
allows you to specify what characters it should break on (space, hyphen, etc.),
returns the start indices and lengths of the lines via Microsoft.Extensions.Primitives.StringSegment struct instances (but it's very simple to replace this struct with your own, or append directly to a StringBuilder).
public static IEnumerable<StringSegment> WordWrap(string input, int maxLineLength, char[] breakableCharacters)
{
int lastBreakIndex = 0;
while (true)
{
var nextForcedLineBreak = lastBreakIndex + maxLineLength;
// If the remainder is shorter than the allowed line-length, return the remainder. Short-circuits instantly for strings shorter than line-length.
if (nextForcedLineBreak >= input.Length)
{
yield return new StringSegment(input, lastBreakIndex, input.Length - lastBreakIndex);
yield break;
}
// If there are native new lines before the next forced break position, use the last native new line as the starting position of our next line.
int nativeNewlineIndex = input.LastIndexOf(Environment.NewLine, nextForcedLineBreak, maxLineLength);
if (nativeNewlineIndex > -1)
{
nextForcedLineBreak = nativeNewlineIndex + Environment.NewLine.Length + maxLineLength;
}
// Find the last breakable point preceding the next forced break position (and include the breakable character, which might be a hypen).
var nextBreakIndex = input.LastIndexOfAny(breakableCharacters, nextForcedLineBreak, maxLineLength) + 1;
// If there is no breakable point, which means a word is longer than line length, force-break it.
if (nextBreakIndex == 0)
{
nextBreakIndex = nextForcedLineBreak;
}
yield return new StringSegment(input, lastBreakIndex, nextBreakIndex - lastBreakIndex);
lastBreakIndex = nextBreakIndex;
}
}

Function to return the acronym of a string

How can I write a function which given an input string, passes back the acronym for the string using only If/Then/Else, simple String functions, and Looping syntax (not use the Split( ) function or its equivalent)?
String s_input, s_acronym
s_input = "Mothers against drunk driving"
s_acronym = f_get_acronym(s_input)
print "acronym = " + s_acronym
/* acronym = MADD */
My code is here. just looking to see if I could get better solution
static string f_get_acronym(string s_input)
{
string s_acronym = "";
for (int i = 0; i < s_input.Length; i++)
{
if (i == 0 && s_input[i].ToString() != " ")
{
s_acronym += s_input[i];
continue;
}
if (s_input[i - 1].ToString() == " " && s_input[i].ToString() != " ")
{
s_acronym += s_input[i];
}
}
return s_acronym.ToUpper();
}
Regex is the way to go in C#. I know you only wanted simple functions, but I want to put this here for any further readers who shall be directed on the right path. ;)
var regex = new Regex(#"(?:\s*(?<first>\w))\w+");
var matches = regex.Matches("Mother against druck driving");
foreach (Match match in matches)
{
Console.Write(match.Groups["first"].Value);
}
private static string f_get_acronym(string s_input)
{
if (string.IsNullOrWhiteSpace(s_input))
return string.Empty;
string accr = string.Empty;
accr += s_input[0];
while (s_input.Length > 0)
{
int index = s_input.IndexOf(' ');
if (index > 0)
{
s_input = s_input.Substring(index + 1);
if (s_input.Length == 0)
break;
accr += s_input[0];
}
else
{
break;
}
}
return accr.ToUpper();
}
Keep it simple:
public static string Acronym(string input)
{
string result = string.Empty;
char last = ' ';
foreach(var c in input)
{
if(char.IsWhiteSpace(last))
{
result += c;
}
last = c;
}
return result.ToUpper();
}
Best practice says you should use a StringBuilder when adding to a string in a loop though. Don't know how long your acronyms are going to be.
Your best way to do so would be to set up a loop to loop over every letter. If it is the first letter in the string, OR the first letter after a space, add that letter to a temp string, which is returned at the end.
eg (basic c++)
string toAcronym(string sentence)
{
string acronym = "";
bool wasSpace = true;
for (int i = 0; i < sentence.length(); i++)
{
if (wasSpace == true)
{
if (sentence[i] != ' ')
{
wasSpace = false;
acronym += toupper(sentence[i]);
}
else
{
wasSpace = true;
}
}
else if (sentence[i] == ' ')
{
wasSpace = true;
}
}
return acronym;
}
This could be further improved by checking to make sure the letter to add to the acronym is a letter, and not a number / symbol. OR...
If it is the first letter in the string, add it to the acronym. Then, run a loop for "find next of" a space. Then, add the next character. Continuously loop the "find next of" space until it returns null / eof / end of string, then return.

c# tab wrapped string

I am writing out a large string (around 100 lines) to a text file, and would like the entire block of text tabbed.
WriteToOutput("\t" + strErrorOutput);
The line I am using above only tabs the first line of the text. How can I indent/tab the entire string?
Replace all linebreaks by linebreak followed by a tab:
WriteToOutput("\t" + strErrorOutput.Replace("\n", "\n\t"));
File.WriteAllLines(FILEPATH,input.Split(new string[] {"\n","\r"}, StringSplitOptions.None)
.Select(x=>"\t"+x));
To do so you would have to have a limited line length (ie <100 characters) at which point this issue becomes easy.
public string ConvertToBlock(string text, int lineLength)
{
string output = "\t";
int currentLineLength = 0;
for (int index = 0; index < text.Length; index++)
{
if (currentLineLength < lineLength)
{
output += text[index];
currentLineLength++;
}
else
{
if (index != text.Length - 1)
{
if (text[index + 1] != ' ')
{
int reverse = 0;
while (text[index - reverse] != ' ')
{
output.Remove(index - reverse - 1, 1);
reverse++;
}
index -= reverse;
output += "\n\t";
currentLineLength = 0;
}
}
}
}
return output;
}
This will convert any text into a block of text that is broken up into lines of length lineLength and that all start with a tab and end with a newline.
You could make a copy of your string for output that replaces CRLF with CRLF + TAB. And the write that string to output (still prefixed with the initial TAB).
strErrorOutput = strErrorOutput.Replace("\r\n", "\r\n\t");
WriteToOutput("\t" + strErrorOutput);
If you're here looking for a way to word-wrap a string to a certain width, and have each line indented (as I was), here's a solution as an extension method. Roughly based on the answer above, but uses regular expressions to split the original text into word and spaces pairs, then rejoins them, adding line breaks and indentation as needed. (Does not sanity check inputs, so you'll need to add that if needed)
public string ToBlock(this string text, int lineLength, string indent="")
{
var r = new Regex("([^ ]+)?([ ]+)?");
var matches = r.Match(text);
if (!matches.Success)
{
return text;
}
string output = indent;
int currentLineLength = indent.Length;
while (matches.Success)
{
var groups = matches.Groups;
var nextLength = groups[0].Length;
if (currentLineLength + nextLength <= lineLength)
{
output += groups[0];
currentLineLength += groups[0].Length;
}
else
{
if (currentLineLength + groups[1].Length > lineLength)
{
output += "\n" + indent + groups[0];
currentLineLength = indent.Length + groups[0].Length;
}
else
{
output += groups[1] + "\n" + indent;
currentLineLength = indent.Length;
}
}
matches = matches.NextMatch();
}
return output;
}

Concatenate neighboring characters of a special character "-"

i am developing an application using c#.net in which i need that if a input entered by user contains the character '-'(hyphen) then i want the immediate neighbors of the hyphen(-) to be concatenated for example if a user enters
A-B-C then i want it to be replaced with ABC
AB-CD then i want it to be replaced like BC
ABC-D-E then i want it to be replaced like CDE
AB-CD-K then i want it to be replaced like BC and DK both separated by keyword and
after getting this i have to prepare my query to database.
i hope i made the problem clear but if need more clarification let me know.
Any help will be appreciated much.
Thanks,
Devjosh
Use:
string[] input = {
"A-B-C",
"AB-CD",
"ABC-D-E",
"AB-CD-K"
};
var regex = new Regex(#"\w(?=-)|(?<=-)\w", RegexOptions.Compiled);
var result = input.Select(s => string.Concat(regex.Matches(s)
.Cast<Match>().Select(m => m.Value)));
foreach (var s in result)
{
Console.WriteLine(s);
}
Output:
ABC
BC
CDE
BCDK
Untested, but this should do the trick, or at the very least lead you in the right direction.
private string Prepare(string input)
{
StringBuilder output = new StringBuilder();
char[] chars = input.ToCharArray();
for (int i = 0; i < chars.Length; i++)
{
if (chars[i] == '-')
{
if (i > 0)
{
output.Append(chars[i - 1]);
}
if (++i < chars.Length)
{
output.Append(chars[i])
}
else
{
break;
}
}
}
return output.ToString();
}
If you want each pair to form a separate object in an array, try the following code:
private string[] Prepare(string input)
{
List<string> output = new List<string>();
char[] chars = input.ToCharArray();
for (int i = 0; i < chars.Length; i++)
{
if (chars[i] == '-')
{
string o = string.Empty;
if (i > 0)
{
o += chars[i - 1];
}
if (++i < chars.Length)
{
o += chars[i]
}
output.Add(o);
}
}
return output.ToArray();
}
Correct me if I am wrong but surely all you need to do is remove the '-'?
like this:
"A-B-C".Replace("-","");
You can even solve this with a one-liner (although a bit ugly):
String.Join(String.Empty, input.Split('-').Select(q => (q.Length == 0 ? String.Empty : (q.Length > 1 ? (q.First() + q.Last()).ToString() : q.First().ToString())))).Substring(((input[0] + input[1]).ToString().Contains('-') ? 0 : 1), input.Length - ((input[0] + input[1]).ToString().Contains('-') ? 0 : 1) - ((input[input.Length - 1] + input[input.Length - 2]).ToString().Contains('-') ? 0 : 1));
first it splits the string to an array on each '-', then it concatenates only the first and the last character of each string (or just the only character if there's only one, and it leaves the empty string if there's nothing there), and then it concatenates the resulting enumerable to a String. Finally we strip the first and the last letter, if they are not in the needed range.
I know, it's ugly, I'm just saying that it's possible..
Probably it's way better to just use a simple
new Regex(#"\w(?=-)|(?<=-)\w", RegexOptions.Compiled)
and then work with that..
EDIT #Kirill Polishchuk was faster.. his solution should work..
EDIT 2
After the Question has been updated, here's a snippet that should do the trick:
string input = "A-B-C";
string s2;
string s3 = "";
string s4 = "";
var splitted = input.Split('-');
foreach(string s in splitted) {
if (s.Length == 0)
s2 = String.Empty;
else
if (s.Length > 1)
s2 = (s.First() + s.Last()).ToString();
else
s2 = s.First().ToString();
s3 += s4 + s2;
s4 = " and ";
}
int beginning;
int end;
if (input.Length > 1)
{
if ((input[0] + input[1]).ToString().Contains('-'))
beginning = 0;
else
beginning = 1;
if ((input[input.Length - 1] + input[input.Length - 2]).ToString().Contains('-'))
end = 0;
else
end = 1;
}
else
{
if ((input[0]).ToString().Contains('-'))
beginning = 0;
else
beginning = 1;
if ((input[input.Length - 1]).ToString().Contains('-'))
end = 0;
else
end = 1;
}
string result = s3.Substring(beginning, s3.Length - beginning - end);
It's not very elegant, but it should work (not tested though..). it works nearly the same as the one-liner above...

Divide long string into 60 character long lines but don't break words

There has to be a better way to do this.
I just want to split long string into 60 character lines but do not break words. So it doesn't have to add up to 60 characters just has to be less than 60.
The code below is what I have and it works but I'm thinking there's a better way. Anybody?
Modified to use StringBuilder and fixed the problem of removing a repeating word.
Also don't want to use regex because I think that would be less efficient than what I have now.
public static List<String> FormatMe(String Message)
{
Int32 MAX_WIDTH = 60;
List<String> Line = new List<String>();
String[] Words;
Message = Message.Trim();
Words = Message.Split(" ".ToCharArray());
StringBuilder s = new StringBuilder();
foreach (String Word in Words)
{
s.Append(Word + " ");
if (s.Length > MAX_WIDTH)
{
s.Replace(Word, "", 0, s.Length - Word.Length);
Line.Add(s.ToString().Trim());
s = new StringBuilder(Word + " ");
}
}
if (s.Length > 0)
Line.Add(s.ToString().Trim());
return Line;
}
Thanks
Another (now TESTED) sample, very similiar to Keith approach:
static void Main(string[] args)
{
const Int32 MAX_WIDTH = 60;
int offset = 0;
string text = Regex.Replace(File.ReadAllText("oneline.txt"), #"\s{2,}", " ");
List<string> lines = new List<string>();
while (offset < text.Length)
{
int index = text.LastIndexOf(" ",
Math.Min(text.Length, offset + MAX_WIDTH));
string line = text.Substring(offset,
(index - offset <= 0 ? text.Length : index) - offset );
offset += line.Length + 1;
lines.Add(line);
}
}
I ran that on this file with all line breaks manually replaced by " ".
Try this:
const Int32 MAX_WIDTH = 60;
string text = "...";
List<string> lines = new List<string>();
StringBuilder line = new StringBuilder();
foreach(Match word in Regex.Matches(text, #"\S+", RegexOptions.ECMAScript))
{
if (word.Value.Length + line.Length + 1 > MAX_WIDTH)
{
lines.Add(line.ToString());
line.Length = 0;
}
line.Append(String.Format("{0} ", word.Value));
}
if (line.Length > 0)
line.Append(word.Value);
Please, also check this out: How do I use a regular expression to add linefeeds?
Inside a Regular expression, the Match Evaluator function (an anonymous method) does the grunt work and stores the newly sized lines into a StringBuilder. We don't use the return value of Regex.Replace method because we're just using its Match Evaluator function as a feature to accomplish line breaking from inside the regular expression call - just for the heck of it, because I think it's cool.
using System;
using System.Text;
using System.Text.RegularExpressions;
strInput is what you want to convert the lines of.
int MAX_LEN = 60;
StringBuilder sb = new StringBuilder();
int bmark = 0; //bookmark position
Regex.Replace(strInput, #".*?\b\w+\b.*?",
delegate(Match m) {
if (m.Index - bmark + m.Length + m.NextMatch().Length > MAX_LEN
|| m.Index == bmark && m.Length >= MAX_LEN) {
sb.Append(strInput.Substring(bmark, m.Index - bmark + m.Length).Trim() + Environment.NewLine);
bmark = m.Index + m.Length;
} return null;
}, RegexOptions.Singleline);
if (bmark != strInput.Length) // last portion
sb.Append(strInput.Substring(bmark));
string strModified = sb.ToString(); // get the real string from builder
It's also worth noting the second condition in the if expression in the Match Evaluator m.Index == bmark && m.Length >= MAX_LEN is meant as an exceptional condition in case there is a word longer than 60 chars (or longer than the set max length) - it will not be broken down here but just stored on one line by itself - I guess you might want to create a second formula for that condition in the real world to hyphenate it or something.
An other one ...
public static string SplitLongWords(string text, int maxWordLength)
{
var reg = new Regex(#"\S{" + (maxWordLength + 1) + ",}");
bool replaced;
do
{
replaced = false;
text = reg.Replace(text, (m) =>
{
replaced = true;
return m.Value.Insert(maxWordLength, " ");
});
} while (replaced);
return text;
}
I would start with saving the length of the original string. Then, start backwards, and just subtract, since odds are that I would get below 60 faster by starting at the last word and going back than building up.
Once I know how long, then just use StringBuilder and build up the string for the new string.
List<string> lines = new List<string>();
while (message.Length > 60) {
int idx = message.LastIndexOf(' ', 60);
lines.Add(message.Substring(0, idx));
message = message.Substring(idx + 1, message.Length - (idx + 1));
}
lines.Add(message);
You might need to modify a bit to handle multiple spaces, words with >60 chars in them, ...
I tried the original solution and found that it didn't quite work. I've modified it slightly to make it work. It now works for me and solves a problem I had. Thanks.
Jim.
public static List<String> FormatMe(String message)
{
int maxLength = 10;
List<String> Line = new List<String>();
String[] words;
message = message.Trim();
words = message.Split(" ".ToCharArray());
StringBuilder sentence = new StringBuilder();
foreach (String word in words)
{
if((sentence.Length + word.Length) <= maxLength)
{
sentence.Append(word + " ");
}
else
{
Line.Add(sentence.ToString().Trim());
sentence = new StringBuilder(word + " ");
}
}
if (sentence.Length > 0)
Line.Add(sentence.ToString().Trim());
return Line;
}
private void btnSplitText_Click(object sender, EventArgs e)
{
List<String> Line = new List<string>();
string message = "The quick brown fox jumps over the lazy dog.";
Line = FormatMe(message);
}

Categories