Splitting text into lines with maximum length - c#

I have a long string and I want to fit that in a small field. To achieve that, I break the string into lines on whitespace. The algorithm goes like this:
public static string BreakLine(string text, int maxCharsInLine)
{
int charsInLine = 0;
StringBuilder builder = new StringBuilder();
for (int i = 0; i < text.Length; i++)
{
char c = text[i];
builder.Append(c);
charsInLine++;
if (charsInLine >= maxCharsInLine && char.IsWhiteSpace(c))
{
builder.AppendLine();
charsInLine = 0;
}
}
return builder.ToString();
}
But this breaks when there's a short word, followed by a longer word. "foo howcomputerwork" with a max length of 16 doesn't break, but I want it to. One thought I has was looking forward to see where the next whitespace occurs, but I'm not sure whether that would result in the fewest lines possible.

Enjoy!
public static string SplitToLines(string text, char[] splitOnCharacters, int maxStringLength)
{
var sb = new StringBuilder();
var index = 0;
while (text.Length > index)
{
// start a new line, unless we've just started
if (index != 0)
sb.AppendLine();
// get the next substring, else the rest of the string if remainder is shorter than `maxStringLength`
var splitAt = index + maxStringLength <= text.Length
? text.Substring(index, maxStringLength).LastIndexOfAny(splitOnCharacters)
: text.Length - index;
// if can't find split location, take `maxStringLength` characters
splitAt = (splitAt == -1) ? maxStringLength : splitAt;
// add result to collection & increment index
sb.Append(text.Substring(index, splitAt).Trim());
index += splitAt;
}
return sb.ToString();
}
Note that splitOnCharacters and maxStringLength could be saved in user settings area of the app.

Check the contents of the character before writing to the string builder and or it with the current count:
public static string BreakLine(string text, int maxCharsInLine)
{
int charsInLine = 0;
StringBuilder builder = new StringBuilder();
for (int i = 0; i < text.Length; i++)
{
char c = text[i];
if (char.IsWhiteSpace(c) || charsInLine >= maxCharsInLine)
{
builder.AppendLine();
charsInLine = 0;
}
else
{
builder.Append(c);
charsInLine++;
}
}
return builder.ToString();
}

update a code a bit, the #dead.rabit goes to loop sometime.
public static string SplitToLines(string text,char[] splitanyOf, int maxStringLength)
{
var sb = new System.Text.StringBuilder();
var index = 0;
var loop = 0;
while (text.Length > index)
{
// start a new line, unless we've just started
if (loop != 0)
{
sb.AppendLine();
}
// get the next substring, else the rest of the string if remainder is shorter than `maxStringLength`
var splitAt = 0;
if (index + maxStringLength <= text.Length)
{
splitAt = text.Substring(index, maxStringLength).LastIndexOfAny(splitanyOf);
}
else
{
splitAt = text.Length - index;
}
// if can't find split location, take `maxStringLength` characters
if (splitAt == -1 || splitAt == 0)
{
splitAt = text.IndexOfAny(splitanyOf, maxStringLength);
}
// add result to collection & increment index
sb.Append(text.Substring(index, splitAt).Trim());
if(text.Length > splitAt)
{
text = text.Substring(splitAt + 1).Trim();
}
else
{
text = string.Empty;
}
loop = loop + 1;
}
return sb.ToString();
}

Related

Take only letters from the string and reverse them

I'm preparing for my interview, faced the problem with the task. The case is that we're having a string:
test12pop90java989python
I need to return new string where words will be reversed and numbers will stay in the same place:
test12pop90java989python ==> tset12pop90avaj989nohtyp
What I started with:
Transferring string to char array
Use for loop + Char.IsNumber
??
var charArray = test.ToCharArray();
for (int i = 0; i < charArray.Length; i++)
{
if (!Char.IsNumber(charArray[i]))
{
....
}
}
but currently I'm stuck and don't know how to proceed, any tips how it can be done?
You can't reverse a run of letters until you've observed the entire run; until then, you need to keep track of the pending letters to be reversed and appended to the final output upon encountering a number or the end of the string. By storing these pending characters in a Stack<> they are naturally returned in the reverse order they were added.
static string Transform(string input)
{
StringBuilder outputBuilder = new StringBuilder(input.Length);
Stack<char> pending = new Stack<char>();
foreach (char c in input)
if (char.IsNumber(c))
{
// In the reverse order of which they were added, consume
// and append pending characters as long as they are available
while (pending.Count > 0)
outputBuilder.Append(pending.Pop());
// Alternatively...
//foreach (char p in pending)
// outputBuilder.Append(p);
//pending.Clear();
outputBuilder.Append(c);
}
else
pending.Push(c);
// Handle pending characters when input does not end with a number
while (pending.Count > 0)
outputBuilder.Append(pending.Pop());
return outputBuilder.ToString();
}
A similar but buffer-free way is to do it is to store the index of the start of the current run of letters, then walk back through and append each character when a number is found...
static string Transform(string input)
{
StringBuilder outputBuilder = new StringBuilder(input.Length);
int lettersStartIndex = -1;
for (int i = 0; i < input.Length; i++)
{
char c = input[i];
if (char.IsNumber(c))
{
if (lettersStartIndex >= 0)
{
// Iterate backwards from the previous character to the start of the run
for (int j = i - 1; j >= lettersStartIndex; j--)
outputBuilder.Append(input[j]);
lettersStartIndex = -1;
}
outputBuilder.Append(c);
}
else if (lettersStartIndex < 0)
lettersStartIndex = i;
}
// Handle remaining characters when input does not end with a number
if (lettersStartIndex >= 0)
for (int j = input.Length - 1; j >= lettersStartIndex; j--)
outputBuilder.Append(input[j]);
return outputBuilder.ToString();
}
For both implementations, calling Transform() with...
string[] inputs = new string[] {
"test12pop90java989python",
"123test12pop90java989python321",
"This text contains no numbers",
"1a2b3c"
};
for (int i = 0; i < inputs.Length; i++)
{
string input = inputs[i];
string output = Transform(input);
Console.WriteLine($" Input[{i}]: \"{input }\"");
Console.WriteLine($"Output[{i}]: \"{output}\"");
Console.WriteLine();
}
...produces this output...
Input[0]: "test12pop90java989python"
Output[0]: "tset12pop90avaj989nohtyp"
Input[1]: "123test12pop90java989python321"
Output[1]: "123tset12pop90avaj989nohtyp321"
Input[2]: "This text contains no numbers"
Output[2]: "srebmun on sniatnoc txet sihT"
Input[3]: "1a2b3c"
Output[3]: "1a2b3c"
A possible solution using Regex and Linq:
using System;
using System.Text.RegularExpressions;
using System.Linq;
public class Program
{
public static void Main()
{
var result = "";
var matchList = Regex.Matches("test12pop90java989python", "([a-zA-Z]*)(\\d*)");
var list = matchList.Cast<Match>().SelectMany(o =>o.Groups.Cast<Capture>().Skip(1).Select(c => c.Value));
foreach (var el in list)
{
if (el.All(char.IsDigit))
{
result += el;
}
else
{
result += new string(el.Reverse().ToArray());
}
}
Console.WriteLine(result);
}
}
I've used code from stackoverflow.com/a/21123574/1037948 to create a list of Regex matches on line 11:
var list = matchList.Cast<Match>().SelectMany(o =>o.Groups.Cast<Capture>().Skip(1).Select(c => c.Value));
Hey you can do something like:
string test = "test12pop90java989python", tempStr = "", finalstr = "";
var charArray = test.ToCharArray();
for (int i = 0; i < charArray.Length; i++)
{
if (!Char.IsNumber(charArray[i]))
{
tempStr += charArray[i];
}
else
{
char[] ReverseString = tempStr.Reverse().ToArray();
foreach (char charItem in ReverseString)
{
finalstr += charItem;
}
tempStr = "";
finalstr += charArray[i];
}
}
if(tempStr != "" && tempStr != null)
{
char[] ReverseString = tempStr.Reverse().ToArray();
foreach (char charItem in ReverseString)
{
finalstr += charItem;
}
tempStr = "";
}
I hope this helps

How to implement GetLineStartPosition for a WPF TextBox UI Control?

How to implement the RichTextBox API, GetLineStartPosition, for a WPF TextBox UI Control?
var charpos_x1 = GetLineStartPosition(0); // get charpos start of the current line
var charpos_x2 = GetLineStartPosition(1); // get charpos start of the next line
var charpos_x3 = GetLineStartPosition(-1); // get charpos start of the prev line
A TextPointer is just a reference to a location in the content of the rich text. TextBox doesn’t expose that but you can use an integer representing the offset of the character in the text. So based on the docs for GetLineStartPosition I think something like the following may work:
public int? GetLineStartPosition(this TextBox tb, int charIndex, int count)
{
/// get the line # for character idx then adjust the line
/// number by the specified # of lines
var l = tb.GetLineIndexFromCharacterIndex(charIndex) + count;
/// if its not valid return null
if (l < 0 || l >= tb.LineCount) return null;
/// otherwise return the character index of the start
/// of the adjusted line
return tb.GetCharacterIndexFromLineIndex(l);
}
Then just use it like this:
tb.GetLineStartPosition(tb.CaretIndex, 0);
public struct LinePos_t
{
public int indexz; // Multiline TextBox.Line End-Position
public int index0; // n-Line Begin Position
public int index1; // n-Line End Position
public int len; // n-Line Length
public string str; // n-line String
};
private static LinePos_t TextBoxMultiline_GetLinePosition(
TextBox tb,
int n
)
{
// (n == 0): current Line
// (n == 1): next line
// (n == -1): prev line
// Notes: end of line marked by: "\r\n" for Textbox.Text
// User can never postion Cursor on '\n' character...
// but can position it on \r character
LinePos_t linepos = new LinePos_t();
int anchor;
int repeat;
string c;
tb.Focus();
if (tb.Text.Length == 0)
{
linepos.indexz = 0;
linepos.index0 = 0;
linepos.index1 = 0;
linepos.len = 0;
linepos.str = "";
return linepos;
}
// Position End of Text
linepos.indexz = tb.Text.Length;
if (linepos.indexz != 0)
linepos.indexz--;
// Position Init Carrot
if (tb.CaretIndex > linepos.indexz)
anchor = linepos.indexz;
else
anchor = tb.CaretIndex;
// Rewind when n is negative
if (n < 0)
{
repeat = -n;
for (int i = 0; i < repeat; i++)
{
anchor = tb.Text.LastIndexOfAny(new char[] { '\r' }, anchor);
if (anchor == -1)
{
anchor = 0;
break;
}
}
//Select Current Line
n = 0;
}
repeat = n + 1;
for (int i = 0; i < repeat; i++)
{
linepos.index0 = tb.Text.LastIndexOfAny(new char[] { '\n' }, anchor);
if (linepos.index0 == -1)
{
linepos.index0 = 0;
break;
}
else
{
linepos.index0++;
}
}
linepos.index1 = tb.Text.IndexOfAny(new char[] { '\n' }, anchor);
if (linepos.index1 == -1)
linepos.index1 = linepos.indexz;
// Length of Line
linepos.len = linepos.index1 - linepos.index0 + 1;
linepos.str = tb.Text.Substring(linepos.index0, linepos.len);
return linepos;
}

How to find biggest substring from string1 into string2

Lets suppose i have two strings string1 and string2.
var string1 = "images of canadian geese goslings";
var string2 = "Canadian geese with goslings pictures to choose from, with no signup needed";
I need to find biggest substring of string1 which matches in string2.
here biggest substring will be "canadian geese" which is matching in string2.
How can I find it? I tried breaking string1 into char[] and find words then merged matched words but that failed my objective.
classy loop approach - the result includes te space after geese "canadian geese "
var string1 = "images of canadian geese goslings";
var string2 = "Canadian geese with goslings pictures to choose from, with no signup needed";
string result = "";
for (int i = 0; i < string1.Length; i++)
{
for (int j = 0; j < string1.Length - i; j++)
{
//add .Trim() here if you want to ignore space characters
string searchpattern = string1.Substring(i, j);
if (string2.IndexOf(searchpattern, StringComparison.OrdinalIgnoreCase) > -1 && searchpattern.Length > result.Length)
{
result = searchpattern;
}
}
}
https://dotnetfiddle.net/q3rHjI
Side note:
canadian and Canadian are not equal so you have to use StringComparison.OrdinalIgnoreCase if you want to search case insensitive
Have a look at the following code https://dotnetfiddle.net/aPyw3o
public class Program {
static IEnumerable<string> substrings(string s, int length) {
for (int i = 0 ; i + length <= s.Length; i++) {
var ss = s.Substring(i, length);
if (!(ss.StartsWith(" ") || ss.EndsWith(" ")))
yield return ss;
}
}
public static void Main()
{
int count = 0;
var string1 = "images of canadian geese goslings";
var string2 = "Canadian geese with goslings pictures to choose from, with no signup needed";
string result = null;
for (int i = string1.Length; i>0 && string.IsNullOrEmpty(result); i--) {
foreach (string s in substrings(string1, i)) {
count++;
if (string2.IndexOf(s, StringComparison.CurrentCultureIgnoreCase) >= 0) {
result = s;
break;
}
}
}
if (string.IsNullOrEmpty(result))
Console.WriteLine("no common substrings found");
else
Console.WriteLine("'" + result + "'");
Console.WriteLine(count);
}
}
The substrings method returns all substrings of the string s with the length of length (for the yield have a look at the documentation https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/keywords/yield) We skip substrings that start or end with a space, as we don't want whitespaces to make a substring longer than it really is)
The outer loop iterates through all the possible length values for the substrings, from the longest (ie string1.Length) to the shortest (ie 1). Then for each of the found substrings of length i is checked, if it's also a substring of string2. If that's the case, we can stop, as there cannot be any longer common substring, because we checked all longer substrings in previous iterations. But of course there may be other common substrings with length i
I'm going to add one more, using span/readonlymemory so you can avoid allocating all the strings that the current answers create. Note I didn't do any check for starting space or ending space as that does not seem to be a requirement for the question. This does do a case insensitive search, if you don't want that you can make it more efficient by using the built in indexof and dropping the case insensitive compares.
static void Main(string[] _)
{
var string1 = "images of canadian geese goslings";
var string2 = "Canadian geese with goslings pictures to choose from, with no signup needed";
var longest = FindLongestMatchingSubstring(string1, string2);
Console.WriteLine(longest);
}
static string FindLongestMatchingSubstring(string lhs, string rhs)
{
var left = lhs.AsMemory();
var right = rhs.AsMemory();
ReadOnlyMemory<char> longest = ReadOnlyMemory<char>.Empty;
for (int i = 0; i < left.Length; ++i)
{
foreach (var block in FindMatchingSubSpans(left, i, right))
{
if (block.Length > longest.Length)
longest = block;
}
}
if (longest.IsEmpty)
return string.Empty;
return longest.ToString();
}
static IEnumerable<ReadOnlyMemory<char>> FindMatchingSubSpans(ReadOnlyMemory<char> source, int pos, ReadOnlyMemory<char> matchFrom)
{
int lastMatch = 0;
for (int i = pos; i < source.Length; ++i)
{
var ch = source.Span[i];
int match = IndexOfChar(matchFrom, lastMatch, ch);
if (-1 != match)
{
lastMatch = match + 1;
int end = i;
while (++end < source.Length && ++match < matchFrom.Length)
{
char lhs = source.Span[end];
char rhs = matchFrom.Span[match];
if (lhs != rhs && lhs != (char.IsUpper(rhs) ? char.ToLower(rhs) : char.ToUpper(rhs)))
{
break;
}
}
yield return source.Slice(i, end - i);
}
}
}
static int IndexOfChar(ReadOnlyMemory<char> source, int pos, char ch)
{
char alt = char.IsUpper(ch) ? char.ToLower(ch) : char.ToUpper(ch);
for (int i = pos; i < source.Length; ++i)
{
char m = source.Span[i];
if (m == ch || m == alt)
return i;
}
return -1;
}

trying to parse a linktext in c#

How to parse a link as example: 'a/b/c' ?
How could I fix this code that returns: 1. 'a' 2. 'b/c' 3. empty
int getSizeOfParser(string links, char c)
{
int size = 0;
if (!string.IsNullOrEmpty(links))
{
for (int i = 0; i < links.Length; i++)
{
if (links[i] == c)
size++;
}
return size + 1;
}
return -1;
}
string[] parsedLink(string links, char c)
{
int size = getSizeOfParser(links, c);
if (size == -1)
return null;
string[] parsed = new string[size];
int i = 0, index = 0, tmp = 0;
while (i < links.Length)
{
if (links[i] == c)
{
parsed[index++] = links.Substring(tmp, i++);
tmp = i;
}
else
i++;
}
return parsed;
}
According to documentation, the second argument of SubString is its length from the index int he first argument:
Substring(Int32, Int32)
Retrieves a substring from this instance. The substring starts at a specified character position and has a specified length.
So what you want to do is:
if (links[i] == c)
{
parsed[index++] = links.Substring(tmp, i-tmp);
tmp = i+1;
}
i++;
instead of:
if (links[i] == c)
{
parsed[index++] = links.Substring(tmp, i++);
tmp = i;
}
else
i++;

Fix string for JavaScript in C#

I have a function that fixed non-printable characters in C# for JavaScript. But it works very slow! How to increase speed of this function?
private static string JsStringFixNonPrintable(string Source)
{
string Result = "";
for (int Position = 0; Position < Source.Length; ++Position)
{
int i = Position;
var CharCat = char.GetUnicodeCategory(Source, i);
if (Char.IsWhiteSpace(Source[i]) ||
CharCat == System.Globalization.UnicodeCategory.LineSeparator ||
CharCat == System.Globalization.UnicodeCategory.SpaceSeparator) { Result += " "; continue; }
if (Char.IsControl(Source[i]) && Source[i] != 10 && Source[i] != 13) continue;
Result += Source[i];
}
return Result;
}
I have recoded your snippet of code using StringBuilder class, with predefined buffer size... that is much faster than your sample.
private static string JsStringFixNonPrintable(string Source)
{
StringBuilder builder = new StringBuilder(Source.Length); // predefine size to be the same as input
for (int it = 0; it < Source.Length; ++it)
{
var ch = Source[it];
var CharCat = char.GetUnicodeCategory(Source, it);
if (Char.IsWhiteSpace(ch) ||
CharCat == System.Globalization.UnicodeCategory.LineSeparator ||
CharCat == System.Globalization.UnicodeCategory.SpaceSeparator) { builder.Append(' '); continue; }
if (Char.IsControl(ch) && ch != 10 && ch != 13) continue;
builder.Append(ch);
}
return builder.ToString();
}
Instead of concatenating to the string, try using System.Text.StringBuilder which internally maintains a character buffer and does not create a new object every time you append.
Example:
StringBuilder sb = new StringBuilder();
sb.Append('a');
sb.Append('b');
sb.Append('c');
string result = sb.ToString();
Console.WriteLine(result); // prints 'abc'
Use Stringbuilder
http://msdn.microsoft.com/en-us/library/system.text.stringbuilder.aspx
and replace characters in-place, that should speed up things

Categories