Indent multiple lines of text - c#

I need to indent multiple lines of text (in contrast to this question for a single line of text).
Let's say this is my input text:
First line
Second line
Last line
What I need is this result:
First line
Second line
Last line
Notice the indentation in each line.
This is what I have so far:
var textToIndent = #"First line
Second line
Last line.";
var splittedText = textToIndent.Split(new string[] {Environment.NewLine}, StringSplitOptions.None);
var indentAmount = 4;
var indent = new string(' ', indentAmount);
var sb = new StringBuilder();
foreach (var line in splittedText) {
sb.Append(indent);
sb.AppendLine(line);
}
var result = sb.ToString();
Is there a safer/simpler way to do it?
My concern is in the split method, which might be tricky if text from Linux, Mac or Windows is transfered, and new lines might not get splitted correctly in the target machine.

Since you are indenting all the lines, how about doing something like:
var result = indent + textToIndent.Replace("\n", "\n" + indent);
Which should cover both Windows \r\n and Unix \n end of lines.

Just replace your newline with newline + indent:
var indentAmount = 4;
var indent = new string(' ', indentAmount);
textToIndent = indent + textToIndent.Replace(Environment.NewLine, Environment.NewLine + indent);

The following solution may seem long-winded compared to other solutions posted here; but it has a few distinct advantages:
It will preserve line separators / terminators exactly as they are in the input string.
It will not append superfluous indentation characters at the end of the string.
It might run faster, as it uses only very primitive operations (character comparisons and copying; no substring searches, nor regular expressions). (But that's just my expectation; I haven't actually measured.)
static string Indent(this string str, int count = 1, char indentChar = ' ')
{
var indented = new StringBuilder();
var i = 0;
while (i < str.Length)
{
indented.Append(indentChar, count);
var j = str.IndexOf('\n', i + 1);
if (j > i)
{
indented.Append(str, i, j - i + 1);
i = j + 1;
}
else
{
break;
}
}
indented.Append(str, i, str.Length - i);
return indented.ToString();
}

Stakx's answer got me thinking about not appending superfluous indentation characters. And I think is best to avoid those characters not only at the end, but also in the middle and beginning of the string (when that's all that line has).
I used a Regex to replace new lines only if they are not followed by another new line, and another Regex to avoid adding the first indent in case the string begins with a new line:
Regex regexForReplace = new Regex(#"(\n)(?![\r\n])");
Regex regexForFirst = new Regex(#"^([\r\n]|$)");
string Indent(string textToIndent, int indentAmount = 1, char indentChar = ' ')
{
var indent = new string(indentChar, indentAmount);
string firstIndent = regexForFirst.Match(textToIndent).Success ? "" : indent;
return firstIndent + regexForReplace.Replace(textToIndent, #"$1" + indent);
}
I create the Regexs outside the method in order to speed up multiple replacements.
This solution can be tested at: https://ideone.com/9yu5Ih

If you need a string extension that adds a generic indent to a multi line string you can use:
public static string Indent(this string input, string indent)
{
return string.Join(Environment.NewLine, input.Split(Environment.NewLine).Select(item => string.IsNullOrEmpty(item.Trim()) ? item : indent + item));
}
This extension skips empty lines.
This solution is really simple to understand if you know linq and it's more simple to debug and change if you need to adapt it to different scopes.

Related

C# Console Word Wrap

I have a string with newline characters and I want to wrap the words. I want to keep the newline characters so that when I display the text it looks like separate paragraphs. Anyone have a good function to do this? Current function and code below.(not my own function). The WordWrap function seems to be stripping out \n characters.
static void Main(string[] args){
StreamReader streamReader = new StreamReader("E:/Adventure Story/Intro.txt");
string intro = "";
string line;
while ((line = streamReader.ReadLine()) != null)
{
intro += line;
if(line == "")
{
intro += "\n\n";
}
}
WordWrap(intro);
public static void WordWrap(string paragraph)
{
paragraph = new Regex(#" {2,}").Replace(paragraph.Trim(), #" ");
var left = Console.CursorLeft; var top = Console.CursorTop; var lines = new List<string>();
for (var i = 0; paragraph.Length > 0; i++)
{
lines.Add(paragraph.Substring(0, Math.Min(Console.WindowWidth, paragraph.Length)));
var length = lines[i].LastIndexOf(" ", StringComparison.Ordinal);
if (length > 0) lines[i] = lines[i].Remove(length);
paragraph = paragraph.Substring(Math.Min(lines[i].Length + 1, paragraph.Length));
Console.SetCursorPosition(left, top + i); Console.WriteLine(lines[i]);
}
}
Here is a word wrap function that works by using regular expressions to find the places that it's ok to break and places where it must break. Then it returns pieces of the original text based on the "break zones". It even allows for breaks at hyphens (and other characters) without removing the hyphens (since the regex uses a zero-width positive lookbehind assertion).
IEnumerable<string> WordWrap(string text, int width)
{
const string forcedBreakZonePattern = #"\n";
const string normalBreakZonePattern = #"\s+|(?<=[-,.;])|$";
var forcedZones = Regex.Matches(text, forcedBreakZonePattern).Cast<Match>().ToList();
var normalZones = Regex.Matches(text, normalBreakZonePattern).Cast<Match>().ToList();
int start = 0;
while (start < text.Length)
{
var zone =
forcedZones.Find(z => z.Index >= start && z.Index <= start + width) ??
normalZones.FindLast(z => z.Index >= start && z.Index <= start + width);
if (zone == null)
{
yield return text.Substring(start, width);
start += width;
}
else
{
yield return text.Substring(start, zone.Index - start);
start = zone.Index + zone.Length;
}
}
}
If you want another newline to make text look-like paragraphs, just use Replace method of your String object.
var str =
"Line 1\n" +
"Line 2\n" +
"Line 3\n";
Console.WriteLine("Before:\n" + str);
str = str.Replace("\n", "\n\n");
Console.WriteLine("After:\n" + str);
Recently I've been working on creating some abstractions that imitate window-like features in a performance- and memory-sensitive console context.
To this end I had to implement word-wrapping functionality without any unnecessary string allocations.
The following is what I managed to simplify it into. This method:
preserves new-lines in the input string,
allows you to specify what characters it should break on (space, hyphen, etc.),
returns the start indices and lengths of the lines via Microsoft.Extensions.Primitives.StringSegment struct instances (but it's very simple to replace this struct with your own, or append directly to a StringBuilder).
public static IEnumerable<StringSegment> WordWrap(string input, int maxLineLength, char[] breakableCharacters)
{
int lastBreakIndex = 0;
while (true)
{
var nextForcedLineBreak = lastBreakIndex + maxLineLength;
// If the remainder is shorter than the allowed line-length, return the remainder. Short-circuits instantly for strings shorter than line-length.
if (nextForcedLineBreak >= input.Length)
{
yield return new StringSegment(input, lastBreakIndex, input.Length - lastBreakIndex);
yield break;
}
// If there are native new lines before the next forced break position, use the last native new line as the starting position of our next line.
int nativeNewlineIndex = input.LastIndexOf(Environment.NewLine, nextForcedLineBreak, maxLineLength);
if (nativeNewlineIndex > -1)
{
nextForcedLineBreak = nativeNewlineIndex + Environment.NewLine.Length + maxLineLength;
}
// Find the last breakable point preceding the next forced break position (and include the breakable character, which might be a hypen).
var nextBreakIndex = input.LastIndexOfAny(breakableCharacters, nextForcedLineBreak, maxLineLength) + 1;
// If there is no breakable point, which means a word is longer than line length, force-break it.
if (nextBreakIndex == 0)
{
nextBreakIndex = nextForcedLineBreak;
}
yield return new StringSegment(input, lastBreakIndex, nextBreakIndex - lastBreakIndex);
lastBreakIndex = nextBreakIndex;
}
}

How to bring back removed spaces

I have a string which has multiple spaces between the words and I want to remove these spaces and then bring them back .. The problem was in bringing the spaces back after deleting them I tried to code my idea but it runs without any output and some times it gives me an exception of "Index was outside of array bounds" .. any help
string pt = "My name is Code"
int ptLenght = pt.Length;
char[] buffer1 = pt.ToCharArray();
spaceHolder = new int[pt.Length];
for (int m = 0; m < pt.Length; m++)
{
if (buffer1[m] == ' ')
{
hold = m;
spaceHolder[m] = hold;
}
}
char[] buffer = pt.Replace(" ", string.Empty).ToCharArray();
int stringRemovedSpaces = pt.Length;
char[] buffer = pt.ToCharArray(); // source
char[] buffer2 = new char[ptLenght]; // destination
for (int i = 0; i < pt.Length; i++)
{
buffer2[i] = buffer[i];
}
for (int i = 0; i < buffer2.Length; i++)
{
if (i == spaceHolder[i])
{
for (int m = stringRemovedSpaces; m <= i; m--)
{
buffer2[m-1] = buffer2[m];
}
buffer2[i] = ' ';
}
}
return new string(buffer2);
I suspect that you want to replace multiple spaces with a single one. The fastest and easiest way to do this is with a simple regular expression that replaces multiple whitespaces with a single space, eg:
var newString = Regex.Replace("A B C",#"\s+"," ")
or
var newString = Regex.Replace("A B C",#"[ ]+"," ")
The reason this is fastest than other methods like repeated string replacements is that a regular expression does not generate temporary strings. Internatlly it generates a list of indexes to matching items. A string is generated only when the code asks for a string value, like a match or replacement.
A regular expression object is thread-safe as well. You can create it once, store it in a static field and reuse it :
static Regex _spacer = new Regex(#"\s+");
public void MyMethod(string someInput)
{
...
var newString=_spacer.Replace(someInput, " ");
...
}
Deleting all the spaces in a string is really just as simple as calling the Replace function on a string. Please note that when you call Replace on a string, it does not modify the original string, it creates and returns a new string. I only bring this up because you set two different integer variables to pt.Length, and at no point is the string pt actually modified. Also, I would imagine you are getting the "Index was outside of array bounds" from the nested for loop. You have the loop set up in a way that m will decrement forever. However, once m equals zero, you will be trying to access an index of -1 from buffer2, which is a no no. If you want to delete the spaces while retaining an original copy of the string you could simplify your code to this:
string pt = "My name is Code";
string newpt = pt.Replace(" ", string.empty);
pt is a string with spaces, newpt is a new string with spaces removed. However, if you want to replace multiple consecutive spaces with a single space, I would recommend following the answer Panagiotis Kanavos gave.
I followed the advice of #maccettura and I did it like this and it worked as I wished :)
string orignalString = "";
orignalString = pt;
char[] buffer = pt.ToCharArray();
orignalString = new string(buffer);
return orignalString.Replace(" ", string.Empty);
// Here I got the string without spaces , now I have the same string with and without spaces

C# More intuitive way to split a string into tokens?

I have a method which takes in a string, which contains various characters, but I'm only concerned about underscores '_' and dollar signs '$'. I want to split up the string into tokens by underscores as each piece b/w the underscores contains important information.
However, if a $ is contained in an area between underscores, then a token should be created from the last occurrence of an underscore to the end (ignoring any underscores in this last section).
Example
input: Hello_To_The$Great_World
expected tokens: Hello, To, The$Great_World
Question
I have a solution below, but I'm wondering is there a cleaner/more intuitive way of doing this than what I have below?
var aTokens = new List<string>();
var aPos = 0;
for (var aNum = 0; aNum < item.Length; aNum++)
{
if (aNum == item.Length - 1)
{
aTokens.Add(item.Substring(aPos, item.Length - aPos));
break;
}
if (item[aNum] == '$')
{
aTokens.Add(item.Substring(aPos, item.Length - aPos));
break;
}
if (item[aNum] == '_')
{
aTokens.Add(item.Substring(aPos, aNum - aPos));
aPos = aNum + 1;
}
}
You can split string by _ not having $ before them.
For that you can use the following regex:
(?<!\$.*)_
Sample code:
string input = "Hello_To_The$Great_World";
string[] output = Regex.Split(input, #"(?<!\$.*)_");
You also can do the task without regex and without loops, but with the help of 2 splits:
string input = "Hello_To_The$Great_World";
string[] temp = input.Split(new[] { '$' }, 2);
string[] output = temp[0].Split('_');
if (temp.Length > 1)
output[output.Length - 1] = output[output.Length - 1] + "$" + temp[1];
This method is not efficient or clean, but it gives you a general idea of how to do this:
Split your string into tokens
Find the index of the first string to contain $
Return a new array with the first n tokens and the final token is the remaining strings concatenated.
It's probably more useful to take advantage of IEnumerable or do things over a for loop instead of all this Array.Copy stuff... but you get the gist of it.
private string[] SomeMethod(string arg)
{
var strings = arg.Split(new[] { '_' });
var indexedValue = strings.Select((v, i) => new { Value = v, Index = i }).FirstOrDefault(x => x.Value.Contains("$"));
if (indexedValue != null)
{
var count = indexedValue.Index + 1;
string[] final = new string[count];
Array.Copy(strings, 0, final, 0, indexedValue.Index);
final[indexedValue.Index] = String.Join("_", strings, indexedValue.Index, strings.Length - indexedValue.Index);
return final;
}
return strings;
}
Here's my version (loops are so last year...)
const char dollar = '$';
const char underscore = '_';
var item = "Hello_To_The$Great_World";
var aTokens = new List<string>();
int dollarIndex = item.IndexOf(dollar);
if (dollarIndex >= 0)
{
int lastUnderscoreIndex = item.LastIndexOf(underscore, dollarIndex);
if (lastUnderscoreIndex >= 0)
{
aTokens.AddRange(item.Substring(0, lastUnderscoreIndex).Split(underscore));
aTokens.Add(item.Substring(lastUnderscoreIndex + 1));
}
else
{
aTokens.Add(item);
}
}
else
{
aTokens.AddRange(item.Split(underscore));
}
Edit:
I should have added, cleaner/more intuitive is very subjective, as you have found out by the variety of answers provided. From a maintainability point of view, it's much more important that the method you write to do the parsing is unit tested!
It's also an interesting exercise to test the performance of the various methods posted here - it quickly becomes apparent that your original version is much faster than using regular expressions! (Although in a real life situation, it's probably quite unlikely that the performance of this method will make any difference to your application!)

Splitting a string in C#, why is this not working?

I have the following string:
string myString = " The objective for test.\vVision\v* Deliver a test
goals\v** Comprehensive\v** Control\v* Alignment with cross-Equities
strategy\vApproach\v*An acceleration "
and I am trying to split on "\v"
I tried this but it doesn't seem to work:
char[] delimiters = new char[] { '\v' };
string[] split = myString.Split(delimiters);
for (int i = 0; i < split.Length; i++) {
}
split.Length shows up as 1. Any suggestions?
"\v" is two characters, not one, in your original string (which is not counting the \ as an escape character as a literal C# string does).
You need to be splitting on literal "\v" which means you need to specify the overload of Split that takes a string:
string[] split = narrative.Split(new string[] {"\\v"}, StringSplitOptions.None);
Note how I had to escape the "\" character with "\\"
Your '\v' is a single control character, not two characters.
I think your question itself is slightly misleading...
Your example string, if entered into C# will actually work like you expected, because a \v in a verbatum C# string will be escaped to a special character:
string test = " The objective for test.\vVision\v* Deliver a test goals\v** Comprehensive\v** Control\v* Alignment with cross-Equities strategy\vApproach\v*An acceleration ";
char[] delimiters = new char[] { '\v' };
Console.WriteLine(test.Split(delimiters).Length); // Prints 8
However, I think your actual string really does have backslash-v in it rather than escaped \v:
string test = " The objective for test.\\vVision\\v* Deliver a test goals\\v** Comprehensive\\v** Control\\v* Alignment with cross-Equities strategy\\vApproach\\v*An acceleration ";
char[] delimiters = new char[] { '\v' };
Console.WriteLine(test.Split(delimiters).Length); // Prints 1, like you say you see.
So you can fix it as described above by using an array of strings to split the string:
string test = " The objective for test.\\vVision\\v* Deliver a test goals\\v** Comprehensive\\v** Control\\v* Alignment with cross-Equities strategy\\vApproach\\v*An acceleration ";
string[] delimiters = new [] { "\\v" };
Console.WriteLine(test.Split(delimiters, StringSplitOptions.None).Length); // Prints 8
Use something like this
string[] separators = {#"\v"};
string value = #"The objective for test.\vVision\v* Deliver a test goals\v** Comprehensive\v** Control\v* Alignment with cross-Equities strategy\vApproach\v*An acceleration";
string[] words = value.Split(separators, StringSplitOptions.RemoveEmptyEntries);
If you don't need the resulting array for later use, you can combine the split and loop into one call.
foreach (string s in myString.Split(new[] { "\v" }, StringSplitOptions.RemoveEmptyEntries))
{
//s is the string you can now use in your loop
}
Use Like this,
string myString = " The objective for test.\vVision\v* Deliver a test goals\v** Comprehensive\v** Control\v* Alignment with cross-Equities strategy\vApproach\v*An acceleration";
string[] delimiters = new string[] { "\v" };
string[] split = myString.Split(delimiters, StringSplitOptions.None);
for (int i = 1; i < split.Length; i++)
{
}
Try running it like this.
public static string FirstName(string fullName)
{
if (fullName == null)
return null;
var split = fullName.Split(',');
return split.Length > 0 ? split[0] : string.Empty;
}

Add to string until hits length (noobie C# guy)

I am trying to read a text file, break it into a string array, and then compile new strings out of the words, but I don't want it to exceed 120 characters in length.
What I am doing with is making it write PML to create a macro for some software I use, and the text can't exceed 120 characters. To take it even further I need to wrap the 120 characters or less (to the nearest word), string with "BTEXT |the string here|" which is the command.
Here is the code:
static void Main(string[] args)
{
int BIGSTRINGLEN = 120;
string readit = File.ReadAllText("C:\\stringtest.txt");
string finish = readit.Replace("\r\n", " ").Replace("\t", "");
string[] seeit = finish.Split(' ');
StringBuilder builder = new StringBuilder(BIGSTRINGLEN);
foreach(string word in seeit)
{
while (builder.Length + " " + word.Length <= BIGSTRINGLEN)
{
builder.Append(word)
}
}
}
Try using an if instead of the while as you will continually append the same word if not!!
Rather than read the entire file into memory, you can read it a line at a time. That will reduce your memory requirements and also prevent you having to replace the newlines.
StringBuilder builder = new StringBuilder(BIGSTRINGLEN);
foreach (var line in File.ReadLines(filename))
{
// clean up the line.
// Do you really want to replace tabs with nothing?
// if you want to treat tabs like spaces, change the call to Split
// and include '\t' in the character array.
string finish = line.Replace("\t", string.Empty);
string[] seeit = finish.Split(new char[] {' '}, StringSplitOptions.RemoveEmptyEntries);
foreach (string word in seeit)
{
if ((builder.Length + word.Length + 1 <= BIGSTRINGLEN)
{
if (builder.Length != 0)
builder.Append(' ');
builder.Append(word);
}
else
{
// output line
Console.WriteLine(builder.ToString());
// and reset the builder
builder.Length = 0;
}
}
}
// and write the last line
if (builder.Length > 0)
Console.WriteLine(builder.ToString());
That code is going to fail if a word is longer than BIGSTRINGLEN. Long words will end up outputting a blank line. I think you can figure out how to handle that case if it becomes a problem.
Matthew Moon is right - your while loop is not going to work as currently placed.
But that aside, you have some problems in this line
while (builder.Length + " " + word.Length <= BIGSTRINGLEN)
builder.Length and word.Length are integers - the number of characters in each word. " " is not an integer, it's a string. You can't correctly add 10 + " " + 5. You probably want
while (builder.Length + (" ").Length + word.Length <= BIGSTRINGLEN)
// or
while (builder.Length + 1 + word.Length <= BIGSTRINGLEN)

Categories