Between two strings, but first string must be last occurrence

Between two strings, but first string must be last occurrence - c#

public static string Between(this string value, string a, string b)
{
int posA = value.IndexOf(a);
int posB = value.LastIndexOf(b);
if (posA == -1)
{
return "";
}
if (posB == -1)
{
return "";
}
int adjustedPosA = posA + a.Length;
if (adjustedPosA >= posB)
{
return "";
}
return value.Substring(adjustedPosA, posB - adjustedPosA);
}
//Button1 Click
MessageBox.Show(Between("Abcdefgh- 50- 25------------ 37,50-#", "- ", "-#"));
The result is: 50- 25------------ 37,50. But I want to select last '- '. So the result must be 37,50.
Can anyone help me?

I'd use Regex.
public static string Between(this string value, string a, string b)
{
return Regex.Match(value, string.Format("((?:(?!{0}).)*){1}", Regex.Escape(a), Regex.Escape(b))).Groups[1].Value;
}
The regex is looking for the last occurrence of a before b and selects the characters between them.
Adapted from: https://stackoverflow.com/a/18264730/134330

Related

function complex_decode( string str) that takes a non-simple repeated encoded string, and returns the original un-encoded string

Im trying to write a function complex_decode( string str) in c sharp that takes a non-simple repeated encoded string, and returns the original un-encoded string.
for example, "t11h12e14" would return "ttttttttttthhhhhhhhhhhheeeeeeeeeeeeee". I have been successful in decoding strings where the length is less than 10, but unable to work with length for than 10. I am not allowed to use regex, libraries or loops. Only recursions.
This is my code for simple decode which decodes when length less than 10.
public string decode(string str)
{
if (str.Length < 1)
return "";
if(str.Length==2)
return repeat_char(str[0], char_to_int(str[1]));
else
return repeat_char(str[0], char_to_int(str[1]))+decode(str.Substring(2));
}
public int char_to_int(char c)
{
return (int)(c-48);
}
public string repeat_char(char c, int n)
{
if (n < 1)
return "";
if (n == 1)
return ""+c;
else
return c + repeat_char(c, n - 1);
}
This works as intended, for example input "a5" returns "aaaaa", "t1h1e1" returns "the"
Any help is appreciated.

Here is another way of doing this, assuming the repeating string is always one character long and using only recursion (and a StringBuilder object):
private static string decode(string value)
{
var position = 0;
var result = decode_char(value, ref position);
return result;
}
private static string decode_char(string value, ref int position)
{
var next = value[position++];
var countBuilder = new StringBuilder();
get_number(value, ref position, countBuilder);
var result = new string(next, Convert.ToInt32(countBuilder.ToString()));
if (position < value.Length)
result += decode_char(value, ref position);
return result;
}
private static void get_number(string value, ref int position, StringBuilder countBuilder)
{
if (position < value.Length && char.IsNumber(value[position]))
{
countBuilder.Append(value[position++]);
get_number(value, ref position, countBuilder);
}
}

I've refactored your code a bit. I've removed 2 unnecessary methods that you don't actually need. So, the logic is simple and it works like this;
Example input: t3h2e4
Get the first digit. (Which is 2 and has index of 1)
Get the first letter comes after that index, which is our next letter. (Which is "h" and has index of 2)
Slice the string. Start from index 1 and end the slicing on index 2 to get repeat count. (Which is 3)
Repeat the first letter of string for repeat count times and combine it with the result you got from step 5.
Slice the starting from the next letter index we got in second step, to the very end of the string and pass this to recursive method.
public static string Decode(string input)
{
// If the string is empty or has only 1 character, return the string itself to not proceed.
if (input.Length <= 1)
{
return input;
}
// Convert string into character list.
var characters = new List<char>();
characters.AddRange(input);
var firstDigitIndex = characters.FindIndex(c => char.IsDigit(c)); // Get first digit
var nextLetterIndex = characters.FindIndex(firstDigitIndex, c => char.IsLetter(c)); // Get the next letter after that digit
if (nextLetterIndex == -1)
{
// This has only one reason. Let's say you are in the last recursion and you have c2
// There is no letter after the digit, so the index will -1, which means "not found"
// So, it will raise an exception, since we try to use the -1 in slicing part
// Instead, if it's not found, we set the next letter index to length of the string
// With doing that, you either get repeatCount correctly (since remaining part is only digits)
// or you will get empty string in the next recursion, which will stop the recursion.
nextLetterIndex = input.Length;
}
// Let's say first digit's index is 1 and the next letter's index is 2
// str[2..3] will start to slice the string from index 2 and will stop in index 3
// So, it will basically return us the repeat count.
var repeatCount = int.Parse(input[firstDigitIndex..nextLetterIndex]);
// string(character, repeatCount) constructor will repeat the "character" you passed to it for "repeatCount" times
return new string(input[0], repeatCount) + Decode(input[nextLetterIndex..]);
}
Examples;
Console.WriteLine(Decode("t1h1e1")); // the
Console.WriteLine(Decode("t2h3e4")); // tthhheeee
Console.WriteLine(Decode("t3h3e3")); // ttthhheee
Console.WriteLine(Decode("t2h10e2")); // tthhhhhhhhhhee
Console.WriteLine(Decode("t2h10e10")); // tthhhhhhhhhheeeeeeeeee

First you can simplify your repeat_char function, you have to have a clear stop condition:
public static string repeat_char(char c, int resultLength)
{
if(resultLength < 1) throw new ArgumentOutOfRangeException("resultLength");
if(resultLength == 1) return c.ToString();
return c + repeat_char(c, resultLength - 1);
}
See the use of the parameter as equivalent of a counter on a loop.
So you can have something similar on the main function, a parameter that tells when your substring is not an int anymore.
public static string decode(string str, int repeatNumberLength = 1)
{
if(repeatNumberLength < 1) throw new ArgumentOutOfRangeException("length");
//stop condition
if(string.IsNullOrWhiteSpace(str)) return str;
if(repeatNumberLength >= str.Length) repeatNumberLength = str.Length; //Some validation, just to be safe
//keep going until str[1...repeatNumberLength] is not an int
int charLength;
if(repeatNumberLength < str.Length && int.TryParse(str.Substring(1, repeatNumberLength), out charLength))
{
return decode(str, repeatNumberLength + 1);
}
repeatNumberLength--;
//Get the repeated Char.
charLength = int.Parse(str.Substring(1, repeatNumberLength));
var repeatedChar = repeat_char(str[0], charLength);
//decode the substring
var decodedSubstring = decode(str.Substring(repeatNumberLength + 1));
return repeatedChar + decodedSubstring;
}
I used a default parameter, but you can easily change it for a more traditonal style.
This also assumes that the original str is in a correct format.
An excellent exercise is to change the function so that you can have a word, instead of a char before the number. Then you could, for example, have "the3" as the parameter (resulting in "thethethe").

I took more of a Lisp-style head and tail approach (car and cdr if you speak Lisp) and created a State class to carry around the current state of the parsing.
First the State class:
internal class State
{
public State()
{
LastLetter = string.Empty;
CurrentCount = 0;
HasStarted = false;
CurrentValue = string.Empty;
}
public string LastLetter { get; private set; }
public int CurrentCount { get; private set; }
public bool HasStarted { get; private set; }
public string CurrentValue { get; private set; }
public override string ToString()
{
return $"LastLetter: {LastLetter}, CurrentCount: {CurrentCount}, HasStarted: {HasStarted}, CurrentValue: {CurrentValue}";
}
public void AddLetter(string letter)
{
CurrentCount = 0;
LastLetter = letter;
HasStarted = true;
}
public int AddDigit(string digit)
{
if (!HasStarted)
{
throw new InvalidOperationException($"The input must start with a letter, not a digit");
}
if (!int.TryParse(digit, out var num))
{
throw new InvalidOperationException($"Digit passed to {nameof(AddDigit)} ({digit}) is not a number");
}
CurrentCount = CurrentCount * 10 + num;
return CurrentCount;
}
public string GetValue()
{
if (string.IsNullOrEmpty(LastLetter))
{
return string.Empty;
}
CurrentValue = new string(LastLetter[0], CurrentCount);
return CurrentValue;
}
}
You'll notice it's got some stuff in there for debugging (example, the ToString override and the CurrentValue property)
Once you have that, the decoder is easy, it just recurses over the string it's given (along with (initially) a freshly constructed State instance):
private string Decode(string input, State state)
{
if (input.Length == 0)
{
_buffer.Append(state.GetValue());
return _buffer.ToString();
}
var head = input[0];
var tail = input.Substring(1);
var headString = head.ToString();
if (char.IsDigit(head))
{
state.AddDigit(headString);
}
else // it's a character
{
_buffer.Append(state.GetValue());
state.AddLetter(headString);
}
Decode(tail, state);
return _buffer.ToString();
}
I did this in a simple Windows Forms app, with a text box for input, a label for output and a button to crank her up:
const string NotAllowedPattern = #"[^a-zA-Z0-9]";
private static Regex NotAllowedRegex = new Regex(NotAllowedPattern);
private StringBuilder _buffer = new StringBuilder();
private void button1_Click(object sender, EventArgs e)
{
if (textBox1.Text.Length == 0 || NotAllowedRegex.IsMatch(textBox1.Text))
{
MessageBox.Show(this, "Only Letters and Digits Allowed", "Bad Input", MessageBoxButtons.OK, MessageBoxIcon.Error);
return;
}
label1.Text = string.Empty;
_buffer.Clear();
var result = Decode(textBox1.Text, new State());
label1.Text = result;
}
Yeah, there's a Regex there, but it's just to make sure that the input is valid; it's not involved in calculating the output.

Finding multiple brackets through loop

I have a piece of code which I am using to separate data, when '('bracket has found, but I want to do it for multiple brackets occurs
private void button1_Click(object sender, EventArgs e)
{
string str = "A+(B*C)*D";
string a=getBetween(str, "(", ")");
str = str.Replace(a, "()");
MessageBox.Show("string is=" + a);
MessageBox.Show("string is=" + str);
}
}
public string getBetween(string strSource, string strStart, string strEnd)
{
int Start, End;
if (strSource.Contains(strStart) && strSource.Contains(strEnd))
{
Start = strSource.IndexOf(strStart, 0) + strStart.Length;
End = strSource.IndexOf(strEnd, Start);
return "(" +strSource.Substring(Start, End - Start)+")";
}
else
{
return "";
}
I want to do it for multiple brakcets
like string="A+(B+(C*D))"

This will extract all the strings inside brackets.
public IEnumerable<string> Extract(string sourceStr, string startStr, string endStr)
{
var startIndices = IndexOfAll(sourceStr, startStr).ToArray();
var endIndices = IndexOfAll(sourceStr, endStr).ToArray();
if(startIndices.Length != endIndices.Length)
throw new InvalidOperationException("Missmatch");
for (int i = 0; i < startIndices.Length; i++)
{
var start = startIndices[i];
var end = endIndices[endIndices.Length - 1 - i];
yield return sourceStr.Substring(start, end - start + 1);
}
}
public static IEnumerable<int> IndexOfAll(string source, string subString)
{
return Regex.Matches(source, Regex.Escape(subString)).Cast<Match>().Select(m => m.Index);
}
So A+(B+(C*D)) would return two strings (B+(C*D)) and (C*D)

Matching strings with wildcard

I would like to match strings with a wildcard (*), where the wildcard means "any". For example:
*X = string must end with X
X* = string must start with X
*X* = string must contain X
Also, some compound uses such as:
*X*YZ* = string contains X and contains YZ
X*YZ*P = string starts with X, contains YZ and ends with P.
Is there a simple algorithm to do this? I'm unsure about using regex (though it is a possibility).
To clarify, the users will type in the above to a filter box (as simple a filter as possible), I don't want them to have to write regular expressions themselves. So something I can easily transform from the above notation would be good.

Often, wild cards operate with two type of jokers:
? - any character (one and only one)
* - any characters (zero or more)
so you can easily convert these rules into appropriate regular expression:
// If you want to implement both "*" and "?"
private static String WildCardToRegular(String value) {
return "^" + Regex.Escape(value).Replace("\\?", ".").Replace("\\*", ".*") + "$";
}
// If you want to implement "*" only
private static String WildCardToRegular(String value) {
return "^" + Regex.Escape(value).Replace("\\*", ".*") + "$";
}
And then you can use Regex as usual:
String test = "Some Data X";
Boolean endsWithEx = Regex.IsMatch(test, WildCardToRegular("*X"));
Boolean startsWithS = Regex.IsMatch(test, WildCardToRegular("S*"));
Boolean containsD = Regex.IsMatch(test, WildCardToRegular("*D*"));
// Starts with S, ends with X, contains "me" and "a" (in that order)
Boolean complex = Regex.IsMatch(test, WildCardToRegular("S*me*a*X"));

You could use the VB.NET Like-Operator:
string text = "x is not the same as X and yz not the same as YZ";
bool contains = LikeOperator.LikeString(text,"*X*YZ*", Microsoft.VisualBasic.CompareMethod.Binary);
Use CompareMethod.Text if you want to ignore the case.
You need to add using Microsoft.VisualBasic.CompilerServices; and add a reference to the Microsoft.VisualBasic.dll.
Since it's part of the .NET framework and will always be, it's not a problem to use this class.

For those using .NET Core 2.1+ or .NET 5+, you can use the FileSystemName.MatchesSimpleExpression method in the System.IO.Enumeration namespace.
string text = "X is a string with ZY in the middle and at the end is P";
bool isMatch = FileSystemName.MatchesSimpleExpression("X*ZY*P", text);
Both parameters are actually ReadOnlySpan<char> but you can use string arguments too. There's also an overloaded method if you want to turn on/off case matching. It is case insensitive by default as Chris mentioned in the comments.

Using of WildcardPattern from System.Management.Automation may be an option.
pattern = new WildcardPattern(patternString);
pattern.IsMatch(stringToMatch);
Visual Studio UI may not allow you to add System.Management.Automation assembly to References of your project. Feel free to add it manually, as described here.

A wildcard * can be translated as .* or .*? regex pattern.
You might need to use a singleline mode to match newline symbols, and in this case, you can use (?s) as part of the regex pattern.
You can set it for the whole or part of the pattern:
X* = > #"X(?s:.*)"
*X = > #"(?s:.*)X"
*X* = > #"(?s).*X.*"
*X*YZ* = > #"(?s).*X.*YZ.*"
X*YZ*P = > #"(?s:X.*YZ.*P)"

*X*YZ* = string contains X and contains YZ
#".*X.*YZ"
X*YZ*P = string starts with X, contains YZ and ends with P.
#"^X.*YZ.*P$"

It is necessary to take into consideration, that Regex IsMatch gives true with XYZ, when checking match with Y*. To avoid it, I use "^" anchor
isMatch(str1, "^" + str2.Replace("*", ".*?"));
So, full code to solve your problem is
bool isMatchStr(string str1, string str2)
{
string s1 = str1.Replace("*", ".*?");
string s2 = str2.Replace("*", ".*?");
bool r1 = Regex.IsMatch(s1, "^" + s2);
bool r2 = Regex.IsMatch(s2, "^" + s1);
return r1 || r2;
}

This is kind of an improvement on the popular answer from #Dmitry Bychenko above (https://stackoverflow.com/a/30300521/4491768). In order to support ? and * as a matching characters we have to escape them. Use \\? or \\* to escape them.
Also a pre compiled regex will improve the performance (on reuse).
public class WildcardPattern
{
private readonly string _expression;
private readonly Regex _regex;
public WildcardPattern(string pattern)
{
if (string.IsNullOrEmpty(pattern)) throw new ArgumentNullException(nameof(pattern));
_expression = "^" + Regex.Escape(pattern)
.Replace("\\\\\\?","??").Replace("\\?", ".").Replace("??","\\?")
.Replace("\\\\\\*","**").Replace("\\*", ".*").Replace("**","\\*") + "$";
_regex = new Regex(_expression, RegexOptions.Compiled);
}
public bool IsMatch(string value)
{
return _regex.IsMatch(value);
}
}
usage
new WildcardPattern("Hello *\\**\\?").IsMatch("Hello W*rld?");
new WildcardPattern(#"Hello *\**\?").IsMatch("Hello W*rld?");

To support those one with C#+Excel (for partial known WS name) but not only - here's my code with wildcard (ddd*).
Briefly: the code gets all WS names and if today's weekday(ddd) matches the first 3 letters of WS name (bool=true) then it turn it to string that gets extracted out of the loop.
using System;
using Microsoft.Office.Interop.Excel;
using System.Runtime.InteropServices;
using Range = Microsoft.Office.Interop.Excel.Range;
using System.Diagnostics;
using System.Reflection;
using System.IO;
using System.Text.RegularExpressions;
...
string weekDay = DateTime.Now.ToString("ddd*");
Workbook sourceWorkbook4 = xlApp.Workbooks.Open(LrsIdWorkbook, 0, false, 5, "", "", true, XlPlatform.xlWindows, "\t", false, false, 0, true, 1, 0);
Workbook destinationWorkbook = xlApp.Workbooks.Open(masterWB, 0, false, 5, "", "", true, XlPlatform.xlWindows, "\t", false, false, 0, true, 1, 0);
static String WildCardToRegular(String value)
{
return "^" + Regex.Escape(value).Replace("\\*", ".*") + "$";
}
string wsName = null;
foreach (Worksheet works in sourceWorkbook4.Worksheets)
{
Boolean startsWithddd = Regex.IsMatch(works.Name, WildCardToRegular(weekDay + "*"));
if (startsWithddd == true)
{
wsName = works.Name.ToString();
}
}
Worksheet sourceWorksheet4 = (Worksheet)sourceWorkbook4.Worksheets.get_Item(wsName);
...

public class Wildcard
{
private readonly string _pattern;
public Wildcard(string pattern)
{
_pattern = pattern;
}
public static bool Match(string value, string pattern)
{
int start = -1;
int end = -1;
return Match(value, pattern, ref start, ref end);
}
public static bool Match(string value, string pattern, char[] toLowerTable)
{
int start = -1;
int end = -1;
return Match(value, pattern, ref start, ref end, toLowerTable);
}
public static bool Match(string value, string pattern, ref int start, ref int end)
{
return new Wildcard(pattern).IsMatch(value, ref start, ref end);
}
public static bool Match(string value, string pattern, ref int start, ref int end, char[] toLowerTable)
{
return new Wildcard(pattern).IsMatch(value, ref start, ref end, toLowerTable);
}
public bool IsMatch(string str)
{
int start = -1;
int end = -1;
return IsMatch(str, ref start, ref end);
}
public bool IsMatch(string str, char[] toLowerTable)
{
int start = -1;
int end = -1;
return IsMatch(str, ref start, ref end, toLowerTable);
}
public bool IsMatch(string str, ref int start, ref int end)
{
if (_pattern.Length == 0) return false;
int pindex = 0;
int sindex = 0;
int pattern_len = _pattern.Length;
int str_len = str.Length;
start = -1;
while (true)
{
bool star = false;
if (_pattern[pindex] == '*')
{
star = true;
do
{
pindex++;
}
while (pindex < pattern_len && _pattern[pindex] == '*');
}
end = sindex;
int i;
while (true)
{
int si = 0;
bool breakLoops = false;
for (i = 0; pindex + i < pattern_len && _pattern[pindex + i] != '*'; i++)
{
si = sindex + i;
if (si == str_len)
{
return false;
}
if (str[si] == _pattern[pindex + i])
{
continue;
}
if (si == str_len)
{
return false;
}
if (_pattern[pindex + i] == '?' && str[si] != '.')
{
continue;
}
breakLoops = true;
break;
}
if (breakLoops)
{
if (!star)
{
return false;
}
sindex++;
if (si == str_len)
{
return false;
}
}
else
{
if (start == -1)
{
start = sindex;
}
if (pindex + i < pattern_len && _pattern[pindex + i] == '*')
{
break;
}
if (sindex + i == str_len)
{
if (end <= start)
{
end = str_len;
}
return true;
}
if (i != 0 && _pattern[pindex + i - 1] == '*')
{
return true;
}
if (!star)
{
return false;
}
sindex++;
}
}
sindex += i;
pindex += i;
if (start == -1)
{
start = sindex;
}
}
}
public bool IsMatch(string str, ref int start, ref int end, char[] toLowerTable)
{
if (_pattern.Length == 0) return false;
int pindex = 0;
int sindex = 0;
int pattern_len = _pattern.Length;
int str_len = str.Length;
start = -1;
while (true)
{
bool star = false;
if (_pattern[pindex] == '*')
{
star = true;
do
{
pindex++;
}
while (pindex < pattern_len && _pattern[pindex] == '*');
}
end = sindex;
int i;
while (true)
{
int si = 0;
bool breakLoops = false;
for (i = 0; pindex + i < pattern_len && _pattern[pindex + i] != '*'; i++)
{
si = sindex + i;
if (si == str_len)
{
return false;
}
char c = toLowerTable[str[si]];
if (c == _pattern[pindex + i])
{
continue;
}
if (si == str_len)
{
return false;
}
if (_pattern[pindex + i] == '?' && c != '.')
{
continue;
}
breakLoops = true;
break;
}
if (breakLoops)
{
if (!star)
{
return false;
}
sindex++;
if (si == str_len)
{
return false;
}
}
else
{
if (start == -1)
{
start = sindex;
}
if (pindex + i < pattern_len && _pattern[pindex + i] == '*')
{
break;
}
if (sindex + i == str_len)
{
if (end <= start)
{
end = str_len;
}
return true;
}
if (i != 0 && _pattern[pindex + i - 1] == '*')
{
return true;
}
if (!star)
{
return false;
}
sindex++;
continue;
}
}
sindex += i;
pindex += i;
if (start == -1)
{
start = sindex;
}
}
}
}

C# Console application sample
Command line Sample:
C:/> App_Exe -Opy PythonFile.py 1 2 3
Console output:
Argument list: -Opy PythonFile.py 1 2 3
Found python filename: PythonFile.py
using System;
using System.Text.RegularExpressions; //Regex
namespace ConsoleApp1
{
class Program
{
static void Main(string[] args)
{
string cmdLine = String.Join(" ", args);
bool bFileExtFlag = false;
int argIndex = 0;
Regex regex;
foreach (string s in args)
{
//Search for the 1st occurrence of the "*.py" pattern
regex = new Regex(#"(?s:.*)\056py", RegexOptions.IgnoreCase);
bFileExtFlag = regex.IsMatch(s);
if (bFileExtFlag == true)
break;
argIndex++;
};
Console.WriteLine("Argument list: " + cmdLine);
if (bFileExtFlag == true)
Console.WriteLine("Found python filename: " + args[argIndex]);
else
Console.WriteLine("Python file with extension <.py> not found!");
}
}
}

String Utilities in C#

I'm learning about string utilities in C#, and I have a method that replaces parts of a string.
Using the replace method I need to get an output such as
"Old file name: file00"
"New file name: file01"
Depending on what the user wants to change it to.
I am looking for help on making the method (NextImageName) replace only the digits, but not the file name.
class BuildingBlock
{
public static string ReplaceOnce(string word, string characters, int position)
{
word = word.Remove(position, characters.Length);
word = word.Insert(position, characters);
return word;
}
public static string GetLastName(string name)
{
string result = "";
int posn = name.LastIndexOf(' ');
if (posn >= 0) result = name.Substring(posn + 1);
return result;
}
public static string NextImageName(string filename, int newNumber)
{
if (newNumber > 9)
{
return ReplaceOnce(filename, newNumber, (filename.Length - 2))
}
if (newNumber < 10)
{
}
if (newNumber == 0)
{
}
}
The other "if" statements are empty for now until I find out how to do the first one.

The correct way to do this would be to use Regular Expressions.
Ideally you would separate "file" from "00" in "file00". Then take "00", convert it to an Int32 (using Int32.Parse()) and then rebuild your string with String.Format().

public static string NextImageName(string filename, int newNumber)
{
string oldnumber = "";
foreach (var item in filename.ToCharArray().Reverse())
if (char.IsDigit(item))
oldnumber = item + oldnumber ;
else
break;
return filename.Replace(oldnumber ,newNumber.ToString());
}

public static string NextImageName(string filename, int newNumber)
{
int i = 0;
foreach (char c in filename) // get index of first number
{
if (char.IsNumber(c))
break;
else
i++;
}
string s = filename.Substring(0,i); // remove original number
s = s + newNumber.ToString(); // add new number
return s;
}

Truncate string on whole words in .NET C#

I am trying to truncate some long text in C#, but I don't want my string to be cut off part way through a word. Does anyone have a function that I can use to truncate my string at the end of a word?
E.g:
"This was a long string..."
Not:
"This was a long st..."

Try the following. It is pretty rudimentary. Just finds the first space starting at the desired length.
public static string TruncateAtWord(this string value, int length) {
if (value == null || value.Length < length || value.IndexOf(" ", length) == -1)
return value;
return value.Substring(0, value.IndexOf(" ", length));
}

Thanks for your answer Dave. I've tweaked the function a bit and this is what I'm using ... unless there are any more comments ;)
public static string TruncateAtWord(this string input, int length)
{
if (input == null || input.Length < length)
return input;
int iNextSpace = input.LastIndexOf(" ", length, StringComparison.Ordinal);
return string.Format("{0}…", input.Substring(0, (iNextSpace > 0) ? iNextSpace : length).Trim());
}

My contribution:
public static string TruncateAtWord(string text, int maxCharacters, string trailingStringIfTextCut = "…")
{
if (text == null || (text = text.Trim()).Length <= maxCharacters)
return text;
int trailLength = trailingStringIfTextCut.StartsWith("&") ? 1
: trailingStringIfTextCut.Length;
maxCharacters = maxCharacters - trailLength >= 0 ? maxCharacters - trailLength
: 0;
int pos = text.LastIndexOf(" ", maxCharacters);
if (pos >= 0)
return text.Substring(0, pos) + trailingStringIfTextCut;
return string.Empty;
}
This is what I use in my projects, with optional trailing. Text will never exceed the maxCharacters + trailing text length.

If you are using windows forms, in the Graphics.DrawString method, there is an option in StringFormat to specify if the string should be truncated, if it does not fit into the area specified. This will handle adding the ellipsis as necessary.
http://msdn.microsoft.com/en-us/library/system.drawing.stringtrimming.aspx

I took your approach a little further:
public string TruncateAtWord(string value, int length)
{
if (value == null || value.Trim().Length <= length)
return value;
int index = value.Trim().LastIndexOf(" ");
while ((index + 3) > length)
index = value.Substring(0, index).Trim().LastIndexOf(" ");
if (index > 0)
return value.Substring(0, index) + "...";
return value.Substring(0, length - 3) + "...";
}
I'm using this to truncate tweets.

This solution works too (takes first 10 words from myString):
String.Join(" ", myString.Split(' ').Take(10))

Taking into account more than just a blank space separator (e.g. words can be separated by periods followed by newlines, followed by tabs, etc.), and several other edge cases, here is an appropriate extension method:
public static string GetMaxWords(this string input, int maxWords, string truncateWith = "...", string additionalSeparators = ",-_:")
{
int words = 1;
bool IsSeparator(char c) => Char.IsSeparator(c) || additionalSeparators.Contains(c);
IEnumerable<char> IterateChars()
{
yield return input[0];
for (int i = 1; i < input.Length; i++)
{
if (IsSeparator(input[i]) && !IsSeparator(input[i - 1]))
if (words == maxWords)
{
foreach (char c in truncateWith)
yield return c;
break;
}
else
words++;
yield return input[i];
}
}
return !input.IsNullOrEmpty()
? new String(IterateChars().ToArray())
: String.Empty;
}

simplified, added trunking character option and made it an extension.
public static string TruncateAtWord(this string value, int maxLength)
{
if (value == null || value.Trim().Length <= maxLength)
return value;
string ellipse = "...";
char[] truncateChars = new char[] { ' ', ',' };
int index = value.Trim().LastIndexOfAny(truncateChars);
while ((index + ellipse.Length) > maxLength)
index = value.Substring(0, index).Trim().LastIndexOfAny(truncateChars);
if (index > 0)
return value.Substring(0, index) + ellipse;
return value.Substring(0, maxLength - ellipse.Length) + ellipse;
}

Heres what i came up with. This is to get the rest of the sentence also in chunks.
public static List<string> SplitTheSentenceAtWord(this string originalString, int length)
{
try
{
List<string> truncatedStrings = new List<string>();
if (originalString == null || originalString.Trim().Length <= length)
{
truncatedStrings.Add(originalString);
return truncatedStrings;
}
int index = originalString.Trim().LastIndexOf(" ");
while ((index + 3) > length)
index = originalString.Substring(0, index).Trim().LastIndexOf(" ");
if (index > 0)
{
string retValue = originalString.Substring(0, index) + "...";
truncatedStrings.Add(retValue);
string shortWord2 = originalString;
if (retValue.EndsWith("..."))
{
shortWord2 = retValue.Replace("...", "");
}
shortWord2 = originalString.Substring(shortWord2.Length);
if (shortWord2.Length > length) //truncate it further
{
List<string> retValues = SplitTheSentenceAtWord(shortWord2.TrimStart(), length);
truncatedStrings.AddRange(retValues);
}
else
{
truncatedStrings.Add(shortWord2.TrimStart());
}
return truncatedStrings;
}
var retVal_Last = originalString.Substring(0, length - 3);
truncatedStrings.Add(retVal_Last + "...");
if (originalString.Length > length)//truncate it further
{
string shortWord3 = originalString;
if (originalString.EndsWith("..."))
{
shortWord3 = originalString.Replace("...", "");
}
shortWord3 = originalString.Substring(retVal_Last.Length);
List<string> retValues = SplitTheSentenceAtWord(shortWord3.TrimStart(), length);
truncatedStrings.AddRange(retValues);
}
else
{
truncatedStrings.Add(retVal_Last + "...");
}
return truncatedStrings;
}
catch
{
return new List<string> { originalString };
}
}

I use this
public string Truncate(string content, int length)
{
try
{
return content.Substring(0,content.IndexOf(" ",length)) + "...";
}
catch
{
return content;
}
}

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Between two strings, but first string must be last occurrence - c#

Related

function complex_decode( string str) that takes a non-simple repeated encoded string, and returns the original un-encoded string

Finding multiple brackets through loop

Matching strings with wildcard

String Utilities in C#

Truncate string on whole words in .NET C#

Categories

Resources