Trying to figure out how to use recursion c# - c#

I'm learning about recursion and I want to create a function to type something and give me the same string but inverse but can't find the logic for it.
public static string Reverse(string text, int textlength)
{
if (textlength== 0)
{
return text.ElementAt(textlength).ToString();
}
else
{
return Reverse(text, textlength - 1);
}
}
How could I do that using recursion?

Recursion is always about two cases, the base case and the recursive case.
The base case is when the string is 0 or 1 characters - the reverse is just what was passed in, so we return text.
The recursive case is when text is longer than 1 character, we must define the operation in terms of itself. In this case, the reverse of the string is the reverse of the rest of the string followed by the first character.
public static string Reverse(string text) {
if (text.Length <= 1) // base case
return text;
else // recursive case
return Reverse(text.Substring(1))+text.Substring(0, 1);
}

Related

How to check for variable character in string and match it with another string of same length?

I have a rather complex issue that I'am unable to figure out.
I'm getting a set of string every 10 seconds from another process in which the first set has first 5 characters constant, next 3 are variable and can change. And then another set of string in which first 3 are variable and next 3 are constant.
I want to compare these values to a fixed string to check if the first 5 char matches in 1st set of string (ABCDE*** == ABCDEFGH) and ignore the last 3 variable characters while making sure the length is the same. Eg : if (ABCDE*** == ABCDEDEF) then condition is true, but if (ABCDE*** == ABCDDEFG) then the condition is false because the first 5 char is not same, also if (ABCDE*** == ABCDEFV) the condition should be false as one char is missing.
I'm using the * in fixed string to try to make the length same while comparing.
Does this solve your requirements?
private static bool MatchesPattern(string input)
{
const string fixedString = "ABCDExyz";
return fixedString.Length == input.Length && fixedString.Substring(0, 5).Equals(input.Substring(0, 5));
}
In last versions of C# you can also use ranges:
private static bool MatchesPattern(string input)
{
const string fixedString = "ABCDExyz";
return fixedString.Length == input.Length && fixedString[..5].Equals(input[..5]);
}
See this fiddle.
BTW: You could probably achieve the same using regex.
It's always a good idea to make an abstraction. Here I've made a simple function that takes the pattern and the value and makes a check:
bool PatternMatches(string pattern, string value)
{
// The null string doesn't match any pattern
if (value == null)
{
return false;
}
// If the value has a different length than the pattern, it doesn't match.
if (pattern.Length != value.Length)
{
return false;
}
// If both strings are zero-length, it's considered a match
bool result = true;
// Check every character against the pattern
for (int i = 0; i< pattern.Length; i++)
{
// Logical and the result, * matches everything
result&= (pattern[i]== '*') ? true: value[i] == pattern[i];
}
return result;
}
You can then call it like this:
bool b1 = PatternMatches("ABCDE***", "ABCDEFGH");
bool b2 = PatternMatches("ABC***", "ABCDEF");
You could use regular expressions, but this is fairly readable, RegExes aren't always.
Here is a link to a dotnetfiddle: https://dotnetfiddle.net/4x1U1E
If the string you match against is known at compile time, your best bet is probably using regular expressions. In the first case, match against ^ABCDE...$. In the second case, match against ^...DEF$.
Another way, probably better if the match string is unknown, uses Length, StartsWith and EndsWith:
String prefix = "ABCDE";
if (str.Length == 8 && str.StartsWith(prefix)) {
// do something
}
Then similarly for the second case, but using EndsWith instead of StartsWith.
check this
public bool Comparing(string str1, string str2)
=> str2.StartWith(str1.replace("*","")) && str1.length == str2.Length;

How to parse string and prepare it for math

I want to parse latex string to math expression with following code
static string ParseString(List<string> startItems, string input)
{
foreach (var item in startItems)
{
int charLocation = input.IndexOf(item, StringComparison.Ordinal);
if (charLocation > 0)
{
switch (input.Substring(0, charLocation).Last())
{
case '{':
case '+':
case '-':
case '*':
break;
default:
input = input.Replace(item, "*" + item);
break;
}
}
}
return input;
}
static void Main(string[] args)
{
string ToParse = #"\sqrt{3}\sqrt{3}*4\sqrt{3}";
ToParse = ParseString(new System.Collections.Generic.List<string> { #"\frac", #"\sqrt" }, ToParse);
Console.WriteLine(ToParse);
}
I should get before every \sqrt word multiply char.But my program breaks in first \sqrt and cant parse others.
Result is
\sqrt{3}\sqrt{3}*4\sqrt{3}
And i want to get
\sqrt{3}*\sqrt{3}*4*\sqrt{3}
Sorry for bad English.
Based on provided example you can do as simple as input = input.Replace(item, "*" + item).TrimStart('*'); (Replace replaces all occurrences of string/char).
As for your code, you have next issues:
First inclusion of \sqrt returns charLocation set to 0 (arrays are xero-based in C#), so your check charLocation > 0 evaluates to false, so nothing happens
input.Substring(i, l) accepts 2 parameters - start index and length, so it is incorrect to use your charLocation as second parameter(even if you remove previous check, input.Substring(0, charLocation).Last() will end in exception Sequence contains no elements).
So if you want to go through all occurrences and insert symbol based on some logic then you'll need to have some cycle and move start index and calculate length(and also use something else, not Replace). Or you can dive into Regular Expressions.

how write C# Recursion method?

i want to write following for loop by using C# Recursion please guide me . Thank you !
StringMobileNo = value.ToString();
for (int i = 0; i < 3; i++)
{
if (StringMobileNo.StartsWith("0"))
{
StringMobileNo = StringMobileNo.Remove(0, 1);
}
}
If you want to recursively remove leading zeroes, you can do this:
public string RemoveZeroes(string input)
{
if (!input.StartsWith("0"))
return input;
return RemoveZeroes(input.Remove(0, 1));
}
An explanation:
Check if there are leading zeroes.
If not, simply return the input string.
If so, remove the first zero and then recur, calling the same method with the first character removed.
This will cause the method to recur until, finally, there are no leading zeroes, at which point the resulting string - with all leading zeroes removed - will be returned all the way up the call stack.
Then call like so:
var myString = RemoveZeroes(StringMobileNo);
However, the same can be achieved by simply doing:
StringMobileNo.TrimStart('0');
Note that I have assumed here that the i < 3 condition is an arbitrary exit condition and you actually want to remove all leading zeroes. Here's one that will let you specify how many to remove:
public string RemoveZeroes(string input, int count)
{
if (count < 1 || !input.StartsWith("0"))
return input;
return RemoveZeroes(input.Remove(0, 1), count - 1);
}
You don't need recursion at all for that.
Instead, use TrimStart to remove all leading 0
StringMobileNo.TrimStart('0')
I think you don't need to use Recursion function here.
you can Use String.TrimStart("0") but if you want to use Recursion function
Try This:
class Program
{
static void Main(string[] args)
{
Recursion("000012345",0);
}
static void Recursion(string value,int c)
{
String MobileNo = value;
int count=c;
if (MobileNo.StartsWith("0") && count<3)
{
count++;
Recursion(MobileNo.Remove(0, 1),count);
}
}
}

Using unicode characters bigger than 2 bytes with .Net

I'm using this code to generate U+10FFFC
var s = Encoding.UTF8.GetString(new byte[] {0xF4,0x8F,0xBF,0xBC});
I know it's for private-use and such, but it does display a single character as I'd expect when displaying it. The problems come when manipulating this unicode character.
If I later do this:
foreach(var ch in s)
{
Console.WriteLine(ch);
}
Instead of it printing just the single character, it prints two characters (i.e. the string is apparently composed of two characters). If I alter my loop to add these characters back to an empty string like so:
string tmp="";
foreach(var ch in s)
{
Console.WriteLine(ch);
tmp += ch;
}
At the end of this, tmp will print just a single character.
What exactly is going on here? I thought that char contains a single unicode character and I never had to worry about how many bytes a character is unless I'm doing conversion to bytes. My real use case is I need to be able to detect when very large unicode characters are used in a string. Currently I have something like this:
foreach(var ch in s)
{
if(ch>=0x100000 && ch<=0x10FFFF)
{
Console.WriteLine("special character!");
}
}
However, because of this splitting of very large characters, this doesn't work. How can I modify this to make it work?
U+10FFFC is one Unicode code point, but string's interface does not expose a sequence of Unicode code points directly. Its interface exposes a sequence of UTF-16 code units. That is a very low-level view of text. It is quite unfortunate that such a low-level view of text was grafted onto the most obvious and intuitive interface available... I'll try not to rant much about how I don't like this design, and just say that not matter how unfortunate, it is just a (sad) fact you have to live with.
First off, I will suggest using char.ConvertFromUtf32 to get your initial string. Much simpler, much more readable:
var s = char.ConvertFromUtf32(0x10FFFC);
So, this string's Length is not 1, because, as I said, the interface deals in UTF-16 code units, not Unicode code points. U+10FFFC uses two UTF-16 code units, so s.Length is 2. All code points above U+FFFF require two UTF-16 code units for their representation.
You should note that ConvertFromUtf32 doesn't return a char: char is a UTF-16 code unit, not a Unicode code point. To be able to return all Unicode code points, that method cannot return a single char. Sometimes it needs to return two, and that's why it makes it a string. Sometimes you will find some APIs dealing in ints instead of char because int can be used to handle all code points too (that's what ConvertFromUtf32 takes as argument, and what ConvertToUtf32 produces as result).
string implements IEnumerable<char>, which means that when you iterate over a string you get one UTF-16 code unit per iteration. That's why iterating your string and printing it out yields some broken output with two "things" in it. Those are the two UTF-16 code units that make up the representation of U+10FFFC. They are called "surrogates". The first one is a high/lead surrogate and the second one is a low/trail surrogate. When you print them individually they do not produce meaningful output because lone surrogates are not even valid in UTF-16, and they are not considered Unicode characters either.
When you append those two surrogates to the string in the loop, you effectively reconstruct the surrogate pair, and printing that pair later as one gets you the right output.
And in the ranting front, note how nothing complains that you used a malformed UTF-16 sequence in that loop. It creates a string with a lone surrogate, and yet everything carries on as if nothing happened: the string type is not even the type of well-formed UTF-16 code unit sequences, but the type of any UTF-16 code unit sequence.
The char structure provides static methods to deal with surrogates: IsHighSurrogate, IsLowSurrogate, IsSurrogatePair, ConvertToUtf32, and ConvertFromUtf32. If you want you can write an iterator that iterates over Unicode characters instead of UTF-16 code units:
static IEnumerable<int> AsCodePoints(this string s)
{
for(int i = 0; i < s.Length; ++i)
{
yield return char.ConvertToUtf32(s, i);
if(char.IsHighSurrogate(s, i))
i++;
}
}
Then you can iterate like:
foreach(int codePoint in s.AsCodePoints())
{
// do stuff. codePoint will be an int will value 0x10FFFC in your example
}
If you prefer to get each code point as a string instead change the return type to IEnumerable<string> and the yield line to:
yield return char.ConvertFromUtf32(char.ConvertToUtf32(s, i));
With that version, the following works as-is:
foreach(string codePoint in s.AsCodePoints())
{
Console.WriteLine(codePoint);
}
As posted already by Martinho, it is much easier to create the string with this private codepoint that way:
var s = char.ConvertFromUtf32(0x10FFFC);
But to loop through the two char elements of that string is senseless:
foreach(var ch in s)
{
Console.WriteLine(ch);
}
What for? You will just get the high and low surrogate that encode the codepoint. Remember a char is a 16 bit type so it can hold just a max value of 0xFFFF. Your codepoint doesn't fit into a 16 bit type, indeed for the highest codepoint you'll need 21 bits (0x10FFFF) so the next wider type would just be a 32 bit type. The two char elements are not characters, but a surrogate pair. The value of 0x10FFFC is encoded into the two surrogates.
While #R. Martinho Fernandes's answer is correct, his AsCodePoints extension method has two issues:
It will throw an ArgumentException on invalid code points (high surrogate without low surrogate or vice versa).
You can't use char static methods that take (char) or (string, int) (such as char.IsNumber()) if you only have int code points.
I've split the code into two methods, one similar to the original but returns the Unicode Replacement Character on invalid code points. The second method returns a struct IEnumerable with more useful fields:
StringCodePointExtensions.cs
public static class StringCodePointExtensions {
const char ReplacementCharacter = '\ufffd';
public static IEnumerable<CodePointIndex> CodePointIndexes(this string s) {
for (int i = 0; i < s.Length; i++) {
if (char.IsHighSurrogate(s, i)) {
if (i + 1 < s.Length && char.IsLowSurrogate(s, i + 1)) {
yield return CodePointIndex.Create(i, true, true);
i++;
continue;
} else {
// High surrogate without low surrogate
yield return CodePointIndex.Create(i, false, false);
continue;
}
} else if (char.IsLowSurrogate(s, i)) {
// Low surrogate without high surrogate
yield return CodePointIndex.Create(i, false, false);
continue;
}
yield return CodePointIndex.Create(i, true, false);
}
}
public static IEnumerable<int> CodePointInts(this string s) {
return s
.CodePointIndexes()
.Select(
cpi => {
if (cpi.Valid) {
return char.ConvertToUtf32(s, cpi.Index);
} else {
return (int)ReplacementCharacter;
}
});
}
}
CodePointIndex.cs:
public struct CodePointIndex {
public int Index;
public bool Valid;
public bool IsSurrogatePair;
public static CodePointIndex Create(int index, bool valid, bool isSurrogatePair) {
return new CodePointIndex {
Index = index,
Valid = valid,
IsSurrogatePair = isSurrogatePair,
};
}
}
To the extent possible under law, the person who associated CC0 with this work has waived all copyright and related or neighboring rights to this work.
Yet another alternative to enumerate the UTF32 characters in a C# string is to use the System.Globalization.StringInfo.GetTextElementEnumerator method, as in the code below.
public static class StringExtensions
{
public static System.Collections.Generic.IEnumerable<UTF32Char> GetUTF32Chars(this string s)
{
var tee = System.Globalization.StringInfo.GetTextElementEnumerator(s);
while (tee.MoveNext())
{
yield return new UTF32Char(s, tee.ElementIndex);
}
}
}
public struct UTF32Char
{
private string s;
private int index;
public UTF32Char(string s, int index)
{
this.s = s;
this.index = index;
}
public override string ToString()
{
return char.ConvertFromUtf32(this.UTF32Code);
}
public int UTF32Code { get { return char.ConvertToUtf32(s, index); } }
public double NumericValue { get { return char.GetNumericValue(s, index); } }
public UnicodeCategory UnicodeCategory { get { return char.GetUnicodeCategory(s, index); } }
public bool IsControl { get { return char.IsControl(s, index); } }
public bool IsDigit { get { return char.IsDigit(s, index); } }
public bool IsLetter { get { return char.IsLetter(s, index); } }
public bool IsLetterOrDigit { get { return char.IsLetterOrDigit(s, index); } }
public bool IsLower { get { return char.IsLower(s, index); } }
public bool IsNumber { get { return char.IsNumber(s, index); } }
public bool IsPunctuation { get { return char.IsPunctuation(s, index); } }
public bool IsSeparator { get { return char.IsSeparator(s, index); } }
public bool IsSurrogatePair { get { return char.IsSurrogatePair(s, index); } }
public bool IsSymbol { get { return char.IsSymbol(s, index); } }
public bool IsUpper { get { return char.IsUpper(s, index); } }
public bool IsWhiteSpace { get { return char.IsWhiteSpace(s, index); } }
}

Fastest way to trim a string and convert it to lower case

I've written a class for processing strings and I have the following problem: the string passed in can come with spaces at the beginning and at the end of the string.
I need to trim the spaces from the strings and convert them to lower case letters. My code so far:
var searchStr = wordToSearchReplacemntsFor.ToLower();
searchStr = searchStr.Trim();
I couldn't find any function to help me in StringBuilder. The problem is that this class is supposed to process a lot of strings as quickly as possible. So I don't want to be creating 2 new strings for each string the class processes.
If this isn't possible, I'll go deeper into the processing algorithm.
Try method chaining.
Ex:
var s = " YoUr StRiNg".Trim().ToLower();
Cyberdrew has the right idea. With string being immutable, you'll be allocating memory during both of those calls regardless. One thing I'd like to suggest, if you're going to call string.Trim().ToLower() in many locations in your code, is to simplify your calls with extension methods. For example:
public static class MyExtensions
{
public static string TrimAndLower(this String str)
{
return str.Trim().ToLower();
}
}
Here's my attempt. But before I would check this in, I would ask two very important questions.
Are sequential "String.Trim" and "String.ToLower" calls really impacting the performance of my app? Would anyone notice if this algorithm was twice as slow or twice as fast? The only way to know is to measure the performance of my code and compare against pre-set performance goals. Otherwise, micro-optimizations will generate micro-performance gains.
Just because I wrote an implementation that appears faster, doesn't mean that it really is. The compiler and run-time may have optimizations around common operations that I don't know about. I should compare the running time of my code to what already exists.
static public string TrimAndLower(string str)
{
if (str == null)
{
return null;
}
int i = 0;
int j = str.Length - 1;
StringBuilder sb;
while (i < str.Length)
{
if (Char.IsWhiteSpace(str[i])) // or say "if (str[i] == ' ')" if you only care about spaces
{
i++;
}
else
{
break;
}
}
while (j > i)
{
if (Char.IsWhiteSpace(str[j])) // or say "if (str[j] == ' ')" if you only care about spaces
{
j--;
}
else
{
break;
}
}
if (i > j)
{
return "";
}
sb = new StringBuilder(j - i + 1);
while (i <= j)
{
// I was originally check for IsUpper before calling ToLower, probably not needed
sb.Append(Char.ToLower(str[i]));
i++;
}
return sb.ToString();
}
If the strings use only ASCII characters, you can look at the C# ToLower Optimization. You could also try a lookup table if you know the character set ahead of time
So first of all, trim first and replace second, so you have to iterate over a smaller string with your ToLower()
other than that, i think your best algorithm would look like this:
Iterate over the string once, and check
whether there's any upper case characters
whether there's whitespace in beginning and end (and count how many chars you're talking about)
if none of the above, return the original string
if upper case but no whitespace: do ToLower and return
if whitespace:
allocate a new string with the right size (original length - number of white chars)
fill it in while doing the ToLower
You can try this:
public static void Main (string[] args) {
var str = "fr, En, gB";
Console.WriteLine(str.Replace(" ","").ToLower());
}

Categories