An unexpected behavior of LastIndexOf()? - c#

While using LastIndexOf to search for a short string in a longer string I came across behavior that I find somewhat counterintuitive:
If I have a haystack and a needle:
var h = "abcabcabc";
var n = "abc";
And I tell LastIndexOf to start searching at index 3, 4, I would expect it to start looking there and proceed towards the start of the string, and hence return 3:
012345678
abcabcabc
abc <- try index 4, no
abc <- found at index 3
..but it actually locates the first abc and returns 0. It behaves like there is an assumption "user wants to start searching for a string of length 3 starting at index 4; the string couldn't possibly occur at any index higher than 2 so search will begin from index 2.. found at index 0" - while that would be true if one was starting from the very end of the string i.e. a needle of length 3 couldn't possibly be found any later than haystack.Length-3, I don't find it logical to adopt the approach in the middle of a string
Another way of looking at it is "the haystack is substringed so that it has a length equal to the startIndex and then the substringed haystack is searched" - but again, I don't find it reasonable to chop a document and remove a potential match
While I can reason the search logic out thus and try to remember it, it seems illogical to me to operate in such a manner, so I'm here asking if there is some underlying reason for this behavior that will make it easier to reason about?
Note: it's also fine to say "no, your logic of "start at 4, find at 3 is unreasonable because.." - it would help adjust my mental model of how I think LastIndexOf should work

from the documentation - it says
Reports the zero-based index position of the last occurrence of a
specified string within this instance. The search starts at a
specified character position and proceeds backward toward the
beginning of the string.
So it searches from the value to the beginning of the string. Different starting point as your expectation.
Update: as Matthew Watson mentioned in his comment from the source code
For LastIndexOf specifially, overloads which take a 'startIndex' and 'count' behave differently than their IndexOf counterparts. 'startIndex' is the index of the last char element that should be considered when performing the search. For example, if startIndex = 4, then the caller is indicating "when finding the match I want you to include the char element at index 4, but not any char elements past that point.
var h = "abcabcabc";
//index 012345678
// ^ last element that will be considdered "abca"
var n = "abc";
int result = h.LastIndexOf(n,3); //0

Related

Why does the new C# 8 Index type start at the end from 1 instead of 0? [duplicate]

C# 8.0 introduces a convenient way to slice arrays - see official C# 8.0 blogpost.
The syntax to access the last element of an array is
var value = new[] { 10, 11, 12, 13 };
int a = value[^1]; // 13
int b = value[^2]; // 12
I'm wondering why the indexing for accessing the elements backwards starts at 1 instead of 0? Is there a technical reason for this?
Official answer
Here is a comment from Mads Torgersen explaining this design decision from the C# 8 blog post:
We decided to follow Python when it comes to the from-beginning and from-end arithmetic. 0 designates the first element (as always), and ^0 the “length’th” element, i.e. the one right off the end. That way you get a simple relationship, where an element's position from beginning plus its position from end equals the length. the x in ^x is what you would have subtracted from the length if you’d done the math yourself.
Why not use the minus (-) instead of the new hat (^) operator? This primarily has to do with ranges. Again in keeping with Python and most of the industry, we want our ranges to be inclusive at the beginning, exclusive at the end. What is the index you pass to say that a range should go all the way to the end? In C# the answer is simple: x..^0 goes from x to the end. In Python, there is no explicit index you can give: -0 doesn’t work, because it is equal to 0, the first element! So in Python, you have to leave the end index off completely to express a range that goes to the end: x... If the end of the range is computed, then you need to remember to have special logic in case it comes out to 0. As in x..-y, where y was computed and came out to 0. This is a common nuisance and source of bugs.
Finally, note that indices and ranges are first class types in .NET/C#. Their behavior is not tied to what they are applied to, or even to be used in an indexer. You can totally define your own indexer that takes Index and another one that takes Range – and we’re going to add such indexers to e.g. Span. But you can also have methods that take ranges, for instance.
My answer
I think this is to match the classic syntax we are used to:
value[^1] == value[value.Length - 1]
If it used 0, it would be confusing when the two syntaxes were used side-by-side. This way it has lower cognitive load.
Other languages like Python also use the same convention.

Why is string.Substring(length,0) allowed? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
Can somebody tell me for what is allowed to use string.Substring(someIndex,0)?
string a = "abc";
var result = a.Substring(1,0);
Console.WriteLine(result);
This code will be compiled and will write nothing to console.
What is the reason that this is allowed?
In which case can this be used?
UPDATE
I will clarify that I know what is this method and that in this case it is returning empty string. I am NOT asking why the result is empty. I am asking why it's allowed to do this.
This code will be compiled and will write nothing to console.
First of all technically speaking, this statement is wrong: it writes a new line to the console. Thats where the Line in WriteLine comes in. But let's not be picky.
What is the reason that this is allowed?
There is no reason to disable it. Say for instance you want to make a string insertion method:
public static string StringInsert(String original, String toInsert, int index) {
return original.Substring(0,index)+toInsert+original.SubString(index);
}
Now our StringInsert method cannot know whether or first or second part will be empty (we could decide to insert at index 0). If we had to take into account that the first substring could have zero length, or the second, or both, then we would have to implement a lot of if-logic. Now we can use a one liner.
Usually one considers a string s a sequence of characters s=s0s1...sn-1. A substring from i with length j, is the string t=sisi+1...si+j-1. There is no ambiguity here: it is clear that if j is 0, then the result is the empty string. Usually you only raise an exception if something is exceptional: the input does not make any sense, or is not allowed.
Many things are allowed because there is no good reason to prohibit them.
Substrings of length zero are one such thing: in situations when the desired length is computed, this saves programmers who use your library from having to zero-check the length before making a call.
For example, let's say the task is to find a substring between the first and the last hash mark # in a string. Current library lets you do this:
var s = "aaa#bbb"; // <<== Only one # here
var start = s.IndexOf('#');
var end = s.LastIndexOf('#');
var len = end-start;
var substr = s.Substring(start, len); // Zero length
If zero length were prohibited, you would be forced to add a conditional:
var len = end-start;
var substr = len != 0 ? s.Substring(start, len) : "";
Checking fewer pre-requisites makes your library easier to use. In a way, Pythagorean theorem is useful in no small part because it works for the degenerate case, when the length of all three sides is zero.
The method you use has the following signature:
public string Substring(
int startIndex,
int length
)
where startIndex is
The zero-based starting character position of a substring in this
instance.
and length is
The number of characters in the substring.
That being said the following call is a pretty valid call
var result = a.Substring(1,0);
but is meaningless, since the number of charcaters in the substring you want to create is 0. This is why you don't get anything in the Console as an output.
Apparently, a call of Substring with passing the value of 0 as the value of second argument has no meaning.
From the documentation:
public string Substring(startIndex, length)
A string that is equivalent to the substring of length length that
begins at startIndex in this instance, or Empty if startIndex is equal
to the length of this instance and length is zero.
Basically, when you do someString.Substring(n, 0);, what you're getting back is a string that starts at n and has length 0.
The length parameter represents the total number of characters to extract from the current string instance.
Thats why nothing is printed to the console. The returned string is empty (has length 0).
EDIT:
Well, there is a limitation in place: the method throws an ArgumentOutOfRangeException if:
startIndex plus length indicates a position not within this instance.
-or-
startIndex or length is less than zero.
The reason they made the exception be thrown if length is less than zero and not if it is equal is most likely because, though pointless in most situations, requesting a string of 0 length is not an invalid request.

C# LastIndexOf not giving correct result

in this particular line of code :
correct = Array.LastIndexOf(turns.ToArray(), false, 4, 0);
I get result correct = -1, well how is this even possible ?
turns[0] up to turns[3] are equal to false turns[4]=true and turns[5]=false is it possible to be caused because the last index i want to be looked up to is 4 and it has value different than the required one ?
The issue is with the last argument (count). This restricts the number of elements searched. You are restricting it to search 0 elements starting at index 4. Thus, it doesn't find anything.
Your count indicates searching 0 elements in the section.
correct = Array.LastIndexOf(turns.ToArray(), false, 4, 2);
Try this out:
correct = Array.LastIndexOf(turns.ToArray(), false, turns.Length, turns.Length);
What were you doing wrong:
never hard code array length (especially in your case, when the array is filled with values)
the first index is actually the starting search index from backwards, and the second index is actually the count, i.e. how many items to search (MSDN constructor clarification)
Update 1:
Made a mistake on the starting index and the count number. Updated the changes, thank you #Steve for pointing it out.

C# Find the Next X and Previous Numbers in a sequence

I have a list of numbers, {1,2,3,4,...,End} where End is specified. I want to display the X closest numbers around a given number Find within the list. If x is odd I want the extra digit to go on the greater than side.
Example (Base Case)
End: 6
X: 2
Find: 3
The result should be: {2,3,4}
Another Example (Bound Case):
End: 6
X: 4
Find: 5
The result should be: {2,3,4,5,6}
Yet Another Example (Odd Case):
End: 6
X: 3
Find: 3
The result should be: {2,3,4,5}
I'm assuming it would be easier to simply find a start and stop value, rather than actually generating the list, but I don't really care one way or another.
I'm using C# 4.0 if that matters.
Edit: I can think of a way to do it, but it involves way too many if, else if cases.
if (Find == 1)
{
Start = Find;
Stop = (Find + X < End ? Find + X : End);
}
else if (Find == 2)
{
if (X == 1)
{
Start = Find;
End = (Find + 1 < End ? Find + 1 : End);
}
...
}
You can hopefully see where this is going. I assuming I'm going to have to use a (X % 2 == 0) for odd/even checking. Then some bound thats like less = Find - X/2 and more = Find + X/2. I just can't figure out the path of least if cases.
Edit II: I should also clarify that I don't actually create a list of {1,2,3,4...End}, but maybe I need to just start at Find-X/2.
I realise that you are learning, and out of respect from this I will not provide you with the full solution. I will however do my best to nudge you in the right direction.
From looking at your attempted solution, I think you need to figure out the algorithm you need before trying to code up something that may or may not solve your problem. As you say yourself, writing one if statement for every possible permutation on the input is not a manageble solution. You need to find an algorithm that is general enough that you can use it for any input you get, and still get the right results out.
Basically, there are two questions you need to answer before you'll be able to code up a working solution.
How do I find the lower bound of the list I want to return?
How do I find the upper bound of the list I want to return?
Considering the example base case, you know that the given parameter X contains a number that tells you how many numbers around Find you should display. Therefore you need to divide X equally on both sides of Find.
Thus:
If I get an input X = 4 and Find = 3, the lower bound will be 3 - 4/2 or Find - X/2.
The higher bound will be 3 + 4/2 or Find + X/2.
Start by writing a program that runs and works for the base case. Once that is done, sit down and figure out how you would find the higher and lower bounds for a more complicated case.
Good luck!
You can look at Extension methods. skip and take.
x.Skip(3).Take(4);
this will help u in what u r trying to do

Need help understanding code

I am taking a C# class and I need help understanding the following code.
The code has an array which represents responses to a survey, with values 1 thru 10.
The output displays these ratings and the frequency of how many times a value was selected.
The following code is from my book, but I have modified it to just a basic example.
int[] responses = { 3, 2, 5, 6, 3, 5 , 4, 5, 5, 5};
int[] frequency = new int[7];
for (int answer = 0; answer < responses.Length; answer++)
++frequency[responses[answer]];
for (int rating = 1; rating < frequency.Length; rating++)
Console.WriteLine(rating + ", " + frequency[rating]);
Console.Read();
How does the line ++frequency[responses[answer]]; work? In looking at this, if I take reponses[answer] the first time through the loop, this would represent responses[0] which would be a 3, correct? This is where I get confused, what does the ++frequency part of this line do?
frequency[responses[answer]] = frequency[responses[answer]] + 1;
EDIT: I think it's pretty unclear to write it like that. As a personal preference, I don't like using unary operations (++x, x++, etc) on elements that have lots of indexes present.
It adds one to the frequency at that location in the array.
For example, the frequency at position 3 (from your example) will be increased by one after that line executes.
EDIT: So, in more detail, when answer = 0, responses[0] = 3, so frequency[3] gets one added to it.
The ++ could very easily be at the end of the command as well. In other words,
++frequency[responses[answer]];
is the same thing (IN THIS CASE) as using
frequency[responses[answer]]++;
Let's break it down: As you point out, on the first pass responses[answer] will evaluate to "3"
So this then looks like ++frequency[3]
The ++ is incrementing the value of the array at index 3 by 1
Simple enough?
I should also point out that applying the ++ before the array rather than after it does effect how the incrementing is executed (although it doesn't effect the results of this code).
For instance:
int n = 2;
int j = ++n;
int k = n++;
What are j and k?
j will be 3, and k will also be 3. This is because if you place the ++ before, it evaluates it first. If you place it at the end, it evaluates it after the rest of the expression.
If it helps, think of ++frequency[] as "frequency = frequency + 1".
If the ++ operator comes before the variable, then the increment is applied before the statement is executed. If the ++ comes afterwards, then the statement is executed and then the variable is incremented.
In this case, it doesn't matter, since incrementing before or after doesn't impact the logic.
Since "responses[answer]" evaluates to a number, that line of code is incrementing the frequency entry at that array index. So the first time through, answer is 0, so responses[answer] is 3, so the frequency[3] box is getting incremented by 1. The next time through, the frequency[2] box is incremented... etc. etc. etc.
frequency is an array, where all elements are initialized to 0 (the default value for an int). The line ++frequency[responses[answer]] will increment the frequency element pointed out by the integer found at responses[answer]. By putting the ++ in front of frequency, the array element will be incremented before the resulting value is returned.
You can read more about the ++ operator here.
In cases like this it's often useful to rewrite the code as you walk it.
When answer = 0
++frequency[responses[0]]
++frequency[3] since responses[0] = 3
frequency now looks like { 0, 0, 0, 1, 0, 0, 0 }
When answer = 1
++frequency[responses[1]]
++frequency[2] since responses[1] = 2
frequency now looks like { 0, 0, 1, 1, 0, 0, 0 }
And so on.
It means increment the value at frequency[ 3 ]. Where 3 is the return result from responses[answer]. Similarly the next iteration would increment the value at frequency[ 2 ].
The ++ operator in C# when applied to an integer will increment it by one.
The specific line you're looking at, frequency is an array of integers with 7 elements. Which is sort of confusing, because the way you explained it in your code, it would appear that this code would break with any value in the responses array above 6.
That issue aside, basically it's incrementing whichever index of the array it's accessing. So in your example responses[0] would be 3. So this line would find the value of frequency[3] and increment it by 1. Since integer arrays are initialized with all values at zero, then after the first iteration, frequency[3] would be 1. Then if there was another 3 later in your responses array, frequency[3] would be incremented again (i.e. responses[4]).
I hope this helps you.
The goal of the code snippet seems to be to determine the number of times each response appears in the 'responses' array. So, for your example set, frequency[3] should be 5, frequency[5] should be 5, etc.
So, the line you are asking about takes the current element from the responses array, and increments the associated value in the frequency array by 1, to indicate that the particular value has been observed in responses.
Once the entire code snippet executes, the frequency array contains the number of times each element from 0 to 7 was observed in the responses array.
It is using the frequency array to count how many times each response was entered. You could have a counter for each answer:
int numberOfOnes = 0;
int numberOfTwos = 0;
// Etc...
But that would be ugly programming and not as easy or efficient. Using the frequency array allows you do not use an if/else if block or a switch and makes your code easier to read.
Another thing about that frequency array.
int[] frequency = new int[7];
This initializes all the integers in the array to 0, that's why you can just start off by incrementing it instead of seeing if it was the first time for that specific response and then initializing it with 1 or something of that nature.
Good luck with all the fun C# you have ahead of you.

Categories