LINQ select should break on first match - c#

I want to convert the following structured code into a more readable LINQ call:
foreach (string line in this.HeaderTexts)
{
Match match = dimensionsSearcher.Match(line);
if (match.Success)
{
// Do something
return;
}
}
I came up with the following code:
Match foundMatch = this.HeaderTexts
.Select(text => dimensionsSearcher.Match(text))
.Where(match => match.Success)
.FirstOrDefault();
if (foundMatch != null)
{
// Do something
return;
}
However, from my understanding, this will run the Regex check for each header text, while my first code breaks as soon as it hits for the first time. Is there a way to optimize the LINQ version of that code, of should I rather stick to the structural code?

Let's say you have a list of integers, you need to add 2 to each number, then find the first one that is even.
var input = new[] { 1, 2, 3, 4, 5, 6 };
var firstEvenNumber = input
.Select(x => x + 2)
.Where(x => x % 2 == 0)
.First();
// firstEvenNumber is 4, which is the input "2" plus two
Now, does the Select evaluate x + 2 on every input before First gets ran? Let's find out. We can replace the code in Select with a multi-line lambda to print to the console when it's evaluated.
var input = new[] { 1, 2, 3, 4, 5, 6 };
var firstEvenNumber = input
.Select(x => {
Console.WriteLine($"Processing {x}");
return x + 2;
})
.Where(x => x % 2 == 0)
.First();
Console.WriteLine("First even number is " + firstEvenNumber);
This prints:
Processing 1
Processing 2
First even number is 4
So it looks like Linq only evaluated the minimum number of entries needed to satisfy Where and First.
Where and First doesn't need all the processed records up-front in order to pass to the next step unlike Reverse(), ToList(), OrderBy(), etc.
If you instead stuck a ToList() before First, it would be a different story.
var input = new[] { 1, 2, 3, 4, 5, 6 };
var firstEvenNumber = input
.Select(x => {
Console.WriteLine($"Processing {x}");
return x + 2;
})
.Where(x => x % 2 == 0)
.ToList() // same thing if you put it before Where instead
.First();
Console.WriteLine("First even number is " + firstEvenNumber);
This prints:
Processing 1
Processing 2
Processing 3
Processing 4
Processing 5
Processing 6
First even number is 4

Your LINQ query does what you hope it does. It will only execute the regex until one header matches. So it has the same behavior as your loop. That's ensured with FirstOrDefault (or First). You could rewrite it to:
Match foundMatch = this.HeaderTexts
.Select(text => dimensionsSearcher.Match(text))
.FirstOrDefault(m => m.Success);
// ...
Note that Single and SingleOrDefault ensure that there is at maximum one match(otherwise they throw an InvalidOperationException), so they might need to enumerate all because they have to check if there is a second match.
Read this blog if you want to understand how lazy evaluation(deferred execution) works:
https://codeblog.jonskeet.uk/category/edulinq/

Related

How to check if a List<T> contains another List<T> [duplicate]

This question already has answers here:
How to check if list contains another list in same order
(2 answers)
Closed 4 years ago.
Is there any elegant way in c# to check whether a List<T> contains a sub-List<T> similar to string.Contains(string)?
Let's say e.g. I want to test for example whether List A is contained in List B
List<int> A = new List<int>{ 1, 2, 3, 4, 3, 4, 5 };
List<int> B = new List<int>{ 3, 4, 5 };
important is that all elements have to match in exactly that order.
I know I could possibly do something like
bool Contains(List<Sampletype> source, List<Sampletype> sample)
{
// sample has to be smaller or equal length
if (sample.Count > source.Count) return false;
// doesn't even contain first element
if (!source.Contains(sample[0])) return false;
// get possible starts
// see https://stackoverflow.com/a/10443540/7111561
int[] possibleStartIndexes = source.Select((b, i) => b == sample[0] ? i : -1).Where(i => i != -1).ToArray();
foreach (int possibleStartIndex in possibleStartIndexes)
{
// start is too late -> can't match
if (possibleStartIndex + sample.Count - 1 > source.Count - 1) return false;
for (int index = possibleStartIndex; index < possibleStartIndex + sample.Count; index++)
{
// if one element does not match the whole sample doesn't match
if (source[index] != sample[index]) return false;
}
// if this is reached all elements of the sample matched
Debug.Log("Match found starting at index " + possibleStartIndex);
return true;
}
return false;
}
But I hope there is a better way to do so.
Here's a oneliner:
var result = A.Select(a => $"{a}").Aggregate((c, n) => $"{c};{n}").Contains(B.Select(b => $"{b}").Aggregate((c, n) => $"{c};{n}"));
It basically creates a string from each list, and checks whether the A string contains the B string. This way you won't just get a method like string.Contains, you actually get to use just that.
EDIT
Added separator to the string aggregations, as {1, 2, 3} would result in the same string as {1, 23}
EDIT 2
Re-adding my first approach which identifies if list B is present in list A, perhaps scattered, but still ordered:
var result = B.Intersect(A).SequenceEqual(B)
Essentially you want to slide over A and check each element of that window with the B. The last part is actually SequenceEqual and I do recommend to use it but this is just an alternative to explain the point:
bool equal = Enumerable.Range(0, A.Count() - B.Count() + 1)
.Select(i => A.Skip(i).Take(B.Count))
.Any(w => w.Select((item, i) => item.Equals(B[i])).All(item => item));

Prepare default new line as replacement of empty values

I'm working on reporting using rdlc where the report footer position needs to stay at bottom, but the four table footer visibility depend on certain condition.
For that I use Union and use Take(4) to make sure if any of the table footer is not visible it should be replaced with \r\n at the bottom.
Should be:
12345
\r\n
\r\n
\r\n
Not like this:
\r\n
\r\n
\r\n
12345
Here is my code.
var footerValues = new[]
{
salesOrder.Subtotal.ToString("N0"),
salesOrder.Discount.ToString("N0"),
salesOrder.PPN.ToString("N0"),
salesOrder.Total.ToString("N0")
};
var stats = new[] {
salesOrder.Discount >= 1 || salesOrder.PPN >= 1, // combined visibility
salesOrder.Discount >= 1, // visible only if the value >= 1
salesOrder.PPN >= 1, // visible only if the value >= 1
true // total always visible
};
var textValues = stats
.Select((v, i) => v ? footerValues[i] : null).OfType<string>()
.Union(Enumerable.Range(0, 4).Select(x => string.Empty).ToArray())
.Take(4)
.ToArray()
var footerValue = string.Join(Environment.NewLine, textValues);
If the stats produces
false, false, false, true
The expected footerValue would be
"12345\r\n\r\n\r\n\r\n"
But actual result is
"12345\r\n"
What's wrong with the code? Or can it be simplified?
This part:
.Union(Enumerable.Range(0, 4).Select(x => string.Empty).ToArray())
is pointless. it will only return one string.Empty. Because Union removes the duplicates. I think you want Concat instead.Btw, you can also replace Enumerable.Range with Enumerable.Repeat
var textValues = stats
.Select((v, i) => v ? footerValues[i] : null).OfType<string>()
.Concat(Enumerable.Repeat(string.Empty, 4))
.Take(4)
.ToArray();

LINQ non-linear order by string length

I'm trying to get a list of string ordered such that the longest are on either end of the list and the shortest are in the middle. For example:
A
BB
CCC
DDDD
EEEEE
FFFFFF
would get sorted as:
FFFFFF
DDDD
BB
A
CCC
EEEEE
EDIT: To clarify, I was specifically looking for a LINQ implementation to achieve the desired results because I wasn't sure how/if it was possible to do using LINQ.
You could create two ordered groups, then order the first group descending(already done) and the second group ascending:
var strings = new List<string> {
"A",
"BB",
"CCC",
"DDDD",
"EEEEE",
"FFFFFF"};
var two = strings.OrderByDescending(str => str.Length)
.Select((str, index) => new { str, index })
.GroupBy(x => x.index % 2)
.ToList(); // two groups, ToList to prevent double execution in following query
List<string> ordered = two.First()
.Concat(two.Last().OrderBy(x => x.str.Length))
.Select(x => x.str)
.ToList();
Result:
[0] "FFFFFF" string
[1] "DDDD" string
[2] "BB" string
[3] "A" string
[4] "CCC" string
[5] "EEEEE" string
Don't ask how and why... ^^
list.Sort(); // In case the list is not already sorted.
var length = list.Count;
var result = Enumerable.Range(0, length)
.Select(i => length - 1 - 2 * i)
.Select(i => list[Math.Abs(i - (i >> 31))])
.ToList();
Okay, before I forget how it works, here you go.
A list with 6 items for example has to be reordered to this; the longest string is at index 5, the shortest one at index 0 of the presorted list.
5 3 1 0 2 4
We start with Enumerable.Range(0, length) yielding
0 1 2 3 4 5
then we apply i => length - 1 - 2 * i yielding
5 3 1 -1 -3 -5
and we have the non-negative part correct. Now note that i >> 31 is an arithmetic left shift and will copy the sign bit into all bits. Therefore non-negative numbers yield 0 while negative numbers yield -1. That in turn means subtracting i >> 31 will not change non-negative numbers but add 1 to negative numbers yielding
5 3 1 0 -2 -4
and now we finally apply Math.Abs() and get
5 3 1 0 2 4
which is the desired result. It works similarly for lists of odd length.
Just another option, which I find more readable and easy to follow:
You have an ordered list:
var strings = new List<string> {
"A",
"BB",
"CCC",
"DDDD",
"EEEEE",
"FFFFFF"};
Create a new list and simply alternate where you add items::
var new_list = new List<string>(); // This will hold your results
bool start = true; // Insert at head or tail
foreach (var s in strings)
{
if (start)
new_list.Insert(0,s);
else
new_list.Add(s);
start = !start; // Flip the insert location
}
Sweet and simple :)
As for Daniel Bruckner comment, if you care about which strings comes first, you could also change the start condition to:
// This will make sure the longest strings is first
bool start= strings.Count()%2 == 1;

How can I ignore / skip 'n' elements while sorting string in a lexicographical order

I have the following c# code that sorts a string in a lexicographical (alphabetical) order.
string str = "ACGGACGAACT";
IEnumerable<string> sortedSubstrings =
Enumerable.Range(0, str.Length)
.Select(i => str.Substring(i))
.OrderBy(s => s);
Result:
0 AACT
1 ACGAACT
2 ACGGACGAACT
3 ACT
4 CGAACT
5 CGGACGAACT
6 CT
7 GAACT
8 GACGAACT
9 GACGAACT
10 T
However I want to enhance this sort by skipping the 3rd and the 4th character during the lexicographical sort process
In this case the lexicographical sort will be different to the one above.
result:
0 AA[CT
1 AC[T
2 AC[GG]ACGAACT
3 AC[GA]ACT
4 CG[GA]CGAACT
5 CG[AA]CT
6 CT
7 GA[CG]AACT
8 GA[AC]T
9 GG[AC]GAACT
10 T
how can I achieve this?
This can be done by tweaking the lambda passed to OrderBy. Something like this should do it:
var sortedSubstrings =
Enumerable.Range(0, str.Length)
.Select(i => str.Substring(i))
.OrderBy(s => s.Length < 3 ? s : s.Remove(2, Math.Min(s.Length - 2, 2)));
Edit: Corrected off-by-one error.
You can change the lambda passed to OrderBy to one which will remove the 3rd and 4th symbols from the string.
var sorted = source.OrderBy(s => new string(s.Where((ch, n) => n != 2 && n != 3).ToArray()));

LINQ - is SkipWhile broken?

I'm a bit surprised to find the results of the following code, where I simply want to remove all 3s from a sequence of ints:
var sequence = new [] { 1, 1, 2, 3 };
var result = sequence.SkipWhile(i => i == 3); // Oh noes! Returns { 1, 1, 2, 3 }
Why isn't 3 skipped?
My next thought was, OK, the Except operator will do the trick:
var sequence = new [] { 1, 1, 2, 3 };
var result = sequence.Except(i => i == 3); // Oh noes! Returns { 1, 2 }
In summary,
Except removes the 3, but also
removes non-distinct elements. Grr.
SkipWhile doesn't skip the last
element, even if it matches the
condition. Grr.
Can someone explain why SkipWhile doesn't skip the last element? And can anyone suggest what LINQ operator I can use to remove the '3' from the sequence above?
It's not broken. SkipWhile will only skip items in the beginning of the IEnumerable<T>. Once that condition isn't met it will happily take the rest of the elements. Other elements that later match it down the road won't be skipped.
int[] sequence = { 3, 3, 1, 1, 2, 3 };
var result = sequence.SkipWhile(i => i == 3);
// Result: 1, 1, 2, 3
var result = sequence.Where(i => i != 3);
The SkipWhile and TakeWhile operators skip or return elements from a sequence while a predicate function passes (returns True). The first element that doesn’t pass the predicate function ends the process of evaluation.
//Bypasses elements in a sequence as long as a specified condition is true and returns the remaining elements.
One solution you may find useful is using List "FindAll" function.
List <int> aggregator = new List<int> { 1, 2, 3, 3, 3, 4 };
List<int> result = aggregator.FindAll(b => b != 3);
Ahmad already answered your question, but here's another option:
var result = from i in sequence where i != 3 select i;

Categories