Conditionally sum values in a list of tuples - c#

Hello what I'm trying to do is hopefully simple, I have a List<tuple>(string, decimal) and I'm trying to grab the sum of the decimal values if there are two similar strings.
My list contains the values:
("q", .5)
("w", 1.5)
("e", .7)
("r", .8)
("q", .5)
The sum of all the values would therefore be .5 + 1.5 + .7 + .8 + .5 = 4.0.
I assume the algorithm would be something like this:
newlist.select(item1 , item2).where(get the sum of first "q" up to the next "q" in list)
As for existing code, I don't have much only the declaration of the list and it's values.
**I want the sum between 'q' and 'q' + the values of 'q' and 'q', not just in-between, the answer should be '4' and not 3, I want everything between q and q including q's values, thank you.

You could use simple Linq extensions, SkipWhile and TakeWhile
List<Tuple<string,double>> items = new List<Tuple<string,double>>()
{
new Tuple<string,double>("q", .5),
new Tuple<string,double>("w", 1.5),
new Tuple<string,double>("e", .7),
new Tuple<string,double>("r", .8),
new Tuple<string,double>("q", .5)
};
var sumvalue = items.Sum(c=>c.Item2); // Calculates sum of all values
var betweensum = items.SkipWhile(x=>x.Item1 == "q") // Skip until matching item1
.TakeWhile(x=>x.Item1 != "q") // take until matching item1
.Sum(x=>x.Item2); // Sum
As asked in the comments, in case if you have multiple such sets and you want count in between those matching strings for multiple sets, do this.
int gid = 0;
items.Select(c => new { Tuple = c, gid = c.Item1=="q"? ++gid : gid })
.GroupBy(x=>x.gid)
.Where(x=>x.Key%2==1)
.SelectMany(x=>x.Skip(1))
.Sum(x=>x.Tuple.Item2);
Working Demo

You can add a new IEnumerable extension to get all values between two conditions. Note that if there is no second occurrence of the condition an empty list is returned, this can be edited based on your requirements:
public static class Extensions
{
public static IEnumerable<T> TakeBetween<T>(this IEnumerable<T> e, Func<T, bool> f)
{
bool first = false;
List<T> ret = new List<T>();
foreach (var item in e)
{
if(f(item) && !first)
{
first = true;
ret.Add(item);
continue;
}
if(first)
{
ret.Add(item);
if(f(item))
return ret;
}
}
return new List<T>();
}
}
Then it is just a matter of using this to get the sum:
var sum = list.TakeBetween(x => x.Item1 == "q").Sum(x => x.Item2);

Related

How to get every possible combination base on the ranges in brackets?

Looking for the best way to take something like 1[a-C]3[1-6]07[R,E-G] and have it output a log that would look like the following — basically every possible combination base on the ranges in brackets.
1a3107R
1a3107E
1a3107F
1a3107G
1b3107R
1b3107E
1b3107F
1b3107G
1c3107R
1c3107E
1c3107F
1c3107G
all the way to 1C3607G.
Sorry for not being more technical about what I looking for, just not sure on the correct terms to explain.
Normally what we'd do to get all combinations is to put all our ranges into arrays, then use nested loops to loop through each array, and create a new item in the inner loop that gets added to our results.
But in order to do that here, we'd first need to write a method that can parse your range string and return a list of char values defined by the range. I've written a rudimentary one here, which works with your sample input but should have some validation added to ensure the input string is in the proper format:
public static List<char> GetRange(string input)
{
input = input.Replace("[", "").Replace("]", "");
var parts = input.Split(',');
var range = new List<char>();
foreach (var part in parts)
{
var ends = part.Split('-');
if (ends.Length == 1)
{
range.Add(ends[0][0]);
}
else if (char.IsDigit(ends[0][0]))
{
var start = Convert.ToInt32(ends[0][0]);
var end = Convert.ToInt32(ends[1][0]);
var count = end - start + 1;
range.AddRange(Enumerable.Range(start, count).Select(c => (char) c));
}
else
{
var start = (int) ends[0][0];
var last = (int) ends[1][0];
var end = last < start ? 'z' : last;
range.AddRange(Enumerable.Range(start, end - start + 1)
.Select(c => (char) c));
if (last < start)
{
range.AddRange(Enumerable.Range('A', last - 'A' + 1)
.Select(c => (char) c));
}
}
}
return range;
}
Now that we can get a range of values from a string like "[a-C]", we need a way to create nested loops for each range, and to build our list of values based on the input string.
One way to do this is to replace our input string with one that contains placeholders for each range, and then we can create a loop for each range, and on each iteration we can replace the placeholder for that range with a character from the range.
So we'll take an input like this: "1[a-C]3[1-6]07[R,E-G]", and turn it into this: "1{0}3{1}07{2}". Now we can create loops where we take the characters from the first range and create a new string for each one of them, replacing the {0} with the character. Then, for each one of those strings, we iterate over the second range and create a new string that replaces the {1} placeholder with a character from the second range, and so on and so on until we've created new strings for every possible combination.
public static List<string> GetCombinatins(string input)
{
// Sample input = "1[a-C]3[1-6]07[R,E-G]"
var inputWithPlaceholders = string.Empty; // This will become "1{0}3{1}07{2}"
var placeholder = 0;
var ranges = new List<List<char>>();
for (int i = 0; i < input.Length; i++)
{
// We've found a range start, so replace this with our
// placeholder '{n}' and add the range to our list of ranges
if (input[i] == '[')
{
inputWithPlaceholders += $"{{{placeholder++}}}";
var rangeEndIndex = input.IndexOf("]", i);
ranges.Add(GetRange(input.Substring(i, rangeEndIndex - i)));
i = rangeEndIndex;
}
else
{
inputWithPlaceholders += input[i];
}
}
if (ranges.Count == 0) return new List<string> {input};
// Add strings for the first range
var values = ranges.First().Select(chr =>
inputWithPlaceholders.Replace("{0}", chr.ToString())).ToList();
// Then continually add all combinations of other ranges
for (int i = 1; i < ranges.Count; i++)
{
values = values.SelectMany(value =>
ranges[i].Select(chr =>
value.Replace($"{{{i}}}", chr.ToString()))).ToList();
}
return values;
}
Now with these methods out of the way, we can create output of all our ranges quite easily:
static void Main()
{
Console.WriteLine(string.Join(", ", GetCombinatins("1[a-C]3[1-6]07[R,E-G]")));
GetKeyFromUser("\nPress any key to exit...");
}
Output
I would approach this problem in three stages. The first stage is to transform the source string to an IEnumerable of IEnumerable<string>.
static IEnumerable<IEnumerable<string>> ParseSourceToEnumerables(string source);
For example the source "1[A-C]3[1-6]07[R,E-G]" should be transformed to the 6 enumerables below:
"1"
"A", "B", "C"
"3"
"1", "2", "3", "4", "5", "6"
"07"
"R", "E", "F", "G"
Each literal inside the source has been transformed to an IEnumerable<string> containing a single string.
The second stage would be to create the Cartesian product of these enumerables.
static IEnumerable<IEnumerable<T>> CartesianProduct<T>(
IEnumerable<IEnumerable<T>> sequences)
The final (and easiest) stage would be to concatenate each one of the inner IEnumerable<string> of the Cartesian product to a single string. For example
the sequence "1", "A", "3", "1", "07", "R" to the string "1A3107R"
The hardest stage is the first one, because it involves parsing. Below is a partial implementation:
static IEnumerable<IEnumerable<string>> ParseSourceToEnumerables(string source)
{
var matches = Regex.Matches(source, #"\[(.*?)\]", RegexOptions.Singleline);
int previousIndex = 0;
foreach (Match match in matches)
{
var previousLiteral = source.Substring(
previousIndex, match.Index - previousIndex);
if (previousLiteral.Length > 0)
yield return Enumerable.Repeat(previousLiteral, 1);
yield return SinglePatternToEnumerable(match.Groups[1].Value);
previousIndex = match.Index + match.Length;
}
var lastLiteral = source.Substring(previousIndex, source.Length - previousIndex);
if (lastLiteral.Length > 0) yield return Enumerable.Repeat(lastLiteral, 1);
}
static IEnumerable<string> SinglePatternToEnumerable(string pattern)
{
// TODO
// Should transform the pattern "X,A-C,YZ"
// to the sequence ["X", "A", "B", "C", "YZ"]
}
The second stage is hard too, but solved. I just grabbed the implementation from Eric Lippert's blog.
static IEnumerable<IEnumerable<T>> CartesianProduct<T>(
IEnumerable<IEnumerable<T>> sequences)
{
IEnumerable<IEnumerable<T>> emptyProduct = new[] { Enumerable.Empty<T>() };
return sequences.Aggregate(
emptyProduct,
(accumulator, sequence) =>
accumulator.SelectMany(_ => sequence,
(accseq, item) => accseq.Append(item)) // .NET Framework 4.7.1
);
}
The final stage is just a call to String.Join.
var source = "1[A-C]3[1-6]07[R,E-G]";
var enumerables = ParseSourceToEnumerables(source);
var combinations = CartesianProduct(enumerables);
foreach (var combination in combinations)
{
Console.WriteLine($"Combination: {String.Join("", combination)}");
}

How to find the placement of a List within another List?

I am working with two lists. The first contains a large sequence of strings. The second contains a smaller list of strings. I need to find where the second list exists in the first list.
I worked with enumeration, and due to the large size of the data, this is very slow, I was hoping for a faster way.
List<string> first = new List<string>() { "AAA","BBB","CCC","DDD","EEE","FFF" };
List<string> second = new List<string>() { "CCC","DDD","EEE" };
int x = SomeMagic(first,second);
And I would need x to = 2.
Ok, here is my variant with old-good-for-each-loop:
private int SomeMagic(IEnumerable<string> source, IEnumerable<string> target)
{
/* Some obvious checks for `source` and `target` lenght / nullity are ommited */
// searched pattern
var pattern = target.ToArray();
// candidates in form `candidate index` -> `checked length`
var candidates = new Dictionary<int, int>();
// iteration index
var index = 0;
// so, lets the magic begin
foreach (var value in source)
{
// check candidates
foreach (var candidate in candidates.Keys.ToArray()) // <- we are going to change this collection
{
var checkedLength = candidates[candidate];
if (value == pattern[checkedLength]) // <- here `checkedLength` is used in sense `nextPositionToCheck`
{
// candidate has match next value
checkedLength += 1;
// check if we are done here
if (checkedLength == pattern.Length) return candidate; // <- exit point
candidates[candidate] = checkedLength;
}
else
// candidate has failed
candidates.Remove(candidate);
}
// check for new candidate
if (value == pattern[0])
candidates.Add(index, 1);
index++;
}
// we did everything we could
return -1;
}
We use dictionary of candidates to handle situations like:
var first = new List<string> { "AAA","BBB","CCC","CCC","CCC","CCC","EEE","FFF" };
var second = new List<string> { "CCC","CCC","CCC","EEE" };
If you are willing to use MoreLinq then consider using Window:
var windows = first.Window(second.Count);
var result = windows
.Select((subset, index) => new { subset, index = (int?)index })
.Where(z => Enumerable.SequenceEqual(second, z.subset))
.Select(z => z.index)
.FirstOrDefault();
Console.WriteLine(result);
Console.ReadLine();
Window will allow you to look at 'slices' of the data in chunks (based on the length of your second list). Then SequenceEqual can be used to see if the slice is equal to second. If it is, the index can be returned. If it doesn't find a match, null will be returned.
Implemented SomeMagic method as below, this will return -1 if no match found, else it will return the index of start element in first list.
private int SomeMagic(List<string> first, List<string> second)
{
if (first.Count < second.Count)
{
return -1;
}
for (int i = 0; i <= first.Count - second.Count; i++)
{
List<string> partialFirst = first.GetRange(i, second.Count);
if (Enumerable.SequenceEqual(partialFirst, second))
return i;
}
return -1;
}
you can use intersect extension method using the namepace System.Linq
var CommonList = Listfirst.Intersect(Listsecond)

C# array for finding all the 2's

Basically what I have to do is find a certain number, which in this case is 2, and see how many times I have that number in my program, I assumed that I would have to use a .GetValue(42) but it's not doing it right, the code I am using is
static int count2(int[] input)
{
return input.GetValue(2);
}
input is from a separate method, but it contains the values that I'm working with which is
int [] input = {1,2,3,4,5};
Not sure if you count specifically the number 2, or any number that contains the number 2.
For the later here's the easy way:
public int count2(int[] input) {
int counter = 0;
foreach(var i in input) {
if (i.ToString().Contains("2"))
{
++counter;
}
}
return counter;
}
You can do it with LINQ
input.Count(x=>x==2);
Array.GetValue() "gets the value at the specified position in the one-dimensional Array" which is not what you want. (in your example it will return 3 because that's the value at index 2 of your array).
You want to count the number of times a specific item is in the array. That's a matter of looping and checking each item:
var counter = 0;
foreach(var item in input)
{
if(item == 2)
{
counter++;
}
}
return counter;
to get a count do this
int [] inputDupes = {1,2,3,4,5,2};
var duplicates = inputDupes
.Select(w => inputDupes.Contains(2))
.GroupBy(q => q)
.Where(gb => gb.Count() > 1)
.Select(gb => gb.Key).Count();//returns an Int32 value
to see if there are duplicates of the number 2 then do the following
int [] inputDupes = {1,2,3,4,5,2};
var duplicates = inputDupes
.Select(w => inputDupes.Contains(2))
.GroupBy(q => q)
.Where(gb => gb.Count() > 1)
.Select(gb => gb.Key)
.ToList(); //returns true | false
if you want to do this based on any number then create a method and pass a param in where .Contains() extension method is being called
if you want to capture user input from Console you can do it this way as well
int [] inputDupes = {1,2,3,4,5,2};
Console.WriteLine("Enter a number to check for duplicates: ");
string input = Console.ReadLine();
int number;
Int32.TryParse(input, out number);
var dupeCount = inputDupes.Count(x => x == number);
Console.WriteLine(dupeCount);
Console.Read();
Yields 2 for the duplicate Count
static int count2(int[] input)
{
return input.Count(i => i == 2);
}
You could use a Func like this:
public Func<int[], int, int> GetNumberCount =
(numbers, numberToSearchFor) =>
numbers.Count(num => num.Equals(numberToSearchFor));
...
var count = GetNumberCount(input, 2);
Gotta' love a Func :)

How do I compare items from a list to all others without repetition?

I have a collection of objects (lets call them MyItem) and each MyItem has a method called IsCompatibleWith which returns a boolean saying whether it's compatible with another MyItem.
public class MyItem
{
...
public bool IsCompatibleWith(MyItem other) { ... }
...
}
A.IsCompatibleWith(B) will always be the same as B.IsCompatibleWith(A). If for example I have a collection containing 4 of these, I am trying to find a LINQ query that will run the method on each distinct pair of items in the same collection. So if my collection contains A, B, C and D I wish to do the equivalent of:
A.IsCompatibleWith(B); // A & B
A.IsCompatibleWith(C); // A & C
A.IsCompatibleWith(D); // A & D
B.IsCompatibleWith(C); // B & C
B.IsCompatibleWith(D); // B & D
C.IsCompatibleWith(D); // C & D
The code initially used was:
var result = from item in myItems
from other in myItems
where item != other &&
item.IsCompatibleWith(other)
select item;
but of course this will still do both A & B and B & A (which is not required and not efficient). Also it's probably worth noting that in reality these lists will be a lot bigger than 4 items, hence the desire for an optimal solution.
Hopefully this makes sense... any ideas?
Edit:
One possible query -
MyItem[] items = myItems.ToArray();
bool compatible = (from item in items
from other in items
where
Array.IndexOf(items, item) < Array.IndexOf(items, other) &&
!item.IsCompatibleWith(other)
select item).FirstOrDefault() == null;
Edit2: In the end switched to using the custom solution from LukeH as it was more efficient for bigger lists.
public bool AreAllCompatible()
{
using (var e = myItems.GetEnumerator())
{
var buffer = new List<MyItem>();
while (e.MoveNext())
{
if (buffer.Any(item => !item.IsCompatibleWith(e.Current)))
return false;
buffer.Add(e.Current);
}
}
return true;
}
Edit...
Judging by the "final query" added to your question, you need a method to determine if all the items in the collection are compatible with each other. Here's how to do it reasonably efficiently:
bool compatible = myItems.AreAllItemsCompatible();
// ...
public static bool AreAllItemsCompatible(this IEnumerable<MyItem> source)
{
using (var e = source.GetEnumerator())
{
var buffer = new List<MyItem>();
while (e.MoveNext())
{
foreach (MyItem item in buffer)
{
if (!item.IsCompatibleWith(e.Current))
return false;
}
buffer.Add(e.Current);
}
}
return true;
}
Original Answer...
I don't think there's an efficient way to do this using only the built-in LINQ methods.
It's easy enough to build your own though. Here's an example of the sort of code you'll need. I'm not sure exactly what results you're trying to return so I'm just writing a message to the console for each compatible pair. It should be easy enough to change it to yield the results that you need.
using (var e = myItems.GetEnumerator())
{
var buffer = new List<MyItem>();
while (e.MoveNext())
{
foreach (MyItem item in buffer)
{
if (item.IsCompatibleWith(e.Current))
{
Console.WriteLine(item + " is compatible with " + e.Current);
}
}
buffer.Add(e.Current);
}
}
(Note that although this is reasonably efficient, it does not preserve the original ordering of the collection. Is that an issue in your situation?)
this should do it:
var result = from item in myItems
from other in myItems
where item != other &&
myItems.indexOf(item) < myItems.indexOf(other) &&
item.IsCompatibleWith(other)
select item;
But i dont know if it makes it faster, because in the query has to check the indices of the rows each row.
Edit:
if you have an index in myItem you should use that one instead of indexOf. And you can remove the "item != other" from the where clause, little bit redundant now
Here's an idea:
Implement IComparable so that your MyItem becomes sortable, then run this linq-query:
var result = from item in myItems
from other in myItems
where item.CompareTo(other) < 0 &&
item.IsCompatibleWith(other)
select item;
If your MyItem collection is small enough, you can storage the results of item.IsCompatibleWith(otherItem) in a boolean array:
var itemCount = myItems.Count();
var compatibilityTable = new bool[itemCount, itemCount];
var itemsToCompare = new List<MyItem>();
var i = 0;
var j = 0;
foreach (var item in myItems)
{
j = 0;
foreach (var other in itemsToCompare)
{
compatibilityTable[i,j] = item.IsCompatibleWith(other);
compatibilityTable[j,i] = compatibilityTable[i,j];
j++;
}
itemsToCompare.Add(item);
i++;
}
var result = myItems.Where((item, i) =>
{
var compatible = true;
var j = 0;
while (compatible && j < itemCount)
{
compatible = compatibilityTable[i,j];
}
j++;
return compatible;
}
So, we have
IEnumerable<MyItem> MyItems;
To get all the combinations we could use a function like this.
//returns all the k sized combinations from a list
public static IEnumerable<IEnumerable<T>> Combinations<T>(IEnumerable<T> list,
int k)
{
if (k == 0) return new[] {new T[0]};
return list.SelectMany((l, i) =>
Combinations(list.Skip(i + 1), k - 1).Select(c => (new[] {l}).Concat(c))
);
}
We can then apply this function to our problem like this.
var combinations = Combinations(MyItems, 2).Select(c => c.ToList<MyItem>());
var result = combinations.Where(c => c[0].IsCompatibleWith(c[1]))
This will perform IsCompatableWith on all the combinations without repetition.
You could of course perform the the checking inside the Combinations functions. For further work you could make the Combinations function into an extention that takes a delegate with a variable number of parameters for several lengths of k.
EDIT: As I suggested above, if you wrote these extension method
public static class Extenesions
{
IEnumerable<IEnumerable<T>> Combinations<T>(this IEnumerable<T> list, int k)
{
if (k == 0) return new[] { new T[0] };
return list.SelectMany((l, i) =>
list.Skip(i + 1).Combinations<T>(k - 1)
.Select(c => (new[] { l }).Concat(c)));
}
IEnumerable<Tuple<T, T>> Combinations<T> (this IEnumerable<T> list,
Func<T, T, bool> filter)
{
return list.Combinations(2).Where(c =>
filter(c.First(), c.Last())).Select(c =>
Tuple.Create<T, T>(c.First(), c.Last()));
}
}
Then in your code you could do the rather more elegant (IMO)
var compatibleTuples = myItems.Combinations(a, b) => a.IsCompatibleWith(b)))
then get at the compatible items with
foreach(var t in compatibleTuples)
{
t.Item1 // or T.item2
}

Detecting sequence of at least 3 sequential numbers from a given list

I have a list of numbers e.g. 21,4,7,9,12,22,17,8,2,20,23
I want to be able to pick out sequences of sequential numbers (minimum 3 items in length), so from the example above it would be 7,8,9 and 20,21,22,23.
I have played around with a few ugly sprawling functions but I am wondering if there is a neat LINQ-ish way to do it.
Any suggestions?
UPDATE:
Many thanks for all the responses, much appriciated. Im am currently having a play with them all to see which would best integrate into our project.
It strikes me that the first thing you should do is order the list. Then it's just a matter of walking through it, remembering the length of your current sequence and detecting when it's ended. To be honest, I suspect that a simple foreach loop is going to be the simplest way of doing that - I can't immediately think of any wonderfully neat LINQ-like ways of doing it. You could certainly do it in an iterator block if you really wanted to, but bear in mind that ordering the list to start with means you've got a reasonably "up-front" cost anyway. So my solution would look something like this:
var ordered = list.OrderBy(x => x);
int count = 0;
int firstItem = 0; // Irrelevant to start with
foreach (int x in ordered)
{
// First value in the ordered list: start of a sequence
if (count == 0)
{
firstItem = x;
count = 1;
}
// Skip duplicate values
else if (x == firstItem + count - 1)
{
// No need to do anything
}
// New value contributes to sequence
else if (x == firstItem + count)
{
count++;
}
// End of one sequence, start of another
else
{
if (count >= 3)
{
Console.WriteLine("Found sequence of length {0} starting at {1}",
count, firstItem);
}
count = 1;
firstItem = x;
}
}
if (count >= 3)
{
Console.WriteLine("Found sequence of length {0} starting at {1}",
count, firstItem);
}
EDIT: Okay, I've just thought of a rather more LINQ-ish way of doing things. I don't have the time to fully implement it now, but:
Order the sequence
Use something like SelectWithPrevious (probably better named SelectConsecutive) to get consecutive pairs of elements
Use the overload of Select which includes the index to get tuples of (index, current, previous)
Filter out any items where (current = previous + 1) to get anywhere that counts as the start of a sequence (special-case index=0)
Use SelectWithPrevious on the result to get the length of the sequence between two starting points (subtract one index from the previous)
Filter out any sequence with length less than 3
I suspect you need to concat int.MinValue on the ordered sequence, to guarantee the final item is used properly.
EDIT: Okay, I've implemented this. It's about the LINQiest way I can think of to do this... I used null values as "sentinel" values to force start and end sequences - see comments for more details.
Overall, I wouldn't recommend this solution. It's hard to get your head round, and although I'm reasonably confident it's correct, it took me a while thinking of possible off-by-one errors etc. It's an interesting voyage into what you can do with LINQ... and also what you probably shouldn't.
Oh, and note that I've pushed the "minimum length of 3" part up to the caller - when you have a sequence of tuples like this, it's cleaner to filter it out separately, IMO.
using System;
using System.Collections.Generic;
using System.Linq;
static class Extensions
{
public static IEnumerable<TResult> SelectConsecutive<TSource, TResult>
(this IEnumerable<TSource> source,
Func<TSource, TSource, TResult> selector)
{
using (IEnumerator<TSource> iterator = source.GetEnumerator())
{
if (!iterator.MoveNext())
{
yield break;
}
TSource prev = iterator.Current;
while (iterator.MoveNext())
{
TSource current = iterator.Current;
yield return selector(prev, current);
prev = current;
}
}
}
}
class Test
{
static void Main()
{
var list = new List<int> { 21,4,7,9,12,22,17,8,2,20,23 };
foreach (var sequence in FindSequences(list).Where(x => x.Item1 >= 3))
{
Console.WriteLine("Found sequence of length {0} starting at {1}",
sequence.Item1, sequence.Item2);
}
}
private static readonly int?[] End = { null };
// Each tuple in the returned sequence is (length, first element)
public static IEnumerable<Tuple<int, int>> FindSequences
(IEnumerable<int> input)
{
// Use null values at the start and end of the ordered sequence
// so that the first pair always starts a new sequence starting
// with the lowest actual element, and the final pair always
// starts a new one starting with null. That "sequence at the end"
// is used to compute the length of the *real* final element.
return End.Concat(input.OrderBy(x => x)
.Select(x => (int?) x))
.Concat(End)
// Work out consecutive pairs of items
.SelectConsecutive((x, y) => Tuple.Create(x, y))
// Remove duplicates
.Where(z => z.Item1 != z.Item2)
// Keep the index so we can tell sequence length
.Select((z, index) => new { z, index })
// Find sequence starting points
.Where(both => both.z.Item2 != both.z.Item1 + 1)
.SelectConsecutive((start1, start2) =>
Tuple.Create(start2.index - start1.index,
start1.z.Item2.Value));
}
}
Jon Skeet's / Timwi's solutions are the way to go.
For fun, here's a LINQ query that does the job (very inefficiently):
var sequences = input.Distinct()
.GroupBy(num => Enumerable.Range(num, int.MaxValue - num + 1)
.TakeWhile(input.Contains)
.Last()) //use the last member of the consecutive sequence as the key
.Where(seq => seq.Count() >= 3)
.Select(seq => seq.OrderBy(num => num)); // not necessary unless ordering is desirable inside each sequence.
The query's performance can be improved slightly by loading the input into a HashSet (to improve Contains), but that will still not produce a solution that is anywhere close to efficient.
The only bug I am aware of is the possibility of an arithmetic overflow if the sequence contains negative numbers of large magnitude (we cannot represent the count parameter for Range). This would be easy to fix with a custom static IEnumerable<int> To(this int start, int end) extension-method. If anyone can think of any other simple technique of dodging the overflow, please let me know.
EDIT:
Here's a slightly more verbose (but equally inefficient) variant without the overflow issue.
var sequences = input.GroupBy(num => input.Where(candidate => candidate >= num)
.OrderBy(candidate => candidate)
.TakeWhile((candidate, index) => candidate == num + index)
.Last())
.Where(seq => seq.Count() >= 3)
.Select(seq => seq.OrderBy(num => num));
I think my solution is more elegant and simple, and therefore easier to verify as correct:
/// <summary>Returns a collection containing all consecutive sequences of
/// integers in the input collection.</summary>
/// <param name="input">The collection of integers in which to find
/// consecutive sequences.</param>
/// <param name="minLength">Minimum length that a sequence should have
/// to be returned.</param>
static IEnumerable<IEnumerable<int>> ConsecutiveSequences(
IEnumerable<int> input, int minLength = 1)
{
var results = new List<List<int>>();
foreach (var i in input.OrderBy(x => x))
{
var existing = results.FirstOrDefault(lst => lst.Last() + 1 == i);
if (existing == null)
results.Add(new List<int> { i });
else
existing.Add(i);
}
return minLength <= 1 ? results :
results.Where(lst => lst.Count >= minLength);
}
Benefits over the other solutions:
It can find sequences that overlap.
It’s properly reusable and documented.
I have not found any bugs ;-)
Here is how to solve the problem in a "LINQish" way:
int[] arr = new int[]{ 21, 4, 7, 9, 12, 22, 17, 8, 2, 20, 23 };
IOrderedEnumerable<int> sorted = arr.OrderBy(x => x);
int cnt = sorted.Count();
int[] sortedArr = sorted.ToArray();
IEnumerable<int> selected = sortedArr.Where((x, idx) =>
idx <= cnt - 3 && sortedArr[idx + 1] == x + 1 && sortedArr[idx + 2] == x + 2);
IEnumerable<int> result = selected.SelectMany(x => new int[] { x, x + 1, x + 2 }).Distinct();
Console.WriteLine(string.Join(",", result.Select(x=>x.ToString()).ToArray()));
Due to the array copying and reconstruction, this solution - of course - is not as efficient as the traditional solution with loops.
Not 100% Linq but here's a generic variant:
static IEnumerable<IEnumerable<TItem>> GetSequences<TItem>(
int minSequenceLength,
Func<TItem, TItem, bool> areSequential,
IEnumerable<TItem> items)
where TItem : IComparable<TItem>
{
items = items
.OrderBy(n => n)
.Distinct().ToArray();
var lastSelected = default(TItem);
var sequences =
from startItem in items
where startItem.Equals(items.First())
|| startItem.CompareTo(lastSelected) > 0
let sequence =
from item in items
where item.Equals(startItem) || areSequential(lastSelected, item)
select (lastSelected = item)
where sequence.Count() >= minSequenceLength
select sequence;
return sequences;
}
static void UsageInt()
{
var sequences = GetSequences(
3,
(a, b) => a + 1 == b,
new[] { 21, 4, 7, 9, 12, 22, 17, 8, 2, 20, 23 });
foreach (var sequence in sequences)
Console.WriteLine(string.Join(", ", sequence.ToArray()));
}
static void UsageChar()
{
var list = new List<char>(
"abcdefghijklmnopqrstuvwxyz".ToCharArray());
var sequences = GetSequences(
3,
(a, b) => (list.IndexOf(a) + 1 == list.IndexOf(b)),
"PleaseBeGentleWithMe".ToLower().ToCharArray());
foreach (var sequence in sequences)
Console.WriteLine(string.Join(", ", sequence.ToArray()));
}
Here's my shot at it:
public static class SequenceDetector
{
public static IEnumerable<IEnumerable<T>> DetectSequenceWhere<T>(this IEnumerable<T> sequence, Func<T, T, bool> inSequenceSelector)
{
List<T> subsequence = null;
// We can only have a sequence with 2 or more items
T last = sequence.FirstOrDefault();
foreach (var item in sequence.Skip(1))
{
if (inSequenceSelector(last, item))
{
// These form part of a sequence
if (subsequence == null)
{
subsequence = new List<T>();
subsequence.Add(last);
}
subsequence.Add(item);
}
else if (subsequence != null)
{
// We have a previous seq to return
yield return subsequence;
subsequence = null;
}
last = item;
}
if (subsequence != null)
{
// Return any trailing seq
yield return subsequence;
}
}
}
public class test
{
public static void run()
{
var list = new List<int> { 21, 4, 7, 9, 12, 22, 17, 8, 2, 20, 23 };
foreach (var subsequence in list
.OrderBy(i => i)
.Distinct()
.DetectSequenceWhere((first, second) => first + 1 == second)
.Where(seq => seq.Count() >= 3))
{
Console.WriteLine("Found subsequence {0}",
string.Join(", ", subsequence.Select(i => i.ToString()).ToArray()));
}
}
}
This returns the specific items that form the sub-sequences and permits any type of item and any definition of criteria so long as it can be determined by comparing adjacent items.
What about sorting the array then create another array that is the difference between each element the previous one
sortedArray = 8, 9, 10, 21, 22, 23, 24, 27, 30, 31, 32
diffArray = 1, 1, 11, 1, 1, 1, 3, 3, 1, 1
Now iterate through the difference array; if the difference equlas 1, increase the count of a variable, sequenceLength, by 1. If the difference is > 1, check the sequenceLength if it is >=2 then you have a sequence of at at least 3 consecutive elements. Then reset sequenceLenght to 0 and continue your loop on the difference array.
Here is a solution I knocked up in F#, it should be fairly easy to translate this into a C# LINQ query since fold is pretty much equivalent to the LINQ aggregate operator.
#light
let nums = [21;4;7;9;12;22;17;8;2;20;23]
let scanFunc (mainSeqLength, mainCounter, lastNum:int, subSequenceCounter:int, subSequence:'a list, foundSequences:'a list list) (num:'a) =
(mainSeqLength, mainCounter + 1,
num,
(if num <> lastNum + 1 then 1 else subSequenceCounter+1),
(if num <> lastNum + 1 then [num] else subSequence#[num]),
if subSequenceCounter >= 3 then
if mainSeqLength = mainCounter+1
then foundSequences # [subSequence#[num]]
elif num <> lastNum + 1
then foundSequences # [subSequence]
else foundSequences
else foundSequences)
let subSequences = nums |> Seq.sort |> Seq.fold scanFunc (nums |> Seq.length, 0, 0, 0, [], []) |> fun (_,_,_,_,_,results) -> results
Linq isn't the solution for everything, sometimes you're better of with a simple loop. Here's a solution, with just a bit of Linq to order the original sequences and filter the results
void Main()
{
var numbers = new[] { 21,4,7,9,12,22,17,8,2,20,23 };
var sequences =
GetSequences(numbers, (prev, curr) => curr == prev + 1);
.Where(s => s.Count() >= 3);
sequences.Dump();
}
public static IEnumerable<IEnumerable<T>> GetSequences<T>(
IEnumerable<T> source,
Func<T, T, bool> areConsecutive)
{
bool first = true;
T prev = default(T);
List<T> seq = new List<T>();
foreach (var i in source.OrderBy(i => i))
{
if (!first && !areConsecutive(prev, i))
{
yield return seq.ToArray();
seq.Clear();
}
first = false;
seq.Add(i);
prev = i;
}
if (seq.Any())
yield return seq.ToArray();
}
I thought of the same thing as Jon: to represent a range of consecutive integers all you really need are two measly integers! So I'd start there:
struct Range : IEnumerable<int>
{
readonly int _start;
readonly int _count;
public Range(int start, int count)
{
_start = start;
_count = count;
}
public int Start
{
get { return _start; }
}
public int Count
{
get { return _count; }
}
public int End
{
get { return _start + _count - 1; }
}
public IEnumerator<int> GetEnumerator()
{
for (int i = 0; i < _count; ++i)
{
yield return _start + i;
}
}
// Heck, why not?
public static Range operator +(Range x, int y)
{
return new Range(x.Start, x.Count + y);
}
// skipping the explicit IEnumerable.GetEnumerator implementation
}
From there, you can write a static method to return a bunch of these Range values corresponding to the consecutive numbers of your sequence.
static IEnumerable<Range> FindRanges(IEnumerable<int> source, int minCount)
{
// throw exceptions on invalid arguments, maybe...
var ordered = source.OrderBy(x => x);
Range r = default(Range);
foreach (int value in ordered)
{
// In "real" code I would've overridden the Equals method
// and overloaded the == operator to write something like
// if (r == Range.Empty) here... but this works well enough
// for now, since the only time r.Count will be 0 is on the
// first item.
if (r.Count == 0)
{
r = new Range(value, 1);
continue;
}
if (value == r.End)
{
// skip duplicates
continue;
}
else if (value == r.End + 1)
{
// "append" consecutive values to the range
r += 1;
}
else
{
// return what we've got so far
if (r.Count >= minCount)
{
yield return r;
}
// start over
r = new Range(value, 1);
}
}
// return whatever we ended up with
if (r.Count >= minCount)
{
yield return r;
}
}
Demo:
int[] numbers = new[] { 21, 4, 7, 9, 12, 22, 17, 8, 2, 20, 23 };
foreach (Range r in FindConsecutiveRanges(numbers, 3))
{
// Using .NET 3.5 here, don't have the much nicer string.Join overloads.
Console.WriteLine(string.Join(", ", r.Select(x => x.ToString()).ToArray()));
}
Output:
7, 8, 9
20, 21, 22, 23
Here's my LINQ-y take on the problem:
static IEnumerable<IEnumerable<int>>
ConsecutiveSequences(this IEnumerable<int> input, int minLength = 3)
{
int order = 0;
var inorder = new SortedSet<int>(input);
return from item in new[] { new { order = 0, val = inorder.First() } }
.Concat(
inorder.Zip(inorder.Skip(1), (x, val) =>
new { order = x + 1 == val ? order : ++order, val }))
group item.val by item.order into list
where list.Count() >= minLength
select list;
}
uses no explicit loops, but should still be O(n lg n)
uses SortedSet instead of .OrderBy().Distinct()
combines consecutive element with list.Zip(list.Skip(1))
Here's a solution using a Dictionary instead of a sort...
It adds the items to a Dictionary, and then for each value increments above and below to find the longest sequence.
It is not strictly LINQ, though it does make use of some LINQ functions, and I think it is more readable than a pure LINQ solution..
static void Main(string[] args)
{
var items = new[] { -1, 0, 1, 21, -2, 4, 7, 9, 12, 22, 17, 8, 2, 20, 23 };
IEnumerable<IEnumerable<int>> sequences = FindSequences(items, 3);
foreach (var sequence in sequences)
{ //print results to consol
Console.Out.WriteLine(sequence.Select(num => num.ToString()).Aggregate((a, b) => a + "," + b));
}
Console.ReadLine();
}
private static IEnumerable<IEnumerable<int>> FindSequences(IEnumerable<int> items, int minSequenceLength)
{
//Convert item list to dictionary
var itemDict = new Dictionary<int, int>();
foreach (int val in items)
{
itemDict[val] = val;
}
var allSequences = new List<List<int>>();
//for each val in items, find longest sequence including that value
foreach (var item in items)
{
var sequence = FindLongestSequenceIncludingValue(itemDict, item);
allSequences.Add(sequence);
//remove items from dict to prevent duplicate sequences
sequence.ForEach(i => itemDict.Remove(i));
}
//return only sequences longer than 3
return allSequences.Where(sequence => sequence.Count >= minSequenceLength).ToList();
}
//Find sequence around start param value
private static List<int> FindLongestSequenceIncludingValue(Dictionary<int, int> itemDict, int value)
{
var result = new List<int>();
//check if num exists in dictionary
if (!itemDict.ContainsKey(value))
return result;
//initialize sequence list
result.Add(value);
//find values greater than starting value
//and add to end of sequence
var indexUp = value + 1;
while (itemDict.ContainsKey(indexUp))
{
result.Add(itemDict[indexUp]);
indexUp++;
}
//find values lower than starting value
//and add to start of sequence
var indexDown = value - 1;
while (itemDict.ContainsKey(indexDown))
{
result.Insert(0, itemDict[indexDown]);
indexDown--;
}
return result;
}

Categories