using Linq to partition data into arrays - c#

I have an array of elements where the element has a Flagged boolean value.
1 flagged
2 not flagged
3 not flagged
4 flagged
5 not flagged
6 not flagged
7 not flagged
8 flagged
9 not flagged
I want to break it into arrays based on the flagged indicator
output >
array 1 {1,2,3}
array 2 {4,5,6,7}
array 3 {8,9}

Linq doesn't have an operator for this, but I've written an extension method that you may be able to use (in the process of submitting it to MoreLinq, which you should also check out):
Using the operator below, you would write:
var result =
items.Segment( (item,prevItem,idx) => item.Flagged )
.Select( seq => seq.ToArray() ) // converts each sequence to an array
.ToList();
Here's the code of the extension method:
public static IEnumerable<IEnumerable<T>> Segment<T>(IEnumerable<T> sequence, Func<T, T, int, bool> newSegmentIdentifier)
{
var index = -1;
using (var iter = sequence.GetEnumerator())
{
var segment = new List<T>();
var prevItem = default(T);
// ensure that the first item is always part
// of the first segment. This is an intentional
// behavior. Segmentation always begins with
// the second element in the sequence.
if (iter.MoveNext())
{
++index;
segment.Add(iter.Current);
prevItem = iter.Current;
}
while (iter.MoveNext())
{
++index;
// check if the item represents the start of a new segment
var isNewSegment = newSegmentIdentifier(iter.Current, prevItem, index);
prevItem = iter.Current;
if (!isNewSegment)
{
// if not a new segment, append and continue
segment.Add(iter.Current);
continue;
}
yield return segment; // yield the completed segment
// start a new segment...
segment = new List<T> { iter.Current };
}
// handle the case of the sequence ending before new segment is detected
if (segment.Count > 0)
yield return segment;
}
}

I had a similar problem with this, and solved it using GroupBy and closure.
//sample data
var arrayOfElements = new[] {
new { Id = 1, Flagged = true },
new { Id = 2, Flagged = false },
new { Id = 3, Flagged = false },
new { Id = 4, Flagged = true },
new { Id = 5, Flagged = false },
new { Id = 6, Flagged = false },
new { Id = 7, Flagged = false },
new { Id = 8, Flagged = true },
new { Id = 9, Flagged = false }
};
//this is the closure which will increase each time I see a flagged
int flagCounter = 0;
var query =
arrayOfElements.GroupBy(e =>
{
if (e.Flagged)
flagCounter++;
return flagCounter;
});
What it does is grouping on an int (flagCounter), which is increased each time a Flagged element is found.
Please note this won't work with AsParallel().
Testing the results:
foreach(var group in query)
{
Console.Write("\r\nGroup: ");
foreach (var element in group)
Console.Write(element.Id);
}
Outputs:
Group: 123
Group: 4567
Group: 89

Considering:
var arrayOfElements = new[] {
new { Id = 1, Flagged = true },
new { Id = 2, Flagged = false },
new { Id = 3, Flagged = false },
new { Id = 4, Flagged = true },
new { Id = 5, Flagged = false },
new { Id = 6, Flagged = false },
new { Id = 7, Flagged = false },
new { Id = 8, Flagged = true },
new { Id = 9, Flagged = false }
};
You can write:
var grouped =
from i in arrayOfElements
where i.Flagged
select
(new[] { i.Id })
.Union(arrayOfElements.Where(i2 => i2.Id > i.Id).TakeWhile(i2 => !i2.Flagged).Select(i2 => i2.Id))
.ToArray();
This works if your elements are ordered by the Id attribute. If they don't, you'll have to inject a Sequence on your original array, that should be easy to do with linq as well, so you'll get a sequence.
Also, a better alternative should be:
// for each flagged element, slice the array,
// starting on the flagged element until the next flagged element
var grouped =
from i in arrayOfElements
where i.Flagged
select
arrayOfElements
.SkipWhile(i2 => i2 != i)
.TakeWhile(i2 => i2 == i || !i2.Flagged)
.Select(i2 => i2.Id)
.ToArray();
Note that those answers are using pure linq.

I don't think LINQ is the right tool for this task. What about this:
public static List<List<T>> PartitionData<T>(T[] arr, Func<T, bool> flagSelector){
List<List<T>> output = new List<List<T>>();
List<T> partition = null;
bool first = true;
foreach(T obj in arr){
if(flagSelector(obj) || first){
partition = new List<T>();
output.Add(partition);
first = false;
}
partition.Add(obj);
}
return output;
}
A small example, with the Data from Fábio Batistas post:
var arrayOfElements = new[] {
new { Id = 1, Flagged = true },
new { Id = 2, Flagged = false },
new { Id = 3, Flagged = false },
new { Id = 4, Flagged = true },
new { Id = 5, Flagged = false },
new { Id = 6, Flagged = false },
new { Id = 7, Flagged = false },
new { Id = 8, Flagged = true },
new { Id = 9, Flagged = false }
};
var partitioned = PartitionData(arrayOfElements, x => x.Flagged);

I don't think LINQ is suited for this very well. It could be done with Aggregate() but I think you'd be better of just looping with a foreach() building up the result.

Related

Using Group by with x amount of elements

Here's a list, think of it as rows and columns where rows are going down and columns are side ways. the column count will always be the same for all rows.
var dataValues = new List<List<string>>()
{
//row 1
new List<string>(){"A","12","X","P8" },
//row 2
new List<string>(){"B","13","Y","P7" },
//row 3
new List<string>(){"C","12","Y","P6" },
//row 4
new List<string>(){"A","14","X","P5" },
//....
new List<string>(){"D","15","Z","P4" },
new List<string>(){"A","13","X","P3" },
new List<string>(){"B","14","Y","P2" },
new List<string>(){"C","13","Z","P1" },
};
The user providers a list of indexes to group by.
var userParam= new List<int>() { 0, 2 };
my question is how do i dynamically group dataValues by the userParam where user param is n amount of index. In the example above it will gorup by the first column and the 3rd. However the index can change and the amount of indexes can change aswell
example
var userParam2 = new List<int>() { 0, 2};
var userParam3 = new List<int>() { 0};
var userParam4 = new List<int>() { 0,1,2};
i know how to group by when i know how many indexes there will be (the the case below it's 2 index parameters), however when it's dynamic (x amount) then i do not know how to do this
var result = dataValues.GroupBy(e => new { G1 = e[userParam2 [0]], G2 = e[userParam2 [1]] });
You could use a Custom Comparer to achieve this :
1 - Declaration of GroupByComparer that inherit from IEqualityComparer :
public class GroupByComparer : IEqualityComparer<List<string>>
{
private static List<int> _intList;
public GroupByComparer(List<int> intList)
{
_intList = intList;
}
public bool Equals(List<string> x, List<string> y)
{
foreach (int item in _intList)
{
if (x[item] != y[item])
return false;
}
return true;
}
public int GetHashCode(List<string> obj)
{
int hashCode = 0;
foreach (int item in _intList)
{
hashCode ^= obj[item].GetHashCode() + item;
}
return hashCode;
}
}
2 - Call group by with EqualityComparer like :
var userParam = new List<int>() { 0, 2 };
var result = dataValues.GroupBy(e => e, new GroupByComparer(userParam));
I hope you find this helpful.
I believe i have something but this looks slow please let me know if there is anyway better of doing this.
var userParams = new List<int>() { 0, 2 };
var dataValues = new List<List<string>>()
{
new List<string>(){"A","12","X","P8" },
new List<string>(){"B","13","Y","P7" },
new List<string>(){"C","12","Y","P6" },
new List<string>(){"A","14","X","P5" },
new List<string>(){"D","15","Z","P4" },
new List<string>(){"A","13","X","P3" },
new List<string>(){"B","14","Y","P2" },
new List<string>(){"C","13","Z","P1" },
};
var result = new List<(List<string> Key, List<List<string>> Values)>();
result.Add((new List<string>(), dataValues));
for (int index = 0; index < userParams.Count; index++)
{
var currentResult = new List<(List<string> Key, List<List<string>> Values)>();
foreach (var item in result)
{
foreach (var newGroup in item.Values.GroupBy(e => e[userParams[index]]))
{
var newKey = item.Key.ToList();
newKey.Add(newGroup.Key);
currentResult.Add((newKey, newGroup.ToList()));
}
}
result = currentResult;
}
foreach(var res in result)
{
Console.WriteLine($"Key: {string.Join(#"\", res.Key)}, Values: {string.Join(" | ", res.Values.Select(e=> string.Join(",",e)))}");
}
final result
Key: A\X, Values: A,12,X,P8 | A,14,X,P5 | A,13,X,P3
Key: B\Y, Values: B,13,Y,P7 | B,14,Y,P2
Key: C\Y, Values: C,12,Y,P6
Key: C\Z, Values: C,13,Z,P1
Key: D\Z, Values: D,15,Z,P4

Sort a list based on another list's values

I have two lists: finalAnswers and draftAnswers, where finalAnswers[i] is related to draftAnswers[i] and they both have the same size.
I want to sort finalAnswers in a way, that elements i++, having draftAnswers[i].ID!=0 are appeared at the top and then the others.
Assume that finalAnswers is:
finalAnswers[1].Name = "a";
finalAnswers[2].Name = "b";
finalAnswers[3].Name = "c";
And corresponding elements in draftAnswers:
draftAnswers[1].ID = 1;
draftAnswers[2].ID = 0;
draftAnswers[3].ID = 2;
Once sorted, the finalAnswers is:
finalAnswers[1].Name = "a";
finalAnswers[2].Name = "c";
finalAnswers[3].Name = "b";
I tried using usual orderBy, but it's not straightforward in this case. Any suggestions are appretiated.
UPDATE:
Class:
class A
{
public int id;
public int tID;
public int cID;
public string Name;
}
Values:
var finalAnswers = new List<A>() { new A() { id = 7, tID = 10, cID = 50, Name="Q1" },
new A() { id = 8, tID = 20, cID = 30, Name="Q2" },
new A() { id = 9, tID = 30, cID = 20, Name="Q3" }
};
var draftAnswers = new List<A>() { new A() { id = 1, tID = 10, cID = 50, Name="Q5" },
new A() { id = 0, tID = 20, cID = 30, Name="Q2" },
new A() { id = 1, tID = 30, cID = 20, Name="Q3" }
};
Sorting:
draftAnswers = draftAnswers.OrderBy(d=>d.id).ToList();
finalAnswers = finalAnswers.OrderBy(b => draftAnswers.FindIndex(a => a.tID == b.tID && a.cID == b.cID)).ToList();
OUTPUT (finalAnswer IDs):
8
7
9
EXPECTED:
7
9
8
The orderBy is not sorting in acsending order - dotNetFiddle
So, you're sorting your finalAnswers list using the following line:1
finalAnswers = finalAnswers.OrderBy(b => draftAnswers.FindIndex(a => a.tID == b.tID && a.cID == b.cID)).ToList();
Which does what it's supposed to, except it has two problems:
It doesn't take into account the condition: draftAnswers[i].id!=0.
You are matching by the properties values, not by the index as your original requirements are.
If the second point isn't an issue for you, then you can keep that line as is, and simply change the way you sort the draftAnswers list using the ThenBy() method to take into account excluding the items with .id == 0:
draftAnswers = draftAnswers.OrderBy(d => d.id == 0).ThenBy(d => d.id).ToList();
finalAnswers = finalAnswers.OrderBy(b => draftAnswers.FindIndex(a => a.tID == b.tID &&
a.cID == b.cID)).ToList();
However, I see no reason for you to keep using two separate lists. You could simply merge them into a List<KeyValuePair> where the Keys are your draftAnswers items and the Values are the finalAnswers items. And then you can use ThenBy() the same way as above, in order to sort the list based on the two conditions you have.
Something like the following would work just fine:
var mixed = new List<KeyValuePair<A, A>>()
{ new KeyValuePair<A, A>(new A() { id = 1, tID = 10, cID = 50, Name = "Q5" }, // draft.
new A() { id = 7, tID = 10, cID = 50, Name = "Q1" }), // final.
new KeyValuePair<A, A>(new A() { id = 0, tID = 20, cID = 30, Name = "Q2" }, // draft.
new A() { id = 8, tID = 20, cID = 30, Name = "Q2" }), // final.
new KeyValuePair<A, A>(new A() { id = 1, tID = 10, cID = 50, Name = "Q5" }, // etc.
new A() { id = 9, tID = 30, cID = 20, Name = "Q3" })};
mixed = mixed.OrderBy(a => a.Key.id == 0).ThenBy(a => a.Key.id).ToList();
Console.WriteLine("finalAnswers IDs:");
// Print the IDs of the values (finalAnswers).
foreach (var item in mixed)
Console.WriteLine(item.Value.id);
Console.WriteLine("\ndraftAnswers IDs:");
// Print the IDs of the keys (draftAnswers).
foreach (var item in mixed)
Console.WriteLine(item.Key.id);
This will have your desired output for both lists. Try it online.
Hope that helps.
1 You have --what seems to be-- a typo in your fiddle, where you wrote .. b => finalAnswers.FindIndex .. instead of .. b => draftAnswers.FindIndex .. which produces a totally different result. Just wanted to clear that up so no one else gets confused.
In your comment you mentioned that the OrderBy should produce an order of: 1,1,0, however that would be OrderByDescending. OrderBy is ascending, so it orders them 0,1,1. But regardless of which way you sort the draftAnswers, the result will be (using TId): 20,10,30, or the opposite, 30,10,20. Note that when using OrderBy, if two entries have the same value, their relative postions remain as they were. That's why you have the result you do.
So you aren't going to get your expected result by sorting on the id field alone. One way you can get that result (if that's what you want), is to use a ThenBy to further sort on the cID field:
draftAnswers = draftAnswers
.OrderByDescending(d => d.id)
.ThenByDescending(d => d.cID)
.ToList();
finalAnswers = finalAnswers
.OrderBy(b => draftAnswers.FindIndex(a => a.tID == b.tID && a.cID == b.cID))
.ToList();

Cartesian Product of an arbitrary number of objects [duplicate]

This question already has answers here:
Is there a good LINQ way to do a cartesian product?
(3 answers)
Closed 4 years ago.
I'm looking to get the Cartesian Product of an arbitrary number of objects in c#. My situation is slightly unusual - my inputs are not lists of base types, but objects which have a property that's a list of base types.
My input and output objects are as follows:
public class Input
{
public string Label;
public List<int> Ids;
}
public class Result
{
public string Label;
public int Id;
}
Some sample input data:
var inputs = new List<Input>
{
new Input { Label = "List1", Ids = new List<int>{ 1, 2 } },
new Input { Label = "List2", Ids = new List<int>{ 2, 3 } },
new Input { Label = "List3", Ids = new List<int>{ 4 } }
};
And my expected output object:
var expectedResult = new List<List<Result>>
{
new List<Result>
{
new Result{Label = "List1", Id = 1},
new Result{Label = "List2", Id = 2},
new Result{Label = "List3", Id = 4}
},
new List<Result>
{
new Result{Label = "List1", Id = 1},
new Result{Label = "List2", Id = 3},
new Result{Label = "List3", Id = 4}
},
new List<Result>
{
new Result{Label = "List1", Id = 2},
new Result{Label = "List2", Id = 2},
new Result{Label = "List3", Id = 4}
},
new List<Result>
{
new Result{Label = "List1", Id = 2},
new Result{Label = "List2", Id = 3},
new Result{Label = "List3", Id = 4}
}
};
If I knew the number of items in 'inputs' in advance I could do this:
var knownInputResult =
from id1 in inputs[0].Ids
from id2 in inputs[1].Ids
from id3 in inputs[2].Ids
select
new List<Result>
{
new Result { Id = id1, Label = inputs[0].Label },
new Result { Id = id2, Label = inputs[1].Label },
new Result { Id = id3, Label = inputs[2].Label },
};
I'm struggling to adapt this to an arbitrary number of inputs - is there a possible way to do this?
I consider this duplicate of question linked in comments, but since it was reopened and you struggle to adapt that question to your case, here is how.
First grab function by Eric Lippert from duplicate question as is (how it works is explained there):
public static class Extensions {
public static IEnumerable<IEnumerable<T>> CartesianProduct<T>(this IEnumerable<IEnumerable<T>> sequences)
{
IEnumerable<IEnumerable<T>> emptyProduct = new[] { Enumerable.Empty<T>() };
return sequences.Aggregate(
emptyProduct,
(accumulator, sequence) =>
from accseq in accumulator
from item in sequence
select accseq.Concat(new[] { item })
);
}
}
Then flatten your input. Basically just attach corresponding label to each id:
var flatten = inputs.Select(c => c.Ids.Select(r => new Result {Label = c.Label, Id = r}));
Then run cartesian product and done:
// your expected result
var result = flatten.CartesianProduct().Select(r => r.ToList()).ToList();
I'm not proud of the amount of time I spent messing with this, but it works.
It's basically black magic, and I would replace it the first chance you get.
public static List<List<Result>> Permutate(IEnumerable<Input> inputs)
{
List<List<Result>> results = new List<List<Result>>();
var size = inputs.Select(inp => factorial_WhileLoop(inp.Ids.Count)).Aggregate((item, carry) => item + carry) - 1;
for (int i = 0; i < size; i++) results.Add(new List<Result>());
foreach (var input in inputs)
{
for (int j = 0; j < input.Ids.Count; j++)
{
for (int i = 0; i < (size / input.Ids.Count); i++)
{
var x = new Result() { Label = input.Label, Id = input.Ids[j] };
results[(input.Ids.Count * i) + j].Add(x);
}
}
}
return results;
}
public static int factorial_WhileLoop(int number)
{
var result = 1;
while (number != 1)
{
result = result * number;
number = number - 1;
}
return result;
}

Using LINQ, how would you filter out all but one item of a particular criteria from a list?

I realize my title probably isn't very clear so here's an example:
I have a list of objects with two properties, A and B.
public class Item
{
public int A { get; set; }
public int B { get; set; }
}
var list = new List<Item>
{
new Item() { A = 0, B = 0 },
new Item() { A = 0, B = 1 },
new Item() { A = 1, B = 0 },
new Item() { A = 2, B = 0 },
new Item() { A = 2, B = 1 },
new Item() { A = 2, B = 2 },
new Item() { A = 3, B = 0 },
new Item() { A = 3, B = 1 },
}
Using LINQ, what's the most elegant way to collapse all the A = 2 items into the first A = 2 item and return along with all the other items? This would be the expected result.
var list = new List<Item>
{
new Item() { A = 0, B = 0 },
new Item() { A = 0, B = 1 },
new Item() { A = 1, B = 0 },
new Item() { A = 2, B = 0 },
new Item() { A = 3, B = 0 },
new Item() { A = 3, B = 1 },
}
I'm not a LINQ expert and already have a "manual" solution but I really like the expressiveness of LINQ and was curious to see if it could be done better.
How about:
var collapsed = list.GroupBy(i => i.A)
.SelectMany(g => g.Key == 2 ? g.Take(1) : g);
The idea is to first group them by A and then select those again (flattening it with .SelectMany) but in the case of the Key being the one we want to collapse, we just take the first entry with Take(1).
One way you can accomplish this is with GroupBy. Group the items by A, and use a SelectMany to project each group into a flat list again. In the SelectMany, check if A is 2 and if so Take(1), otherwise return all results for that group. We're using Take instead of First because the result has to be IEnumerable.
var grouped = list.GroupBy(g => g.A);
var collapsed = grouped.SelectMany(g =>
{
if (g.Key == 2)
{
return g.Take(1);
}
return g;
});
One possible solution (if you insist on LINQ):
int a = 2;
var output = list.GroupBy(o => o.A == a ? a.ToString() : Guid.NewGuid().ToString())
.Select(g => g.First())
.ToList();
Group all items with A=2 into group with key equal to 2, but all other items will have unique group key (new guid), so you will have many groups having one item. Then from each group we take first item.
Yet another way:
var newlist = list.Where (l => l.A != 2 ).ToList();
newlist.Add( list.First (l => l.A == 2) );
An alternative to other answers based on GroupBy can be Aggregate:
// Aggregate lets iterate a sequence and accumulate a result (the first arg)
var list2 = list.Aggregate(new List<Item>(), (result, next) => {
// This will add the item in the source sequence either
// if A != 2 or, if it's A == 2, it will check that there's no A == 2
// already in the resulting sequence!
if(next.A != 2 || !result.Any(item => item.A == 2)) result.Add(next);
return result;
});
What about this:
list.RemoveAll(l => l.A == 2 && l != list.FirstOrDefault(i => i.A == 2));
if you whould like more efficient way it would be:
var first = list.FirstOrDefault(i => i.A == 2);
list.RemoveAll(l => l.A == 2 && l != first);

Using LINQ to count value frequency

I have a table
ID|VALUE
VALUE is an integer field with possible values between 0 and 4. How can I query the count of each value?
Ideally the result should be an array with 6 elements, one for the count of each value and the last one is the total number of rows.
This simple program does just that:
class Record
{
public int Id { get; set; }
public int Value { get; set; }
}
class Program
{
static void Main(string[] args)
{
List<Record> records = new List<Record>()
{
new Record() { Id = 1, Value = 0},
new Record() { Id = 2, Value = 1 },
new Record() { Id = 3, Value = 2 },
new Record() { Id = 4, Value = 3 },
new Record() { Id = 5, Value = 4 },
new Record() { Id = 6, Value = 2 },
new Record() { Id = 7, Value = 3 },
new Record() { Id = 8, Value = 1 },
new Record() { Id = 9, Value = 0 },
new Record() { Id = 10, Value = 4 }
};
var query = from r in records
group r by r.Value into g
select new {Count = g.Count(), Value = g.Key};
foreach (var v in query)
{
Console.WriteLine("Value = {0}, Count = {1}", v.Value, v.Count);
}
}
}
Output:
Value = 0, Count = 2
Value = 1, Count = 2
Value = 2, Count = 2
Value = 3, Count = 2
Value = 4, Count = 2
Slightly modified version to return an Array with only the count of values:
int[] valuesCounted = (from r in records
group r by r.Value
into g
select g.Count()).ToArray();
Adding the rows count in the end:
valuesCounted = valuesCounted.Concat(new[] { records.Count()}).ToArray();
Here is how you would get the number of rows for each value of VALUE, in order:
var counts =
from row in db.Table
group row by row.VALUE into rowsByValue
orderby rowsByValue.Key
select rowsByValue.Count();
To get the total number of rows in the table, you can add all of the counts together. You don't want the original sequence to be iterated twice, though; that would cause the query to be executed twice. Instead, you should make an intermediate list first:
var countsList = counts.ToList();
var countsWithTotal = countsList.Concat(new[] { countsList.Sum() });

Categories