I have a small problem in finding the most efficient solution. I have a number, for example 10, of student ids. And some of those ids are relative(siblings) to each other. For those who are siblings leave only one of them for identifing, and doesn't matter which, first one is fine.
For example, the student ids
original
1, 2, 3, 4, 5, 6, 7, 8, 9, 10
where 1, 2, 3 are one family siblings, and 8, 9 are another. At the end I should have:
expected
1, 4, 5, 6, 7, 8, 10
I am doing it through loop.
UPDATE:
I just stopped to implement it, because it gets bigger and bigger. This is a big picture of what I have in my mind. I just gathered all sibling ids for each given id row by row, and then I was going to iterate per each. But like I said it's wasting time.
Code (in conceptual)
static string Trimsiblings(string ppl) {
string[] pids=ppl.Split(',');
Stack<string> personid=new Stack<string>();
foreach(string pid in pids) {
// access database and check for all sibling
// is for each individual pid
// such as in example above
// row 1: field 1=1, field2=2, field3=3
// row 2: field 1=8, field2=9
query = Select..where ..id = pid; // this line is pesudo code
for(int i=0; i<query.Length; i++) {
foreach(string pid in pids) {
if(query.field1==pid) {
personid.Push(pid);
}
}
}
}
}
For an efficient code, it essential to notice that the one member (e.g., the first) of each family of siblings is irrelevant because it will stay in the output. That is, we simply have to
Create a list of items that must not appear in the output
Actually remove them
Of course, this only works under the assumption that every sibling actually appears in the original list of ids.
In code:
int[] ids = new int[] { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
int families = new int[2][] {
new int [] {1, 2, 3},
new int [] {8, 9}
};
var itemsToOmit = siblings.
Select(family => family.Skip(1)).
Aggregate((family1, family2) => family1.Concat(family2));
var cleanedIds = ids.Except(itemsToOmit);
Edit: Since you mention that you are not too familiar with the syntax, I will give some further explanations
The expressions I've used are extension methods that are part of the System.LINQ namespace
The Select method transforms one sequence into another sequence. Since families is sequence of sequences, family will be a sequence of siblings in the same family (i.e., 1, 2, 3 and 8, 9 in this particular case)
The Skip method skips a number of elements of a sequence. Here, I've decided to always skip the first element (for reasons, see above)
The Aggregate method combines element of a sequence into a single element. Here, all families of siblings are just concatenated to each other (except for the first sibling of each family which has been omitted via Skip)
The Except method returns all elements of a sequence that are not in the sequence that is given as an argument.
I hope this clarifies things a bit.
Here's how
public static String Trimsiblings(String ppl) {
var table=GetSiblingTable();
var pids=ppl.Split(',');
return
String.Join(", ", (
from id in pids.Select(x => int.Parse(x))
where (
from row in table.AsEnumerable()
select
from DataColumn column in table.Columns
let data=row[column.ColumnName]
where DBNull.Value!=data
select int.Parse((String)data)
).All(x => false==x.Contains(id)||x.First()==id)
select id.ToString()).ToArray()
);
}
// emulation of getting table from database
public static DataTable GetSiblingTable() {
var dt=new DataTable();
// define field1, ..., fieldn
for(int n=3, i=1+n; i-->1; dt.Columns.Add("field"+i))
;
dt.Rows.Add(new object[] { 1, 2, 3 });
dt.Rows.Add(new object[] { 8, 9 });
return dt;
}
public static void TestMethod() {
Console.WriteLine("{0}", Trimsiblings("1, 2, 3, 4, 5, 6, 7, 8, 9, 10"));
}
comment to request why(if you need).
Related
Sorry if the header made you confused.
This thread looks similar header but that is actually different Selecting some lists from a list of lists.
I want to Select some lists in many lists that look like a list
Sample:
// data source
List<List<int>> sources = new List<List<int>>();
sources.Add(new List<int>(){1, 2, 3, 4});
sources.Add(new List<int>(){1, 2, 3, 4, 5});
sources.Add(new List<int>(){1, 2, 3, 4, 5, 6});
sources.Add(new List<int>(){1, 2, 99, 3, 4, 5, 6});
sources.Add(new List<int>(){1, 3, 99, 2, 4, 5});
sources.Add(new List<int>(){5, 4, 3, 2, 1});
sources.Add(new List<int>(){1, 2, 4, 5, 6});
sources.Add(new List<int>(){1, 2, 69, 3, 4, 5});
// the list that we want to find lists similar to this
List<int> current = new List<int>() {1, 2, 3, 4, 5};
The list contain not-important element, can be ignored. Updated! In the case its elements are not appeared in current:
List<int> flexible = new List<int>() {99, 66, 123123, 2};// <= updated!
The function I want to write:
void FilterA(List<int> current, List<List<int>> sources, List<int> flexible) {}
How to make FilterA output these list (Lists chosen)? Printing functions are not required.
Lists chosen
1 2 3 4 5 // exactly the same !
1 2 3 4 5 6 // same first 5 elements, the rests are not important
1 2 99 3 4 5 6 // 99 is in flexible list, after ignored that is 1 2 3 4 5 6
// Updated! Ignore 99 because it is not in list current
Lists ignored
1 2 3 4 // missing 5 in current
1 3 99 2 4 5 // 99 is in flexible list, after ignored that is 1 3 2 4 5
5 4 3 2 1 // wrong order
1 2 4 5 6 // missing 3 in current
1 2 69 3 4 5 // 69 is not in flexible list
Thank you very much!
--- Updated ---
If elements in list flexible appeared in list current, they must not be excluded.
The answer of #Sweeper is nice.
p/s: In the case not any element of flexible appear in current, #TheGeneral 's answer is great, runs great performance.
Update after clarification
The premise is, remove flexible with Except, Take n to then compare with SequenceEqual.
Note : All three methods have linear time complexity O(n)
var results = sources.Where(x =>
x.Except(flexible)
.Take(current.Count)
.SequenceEqual(current));
Output
1, 2, 3, 4, 5
1, 2, 3, 4, 5, 6
1, 2, 99, 3, 4, 5, 6
Full demo here
Additional Resources
Enumerable.Except
Produces the set difference of two sequences.
Enumerable.Take
Returns a specified number of contiguous elements from the start of a
sequence.
Enumerable.SequenceEqual
Determines whether two sequences are equal according to an equality
comparer.
You should write a method that determines whether one list (candidate) should be chosen:
public static bool ShouldChoose(List<int> candidate, List<int> current, List<int> flexible) {
int candidateIndex = 0;
foreach (int element in current) {
if (candidateIndex >= candidate.Count) {
return false;
}
// this loop looks for the next index in "candidate" where "element" matches
// ignoring the elements in "flexible"
while (candidate[candidateIndex] != element) {
if (!flexible.Contains(candidate[candidateIndex])) {
return false;
}
candidateIndex++;
}
candidateIndex++;
}
return true;
}
Then you can do a Where filter:
var chosenLists = sources.Where(x => ShouldChoose(x, current, flexible)).ToList();
foreach (var list in chosenLists) {
Console.WriteLine(string.Join(", ", list));
}
This works for me:
var results =
sources
.Where(source => source.Except(flexible).Count() >= current.Count())
.Where(source => source.Except(flexible).Zip(current, (s, c) => s == c).All(x => x))
.ToList();
Let's say I have the following nested array:
[
[1, 2, 3],
[4, 7, 9, 13],
[1, 2],
[2, 3]
[12, 15, 16]
]
I only need the arrays with the most occurrences of the same numbers. In the above example this would be:
[
[1, 2, 3],
[4, 7, 9, 13],
[12, 15, 16]
]
How can I do this efficiently with C#?
EDIT
Indeed my question is really confusing. What I wanted to ask is: How can I eliminate sub-arrays if some bigger sub-array already contains all the elements of a smaller sub-array.
My current implementation of the problem is the following:
var allItems = new List<List<int>>{
new List<int>{1, 2, 3},
new List<int>{4, 7, 9, 13},
new List<int>{1, 2},
new List<int>{2, 3},
new List<int>{12, 15, 16}
};
var itemsToEliminate = new List<List<int>>();
for(var i = 0; i < allItems.ToList().Count; i++){
var current = allItems[i];
var itemsToVerify = allItems.Where(item => item != current).ToList();
foreach(var item in itemsToVerify){
bool containsSameNumbers = item.Intersect(current).Any();
if(containsSameNumbers && item.Count > current.Count){
itemsToEliminate.Add(current);
}
}
}
allItems.RemoveAll(item => itemsToEliminate.Contains(item));
foreach(var item in allItems){
Console.WriteLine(string.Join(", ", item));
}
This does work, but the nested loops for(var i = 0; i < allItems.ToList().Count; i++) and foreach(var item in itemsToVerify) gives it a bad performance. Especially if you know that the allItems array can contain about 10000000 rows.
I would remember the items that are already in the list.
First sort your lists by decreasing length, then check for each item if it's already present.
Given your algorithm, the array is not added if even a single integer is in the list already of known integers already.
Therefore I would use the following algorithm:
List<List<int>> allItems = new List<List<int>>{
new List<int>{1, 2, 3},
new List<int>{4, 7, 9, 13},
new List<int>{1, 2},
new List<int>{2, 3},
new List<int>{12, 15, 16}
};
allItems = allItems.OrderByDescending(x => x.Count()).ToList(); // order by length, decreasing order
List<List<int>> result = new List<List<int>>();
SortedSet<int> knownItems = new SortedSet<int>(); // keep track of numbers, so you don't have to loop arrays
// https://learn.microsoft.com/en-us/dotnet/api/system.collections.generic.sortedset-1?view=netframework-4.7.2
foreach (List<int> l in allItems)
{
// bool allUnique = true;
foreach (int elem in l)
{
if (knownItems.Contains(elem))
{
// allUnique = false;
break;
}
else
{
// OK, because duplicates not allowed in single list
// and because how the data is constrained (I still have my doubts about how the data is allowed to look and what special cases may pop up that ruin this, so use with care)
// this WILL cause problems if a list starts with any number which has not yet been provided appears before the first match that would cause the list to be discarded.
knownItems.Add(elem);
}
}
// see comment above near knownItems.Add()
/*
if (allUnique)
{
result.Add(l);
foreach (int elem in l)
{
knownItems.Add(elem);
}
}
*/
}
// output
foreach(List<int> item in result){
Console.WriteLine(string.Join(", ", item));
}
Instead of looping over your original array twice nestedly (O(n^2)), you only do it once (O(n)) and do a search in known numbers (binary search tree lookup: O(n*log2(n))).
Instead of removing from the array, you add to a new one. This uses more memory for the new array. The reordering is done because it is more likely that any subsequent array contains numbers already processed. However sorting a large amount of lists may be slower than the benefit you gain if you have many small lists. If you have even a few long ones, this may pay off.
Sorting your list of lists by the length is valid because
what is to happen if a list has items from different lists? say instead of new List{2, 3} it was new List{2, 4}?
That unexpected behavior. You can see the ints as an id of a person. Each group of ints forms, for example, a family. If the algorithm creates [2, 4], then we are creating, for example, an extramarital relationship. Which is not desirable.
From this I gather the arrays will contain subsets of at most only one other array or be unique. Therefore the Order is irrelevant.
This also assumes that at least one such array would contain all elements of such subsets (and therefore be the longest one and come first.)
The sorting could be removed if it were not so, and should probably be removed if in doubt.
For example:
{1, 2, 3, 4, 5} - contains all elements that future arrays will have subsets of
{1, 4, 5} - must contain no element that {1,2,3,4,5} does not contain
{1, 2, 6} - illegal in this case
{7, 8 ,9} - OK
{8, 9} - OK (will be ignored)
{7, 9} - OK (will be ignored, is only subset in {7,8,9})
{1, 7} - - illegal, but would be legal if {1,2,3,4,5,7,8,9} was in this list. because it is longer it would've been earlier, making this valid to ignore.
I need to process an outbound SMS queue and create batches of messages. The queued list might contain multiple messages to the same person. Batches do not allow this, so I need to run through the main outbound queue and create as many batches as necessary to ensure they contain unique entries.
Example:
Outbound queue = (1,2,3,3,4,5,6,7,7,7,8,8,8,8,9)
results in...
batch 1 = (1,2,3,4,5,6,7,8,9)
batch 2 = (3,7,8)
batch 3 = (7,8)
batch 4 = (8)
I can easily check for duplicates but I'm looking for a slick way to generate the additional batches.
Thanks!
Have a look at this approach using Enumerable.ToLookup and other LINQ methods:
var queues = new int[] { 1, 2, 3, 3, 4, 5, 6, 7, 7, 8, 8, 8, 8, 9 };
var lookup = queues.ToLookup(i => i);
int maxCount = lookup.Max(g => g.Count());
List<List<int>> allbatches = Enumerable.Range(1, maxCount)
.Select(count => lookup.Where(x => x.Count() >= count).Select(x => x.Key).ToList())
.ToList();
Result is a list which contains four other List<int>:
foreach (List<int> list in allbatches)
Console.WriteLine(string.Join(",", list));
1, 2, 3, 4, 5, 6, 7, 8, 9
3, 7, 8
8
8
Depending on the specific data structures used, the Linq GroupBy extension method could be used (provided that the queue implements IEnumerable<T> for some type T) for grouping by the same user; afterwards, the groups can be iterated separately.
A naive approach would be to walk over the input, creating and filling the batches as you go:
private static List<List<int>> CreateUniqueBatches(List<int> source)
{
var batches = new List<List<int>>();
int currentBatch = 0;
foreach (var i in source)
{
// Find the index for the batch that can contain the number `i`
while (currentBatch < batches.Count && batches[currentBatch].Contains(i))
{
currentBatch++;
}
if (currentBatch == batches.Count)
{
batches.Add(new List<int>());
}
batches[currentBatch].Add(i);
currentBatch = 0;
}
return batches;
}
Output:
1, 2, 3, 4, 5, 6, 7, 8, 9
3, 7, 8
8
8
I'm sure this can be shortened or written in a functional way. I've tried using GroupBy, Distinct and Except, but couldn't figure it out that quickly.
I was wondering if it is possible to do ranged sort using LINQ, for example i have list of numbers:
List< int > numbers = new List< int >
1
2
3
15 <-- sort
11 <-- sort
13 <-- sort
10 <-- sort
6
7
etc.
Simply using numbers.Skip(3).Take(4).OrderBy(blabla) will work, but it will return a new list containing only those 4 numbers. Is is somehow possible to force LINQ to work on itself without returning a new "partial" list or to receive complete one with sorted part?
Thanks for any answer!
Try something like this:
var partiallySorted = list.Where(x => x < 11)
.Concat(list.Where(x => x >= 11 && x <=15).OrderBy(/*blah*/)))
.Concat(list.Where(x => x > 15));
List<int> list = new List<int>() {1,2,3,15,11,13,10,6,7};
list.Sort(3, 4,Comparer<int>.Default);
Simply get the required range based on some criteria and apply the sort on the resultant range using Linq.
List<int> numbers = new List<int>() { 15, 4, 1, 3, 2, 11, 7, 6, 12, 13 };
var range = numbers.Skip(3).Take(4).OrderBy(n => n).Select(s => s);
// output: 2, 3, 7, 11
No, the Linq extension methods will never modify the underlying list. You can use the method List<T>.Sort(int index, int counter, IComparer<T> comparer) to do an in-place sort:
var list = new List<int> {1, 2, 3, 15, 11, 13, 10, 6, 7};
list.Sort(3, 4, null);
Use this for default inline List Sort:
Syntax: List.Sort(start index, number of elements, Default Comparer)
List<int> numbers = new List<int> { 1, 2, 3, 15, 11, 13, 10, 6, 7 };
numbers.Sort(3, 6, Comparer<int>.Default);
If you want to sort by [properties/attributes] of the element or precisely something else use the below method,
I had sorted the string by number of characters, and also from 2nd element to end of List.
Syntax: List.Sort(start index, number of elements, Custom Comparer)
List<string> str = new List<string> { "123", "123456789", "12", "1234567" };
str.Sort(1, str.Count - 1, Comparer<string>.Create((x, y) => x.Length.CompareTo(y.Length)));
Let's say that I have an array of strings like this:
1, 2, 3, 4, 5, 6, 7, 8
and I want to shift the elements of the array such that
The first element always remains fixed
Only the remaining elements get shifted like so ...
The last element in the array becomes the 2nd element and is shifted through the array with each pass.
Pass #1: 1, 2, 3, 4, 5, 6, 7, 8
Pass #2: 1, 8, 2, 3, 4, 5, 6, 7
Pass #3: 1, 7, 8, 2, 3, 4, 5, 6
Pass #4: 1, 6, 7, 8, 2, 3, 4, 5
Any assistance would be greatly appreciated.
Because this looks like homework, I'm posting an unnecessary complex, but very hip LINQ solution:
int[] array = new int[] { 1, 2, 3, 4, 5, 6, 7, 8 };
int[] result = array.Take(1)
.Concat(array.Reverse().Take(1))
.Concat(array.Skip(1).Reverse().Skip(1).Reverse())
.ToArray();
Probably the fastest way to do this in C# is to use Array.Copy. I don't know much about pointers in C# so there's probably a way of doing it that's even faster and avoids the array bounds checks and such but the following should work. It makes several assumptions and doesn't check for errors but you can fix it up.
void Shift<T>(T[] array) {
T last = array[array.Length-1];
Array.Copy(array, 1, array, 2, array.Length-2);
array[1]=last;
}
EDIT
Optionally, there is Buffer.BlockCopy which according to this post performs fewer validations but internally copies the block the same way.
Because this looks like homework, I'm not going to solve it for you, but I have a couple of suggestions:
Remember to not overwrite data if it isn't somewhere else already. You're going to need a temporary variable.
Try traversing the array from the end to the beginning. The problem is probably simpler that way, though it can be done from front-to-back.
Make sure your algorithm works for an arbitrary-length array, not just one that's of size 8, as your example gave.
Although sounds like homework like others suggest, if changing to a List<>, you can get what you want with the following...
List<int> Nums2 = new List<int>();
for( int i = 1; i < 9; i++ )
Nums2.Add(i);
for (int i = 1; i < 10; i++)
{
Nums2.Insert( 1, Nums2[ Nums2.Count -1] );
Nums2.RemoveAt(Nums2.Count -1);
}
Define this:
public static class Extensions
{
public static IEnumerable<T> Rotate<T>(this IEnumerable<T> enuml)
{
var count = enuml.Count();
return enuml
.Skip(count - 1)
.Concat(enuml.Take(count - 1));
}
public static IEnumerable<T> SkipAndRotate<T>(this IEnumerable<T> enuml)
{
return enum
.Take(1)
.Concat(
enuml.Skip(1).Rotate()
);
}
}
Then call it like so:
var array = new [] { 1, 2, 3, 4, 5, 6, 7, 8 };
var pass1 = array.SkipAndRotate().ToArray();
var pass2 = pass1.SkipAndRotate().ToArray();
var pass3 = pass2.SkipAndRotate().ToArray();
var pass4 = pass3.SkipAndRotate().ToArray();
There's some repeated code there that you might want to refactor. And of course, I haven't compiled this so caveat emptor!
This is similar to Josh Einstein's but it will do it manually and will allow you to specify how many elements to preserve at the beginning.
static void ShiftArray<T>(T[] array, int elementsToPreserve)
{
T temp = array[array.Length - 1];
for (int i = array.Length - 1; i > elementsToPreserve; i--)
{
array[i] = array[i - 1];
}
array[elementsToPreserve] = temp;
}
Consumed:
int[] array = { 1, 2, 3, 4, 5, 6, 7, 8 };
ShiftArray(array, 2);
First pass: 1 2 8 3 4 5 6 7