Search missing numbers in sequence - c#

Suppose I have the following array (my sequences are all sorted in ascending order, and contain positive integers)
var tabSequence = new[] { 1, 2, 3, 7, 8, 9, 12, 15, 16, 17, 22, 23, 32 };
I made a code using LINQ and a loop to search missing numbers like that :
List<Int32> lstSearch = new List<int>();
var lstGroup = tabSequence
.Select((val, ind) => new { val, group = val - ind })
.GroupBy(v => v.group, v => v.val)
.Select(group => new{ GroupNumber = group.Key, Min = group.Min(), Max = group.Max() }).ToList();
for (int number = 0; number < lstGroup.Count; number++)
{
if (number < lstGroup.Count-1)
{
for (int missingNumber = lstGroup[number].Max+1; missingNumber < lstGroup[number+1].Min; missingNumber++)
lstSearch.Add(missingNumber);
}
}
var tabSequence2 = lstSearch.ToArray();
// Same result as var tabSequence2 = new[] {4, 5, 6, 10, 11, 13, 14, 18, 19, 20, 21, 24, 25, 26, 27, 28, 29, 30, 31 };
This code works but i'd like to know if there a better way to do the same thing only with linq.

Maybe I am just not understanding the problem. Your code seems very complicated, you could make this a lot simpler:
int[] tabSequence = new[] { 1, 2, 3, 7, 8, 9, 12, 15, 16, 17, 22, 23, 32 };
var results = Enumerable.Range(1, tabSequence.Max()).Except(tabSequence);
//results is: 4, 5, 6, 10, 11, 13, 14, 18, 19, 20, 21, 24, 25, 26, 27, 28, 29, 30, 31
I made a fiddle here

You can use IEnumerable.Aggregate to your advantage. The overload I choose uses an accumulator seed (empty List<IEnumerable<int>>) and proceeds to iterate over each item in your array.
The first time I set an lastNR defined before using the aggregate to the firsst number we iterate over. We compare the nexts iterations actual nr against this lastNr.
If we are in sequence we just increment the lastNr.
If not, we generate the missing numbers via Enumerable.Range(a,count) between lastNr
and the actual nr and add them to our accumulator-List. Then we set the lastNr to nr to continue.
public static List<IEnumerable<int>> GetMissingSeq(int[] seq)
{
var lastNr = int.MinValue;
var missing = seq.Aggregate(
new List<IEnumerable<int>>(),
(acc, nr) =>
{
if (lastNr == int.MinValue || lastNr == nr - 1)
{
lastNr = nr; // first ever or in sequence
return acc; // noting to do
}
// not in sequence, add the missing into our ac'umulator list
acc.Add(Enumerable.Range(lastNr + 1, nr - lastNr - 1));
lastNr = nr; //thats the new lastNR to compare against in the next iteration
return acc;
}
);
return missing;
}
Tested by:
public static void Main(string[] args)
{
var tabSequence = new[] { 1, 2, 3, 7, 8, 9, 12, 15, 16, 17, 22, 23, 32 };
var lastNr = int.MinValue;
var missing = tabSequence.Aggregate(
new List<IEnumerable<int>>(),
(acc, nr) =>
{
if (lastNr == int.MinValue || lastNr == nr - 1)
{
lastNr = nr; // first ever or in sequence
return acc; // noting to do
}
acc.Add(Enumerable.Range(lastNr + 1, nr - lastNr - 1));
return acc;
}
);
Console.WriteLine(string.Join(", ", tabSequence));
foreach (var inner in GetMissingSeq(tabSequence))
Console.WriteLine(string.Join(", ", inner));
Console.ReadLine();
}
Output:
1, 2, 3, 7, 8, 9, 12, 15, 16, 17, 22, 23, 32 // original followed by missing sequences
4, 5, 6
10, 11
13, 14
18, 19, 20, 21
24, 25, 26, 27, 28, 29, 30, 31
If you are not interested in the subsequences you can use GetMissingSeq(tabSequence).SelectMany(i => i) to flatten them into one IEnumerable.

Related

Equivalent of `shift` from pandas

Initial DataFrame in Pandas
Let's suppose we have the following in Python with pandas:
import pandas as pd
df = pd.DataFrame({
"Col1": [10, 20, 15, 30, 45],
"Col2": [13, 23, 18, 33, 48],
"Col3": [17, 27, 22, 37, 52] },
index=pd.date_range("2020-01-01", "2020-01-05"))
df
Here's what we get in Jupyter:
Shifting columns
Now let's shift Col1 by 2 and store it in Col4.
We'll also store df['Col1'] / df['Col1'].shift(2) in Col5:
df_2 = df.copy(deep=True)
df_2['Col4'] = df['Col1'].shift(2)
df_2['Col5'] = df['Col1'] / df['Col1'].shift(2)
df_2
The result:
C# version
Now let's setup a similar DataFrame in C#:
#r "nuget:Microsoft.Data.Analysis"
using Microsoft.Data.Analysis;
var df = new DataFrame(
new PrimitiveDataFrameColumn<DateTime>("DateTime",
Enumerable.Range(0, 5).Select(i => new DateTime(2020, 1, 1).Add(new TimeSpan(i, 0, 0, 0)))),
new PrimitiveDataFrameColumn<int>("Col1", new []{ 10, 20, 15, 30, 45 }),
new PrimitiveDataFrameColumn<int>("Col2", new []{ 13, 23, 18, 33, 48 }),
new PrimitiveDataFrameColumn<int>("Col3", new []{ 17, 27, 22, 37, 52 })
);
df
The result in .NET Interactive:
Question
What's a good way to perform the equivalent column shifts as demonstrated in the Pandas version?
Notes
The above example is from the documentation for pandas.DataFrame.shift:
https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.shift.html
Update
It does indeed look like there isn't currently a built-in shift in Microsoft.Data.Analysis. I've posted an issue for this here:
https://github.com/dotnet/machinelearning/issues/6008
Helper functions
Perform a column shift.
PrimitiveDataFrameColumn<double> ShiftIntColumn(PrimitiveDataFrameColumn<int> col, int n, string name)
{
return
new PrimitiveDataFrameColumn<double>(
name,
Enumerable.Repeat((double?) null, n)
.Concat(col.Select(item => (double?) item))
.Take(col.Count()));
}
Carry out division, taking care of null values in divisor.
PrimitiveDataFrameColumn<double> DivAlt3(PrimitiveDataFrameColumn<int> a, PrimitiveDataFrameColumn<double> b, string name)
{
return
new PrimitiveDataFrameColumn<double>(name, a.Zip(b, (x, y) => y == null ? null : x / y));
}
Then the following:
var df = new DataFrame(
new PrimitiveDataFrameColumn<DateTime>("DateTime",
Enumerable.Range(0, 5).Select(i =>
new DateTime(2020, 1, 1).Add(new TimeSpan(i, 0, 0, 0)))),
new PrimitiveDataFrameColumn<int>("Col1", new []{ 10, 20, 15, 30, 45 }),
new PrimitiveDataFrameColumn<int>("Col2", new []{ 13, 23, 18, 33, 48 }),
new PrimitiveDataFrameColumn<int>("Col3", new []{ 17, 27, 22, 37, 52 })
);
df.Columns.Add(ShiftIntColumn((PrimitiveDataFrameColumn<int>)df["Col1"], 2, "Col4"));
df.Columns.Add(DivAlt3((PrimitiveDataFrameColumn<int>) df["Col1"], (PrimitiveDataFrameColumn<double>) df["Col4"], "Col5"));
results in:
Complete notebook
See the following notebook for a full demonstration of the above:
https://github.com/dharmatech/dataframe-shift-example-cs/blob/003/dataframe-shift-example-cs.ipynb
Notes
It would be great if Microsoft.Data.Analysis came with column shift functionality.
It would also be great if column division handled nulls natively.
Would love to see other perhaps more idiomatic approaches to this.

C# datetime array

I have two arrays, array1 has datetime data count by minute from 8am to 2pm and array2 has datetime data count by hour from same date 8am to 1pm.
I want to output the index number of two array that has same datetime.hour. and it should matchup the last available index number of array2 for all of the datetime data from array1 that later than array2.
for example if I have two datetime array like this:
DateTime[] dateTimes1 = new DateTime[]
{
new DateTime(2010, 10, 1, 8, 15, 0),
new DateTime(2010, 10, 1, 8, 30, 1),
new DateTime(2010, 10, 1, 8, 45, 2),
new DateTime(2010, 10, 1, 9, 15, 3),
new DateTime(2010, 10, 1, 9, 30, 4),
new DateTime(2010, 10, 1, 9, 45, 5),
new DateTime(2010, 10, 1, 10, 15, 6),
new DateTime(2010, 10, 1, 10, 30, 7),
new DateTime(2010, 10, 1, 10, 45, 8),
new DateTime(2010, 10, 1, 11, 15, 9),
new DateTime(2010, 10, 1, 11, 30, 10),
new DateTime(2010, 10, 1, 11, 45, 11),
new DateTime(2010, 10, 1, 12, 15, 12),
new DateTime(2010, 10, 1, 12, 30, 13),
new DateTime(2010, 10, 1, 12, 45, 14),
new DateTime(2010, 10, 1, 13, 15, 15),
new DateTime(2010, 10, 1, 13, 30, 16),
new DateTime(2010, 10, 1, 13, 45, 17),
new DateTime(2010, 10, 1, 14, 15, 18),
new DateTime(2010, 10, 1, 14, 30, 19),
new DateTime(2010, 10, 1, 14, 45, 20),
};
DateTime[] dateTimes2 = new DateTime[]
{
new DateTime(2010, 10, 1, 8, 0, 0),
new DateTime(2010, 10, 1, 9, 0, 1),
new DateTime(2010, 10, 1, 10, 0, 2),
new DateTime(2010, 10, 1, 11, 0, 3),
new DateTime(2010, 10, 1, 12, 0, 4),
new DateTime(2010, 10, 1, 13, 0, 5),
};
it should gives me the output:
0, 0
1, 0
2, 0
3, 1
4, 1
5, 1
6, 2
7, 2
8, 2
9, 3
10, 3
11, 3
12, 4
13, 4
14, 4
15, 5
16, 5
17, 5
18, 5
19, 5
20, 5
This is what I have tried:
int i = 0;
int j = 0;
while (i < dateTimes1.Length && j < dateTimes2.Length)
{
if (dateTimes1[i].Date == dateTimes2[j].Date && dateTimes1[i].Hour == dateTimes2[j].Hour)
{
list.Add(i);
list2.Add(j);
i++;
}
else if (dateTimes1[i] < dateTimes2[j])
{
i++;
}
else if (dateTimes1[i] > dateTimes2[j])
{
j++;
}
}
for (int k = 0; k < list.Count; k++)
{
Console.WriteLine(list[k] + " , " + list2[k];
}
but it won't output the index number after 1pm.
Your two lists are not the same length. In your while statement you are trying to iterate two different length lists at the same time.
If I understand your requirements properly you should be doing something like this by using an inner loop:
DateTime[] dateTimes1 = new DateTime[]
{
new DateTime(2010, 10, 1, 8, 15, 0),
new DateTime(2010, 10, 1, 8, 30, 1),
new DateTime(2010, 10, 1, 8, 45, 2),
new DateTime(2010, 10, 1, 9, 15, 3),
new DateTime(2010, 10, 1, 9, 30, 4),
new DateTime(2010, 10, 1, 9, 45, 5),
new DateTime(2010, 10, 1, 10, 15, 6),
new DateTime(2010, 10, 1, 10, 30, 7),
new DateTime(2010, 10, 1, 10, 45, 8),
new DateTime(2010, 10, 1, 11, 15, 9),
new DateTime(2010, 10, 1, 11, 30, 10),
new DateTime(2010, 10, 1, 11, 45, 11),
new DateTime(2010, 10, 1, 12, 15, 12),
new DateTime(2010, 10, 1, 12, 30, 13),
new DateTime(2010, 10, 1, 12, 45, 14),
new DateTime(2010, 10, 1, 13, 15, 15),
new DateTime(2010, 10, 1, 13, 30, 16),
new DateTime(2010, 10, 1, 13, 45, 17),
new DateTime(2010, 10, 1, 14, 15, 18),
new DateTime(2010, 10, 1, 14, 30, 19),
new DateTime(2010, 10, 1, 14, 45, 20),
};
DateTime[] dateTimes2 = new DateTime[]
{
new DateTime(2010, 10, 1, 8, 0, 0),
new DateTime(2010, 10, 1, 9, 0, 1),
new DateTime(2010, 10, 1, 10, 0, 2),
new DateTime(2010, 10, 1, 11, 0, 3),
new DateTime(2010, 10, 1, 12, 0, 4),
new DateTime(2010, 10, 1, 13, 0, 5),
};
int i = 0;
while (i < dateTimes1.Length)
{
int j = 0;
while (j < dateTimes2.Length))
{
if (dateTimes1[i].Date == dateTimes2[j].Date && dateTimes1[i].Hour == dateTimes2[j].Hour)
{
list.Add(i);
list2.Add(j);
i++;
}
else if (dateTimes1[i] < dateTimes2[j])
{
i++;
}
else if (dateTimes1[i] > dateTimes2[j])
{
j++;
}
}
}
for (int k = 0; k < list.Count; k++)
{
Console.WriteLine(list[k] + " , " + list2[k];
}
Here's a pretty basic method using Array.FindIndex and foreach:
EDIT: Updated this answer to handle the "matchup the last available index number of array2 for all of the datetime data from array1 that later than array2." issue.
foreach (DateTime dt in dateTimes1)
{
int currentHour = dt.Hour;
int lastHour = dateTimes2[dateTimes2.GetUpperBound(0)].Hour; //GetUpperBound(0) is the last index
int dt1index = Array.FindIndex(dateTimes1, a => a == dt); //get the index of the current item in dateTimes1
int dt2index = Array.FindIndex(dateTimes2, x => x.Hour == currentHour); //get the index of the item in dateTimes2 matching dateTimes1 hour field
if (currentHour > lastHour)
{
Console.WriteLine("{0}, {1}", dt1index, dateTimes2.GetUpperBound(0));
}
else
{
Console.WriteLine("{0}, {1}", dt1index, dt2index);
}
}
This simply looks at each of the values in dateTimes1 and dateTimes2 and returns the first match it finds (very similar to your loop).
To determine dt1index, we look through dateTimes1 and return the first match where a => a == dt (a is just the predicate, representing the "current" value in dateTimes1 - think of i = 0,1,2,etc in a regular loop ).
Similarly, to determine dt2index, we look for the first match on x => x.Hour == dt.Hour -- that is, where the "current" dt's hour field matches the hour field in dateTimes2.
In both cases, the first match is returned - if no matches are found, -1 is returned.
When we go to write to the console, we check if currentHour is greater than the last hour in dateTimes2 if so, we just write the current index of dateTimes1 and the last index of dateTimes2. Otherwise, we write the current index of dateTimes1 and the index where the hour matches on dateTimes2.
Using Linq:
var hour = new TimeSpan(1, 0, 0);
var dt2MaxValue = dateTimes2.Max();
for (int i = 0; i < dateTimes1.Length; i++)
{
var output = string.Format("{0}, {1}",
i,
dateTimes2
.Select((o, index) => new { index = index, value = o })
.Where(dt2 => (dateTimes1[i] - dt2.value) < hour
|| dt2.value == dt2MaxValue)
.Select(dt2 => dt2.index)
.FirstOrDefault());
Console.WriteLine(output);
}
What the above Linq statement does:
The first Select uses that method's overload which also passes the index of the item. This simply allows that info to cascade through. It uses an anonymous object with both index and the collection item being the index and value properties, respectively.
The Where clause queries the collection of these anonymous objects and compares their value with dateTime1[i]. It gets the one where value is less than dateTime1[i] but not by more than 1 hour, OR if it is the maximum value in the whole collection.
The second Select simply gets the indexes of the items that Where filtered through.
And FirstOrDefault returns just that (ie, the first or default, which is the index of the item selected or 0 if no item was selected).

Take the first five elements and the last five elements from an array by one query using LINQ

I have been recently asked by a co-worker: Is it possible just take the first five elements and the last five elements by one query from an array?
int[] someArray = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18 };
What I've tried:
int[] someArray = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18 };
var firstFiveResults = someArray.Take(5);
var lastFiveResults = someArray.Skip(someArray.Count() - 5).Take(5);
var result = firstFiveResults;
result = result.Concat(lastFiveResults);
Is it possible to just take the first five elements and the last five elements by one query?
You can use a .Where method with lambda that accepts the element index as its second parameter:
int[] someArray = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18 };
int[] newArray = someArray.Where((e, i) => i < 5 || i >= someArray.Length - 5).ToArray();
foreach (var item in newArray)
{
Console.WriteLine(item);
}
Output:
0, 1, 2, 3, 4, 14, 15, 16, 17, 18
A solution with ArraySegment<> (requires .NET 4.5 (2012) or later):
var result = new ArraySegment<int>(someArray, 0, 5)
.Concat(new ArraySegment<int>(someArray, someArray.Length - 5, 5));
And a solution with Enumerable.Range:
var result = Enumerable.Range(0, 5).Concat(Enumerable.Range(someArray.Length - 5, 5))
.Select(idx => someArray[idx]);
Both these solution avoid iterating through the "middle" of the array (indices 5 through 13).
In case you are not playing code puzzles with your co-workers, but just want to create a new array with your criteria, I wouldn't do this with queries at all, but use Array.copy.
There are three distinct cases to consider:
the source array has fewer than 5 items
the source array has 5 to 9 items
the source array has 10 or more items
The third one is the simple case, as the first and last 5 elements are distinct and well defined.
The other two require more thought. I'm going to assume you want the following, check those assumptions:
If the source array has fewer than 5 items, you will want to have an array of 2 * (array length) items, for example [1, 2, 3] becomes [1, 2, 3, 1, 2, 3]
If the source array has between 5 and 9 items, you will want to have an array of exactly 10 items, for example [1, 2, 3, 4, 5, 6] becomes [1, 2, 3, 4, 5, 2, 3, 4, 5, 6]
A demonstration program is
public static void Main()
{
Console.WriteLine(String.Join(", ", headandtail(new int[]{1, 2, 3})));
Console.WriteLine(String.Join(", ", headandtail(new int[]{1, 2, 3, 4, 5, 6})));
Console.WriteLine(String.Join(", ", headandtail(new int[]{1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11})));
}
private static T[] headandtail<T>(T[] src) {
int runlen = Math.Min(src.Length, 5);
T[] result = new T[2 * runlen];
Array.Copy(src, 0, result, 0, runlen);
Array.Copy(src, src.Length - runlen, result, result.Length - runlen, runlen);
return result;
}
which runs in O(1);
If you are playing code puzzles with your co-workers, well all the fun is in the puzzle, isn't it?
It's trivial though.
src.Take(5).Concat(src.Reverse().Take(5).Reverse()).ToArray();
this runs in O(n).
Try this:
var result = someArray.Where((a, i) => i < 5 || i >= someArray.Length - 5);
This should work
someArray.Take(5).Concat(someArray.Skip(someArray.Count() - 5)).Take(5);
Try this:
int[] someArray = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18 };
var firstFiveResults = someArray.Take(5);
var lastFiveResults = someArray.Reverse().Take(5).Reverse();
var result = firstFiveResults;
result = result.Concat(lastFiveResults);
The second Reverse() reorders the numbers so you won't get 18,17,16,15,14
Please try this:
var result = someArray.Take(5).Union(someArray.Skip(someArray.Count() - 5).Take(5));

How to find a dense region in a List<int>

hi i need to find the biggest dense region in a List of values based on a given range
example:
var radius =5; //Therm edited
var list = new List<int>{0,1,2,3,4,5,12,15,16,22,23,24,26,27,28,29};
//the following dense regions exist in the list above
var region1 = new List<int> { 0, 1, 2, 3, 4, 5 }; // exist 6 times (0, 1, 2, 3, 4, 5)
var region2 = new List<int> { 12, 15, 16}; // exist 3 times (12, 15, 16)
var region3 = new List<int> { 22, 23, 24, 26, 27}; // exist 1 times (22)
var region4 = new List<int> { 22, 23, 24, 26, 27, 28}; // exist 1 times (23)
var region5 = new List<int> { 22, 23, 24, 26, 27, 28, 29 }; // exist 3 times (24, 26, 27)
var region6 = new List<int> { 23, 24, 26, 27, 28, 29 }; // exist 1 times (28)
var region7 = new List<int> { 24, 26, 27, 28, 29 }; // exist 1 times (29)
//var result{22,23,24,26,27,28,29}
the solution doesn't really need to be fast because the max number of values is 21
is there an way to use fluent to achieve this?
i only know how to get the closest value
int closest = list.Aggregate((x,y) => Math.Abs(x-number) < Math.Abs(y-number) ? x : y);
and how to get values between 2 numbers
var between = list.Where(value=> min < value && value < max);
Edit
additional information's
Ok range is maybe the wrong therm radius would be a better word.
I define the dense region as the largest count of all values between currenvalue-range and currenvalue + range we get the dense region
A rather cryptic (but short) way would be:
int w = 5; // window size
var list = new List<int> { 0, 1, 2, 3, 4, 5, 12, 15, 16, 22,
23, 24, 26, 27, 28, 29 };
var result = list.Select(x => list.Where(y => y >= x - w && y <= x + w))
.Aggregate((a, b) => (a.Count() > b.Count()) ? a : b);
Console.WriteLine(string.Join(",", result.ToArray()));
Prints
22,23,24,26,27,28,29
This code consists of 3 steps:
For a given x the snippet list.Where(y => y >= x - w && y <= x + w) gives all elements from the list that are in the cluster around x.
list.Select(x => ...) computes that cluster for every element of the list.
.Aggregate((a, b) => (a.Count() > b.Count()) ? a : b) takes the cluster of maximum size.

How to add a value to the various position of a int list?

For example
List contains integer values 34, 78, 20, 10, 17, 99, 101, 24, 50, 13
and the value to put is 11 at position 1, 4 and 5
Position is the index value which starts from 0
so the final result is => 34, 11, 78, 20, 10, 11, 17, 11, 99, 101, 24, 50, 13
My current code is as follows:
List<int> list_iNumbers = new List<int>();
list_iNumbers.Add(34);
list_iNumbers.Add(78);
list_iNumbers.Add(20);
list_iNumbers.Add(10);
list_iNumbers.Add(17);
list_iNumbers.Add(99);
list_iNumbers.Add(101);
list_iNumbers.Add(24);
list_iNumbers.Add(50);
list_iNumbers.Add(13);
List<int> list_iPosition = new List<int>();
list_iPosition.Add(1);
list_iPosition.Add(4);
list_iPosition.Add(5);
int iValueToInsert = 11;
Now How to insert at these positions and get the correct result?
Use Insert(index, element) method instead of Add. Something like that:
foreach(var pos in list_iPosition.OrderByDescending(x => x))
list_iNumbers.Insert(pos, iValueToInsert);
You have to do it from the last index, to make it right. That's why I used OrderByDescending first.
Non Linq Solution:
For(int i = 0; i<count_of_numbers_to_insert; i++)
{
list_iNumbers.Insert(pos+i, valueToInsert);
}

Categories