C# : is there an appropriate collection for fast range-related search? - c#

I have data like that:
Time(seconds from start)
Value
15
2
16
4
19
2
25
9
There are a lot of entries (10000+), and I need a way to find fast enough sum of any time range, like sum of range 16-25 seconds (which would be 4+2+9=15). This data will be dynamically changed many times (always adding new entries at the bottom of list).
I am thinking about using sorted list + binary search to determinate positions and just make sum of values, but is can took too much time to calculate it. Is there are any more appropriate way to do so? Nuget packets or algorithm references would be appreciated.

Just calculate cumulative sum:
Time Value CumulativeSum
15 2 2
16 4 6
19 2 8
25 9 17
Then for range [16,25] it will be task to binary search left border of 16 and 25 exact, which turns into 17 - 2 = 15
Complexity: O(log(n)), where n - size of the list.
Binary search implementation for lower/upper bound can be found in my repo - https://github.com/eocron/Algorithm/blob/master/Algorithm/Sorted/BinarySearchExtensions.cs

Related

Algorithm to divide a list of numbers and summed up >= X million

I got 9 numbers which I want to divide in two lists, and both lists need to reach a certain amount when summed up. For example I got a list of ints:
List<int> test = new List<int>
{
1963000, 1963000, 393000, 86000,
393000, 393000, 176000, 420000,
3193000
};
And I want to have 2 lists of numbers that when you sum them up, they both reach over 4 million.
It doesn't matter if the 2 lists don't have the same amount of numbers. If it only takes 2 numbers to reach 4 million in 1 list, and 7 numbers together reaching 7 million, is fine.
As long as both lists summed up are equal to 4 million or higher.
Is this certain sum low enough to be reached easily?
If yes, then your algorithm may be as simple as: iterate i from 1 to number of items. sum up the first i numbers. if the sum is higher than your certain sum (eg 4 million), then you are finished, else increment i.
BUT: if your certain sums are high and it is not such trivial to find the partition, then you have the famous Partition Probem (https://en.wikipedia.org/wiki/Partition_problem), this is not that simple but there are some algorithms. Read this wikipedia artikle or try to google "Partition problem solution" or similar.

Find key or value in Dictionary<double, double> with linear interpolation

I am developing application with .NET 4.5 / C# 6.0 which gets measurement data from a hardware device, which is also configured by my application. E.g. I configure my device with the following:
start frequency: 10 kHz
stop frequency: 20 kHz
measurement points: 11
and at the end of the processing pipeline of my raw data I get a dictionary like the following, where the key is the frequency and the value e.g. the magnitude in dB:
key => value
10k => -3
11k => -3
12k => -3
13k => -3.5
14k => -4
15k => -5
16k => -6
17k => -7
18k => -8
19k => -10
20k => -12
This dictionary is updated "on the fly" as the device is continuously sweeping these 11 points in a loop.
These values are for once displayed in a chart, which is no problem as I simply update the chart data whenever a new point is ready (I get an event for each new point), but I also have a data grid where the user can display the values for manually entered points ([value] means manually entered), e.g.:
| A: f | B: mag(db)
Point 1 | [11 kHz] | -3 dB
Point 2 | [16.5 kHz] | -6.5 dB
Point 3 | 18.5 kHz | [-9.5 dB]
Point 1 is easy as it hits exactly on one measured point, but for e.g. Point 2 the user manually enters 16.5 kHz in column A, which means I need to interpolate the value of column B for the two measurement points next to it.
About the same, but the other way around for Point 3: The user manually entered -9.5 dB in column B and so I need to find the interpolated frequency where this value would be the interpolated result.
Note - the following constraints apply for Point 3:
If the value would be possible twice or more, the first one is used, e.g. for -3 dB it should return 10 kHz
The frequency for the entered value is only searched once and then it behaves the same as Point 2
If the given value is not found the closest one is returned (also valid for Point 2), e.g. values > -3 dB return 10 kHz and values < -12 dB return 20 kHz
Is there some fast/optimized way to get to these values for Point 2 and Point 3?
I could only think of iterating over all point every time a value is manually entered and then interpolate between every two points until the given value is found. Then update column B when one of the neighboring frequencies is updated.
Note: The device delivers up to several hundred points per second.

How is an integer stored in memory?

This is most probably the dumbest question anyone would ask, but regardless I hope I will find a clear answer for this.
My question is - How is an integer stored in computer memory?
In c# an integer is of size 32 bit. MSDN says we can store numbers from -2,147,483,648 to 2,147,483,647 inside an integer variable.
As per my understanding a bit can store only 2 values i.e 0 & 1. If I can store only 0 or 1 in a bit, how will I be able to store numbers 2 to 9 inside a bit?
More precisely, say I have this code int x = 5; How will this be represented in memory or in other words how is 5 converted into 0's and 1's, and what is the convention behind it?
It's represented in binary (base 2). Read more about number bases. In base 2 you only need 2 different symbols to represent a number. We usually use the symbols 0 and 1. In our usual base we use 10 different symbols to represent all the numbers, 0, 1, 2, ... 8, and 9.
For comparison, think about a number that doesn't fit in our usual system. Like 14. We don't have a symbol for 14, so how to we represent it? Easy, we just combine two of our symbols 1 and 4. 14 in base 10 means 1*10^1 + 4*10^0.
1110 in base 2 (binary) means 1*2^3 + 1*2^2 + 1*2^1 + 0*2^0 = 8 + 4 + 2 + 0 = 14. So despite not having enough symbols in either base to represent 14 with a single symbol, we can still represent it in both bases.
In another commonly used base, base 16, which is also known as hexadecimal, we have enough symbols to represent 14 using only one of them. You'll usually see 14 written using the symbol e in hexadecimal.
For negative integers we use a convenient representation called twos-complement which is the complement (all 1s flipped to 0 and all 0s flipped to 1s) with one added to it.
There are two main reasons this is so convenient:
We know immediately if a number is positive of negative by looking at a single bit, the most significant bit out of the 32 we use.
It's mathematically correct in that x - y = x + -y using regular addition the same way you learnt in grade school. This means that processors don't need to do anything special to implement subtraction if they already have addition. They can simply find the twos-complement of y (recall, flip the bits and add one) and then add x and y using the addition circuit they already have, rather than having a special circuit for subtraction.
This is not a dumb question at all.
Let's start with uint because it's slightly easier. The convention is:
You have 32 bits in a uint. Each bit is assigned a number ranging from 0 to 31. By convention the rightmost bit is 0 and the leftmost bit is 31.
Take each bit number and raise 2 to that power, and then multiply it by the value of the bit. So if bit number three is one, that's 1 x 23. If bit number twelve is zero, that's 0 x 212.
Add up all those numbers. That's the value.
So five would be 00000000000000000000000000000101, because 5 = 1 x 20 + 0 x 21 + 1 x 22 + ... the rest are all zero.
That's a uint. The convention for ints is:
Compute the value as a uint.
If the value is greater than or equal to 0 and strictly less than 231 then you're done. The int and uint values are the same.
Otherwise, subtract 232 from the uint value and that's the int value.
This might seem like an odd convention. We use it because it turns out that it is easy to build chips that perform arithmetic in this format extremely quickly.
Binary works as follows (as your 32 bits).
1 1 1 1 | 1 1 1 1 | 1 1 1 1 | 1 1 1 1 | 1 1 1 1 | 1 1 1 1 | 1 1 1 1 | 1 1 1 1
2^ 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16......................................0
x
x = sign bit (if 1 then negative number if 0 then positive)
So the highest number is 0111111111............1 (all ones except the negative bit), which is 2^30 + 2 ^29 + 2^28 +........+2^1 + 2^0 or 2,147,483,647.
The lowest is 1000000.........0, meaning -2^31 or -2147483648.
Is this what high level languages lead to!? Eeek!
As other people have said it's a base 2 counting system. Humans are naturally base 10 counters mostly, though time for some reason is base 60, and 6 x 9 = 42 in base 13. Alan Turing was apparently adept at base 17 mental arithmetic.
Computers operate in base 2 because it's easy for the electronics to be either on or off - representing 1 and 0 which is all you need for base 2. You could build the electronics in such a way that it was on, off or somewhere in between. That'd be 3 states, allowing you to do tertiary math (as opposed to binary math). However the reliability is reduced because it's harder to tell the difference between those three states, and the electronics is much more complicated. Even more levels leads to worse reliability.
Despite that it is done in multi level cell flash memory. In these each memory cell represents on, off and a number of intermediate values. This improves the capacity (each cell can store several bits), but it is bad news for reliability. This sort of chip is used in solid state drives, and these operate on the very edge of total unreliability in order to maximise capacity.

Finding a maximum weight clique in a weighted graph C# implementation

Is there a freely available implementation of finding a maximum weight clique in weighted graph in C#?
You could read the paper "A fast algorithm for the maximum clique problem", and you will find an effective maximum clique algorithm that proposed in this paper. In addition, a maximum weighted algorithm could be found in "A new algorithm for the maximum weighted clique problem". Here is the Pseudo-Code:
1 **FUNCTION CLIQUE(U, size)**
2 if |U| = 0 then
3 if size > max then
4 max ← size
5 New record; save it.
6 found ← true
7 end
8 return
9 end
10 while |U| != ∅ do
11 if size + weight(|U|) <= max then
12 return
13 end
14 i ← min{ j|vj ∈ U}
15 if size + c[i] <= max then
16 return
17 end
18 U ← U ∖ {vi}
19 CLIQUE(U ∩ N(vi); size + weight(vi))
20 if found = true then
21 return
22 end
23 end
24 return
25 **FUNCTION NEW()**
26 max ← 0
27 for i ← n downto 1 do
28 found ← false
29 CLIQUE(Si ∩ N(vi), weight(i))
30 c[i] ← max
31 end
32 return
We assume Si represents vertexes that have larger index than i, for example {vi,vi+1,...,vn}. N(vi) means the adjacent vertexes of vi. The global variable max marks the maximum size of clique that we find for now, and the global variable found marks whether we have found a larger clique. The array c[] record the maximum clique size of Si. size records maximum clique size in local recursion。
There are several prune strategies that could avoid useless search, especially, in line 11 and line 15.
You could use the hash table to implement this algorithm.
Find maximum clique is an NP-hard problem. You can find something useful in Clique problem (Wikipedia).

How to manage AI actions based on percentages

I am looking now for some time about how can a programmer simulate a AI decision based on percentages of actions for the final fantasy tactic-like games (strategy game).
Say for example that the AI character has the following actions:
Attack 1: 10%
Attack 2: 9%
Magic : 4%
Move : 1%
All of this is far from equaling 100%
Now at first I though about having an array with 100 empty slots, attack would have 10 slots, attack 2 9 slots on the array. Combining random I could get the action to do then. My problem here is it is not really efficient, or doesn't seem to be. Also important thing, what do I do if I get on an empty slot. Do I have to calculate for each character all actions based on 100% or define maybe a "default" action for everyone ?
Or maybe there is a more efficient way to see all of this ? I think that percentage is the easiest way to implement an AI.
The best answer I can come up with is to make a list of all the possible moves you want the character to have, give each a relative value, then scale all of them to total 100%.
EDIT:
For example, here are three moves I have. I want attack and magic to be equally likely, and fleeing to be half as likely as attacking or using magic:
attack = 20
magic = 20
flee = 10
This adds up to 50, so dividing each by this total gives me a fractional value (multiply by 100 for percentage):
attack = 0.4
magic = 0.4
flee = 0.2
Then, I would make from this a list of cumulative values (i.e. each entry is a sum of that entry and all that came before it):
attack = 0.4
magic = 0.8
flee = 1
Now, generate a random number between 0 and 1 and find the first entry in the list that is greater than or equal to that number. That is the move you make.
No, you just create threshholds. One simple way is:
0 - 9 -> Attack1
10 - 18 -> Attack 2
19 - 22 -> Magic
23 -> Move
Something else -> 24-99 (you need to add up to 100)
Now create a random number and mod it by 100 (so num = randomNumber % 100) to define your action. The better the random number to close to a proper distribution you will get. So you take the result and see which category it falls into. You can actually make this even more efficient but it is a good start.
Well if they don't all add up to 100 they aren't really percentages. This doesnt matter though. you just need to figure out the relative probability of each action. To do this use the following formula...
prob = value_of_action / total_value_of_all_actions
This gives you a number between 0 and 1. if you really want a percentage rather than a fraction, multiply it by 100.
here is an example:
prob_attack = 10 / (10 + 9 + 4 + 1)
= 10 / 24
= 0.4167
This equates to attack being chosen 41.67% of the time.
you can then generate thresholds as is mentioned in other answers. And use a random number between 0 and 1 to choose your action.

Categories