I have the following:
public class Interval
{
DateTime Start;
DateTime End;
}
I have a List<Interval> object containing multiple intervals. I am trying to achieve the following (I used numbers to make it easy to understand):
[(1, 5), (2, 4), (3, 6)] ---> [(1,6)]
[(1, 3), (2, 4), (5, 8)] ---> [(1, 4), (5,8)]
I currently do this in Python as follows:
def merge(times):
saved = list(times[0])
for st, en in sorted([sorted(t) for t in times]):
if st <= saved[1]:
saved[1] = max(saved[1], en)
else:
yield tuple(saved)
saved[0] = st
saved[1] = en
yield tuple(saved)
but am trying to achieve the same in C# (LINQ would be best but optional). Any suggestions on how to do this efficiently?
Here's a version using yield return - I find it easier to read than doing an Aggregate query, although it's still lazy evaluated. This assumes you've ordered the list already, if not, just add that step.
IEnumerable<Interval> MergeOverlappingIntervals(IEnumerable<Interval> intervals)
{
var accumulator = intervals.First();
intervals = intervals.Skip(1);
foreach(var interval in intervals)
{
if ( interval.Start <= accumulator.End )
{
accumulator = Combine(accumulator, interval);
}
else
{
yield return accumulator;
accumulator = interval;
}
}
yield return accumulator;
}
Interval Combine(Interval start, Interval end)
{
return new Interval
{
Start = start.Start,
End = Max(start.End, end.End),
};
}
private static DateTime Max(DateTime left, DateTime right)
{
return (left > right) ? left : right;
}
I was beset by "Not Created Here" syndrome tonight, so here's mine. Using an Enumerator directly saved me a couple lines of code, made it clearer (IMO), and handled the case with no records. I suppose it might run a smidge faster as well if you care about that...
public IEnumerable<Tuple<DateTime, DateTime>> Merge(IEnumerable<Tuple<DateTime, DateTime>> ranges)
{
DateTime extentStart, extentEnd;
using (var enumerator = ranges.OrderBy(r => r.Item1).GetEnumerator()) {
bool recordsRemain = enumerator.MoveNext();
while (recordsRemain)
{
extentStart = enumerator.Current.Item1;
extentEnd = enumerator.Current.Item2;
while ((recordsRemain = enumerator.MoveNext()) && enumerator.Current.Item1 < extentEnd)
{
if (enumerator.Current.Item2 > extentEnd)
{
extentEnd = enumerator.Current.Item2;
}
}
yield return Tuple.Create(extentStart, extentEnd);
}
}
}
In my own implementation, I use a TimeRange type to store each Tuple<DateTime, DateTime>, as other here do. I didn't include that here simply to stay focused / on-topic.
This may not be the prettiest solution, but it may work as well
public static List<Interval> Merge(List<Interval> intervals)
{
var mergedIntervals = new List<Interval>();
var orderedIntervals = intervals.OrderBy<Interval, DateTime>(x => x.Start).ToList<Interval>();
DateTime start = orderedIntervals.First().Start;
DateTime end = orderedIntervals.First().End;
Interval currentInterval;
for (int i = 1; i < orderedIntervals.Count; i++)
{
currentInterval = orderedIntervals[i];
if (currentInterval.Start < end)
{
end = currentInterval.End;
}
else
{
mergedIntervals.Add(new Interval()
{
Start = start,
End = end
});
start = currentInterval.Start;
end = currentInterval.End;
}
}
mergedIntervals.Add(new Interval()
{
Start = start,
End = end
});
return mergedIntervals;
}
Any feedback will be appreciated.
Regards
This kind of merging would typically be considered as a fold in functional languages. The LINQ equivalent is Aggregate.
IEnumerable<Interval<T>> Merge<T>(IEnumerable<Interval<T>> intervals)
where T : IComparable<T>
{
//error check parameters
var ret = new List<Interval<T>>(intervals);
int lastCount
do
{
lastCount = ret.Count;
ret = ret.Aggregate(new List<Interval<T>>(),
(agg, cur) =>
{
for (int i = 0; i < agg.Count; i++)
{
var a = agg[i];
if (a.Contains(cur.Start))
{
if (a.End.CompareTo(cur.End) <= 0)
{
agg[i] = new Interval<T>(a.Start, cur.End);
}
return agg;
}
else if (a.Contains(cur.End))
{
if (a.Start.CompareTo(cur.Start) >= 0)
{
agg[i] = new Interval<T>(cur.Start, a.End);
}
return agg;
}
}
agg.Add(cur);
return agg;
});
} while (ret.Count != lastCount);
return ret;
}
I made the Interval class generic (Interval<T> where T : IComparable<T>), added a bool Contains(T value) method, and made it immutable, but you should not need to change it much if you want to use the class definition as you have it now.
I used TimeRange as a container storing the ranges:
public class TimeRange
{
public TimeRange(DateTime s, DateTime e) { start = s; end = e; }
public DateTime start;
public DateTime end;
}
It divides the problem in combining two time ranges. Therefor, the current time range (work) is matched with the time ranges previously merged. If one of the previously added time ranges is outdated, it is dropped and the new time range (combined from work and the matching time range) is used.
The cases I figured out for two ranges () and [] are as follows:
[] ()
([])
[(])
[()]
([)]
()[]
public static IEnumerable<TimeRange> Merge(IEnumerable<TimeRange> timeRanges)
{
List<TimeRange> mergedData = new List<TimeRange>();
foreach (var work in timeRanges)
{
Debug.Assert(work.start <= work.end, "start date has to be smaller or equal to end date to be a valid TimeRange");
var tr = new TimeRange(work.start, work.end);
int idx = -1;
for (int i = 0; i < mergedData.Count; i++)
{
if (tr.start < mergedData[i].start)
{
if (tr.end < mergedData[i].start)
continue;
if (tr.end < mergedData[i].end)
tr.end = mergedData[i].end;
}
else if (tr.start < mergedData[i].end)
{
tr.start = mergedData[i].start;
if (tr.end < mergedData[i].end)
tr.end = mergedData[i].end;
}
else
continue;
idx = i;
mergedData.RemoveAt(i);
i--;
}
if (idx < 0)
idx = mergedData.Count;
mergedData.Insert(idx, tr);
}
return mergedData;
}
Related
For my problem I have a list of a count larger then 6+. From that list I would like to make a list containing every possible combination of the original cards that is exactly 6 cards long. (they have to be unique and order doesn't matter)
so object
01,02,03,04,05,06
is the same for me as
06,05,04,03,02,01
//STARTER list with more then 6 value's
List < ClassicCard > lowCardsToRemove = FrenchTarotUtil.checkCountLowCardForDiscardChien(handCards);
The solution i found and used:
public static List generateAllSubsetCombinations(object[] fullSet, ulong subsetSize) {
if (fullSet == null) {
throw new ArgumentException("Value cannot be null.", "fullSet");
}
else if (subsetSize < 1) {
throw new ArgumentException("Subset size must be 1 or greater.", "subsetSize");
}
else if ((ulong)fullSet.LongLength < subsetSize) {
throw new ArgumentException("Subset size cannot be greater than the total number of entries in the full set.", "subsetSize");
}
// All possible subsets will be stored here
List<object[]> allSubsets = new List<object[]>();
// Initialize current pick; will always be the leftmost consecutive x where x is subset size
ulong[] currentPick = new ulong[subsetSize];
for (ulong i = 0; i < subsetSize; i++) {
currentPick[i] = i;
}
while (true) {
// Add this subset's values to list of all subsets based on current pick
object[] subset = new object[subsetSize];
for (ulong i = 0; i < subsetSize; i++) {
subset[i] = fullSet[currentPick[i]];
}
allSubsets.Add(subset);
if (currentPick[0] + subsetSize >= (ulong)fullSet.LongLength) {
// Last pick must have been the final 3; end of subset generation
break;
}
// Update current pick for next subset
ulong shiftAfter = (ulong)currentPick.LongLength - 1;
bool loop;
do {
loop = false;
// Move current picker right
currentPick[shiftAfter]++;
// If we've gotten to the end of the full set, move left one picker
if (currentPick[shiftAfter] > (ulong)fullSet.LongLength - (subsetSize - shiftAfter)) {
if (shiftAfter > 0) {
shiftAfter--;
loop = true;
}
}
else {
// Update pickers to be consecutive
for (ulong i = shiftAfter+1; i < (ulong)currentPick.LongLength; i++) {
currentPick[i] = currentPick[i-1] + 1;
}
}
} while (loop);
}
return allSubsets;
}
This one is not from me, but it does the job!
List <ClassicCard> lowCardsToRemove = FrenchTarotUtil.checkCountLowCardForDiscardChien(handCards);
var result = Combinator.Combinations(lowCardsToRemove, 6);
public static class Combinator
{
public static IEnumerable<IEnumerable<T>> Combinations<T>(this IEnumerable<T> elements, int k)
{
return k == 0 ? new[] { new T[0] } :
elements.SelectMany((e, i) =>
elements.Skip(i + 1).Combinations(k - 1).Select(c => (new[] { e }).Concat(c)));
}
}
Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 7 years ago.
Improve this question
Not sure why question is being marked as offtopic, where as so called desired behaviour is included within the question post!
I am trying to write this program that takes two inputs:
• a set of include intervals
• and a set of exclude intervals
The sets of intervals can be given in any order, and they may be empty or overlapping. The program should output the result of taking all the includes and “remove” the excludes. The output should be given as non-overlapping intervals in a sorted order.
Intervals will contain Integers only
Example :
Includes: 50-600, 10-100
Excludes: (empty)
Output: 10-600
Includes: 10-100, 200-300, 400-600
Excludes: 95-205, 410-420
Output: 10-94, 206-300, 400-409, 421-600
I tried to populate two Enumerable Range from include and excludes (after splitting,parsing ), but didn't find any efficient way of implementing this afterwards.
string[] _break = _string.Split(',');
string[] _breakB = _stringB.Split(',');
string[] res = new string[_break.Length + 1];
string[] _items, _itemsB;
List < int > _back = new List < int > ();
int count = 0;
foreach(var _item in _break) {
_items = _item.Split('-');
var a = Enumerable.Range(int.Parse(_items[0]), (int.Parse(_items[1]) - int.Parse(_items[0]) + 1)).ToList();
foreach(var _itemB in _breakB) {
_itemsB = _itemB.Split('-');
var b = Enumerable.Range(int.Parse((_itemsB[0])), (int.Parse(_itemsB[1]) - int.Parse((_itemsB[0])) + 1)).ToList();
var c = a.Except < int > (b).ToList();
/// different things tried here, but they are not good
res[count] = c.Min().ToString() + "-" + c.Max().ToString();
count++;
}
}
return res;
Any input will be of great help
You can use the Built-in SortedSet<T> collection to do most of the work for you like this:
The SortedSet<T> collection implements the useful UnionWith and ExceptWith methods which at least makes the code quite easy to follow:
private void button1_Click(object sender, EventArgs e)
{
string[] includeRanges = _string.Text.Replace(" ", "").Split(',');
string[] excludeRanges = _stringB.Text.Replace(" ", "").Split(',');
string[] includeRange, excludeRange;
SortedSet<int> includeSet = new SortedSet<int>();
SortedSet<int> excludeSet = new SortedSet<int>();
// Create a UNION of all the include ranges
foreach (string item in includeRanges)
{
includeRange = item.Split('-');
includeSet.UnionWith(Enumerable.Range(int.Parse(includeRange[0]), (int.Parse(includeRange[1]) - int.Parse(includeRange[0]) + 1)).ToList());
}
// Create a UNION of all the exclude ranges
foreach (string item in excludeRanges)
{
excludeRange = item.Split('-');
excludeSet.UnionWith(Enumerable.Range(int.Parse(excludeRange[0]), (int.Parse(excludeRange[1]) - int.Parse(excludeRange[0]) + 1)).ToList());
}
// Exclude the excludeSet from the includeSet
includeSet.ExceptWith(excludeSet);
//Format the output using a stringbuilder
StringBuilder sb = new StringBuilder();
int lastValue = -1;
foreach (int included in includeSet)
{
if (lastValue == -1)
{
sb.Append(included + "-");
lastValue = included;
}
else
{
if (lastValue == included - 1)
{
lastValue = included;
}
else
{
sb.Append(lastValue + ",");
sb.Append(included + "-");
lastValue = included;
}
}
}
sb.Append(lastValue);
result.Text = sb.ToString();
}
This should work faster than SortedSet trick, at least for large intervals. Idea is like:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text.RegularExpressions;
namespace Test
{
using Pair = Tuple<int, int>; //for brevity
struct Point //point of an interval
{
public enum Border { Left, Right };
public enum Interval { Including, Excluding };
public int Val;
public int Brdr;
public int Intr;
public Point(int value, Border border, Interval interval)
{
Val = value;
Brdr = (border == Border.Left) ? 1 : -1;
Intr = (int)interval;
}
public override string ToString() =>
(Brdr == 1 ? "L" : "R") + (Intr == 0 ? "+ " : "- ") + Val;
}
class Program
{
static IEnumerable<Pair> GetInterval(string strIn, string strEx)
{
//a func to get interval border points from string:
Func<string, Point.Interval, IEnumerable<Point>> parse = (str, intr) =>
Regex.Matches(str, "[0-9]+").Cast<Match>().Select((s, idx) =>
new Point(int.Parse(s.Value), (Point.Border)(idx % 2), intr));
var INs = parse(strIn, Point.Interval.Including);
var EXs = parse(strEx, Point.Interval.Excluding);
var intrs = new int[2]; //current interval border control IN[0], EX[1]
int start = 0; //left border of a new resulting interval
//put all points in a line and loop:
foreach (var p in INs.Union(EXs).OrderBy(x => x.Val))
{
//check for start (close) of a new (cur) interval:
var change = (intrs[p.Intr] == 0) ^ (intrs[p.Intr] + p.Brdr == 0);
intrs[p.Intr] += p.Brdr;
if (!change) continue;
var In = p.Intr == 0 && intrs[1] == 0; //w no Ex
var Ex = p.Intr == 1 && intrs[0] > 0; //breaks In
var Open = intrs[p.Intr] > 0;
var Close = !Open;
if (In && Open || Ex && Close)
{
start = p.Val + p.Intr; //exclude point if Ex
}
else if (In && Close || Ex && Open)
{
yield return new Pair(start, p.Val - p.Intr);
}
}
}
static void Main(string[] args)
{
var strIN = "10-100, 200-300, 400-500, 420-480";
var strEX = "95-205, 410-420";
foreach (var i in GetInterval(strIN, strEX))
Console.WriteLine(i.Item1 + "-" + i.Item2);
Console.ReadLine();
}
}
}
So, you task could be separated to the list of subtasks:
Parse a source line of intervals to the list of objects
Concatinate intervals if they cross each over
Excludes intervals 'excludes' from 'includes'
I published my result code here: http://rextester.com/OBXQ56769
The code could be optimized as well, but I wanted it to be quite simple. Hope it will help you.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text.RegularExpressions;
namespace ConsoleApplication
{
public class Program
{
private const string Includes = "10-100, 200-300, 400-500 ";
private const string Excludes = "95-205, 410-420";
private const string Pattern = #"(\d*)-(\d*)";
public static void Main(string[] args)
{
var includes = ParseIntevals(Includes);
var excludes = ParseIntevals(Excludes);
includes = ConcatinateIntervals(includes);
excludes = ConcatinateIntervals(excludes);
// The Result
var result = ExcludeFromInclude(includes, excludes);
foreach (var interval in result)
{
Console.WriteLine(interval.Min + "-" + interval.Max);
}
}
/// <summary>
/// Excludes intervals 'excludes' from 'includes'
/// </summary>
public static List<Interval> ExcludeFromInclude(List<Interval> includes, List<Interval> excludes)
{
var result = new List<Interval>();
if (!excludes.Any())
{
return includes.Select(x => x.Clone()).ToList();
}
for (int i = 0; i < includes.Count; i++)
{
for (int j = 0; j < excludes.Count; j++)
{
if (includes[i].Max < excludes[j].Min || includes[i].Min > excludes[j].Max)
continue; // no crossing
//1 Example: includes[i]=(10-20) excludes[j]=(15-25)
if (includes[i].Min < excludes[j].Min && includes[i].Max <= excludes[j].Max)
{
var interval = new Interval(includes[i].Min, excludes[j].Min - 1);
result.Add(interval);
break;
}
//2 Example: includes[i]=(10-25) excludes[j]=(15-20)
if (includes[i].Min <= excludes[j].Min && includes[i].Max >= excludes[j].Max)
{
if (includes[i].Min < excludes[j].Min)
{
var interval1 = new Interval(includes[i].Min, excludes[j].Min - 1);
result.Add(interval1);
}
if (includes[i].Max > excludes[j].Max)
{
var interval2 = new Interval(excludes[j].Max + 1, includes[i].Max);
result.Add(interval2);
}
break;
}
//3 Example: includes[i]=(15-25) excludes[j]=(10-20)
if (includes[i].Min < excludes[j].Max && includes[i].Max > excludes[j].Max)
{
var interval = new Interval(excludes[j].Max + 1, includes[i].Max);
result.Add(interval);
break;
}
}
}
return result;
}
/// <summary>
/// Concatinates intervals if they cross each over
/// </summary>
public static List<Interval> ConcatinateIntervals(List<Interval> intervals)
{
var result = new List<Interval>();
for (int i = 0; i < intervals.Count; i++)
{
for (int j = 0; j < intervals.Count; j++)
{
if (i == j)
continue;
if (intervals[i].Max < intervals[j].Min || intervals[i].Min > intervals[j].Max)
{
Interval interval = intervals[i].Clone();
result.Add(interval);
continue; // no crossing
}
//1
if (intervals[i].Min < intervals[j].Min && intervals[i].Max < intervals[j].Max)
{
var interval = new Interval(intervals[i].Min, intervals[j].Max);
result.Add(interval);
break;
}
//2
if (intervals[i].Min < intervals[j].Min && intervals[i].Max > intervals[j].Max)
{
Interval interval = intervals[i].Clone();
result.Add(interval);
break;
}
//3
if (intervals[i].Min < intervals[j].Max && intervals[i].Max > intervals[j].Max)
{
var interval = new Interval(intervals[j].Min, intervals[i].Max);
result.Add(interval);
break;
}
//4
if (intervals[i].Min > intervals[j].Min && intervals[i].Max < intervals[j].Max)
{
var interval = new Interval(intervals[j].Min, intervals[j].Max);
result.Add(interval);
break;
}
}
}
return result.Distinct().ToList();
}
/// <summary>
/// Parses a source line of intervals to the list of objects
/// </summary>
public static List<Interval> ParseIntevals(string intervals)
{
var matches = Regex.Matches(intervals, Pattern, RegexOptions.IgnoreCase);
var list = new List<Interval>();
foreach (Match match in matches)
{
var min = int.Parse(match.Groups[1].Value);
var max = int.Parse(match.Groups[2].Value);
list.Add(new Interval(min, max));
}
return list.OrderBy(x => x.Min).ToList();
}
/// <summary>
/// Interval
/// </summary>
public class Interval
{
public int Min { get; set; }
public int Max { get; set; }
public Interval()
{
}
public Interval(int min, int max)
{
Min = min;
Max = max;
}
public override bool Equals(object obj)
{
var obj2 = obj as Interval;
if (obj2 == null) return false;
return obj2.Min == Min && obj2.Max == Max;
}
public override int GetHashCode()
{
return this.ToString().GetHashCode();
}
public override string ToString()
{
return string.Format("{0}-{1}", Min, Max);
}
public Interval Clone()
{
return (Interval) this.MemberwiseClone();
}
}
}
}
Lots of ways to solve this. The LINQ approach hasn't been discussed yet - this is how I would do it:
// declaring a lambda fn because it's gonna be used by both include/exclude
// list
Func<string, IEnumerable<int>> rangeFn =
baseInput =>
{
return baseInput.Split (new []{ ',', ' ' },
StringSplitOptions.RemoveEmptyEntries)
.SelectMany (rng =>
{
var range = rng.Split (new []{ '-' },
StringSplitOptions.RemoveEmptyEntries)
.Select(i => Convert.ToInt32(i));
// just in case someone types in
// a reverse range (e.g. 10-5), LOL...
var start = range.Min ();
var end = range.Max ();
return Enumerable.Range (start, (end - start + 1));
});
};
var includes = rangeFn (_string);
var excludes = rangeFn (_stringB);
var result = includes.Except (excludes).Distinct().OrderBy(r => r);
I have a c# class like so
internal class QueuedMinimumNumberFinder : ConcurrentQueue<int>
{
private readonly string _minString;
public QueuedMinimumNumberFinder(string number, int takeOutAmount)
{
if (number.Length < takeOutAmount)
{
throw new Exception("Error *");
}
var queueIndex = 0;
var queueAmount = number.Length - takeOutAmount;
var numQueue = new ConcurrentQueue<int>(number.ToCharArray().Where(m => (int) Char.GetNumericValue(m) != 0).Select(m=>(int)Char.GetNumericValue(m)).OrderBy(m=>m));
var zeroes = number.Length - numQueue.Count;
while (queueIndex < queueAmount)
{
int next;
if (queueIndex == 0)
{
numQueue.TryDequeue(out next);
Enqueue(next);
} else
{
if (zeroes > 0)
{
Enqueue(0);
zeroes--;
} else
{
numQueue.TryDequeue(out next);
Enqueue(next);
}
}
queueIndex++;
}
var builder = new StringBuilder();
while (Count > 0)
{
int next = 0;
TryDequeue(out next);
builder.Append(next.ToString());
}
_minString = builder.ToString();
}
public override string ToString() { return _minString; }
}
The point of the program is to find the minimum possible integer that can be made by taking out any x amount of characters from a string(example 100023 is string, if you take out any 3 letters, the minimum int created would be 100). My question is, is this the correct way to do this? Is there a better data structure that can be used for this problem?
First Edit:
Here's how it looks now
internal class QueuedMinimumNumberFinder
{
private readonly string _minString;
public QueuedMinimumNumberFinder(string number, int takeOutAmount)
{
var queue = new Queue<int>();
if (number.Length < takeOutAmount)
{
throw new Exception("Error *");
}
var queueIndex = 0;
var queueAmount = number.Length - takeOutAmount;
var numQueue = new List<int>(number.Where(m=>(int)Char.GetNumericValue(m)!=0).Select(m=>(int)Char.GetNumericValue(m))).ToList();
var zeroes = number.Length - numQueue.Count;
while (queueIndex < queueAmount)
{
if (queueIndex == 0)
{
var nextMin = numQueue.Min();
numQueue.Remove(nextMin);
queue.Enqueue(nextMin);
} else
{
if (zeroes > 1)
{
queue.Enqueue(0);
zeroes--;
} else
{
var nextMin = numQueue.Min();
numQueue.Remove(nextMin);
queue.Enqueue(nextMin);
}
}
queueIndex++;
}
var builder = new StringBuilder();
while (queue.Count > 0)
{
builder.Append(queue.Dequeue().ToString());
}
_minString = builder.ToString();
}
public override string ToString() { return _minString; }
}
A pretty simple and efficient implementation can be made, once you realize that your input string digits map to the domain of only 10 possible values: '0' .. '9'.
This can be encoded as the number of occurrences of a specific digit in your input string using a simple array of 10 integers: var digit_count = new int[10];
#MasterGillBates describes this idea in his answer.
You can then regard this array as your priority queue from which you can dequeue the characters you need by iteratively removing the lowest available character (decreasing its occurrence count in the array).
The code sample below provides an example implementation for this idea.
public static class MinNumberSolver
{
public static string GetMinString(string number, int takeOutAmount)
{
// "Add" the string by simply counting digit occurrance frequency.
var digit_count = new int[10];
foreach (var c in number)
if (char.IsDigit(c))
digit_count[c - '0']++;
// Now remove them one by one in lowest to highest order.
// For the first character we skip any potential leading 0s
var selected = new char[takeOutAmount];
var start_index = 1;
selected[0] = TakeLowest(digit_count, ref start_index);
// For the rest we start in digit order at '0' first.
start_index = 0;
for (var i = 0; i < takeOutAmount - 1; i++)
selected[1 + i] = TakeLowest(digit_count, ref start_index);
// And return the result.
return new string(selected);
}
private static char TakeLowest(int[] digit_count, ref int start_index)
{
for (var i = start_index; i < digit_count.Length; i++)
{
if (digit_count[i] > 0)
{
start_index = ((--digit_count[i] > 0) ? i : i + 1);
return (char)('0' + i);
}
}
throw new InvalidDataException("Input string does not have sufficient digits");
}
}
Just keep a count of how many times each digit appears. An array of size 10 will do. Count[i] gives the count of digit i.
Then pick the smallest non-zero i first, then pick the smallest etc and form your number.
Here's my solution using LINQ:
public string MinimumNumberFinder(string number, int takeOutAmount)
{
var ordered = number.OrderBy(n => n);
var nonZero = ordered.SkipWhile(n => n == '0');
var zero = ordered.TakeWhile(n => n == '0');
var result = nonZero.Take(1)
.Concat(zero)
.Concat(nonZero.Skip(1))
.Take(number.Length - takeOutAmount);
return new string(result.ToArray());
}
You could place every integer into a list and find all possible sequences of these values. From the list of sequences, you could sort through taking only the sets which have the number of integers you want. From there, you can write a quick function which parses a sequence into an integer. Next, you could store all of your parsed sequences into an array or an other data structure and sort based on value, which will allow you to select the minimum number from the data structure. There may be simpler ways to do this, but this will definitely work and gives you options as far as how many digits you want your number to have.
If I'm understanding this correctly, why don't you just pick out your numbers starting with the smallest number greater than zero. Then pick out all zeroes, then any remaining number if all the zeroes are picked up. This is all depending on the length of your ending result
In your example you have a 6 digit number and you want to pick out 3 digits. This means you'll only have 3 digits left. If it was a 10 digit number, then you would end up with a 7 digit number, etc...
So have an algorithm that knows the length of your starting number, how many digits you plan on removing, and the length of your ending number. Then just pick out the numbers.
This is just quick and dirty code:
string startingNumber = "9999903040404"; // "100023";
int numberOfCharactersToRemove = 3;
string endingNumber = string.Empty;
int endingNumberLength = startingNumber.Length - numberOfCharactersToRemove;
while (endingNumber.Length < endingNumberLength)
{
if (string.IsNullOrEmpty(endingNumber))
{
// Find the smallest digit in the starting number
for (int i = 1; i <= 9; i++)
{
if (startingNumber.Contains(i.ToString()))
{
endingNumber += i.ToString();
startingNumber = startingNumber.Remove(startingNumber.IndexOf(i.ToString()), 1);
break;
}
}
}
else if (startingNumber.Contains("0"))
{
// Add any zeroes
endingNumber += "0";
startingNumber = startingNumber.Remove(startingNumber.IndexOf("0"), 1);
}
else
{
// Add any remaining numbers from least to greatest
for (int i = 1; i <= 9; i++)
{
if (startingNumber.Contains(i.ToString()))
{
endingNumber += i.ToString();
startingNumber = startingNumber.Remove(startingNumber.IndexOf(i.ToString()), 1);
break;
}
}
}
}
Console.WriteLine(endingNumber);
100023 starting number resulted in 100 being the end result
9999903040404 starting number resulted in 3000044499 being the end result
Here's my version to fix this problem:
DESIGN:
You can sort your list using a binary tree , there are a lot of
implementations , I picked this one
Then you can keep track of the number of the Zeros you have in your
string Finally you will end up with two lists: I named one
SortedDigitsList and the other one ZeroDigitsList
perform a switch case to determine which last 3 digits should be
returned
Here's the complete code:
class MainProgram2
{
static void Main()
{
Tree theTree = new Tree();
Console.WriteLine("Please Enter the string you want to process:");
string input = Console.ReadLine();
foreach (char c in input)
{
// Check if it's a digit or not
if (c >= '0' && c <= '9')
{
theTree.Insert((int)Char.GetNumericValue(c));
}
}
//End of for each (char c in input)
Console.WriteLine("Inorder traversal resulting Tree Sort without the zeros");
theTree.Inorder(theTree.ReturnRoot());
Console.WriteLine(" ");
//Format the output depending on how many zeros you have
Console.WriteLine("The final 3 digits are");
switch (theTree.ZeroDigitsList.Count)
{
case 0:
{
Console.WriteLine("{0}{1}{2}", theTree.SortedDigitsList[0], theTree.SortedDigitsList[1], theTree.SortedDigitsList[2]);
break;
}
case 1:
{
Console.WriteLine("{0}{1}{2}", theTree.SortedDigitsList[0], 0, theTree.SortedDigitsList[2]);
break;
}
default:
{
Console.WriteLine("{0}{1}{2}", theTree.SortedDigitsList[0], 0, 0);
break;
}
}
Console.ReadLine();
}
}//End of main()
}
class Node
{
public int item;
public Node leftChild;
public Node rightChild;
public void displayNode()
{
Console.Write("[");
Console.Write(item);
Console.Write("]");
}
}
class Tree
{
public List<int> SortedDigitsList { get; set; }
public List<int> ZeroDigitsList { get; set; }
public Node root;
public Tree()
{
root = null;
SortedDigitsList = new List<int>();
ZeroDigitsList = new List<int>();
}
public Node ReturnRoot()
{
return root;
}
public void Insert(int id)
{
Node newNode = new Node();
newNode.item = id;
if (root == null)
root = newNode;
else
{
Node current = root;
Node parent;
while (true)
{
parent = current;
if (id < current.item)
{
current = current.leftChild;
if (current == null)
{
parent.leftChild = newNode;
return;
}
}
else
{
current = current.rightChild;
if (current == null)
{
parent.rightChild = newNode;
return;
}
}
}
}
}
//public void Preorder(Node Root)
//{
// if (Root != null)
// {
// Console.Write(Root.item + " ");
// Preorder(Root.leftChild);
// Preorder(Root.rightChild);
// }
//}
public void Inorder(Node Root)
{
if (Root != null)
{
Inorder(Root.leftChild);
if (Root.item > 0)
{
SortedDigitsList.Add(Root.item);
Console.Write(Root.item + " ");
}
else
{
ZeroDigitsList.Add(Root.item);
}
Inorder(Root.rightChild);
}
}
I am trying this problem using dynamic programming
Problem:
Given a meeting room and a list of intervals (represent the meeting), for e.g.:
interval 1: 1.00-2.00
interval 2: 2.00-4.00
interval 3: 14.00-16.00
...
etc.
Question:
How to schedule the meeting to maximize the room utilization, and NO meeting should overlap with each other?
Attempted solution
Below is my initial attempt in C# (knowing it is a modified Knapsack problem with constraints). However I had difficulty in getting the result correctly.
bool ContainsOverlapped(List<Interval> list)
{
var sortedList = list.OrderBy(x => x.Start).ToList();
for (int i = 0; i < sortedList.Count; i++)
{
for (int j = i + 1; j < sortedList.Count; j++)
{
if (sortedList[i].IsOverlap(sortedList[j]))
return true;
}
}
return false;
}
public bool Optimize(List<Interval> intervals, int limit, List<Interval> itemSoFar){
if (intervals == null || intervals.Count == 0)
return true; //no more choice
if (Sum(itemSoFar) > limit) //over limit
return false;
var arrInterval = intervals.ToArray();
//try all choices
for (int i = 0; i < arrInterval.Length; i++){
List<Interval> remaining = new List<Interval>();
for (int j = i + 1; j < arrInterval.Length; j++) {
remaining.Add(arrInterval[j]);
}
var partialChoice = new List<Interval>();
partialChoice.AddRange(itemSoFar);
partialChoice.Add(arrInterval[i]);
//should not schedule overlap
if (ContainsOverlapped(partialChoice))
partialChoice.Remove(arrInterval[i]);
if (Optimize(remaining, limit, partialChoice))
return true;
else
partialChoice.Remove(arrInterval[i]); //undo
}
//try all solution
return false;
}
public class Interval
{
public bool IsOverlap(Interval other)
{
return (other.Start < this.Start && this.Start < other.End) || //other < this
(this.Start < other.Start && other.End < this.End) || // this covers other
(other.Start < this.Start && this.End < other.End) || // other covers this
(this.Start < other.Start && other.Start < this.End); //this < other
}
public override bool Equals(object obj){
var i = (Interval)obj;
return base.Equals(obj) && i.Start == this.Start && i.End == this.End;
}
public int Start { get; set; }
public int End { get; set; }
public Interval(int start, int end){
Start = start;
End = end;
}
public int Duration{
get{
return End - Start;
}
}
}
Edit 1
Room utilization = amount of time the room is occupied. Sorry for confusion.
Edit 2
for simplicity: the duration of each interval is integer, and the start/end time start at whole hour (1,2,3..24)
I'm not sure how you are relating this to a knapsack problem. To me it seems more of a vertex cover problem.
First sort the intervals as per their start times and form a graph representation in the form of adjacency matrix or list.
The vertices shall be the interval numbers. There shall be an edge between two vertices if the corresponding intervals overlap with each other. Also, each vertex shall be associated with a value equal to the interval's duration.
The problem then becomes choosing the independent vertices in such a way that the total value is maximum.
This can be done through dynamic programming. The recurrence relation for each vertex shall be as follows:
V[i] = max{ V[j] | j < i and i->j is an edge,
V[k] + value[i] | k < i and there is no edge between i and k }
Base Case V[1] = value[1]
Note:
The vertices should be numbered in increasing order of their start times. Then if there are three vertices:
i < j < k, and if there is no edge between vertex i and vertex j, then there cannot be any edge between vertex i and vertex k.
Good approach is to create class that can easily handle for you.
First I create helper class for easily storing intervals
public class FromToDateTime
{
private DateTime _start;
public DateTime Start
{
get
{
return _start;
}
set
{
_start = value;
}
}
private DateTime _end;
public DateTime End
{
get
{
return _end;
}
set
{
_end = value;
}
}
public FromToDateTime(DateTime start, DateTime end)
{
Start = start;
End = end;
}
}
And then here is class Room, where all intervals are and which has method "addInterval", which returns true, if interval is ok and was added and false, if it does not.
btw : I got a checking condition for overlapping here : Algorithm to detect overlapping periods
public class Room
{
private List<FromToDateTime> _intervals;
public List<FromToDateTime> Intervals
{
get
{
return _intervals;
}
set
{
_intervals = value;
}
}
public Room()
{
Intervals = new List<FromToDateTime>();
}
public bool addInterval(FromToDateTime newInterval)
{
foreach (FromToDateTime interval in Intervals)
{
if (newInterval.Start < interval.End && interval.Start < newInterval.End)
{
return false;
}
}
Intervals.Add(newInterval);
return true;
}
}
While the more general problem (if you have multiple number of meeting rooms) is indeed NP-Hard, and is known as the interval scheduling problem.
Optimal solution for 1-d problem with one classroom:
For the 1-d problem, choosing the (still valid) earliest deadline first solves the problem optimally.
Proof: by induction, the base clause is the void clause - the algorithm optimally solves a problem with zero meetings.
The induction hypothesis is the algorithm solves the problem optimally for any number of k tasks.
The step: Given a problem with n meetings, hose the earliest deadline, and remove all invalid meetings after choosing it. Let the chosen earliest deadline task be T.
You will get a new problem of smaller size, and by invoking the algorithm on the reminder, you will get the optimal solution for them (induction hypothesis).
Now, note that given that optimal solution, you can add at most one of the discarded tasks, since you can either add T, or another discarded task - but all of them overlaps T - otherwise they wouldn't have been discarded), thus, you can add at most one from all discarded tasks, same as the suggested algorithm.
Conclusion: For 1 meeting room, this algorithm is optimal.
QED
high level pseudo code of the solution:
findOptimal(list<tasks>):
res = [] //empty list
sort(list) //according to deadline/meeting end
while (list.IsEmpty() == false):
res = res.append(list.first())
end = list.first().endTime()
//remove all overlaps with the chosen meeting
while (list.first().startTine() < end):
list.removeFirst()
return res
Clarification: This answer assumes "Room Utilization" means maximize number of meetings placed in the room.
Thanks all, here is my solution based on this Princeton note on dynamic programming.
Algorithm:
Sort all events by end time.
For each event, find p[n] - the latest event (by end time) which does not overlap with it.
Compute the optimization values: choose the best between including/not including the event.
Optimize(n) {
opt(0) = 0;
for j = 1 to n-th {
opt(j) = max(length(j) + opt[p(j)], opt[j-1]);
}
}
The complete source-code:
namespace CommonProblems.Algorithm.DynamicProgramming {
public class Scheduler {
#region init & test
public List<Event> _events { get; set; }
public List<Event> Init() {
if (_events == null) {
_events = new List<Event>();
_events.Add(new Event(8, 11));
_events.Add(new Event(6, 10));
_events.Add(new Event(5, 9));
_events.Add(new Event(3, 8));
_events.Add(new Event(4, 7));
_events.Add(new Event(0, 6));
_events.Add(new Event(3, 5));
_events.Add(new Event(1, 4));
}
return _events;
}
public void DemoOptimize() {
this.Init();
this.DynamicOptimize(this._events);
}
#endregion
#region Dynamic Programming
public void DynamicOptimize(List<Event> events) {
events.Add(new Event(0, 0));
events = events.SortByEndTime();
int[] eventIndexes = getCompatibleEvent(events);
int[] utilization = getBestUtilization(events, eventIndexes);
List<Event> schedule = getOptimizeSchedule(events, events.Count - 1, utilization, eventIndexes);
foreach (var e in schedule) {
Console.WriteLine("Event: [{0}- {1}]", e.Start, e.End);
}
}
/*
Algo to get optimization value:
1) Sort all events by end time, give each of the an index.
2) For each event, find p[n] - the latest event (by end time) which does not overlap with it.
3) Compute the optimization values: choose the best between including/not including the event.
Optimize(n) {
opt(0) = 0;
for j = 1 to n-th {
opt(j) = max(length(j) + opt[p(j)], opt[j-1]);
}
display opt();
}
*/
int[] getBestUtilization(List<Event> sortedEvents, int[] compatibleEvents) {
int[] optimal = new int[sortedEvents.Count];
int n = optimal.Length;
optimal[0] = 0;
for (int j = 1; j < n; j++) {
var thisEvent = sortedEvents[j];
//pick between 2 choices:
optimal[j] = Math.Max(thisEvent.Duration + optimal[compatibleEvents[j]], //Include this event
optimal[j - 1]); //Not include
}
return optimal;
}
/*
Show the optimized events:
sortedEvents: events sorted by end time.
index: event index to start with.
optimal: optimal[n] = the optimized schedule at n-th event.
compatibleEvents: compatibleEvents[n] = the latest event before n-th
*/
List<Event> getOptimizeSchedule(List<Event> sortedEvents, int index, int[] optimal, int[] compatibleEvents) {
List<Event> output = new List<Event>();
if (index == 0) {
//base case: no more event
return output;
}
//it's better to choose this event
else if (sortedEvents[index].Duration + optimal[compatibleEvents[index]] >= optimal[index]) {
output.Add(sortedEvents[index]);
//recursive go back
output.AddRange(getOptimizeSchedule(sortedEvents, compatibleEvents[index], optimal, compatibleEvents));
return output;
}
//it's better NOT choose this event
else {
output.AddRange(getOptimizeSchedule(sortedEvents, index - 1, optimal, compatibleEvents));
return output;
}
}
//compatibleEvents[n] = the latest event which do not overlap with n-th.
int[] getCompatibleEvent(List<Event> sortedEvents) {
int[] compatibleEvents = new int[sortedEvents.Count];
for (int i = 0; i < sortedEvents.Count; i++) {
for (int j = 0; j <= i; j++) {
if (!sortedEvents[j].IsOverlap(sortedEvents[i])) {
compatibleEvents[i] = j;
}
}
}
return compatibleEvents;
}
#endregion
}
public class Event {
public int EventId { get; set; }
public bool IsOverlap(Event other) {
return !(this.End <= other.Start ||
this.Start >= other.End);
}
public override bool Equals(object obj) {
var i = (Event)obj;
return base.Equals(obj) && i.Start == this.Start && i.End == this.End;
}
public int Start { get; set; }
public int End { get; set; }
public Event(int start, int end) {
Start = start;
End = end;
}
public int Duration {
get {
return End - Start;
}
}
}
public static class ListExtension {
public static bool ContainsOverlapped(this List<Event> list) {
var sortedList = list.OrderBy(x => x.Start).ToList();
for (int i = 0; i < sortedList.Count; i++) {
for (int j = i + 1; j < sortedList.Count; j++) {
if (sortedList[i].IsOverlap(sortedList[j]))
return true;
}
}
return false;
}
public static List<Event> SortByEndTime(this List<Event> events) {
if (events == null) return new List<Event>();
return events.OrderBy(x => x.End).ToList();
}
}
}
Is there any simple algorithm to determine the likeliness of 2 names representing the same person?
I'm not asking for something of the level that Custom department might be using. Just a simple algorithm that would tell me if 'James T. Clark' is most likely the same name as 'J. Thomas Clark' or 'James Clerk'.
If there is an algorithm in C# that would be great, but I can translate from any language.
Sounds like you're looking for a phonetic-based algorithms, such as soundex, NYSIIS, or double metaphone. The first actually is what several government departments use, and is trivial to implement (with many implementations readily available). The second is a slightly more complicated and more precise version of the first. The latter-most works with some non-English names and alphabets.
Levenshtein distance is a definition of distance between two arbitrary strings. It gives you a distance of 0 between identical strings and non-zero between different strings, which might also be useful if you decide to make a custom algorithm.
Levenshtein is close, although maybe not exactly what you want.
I've faced similar problem and tried to use Levenstein distance first, but it did not work well for me. I came up with an algorithm that gives you "similarity" value between two strings (higher value means more similar strings, "1" for identical strings). This value is not very meaningful by itself (if not "1", always 0.5 or less), but works quite well when you throw in Hungarian Matrix to find matching pairs from two lists of strings.
Use like this:
PartialStringComparer cmp = new PartialStringComparer();
tbResult.Text = cmp.Compare(textBox1.Text, textBox2.Text).ToString();
The code behind:
public class SubstringRange {
string masterString;
public string MasterString {
get { return masterString; }
set { masterString = value; }
}
int start;
public int Start {
get { return start; }
set { start = value; }
}
int end;
public int End {
get { return end; }
set { end = value; }
}
public int Length {
get { return End - Start; }
set { End = Start + value;}
}
public bool IsValid {
get { return MasterString.Length >= End && End >= Start && Start >= 0; }
}
public string Contents {
get {
if(IsValid) {
return MasterString.Substring(Start, Length);
} else {
return "";
}
}
}
public bool OverlapsRange(SubstringRange range) {
return !(End < range.Start || Start > range.End);
}
public bool ContainsRange(SubstringRange range) {
return range.Start >= Start && range.End <= End;
}
public bool ExpandTo(string newContents) {
if(MasterString.Substring(Start).StartsWith(newContents, StringComparison.InvariantCultureIgnoreCase) && newContents.Length > Length) {
Length = newContents.Length;
return true;
} else {
return false;
}
}
}
public class SubstringRangeList: List<SubstringRange> {
string masterString;
public string MasterString {
get { return masterString; }
set { masterString = value; }
}
public SubstringRangeList(string masterString) {
this.MasterString = masterString;
}
public SubstringRange FindString(string s){
foreach(SubstringRange r in this){
if(r.Contents.Equals(s, StringComparison.InvariantCultureIgnoreCase))
return r;
}
return null;
}
public SubstringRange FindSubstring(string s){
foreach(SubstringRange r in this){
if(r.Contents.StartsWith(s, StringComparison.InvariantCultureIgnoreCase))
return r;
}
return null;
}
public bool ContainsRange(SubstringRange range) {
foreach(SubstringRange r in this) {
if(r.ContainsRange(range))
return true;
}
return false;
}
public bool AddSubstring(string substring) {
bool result = false;
foreach(SubstringRange r in this) {
if(r.ExpandTo(substring)) {
result = true;
}
}
if(FindSubstring(substring) == null) {
bool patternfound = true;
int start = 0;
while(patternfound){
patternfound = false;
start = MasterString.IndexOf(substring, start, StringComparison.InvariantCultureIgnoreCase);
patternfound = start != -1;
if(patternfound) {
SubstringRange r = new SubstringRange();
r.MasterString = this.MasterString;
r.Start = start++;
r.Length = substring.Length;
if(!ContainsRange(r)) {
this.Add(r);
result = true;
}
}
}
}
return result;
}
private static bool SubstringRangeMoreThanOneChar(SubstringRange range) {
return range.Length > 1;
}
public float Weight {
get {
if(MasterString.Length == 0 || Count == 0)
return 0;
float numerator = 0;
int denominator = 0;
foreach(SubstringRange r in this.FindAll(SubstringRangeMoreThanOneChar)) {
numerator += r.Length;
denominator++;
}
if(denominator == 0)
return 0;
return numerator / denominator / MasterString.Length;
}
}
public void RemoveOverlappingRanges() {
SubstringRangeList l = new SubstringRangeList(this.MasterString);
l.AddRange(this);//create a copy of this list
foreach(SubstringRange r in l) {
if(this.Contains(r) && this.ContainsRange(r)) {
Remove(r);//try to remove the range
if(!ContainsRange(r)) {//see if the list still contains "superset" of this range
Add(r);//if not, add it back
}
}
}
}
public void AddStringToCompare(string s) {
for(int start = 0; start < s.Length; start++) {
for(int len = 1; start + len <= s.Length; len++) {
string part = s.Substring(start, len);
if(!AddSubstring(part))
break;
}
}
RemoveOverlappingRanges();
}
}
public class PartialStringComparer {
public float Compare(string s1, string s2) {
SubstringRangeList srl1 = new SubstringRangeList(s1);
srl1.AddStringToCompare(s2);
SubstringRangeList srl2 = new SubstringRangeList(s2);
srl2.AddStringToCompare(s1);
return (srl1.Weight + srl2.Weight) / 2;
}
}
Levenstein distance one is much simpler (adapted from http://www.merriampark.com/ld.htm):
public class Distance {
/// <summary>
/// Compute Levenshtein distance
/// </summary>
/// <param name="s">String 1</param>
/// <param name="t">String 2</param>
/// <returns>Distance between the two strings.
/// The larger the number, the bigger the difference.
/// </returns>
public static int LD(string s, string t) {
int n = s.Length; //length of s
int m = t.Length; //length of t
int[,] d = new int[n + 1, m + 1]; // matrix
int cost; // cost
// Step 1
if(n == 0) return m;
if(m == 0) return n;
// Step 2
for(int i = 0; i <= n; d[i, 0] = i++) ;
for(int j = 0; j <= m; d[0, j] = j++) ;
// Step 3
for(int i = 1; i <= n; i++) {
//Step 4
for(int j = 1; j <= m; j++) {
// Step 5
cost = (t.Substring(j - 1, 1) == s.Substring(i - 1, 1) ? 0 : 1);
// Step 6
d[i, j] = System.Math.Min(System.Math.Min(d[i - 1, j] + 1, d[i, j - 1] + 1), d[i - 1, j - 1] + cost);
}
}
// Step 7
return d[n, m];
}
}
I doubt there is, considering even the Customs Department doesn't seem to have a satisfactory answer...
If there is a solution to this problem I seriously doubt it's a part of core C#. Off the top of my head, it would require a database of first, middle and last name frequencies, as well as account for initials, as in your example. This is fairly complex logic that relies on a database of information.
Second to Levenshtein distance, what language do you want? I was able to find an implementation in C# on codeproject pretty easily.
In an application I worked on, the Last name field was considered reliable.
So presented all the all the records with the same last name to the user.
User could sort by the other fields to look for similar names.
This solution was good enough to greatly reduce the issue of users creating duplicate records.
Basically looks like the issue will require human judgement.