Implementing Custom Int+Range List Solution - c#

I'm wondering if anyone can come up with a way to implement an array of numbers in a more memory efficient manner that will auto-organise itself into ranges. Example;
List testList = new List{1,2,3,4,5,6,7...};
vs
List<Range> testList = new List<Range>{1-3000,3002,4000-5000...};
Previously, I have asked a question just to confirm about whether or not this would in fact be a more memory efficient alternative. This question however pertains to actual application, how to implement this range list solution.
Index Array Storage Memory
I imagine this would perhaps need to be a custom list solution that would be a mix of ints and ranges. I'm picturing being able to .Add([int]) to the list, at which point it would determine if the value would cause a range to be added or to simply add the int value to the list.
Example
RangeList rangeList = new RangeList{1, 4, 7-9};
rangeList.Add(2);
//rangeList -> 1-2, 4, 7-9
rangeList.Add(3);
//rangeList -> 1-3, 4, 7-9
Details specific to my implementation
In my particular case, I'm analysing a very large document, line by line. Lines that meet a certain criteria need to be identified and then the overall list of line indexes need to be presented to the user.
Obviously displaying "Lines 33-32019 identified" is preferable to "Lines 33,34,35...etc". For this case, numbers will always be positive.

The first thing I would do is make a class which represents your range. You can provide some convenience like formatting as a string, and having an implicit cast from an int (This helps later implementation of the range list)
public class Range
{
public int Start{get; private set;}
public int End{get; private set;}
public Range(int startEnd) : this(startEnd,startEnd)
{
}
public Range(int start, int end)
{
this.Start = start;
this.End = end;
}
public static implicit operator Range(int i)
{
return new Range(i);
}
public override string ToString()
{
if(Start == End)
return Start.ToString();
return String.Format("{0}-{1}",Start,End);
}
}
You can then begin a simple implementation of the RangeList. By providing an Add method you can use a list initializer similar to List<T>:
public class RangeList : IEnumerable<Range>
{
private List<Range> ranges = new List<Range>();
public void Add(Range range)
{
this.ranges.Add(range);
}
public IEnumerator<Range> GetEnumerator()
{
return this.ranges.GetEnumerator();
}
IEnumerator IEnumerable.GetEnumerator(){
return this.GetEnumerator();
}
}
At this point you can write some test code:
var rangeList = new RangeList(){
new Range(1,10),
15
};
foreach(var range in rangeList)
Console.WriteLine(range);
// Outputs:
// 1-10
// 15
Live example at this point: http://rextester.com/NCZSA71850
The next thing to do is provide an overload of Add which takes an int and finds the right range or adds a new one. A naive implemntation might look like the below (Assuming the addition of an Update method on range)
public void Add(int i)
{
// is it within or contiguous to an existing range
foreach(var range in ranges)
{
if(i>=range.Start && i<=range.End)
return; // already in a range
if(i == range.Start-1)
{
range.Update(i,range.End);
return;
}
if(i == range.End + 1)
{
range.Update(range.Start,i);
return;
}
}
// not in any ranges
ranges.Add(i);
}
Live example at this point: http://rextester.com/CHX64125
However this suffers from a few deficiencies
Does not merge ranges (say you already have 1-10 and 12-20 and you Add(11))
Does not re-order so if you have 1-5 and 20-25 and Add(7) this will be at the end not in the middle.
You can solve both problems by applying a sort after each addition, and some logic to determine if you should merge ranges
private void SortAndMerge()
{
ranges.Sort((a,b) => a.Start - b.Start);
var i = ranges.Count-1;
do
{
var start = ranges[i].Start;
var end = ranges[i-1].End;
if(end == start-1)
{
// merge and remove
ranges[i-1].Update(ranges[i-1].Start,ranges[i].End);
ranges.RemoveAt(i);
}
} while(i-- >1);
}
This needs to be called after every change to the list.
public void Add(Range range)
{
this.ranges.Add(range);
SortAndMerge();
}
public void Add(int value)
{
// is it within or contiguous to an existing range
foreach(var range in ranges)
{
if(value>=range.Start && value<=range.End)
return; // already in a range
if(value == range.Start-1)
{
range.Update(value,range.End);
SortAndMerge();
return;
}
if(value == range.End + 1)
{
range.Update(range.Start,value);
SortAndMerge();
return;
}
}
// not in any ranges
ranges.Add(value);
SortAndMerge();
}
Live example here: http://rextester.com/SYLARF47057
There are still some possible edge cases with this, which I urge you to work through.
UPDATE
The below will get this working as expected. This will merge up any added ranges/ints as you would expect and returns them correctly sorted. I've only changed the Add(Range) method, I think this is a fairly clean way of doing this.
public void Add(Range rangeToAdd)
{
var mergableRange = new List<Range>();
foreach (var range in ranges)
{
if (rangeToAdd.Start == range.Start && rangeToAdd.End == range.End)
return; // already exists
if (mergableRange.Any())
{
if (rangeToAdd.End >= range.Start - 1)
{
mergableRange.Add(range);
continue;
}
}
else
{
if (rangeToAdd.Start >= range.Start - 1
&& rangeToAdd.Start <= range.End + 1)
{
mergableRange.Add(range);
continue;
}
if (range.Start >= rangeToAdd.Start
&& range.End <= rangeToAdd.End)
{
mergableRange.Add(range);
continue;
}
}
}
if (!mergableRange.Any()) //Standalone range
{
ranges.Add(rangeToAdd);
}
else //merge overlapping ranges
{
mergableRange.Add(rangeToAdd);
var min = mergableRange.Min(x => x.Start);
var max = mergableRange.Max(x => x.End);
foreach (var range in mergableRange) ranges.Remove(range);
ranges.Add(new Range(min, max));
}
SortAndMerge();
}
Finally, we need if (ranges.Count > 1) in the SortAndMerge() method to prevent an index error when the first range is added.
And with that, I think this fully satisfies my question.

Related

How to get a direction on the same train line?

Can you help me with step by step logic that I need to get a direction on the same train line. with already having common train line with functions Next and Previous.
public IStation Next(IStation s)
{
if (!_stations.Contains(s))
{
throw new ArgumentException();
}
var index = _stations.IndexOf(s);
var isLast = index == _stations.Count -1;
if (isLast)
{
return null;
}
return _stations[index + 1];
}
public IStation Previous(IStation s)
{
if (!_stations.Contains(s))
{
throw new ArgumentException();
}
var index = _stations.IndexOf(s);
var isFirst = index == 0;
if (isFirst)
{
return null;
}
return _stations[index - 1];
}
And my function where I look for direction.
public string GetLineDirectiom(Station from, Station to, Line commonLine)
{
bool fromNextTo = true;
//to.Lines.Count();
//to.ToString();
var Final = commonLine.Next(from);
while (Final != null)
{
}
if (fromNextTo)
return "next";
else return "previous";
}
It looks like you are trying to "visit the stations along commonLine", starting at the from station.
The loop you have started is a valid start to that end; you need a variable to store the station you are currently visiting. Maybe the current variable name Final is a bit confusing to yourself here, because it is not the "final" station of the line, just the one you are currently visiting.
Therefore, let's name the variable currentStation. Then, you want to go to the next station until you have found to (and thereby know the direction), or until you have reached the end of the line:
var currentStation = from;
while (currentStation != null)
{
if (currentStation == to)
{
return "next";
}
currentStation = commonLine.Next(currentStation);
}
Now, this checks whether to is "ahead". If it wasn't, you can proceed to checking whether it can be found in the other direction, again starting at from:
currentStation = from;
while (currentStation != null)
{
if (currentStation == to)
{
return "previous";
}
currentStation = commonLine.Previous(currentStation);
}
If this loop doesn't find to either, apparently to is not on the line. Treat this case according to your preference.
Some remarks:
Indicating the direction as "next" or "previous" may be a bit misleading. If it really is the direction of the line, think about something like "forward" and "backward", as "next" and "previous" indeed imply the direct next/previous elements in a list.
While the above works, I do note that your Line object already has the stations in an indexed list. Therefore, a simpler way of achieving your goal might be to just determine the indices of the from and to stations on commonLine and compare which one is greater than the other.
It's not clear what you want to do and why you are returning string "next" and "prev" as a direction but in general to get the direction by the two stations:
public int GetStationIndex(IStation s)
{
var index = _stations.IndexOf(s);
if (index == -1)
{
throw new ArgumentException();
}
return index ;
}
public string GetLineDirection(Station from, Station to, Line commonLine)
{
var direction = commonLine.GetStationIndex(from)<commonLine.GetStationIndex(to)?"next" : "previous"
return direction;
}

How to push 0 item to the stack of int?

I have a simple stack implementation. But I cant realize how programmer solve the following problem: It is not possible to push a 0 to the stack. How to do that? I mean how to track is it a 0 value or just end of the stack? Or its not a problem in my implementation?
public class Stack: IStack
{
private int[] s;
private int N = 0;
public Stack(int N)
{
s = new int[N];
}
public void push(int x)
{
s[N++] = x;
if (N >= s.Length)
{
Array.Resize(ref s, s.Length*2);
}
}
public int pop()
{
s[N] = 0;
return s[--N];
}
}
You are already tracking the last element of the stack with N (or rather, N - 1). You don't need to verify whether the element is 0, and your implementation actually doesn't distinguish between zeroes and other numbers.
In the implementation you provided, it is perfectly possible to push a 0 into the stack.
By the way, I would reimplement your pop() method like this:
public int? pop()
{
if (N != 0)
{
return s[--N];
}
else
{
return null;
}
}
This way, it returns null in case the stack is empty.
You should realize that it doesn't matter what the values S[N], S[N+1], ... are since you are only using the values S[0..N-1] for your implementation. You consider the part S[N...] as uninitialized and adding a new element, even 0, causes S[N] to become initialized as the new value.
You can push 0 nothing prevents it. N is equal to number of elements, it's also used to track index of next item to push N == (index of last element + 1). The problem i see is that if you run pop() too many times you will get IndexOutOfRangeException.
You can add IsEmpty property like this:
public bool IsEmpty
{
get { return N < 1; }
}

How would I return the lowest value from the list without using shortcuts?

I was given a list of numbers, { 1, 2, 3, 4, 5, 6, 7 }. I was asked to return the lowest value without using shortcuts, e.g. .Min() etc.
I am not going to give you the answer to your homework question for you directly, however to get you started just loop over the list and keep track of the smallest one you found. When you are done the smallest you found is the smallest in the list.
For your second part, it is just basic problem solving. Look at the problem, break it into smaller pieces. For example, for your problem you could break it in to:
How do I loop over a list
How do I remember a value between loops
How do i compare if the remembered value is smaller than the current loop value
How do I replace my remembered value if the current loop value is smaller
Then solve each piece individually.
You can do this in old-fashion imperative way. Works with all comparable types:
public static class MyEnumerableExtensions
{
public static T Min<T>(this IEnumerable<T> list) where T : IComparable<T>
{
if (list == null)
{
throw new ArgumentNullException("list");
}
T min = default (T);
bool initialized = false;
foreach (T elem in list)
{
if (!initialized)
{
min = elem;
initialized = true;
}
else if (min == null) // Do not compare with null, reset min
{
min = elem;
}
else if (elem != null && min.CompareTo(elem) > 0) // Compare only when elem is not null
{
min = elem;
}
}
if (!initialized)
{
throw new InvalidOperationException("list is empty");
}
return min;
}
}
Usage:
var min1 = list.Min();
var min2 = MyEnumerableExtensions.Min(list);
Also, is only Min method is restricted from Linq, additional tricks are possible. Works for numbers only:
var minViaMax = -list.Select(x => -x).Max();
var minViaAggregate = list.Aggregate(Math.Min);

Formatting the string by rewriting the delimiters

I'm dealing with some legacy data, where they store each record in one huge/large string (one string = one record)
In each string, they split the data in some sort of delimiters, but each of them actually defines a meaning, for example: \vToyota\cBlue\cRed\cWhite\s200mph\oAndrew\oJohn
\v means vehicle, \c is color, \s is speed \o is Owner... something like that
My task requires me to reformat the data so that if there are multiple fields of one characteristic, I have to rewrite it as: (for example) \vToyota\cBlue\c2Red\c3White\s200mph\oAndrew\o2John
Edited: Alright. #DarrenYoung's suggestions works! Now I have an array of vToyota cBlue cRed cWhite s200mph oAndrew oJohn. I tested on other data using the same method and it is working too. Now I just need help to find a way to rewrite the first letter of each string whenever they are repeated.
Thank you!
I found this an interesting little puzzle to see what I could do with LINQ. The following seems to work:
private string FixIt(string foo)
{
var newFoo = "\\" + string.Join("\\",
foo.Split(new[] {'\\'}, StringSplitOptions.RemoveEmptyEntries)
.GroupBy(s => s[0],
(c, g) =>
{
var cnt = 0;
return g.Select(x => cnt++ == 0
? x
: x[0] + cnt.ToString() + x.Substring(1));
})
.SelectMany(g => g));
return newFoo;
}
Input: \vToyota\cBlue\cRed\cWhite\s200mph\oAndrew\oJohn
Output: \vToyota\cBlue\c2Red\c3White\s200mph\oAndrew\o2John
That SelectMany is a handy thing to remember.
Because I thought this question was interesting I wrote up a program to do what I believe to be a reasonable solution. I started with a few principle assumptions:
In "old data" situations you probably don't know every single option that is going to show up in the records. Consequently whatever approach is taken needs to quickly and easily accommodate new types of delimiters and tags. For that reason I did not use a string.split approach (even though this is easier to read). Instead all tokens are declared at the beginning of the file. Anything can be a token whether or not it has a "\" in front of it.
The solution needs to gracefully handle records that don't conform to the standards
The option of parsing integers for multiple records needs to be able to be disabled per record type. Speed, for example, doesn't (seem) to be able to appear multiple times per record. So, setting the value for speed to false in the "ALLOW_MULTIPLE" variable turns this parsing off, ensuring the correct output value.
In my solution I also created separate classes for readability and so the code could be quickly investigated. Although I would not suggest that this is production ready, the following should go a long ways towards solving the issue. Best of luck!
// Just paste the rest of this into a new console application to see it work!
public class Program
{
private static readonly List<string> TOKENS = new List<string> {#"\v", #"\c", #"\o", #"\s"};
private static readonly List<string> DISPLAY = new List<string> {"Vehicle", "Color", "Owner", "Speed"};
private static readonly List<bool> ALLOW_MULTIPLE = new List<bool> {false, true, true, false};
private class RecordEntry
{
public string Value { get; set; }
public int Index { get; set; }
public string DataType { get; set; }
public override string ToString() { return DataType + ": " + Value; }
}
private class ParsedRecord
{
private List<RecordEntry> entries = new List<RecordEntry>();
public List<RecordEntry> Entries { get { return entries; } }
}
public static void Main(string[] args)
{
// sample records (second has a \m which is ignored since it isn't a recognized token)
var records = new[] {#"\vToyota\cBlue\c2Red\c3White\s200mph\oAndrew\o2John",
#"\vChevy\c2Orange\cGreen\s50mph\o2Bob\mWhite"};
var parsedData = new List<ParsedRecord>();
foreach (var record in records)
{
// character by character parsing
var currentParseRecord = new ParsedRecord();
parsedData.Add(currentParseRecord);
var currentRecord = new StringBuilder(record);
var currentToken = new StringBuilder();
for (var parseIdx = 0; parseIdx < currentRecord.Length; parseIdx++)
{
currentToken.Append(currentRecord[parseIdx]);
var recordIdx = 0;
var index = TOKENS.IndexOf(currentToken.ToString());
if (index < 0) continue;
// current char is used up now (was part of the token)
parseIdx++;
if (ALLOW_MULTIPLE[index] && currentRecord.Length > parseIdx + 1)
{
// assuming less than 10 records max - if more, would need to pull multiple numeric values here
if (!Int32.TryParse(currentRecord[parseIdx] + "", out recordIdx)) recordIdx = 0;
else parseIdx++;
}
// find the next token or end of string
int valueLength = FindNextToken(currentRecord, parseIdx) - parseIdx;
if (valueLength <= 0) valueLength = currentRecord.Length - parseIdx;
currentParseRecord.Entries.Add(new RecordEntry
{
DataType = DISPLAY[index],
Index = recordIdx,
Value = currentRecord.ToString(parseIdx, valueLength)
});
parseIdx += valueLength - 1;
currentToken.Clear();
}
}
}
private static int FindNextToken(StringBuilder value, int currentIndex)
{
for (var searchIdx = currentIndex; searchIdx < value.Length; searchIdx++) {
if (TOKENS.Any(checkToken => value.Length > searchIdx + checkToken.Length &&
value.ToString(searchIdx, checkToken.Length) == checkToken)) {
return searchIdx;
}
}
return -1;
}
}

Array sorting by two parameters

I'm having a little difficulty with the array.sort. I have a class and this class has two fields, one is a random string the other one is a random number. If i want to sort it with one parameter it just works fine. But i would like to sort it with two parameters. The first one is the SUM of the numbers(from low to high), and THEN if these numbers are equal by the random string that is give to them(from low to high).
Can you give some hint and tips how may i can "merge" these two kinds of sort?
Array.Sort(Phonebook, delegate(PBook user1, PBook user2)
{ return user1.Sum().CompareTo(user2.Sum()); });
Console.WriteLine("ORDER");
foreach (PBook user in Phonebook)
{
Console.WriteLine(user.name);
}
That's how i order it with one parameter.
i think this is what you are after:
sourcearray.OrderBy(a=> a.sum).ThenBy(a => a.random)
Here is the general algorithm that you'll use for comparing multiple fields in a CompareTo method:
public int compare(MyClass first, MyClass second)
{
int firstComparison = first.FirstValue.CompareTo(second.SecondValue);
if (firstComparison != 0)
{
return firstComparison;
}
else
{
return first.SecondValue.CompareTo(second.SecondValue);
}
}
However, LINQ does make the syntax for doing this much easier, allowing you to only write:
Phonebook = Phonebook.OrderBy(book=> book.Sum())
.ThenBy(book => book.OtherProperty)
.ToArray();
You can do this in-place by using a custom IComparer<PBook>. The following should order your array as per your original code, but if two sums are equal it should fall back on the random string (which I've called RandomString):
public class PBookComparer : IComparer<PBook>
{
public int Compare(PBook x, PBook y)
{
// Sort null items to the top; you can drop this
// if you don't care about null items.
if (x == null)
return y == null ? 0 : -1;
else if (y == null)
return 1;
// Comparison of sums.
var sumCompare = x.Sum().CompareTo(y.Sum());
if (sumCompare != 0)
return sumCompare;
// Sums are the same; return comparison of strings
return String.Compare(x.RandomString, y.RandomString);
}
}
You call this as
Array.Sort(Phonebook, new PBookComparer());
You could just do this inline but it gets a bit hard to follow:
Array.Sort(Phonebook, (x, y) => {
int sc = x.Sum().CompareTo(y.Sum());
return sc != 0 ? sc : string.Compare(x.RandomString, y.RandomString); });
... Actually, that isn't too bad, although I have dropped the null checks.

Categories