Creating a power set of a Sequence - c#

I am trying to create a program that is a base for creating possible combinations of a sequence, string or a number. This is some sort of encryption / decryption program. I am using Visual Studio 2013 and C#. What I am trying to do is to generate a power set out of a sequence, but I am little bit confused and can't proceed any further. Here is the code.
public static void randomSeq()
{
int temp = 0;
string seq = "1234";
var sb = new StringBuilder();
char[] bits = seq.Select((char c) => c).ToArray();
Console.Write("Given Sequence: ");
Console.Write(seq);
Console.WriteLine();
Console.WriteLine("Generated possiblities");
foreach (char item in bits)
Console.WriteLine(item);
do
{
if (temp <= 2)
{
for (int i = temp + 1; i < bits.Length; i++)
{
sb.Append(bits[temp]);
sb.Append(bits[i]);
Console.WriteLine(sb);
sb.Clear();
}
}
else if (temp > 2)
{
for (int k = 0; k < temp; k++)
sb.Append(bits[k]);
for (int l = temp + 1; l < bits.Length; l++)
{
sb.Append(bits[temp]);
sb.Append(bits[l]);
Console.WriteLine(sb);
sb.Clear();
}
}
temp++;
}
while (temp != bits.Length);
}
I want this code to be generic i.e. I pass any sequence and it generates a power set for me. Then I want to reuse it further in my program. I can do the rest simply I am stuck in generating the power set. Can someone help me?.

Power set is easy to generate if one is familiar with bits. For the set of N elements, there will be 2^N subsets which will go to power set (including empty set and initial set). So each element will be either IN or OUT (1 or 0 in other words).
Taking this into consideration, it is easy to represent subsets of the set as bit masks. Then enumerating through all possible bit masks, it is possible construct the whole power sets. In order to do this we need to examine each bit in bit mask and take element of input set if there is 1 in that place. Below is example for string (collection of chars) as input. It can be easily rewritten to work for collection of any type values.
private static List<string> PowerSet(string input)
{
int n = input.Length;
// Power set contains 2^N subsets.
int powerSetCount = 1 << n;
var ans = new List<string>();
for (int setMask = 0; setMask < powerSetCount; setMask++)
{
var s = new StringBuilder();
for (int i = 0; i < n; i++)
{
// Checking whether i'th element of input collection should go to the current subset.
if ((setMask & (1 << i)) > 0)
{
s.Append(input[i]);
}
}
ans.Add(s.ToString());
}
return ans;
}
Example
Suppose you have string "xyz" as input, it contains 3 elements, than there will be 2^3 == 8 elements in power set. If you will be iterating from 0 to 7 you will get the following table. Columns: (10-base integer; bits representation (2-base); subset of initial set).
0 000 ...
1 001 ..z
2 010 .y.
3 011 .yz
4 100 x..
5 101 x.z
6 110 xy.
7 111 xyz
You can notice that third column contains all subsets of initial string "xyz"
Another approach (twice faster) and generic implementation
Inspired by Eric's idea, I have implemented another variant of this algorithm (without bits now). Also I made it generic. I believe this code is near to fastest of what can be written for Power Set calculation. Its complexity is the same as for bits approach O(n * 2^n), but for this approach constant is halved.
public static T[][] FastPowerSet<T>(T[] seq)
{
var powerSet = new T[1 << seq.Length][];
powerSet[0] = new T[0]; // starting only with empty set
for (int i = 0; i < seq.Length; i++)
{
var cur = seq[i];
int count = 1 << i; // doubling list each time
for (int j = 0; j < count; j++)
{
var source = powerSet[j];
var destination = powerSet[count + j] = new T[source.Length + 1];
for (int q = 0; q < source.Length; q++)
destination[q] = source[q];
destination[source.Length] = cur;
}
}
return powerSet;
}

SergeyS's approach is perfectly reasonable. Here's an alternative way to think about it.
For the purposes of this answer I'm going to assume that "sets" are finite sequences.
We define the function P recursively as follows.
A set is either empty, or a single item H followed by a set T.
P(empty) --> { empty }
P(H : T) --> the union of P(T) and every element of P(T) prepended with H.
Let's try that out. What's the power set of {Apple, Banana, Cherry}?
It's not an empty set, so the power set of {Apple, Banana, Cherry} is the power set of {Banana, Cherry}, plus the sets formed by prepending Apple to each.
So we need to know the power set of {Banana, Cherry}. It's the power set of {Cherry} plus the sets form by prepending Banana to each.
So we need to know the power set of {Cherry}. It's the power set of the empty set, plus the sets formed by prepending Cherry to each.
So we need to know the power set of the empty set. It's the set containing the empty set. { {} }
Now prepend each element with Cherry and take the union. That's { {Cherry}, {} }. That gives us the power set of { Cherry }. Remember we needed that to find the power set of {Banana, Cherry}, so we union it with Banana prepended to each and get { {Banana, Cherry}, {Banana}, {Cherry}, {}} and that's the power set of {Banana, Cherry}.
Now we needed that to get the power set of {Apple, Banana, Cherry}, so union it with Apple appended to each and we have { {Apple, Banana, Cherry}, {Apple, Banana}, {Apple, Cherry}, {Apple}, {Banana, Cherry}, {Banana}, {Cherry}, {}} and we're done.
The code should be straightforward to write. First we'll need a helper method:
static IEnumerable<T> Prepend<T>(this IEnumerable<T> tail, T head)
{
yield return head;
foreach(T item in tail) yield return item;
}
And now the code is a straightforward translation of the description of the algorithm:
static IEnumerable<IEnumerable<T>> PowerSet<T>(this IEnumerable<T> items)
{
if (!items.Any())
yield return items; // { { } }
else
{
var head = items.First();
var powerset = items.Skip(1).PowerSet().ToList();
foreach(var set in powerset) yield return set.Prepend(head);
foreach(var set in powerset) yield return set;
}
}
Make sense?
----------- UPDATE ----------------
Sergey points out correctly that my code has a Schlemiel the Painter algorithm and therefore consumes huge amounts of time and memory; good catch Sergey. Here's an efficient version that uses an immutable stack:
class ImmutableList<T>
{
public static readonly ImmutableList<T> Empty = new ImmutableList<T>(null, default(T));
private ImmutableList(ImmutableList<T> tail, T head)
{
this.Head = head;
this.Tail = tail;
}
public T Head { get; private set; }
public ImmutableList<T> Tail { get; private set; }
public ImmutableList<T> Push(T head)
{
return new ImmutableList<T>(this, head);
}
public IEnumerable<ImmutableList<T>> PowerSet()
{
if (this == Empty)
yield return this;
else
{
var powerset = Tail.PowerSet();
foreach (var set in powerset) yield return set.Push(Head);
foreach (var set in powerset) yield return set;
}
}
}

Same algorithm SergeyS mention using Linq (where inputSet is the input and outputPowerSet is the output):
int setLength = inputSet.Count;
int powerSetLength = 1 << setLength;
for (int bitMask = 0; bitMask < powerSetLength; bitMask++)
{
var subSet = from x in inputSet
where ((1 << inputSet.IndexOf(x)) & bitMask) != 0
select x;
outputPowerSet.Add(subSet.ToList());
}

Very late to the game, but why not the approach below? It seems significantly simpler than the suggestions posted here:
/*
Description for a sample set {1, 2, 2, 3}:
Step 1 - Start with {}:
{}
Step 2 - "Expand" previous set by adding 1:
{}
---
{1}
Step 3 - Expand previous set by adding the first 2:
{}
{1}
---
{2}
{1,2}
Step 4 - Expand previous set by adding the second 2:
{}
{1}
{2}
{1,2}
---
{2}
{1,2}
{2,2}
{1,2,2}
Step 5 - Expand previous set by adding 3:
{}
{1}
{2}
{1,2}
{2}
{1,2}
{2,2}
{1,2,2}
---
{3}
{1,3}
{2,3}
{1,2,3}
{2,3}
{1,2,3}
{2,2,3}
{1,2,2,3}
Total elements = 16 (i.e. 2^4), as expected.
*/
private static void PowerSet(IList<int> nums, ref IList<IList<int>> output)
{
// ToDo: validate args
output.Add(new List<int>());
ExpandSet(nums, 0, ref output);
}
private static void ExpandSet(IList<int> nums, int pos, ref IList<IList<int>> output)
{
if (pos == nums.Count)
{
return;
}
List<int> tmp;
int item = nums[pos];
for (int i = 0, n = output.Count; i < n; i++)
{
tmp = new List<int>();
tmp.AddRange(output[i]);
tmp.Add(item);
output.Add(tmp);
}
ExpandSet(nums, pos + 1, ref output);
}

Related

Implementing Levenstein distance for reversed string combination?

I have an employees list in my application. Every employee has name and surname, so I have a list of elements like:
["Jim Carry", "Uma Turman", "Bill Gates", "John Skeet"]
I want my customers to have a feature to search employees by names with a fuzzy-searching algorithm. For example, if user enters "Yuma Turmon", the closest element - "Uma Turman" will return. I use a Levenshtein distance algorithm, I found here.
static class LevenshteinDistance
{
/// <summary>
/// Compute the distance between two strings.
/// </summary>
public static int Compute(string s, string t)
{
int n = s.Length;
int m = t.Length;
int[,] d = new int[n + 1, m + 1];
// Step 1
if (n == 0)
{
return m;
}
if (m == 0)
{
return n;
}
// Step 2
for (int i = 0; i <= n; d[i, 0] = i++)
{
}
for (int j = 0; j <= m; d[0, j] = j++)
{
}
// Step 3
for (int i = 1; i <= n; i++)
{
//Step 4
for (int j = 1; j <= m; j++)
{
// Step 5
int cost = (t[j - 1] == s[i - 1]) ? 0 : 1;
// Step 6
d[i, j] = Math.Min(
Math.Min(d[i - 1, j] + 1, d[i, j - 1] + 1),
d[i - 1, j - 1] + cost);
}
}
// Step 7
return d[n, m];
}
}
I iterate user's input (full name) over the list of employee names and compare distance. If it is below 3, for example, I return found employee.
Now I want allow users to search by reversed names - for example, if user inputs "Turmon Uma" it will return "Uma Turman", as actually real distance is 1, because First name and Last name is the same as Last name and First name. My algorithm now counts it as different strings, far away. How can I modify it so that names are found regardless of order?
You can create a reversed version of employee names with LINQ. For example, if you have a list of employees like
x = ["Jim Carry", "Uma Turman", "Bill Gates", "John Skeet"]
you can write the following code:
var reversedNames = x.Select(p=> $"{p.Split(' ')[1] p.Split(' ')[0]}");
It will return the reversed version, like:
xReversed = ["Carry Jim", "Turman Uma", "Gates Bill", "Skeet John"]
Then repeat you algorithm with this data too.
A few thoughts, as this is a potentially complicated problem to get right:
Split each employee name into a list of strings. Personally, I'd probably discard anything with 2 or fewer letters, unless that's all the name is composed of. This should help with surnames like "De La Cruz" which might get searched as "dela cruz". Store the list of names for each employee in a dictionary that points back to that employee.
Split the search terms in the same way you split the names in the list. For each search term find the names with the lowest Levenshtein distance, then for each one, starting at the lowest, repeat the search with the rest of the search terms against the other names for that employee. Repeat this step with each word in the query. For example, if the query is John Smith, find the best single word name matches for John, then match remaining names for those "best match" employees on Smith, and get a sum of the distances. Then find the best matches for Smith and match remaining names on John, and sum the distances. The best match is the one with the lowest total distance. You can provide a list of best matches by returning the top 10, say, sorted by total distance. And it won't matter which way around the names in the database or the search terms are. In fact they could be completely out of order and it wouldn't matter.
Consider how to handle hyphenated names. I'd probably split them as if they were not hyphenated.
Consider upper/lower case characters, if you haven't already. You should store lookups in one case and convert the search terms to the same case before comparison.
Be careful of accented letters, many people have them in their names, such as รก. Your algorithm won't work correctly with them. Be even more careful if you expect to ever have non-alpha double byte characters, eg. Chinese, Japanese, Arabic, etc.
Two more benefits of splitting the names of each employee:
"Unused" names won't count against the total, so if I only search using the last name, it won't count against me in finding the shortest distance.
Along the same lines, you could apply some extra rules to help with finding non-standard names. For example, hyphenated names could be stored both as hyphenated (eg. Wells-Harvey), compound (WellsHarvey) and individual names (Wells and Harvey separate), all against the same employee. A low-distance match on any one name is a low-distance match on the employee, again extra names don't count against the total.
Here's some basic code that seems to work, however it only really takes into account points 1, 2 and 4:
using System;
using System.Collections.Generic;
using System.Linq;
namespace EmployeeSearch
{
static class Program
{
static List<string> EmployeesList = new List<string>() { "Jim Carrey", "Uma Thurman", "Bill Gates", "Jon Skeet" };
static Dictionary<int, List<string>> employeesById = new Dictionary<int, List<string>>();
static Dictionary<string, List<int>> employeeIdsByName = new Dictionary<string, List<int>>();
static void Main()
{
Init();
var results = FindEmployeeByNameFuzzy("Umaa Thurrmin");
// Returns:
// (1) Uma Thurman Distance: 3
// (0) Jim Carrey Distance: 10
// (3) Jon Skeet Distance: 11
// (2) Bill Gates Distance: 12
Console.WriteLine(string.Join("\r\n", results.Select(r => $"({r.Id}) {r.Name} Distance: {r.Distance}")));
var results = FindEmployeeByNameFuzzy("Tormin Oma");
// Returns:
// (1) Uma Thurman Distance: 4
// (3) Jon Skeet Distance: 7
// (0) Jim Carrey Distance: 8
// (2) Bill Gates Distance: 9
Console.WriteLine(string.Join("\r\n", results.Select(r => $"({r.Id}) {r.Name} Distance: {r.Distance}")));
Console.Read();
}
private static void Init() // prepare our lists
{
for (int i = 0; i < EmployeesList.Count; i++)
{
// Preparing the list of names for each employee - add special cases such as hyphenation here as well
var names = EmployeesList[i].ToLower().Split(new char[] { ' ' }).ToList();
employeesById.Add(i, names);
// This is not used here, but could come in handy if you want a unique index of names pointing to employee ids for optimisation:
foreach (var name in names)
{
if (employeeIdsByName.ContainsKey(name))
{
employeeIdsByName[name].Add(i);
}
else
{
employeeIdsByName.Add(name, new List<int>() { i });
}
}
}
}
private static List<SearchResult> FindEmployeeByNameFuzzy(string query)
{
var results = new List<SearchResult>();
// Notice we're splitting the search terms the same way as we split the employee names above (could be refactored out into a helper method)
var searchterms = query.ToLower().Split(new char[] { ' ' });
// Comparison with each employee
for (int i = 0; i < employeesById.Count; i++)
{
var r = new SearchResult() { Id = i, Name = EmployeesList[i] };
var employeenames = employeesById[i];
foreach (var searchterm in searchterms)
{
int min = searchterm.Length;
// for each search term get the min distance for all names for this employee
foreach (var name in employeenames)
{
var distance = LevenshteinDistance.Compute(searchterm, name);
min = Math.Min(min, distance);
}
// Sum the minimums for all search terms
r.Distance += min;
}
results.Add(r);
}
// Order by lowest distance first
return results.OrderBy(e => e.Distance).ToList();
}
}
public class SearchResult
{
public int Distance { get; set; }
public int Id { get; set; }
public string Name { get; set; }
}
public static class LevenshteinDistance
{
/// <summary>
/// Compute the distance between two strings.
/// </summary>
public static int Compute(string s, string t)
{
int n = s.Length;
int m = t.Length;
int[,] d = new int[n + 1, m + 1];
// Step 1
if (n == 0)
{
return m;
}
if (m == 0)
{
return n;
}
// Step 2
for (int i = 0; i <= n; d[i, 0] = i++)
{
}
for (int j = 0; j <= m; d[0, j] = j++)
{
}
// Step 3
for (int i = 1; i <= n; i++)
{
//Step 4
for (int j = 1; j <= m; j++)
{
// Step 5
int cost = (t[j - 1] == s[i - 1]) ? 0 : 1;
// Step 6
d[i, j] = Math.Min(
Math.Min(d[i - 1, j] + 1, d[i, j - 1] + 1),
d[i - 1, j - 1] + cost);
}
}
// Step 7
return d[n, m];
}
}
}
Simply call Init() when you start, then call
var results = FindEmployeeByNameFuzzy(userquery);
to return an ordered list of the best matches.
Disclaimers: This code is not optimal and has only been briefly tested, doesn't check for nulls, could explode and kill a kitten, etc, etc. If you have a large number of employees then this could be very slow. There are several improvements that could be made, for example when looping over the Levenshtein algorithm you could drop out if the distance gets above the current minimum distance.

Generating a unique, incremental char array [duplicate]

I need assistance with Combinations with Repetition. Have searched all over the net and although I found a few examples I can't understand them completely. My goal is simple a function (CombinationsWithRepetiion) receives list with items (in this case integer values) and length (that represents how long each combination can be) and returns a list containing the result.
List<int> input = new List<int>() {1, 2, 3}
CombinationsWithRepetition(input, length);
result:
length = 1: 1, 2, 3
length = 2: 11,12,13,21,22,23,31,32,33
length = 3: 111,112 ....
I hope someone helps me and thank you in advance!
recursion
Ok,
here is the C# version - I walk you through it
static IEnumerable<String> CombinationsWithRepetition(IEnumerable<int> input, int length)
{
if (length <= 0)
yield return "";
else
{
foreach(var i in input)
foreach(var c in CombinationsWithRepetition(input, length-1))
yield return i.ToString() + c;
}
}
First you check the border-cases for the recursion (in this case if length <= 0) - in this case the answer is the empty string (btw: I choose to return strings as you did not say what your really needed - should be easy to change).
In any other case you look at each input i and recursivley take the next-smaller combinations and just plug em together (with String-concatination because I wanted strings).
I hope you understand the IEnumerable/yield stuff - if not say so in the comments please.
Here is a sample output:
foreach (var c in CombinationsWithRepetition(new int[]{1,2,3}, 3))
Console.WriteLine (c);
111
112
113
...
332
333
converting numbers
The following uses the idea I sketched in the comment below and has no problems with stack-overflow exceptions (recursion might for big lenght) - this too assumes strings as they are easier to work with (and I can do a simple PadLeft to simplify things)
static String Convert(string symbols, int number, int totalLen)
{
var result = "";
var len = symbols.Length;
var nullSym = symbols [0];
while (number > 0)
{
var index = number % len;
number = number / len;
result = symbols [index] + result;
}
return result.PadLeft (totalLen, nullSym);
}
static IEnumerable<String> CombinationsWithRepetition(string symbols, int len)
{
for (var i = 0; i < Math.Pow(symbols.Length,len); i++)
yield return Convert (symbols, i, len);
}
string[] items = {"1", "2", "3"};
var query = from i1 in items
from i2 in items
from i3 in items
select i1 + i2 + i3 ;
foreach(var result in query)
Console.WriteLine(result);
Console.ReadKey();

Compare current and last value in list

There is a moving average suppose: 2, 4, 6 , 8 , 10...n;
Then add the current value (10) to list
List<int>numHold = new List<int>();
numhold.Add(currentvalue);
Inside the list:
the current value is added
10
and so on
20
30
40 etc
by using
var lastdigit = numHold[numhold.Count -1];
I can get the last digit but the output is
current: 10 last: 10
current: 20 last: 20
the output should be
current: 20 last: 10
Thanks
Typically, C# indexers start from 0, so the first element has index 0. On the other hand, Count/Length will use 1 for one element. So your
numHold[numhold.Count - 1]
actually takes the last element in the list. If you need the one before that, you need to use - 2 - though be careful you do not reach outside of the bounds of the list (something like Math.Max(0, numhold.Count - 2) might be appropriate).
You can also store the values in separate variables:
List<int> nums = new List<int> { 1 };
int current = 1;
int last = current;
for (int i = 0; i < 10; i++)
{
last = current;
current = i * 2;
nums.Add(current);
}
Console.WriteLine("Current: {0}", current);
Console.WriteLine("Last: {0}", last);
Question is so unclear, but if ur using moving average to draw a line graph ๐Ÿ“ˆ you would use a circular buffer which can be implemented by urself utilizing an object that contains an array of specified size, and the next available position. You could also download a nuget package that already has it done.
A relatively simple way to calculate a moving average is to use a circular buffer to hold the last N values (where N is the number of values for which to compute a moving average).
For example:
public sealed class MovingAverage
{
private readonly int _max;
private readonly double[] _numbers;
private double _total;
private int _front;
private int _count;
public MovingAverage(int max)
{
_max = max;
_numbers = new double[max];
}
public double Average
{
get { return _total / _count; }
}
public void Add(double value)
{
_total += value;
if (_count == _max)
_total -= _numbers[_front];
else
++_count;
_numbers[_front] = value;
_front = (_front+1)%_max;
}
};
which you might use like this:
var test = new MovingAverage(11);
for (int i = 0; i < 25; ++i)
{
test.Add(i);
Console.WriteLine(test.Average);
}
Note that this code is optimised for speed. After a large number of iterations, you might start to get rounding errors. You can avoid this by adding to class MovingAverage a slower method to calculate the average instead of using the Average property:
public double AccurateAverage()
{
double total = 0;
for (int i = 0, j = _front; i < _count; ++i)
{
total += _numbers[j];
if (--j < 0)
j = _max - 1;
}
return total/_count;
}
Your last item will always be at position 0.
List<int>numHold = new List<int>();
numHold.add(currentvalue); //Adding 10
numHold[0]; // will contain 10
numHold.add(currentvalue); //Adding 20
numHold[0]; // will contain 10
numHold[numhold.Count - 1]; // will contain 20
the better way to get first and last are
numHold.first(); //Actually last in your case
numHold.last(); //first in your case

Find the contiguous sequence with the largest product in an integer array

I have come up with the code below but that doesn't satisfy all cases, e.g.:
Array consisting all 0's
Array having negative values(it's bit tricky since it's about finding product as two negative ints give positive value)
public static int LargestProduct(int[] arr)
{
//returning arr[0] if it has only one element
if (arr.Length == 1) return arr[0];
int product = 1;
int maxProduct = Int32.MinValue;
for (int i = 0; i < arr.Length; i++)
{
//this block store the largest product so far when it finds 0
if (arr[i] == 0)
{
if (maxProduct < product)
{
maxProduct = product;
}
product = 1;
}
else
{
product *= arr[i];
}
}
if (maxProduct > product)
return maxProduct;
else
return product;
}
How can I incorporate the above cases/correct the code. Please suggest.
I am basing my answer on the assumption that if you have more than 1 element in the array, you would want to multiply at least 2 contiguous integers for checking the output, i.e. in array of {-1, 15}, the output that you want is -15 and not 15).
The problem that we need to solve is to look at all possible multiplication combinations and find out the max product out of them.
The total number of products in an array of n integers would be nC2 i.e. if there are 2 elements, then the total multiplication combinations would be 1, for 3, it would be 3, for 4, it would be 6 and so on.
For each number that we have in the incoming array, it has to multiply with all the multiplications that we did with the last element and keep the max product till now and if we do it for all the elements, at the end we would be left with the maximum product.
This should work for negatives and zeros.
public static long LargestProduct(int[] arr)
{
if (arr.Length == 1)
return arr[0];
int lastNumber = 1;
List<long> latestProducts = new List<long>();
long maxProduct = Int64.MinValue;
for (int i = 0; i < arr.Length; i++)
{
var item = arr[i];
var latest = lastNumber * item;
var temp = new long[latestProducts.Count];
latestProducts.CopyTo(temp);
latestProducts.Clear();
foreach (var p in temp)
{
var product = p * item;
if (product > maxProduct)
maxProduct = product;
latestProducts.Add(product);
}
if (i != 0)
{
if (latest > maxProduct)
maxProduct = latest;
latestProducts.Add(latest);
}
lastNumber = item;
}
return maxProduct;
}
If you want the maximum product to also incorporate the single element present in the array i.e. {-1, 15} should written 15, then you can compare the max product with the element of the array being processed and that should give you the max product if the single element is the max number.
This can be achieved by adding the following code inside the for loop at the end.
if (item > maxProduct)
maxProduct = item;
Your basic problem is 2 parts. Break them down and solving it becomes easier.
1) Find all contiguous subsets.
Since your source sequence can have negative values, you are not all that equipped to make any value judgments until you're found each subset, as a negative can later be "cancelled" by another. So let the first phase be to only find the subsets.
An example of how you might do this is the following code
// will contain all contiguous subsets
var sequences = new List<Tuple<bool, List<int>>>();
// build subsets
foreach (int item in source)
{
var deadCopies = new List<Tuple<bool, List<int>>>();
foreach (var record in sequences.Where(r => r.Item1 && !r.Item2.Contains(0)))
{
// make a copy that is "dead"
var deadCopy = new Tuple<bool, List<int>>(false, record.Item2.ToList());
deadCopies.Add(deadCopy);
record.Item2.Add(item);
}
sequences.Add(new Tuple<bool, List<int>>(true, new List<int> { item }));
sequences.AddRange(deadCopies);
}
In the above code, I'm building all my contiguous subsets, while taking the liberty of not adding anything to a given subset that already has a 0 value. You can omit that particular behavior if you wish.
2) Calculate each subset's product and compare that to a max value.
Once you have found all of your qualifying subsets, the next part is easy.
// find subset with highest product
int maxProduct = int.MinValue;
IEnumerable<int> maxSequence = Enumerable.Empty<int>();
foreach (var record in sequences)
{
int product = record.Item2.Aggregate((a, b) => a * b);
if (product > maxProduct)
{
maxProduct = product;
maxSequence = record.Item2;
}
}
Add whatever logic you wish to restrict the length of the original source or the subset candidates or product values. For example, if you wish to enforce minimum length requirements on either, or if a subset product of 0 is allowed if a non-zero product is available.
Also, I make no claims as to the performance of the code, it is merely to illustrate breaking the problem down into its parts.
I think you should have 2 products at the same time - they will differ in signs.
About case, when all values are zero - you can check at the end if maxProduct is still Int32.MinValue (if Int32.MinValue is really not possible)
My variant:
int maxProduct = Int32.MinValue;
int? productWithPositiveStart = null;
int? productWithNegativeStart = null;
for (int i = 0; i < arr.Length; i++)
{
if (arr[i] == 0)
{
productWithPositiveStart = null;
productWithNegativeStart = null;
}
else
{
if (arr[i] > 0 && productWithPositiveStart == null)
{
productWithPositiveStart = arr[i];
}
else if (productWithPositiveStart != null)
{
productWithPositiveStart *= arr[i];
maxProduct = Math.max(maxProduct, productWithPositiveStart);
}
if (arr[i] < 0 && productWithNegativeStart == null)
{
productWithNegativeStart = arr[i];
}
else if (productWithNegativeStart != null)
{
productWithNegativeStart *= arr[i];
maxProduct = Math.max(maxProduct, productWithNegativeStart);
}
maxProduct = Math.max(arr[i], maxProduct);
}
}
if (maxProduct == Int32.MinValue)
{
maxProduct = 0;
}
At a high level, your current algorithm splits the array upon a 0 and returns the largest contiguous product of these sub-arrays. Any further iterations will be on the process of finding the largest contiguous product of a sub-array where no elements are 0.
To take into account negative numbers, we obviously first need to test if the product of one of these sub-arrays is negative, and take some special action if it is.
The negative result comes from an odd number of negative values, so we need to remove one of these negative values to make the result positive again. To do this we remove all elements up the the first negative number, or the last negative number and all elements after that, whichever results in the highest product.
To take into account an array of all 0's, simply use 0 as your starting maxProduct. If the array is a single negative value, you're special case handling of a single element will mean that is returned. After that, there will always be a positive sub-sequence product, or else the whole array is 0 and it should return 0 anyway.
it can be done in O(N). it is based on the simple idea: calculate the minimum (minCurrent) and maximum (maxCurrent) till i. This can be easily changed to fit for the condition like: {0,0,-2,0} or {-2,-3, -8} or {0,0}
a[] = {6, -3, 2, 0, 3, -2, -4, -2, 4, 5}
steps of the algorithm given below for the above array a :
private static int getMaxProduct(int[] a) {
if (a.length == 0) {
throw new IllegalArgumentException();
}
int minCurrent = 1, maxCurrent = 1, max = Integer.MIN_VALUE;
for (int current : a) {
if (current > 0) {
maxCurrent = maxCurrent * current;
minCurrent = Math.min(minCurrent * current, 1);
} else if (current == 0) {
maxCurrent = 1;
minCurrent = 1;
} else {
int x = maxCurrent;
maxCurrent = Math.max(minCurrent * current, 1);
minCurrent = x * current;
}
if (max < maxCurrent) {
max = maxCurrent;
}
}
//System.out.println(minCurrent);
return max;
}

C#: Cleanest way to divide a string array into N instances N items long

I know how to do this in an ugly way, but am wondering if there is a more elegant and succinct method.
I have a string array of e-mail addresses. Assume the string array is of arbitrary length -- it could have a few items or it could have a great many items. I want to build another string consisting of say, 50 email addresses from the string array, until the end of the array, and invoke a send operation after each 50, using the string of 50 addresses in the Send() method.
The question more generally is what's the cleanest/clearest way to do this kind of thing. I have a solution that's a legacy of my VBScript learnings, but I'm betting there's a better way in C#.
You want elegant and succinct, I'll give you elegant and succinct:
var fifties = from index in Enumerable.Range(0, addresses.Length)
group addresses[index] by index/50;
foreach(var fifty in fifties)
Send(string.Join(";", fifty.ToArray());
Why mess around with all that awful looping code when you don't have to? You want to group things by fifties, then group them by fifties.
That's what the group operator is for!
UPDATE: commenter MoreCoffee asks how this works. Let's suppose we wanted to group by threes, because that's easier to type.
var threes = from index in Enumerable.Range(0, addresses.Length)
group addresses[index] by index/3;
Let's suppose that there are nine addresses, indexed zero through eight
What does this query mean?
The Enumerable.Range is a range of nine numbers starting at zero, so 0, 1, 2, 3, 4, 5, 6, 7, 8.
Range variable index takes on each of these values in turn.
We then go over each corresponding addresses[index] and assign it to a group.
What group do we assign it to? To group index/3. Integer arithmetic rounds towards zero in C#, so indexes 0, 1 and 2 become 0 when divided by 3. Indexes 3, 4, 5 become 1 when divided by 3. Indexes 6, 7, 8 become 2.
So we assign addresses[0], addresses[1] and addresses[2] to group 0, addresses[3], addresses[4] and addresses[5] to group 1, and so on.
The result of the query is a sequence of three groups, and each group is a sequence of three items.
Does that make sense?
Remember also that the result of the query expression is a query which represents this operation. It does not perform the operation until the foreach loop executes.
Seems similar to this question: Split a collection into n parts with LINQ?
A modified version of Hasan Khan's answer there should do the trick:
public static IEnumerable<IEnumerable<T>> Chunk<T>(
this IEnumerable<T> list, int chunkSize)
{
int i = 0;
var chunks = from name in list
group name by i++ / chunkSize into part
select part.AsEnumerable();
return chunks;
}
Usage example:
var addresses = new[] { "a#example.com", "b#example.org", ...... };
foreach (var chunk in Chunk(addresses, 50))
{
SendEmail(chunk.ToArray(), "Buy V14gr4");
}
It sounds like the input consists of separate email address strings in a large array, not several email address in one string, right? And in the output, each batch is a single combined string.
string[] allAddresses = GetLongArrayOfAddresses();
const int batchSize = 50;
for (int n = 0; n < allAddresses.Length; n += batchSize)
{
string batch = string.Join(";", allAddresses, n,
Math.Min(batchSize, allAddresses.Length - n));
// use batch somehow
}
Assuming you are using .NET 3.5 and C# 3, something like this should work nicely:
string[] s = new string[] {"1", "2", "3", "4"....};
for (int i = 0; i < s.Count(); i = i + 50)
{
string s = string.Join(";", s.Skip(i).Take(50).ToArray());
DoSomething(s);
}
I would just loop through the array and using StringBuilder to create the list (I'm assuming it's separated by ; like you would for email). Just send when you hit mod 50 or the end.
void Foo(string[] addresses)
{
StringBuilder sb = new StringBuilder();
for (int i = 0; i < addresses.Length; i++)
{
sb.Append(addresses[i]);
if ((i + 1) % 50 == 0 || i == addresses.Length - 1)
{
Send(sb.ToString());
sb = new StringBuilder();
}
else
{
sb.Append("; ");
}
}
}
void Send(string addresses)
{
}
I think we need to have a little bit more context on what exactly this list looks like to give a definitive answer. For now I'm assuming that it's a semicolon delimeted list of email addresses. If so you can do the following to get a chunked up list.
public IEnumerable<string> DivideEmailList(string list) {
var last = 0;
var cur = list.IndexOf(';');
while ( cur >= 0 ) {
yield return list.SubString(last, cur-last);
last = cur + 1;
cur = list.IndexOf(';', last);
}
}
public IEnumerable<List<string>> ChunkEmails(string list) {
using ( var e = DivideEmailList(list).GetEnumerator() ) {
var list = new List<string>();
while ( e.MoveNext() ) {
list.Add(e.Current);
if ( list.Count == 50 ) {
yield return list;
list = new List<string>();
}
}
if ( list.Count != 0 ) {
yield return list;
}
}
}
I think this is simple and fast enough.The example below divides the long sentence into 15 parts,but you can pass batch size as parameter to make it dynamic.Here I simply divide using "/n".
private static string Concatenated(string longsentence)
{
const int batchSize = 15;
string concatanated = "";
int chanks = longsentence.Length / batchSize;
int currentIndex = 0;
while (chanks > 0)
{
var sub = longsentence.Substring(currentIndex, batchSize);
concatanated += sub + "/n";
chanks -= 1;
currentIndex += batchSize;
}
if (currentIndex < longsentence.Length)
{
int start = currentIndex;
var finalsub = longsentence.Substring(start);
concatanated += finalsub;
}
return concatanated;
}
This show result of split operation.
var parts = Concatenated(longsentence).Split(new string[] { "/n" }, StringSplitOptions.None);
Extensions methods based on Eric's answer:
public static IEnumerable<IEnumerable<T>> SplitIntoChunks<T>(this T[] source, int chunkSize)
{
var chunks = from index in Enumerable.Range(0, source.Length)
group source[index] by index / chunkSize;
return chunks;
}
public static T[][] SplitIntoArrayChunks<T>(this T[] source, int chunkSize)
{
var chunks = from index in Enumerable.Range(0, source.Length)
group source[index] by index / chunkSize;
return chunks.Select(e => e.ToArray()).ToArray();
}

Categories