Set subtraction while keeping duplicates - c#

I need to get the set subtraction of two string arrays while considering duplicates.
Ex:
var a = new string[] {"1", "2", "2", "3", "4", "4"};
var b = new string[] {"2", "3"};
(a - b) => expected output => string[] {"1", "2", "4", "4"}
I already tried Enumerable.Except() which returns the unique values after subtract: { "1", "4" } which is not what I'm looking for.
Is there a straightforward way of achieving this without a custom implementation?

You can try GroupBy, and work with groups e.g.
var a = new string[] {"1", "2", "2", "3", "4", "4"};
var b = new string[] {"2", "3"};
...
var subtract = b
.GroupBy(item => item)
.ToDictionary(chunk => chunk.Key, chunk => chunk.Count());
var result = a
.GroupBy(item => item)
.Select(chunk => new {
value = chunk.Key,
count = chunk.Count() - (subtract.TryGetValue(chunk.Key, out var v) ? v : 0)
})
.Where(item => item.count > 0)
.SelectMany(item => Enumerable.Repeat(item.value, item.count));
// Let's have a look at the result
Console.Write(string.Join(", ", result));
Outcome:
1, 2, 4, 4

By leveraging the undersung Enumerable.ToLookup (which allows you to create dictionary-like structure with multi-values per key) you can do this quite efficiently. Here, because key lookups on non-existent keys in an ILookup return empty IGrouping (rather than null or an error), you can avoid a whole bunch of null-checks/TryGet...-boilerplate. Because Enumerable.Take with a negative value is equivalent to Enumerable.Take(0), we don't have to check our arithmetic either.
var aLookup = a.ToLookup(x => x);
var bLookup = b.ToLookup(x => x);
var filtered = aLookup
.SelectMany(aItem => aItem.Take(aItem.Count() - bLookup[aItem.Key].Count()));

Try the following:
var a = new string[] { "1", "2", "2", "3", "4", "4" }.ToList();
var b = new string[] { "2", "3" };
foreach (var element in b)
{
a.Remove(element);
}
Has been tested.

Related

How linq between 2 lists by StartsWith

I have two list of string.
var list1 = new List<string> { "1", "12", "21", "34", "22" };
var list2 = new List<string> { "1", "2" };
I Need select items of list1 where item StartsWith by items in list2 : "1", "12", "21", "22"
//foreach solution : "1", "12", "21", "22"
var result1 = new List<string>();
foreach (var item in list2)
result1.AddRange(list1.Where(x => x.StartsWith(item)).ToList());
//linq solution : "1"
var result2 = list1.Where(x => list2.Contains(x)).ToList();
How can I get result1 by linq solution?
You can use a combination of Where with Any like:
var query = list1.Where(s1 => list2.Any(s2 => s1.StartsWith(s2))).ToList();
and you will end up with:
{"1","12","21","22"}
another option is doing the cross join and then query like:
var query = from s1 in list1
from s2 in list2
where s1.StartsWith(s2)
select s1;
var result = list1.Where(x => list2.Any(y => x.StartsWith(y)).ToList();
Using NinjaNye.SearchExtensions you can do something like the following
using NinjaNye.SearchExtensions;
var query = list.Search(x => x).StartsWith(list2);

How to remove duplicate record using linq?

I have list of list and want to remove duplicate from list.
Data is stored in list format say IEnumerable<IEnumerable<string>> tableData
if we consider it as table value,
parent list is for rows and child list is values of every column.
Now I want to delete all duplicate rows. from below table value A is duplicate.
List<List<string>> ls = new List<List<string>>();
ls.Add(new List<string>() { "1", "A" });
ls.Add(new List<string>() { "2", "B" });
ls.Add(new List<string>() { "3", "C" });
ls.Add(new List<string>() { "4", "A" });
ls.Add(new List<string>() { "5", "A" });
ls.Add(new List<string>() { "6", "D" });
IEnumerable<IEnumerable<string>> tableData = ls;
var abc = tableData.SelectMany(p => p).Distinct(); ///not work
after operation, I want abc should be exactly tableData format
ls.Add(new List<string>() { "1", "A" });
ls.Add(new List<string>() { "2", "B" });
ls.Add(new List<string>() { "3", "C" });
ls.Add(new List<string>() { "6", "D" });
tableData.GroupBy(q => q.Skip(1).First()).Select(q => q.First())
You can use the overload of Distinct passing in an IEqualityComparer assuming you actually have an IEnumerable<IEnumerable<PropertyData>>.
For example:
var items = tableData.SelectMany(x => x).Distinct(new TableComparer());
And the comparer:
public class TableComparer : IEqualityComparer<PropertyData>
{
public bool Equals(PropertyData x, PropertyData y)
{
return x.id == y.id;
}
public int GetHashCode(PropertyData pData)
{
return pData.id.GetHashCode();
}
}
If it's just an IEnumerable<IEnumerable<string>>, you can use Distinct() without the overload:
var items = tableData.SelectMany(x => x).Distinct();
Though your question lacks clarity..
var distinctValues = tableData.SelectMany(x => x).Distinct();
This will flatten your list of lists and select the distinct set of strings.
you can use below menioned code
List<List<string>> ls=new List<List<string>>();
ls.Add(new List<string>(){"Hello"});
ls.Add(new List<string>(){"Hello"});
ls.Add(new List<string>() { "He" });
IEnumerable<IEnumerable<string>> tableData = ls;
var abc = tableData.SelectMany(p => p).Distinct();
O/P is
Hello
He

How to convert string[] to int in C# [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
I'm a begginer to C# programmer and I have a problem in my code with one string.
I want to know how I can convert this string:
x = {"1","2","0",",","1","2","1",",","1","2","2"}
into something like this:
x = {"120","121","122"}
The variable x is assigned as string and I want it assigned as int. The purpose of this is to get how many numbers are between lets say 120 and 130, which in my string would be 3.
Many Thanks.
string[] x = { "1", "2", "0", ",", "1", "2", "1", ",", "1", "2", "2" };
int[] y = string.Join(string.Empty, x)
.Split(',')
.Select(s => int.Parse(s))
.ToArray();
I think you can do this in three lines as follows:
var x = new []{"1", "2", "0", ",", "1", "2", "1", ",", "1", "2", "2"};
var fullString = String.Join("", x, 0, x.Length);
// get as a string array:
// x = fullString.Split(new[] {','});
// get as an integer array:
var intArray = (fullString.Split(new[] {','}))
.Select(_ => Int32.Parse(_)).ToArray();
In steps this is (1) create the string, (2) join the string, (3) split the string on the separator (which appears to be the comma in your case).
Best of luck!
Not sure why you want to do this but never the less it can be achieved quite simply using LINQ to Objects basic String methods.
var x = new string[] { "1", "2", "0", ",", "1", "2", "1", ",", "1", "2", "2" };
var y = String.Join(String.Empty, x).Split(',');
foreach (var s in y)
Console.WriteLine(s);
Update 23/05/2014
As per the comments, here is some code that will do what you want (i.e. count the numbers between the range 120-130 inclusive).
var min = 120;
var max = 130;
var count = y
.Select(o => Int32.Parse(o))
.Count(o => (min <= o) && (o <= max));
Console.WriteLine(count);
You should first add the 3 strings together to get the correct "120","121","122" as a string
Then you could use a forloop or something equivalent to do val.toint();
EDIT:
Changing a list from list String to Int and keeping the same variable seems abit unnecessary to me
string[] x = { "1", "2", "0", ",", "1", "2", "1", ",", "1", "2", "2" };
StringBuilder sb = new StringBuilder();
foreach (var item in x)
{
sb.Append(item);
}
string str = sb.ToString();
string[] myArr = str.Split(',');
int[] numArr = new int[myArr.Length];
for (int i = 0; i < myArr.Length; i++)
{
numArr[i] = int.Parse(myArr[i]);
}
with a bit of effort, youll be able to find this,(you are not supposed to just post a question, next time do some research please) but whilst the question is here any way:
string[] items;
items = new string[5];
int[] intitems;
intitems = new int[5];
foreach (string item in items)
{
int i = 0;
int.TryParse(items[i], out intitems[i]);
}
an easy and understandable way of doing it. the loop pretty much explains it self
Here is an approach using several string methods and LINQ:
string[] x = { "1", "2", "0", ",", "1", "2", "1", ",", "1", "2", "2" };
string all = string.Join("", x); // "120,121,122"
string[] parts = all.Split(','); // { "120", "121", "122" }
int i = int.MinValue; // variable which is used in below query
int[] y = parts // LINQ query
.Where(s => int.TryParse(s.Trim(), out i)) // filters out invalid strings
.Select(s => i) // selects the parsed ints
.ToArray(); // creates the final array

Sort a list by alpha values first, numeric values next

class ListSort
{
static void Main(string[] args)
{
string[] partNumbers = new string[]
{
"India", "US","UK", "Australia","Germany", "1", "7", "9"
};
var result = partNumbers.OrderBy(x => x).ToList();
}
}
I tried the above code, and expected the following output:
Australia
Germany
India
UK
US
1
7
9
EDIT:
Numbers should be ordered numerically (1, 7, 9, 70, ...), while the non-numbers should always be ordered lexically, even if there's a number inside ("A3stralia", "Australia", "Germany").
Try this:
string[] partNumbers = new string[]
{
"India", "US","UK", "Australia","Germany", "1", "7", "9"
};
var result = partNumbers.OrderBy(x => char.IsNumber(x.FirstOrDefault())).ThenBy(x => x).ToList().Dump();
Note that this will only work if your data is either numeric or text, not if there's a value like "U2S". I can change this to work for those cases too, if you need that, though. Also, the numeric strings are still gettings sorted as strings, so "10" comes before "2".
How would you want the result to be when you add A3stralia and 70 to the list?
EDIT: Changed for the new constraints:
string[] partNumbers = new string[]
{
"India", "US","UK", "Australia","Germany", "1", "7", "9", "70", "A3stralia"
};
var result =
partNumbers
.Select(x => { int p; bool isNumber = int.TryParse(x, out p); return new { IsNumber = isNumber, NumericValue = isNumber ? p : int.MinValue, StringValue = x }; })
.OrderBy(x => x.IsNumber)
.ThenBy(x => x.NumericValue)
.ThenBy(x => x.StringValue)
.Select(x => x.StringValue)
.ToList();
If it doesn't matter that the digit part is ordered lexicographically:
var result = partNumbers.OrderBy(s => s.All(Char.IsDigit)).ThenBy(s => s).ToList();
This just simply checks whether all characters are digits or not. If you want the digits first use Enumerable.OrderByDescending instead.
As commented to one of the answers you want to order the digits numerically, then you need to parse them first:
result = partNumbers
.Select(s => new { s, num = s.TryGetInt() } )
.GroupBy(x => x.num.HasValue) // two groups: one can be parsed to int the other not
.SelectMany (xg =>
{
if (xg.Key) // can be parsed to int, then order by int-value
return xg.OrderBy(x => x.num.Value).Select(x => x.s);
else // can not be parsed to int, order by the string
return xg.OrderBy(x => x.s).Select(x => x.s);
})
.ToList();
I'm using this extension to parse strings to Nullable<int> in LINQ queries:
public static class NumericExtensions
{
public static int? TryGetInt(this string item)
{
int i;
bool success = int.TryParse(item, out i);
return success ? (int?)i : (int?)null;
}
}
So far it seems all the responses are roughly the same, but all too complicated. My attempt:
string[] partNumbers = { "US", "1", "UK", "Australia", "Germany", "70", "9" };
partNumbers.OrderBy(x =>
{
int parseResult;
return int.TryParse(x, out parseResult)
? parseResult
: null as int?;
})
.ThenBy(x => x);
Or, with extracted helper method:
partNumbers.OrderBy(TryParseNullableInt).ThenBy(x => x);
private static int? TryParseNullableInt(string source)
{
int parseResult;
return int.TryParse(x, out parseResult)
? parseResult
: null as int?;
}
string[] partNumbers = new string[]
{
"India", "US","UK", "Australia","Germany", "1", "7", "9"
};
var result = partNumbers.OrderBy(x =>
{
int i;
return int.TryParse(x, out i);
}).ThenBy(x=>x);
Following solution projects string sequence into anonymous type { isNumber, s, value }, which contain current item from sequence, information whether item is integer number, and possible parse result. Then new sequence if grouped into two groups - one for numbers and one for other strings. Each group is sorted - numbers by numeric comparison, strings alphabetically. And original item is selected from flattened groups:
string[] partNumbers = { "US", "1", "UK", "Australia", "Germany", "70", "9" };
int value;
var result = partNumbers
.Select(s => new { isNumber = Int32.TryParse(s, out value), s, value })
.GroupBy(x => x.isNumber)
.OrderBy(g => g.Key)
.SelectMany(g => g.Key ? g.OrderBy(x => x.value) : g.OrderBy(x => x.s))
.Select(x => x.s)
.ToList();
Returns:
"Australia",
"Germany",
"UK",
"US",
"1",
"9",
"70"

Zip N IEnumerable<T>s together? Iterate over them simultaneously?

I have:-
IEnumerable<IEnumerable<T>> items;
and I'd like to create:-
IEnumerable<IEnumerable<T>> results;
where the first item in "results" is an IEnumerable of the first item of each of the IEnumerables of "items", the second item in "results" is an IEnumerable of the second item of each of "items", etc.
The IEnumerables aren't necessarily the same lengths. If some of the IEnumerables in items don't have an element at a particular index, then I'd expect the matching IEnumerable in results to have fewer items in it.
For example:-
items = { "1", "2", "3", "4" } , { "a", "b", "c" };
results = { "1", "a" } , { "2", "b" }, { "3", "c" }, { "4" };
Edit: Another example (requested in comments):-
items = { "1", "2", "3", "4" } , { "a", "b", "c" }, { "p", "q", "r", "s", "t" };
results = { "1", "a", "p" } , { "2", "b", "q" }, { "3", "c", "r" }, { "4", "s" }, { "t" };
I don't know in advance how many sequences there are, nor how many elements are in each sequence. I might have 1,000 sequences with 1,000,000 elements in each, and I might only need the first ~10, so I'd like to use the (lazy) enumeration of the source sequences if I can. In particular I don't want to create a new data structure if I can help it.
Is there a built-in method (similar to IEnumerable.Zip) that can do this?
Is there another way?
Now lightly tested and with working disposal.
public static class Extensions
{
public static IEnumerable<IEnumerable<T>> JaggedPivot<T>(
this IEnumerable<IEnumerable<T>> source)
{
List<IEnumerator<T>> originalEnumerators = source
.Select(x => x.GetEnumerator())
.ToList();
try
{
List<IEnumerator<T>> enumerators = originalEnumerators
.Where(x => x.MoveNext()).ToList();
while (enumerators.Any())
{
List<T> result = enumerators.Select(x => x.Current).ToList();
yield return result;
enumerators = enumerators.Where(x => x.MoveNext()).ToList();
}
}
finally
{
originalEnumerators.ForEach(x => x.Dispose());
}
}
}
public class TestExtensions
{
public void Test1()
{
IEnumerable<IEnumerable<int>> myInts = new List<IEnumerable<int>>()
{
Enumerable.Range(1, 20).ToList(),
Enumerable.Range(21, 5).ToList(),
Enumerable.Range(26, 15).ToList()
};
foreach(IEnumerable<int> x in myInts.JaggedPivot().Take(10))
{
foreach(int i in x)
{
Console.Write("{0} ", i);
}
Console.WriteLine();
}
}
}
It's reasonably straightforward to do if you can guarantee how the results are going to be used. However, if the results might be used in an arbitrary order, you may need to buffer everything. Consider this:
var results = MethodToBeImplemented(sequences);
var iterator = results.GetEnumerator();
iterator.MoveNext();
var first = iterator.Current;
iterator.MoveNext();
var second = iterator.Current;
foreach (var x in second)
{
// Do something
}
foreach (var x in first)
{
// Do something
}
In order to get at the items in "second" you'll have to iterate over all of the subsequences, past the first items. If you then want it to be valid to iterate over the items in first you either need to remember the items or be prepared to re-evaluate the subsequences.
Likewise you'll either need to buffer the subsequences as IEnumerable<T> values or reread the whole lot each time.
Basically it's a whole can of worms which is difficult to do elegantly in a way which will work pleasantly for all situations :( If you have a specific situation in mind with appropriate constraints, we may be able to help more.
Based on David B's answer, this code should perform better:
public static IEnumerable<IEnumerable<T>> JaggedPivot<T>(
this IEnumerable<IEnumerable<T>> source)
{
var originalEnumerators = source.Select(x => x.GetEnumerator()).ToList();
try
{
var enumerators =
new List<IEnumerator<T>>(originalEnumerators.Where(x => x.MoveNext()));
while (enumerators.Any())
{
yield return enumerators.Select(x => x.Current).ToList();
enumerators.RemoveAll(x => !x.MoveNext());
}
}
finally
{
originalEnumerators.ForEach(x => x.Dispose());
}
}
The difference is that the enumerators variable isn't re-created all the time.
Here's one that is a bit shorter, but no doubt less efficient:
Enumerable.Range(0,items.Select(x => x.Count()).Max())
.Select(x => items.SelectMany(y => y.Skip(x).Take(1)));
What about this?
List<string[]> items = new List<string[]>()
{
new string[] { "a", "b", "c" },
new string[] { "1", "2", "3" },
new string[] { "x", "y" },
new string[] { "y", "z", "w" }
};
var x = from i in Enumerable.Range(0, items.Max(a => a.Length))
select from z in items
where z.Length > i
select z[i];
You could compose existing operators like this,
IEnumerable<IEnumerable<int>> myInts = new List<IEnumerable<int>>()
{
Enumerable.Range(1, 20).ToList(),
Enumerable.Range(21, 5).ToList(),
Enumerable.Range(26, 15).ToList()
};
myInts.SelectMany(item => item.Select((number, index) => Tuple.Create(index, number)))
.GroupBy(item => item.Item1)
.Select(group => group.Select(tuple => tuple.Item2));

Categories