Getting the Matched number of the two array - c#

I have 2 arrays.
String[] arrFirst={"a","b","c","d","e"};
String[] arrSecond={"a","b","f","d","g"};
String[] arrThird={"a","f","g","h","e"};
I want the results like for arrFirst and arrSecond , the result is 3
For arrFirst and arrThird, the result is 2
All the code that I found are comparing two arrays and return whether they are example the same or not.
But what I want is how many are matched.
I can do the looping way.
But I think it will take too much time and I am wondering whether there is any faster way.
Thanks..

You can use Intersect method.
String[] arrFirst={"a","b","c","d","e"};
String[] arrSecond={"a","b","f","d","g"};
String[] arrThird={"a","f","g","h","e"};
arrFirst.Intersect(arrSecond).Count(); // 3
arrFirst.Intersect(arrThird).Count(); //2

arrFirst.Join(arrSecond,f=>f,s=>s,(f,s)=>f).count();

arrFirst.Zip(arrSecond, (a, b) => a.Equals(b)).Count(a => a);

Related

Array.Exist() only recognizing last element of an array

I apologize if this is an obvious answer, but I couldn't find anything related to this behavior on StackOverflow nor on Google. I'm self-teaching c# and am writing a program that involves the use of username/password pairs. In this case, I am using .Exist() to determine if a username is already in my array of usernames.
static bool LookUpS(string targ, string[] arr)
{
if (Array.Exists(arr, e => e == targ))
{
return true;
}
else
{
return false;
}
}
The argument targ comes from the user, and the argument arr is written into the program to specify which array should be searched (this is so that I can use LookUpS on different arrays). The array that I'm testing right now is derived from a text document. I've tested to confirm that the transition from .txt to array is working properly.
Now to the actual issue. I've filled the text document in question with a few values: "Jupiter", "Neptune", "Saturn", "Mars". If I pass any of the first three into LookUpS, it returns false despite the fact that they do exist. The part I don't understand, however, is that if I pass "Mars" (or whatever the last value happens to be) into the function, it returns true like it should. If I were to remove the "Mars" value, LookUpS would say that "Saturn" is true but not the others. Can anyone offer some explanation for this? I can post a more complete picture of the program if that would help in identifying the problem.
In the comments you mention that you split the lines with \n as split separator. But on Windows \r\n is used. Hence your split string array will contain
xxxx\r
yyyy\r
lastUser
The last line does not contain a newline which will work.
That explains why your search in the array finds only the last user in your array. Pass to the split operation not \r but Environment.NewLine.ToCharArray() to remove all newline characters in string array.
Arrays have a built-in method called contains() which should do exactly what you want...
string[] array = { "Jupiter", "Neptune", "Saturn", "Mars" };
if (array.Contains("Neptune")) {
return true;
}
You could simplify it to something like this.
You can use Exists() instead of Any() if you like.
static bool LookUpS(string targ, string[] arr)
{
return (arr.Any(s => s == targ))
}
Or if you want it to be non case-sensitive:
static bool LookUpS(string targ, string[] arr)
{
return (arr.Any(e => String.Equals(e, targ, StringComparison.CurrentCultureIgnoreCase));
}
Or, as Karl suggests, you can use Contains()
static bool LookUpS(string targ, string[] arr)
{
return (arr.Contains(targ);
}
Sometimes simplifying your code can solve some problems :)
The reason that you don't need to use an if .. else to return true or false, is that the methods Any(), Exists() and Contains() return are bool, so you can just return the method call as shown in the examples

Performing function on each array element, returning results to new array

I'm a complete Linq newbie here, so forgive me for a probably quite simple question.
I want to perform an operation on every element in an array, and return the result of each of these operations to a new array.
For example, say I have an array or numbers and a function ToWords() that converts the numbers to their word equivalents, I want to be able to pass in the numbers array, perform the ToWords() operation on each element, and pass out a string[]
I know it's entirely possible in a slightly more verbose way, but in my Linq adventures I'm wondering if it's doable in a nice one-liner.
You can use Select() to transform one sequence into another one, and ToArray() to create an array from the result:
int[] numbers = { 1, 2, 3 };
string[] strings = numbers.Select(x => ToWords(x)).ToArray();
It's pretty straight forward. Just use the Select method:
var results = array.Select(ToWords).ToArray();
Note that unless you need an array you don't have to call ToArray. Most of the time you can use lazy evaluation on an IEnumerable<string> just as easily.
There are two different approaches - you can use Select extension method or you can use select clause:
var numbers = new List<int>();
var strings1 = from num in numbers select ToWords(num);
var strings2 = numbers.Select(ToWords);
both of them will return IEnumerable<>, which you can cast as you need (for example, with .ToArray() or .ToList()).
You could do something like this :
public static string[] ProcessStrings(int[] intList)
{
return Array.ConvertAll<int, string>(intList, new Converter<int, string>(
delegate(int number)
{
return ToWords();
}));
}
If it is a list then :
public static List<string> ProcessStrings(List<int> intList)
{
return intList.ConvertAll<string>(new Converter<int, string>(
delegate(int number)
{
return ToWords();
}));
}
Straight simple:
string[] wordsArray = array.ToList().ConvertAll(n => ToWords(n)).ToArray();
If you are OK with Lists, rather than arrays, you can skip ToList() and ToArray().
Lists are much more flexible than arrays, I see no reason on earth not to use them, except for specific cases.

Parallel For Loop

I am trying to utilize the parallel for loop in .NET Framework 4.0. However I noticed that, I am missing some elements in the result set.
I have snippet of code as below. lhs.ListData is a list of nullable double and rhs.ListData is a list of nullable double.
int recordCount = lhs.ListData.Count > rhs.ListData.Count ? rhs.ListData.Count : lhs.ListData.Count;
List<double?> listResult = new List<double?>(recordCount);
var rangePartitioner = Partitioner.Create(0, recordCount);
Parallel.ForEach(rangePartitioner, range =>
{
for (int index = range.Item1; index < range.Item2; index++)
{
double? result = lhs.ListData[index] * rhs.ListData[index];
listResult.Add(result);
}
});
lhs.ListData has the length of 7964 and rhs.ListData has the length of 7962. When I perform the "*" operation, listResult has only 7867 as output. There are null elements in the both input list.
I am not sure what is happening during the execution. Is there any reason why I am seeing less elements in the result set? Please advice...
The correct way to do this is to use LINQ's IEnumerable.AsParallel() extention. It does all of the partitioning for you, and everything in PLINQ is inherently thread-safe. There is another LINQ extension called Zip that zips together two collections into one, based on a function that you give it. However, this isn't exactly what you need as it only goes to the length of the shorter of the two lists, not the longer. It would probably be easies to do this, but first expand the shorter of the two lists to the length of the longer one by padding it with null at the end of the list.
IEnumerable<double?> lhs, rhs; // Assume these are filled with your numbers.
double?[] result = System.Linq.Enumerable.Zip(lhs, rhs, (a, b) => a * b).AsParallel().ToArray();
Here's the MSDN page on Zip:
http://msdn.microsoft.com/en-us/library/dd267698%28VS.100%29.aspx
That's probably because the operations on a List<T> (e.g. Add) are not thread safe - your results may vary. As a workaround you could use a lock, but that would very much reduce performance.
It looks like you just want each item in the result list to be the product of the items at the corresponding index in the two input lists, how about this instead using PLINQ:
var listResult = lhs.AsParallel()
.Zip(rhs.AsParallel(), (a,b) => a*b)
.ToList();
Not sure why you chose parallelism here, I would benchmark if this is even necessary - is this truly the bottleneck in your application?
You are using List<double?> to store results but Add method is not thread safe.
You can use explicit index to store the result (instead of calling Add):
listResult[index] = result;

Difference of two lists C#

I have two lists of strings both of which are ~300,000 lines. List 1 has a few lines more than List 2. What I'm trying to do is find the strings that in List 1 but not in List 2.
Considering how many strings I have to compare, is Except() good enough or is there something better (faster)?
Internally the enumerable Except extension method uses Set<T> to perform the computation. It's going to be as least as fast as any other method.
Go with list1.Except(list2).
It'll give you the best performance and the simplest code.
My suggestion:
HashSet<String> hash1 = new HashSet<String>(new string[] { "a", "b", "c", "d" });
HashSet<String> hash2 = new HashSet<String>(new string[] { "a", "b" });
List<String> result = hash1.Except(hash2).ToList();

IEnumerable<IEnumerable<int>> - no duplicate IEnumerable<int>s

I'm trying to find a solution to this problem:
Given a IEnumerable< IEnumerable< int>> I need a method/algorithm that returns the input, but in case of several IEnmerable< int> with the same elements only one per coincidence/group is returned.
ex.
IEnumerable<IEnumerable<int>> seqs = new[]
{
new[]{2,3,4}, // #0
new[]{1,2,4}, // #1 - equals #3
new[]{3,1,4}, // #2
new[]{4,1,2} // #3 - equals #1
};
"foreach seq in seqs" .. yields {#0,#1,#2} or {#0,#2,#3}
Sould I go with ..
.. some clever IEqualityComparer
.. some clever LINQ combination I havent figured out - groupby, sequenceequal ..?
.. some seq->HashSet stuff
.. what not. Anything will help
I'll be able to solve it by good'n'old programming but inspiration is always appreciated.
Here's a slightly simpler version of digEmAll's answer:
var result = seqs.Select(x => new HashSet<int>(x))
.Distinct(HashSet<int>.CreateSetComparer());
Given that you want to treat the elements as sets, you should have them that way to start with, IMO.
Of course this won't help if you want to maintain order within the sequences that are returned, you just don't mind which of the equal sets is returned... the above code will return an IEnumerable<HashSet<int>> which will no longer have any ordering within each sequence. (The order in which the sets are returned isn't guaranteed either, although it would be odd for them not to be return in first-seen-first-returned basis.)
It feels unlikely that this wouldn't be enough, but if you could give more details of what you really need to achieve, that would make it easier to help.
As noted in comments, this will also assume that there are no duplicates within each original source array... or at least, that they're irrelevant, so you're happy to treat { 1 } and { 1, 1, 1, 1 } as equal.
Use the correct collection type for the job. What you really want is ISet<IEnumerable<int>> with an equality comparer that will ignore the ordering of the IEnumerables.
EDITED:
You can get what you want by building your own IEqualityComparer<IEnumerable<int>> e.g.:
public class MyEqualityComparer : IEqualityComparer<IEnumerable<int>>
{
public bool Equals(IEnumerable<int> x, IEnumerable<int> y)
{
return x.OrderBy(el1 => el1).SequenceEqual(y.OrderBy(el2 => el2));
}
public int GetHashCode(IEnumerable<int> elements)
{
int hash = 0;
foreach (var el in elements)
{
hash = hash ^ el.GetHashCode();
}
return hash;
}
}
Usage:
var values = seqs.Distinct(new MyEqualityComparer()).ToList();
N.B.
this solution is slightly different from the one given by Jon Skeet.
His answer considers sublists as sets, so basically two lists like [1,2] and [1,1,1,2,2] are equal.
This solution don't, i.e. :
[1,2,1,1] is equal to [2,1,1,1] but not to [2,2,1,1], hence basically the two lists have to contain the same elements and in the same number of occurrences.

Categories