How to find the duplicates in the given string in c# - c#

I want to find the duplicates for a given string, I tried for collections, It is working fine, but i don't know how to do it for a string.
Here is the code I tried for collections,
string name = "this is a a program program";
string[] arr = name.Split(' ');
var myList = new List<string>();
var duplicates = new List<string>();
foreach(string res in arr)
{
if (!myList.Contains(res))
{
myList.Add(res);
}
else
{
duplicates.Add(res);
}
}
foreach(string result in duplicates)
{
Console.WriteLine(result);
}
Console.ReadLine();
But I want to find the duplicates for the below string and to store it in an array. How to do that?
eg:- string aa = "elements";
In the above string i want to find the duplicate characters and store it in an array
Can anyone help me?

Linq solution:
string name = "this is a a program program";
String[] result = name.Split(' ')
.GroupBy(word => word)
.Where(chunk => chunk.Count() > 1)
.Select(chunk => chunk.Key)
.ToArray();
Console.Write(String.Join(Environment.NewLine, result));
The same princicple for duplicate characters within a string:
String source = "elements";
Char[] result = source
.GroupBy(c => c)
.Where(chunk => chunk.Count() > 1)
.Select(chunk => chunk.Key)
.ToArray();
// result = ['e']
Console.Write(String.Join(Environment.NewLine, result));

string name = "elements";
var myList = new List<char>();
var duplicates = new List<char>();
foreach (char res in name)
{
if (!myList.Contains(res))
{
myList.Add(res);
}
else if (!duplicates.Contains(res))
{
duplicates.Add(res);
}
}
foreach (char result in duplicates)
{
Console.WriteLine(result);
}
Console.ReadLine();

string is an array of chars. So, you can use your collection approach.
But, I would reccomend typed HashSet. Just load it with string and you'll get array of chars without duplicates, with preserved order.
take a look:
string s = "aaabbcdaaee";
HashSet<char> hash = new HashSet<char>(s);
HashSet<char> hashDup = new HashSet<char>();
foreach (var c in s)
if (hash.Contains(c))
hash.Remove(c);
else
hashDup.Add(c);
foreach (var x in hashDup)
Console.WriteLine(x);
Console.ReadKey();

Instead of a List<> i'd use a HashSet<> because it doesn't allow duplicates and Add returns false in that case. It's more efficient. I'd also use a Dictionary<TKey,Tvalue> instead of the list to track the count of each char:
string text = "elements";
var duplicates = new HashSet<char>();
var duplicateCounts = new Dictionary<char, int>();
foreach (char c in text)
{
int charCount = 0;
bool isDuplicate = duplicateCounts.TryGetValue(c, out charCount);
duplicateCounts[c] = ++charCount;
if (isDuplicate)
duplicates.Add(c);
}
Now you have all unique duplicate chars in the HashSet and the count of each unique char in the dictionary. In this example the set only contains e because it's three times in the string.
So you could output it in the following way:
foreach(char dup in duplicates)
Console.WriteLine("Duplicate char {0} appears {1} times in the text."
, dup
, duplicateCounts[dup]);
For what it's worth, here's a LINQ one-liner which also creates a Dictionary that only contains the duplicate chars and their count:
Dictionary<char, int> duplicateCounts = text
.GroupBy(c => c)
.Where(g => g.Count() > 1)
.ToDictionary(g => g.Key, g => g.Count());
I've shown it as second approach because you should first understand the standard way.

string name = "this is a a program program";
var arr = name.Split(' ').ToArray();
var dup = arr.Where(p => arr.Count(q => q == p) > 1).Select(p => p);
HashSet<string> hash = new HashSet<string>(dup);
string duplicate = string.Join(" ", hash);

You can do this through `LINQ
string name = "this is a a program program";
var d = name.Split(' ').GroupBy(x => x).Select(y => new { word = y.Key, Wordcount = y.Count() }).Where(z=>z.cou > 1).ToList();

Use LINQ to group values:
public static IEnumerable<T> GetDuplicates<T>(this IEnumerable<T> list)
{
return list.GroupBy(item => item).SelectMany(group => group.Skip(1));
}
public static bool HasDuplicates<T>(this IEnumerable<T> list)
{
return list.GetDuplicates().IsNotEmpty();
}
Then you use these extensions like this:
var list = new List<string> { "a", "b", "b", "c" };
var duplicatedValues = list.GetDuplicates();

Related

Removing strings with duplicate letters from string array

I have array of strings like
string[] A = { "abc", "cccc", "fgaeg", "def" };
I would like to obtain a list or array of strings where any letter appears only one time. I means that "cccc", "fgaeg" will be removed from input array.
I managed to do this but I feel that my way is very messy, unnecessarily complicated and not efficient.
Do you have any ideas to improve this algorythm (possibliy replacing with only one Linq query)?
My code:
var goodStrings = new List<string>();
int i = 0;
foreach (var str in A)
{
var tempArr = str.GroupBy(x => x)
.Select(x => new
{
Cnt = x.Count(),
Str = x.Key
}).ToArray();
var resultArr = tempArr.Where(g => g.Cnt > 1).Select(f => f.Str).ToArray();
if(resultArr.Length==0) goodStrings.Add(A[i]);
i++;
}
You can use Distinct method for every array item and get items with count of distinct items equals to original string length
string[] A = { "abc", "cccc", "fgaeg", "def" };
var result = A.Where(a => a.Distinct().Count() == a.Length).ToList();
You'll get list with abc and def values, as expected

Splitting text and putting it into dictionary

I have text with 600 words and I'm supposed to delete every quotation marks, numbers(years, dates, ..), digits ,... I should only have words, and I have to put in into dictionary.
So I have tried to go through with for each loop and get the first letter and save it in a list. Then I split every row in a word.
e.g.:
You are pretty.
You
are
pretty
The problem there are words in a row they're still same but they shouldn't be same. I've tried to fix it but I couldn't find any solution.
public Dictionary<string, int> words = new Dictionary<string, int>();
public Dictionary<char, List<string>> firstletter = new Dictionary<char, List<string>>();
public Aufgabe(string filename)
{
string filler = "ABCDEFGHIJKLMNOPQRSTUVWXYZÄÖÜ";
foreach (char f in filler)
{
firstletter[f] = new List<string>();
}
Load(filename);
}
public void Load(string filename)
{
List<string> w = new List<string>();
StreamReader r = new StreamReader(filename);
while (!r.EndOfStream)
{
string row = r.ReadLine();
string[] parts = row.Split(' ');
string[] sonderzeichen = new string[] { "#", ",", ".", ";", "'", "1", "2", "3", "4", "5", "6", "7", "8", "9", "0", "(", ")", "{",
"}", "!", "?", "/", "\"", "&", "+", "-", "–" };
string[] list = new string[parts.Length];
for (int i = 0; i < parts.Length; i++)
{
string a = parts[i];
foreach (string s in sonderzeichen)
{
if (s != "-")
{
a = a.Replace(s, string.Empty);
}
else
{
if (a.Length == 1)
{
a = string.Empty;
}
}
}
list[i] = a;
}
parts = list;
foreach (string a in parts)
{
if (words.ContainsKey(a))
{
words[a] += 1;
}
else
{
words.Add(a, 1);
}
string b = a.ToUpper();
if (b == "")
continue;
List<string> letter = firstletter[b[0]];
if (!letter.Contains(a))
{
letter.Add(a);
}
}
}
}
There are some things missing in the other answers:
No validation is done to check if the text is a word
Comparison should not be case-sensitive (i.e. spain, Spain and SPAIN should be considered the same word)
My solution:
StringComparer comparer = StringComparer.OrdinalIgnoreCase;
string text = "The 'rain' in spain falls mainly on the plain. 07 November 2018 20:02:07 - 20180520 I said the Plain in SPAIN. 12345";
var dictionary = Regex.Split(text, #"\W+")
.Where(IsValidWord)
.GroupBy(m => m, comparer)
.ToDictionary(m => m.Key, m => m.Count(), comparer);
Method IsValidWord:
// logic to validate word goes here
private static bool IsValidWord(string text)
{
double value;
bool isNumeric = double.TryParse(text, out value);
// add more validation rules here
return !isNumeric;
}
EDIT
I noticed in your code that you have a Dictionary with the words grouped by first letter. This can be achieved like this (using the previous dictionary):
var lettersDictionary = dictionary.Keys.GroupBy(x => x.Substring(0, 1),
(alphabet, subList) => new {
Alphabet = alphabet,
SubList = subList.OrderBy(x => x, comparer).ToList()
})
.ToDictionary(m => m.Alphabet, m => m.SubList, comparer);
You can just split with a regex, then use LINQ to create your dictionary:
var dictionary = Regex.Split(text, #"\W+")
.GroupBy(m => m, StringComparer.OrdinalIgnoreCase) // Case-insensitive
.ToDictionary(m => m.Key, m => m.Count());
UPDATE
In applying to your example code, your task class could become something like this to build both dictionaries (and to consider case insensitive):
public class Aufgabe
{
const string ALPHABET = "ABCDEFGHIJKLMNOPQRSTUVWXYZÄÖÜ";
public Dictionary<string, int> words;
public Dictionary<char, List<string>> firstletter;
public Aufgabe(string filename)
{
var text = File.ReadAllText(filename);
words = Regex.Split(text, #"\W+")
.GroupBy(m => m, StringComparer.OrdinalIgnoreCase)
.ToDictionary(m => m.Key, m => m.Count());
firstletter = ALPHABET.ToDictionary(a => a, // First-letter key
a => words.Keys.Where(m => a == char.ToUpper(m[0])).ToList()); // Words
}
}
Here is one way with Regex, note that case sensitivity has not been addressed
var text = "The 'rain' in spain falls mainly on the plain. I said the plain in spain";
var result = new Dictionary<string,string>();
Regex.Matches(text, #"[^\s]+")
.OfType<Match>()
.Select(m => Regex.Replace(m.Value, #"\W", string.Empty))
.ToList()
.ForEach(word =>
{
if (!result.ContainsKey(word))
result.Add(word, word);
});
result
This is almost certainly a job for regular expressions. \W+ splits your input string into words, defined as any character sequence of alphanumeric characters. See the documentation.
string sentence = "You are pretty. State-of-the-art.";
string[] words = Regex.Split(sentence, #"\W+");
foreach (string word in words)
{
if (word != "")
{
Console.WriteLine(word);
}
}

Sorting a list of numeric strings with numerous decimal points

I've got a situation where I need to sort a list of strings that contain three decimal parts in descending order from left-to-right. The real code is a dictionary of <string, object>, but I've simplified it here as I'm in the same predicament either way.
Straight to the code:
using System;
using System.Collections.Generic;
class Program
{
static void Main()
{
List<string> perlVersions = new List<string>();
perlVersions.Add("5.26.1_32");
perlVersions.Add("5.24.1_32");
perlVersions.Add("5.8.1_64");
perlVersions.Add("5.24.2_64");
perlVersions.Sort();
perlVersions.Reverse();
foreach (string str in perlVersions) Console.WriteLine(str);
}
}
Output:
5.8.1_64
5.26.1_32
5.24.2_64
5.24.1_32
Now, Everything works well, except that the 5.8.1_64, due to the second part of the number being lower than all others, should be at the bottom.
Is there a special sorting trick I'm missing, or is there a way to further break apart the strings and sort on each individual element?
You could for example split the string and treat the different parts an integers, and then sort by these using some LINQ:
static void Main()
{
List<string> perlVersions = new List<string>();
perlVersions.Add("5.26.1_32");
perlVersions.Add("5.24.1_32");
perlVersions.Add("5.8.1_64");
perlVersions.Add("5.24.2_64");
perlVersions = perlVersions
.Select(x => x.Split(new char[] { '.' }))
.Select(x =>
{
string[] lastParts = x[2].Split(new char[] { '_' });
return new { a = Convert.ToInt32(x[0]), b = Convert.ToInt32(x[1]), c = Convert.ToInt32(lastParts[0]), d = Convert.ToInt32(lastParts[1]) };
})
.OrderBy(x => x.a).ThenBy(x => x.b).ThenBy(x => x.c).ThenBy(x => x.d)
.Select(x => string.Format("{0}.{1}.{2}_{3}", x.a, x.b, x.c, x.d))
.ToList();
perlVersions.Reverse();
foreach (string str in perlVersions) Console.WriteLine(str);
}
Try this one
string[] separator = new string[] { "." };
var result = perlVersions
.OrderByDescending(s => int.Parse(s.Split(separator, StringSplitOptions.None)[1]))
.OrderByDescending(s => int.Parse(s.Split(separator, StringSplitOptions.None)[0]))
.ToList();
Or a fully query syntax version:
var b = from v in perlVersions
let ii = v.Split(".")
.Take(2)
.Select(i => int.Parse(i)).ToArray()
orderby ii[0] descending
orderby ii[1] descending
select v;
You can do you custom sort using Linq
To do so split your string by '.' and then extend each part with '0'
List<string> perlVersions = new List<string>();
perlVersions.Add("5.26.1_32");
perlVersions.Add("5.24.1_32");
perlVersions.Add("5.8.1_64");
perlVersions.Add("5.24.2_64");
perlVersions = perlVersions
.OrderByDescending(v => string.Concat(v.Split('.').Select(x => x.PadLeft(5, '0'))))
.ToList();
This will (temporary) convert "8" to "00008" and "24" to "00024", which make your sort working as expected.

String array elements OrderBy itself

If I have a string array like this:
string[] str = new string[]{"abc", "bacd", "pacds"};
Then I need output like below using LINQ:
output: abc, abcd, acdps
This should be what you want:
string[] str = new string[] { "abc", "bacd", "pacds" };
var result = str.Select(c => String.Concat(c.OrderBy(d => d)));
The result is IEnumerable<string> but if you want the result in an string array add .ToArray():
var result = str.Select(c => String.Concat(c.OrderBy(d => d))).ToArray();
The result:
You can use String.Concat(st.OrderBy(c => c)) to order string by its characters.
str.ToList().ForEach((val) => {
val = String.Concat(val.OrderBy(c => c));
});
str.Select(x => x.ToCharArray().OrderBy(c => c).Aggregate("", (s,c)=>s+c))
this just to change the array strings into chars-ordered ones
static void orderChars(string[] str)
{
for (int i = 0; i < str.Length; i++)
str[i] = new string(str[i].OrderBy(c => c).ToArray());
}
As bassfader commented , show your code and say where u stuck , then we can guide you, anyway ..
You can write it like this ,
string[] a = new string[]
{"Indonesian","Korean","Japanese","English","German"};
var sort = from s in a orderby s select s;

split string to Dictionnary<string, int>

I have a string like that : "content;123 contents;456 contentss;789 " etc..
I would like to split this string to get a Dictionary, but I don't know you to make it. I try to split the string but I got a List only.
The content (before semi colon) is always a unique string.
After the semi colon, I always have a number until I found the space.
the number is always an int (no float needs).
Could someone help me please ?
You can use the following LINQ expression:
"content;123 contents;456 contentss;789"
.Split(' ')
.Select(x => x.Split(';'))
.ToDictionary(x => x[0], x => int.Parse(x[1]));
string input = "content1;123 content2;456 content3;789";
var dict = Regex.Matches(input, #"(.+?);(\d+)").Cast<Match>()
.ToDictionary(m => m.Groups[1].Value, m => int.Parse(m.Groups[2].Value));
You can do something like this:
string value = "content;123 contents;456 contentss;789";
Dictionary<string, int> data = new Dictionary<string,int>();
foreach(string line in value.Split(' '))
{
string[] values = line.Split(';');
if (!data.ContainsKey(values[0]))
{
data.Add(values[0], Convert.ToInt32(values[1]));
}
}
var myList = "content1;number1 content2;number2 content3;number3";
var myDictionary = myList.Split(' ').Select(pair => pair.Split(';')).ToDictionary(splitPair => splitPair[0], splitPair => int.Parse(splitPair[1]));
static void Main(string[] args)
{
string content = "content;123 contents;456 contentss;789";
Dictionary<string, int> result = new Dictionary<string, int>();
content.Split(' ').ToList().ForEach(x =>
{
var items = x.Split(';');
result.Add(items[0], int.Parse(items[1]));
});
foreach(var item in result)
{
Console.WriteLine("{0} -> {1}" , item.Key, item.Value);
}
}

Categories