How to store every 2 characters in c# - c#

Is there a way to store every 2 characters in a string?
e.g.
1+2-3-2-3+
So it would be "1+", "2-", "3-", "2-", "3+" as separate strings or in an array.

The simplest way would be to walk your string with a loop, and take two-character substrings from the current position:
var res = new List<string>();
for (int i = 0 ; i < str.Length ; i += 2)
res.Add(str.Substring(i, 2));
An advanced solution would do the same thing with LINQ, and avoid an explicit loop:
var res = Enumerable
.Range(0, str.Length/2)
.Select(i => str.Substring(2*i, 2))
.ToList();
The second solution is somewhat more compact, but it is harder to understand, at least to someone not closely familiar with LINQ.

This is a good problem for a regular expressio. You could try:
\d[+-]
Just find how to compile that regular expression (HINT) and call a method that returns all occurrences.

Use a for loop, and extract the characters using the string.Substring() method, ensuring you do not go over the length of the string.
e.g.
string x = "1+2-3-2-3+";
const int LENGTH_OF_SPLIT = 2;
for(int i = 0; i < x.Length(); i += LENGTH_OF_SPLIT)
{
string temp = null; // temporary storage, that will contain the characters
// if index (i) + the length of the split is less than the
// length of the string, then we will go out of bounds (i.e.
// there is more characters to extract)
if((LENGTH_OF_SPLIT + i) < x.Length())
{
temp = x.Substring(i, LENGTH_OF_SPLIT);
}
// otherwise, we'll break out of the loop
// or just extract the rest of the string, or do something else
else
{
// you can possibly just make temp equal to the rest of the characters
// i.e.
// temp = x.Substring(i);
break; // break out of the loop, since we're over the length of the string
}
// use temp
// e.g.
// Print it out, or put it in a list
// Console.WriteLine(temp);
}

Related

Filling a character array with characters from a string

I'm trying to fill an array with characters from a string inputted via console. I've tried the code bellow but it doesnt seem to work. I get Index out Of Range exception in the for loop part, and i didn't understand why it occured. Is the for loop range incorrect? Any insight would be greatly appreciated
Console.WriteLine("Enter a string: ");
var name = Console.ReadLine();
var intoarray = new char[name.Length];
for (var i = 0; i <= intoarray.Length; i++)
{
intoarray[i] = name[i];
}
foreach (var n in intoarray)
Console.WriteLine(intoarray[n]);
using ToCharArray() strings can be converted into character arrays.
Console.WriteLine("Enter a string: ");
var name = Console.ReadLine();
var intoarray= name.ToCharArray();
foreach (var n in intoarray)
Console.WriteLine(n);
if you are using foreach, you should wait for the index to behave as if you were taking the value.
Console.WriteLine(n);
Since arrays start at 0 and you are counting inclusive of length, the last iteration will exceed the bounds.
Just update the loop conditional to be less than length rather than less than or equal to..
I like snn bm's answer, but to answer you question directly, you're exceeding the length of the input by one. It should be:
for (var i = 0; i <= intoarray.Length - 1; i++)
(Since strings are zero-indexed, the last character in the underlying array is always going to be in the position of arrayLength - 1.)
1: the iteration should be for (var i = 0; i < intoarray.Length; i++)
2: the code
foreach (var n in intoarray)
Console.WriteLine(intoarray[n]);
also throws an exception, for "n" is a character in the array while it's used as array index.
3: In addition there's a much easier way to convert string to char array
var intoarray = name.ToCharArray();
Here's the result
Here is your mistake. There are so many options to represent i < intoarray.Length.
for (var i = 0; i < intoarray.Length; i++) // original was i <= intoarray.Length in your code
{
intoarray[i] = name[i];
}
With linq:
// Select all chars
IEnumerable<char> intoarray =
from ch in name
select ch; // can use var instead of IEnumerable<char>
// Execute the query
foreach (char temp in intoarray)
Console.WriteLine(temp);

Finding string in the beginning of a larger string

I'm trying to find, in an array of strings, which one starts with a particular substring. One of the strings in the array is guaranteed to start with the particular substring.
I tried to use:
int index = Array.BinarySearch (lines, "^"+subString);
Where lines is the array of strings and I'm looking for the index of the array that starts with subString. However, I'm either using regex improperly or there's a much better way to go about this?
BinarySearch can be used only to find a complete string, so you cannot use it for a substring match. You also have to ensure that the array is ordered in the first place to use BinarySearch.
You can use Array.FindIndex:
int index = Array.FindIndex(lines, line => line.TrimStart().StartsWith(subString));
Do you need to find the index of the (first) occurence, or do you need to find the actual strings that match that criterium?
myString.StartsWith(myPrefix); //returns bool
That should do the trick. Or a little more verbose:
var matchedLines = lines.Where(line => line.StartsWith(substring)).ToList();
If you need the index of the first occurence, I'd address it as an array:
var firstOccurence = String.Empty;
var firstOccurenceIndex = -1;
for(int i = 0; i < lines.Length; i++)
{
if(lines[i].StartsWith(substring))
{
firstOccurence = lines[i];
firstOccurenceIndex = i;
break;
}
}
Note: you don't HAVE to use an array. It could just as well have been done with a foreach, manual counter incrementor and a break statement. I just prefer to work with arrays if I'm looking for an index.
Yet, another solution:
int index = lines.ToList().FindIndex(line => line.TrimStart().StartsWith(subString));

Optimizing counting characters within a string

I just created a simple method to count occurences of each character within a string, without taking caps into account.
static List<int> charactercount(string input)
{
char[] characters = "abcdefghijklmnopqrstuvwxyz".ToCharArray();
input = input.ToLower();
List<int> counts = new List<int>();
foreach (char c in characters)
{
int count = 0;
foreach (char c2 in input) if (c2 == c)
{
count++;
}
counts.Add(count);
}
return counts;
}
Is there a cleaner way to do this (i.e. without creating a character array to hold every character in the alphabet) that would also take into account numbers, other characters, caps, etc?
Conceptually, I would prefer to return a Dictionary<string,int> of counts. I'll assume that it's ok to know by omission rather than an explicit count of 0 that a character occurs zero times, you can do it via LINQ. #Oded's given you a good start on how to do that. All you would need to do is replace the Select() with ToDictionary( k => k.Key, v => v.Count() ). See my comment on his answer about doing the case insensitive grouping. Note: you should decide if you care about cultural differences in characters or not and adjust the ToLower method accordingly.
You can also do this without LINQ;
public static Dictionary<string,int> CountCharacters(string input)
{
var counts = new Dictionary<char,int>(StringComparer.OrdinalIgnoreCase);
foreach (var c in input)
{
int count = 0;
if (counts.ContainsKey(c))
{
count = counts[c];
}
counts[c] = counts + 1;
}
return counts;
}
Note if you wanted a Dictionary<char,int>, you could easily do that by creating a case invariant character comparer and using that as the IEqualityComparer<T> for a dictionary of the required type. I've used string for simplicity in the example.
Again, adjust the type of the comparer to be consistent with how you want to handle culture.
Using GroupBy and Select:
aString.GroupBy(c => c).Select(g => new { Character = g.Key, Num = g.Count() })
The returned anonymous type list will contain each character and the number of times it appears in the string.
You can then filter it in any way you wish, using the static methods defined on Char.
Your code is kind of slow because you are looping through the range a-z instead of just looping through the input.
If you only need to count letters (like your code suggests), the fastest way to do it would be:
int[] CountCharacters(string text)
{
var counts = new int[26];
for (var i = 0; i < text.Length; i++)
{
var charIndex - text[index] - (int)'a';
counts[charIndex] = counts[charindex] + 1;
}
return counts;
}
Note that you need to add some thing like verify the character is in the range, and convert it to lowercase when needed, or this code might throw exceptions. I'll leave those for you to add. :)
Based on +Ran's answer to avoiding IndexOutOfRangeException:
static readonly int differ = 'a';
int[] CountCharacters(string text) {
text = text.ToLower();
var counts = new int[26];
for (var i = 0; i < text.Length; i++) {
var charIndex = text[i] - differ;
// to counting chars between 'a' and 'z' we have to do this:
if(charIndex >= 0 && charIndex < 26)
counts[charIndex] += 1;
}
return counts;
}
Actually using Dictionary and/or LINQ is not optimized enough as counting chars and working with a low level array.

How to get the a string that is most repeated in a list

I have a lot of lists like the following:
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[1]/div[2]/h4[1]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[1]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[2]/div[2]/h4[1]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[2]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[3]/div[2]/h4[1]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[3]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[4]/div[2]/h4[1]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[4]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[5]/div[2]/h4[1]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[5]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[6]/div[2]/h4[1]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[2]/div[1]/div[6]/div[1]/div[2]/ul[1]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[7]/div[2]/h4[1]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[2]/div[1]/div[6]/div[1]/div[2]/ul[1]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[8]/div[2]/h4[1]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[8]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[9]/div[2]/h4[1]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[9]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[10]/div[2]/h4[1]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[10]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[11]/div[2]/h4[1]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[2]/div[1]/div[6]/div[1]/div[2]/ul[2]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[12]/div[2]/h4[1]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[12]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[13]/div[2]/h4[1]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[13]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[14]/div[2]/h4[1]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[14]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[15]/div[2]/h4[1]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[15]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[16]/div[2]/h4[1]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[16]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[17]/div[2]/h4[1]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[2]/div[1]/div[6]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[18]/div[2]/h4[1]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[18]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[19]/div[2]/h4[1]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[19]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[20]/div[2]/h4[1]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[20]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[21]/div[2]/h4[1]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[2]/div[1]/div[6]/div[1]/div[2]/ul[2]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[22]/div[2]/h4[1]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[22]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[23]/div[2]/h4[1]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[23]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[24]/div[2]/h4[1]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[24]/div[2]/div[4]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[25]/div[2]/h4[1]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[25]/div[2]/div[4]
And I need to extract the portion that is most repeated in each line, which in this case is
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li
What's the best way to do this?
I'm using C#/.net
thanks!
If I understand your question correctly, what you want is the longest common prefix of all lines. You could obtain it by doing something like that:
void Main()
{
string path = #"D:\tmp\so5670107.txt";
string[] lines = File.ReadAllLines(path);
string prefix = LongestCommonPrefix(lines);
Console.WriteLine(prefix);
}
static string LongestCommonPrefix(string a, string b)
{
int length = 0;
for (int i = 0; i < a.Length && i < b.Length; i++)
{
if (a[i] == b[i])
length++;
else
break;
}
return a.Substring(0, length);
}
static string LongestCommonPrefix(IEnumerable<string> strings)
{
return strings.Aggregate(LongestCommonPrefix);
}
The result is:
/html[1]/body[1]/div[5]/div[1]/div[2]/div[
(the expected result you give in the question seems incorrect, since there are lines that don't match it)
I chose a naive approach for the sake of simplicity, but of course there are more efficient ways of finding the longest common prefix between two strings (using a dichotomic search for instance)
You could do this with a loop. Assumption is that your list of strings is in a collection called paths:
var countByPath = new Dictionary<string, int>();
foreach (var path in paths)
{
if (!countByPath.ContainsKey(path))
{
countByPath[path] = 1;
}
else
{
countByPath[path]++;
}
}
The longest substring that is repeated in the list? Assumption is that your list of strings is in a collection called paths:
var currentChoice = "";
foreach (var path in paths)
{
for (int i = path.Length; i > 0; i--)
{
var candidate = path.Substring(0, i);
if (i > currentChoice.Length &&
paths.Count(p => p.StartsWith(candidate)) > 1)
currentChoice = candidate;
else
break;
}
}
Console.WriteLine(currentChoice);
The result is then
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[10]
since it is repeated twice
There is already an algorithm for this. I can't remember what it's called, but if you are interested in language independent implementation. It works in the following way:
Read first line
Read second line. If second line is the same as first line, than increase counter by one, otherwise keep counter at zero.
Carry on reading lines, if three lines are the same (i.e. repeat), than your counter will be 2. If next line is different to the previous three, than decrease counter by 1.
E.g.
String1 - Counter: 0
String1 - Counter: 1 (Store String1 in a variable)
String1 - Counter: 2 (Store String1 in same variable)
String2 - Counter: 1 (Still store String1 in variable)
I hope this makese sense. I did this at uni few years ago. Can't remember mathematician who came up with algorithm, but it's fairly old.

Testing for repeated characters in a string

I'm doing some work with strings, and I have a scenario where I need to determine if a string (usually a small one < 10 characters) contains repeated characters.
`ABCDE` // does not contain repeats
`AABCD` // does contain repeats, ie A is repeated
I can loop through the string.ToCharArray() and test each character against every other character in the char[], but I feel like I am missing something obvious.... maybe I just need coffee. Can anyone help?
EDIT:
The string will be sorted, so order is not important so ABCDA => AABCD
The frequency of repeats is also important, so I need to know if the repeat is pair or triplet etc.
If the string is sorted, you could just remember each character in turn and check to make sure the next character is never identical to the last character.
Other than that, for strings under ten characters, just testing each character against all the rest is probably as fast or faster than most other things. A bit vector, as suggested by another commenter, may be faster (helps if you have a small set of legal characters.)
Bonus: here's a slick LINQ solution to implement Jon's functionality:
int longestRun =
s.Select((c, i) => s.Substring(i).TakeWhile(x => x == c).Count()).Max();
So, OK, it's not very fast! You got a problem with that?!
:-)
If the string is short, then just looping and testing may well be the simplest and most efficient way. I mean you could create a hash set (in whatever platform you're using) and iterate through the characters, failing if the character is already in the set and adding it to the set otherwise - but that's only likely to provide any benefit when the strings are longer.
EDIT: Now that we know it's sorted, mquander's answer is the best one IMO. Here's an implementation:
public static bool IsSortedNoRepeats(string text)
{
if (text.Length == 0)
{
return true;
}
char current = text[0];
for (int i=1; i < text.Length; i++)
{
char next = text[i];
if (next <= current)
{
return false;
}
current = next;
}
return true;
}
A shorter alternative if you don't mind repeating the indexer use:
public static bool IsSortedNoRepeats(string text)
{
for (int i=1; i < text.Length; i++)
{
if (text[i] <= text[i-1])
{
return false;
}
}
return true;
}
EDIT: Okay, with the "frequency" side, I'll turn the problem round a bit. I'm still going to assume that the string is sorted, so what we want to know is the length of the longest run. When there are no repeats, the longest run length will be 0 (for an empty string) or 1 (for a non-empty string). Otherwise, it'll be 2 or more.
First a string-specific version:
public static int LongestRun(string text)
{
if (text.Length == 0)
{
return 0;
}
char current = text[0];
int currentRun = 1;
int bestRun = 0;
for (int i=1; i < text.Length; i++)
{
if (current != text[i])
{
bestRun = Math.Max(currentRun, bestRun);
currentRun = 0;
current = text[i];
}
currentRun++;
}
// It's possible that the final run is the best one
return Math.Max(currentRun, bestRun);
}
Now we can also do this as a general extension method on IEnumerable<T>:
public static int LongestRun(this IEnumerable<T> source)
{
bool first = true;
T current = default(T);
int currentRun = 0;
int bestRun = 0;
foreach (T element in source)
{
if (first || !EqualityComparer<T>.Default(element, current))
{
first = false;
bestRun = Math.Max(currentRun, bestRun);
currentRun = 0;
current = element;
}
}
// It's possible that the final run is the best one
return Math.Max(currentRun, bestRun);
}
Then you can call "AABCD".LongestRun() for example.
This will tell you very quickly if a string contains duplicates:
bool containsDups = "ABCDEA".Length != s.Distinct().Count();
It just checks the number of distinct characters against the original length. If they're different, you've got duplicates...
Edit: I guess this doesn't take care of the frequency of dups you noted in your edit though... but some other suggestions here already take care of that, so I won't post the code as I note a number of them already give you a reasonably elegant solution. I particularly like Joe's implementation using LINQ extensions.
Since you're using 3.5, you could do this in one LINQ query:
var results = stringInput
.ToCharArray() // not actually needed, I've left it here to show what's actually happening
.GroupBy(c=>c)
.Where(g=>g.Count()>1)
.Select(g=>new {Letter=g.First(),Count=g.Count()})
;
For each character that appears more than once in the input, this will give you the character and the count of occurances.
I think the easiest way to achieve that is to use this simple regex
bool foundMatch = false;
foundMatch = Regex.IsMatch(yourString, #"(\w)\1");
If you need more information about the match (start, length etc)
Match match = null;
string testString = "ABCDE AABCD";
match = Regex.Match(testString, #"(\w)\1+?");
if (match.Success)
{
string matchText = match.Value; // AA
int matchIndnex = match.Index; // 6
int matchLength = match.Length; // 2
}
How about something like:
string strString = "AA BRA KA DABRA";
var grp = from c in strString.ToCharArray()
group c by c into m
select new { Key = m.Key, Count = m.Count() };
foreach (var item in grp)
{
Console.WriteLine(
string.Format("Character:{0} Appears {1} times",
item.Key.ToString(), item.Count));
}
Update Now, you'd need an array of counters to maintain a count.
Keep a bit array, with one bit representing a unique character. Turn the bit on when you encounter a character, and run over the string once. A mapping of the bit array index and the character set is upto you to decide. Break if you see that a particular bit is on already.
/(.).*\1/
(or whatever the equivalent is in your regex library's syntax)
Not the most efficient, since it will probably backtrack to every character in the string and then scan forward again. And I don't usually advocate regular expressions. But if you want brevity...
I started looking for some info on the net and I got to the following solution.
string input = "aaaaabbcbbbcccddefgg";
char[] chars = input.ToCharArray();
Dictionary<char, int> dictionary = new Dictionary<char,int>();
foreach (char c in chars)
{
if (!dictionary.ContainsKey(c))
{
dictionary[c] = 1; //
}
else
{
dictionary[c]++;
}
}
foreach (KeyValuePair<char, int> combo in dictionary)
{
if (combo.Value > 1) //If the vale of the key is greater than 1 it means the letter is repeated
{
Console.WriteLine("Letter " + combo.Key + " " + "is repeated " + combo.Value.ToString() + " times");
}
}
I hope it helps, I had a job interview in which the interviewer asked me to solve this and I understand it is a common question.
When there is no order to work on you could use a dictionary to keep the counts:
String input = "AABCD";
var result = new Dictionary<Char, int>(26);
var chars = input.ToCharArray();
foreach (var c in chars)
{
if (!result.ContainsKey(c))
{
result[c] = 0; // initialize the counter in the result
}
result[c]++;
}
foreach (var charCombo in result)
{
Console.WriteLine("{0}: {1}",charCombo.Key, charCombo.Value);
}
The hash solution Jon was describing is probably the best. You could use a HybridDictionary since that works well with small and large data sets. Where the letter is the key and the value is the frequency. (Update the frequency every time the add fails or the HybridDictionary returns true for .Contains(key))

Categories