Comparing a part of a string. C#

Comparing a part of a string. C# - c#

I want to compare and check if a string is a part of another string. For example:
String1 = "ACGTAAG"
String2 = "TAA"
I want to check if String1 contains String2. Im using this code but it does not work.
public bool ContainsSequence(string input, string toBeChecked)
{
for (int i = 0; i < input.Length; i++)
{
Char x = input[i];
String y = Convert.ToString(x);
for (int j = 0; j < toBeChecked.Length; j++)
{
Char a = toBeChecked[j];
String b = Convert.ToString(a);
if (b.Equals(y))
{
j = toBeChecked.Length;
return true;
}
}
}
return false;
}
input = string1 and tobechecked = string 2.
Im new in c# so some terms may be confusing.

try use String.Contains()
Check it out here:
http://msdn.microsoft.com/en-us/library/dy85x1sa%28v=vs.110%29.aspx
Good luck.

Use
If(mainString.Contains(searchedString)
{
//do stuff
}

It looks to me like you're using this to compare DNA sequences :).
Maybe string.IndexOf(string value) or one of it's overloads is better for you because it can help
you search for further occurences of the same string (stuff like "how many times does it contain the string"): http://msdn.microsoft.com/en-us/library/k8b1470s(v=vs.110).aspx
If indeed all you want is just to see if the string is contained, I'd also go for the versions provided by the others.

For nucleotide sequences you probably want some sequence alignment algorithm which has some tolerance when sequences are not equal. For example Smith–Waterman algorithm.

Related

compare different combinations of a string [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 4 years ago.
Improve this question
I have a dictionary with key value pairs something like below
dict("A-B-C", "This is abc");
dict("X-Y-Z", "This is xyz");
so on
now when I receive an input say something like "B-A-C" the system should intelligent to know its "A-B-C" or if its "Z-Y-X" should map to "X-Y-Z", one way is to put all the combinations into the dictionary such that say dict("B-A-C", "A-B-C") but this may lead to more maintainability thing, just thinking if there is anything from .NET framework can address this issue more easily.
Is there an easy way to compare a string like below
A string either A-B or B-A is equal to A-B
A string either A-B-C or B-C-A or A-C-B or C-A-B or C-B-A equals to A-B-C
likewise a different combinations of A-B-C-D should equals to A-B-C-D
so as A-B-C-D-E
so on
A B C D E in the above example are a two letter non numeric word. something like
Su-Ma-Ju-Ve
UPDATE
For now I solved this with below code which returns the string tokens in the order which is recognized in the dictionary I maintain, not sure if its the best way but is solving the problem for now
string sortPls(string pls)
{
Dictionary<string, int> dctPls = new Dictionary<string, int>();
dctPls["Su"] = 1;
dctPls["Mo"] = 2;
dctPls["Ju"] = 3;
dctPls["Me"] = 4;
dctPls["Ve"] = 5;
dctPls["Ma"] = 6;
dctPls["Sa"] = 7;
string[] arrPls = pls.Split('-');
int j = 0;
string sortPls = string.Empty;
for(int i = 0; i < arrPls.Length; i++)
{
for (j = i + 1; j < arrPls.Length; j++)
{
if (dctPls[arrPls[j]] < dctPls[arrPls[i]])
{
string tmp = arrPls[i];
arrPls[i] = arrPls[j];
arrPls[j] = tmp;
}
}
}
for (int k = 0; k < arrPls.Length; k++)
sortPls += arrPls[k] + "-";
return sortPls.Remove(sortPls.Length - 1) ;
}

Yes. If looking for set equality, then HashSet<T> gives you the tools.
"ABBA".ToHashSet().SetEquals("BA")
Otherwise for an "anagram" type comparison, order the characters and compare sequences:
"CDAB".OrderBy(x=>x).SequenceEqual("DCBA".OrderBy(x=>x))

Split the string into individual letters, add the letters to a Set, and compare the sets using their Equals methods

void Main()
{
string check = "CADB";
string equalTo = new string(check.ToCharArray().OrderBy(x => x).ToArray());
Console.WriteLine(equalTo);
}
is just one way to do that.

I think, this will be the fastest way to do it:
bool StringEquals(string string1, string string2)
{
foreach (char ch in string1)
{
if (!string2.Contains(ch))
{
return false;
}
}
return true;
}

Sort the characters in each string, any two strings that have the same characters will have their sorted version the same. Stick the results into a `Dictionary>, with the sorted version as the key.

PadRight in string of arrays doesn't add chars

I created array of strings which includes strings with Length from 4 to 6. I am trying to PadRight 0's to get length for every element in array to 6.
string[] array1 =
{
"aabc", "aabaaa", "Abac", "abba", "acaaaa"
};
for (var i = 0; i <= array1.Length-1; i++)
{
if (array1[i].Length < 6)
{
for (var j = array1[i].Length; j <= 6; j++)
{
array1[i] = array1[i].PadRight(6 - array1[i].Length, '0');
}
}
Console.WriteLine(array1[i]);
}
Right now the program writes down the exact same strings I have in array without adding 0's at the end. I made a little research and found some informations about that strings are immutable, but still there are some example with changing strings inside, but I couldn't find any with PadRight or PadLeft and I fell like there must be a way to do it, but I just can't figure it out.
Any ideas on how to fix that issue?

The first argument to PadRight is the total length you want. You've specified 6 - array1[i].Length - and as all your strings start off with at least 3 characters, you're padding to at most 3 characters, so it's not doing anything.
You don't need your inner loop, and your outer loop condition is more conventionally written as <. This is one way I'd write that code:
using System;
public class Test
{
static void Main()
{
string[] array =
{
"aabc", "aabaaa", "Abac", "abba", "acaaaa"
};
for (var i = 0; i < array.Length; i++)
{
array[i] = array[i].PadRight(6, '0');
Console.WriteLine(array[i]);
}
}
}
In fact I'd probably use foreach, or even Select, but that's a different matter. I've left this using an array to be a bit closer to your original code.

Replace strings in C#

This might be a very basic question. I need to write a code which works similar as string replace algorithm.
static string stringReplace(string s, string stringOld, string stringNew)
{
string newWord = "";
int oldMax = stringOld.Length;
int index = 0;
for (int i = 0; i < s.Length; i++)
{
if (index != oldMax && s[i] == stringOld[index])
{
if (stringOld[index] < stringNew[index])
{
newWord = newWord + stringNew[index];
index++;
}
else
{
newWord = newWord + stringNew[index];
}
}
else
{
newWord = newWord + s[i];
}
}
return newWord;
}
Since it's 3am the code above is probably bugged. When the new word is shorter than the old one, it goes wrong. Same as when it's longer. When the index variable is equal for both stringOld and stringNew, it will do the swap. I think... Please don't post "use string.Replace(), I have to write that algorithm myself...

I don't know what you're trying to do with your code, but the problem is not a small one.
Think logically about what you are trying to do.
It is a two step process:
Find the starting index of stringOld in s.
If found replace stringOld with stringNew.
Step 1:
There are many rather complex (and elegant) efficient string search algorithms, you can search for them online or look at popular 'Introduction to Algorithms' by Cormen, Leiserson, Rivest & Stein, but the naive approach involves two loops and is pretty simple. It is also described in that book (and online.)
Step 2:
If a match is found at index i; simply copy characters 0 to i-1 of s to newWord, followed by newString and then the rest of the characters in s starting at index i + oldString.Length.

How to get the a string that is most repeated in a list

I have a lot of lists like the following:
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[1]/div[2]/h4[1]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[1]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[2]/div[2]/h4[1]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[2]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[3]/div[2]/h4[1]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[3]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[4]/div[2]/h4[1]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[4]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[5]/div[2]/h4[1]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[5]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[6]/div[2]/h4[1]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[2]/div[1]/div[6]/div[1]/div[2]/ul[1]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[7]/div[2]/h4[1]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[2]/div[1]/div[6]/div[1]/div[2]/ul[1]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[8]/div[2]/h4[1]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[8]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[9]/div[2]/h4[1]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[9]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[10]/div[2]/h4[1]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[10]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[11]/div[2]/h4[1]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[2]/div[1]/div[6]/div[1]/div[2]/ul[2]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[12]/div[2]/h4[1]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[12]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[13]/div[2]/h4[1]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[13]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[14]/div[2]/h4[1]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[14]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[15]/div[2]/h4[1]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[15]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[16]/div[2]/h4[1]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[16]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[17]/div[2]/h4[1]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[2]/div[1]/div[6]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[18]/div[2]/h4[1]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[18]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[19]/div[2]/h4[1]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[19]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[20]/div[2]/h4[1]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[20]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[21]/div[2]/h4[1]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[2]/div[1]/div[6]/div[1]/div[2]/ul[2]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[22]/div[2]/h4[1]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[22]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[23]/div[2]/h4[1]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[23]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[24]/div[2]/h4[1]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[24]/div[2]/div[4]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[25]/div[2]/h4[1]
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[25]/div[2]/div[4]
And I need to extract the portion that is most repeated in each line, which in this case is
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li
What's the best way to do this?
I'm using C#/.net
thanks!

If I understand your question correctly, what you want is the longest common prefix of all lines. You could obtain it by doing something like that:
void Main()
{
string path = #"D:\tmp\so5670107.txt";
string[] lines = File.ReadAllLines(path);
string prefix = LongestCommonPrefix(lines);
Console.WriteLine(prefix);
}
static string LongestCommonPrefix(string a, string b)
{
int length = 0;
for (int i = 0; i < a.Length && i < b.Length; i++)
{
if (a[i] == b[i])
length++;
else
break;
}
return a.Substring(0, length);
}
static string LongestCommonPrefix(IEnumerable<string> strings)
{
return strings.Aggregate(LongestCommonPrefix);
}
The result is:
/html[1]/body[1]/div[5]/div[1]/div[2]/div[
(the expected result you give in the question seems incorrect, since there are lines that don't match it)
I chose a naive approach for the sake of simplicity, but of course there are more efficient ways of finding the longest common prefix between two strings (using a dichotomic search for instance)

You could do this with a loop. Assumption is that your list of strings is in a collection called paths:
var countByPath = new Dictionary<string, int>();
foreach (var path in paths)
{
if (!countByPath.ContainsKey(path))
{
countByPath[path] = 1;
}
else
{
countByPath[path]++;
}
}

The longest substring that is repeated in the list? Assumption is that your list of strings is in a collection called paths:
var currentChoice = "";
foreach (var path in paths)
{
for (int i = path.Length; i > 0; i--)
{
var candidate = path.Substring(0, i);
if (i > currentChoice.Length &&
paths.Count(p => p.StartsWith(candidate)) > 1)
currentChoice = candidate;
else
break;
}
}
Console.WriteLine(currentChoice);
The result is then
/html[1]/body[1]/div[5]/div[1]/div[2]/div[3]/div[1]/div[3]/div[1]/div[2]/div[3]/ul[1]/li[10]
since it is repeated twice

There is already an algorithm for this. I can't remember what it's called, but if you are interested in language independent implementation. It works in the following way:
Read first line
Read second line. If second line is the same as first line, than increase counter by one, otherwise keep counter at zero.
Carry on reading lines, if three lines are the same (i.e. repeat), than your counter will be 2. If next line is different to the previous three, than decrease counter by 1.
E.g.
String1 - Counter: 0
String1 - Counter: 1 (Store String1 in a variable)
String1 - Counter: 2 (Store String1 in same variable)
String2 - Counter: 1 (Still store String1 in variable)
I hope this makese sense. I did this at uni few years ago. Can't remember mathematician who came up with algorithm, but it's fairly old.

Testing for repeated characters in a string

I'm doing some work with strings, and I have a scenario where I need to determine if a string (usually a small one < 10 characters) contains repeated characters.
`ABCDE` // does not contain repeats
`AABCD` // does contain repeats, ie A is repeated
I can loop through the string.ToCharArray() and test each character against every other character in the char[], but I feel like I am missing something obvious.... maybe I just need coffee. Can anyone help?
EDIT:
The string will be sorted, so order is not important so ABCDA => AABCD
The frequency of repeats is also important, so I need to know if the repeat is pair or triplet etc.

If the string is sorted, you could just remember each character in turn and check to make sure the next character is never identical to the last character.
Other than that, for strings under ten characters, just testing each character against all the rest is probably as fast or faster than most other things. A bit vector, as suggested by another commenter, may be faster (helps if you have a small set of legal characters.)
Bonus: here's a slick LINQ solution to implement Jon's functionality:
int longestRun =
s.Select((c, i) => s.Substring(i).TakeWhile(x => x == c).Count()).Max();
So, OK, it's not very fast! You got a problem with that?!
:-)

If the string is short, then just looping and testing may well be the simplest and most efficient way. I mean you could create a hash set (in whatever platform you're using) and iterate through the characters, failing if the character is already in the set and adding it to the set otherwise - but that's only likely to provide any benefit when the strings are longer.
EDIT: Now that we know it's sorted, mquander's answer is the best one IMO. Here's an implementation:
public static bool IsSortedNoRepeats(string text)
{
if (text.Length == 0)
{
return true;
}
char current = text[0];
for (int i=1; i < text.Length; i++)
{
char next = text[i];
if (next <= current)
{
return false;
}
current = next;
}
return true;
}
A shorter alternative if you don't mind repeating the indexer use:
public static bool IsSortedNoRepeats(string text)
{
for (int i=1; i < text.Length; i++)
{
if (text[i] <= text[i-1])
{
return false;
}
}
return true;
}
EDIT: Okay, with the "frequency" side, I'll turn the problem round a bit. I'm still going to assume that the string is sorted, so what we want to know is the length of the longest run. When there are no repeats, the longest run length will be 0 (for an empty string) or 1 (for a non-empty string). Otherwise, it'll be 2 or more.
First a string-specific version:
public static int LongestRun(string text)
{
if (text.Length == 0)
{
return 0;
}
char current = text[0];
int currentRun = 1;
int bestRun = 0;
for (int i=1; i < text.Length; i++)
{
if (current != text[i])
{
bestRun = Math.Max(currentRun, bestRun);
currentRun = 0;
current = text[i];
}
currentRun++;
}
// It's possible that the final run is the best one
return Math.Max(currentRun, bestRun);
}
Now we can also do this as a general extension method on IEnumerable<T>:
public static int LongestRun(this IEnumerable<T> source)
{
bool first = true;
T current = default(T);
int currentRun = 0;
int bestRun = 0;
foreach (T element in source)
{
if (first || !EqualityComparer<T>.Default(element, current))
{
first = false;
bestRun = Math.Max(currentRun, bestRun);
currentRun = 0;
current = element;
}
}
// It's possible that the final run is the best one
return Math.Max(currentRun, bestRun);
}
Then you can call "AABCD".LongestRun() for example.

This will tell you very quickly if a string contains duplicates:
bool containsDups = "ABCDEA".Length != s.Distinct().Count();
It just checks the number of distinct characters against the original length. If they're different, you've got duplicates...
Edit: I guess this doesn't take care of the frequency of dups you noted in your edit though... but some other suggestions here already take care of that, so I won't post the code as I note a number of them already give you a reasonably elegant solution. I particularly like Joe's implementation using LINQ extensions.

Since you're using 3.5, you could do this in one LINQ query:
var results = stringInput
.ToCharArray() // not actually needed, I've left it here to show what's actually happening
.GroupBy(c=>c)
.Where(g=>g.Count()>1)
.Select(g=>new {Letter=g.First(),Count=g.Count()})
;
For each character that appears more than once in the input, this will give you the character and the count of occurances.

I think the easiest way to achieve that is to use this simple regex
bool foundMatch = false;
foundMatch = Regex.IsMatch(yourString, #"(\w)\1");
If you need more information about the match (start, length etc)
Match match = null;
string testString = "ABCDE AABCD";
match = Regex.Match(testString, #"(\w)\1+?");
if (match.Success)
{
string matchText = match.Value; // AA
int matchIndnex = match.Index; // 6
int matchLength = match.Length; // 2
}

How about something like:
string strString = "AA BRA KA DABRA";
var grp = from c in strString.ToCharArray()
group c by c into m
select new { Key = m.Key, Count = m.Count() };
foreach (var item in grp)
{
Console.WriteLine(
string.Format("Character:{0} Appears {1} times",
item.Key.ToString(), item.Count));
}

Update Now, you'd need an array of counters to maintain a count.
Keep a bit array, with one bit representing a unique character. Turn the bit on when you encounter a character, and run over the string once. A mapping of the bit array index and the character set is upto you to decide. Break if you see that a particular bit is on already.

/(.).*\1/
(or whatever the equivalent is in your regex library's syntax)
Not the most efficient, since it will probably backtrack to every character in the string and then scan forward again. And I don't usually advocate regular expressions. But if you want brevity...

I started looking for some info on the net and I got to the following solution.
string input = "aaaaabbcbbbcccddefgg";
char[] chars = input.ToCharArray();
Dictionary<char, int> dictionary = new Dictionary<char,int>();
foreach (char c in chars)
{
if (!dictionary.ContainsKey(c))
{
dictionary[c] = 1; //
}
else
{
dictionary[c]++;
}
}
foreach (KeyValuePair<char, int> combo in dictionary)
{
if (combo.Value > 1) //If the vale of the key is greater than 1 it means the letter is repeated
{
Console.WriteLine("Letter " + combo.Key + " " + "is repeated " + combo.Value.ToString() + " times");
}
}
I hope it helps, I had a job interview in which the interviewer asked me to solve this and I understand it is a common question.

When there is no order to work on you could use a dictionary to keep the counts:
String input = "AABCD";
var result = new Dictionary<Char, int>(26);
var chars = input.ToCharArray();
foreach (var c in chars)
{
if (!result.ContainsKey(c))
{
result[c] = 0; // initialize the counter in the result
}
result[c]++;
}
foreach (var charCombo in result)
{
Console.WriteLine("{0}: {1}",charCombo.Key, charCombo.Value);
}

The hash solution Jon was describing is probably the best. You could use a HybridDictionary since that works well with small and large data sets. Where the letter is the key and the value is the frequency. (Update the frequency every time the add fails or the HybridDictionary returns true for .Contains(key))

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Comparing a part of a string. C# - c#

try use String.Contains() Check it out here: http://msdn.microsoft.com/en-us/library/dy85x1sa%28v=vs.110%29.aspx Good luck.

Use If(mainString.Contains(searchedString) { //do stuff }

For nucleotide sequences you probably want some sequence alignment algorithm which has some tolerance when sequences are not equal. For example Smith–Waterman algorithm.

Related

compare different combinations of a string [closed]

PadRight in string of arrays doesn't add chars

Replace strings in C#

How to get the a string that is most repeated in a list

Testing for repeated characters in a string

Categories

Resources