Extracting character-number pairs from a string - c#

Character-values pairs are received continuously from serial port in the following format
h135v48s167
,where h has value 135, v has 48 and s has 167 (the numeric values ranges from 0 to 2000).
I am using if-else statement to perform specific actions based on values of h, v and s.
I tried regex as v(\d+) to get the value of v, which gives v48 as result.
How can i get the numeric value only?
I have to use regex 3 times to get the values of h, v and s. Can a single regexp statement works?
Is there any other better way without using regex?
Following is the section of the code where I am using this -
if (port.IsOpen) {
if (port.BytesToRead > 0) {
// port.WriteLine ("p");
string data = port.ReadExisting ();
if (!string.IsNullOrEmpty (data)) {
h = Regex.Match (data, #"h\d+").Value;
v = Regex.Match (data, #"v\d+").Value;
s = Regex.Match (data, #"s\d+").Value;
if (h > 150) {
// do something
}
if (v < 30) {
// do something
}
} else {
// default
}
}
}

Using Regex is fine. To allow for arbitrary letters, use [a-z] (or use [hvs] instead if you only want to catch these letters). You may capture both the letter and the number by parantheses and refer to them using the Match.Groups collection.
var data = "h135v48s167";
foreach (Match m in Regex.Matches(data, #"([a-z])(\d+)"))
{
var variable = m.Groups[1].Value[0];
var value = Convert.ToInt32(m.Groups[2].Value);
Console.WriteLine("{0}={1}", variable, value);
switch (variable)
{
case 'h':
// do something
break;
case 'v':
// do something
break;
case 's':
// do something
break;
}
}
gives:
h=135
v=48
s=167

Use Regex :
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Text.RegularExpressions;
namespace ConsoleApplication11
{
class Program
{
const string FILENAME = #"c:\temp\test.txt";
static void Main(string[] args)
{
string input = "h135v48s167";
string pattern = "h(?'h'[^v]+)v(?'v'[^s]+)s(?'s'.*)";
Match match = Regex.Match(input,pattern);
string h = match.Groups["h"].Value;
string v = match.Groups["v"].Value;
string s = match.Groups["s"].Value;
}

Related

Reverse complement of a sequence

Problem is below: (need to write ReserveComplemenet method in c#)
The reverse complement of a sequence is formed by exchanging all of its nucleobases with their base complements, and then reversing the resulting sequence. The reverse complement of a DNA sequence is formed by exchanging all instances of:
A with T
T with A
G with C
C with G
Then reversing the resulting sequence.
For example:
Given the DNA sequence AAGCT the reverse complement is AGCTT
This method, ReverseComplement(), must take the following parameter:
Reference to a DNA sequence
This method should return void and mutate the referenced DNA sequence to its reverse complement.
Currently, here is my code,
string result = z.Replace('A', 'T').Replace('T', 'A').Replace('G', 'C').Replace('C', 'G');
string before = (result);
return before;
I'm stuck and wondering how I do this? Any help would be greatly appreciated. When I run this I get AAGGA and not AGCTT
using System;
using System.Collections.Generic;
using System.Linq;
namespace ConsoleApp8
{
class Program
{
static void Main(string[] args)
{
var dict = new Dictionary<char, char>()
{
['A'] = 'T',
['T'] = 'A',
['G'] = 'C',
['C'] = 'G',
};
var input = "AAGCT";
var output = string.Concat(input.Select(c => dict[c]).Reverse()); // AGCTT
Console.WriteLine(input);
Console.WriteLine(output);
}
}
}
When i run this i get AAGGA and not AGCTT
Because you are looking at it as a single replace, not multiple replaces:
z.Replace('A', 'T').Replace('T', 'A').Replace('G', 'C').Replace('C', 'G');
AAGCT
Replace('A', 'T')
TTGCT
Replace('T', 'A')
AAGCA
Replace('G', 'C')
AACCA
.Replace('C', 'G')
AAGGA
Instead what I would recommend is intermediary replace:
var z = "AAGCT";
var chars = z.Replace('A', '1')
.Replace('T', 'A')
.Replace('1', 'T')
.Replace('G', '2')
.Replace('C', 'G')
.Replace('2', 'C')
.Reverse()
.ToArray();
var result = new string(chars);
Console.WriteLine(result);
Yields:
AGCTT
DotNetFIddle Example
Now if you're doing this millions of times, you may want to consider using a StringBuilder instead.
Recommended reading: The Sad Tragedy of Micro-Optimization Theater
A little trick to Replace-version:
using System;
using System.Linq;
namespace DNA
{
public class Program
{
public static void Main()
{
var dna = "AAGCT";
var reversed = new String(dna
.ToLower()
.Replace('a', 'T')
.Replace('t', 'A')
.Replace('g', 'C')
.Replace('c', 'G')
.Reverse()
.ToArray());
Console.WriteLine(reversed);
}
}
}
Or good old StringBuilder:
using System;
using System.Text;
namespace DNA
{
public class Program
{
public static void Main()
{
var dna = "AAGCT";
var sb = new StringBuilder(dna.Length);
for(var i = dna.Length - 1; i >- 1; i--)
{
switch(dna[i])
{
case 'A':
sb.Append('T');
break;
case 'T':
sb.Append('A');
break;
case 'G':
sb.Append('C');
break;
case 'C':
sb.Append('G');
break;
}
}
var reversed = sb.ToString();
Console.WriteLine(reversed);
}
}
}
Instead of replacing each char, it's easier to implement using linq:
void Translate(ref string dna)
{
var map = new string[] {"AGTC", "TCAG"};
dna = string.Join("", dna.Select(c => map[1][map[0].IndexOf(c)]).Reverse());
}
You start with a string array that represents the mappings - then you select the mapped char for each char of the string, reverse the IEnumerable<char> you get from the Select, and use string.Join to convert it back to a string.
The code in the question first converts A to T, and then convert T to A, so everything that was A returns as an A, but also everything that was T returns as an A as well (same goes for G and C).
And also a non-linq solution based on a for loop and string builder (translation logic is the same):
void Translate(ref string dna)
{
var map = new string[] {"AGTC", "TCAG"};
var sb = new StringBuilder(dna.Length);
for(int i = dna.Length-1; i > -1; i--)
{
sb.Append(map[1][map[0].IndexOf(dna[i])]);
}
dna = sb.ToString();
}

How can I remove the last two numbers from a number that's a double before making it a string?

I have this number in C#
Double a = 1.2345678
What I would like is for it to look like this after it's made into a string:
1.23456
First, convert it to string.
for example the string is defined as S.
then apply this method:
S.Remove(S.Length -2);
You can acheive that in many ways:
Code:
using System;
using System.Linq;
using System.Text;
namespace RemovingLastTwoNumber
{
class Program
{
static void Main(string[] args)
{
// Method 1.
double number = 1.2345678;
string numberInStringFormat = number.ToString();
string TargetNumber = numberInStringFormat.Substring(0, numberInStringFormat.Length - 2);
Console.WriteLine(number);
Console.WriteLine(TargetNumber);
// Method 2.
string _TargetNumber = Math.Round(number, 5).ToString();
Console.WriteLine(_TargetNumber);
// Method 3.
var characters = number.ToString().ToArray();
var __Characters = characters.Take(7);
StringBuilder __targetNumber = new StringBuilder();
foreach (var character in __Characters)
{
__targetNumber.Append(character);
}
Console.WriteLine(__targetNumber);
}
}
}

Split string into multiple alpha and numeric segments

I have a string like "ABCD232ERE44RR". How can I split it into separate segments by letters/numbers. I need:
Segment1: ABCD
Segment2: 232
Segment3: ERE
Segment4: 44
There could be any number of segments. I am thinking go Regex but don't understand how to write it properly
You can do it like this;
using System;
using System.Collections.Generic;
using System.Text.RegularExpressions;
public class Program
{
public static void Main()
{
var substrings = Regex.Split("ABCD232ERE44RR", #"[^A-Z0-9]+|(?<=[A-Z])(?=[0-9])|(?<=[0-9])(?=[A-Z])");
Console.WriteLine(string.Join(",",substrings));
}
}
Output : ABCD,232,ERE,44,RR
I suggest thinking of this as finding matches to a target pattern rather than splitting into the parts you want. Splitting gives significance to the delimiters whereas matching gives significance to the tokens.
You can use Regex.Matches:
Searches the specified input string for all occurrences of a specified regular expression.
var matches = Regex.Matches("ABCD232ERE44RR", "[A-Z]+|[0-9]+");
foreach (Match match in matches) {
Console.WriteLine("Found '{0}' at position {1}", match.Value, match.Index);
}
Try something like:
((A-Z)+(\d)*)+
If you decide not to use regex, you can always go the manual route.
const string str = "ABCD232ERE44RR1SGGSG3333GSDGSDG";
var result = new List<StringBuilder>
{
new StringBuilder()
};
char last = str[0];
result.Last().Append(last);
bool isLastNum = Char.IsNumber(last);
for (int i = 1; i < str.Length; i++)
{
char ch = str[i];
if (!((Char.IsDigit(ch) && isLastNum) || (Char.IsLetter(ch) && !isLastNum)))
{
result.Add(new StringBuilder());
}
result.Last().Append(ch);
last = ch;
isLastNum = Char.IsDigit(ch);
}

How do I replace specific characters from a c# string?

if I have a string along the lines of: "user:jim;id:23;group:49st;"
how can I replace the group code (49st) with something else, so that it shows: "user:jim;id=23;group:76pm;"
sorry if the question is easy but I haven't found a specific answer, just cases different than mine.
You can use the index of "group" like this
string s = "user:jim;id:23;group:49st;";
string newS = s.Substring(0,s.IndexOf("group:") + 6);
string restOfS = s.IndexOf(";",s.IndexOf("group:") + 6) + 1 == s.Length
? ""
: s.Substring(s.IndexOf(";",s.IndexOf("group:") + 6) + 1);
newS += "76pm;";
s = newS + restOfS;
The line with the s = criteria ? true : false is essentially an if but it is put onto one line using a ternary operator.
Alternatively, if you know what text is there already and what it should be replaced with, you can just use a Replace
s = s.Replace("49st","76pm");
As an added precaution, if you are not always going to have this "group:" part in the string, to avoid errors put this inside an if which checks first
if(s.Contains("group:"))
{
//Code
}
Find the match using regex and replace it with new value in original string as mentioned below:
string str = "user:jim;id=23;group:49st;";
var match = Regex.Match(str, "group:.*;").ToString();
var newGroup = "group:76pm;";
str = str.Replace(match, newGroup);
This solution should work no matter where the group appears in the string:
string input = "user:jim;id:23;group:49st;";
string newGroup = "76pm";
string output = Regex.Replace(input, "(group:)([^;]*)", "${1}"+newGroup);
Here is a very generic method for splitting your input, changing items, then rejoining items to a string. It is not meant for single replacement in your example, but is meant to show how to split and join items in string.
I used Regex to split the items and then put results into a dictionary.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Text.RegularExpressions;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
string pattern = "(?'name'[^:]):(?'value'.*)";
string input = "user:jim;id:23;group:49st";
Dictionary<string,string> dict = input.Split(new char[] { ';' }, StringSplitOptions.RemoveEmptyEntries).Select(x => new
{
name = Regex.Match(x, pattern).Groups["name"].Value,
value = Regex.Match(x, pattern).Groups["value"].Value
}).GroupBy(x => x.name, y => y.value)
.ToDictionary(x => x.Key, y => y.FirstOrDefault());
dict["group"] = "76pm";
string output = string.Join(";",dict.AsEnumerable().Select(x => string.Join(":", new string[] {x.Key, x.Value})).ToArray());
}
}
}
That is just one way to do it. I hope it will help you.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace stringi
{
class Program
{
static void Main(string[] args)
{
//this is your original string
string s = "user:jim;id:23;group:49st";
//string with replace characters
string s2 = "76pm";
//convert string to char array so you can rewrite character
char[] c = s.ToCharArray(0, s.Length);
//asign characters to right place
c[21] = s2[0];
c[22] = s2[1];
c[23] = s2[2];
c[24] = s2[3];
//this is your new string
string new_s = new string(c);
//output your new string
Console.WriteLine(new_s);
Console.ReadLine();
}
}
}
string a = "user:jim;id:23;group:49st";
string b = a.Replace("49st", "76pm");
Console.Write(b);

C# Split string into array based on prior character

I need to take a string and split it into an array based on the type of charcter not matching they proceeding it.
So if you have "asd fds 1.4#3" this would split into array as follows
stringArray[0] = "asd";
stringArray[1] = " ";
stringArray[2] = "fds";
stringArray[3] = " ";
stringArray[4] = "1";
stringArray[5] = ".";
stringArray[6] = "4";
stringArray[7] = "#";
stringArray[8] = "3";
Any recomendations on the best way to acheive this? Of course I could create a loop based on .ToCharArray() but was looking for a better way to achieve this.
Thank you
Using a combination of Regular Expressions and link you can do the following.
using System.Text.RegularExpressions;
using System.Linq;
var str="asd fds 1.4#3";
var regex=new Regex("([A-Za-z]+)|([0-9]+)|([.#]+)|(.+?)");
var result=regex.Matches(str).OfType<Match>().Select(x=>x.Value).ToArray();
Add additional capture groups to capture other differences. The last capture (.+?) is a non greedy everything else. So every item in this capture will be considered different (including the same item twice)
Update - new revision of regex
var regex=new Regex(#"(?:[A-Za-z]+)|(?:[0-9]+)|(?:[#.]+)|(?:(?:(.)\1*)+?)");
This now uses non capturing groups so that \1 can be used in the final capture. This means that the same character will be grouped if its in then catch all group.
e.g. before the string "asd fsd" would create 4 strings (each space would be considered different) now the result is 3 strings as 2 adjacent spaces are combined
Use regex:
var mc = Regex.Matches("asd fds 1.4#3", #"([a-zA-Z]+)|.");
var res = new string[mc.Count];
for (var i = 0; i < mc.Count; i++)
{
res[i] = mc[i].Value;
}
This program produces exactly output you want, but I am not sure wether it's generic enaugh for your goal.
class Program
{
private static void Main(string[] args)
{
var splited = Split("asd fds 1.4#3").ToArray();
}
public static IEnumerable<string> Split(string text)
{
StringBuilder result = new StringBuilder();
foreach (var ch in text)
{
if (char.IsLetter(ch))
{
result.Append(ch);
}
else
{
yield return result.ToString();
result.Clear();
yield return ch.ToString(CultureInfo.InvariantCulture);
}
}
}
}

Categories