Remove spaces in the string array - c#

I have a text file that contains numbers in this format :
84 152 100
86 149 101
83 149 99
86 142 101
How can I remove the spaces and bring it in this shape :
84 152 100
86 149 101
83 149 99
86 142 101
This is what I have tried so far :
string path = Directory.GetCurrentDirectory();
string[] lines = System.IO.File.ReadAllLines(#"data_1_2.txt");
string[] line = lines[0].Trim().Split(new string[] { " " }, StringSplitOptions.RemoveEmptyEntries);
But the result of this input is :
84
152
100

Use a bit of LINQ magic:
lines = lines.Select(l => String.Join(" ", l.Split(new[] { ' ' }, StringSplitOptions.RemoveEmptyEntries))).ToArray();
It will split each line using space as a separator, remove empty entries and join them back using space as a separator again.

You can use a simple regular expression:
lines = lines.Select(line => Regex.Replace(line, #"\s+", " ")).ToArray();

Related

How to convert String string dataReturned = "62 07 00 00 04 05 00 01 A0" to List<byte> in C#

I am new to C# and I am trying to convert string into List. I was able to do so only if my dataRetured variable below doesn't have a letter. My code is below:
List<byte> response = new List<byte>();
string dataReturned = "62 07 00 00 04 05 00 01 A0";
response = dataReturned.Split(new char[] { ' ' }, StringSplitOptions.RemoveEmptyEntries).Select(s => Byte.Parse(s)).ToList();
Once I have a letter as A0 in dataReturned variable above I keep getting error in the response line since I am using Bye.Parse with A0. Is there an equivalent conversion for both integer representation and letter? Thanks
Try this
response =
dataReturned
.Split(new char[] { ' ' }, StringSplitOptions.RemoveEmptyEntries)
.Select(s => Byte.Parse(s, NumberStyles.HexNumber))
.ToList();
Here we added NumberStyles.HexNumber which allow you to parse hex strings.

How to find strings between two strings in c#

I'm getting a string with some patterns, like:
A 11 A 222222 B 333 A 44444 B 55 A 66666 B
How to get all the strings between A and B in the smallest area?
For example, "A 11 A 222222 B" result in " 222222 "
And the first example should result in:
222222
333
44444
55
66666
We can try searching for all regex matches in your input string which are situated between A and B, or vice-versa. Here is a regex pattern which uses lookarounds to do this:
(?<=\bA )\d+(?= B\b)|(?<=\bB )\d+(?= A\b)
Sample script:
string input = "A 11 A 222222 B 333 A 44444 B 55 A 66666 B";
var vals = Regex.Matches(input, #"(?<=\bA )\d+(?= B\b)|(?<=\bB )\d+(?= A\b)")
.Cast<Match>()
.Select(m => m.Value)
.ToArray();
foreach (string val in vals)
{
Console.WriteLine(val);
}
This prints:
222222
333
44444
55
66666

Regex if condition c#

Text from txt file:
10 25
32 44
56 88
102 127
135 145
...
If it is a first line place 0, rest use the last number as a first in new line. Is it possible to do it or I need to loop through lines after regex parse.
0 10 25
25 32 44
44 56 88
88 102 127
127 135 145
(?<Middle>\d+)\s(?<End>\d+) //(?<Start>...)
I would advise against using regex for readability reasons but this will work:
var input = ReadFromFile();
var regex = #"(?<num>\d*)[\n\r]+";
var replace = "${num}\n${num} ";
var output = Regex.Replace(input, regex, replace);
That will do everything apart from the first 0.
Note that a regex approach does not sound quite good for a task like this. It can be used for small input strings, for larger ones, it is recommended that you write some more logic and parse text line by line.
So, more from academic interest, here is a regex solution showing how to replace with different replacement patterns based on whether the line matched is first or not:
var pat = #"(?m)(?:(\A)|^(?!\A))(.*\b\s+(\d+)\r?\n)";
var s = "10 25\n32 44\n56 88\n102 127\n135 14510 25\n32 44\n56 88\n102 127\n135 145";
var res = Regex.Replace(s, pat, m => m.Groups[1].Success ?
$"0 {m.Groups[2].Value}{m.Groups[3].Value} " : $"{m.Groups[2].Value}{m.Groups[3].Value} ");
Result of the C# demo:
0 10 25
25 32 44
44 56 88
88 102 127
127 135 14510 25
25 32 44
44 56 88
88 102 127
127 135 145
Note the \n line breaks are hardcoded, but it is still just an illustration of regex capabilities.
Pattern details
(?m) - an inline RegexOptions.Multiline modifier
(?:(\A)|^(?!\A)) - a non-capturing group matching either
(\A) - start of string capturing it to Group 1
| - or
^(?!\A) - start of a line (but not string due to the (?!\A) negative lookahead)
(.*\b\s+(\d+)\r?\n) - Group 2:
.*\b - 0+ chars other than newline up to the last word boundary on a line followed with...
\s+ - 1+ whitespaces (may be replaced with [\p{Zs}\t]+ to only match horizontal whitespaces)
(\d+) - Group 3: one or more digits
\r?\n - a CRLF or LF line break.
The replacement logic is inside the match evaluator: if Group 1 matched (m.Groups[1].Success ?) replace with 0 and Group 2 + Group 3 values + space. Else, replace with Group 2 + Group 3 + space.
With C#.
var lines = File.ReadLines(fileName);
var st = new StringBuilder(); //or StreamWriter directly to disk ect.
var last = "0";
foreach (var line in lines)
{
st.AppendLine(last + " " + line );
last = line.Split().LastOrDefault();
}
var lines2 = st.ToString();

How do cut line using C#?

How do cut the following line?
19 02 2000 01:53:36 System Line [**12345**] ----> filename.txt
I need output to be
12345
Could you help me?
var result = myString.Split(new char[] { '[', ']' } )[1];
should do it.

How to ignore groups

I have the following node(s) which I retrieve in a streamreader. There could be numerous of these. I am only interested to retrieve a few groups within this node for instance REPLICATE_ID, ASSAY_NUMBER,FEW DATES FIELDS.
The ordering of the fields within the node could be different and sometimes new fields could be present as well but the fields I want to extract they will not change.
So far the regex I have matches the entire node so in case the node has new fields or the order is different, it breaks. Is it possible to match groups I am only interested in?
TEST_REPLICATE
{
REPLICATE_ID 453w
ASSAY_NUMBER 334
ASSAY_VERSION 4
ASSAY_STATUS test
DILUTION_ID 1
SAMPLE_ID "NC_dede"
SAMPLE_TYPE Specimen
TEST_ORDER_DATE 05.23.2012
TEST_ORDER_TIME 04:25:07
TEST_INITIATION_DATE 05.23.2012
TEST_INITIATION_TIME 05:19:43
TEST_COMPLETION_DATE 05.23.2012
TEST_COMPLETION_TIME 05:48:01
ASSAY_CALIBRATION_DATE NA
ASSAY_CALIBRATION_TIME NA
TRACK 1
PROCESSING_LANE 1
MODULE_SN "EP004"
LOAD_LIST_NAME C:\BwedwQwedw_SCC\edwLoadlist2RACKSB.json
OPERATOR_ID "Q_dwe"
DARK_SUBREADS 16 23 19 20 16 18 21 16 17 18 19 19 20 22 19 20 19 20 18 20 17 20 21 16 19 23 20 22 19 20
SIGNAL_SUBREADS 18 17 20 21 42 61 41 31 30 30 26 26 25 22 24 DARK_COUNT 577
SIGNAL_COUNT 781
CORRECTED_COUNT 204
STD_BAK 1.95965044971226
AVG_BAK 19.2333333333333
STD_FOR 8.67212471810898
AVG_FOR 26.0333333333333
SHAPE NA
EXCEPTION_STRING TestException - Parameters:Unable to process test, background read failure.
RESULT NA
REPORTED_RESULT NA
REPORTED_RESULT_UNITS NA
REAGENT_MASTER_LOT 13600LI02
REAGENT_SERIAL_NUMBER 25022
RESULT_FLAGS RUO
RESULT_INTERPRETATION NA
DILUTION_PROTOCOL UNDILUTED
RESULT_COMMENT frer 1 LANE A
DATA_MANAGEMENT_FIELD_1 NA
DATA_MANAGEMENT_FIELD_2 NA
DATA_MANAGEMENT_FIELD_3 NA
DATA_MANAGEMENT_FIELD_4 NA
}
string pat = #"TEST_REPLICATE\s*{\s*REPLICATE_ID\s*([^}]*?)\s+ASSAY_NUMBER\s*([^}]*?)\s+ASSAY_VERSION\s*([^}]*?)\s+DILUTION_ID\s*([^}]*?)\s+SAMPLE_ID\s*([^}]*?)\s+SAMPLE_TYPE\s*([^}]*?)\s+TEST_ORDER_DATE\s*([^}]*?)\s+TEST_ORDER_TIME\s*([^}]*?)\s+TEST_INITIATION_DATE\s*([^}]*?)\s+TEST_INITIATION_TIME\s*([^}]*?)\s+TEST_COMPLETION_DATE\s*([^}]*?)\s+TEST_COMPLETION_TIME\s*([^}]*?)\s+ASSAY_CALIBRATION_DATE\s*([^}]*?)\s+ASSAY_CALIBRATION_TIME\s*([^}]*?)\s+TRACK\s*([^}]*?)\s+PROCESSING_LANE\s*([^}]*?)\s+MODULE_SN\s*([^}]*?)\s+LOAD_LIST_NAME\s*([^}]*?)\s+OPERATOR_ID\s*([^}]*?)\s+DARK_SUBREADS\s*([^}]*?)\s+SIGNAL_SUBREADS\s*([^}]*?)\s+DARK_COUNT\s*([^}]*?)\s+SIGNAL_COUNT\s*([^}]*?)\s+CORRECTED_COUNT\s*([^}]*?)\s+STD_BAK\s*([^}]*?)\s+AVG_BAK\s*([^}]*?)\s+STD_FOR\s*([^}]*?)\s+AVG_FOR\s*([^}]*?)\s+SHAPE\s*([^}]*?)\s+EXCEPTION_STRING\s*([^}]*?)\s+RESULT\s*([^}]*?)\s+REPORTED_RESULT\s*([^}]*?)\s+REPORTED_RESULT_UNITS\s*([^}]*?)\s+REAGENT_MASTER_LOT\s*([^}]*?)\s+REAGENT_SERIAL_NUMBER\s*([^}]*?)\s+RESULT_FLAGS\s*([^}]*?)\s+RESULT_INTERPRETATION\s*([^}]*?)\s+DILUTION_PROTOCOL\s*([^}]*?)\s+RESULT_COMMENT\s*([^}]*?)\s+DATA_MANAGEMENT_FIELD_1\s*([^}]*?)\s+DATA_MANAGEMENT_FIELD_2\s*([^}]*?)\s+DATA_MANAGEMENT_FIELD_3\s*([^}]*?)\s+DATA_MANAGEMENT_FIELD_4\s*([^}]*?)\s*}";
Yeah, you probably should just parse the record for key-value pairs.
Here is a code sample if you want to extract key-value pairs from a record.
When a match is found, the key's your looking for can be tested against those in the capture collection.
You can also alter the regex as to how the begin/end of record are allowed.
But don't alter the core, it protects from catastrophic backtracking.
Regex alternatives:
# Record starts on a new line, closing brace can be anywhere
^ [^\S\n]*TEST_REPLICATE\s*\{
(?>
\s* (?<key> [^\s{}]+ ) [^\S\n]* (?<val> [^\n{}]*? ) [^\S\n]* (?:$|(?=\}))
)*
\s*\}
# Record starts anywhere, closing brace is on a new line
TEST_REPLICATE\s*\{
(?>
\s* (?<key> [^\s{}]+ ) [^\S\n]* (?<val> [^\n{}]*? ) [^\S\n]* $
)*
\s*\}
C# test code:
Regex testRx = new Regex(
#"
^ [^\S\n]* TEST_REPLICATE # Record, starts on a newline
\s* # Optional whitespaces (trims blank lines)
\{ # Record opening brace
(?> # Atomic group
\s* # Optional many whitespace (trims blank lines)
# Line in record to be recorded
(?<key> [^\s{}]+) # required <key>, not whitespacs nor braces
[^\S\n]* # trim whitespaces (don't include newline)
(?<val> [^\n{}]*?) # optional <value>, not newlines nor braces
[^\S\n]* # trim whitespaces (don't include newline)
(?:$|(?=\})) # End of line, or next char is a closing brace
)* # End atomic group, do many times (optional)
\s* # Optional whitespaces (trims blank lines)
\} # Record closing brace
", RegexOptions.IgnorePatternWhitespace | RegexOptions.Multiline);
string testdata = #"
TEST_REPLICATE{}
TEST_REPLICATE{
REPLICATE_ID 1asdf985
ASSAY_NUMBER 123sdg
ASSAY_VERSION 4sdgn
ASSAY_TYPE unknown
}
TEST_REPLICATE
{
REPLICATE_ID
ASSAY_NUMBER 123
ASSAY_VERSION 4
ASSAY_TYPE unknown
DILUTION_ID 1
SAMPLE_ID ""NC_HIV1""
SAMPLE_TYPE Specimen
TEST_ORDER_DATE 05.21.2012
TEST_ORDER_TIME 03:44:01
TEST_INITIATION_DATE 05.21.2012
TEST_INITIATION_TIME 04:03:36
TEST_COMPLETION_DATE 05.21.2012
TEST_COMPLETION_TIME 04:29:32
ASSAY_CALIBRATION_DATE NA
ASSAY_CALIBRATION_TIME NA
TRACK 1
PROCESSING_LANE 1
MODULE_SN ""EP004""
LOAD_LIST_NAME C:\sdddd
OPERATOR_ID ""Q_SI""
DARK_SUBREADS NA
SIGNAL_SUBREADS NA
DARK_COUNT NA
SIGNAL_COUNT NA
CORRECTED_COUNT NA
STD_BAK NA
AVG_BAK NA
STD_FOR NA
AVG_FOR NA
SHAPE NA
EXCEPTION_STRING Test execution was stopped.
RESULT NA
REPORTED_RESULT NA
REPORTED_RESULT_UNITS NA
REAGENT_MASTER_LOT 2345
REAGENT_SERIAL_NUMBER 25022
RESULT_FLAGS NA
RESULT_INTERPRETATION NA
DILUTION_PROTOCOL UNDILUTED
RESULT_COMMENT HIV NC 1
DATA_MANAGEMENT_FIELD_1 NA
DATA_MANAGEMENT_FIELD_2 NA
DATA_MANAGEMENT_FIELD_3 NA
DATA_MANAGEMENT_FIELD_4 NA
}
";
Match m_testrec = testRx.Match(testdata);
// Each match contains a single record
//
while (m_testrec.Success)
{
Console.WriteLine("New Record\n------------------------");
CaptureCollection cc_key = m_testrec.Groups["key"].Captures;
CaptureCollection cc_val = m_testrec.Groups["val"].Captures;
for (int i = 0; i < cc_key.Count; i++)
{
Console.WriteLine("'{0}' = '{1}'", cc_key[i].Value, cc_val[i].Value);
//
// Test specific keys here
// if (cc_key[i].Value == "REAGENT_SERIAL_NUMBER") ...
}
Console.WriteLine("------------------------");
// Get next record
m_testrec = m_testrec.NextMatch();
}

Categories