How to cut multiple strings from txt filec# - c#

I tried a lot of possible solutions to this problem but it never seems to work. My problem is the following: I have a txt file with several lines. Each line has something like:
xxxxx yyyyyy
xxxxx yyyyyy
xxxxx yyyyyy
xxxxx yyyyyy
...
I want to store in one array of strings the xxxxx and in another array the yyyyy, for each line on the txt file, something like
string[] x;
string[] y;
string[1] x = xxxxx; // the x from the first line of the txt
string[2] x = xxxxx; // the x from the second line of the txt
string[3] x = xxxxx; // the x from the third line of the txt
...
and the same for string[] y;
... but i have no idea how to...
I would very much appreciate if someone showed me how to make the cycle for this problem i have.

You can use linq for this:
string test = "xxxxx yyyyyy xxxxx yyyyyy xxxxx yyyyyy xxxxx yyyyyy";
string[] testarray = test.Split(' ');
string[] arrayx= testarray.Where((c, i) => i % 2 == 0).ToArray<string>();
string[] arrayy = testarray.Where((c, i) => i % 2 != 0).ToArray<string>();
Basically,this code splits the string by a space, and then puts the even strings in one array and the odd ones in another.
Edit
You say in the comments you don't understand this: Where((c, i) => i % 2 == 0). What it does is taking the position of each string (i) and does a mod of it with 2. This means, it divides the position by 2 and checks if the remain equals 0. It is the way to get if a number is odd or even.
Edit2
My first answer only works for one line. For several ones(as your input source is a file with several lines), you'll need to do a foreach loop. Or you can do something like the next sample code: read all the lines, join them in a single string, and then run the prevously showed code on the result:
string[] file=File.ReadAllLines(#"yourfile.txt");
string allLines = string.Join(" ", file); //this joins all the lines into one
//Alternate way of joining the lines
//string allLines=file.Aggregate((i, j) => i + " " + j);
string[] testarray = allLines.Split(' ');
string[] arrayx= testarray.Where((c, i) => i % 2 == 0).ToArray<string>();
string[] arrayy = testarray.Where((c, i) => i % 2 != 0).ToArray<string>();

If I understand your question correctly, xxxxx and yyyyyy show up repeatly, which in case something like that 11111 222222 11111 222222 11111 222222
There is an ' ' space between them, so
1. you may split the line one by one within a loop
2. use ' ' as delimiter when split the line
3. use a counter to differentiate whether the string is odd or even and store them separately within another loop

If I understood correctly, you have multiple lines, each line with two strings. Then, here is an answer that uses a plain old for:
public static void Main()
{
// This is just an example. In your case you would read the text from a file
const string text = #"x y
xx yy
xxx yyy";
var lines = text.Split(new[]{'\n', '\r'}, StringSplitOptions.RemoveEmptyEntries);
var xs = new string[lines.Length];
var ys = new string[lines.Length];
for(int i = 0; i < lines.Length; i++)
{
var parts = lines[i].Split(' ');
xs[i] = parts[0];
ys[i] = parts[1];
}
}

Related

How To Repeat Split Method After X Times?

I have a .txt file which I would like to split using the split method. My current code is:
string[] alltext = File.ReadAllText(fullPath).Split(new[] { ',' }, 3);
The problem I now have is that I want it to loop through the whole in a way that it always splits the text into three pieces that belong together. If I have a text with:
testing, testing,
buenooo diasssss
testing, testing,
buenooo diasssss
testing, testing,
buenooo diasssss
(the format here is hard to display, but want to show that they are on different lines, so reading line by line will most likely not be possible)
I want "testing", "testing", "buenooo diasssss" to be dispalyed on my console althought they are on different lines.
If I would do it with lines I would simply loop through each line, but this does not work in this case.
You can first remove "\r\n"(new line) from the text, then split and select the first three items.
var alltext = File.ReadAllText(fullPath).Replace("\r\n","").Split(',').ToList().Take(3);
foreach(var item in alltext)
Console.WriteLine(item);
Edit
If you want all three items to be displayed in one line in the console:
int lineNumber = 0;
var alltext = File.ReadAllText(fullPath).Split(new string[] { "\r\n", "," }, StringSplitOptions.None).ToList();
alltext.RemoveAll(item => item == "");
while (lineNumber * 3 < alltext.Count)
{
var tempList = alltext.Skip(lineNumber * 3).Take(3).ToList(); ;
lineNumber++;
Console.WriteLine("line {0} => {1}, {2}, {3}",lineNumber, tempList[0], tempList[1], tempList[2]);
}
result:
Try this:
var data =
File.ReadLines(fullpath)
.Select((x, n) => (line: x, group: n / 3))
.GroupBy(x => x.group, x => x.line)
.Select(x =>
String
.Concat(x)
.Split(',', StringSplitOptions.RemoveEmptyEntries)
.Select(x => x.Trim()));
That gives me:

Splitting an element of an array

In my C# program (I'm new to C# so I hope that I'm doing things correctly), I'm trying to read in all of the lines from a text file, which will look something along the lines of this, but with more entries (these are fictional people so don't worry about privacy):
Logan Babbleton ID #: 0000011 108 Crest Circle Mr. Logan M. Babbleton
Pittsburgh PA 15668 SSN: XXX-XX-XXXX
Current Program(s): Bachelor of Science in Cybersecurity
Mr. Carter J. Bairn ID #: 0000012 21340 North Drive Mr. Carter Joseph Bairn
Pittsburgh PA 15668 SSN: XXX-XX-XXXX
Current Program(s): Bachelor of Science in Computer Science
I have these lines read into an array, concentrationArray and want to find the lines that contain the word "Current", split them at the "(s): " in "Program(s): " and print the words that follow. I've done this earlier in my program, but splitting at an ID instead, like this:
nameLine = nameIDLine.Split(new string[] { "ID" }, StringSplitOptions.None)[1];
However, whenever I attempt to do this, I get an error that my index is out of the bounds of my split array (not my concentrationArray). Here's what I currently have:
for (int i = 0; i < concentrationArray.Length; i++)
{
if (concentrationArray[i].Contains("Current"))
{
lstTest.Items.Add(concentrationArray[i].Split(new string[] { "(s): " }, StringSplitOptions.None)[1]);
}
}
Where I'm confused is that if I change the index to 0 instead of 1, it will print everything out perfectly, but it will print out the first half, instead of the second half, which is what I want. What am I doing wrong? Any feedback is greatly appreciated since I'm fairly new at C# and would love to learn what I can. Thanks!
Edit - The only thing that I could think of was that maybe sometimes there wasn't anything after the string that I used to separate each element, but when I checked my text file, I found that was not the case and there is always something following the string used to separate.
You should check the result of split before trying to read at index 1.
If your line doesn't contain a "(s): " your code will crash with the exception given
for (int i = 0; i < concentrationArray.Length; i++)
{
if (concentrationArray[i].Contains("Current"))
{
string[] result = concentrationArray[i].Split(new string[] { "(s): " }, StringSplitOptions.None);
if(result.Length > 1)
lstTest.Items.Add(result[1]);
else
Console.WriteLine($"Line {i} has no (s): followeed by a space");
}
}
To complete the answer, if you always use index 0 then there is no error because when no separator is present in the input string then the output is an array with a single element containing the whole unsplitted string
If the line will always starts with
Current Program(s):
then why don't you just replace it with empty string like this:
concentrationArray[i].Replace("Current Program(s): ", "")
It is perhaps a little easier to understand and more reusable if you separate the concerns. It will also be easier to test. An example might be...
var allLines = File.ReadLines(#"C:\your\file\path\data.txt");
var currentPrograms = ExtractCurrentPrograms(allLines);
if (currentPrograms.Any())
{
lstTest.Items.AddRange(currentPrograms);
}
...
private static IEnumerable<string> ExtractCurrentPrograms(IEnumerable<string> lines)
{
const string targetPhrase = "Current Program(s):";
foreach (var line in lines.Where(l => !string.IsNullOrWhiteSpace(l)))
{
var index = line.IndexOf(targetPhrase);
if (index >= 0)
{
var programIndex = index + targetPhrase.Length;
var text = line.Substring(programIndex).Trim();
if (!string.IsNullOrWhiteSpace(text))
{
yield return text;
}
}
}
}
Here is a bit different approach
List<string> test = new List<string>();
string pattern = "Current Program(s):";
string[] allLines = File.ReadAllLines(#"C:\Users\xyz\Source\demo.txt");
foreach (var line in allLines)
{
if (line.Contains(pattern))
{
test.Add(line.Substring(line.IndexOf(pattern) + pattern.Length));
}
}
or
string pattern = "Current Program(s):";
lstTest.Items.AddRange(File.ReadLines(#"C:\Users\ODuritsyn\Source\demo.xml")
.Where(line => line.Contains(pattern))
.Select(line => line.Substring(line.IndexOf(pattern) + pattern.Length)));

Reading into list and picking specific numbers

A request came in for an app written a long time ago, they wanted another column on an existing datagridview with current drilling times. These times come from a file that machines read and run. Below is an example of a file.
Now the numbers there looking for are the ones with an S. Now, the very first number is the line number. The line number needs grabbed too as the time's need to match up with the line. So I have to read this into a list, grab the line number and then the associated "s" time number. I will be adding these "s times" to an existing table which already has the line numbers so I'll add these "s times" to a new column next to the corresponding line number. I'm a bit stuck here.
Any help is appreciated.
I know I can read everything in like this
string f = ("//dnc/WJ/MTI-WJ/" + cboPartProgram.Text);
List<string> lines = new List<string>();
using (StreamReader r = new StreamReader(f))
{
// Use while != null pattern for loop
string line;
while ((line = r.ReadLine()) != null)
{
lines.Add(line);
}
}
But once in a list I need to ignore lines that start with an apostrophe " ' ", then take only the first line number "1, 2, 3" etc for however many there are and then the "s" number of that line.
So that I'm left with:
1 2.75
2 2.5
3 2
...etc
Now, on the database there is a program table. with columns
Program Line Time
Program1 1
Program1 2
Program1 3
Program1 4
etc
So the "s" times will be added to the time column of the existing table. But the "s" times taken from line 1,2,3,4 and so on will coincide with the line column of the table.
So, the final end result in this example would be
Program Line Time
Program1 1 2.75
Program1 2 2.5
Program1 3 2
UPDATE
I'm not sure if the edited code to cover the lines w/out the sTime work yet because I'm ran into an issue recently. There is a datagridview that populates the columns from a list filled from a sql statement. I've added a column for the sTime (turns out I may not need the line# so you'll notice I removed that) however I need to incorporate my loop from sTime into the loop that is occurring for the list that is also populating the dgv. Now I've tried to incorporate this in a couple of different ways however what happens is either the very last sTime is use for all 80 entries of the dgv (as the example below does) or if I move the " } " brackets around some and combine them then uses the first sTime to create 80 entries of dgv, then the 2nd sTime to create 80 entries of the dgv. So instead of 80 entries in dgv with each individual sTime next to it, you end up with 6,400 entries in dgv 80x80. so I'm not sure how to combine these 2 to work.
var sLines = File.ReadAllLines("//dnc/WJ/MTI-WJ/" + cboPartProgram.Text)
.Where(s => !s.StartsWith("'"))
.Select(s => new
{
SValue = Regex.Match(s, "(?<=S)[\\d.]*").Value
})
.ToArray();
string lastSValue = "";
foreach (var line in sLines)
{
string val = line.SValue == string.Empty ? lastSValue : line.SValue;
lastSValue = val;
}
foreach (WjDwellOffsets offset in dwellTimes)
{
origionalDwellOffsets.Add(offset.dwellOffset);
dgvDwellTimes.Rows.Add(new object[] { offset.positionId, lastSValue, offset.dwellOffset, 0, offset.dwellOffset, "Update", (offset.dateTime > new DateTime(1900, 1, 1)) ? offset.dateTime.ToString() : "" });
DataGridViewDisableButtonCell btnCell = ((DataGridViewDisableButtonCell)dgvDwellTimes.Rows[dgvDwellTimes.Rows.Count - 1].Cells[5]);
btnCell.Enabled = false;
}
If your only condition is to keep the lines that have an "S" in them, and if that data sample is indicative of the overall line pattern, then this is quite simple (and you can use Regex to further filter your results):
var sLines = File.ReadAllLines("filepath.txt")
.Where(s => !s.StartsWith("'") && s.Contains("S"))
.Select(s => new
{
LineNumber = Regex.Match(s, "^\\d*").Value,
SValue = Regex.Match(s, "(?<=S)[\\d.]*").Value
})
.ToArray();
// Use like this
foreach (var line in sLines)
{
string num = line.LineNumber;
string val = line.SValue;
}
To maintain all lines and just have relevant information on some, this method can be a bit tweaked, but it will also require a bit of some outside processing.
var sLines = File.ReadAllLines("filepath.txt")
.Where(s => !s.StartsWith("'"))
.Select(s => new
{
LineNumber = Regex.Match(s, "^\\d*").Value,
SValue = Regex.Match(s, "(?<=S)[\\d.]*").Value
})
.ToArray();
// Use like this
string lastSValue = "";
foreach (var line in sLines)
{
string num = line.LineNumber;
string val = line.SValue == string.Empty ? lastSValue : line.SValue;
// Do the stuff
lastSValue = val;
}

Find the certain text in the line and then return that line in C#

I have a big file and for the simplicity I am just showing a small part of it. The data looks like following:
NPSER NASER NQSER
10 5 3
TSSR MPSER JDNSR
15 10 6
What I need to do is to find for example NPSER and NASER and then assign the values NPSER as 10, NASER as 5 and NQSER as 3. For this small data set I could do as following:
TextReader infile = new StreamReader(fileName);
string line;
int NPSER, NASER, NQSER;
line = infile.ReadLine();
string[] words = line.Split('\t');
NPSER = Convert.ToInt32(words[0]);
NASER = Convert.ToInt32(words[1]);
NQSER = Convert.ToInt32(words[2]);
infile.Close();
Instead of reading each line and assigning values, I want to write a function which will automatically fetch the line when I search upto three words in a line which would be easier and efficient for longer application.
I would appreciate other methods as well.
It would be easier if you can use LINQ:
var line = File.ReadLines("path")
.SkipWhile(line => !line.Contains("NPSER")) // change this condition to suit your needs
.Skip(1)
.First();
var values = line.Split(new[] { ' '},StringSplitOptions.RemoveEmptyEntries)
.Select(int.Parse)
.ToArray();
int NPSER = values[0];
int NASER = values[1];
int NQSER = values[2];

C# how to split a string backwards?

What i'm trying to do is split a string backwards. Meaning right to left.
string startingString = "<span class=\"address\">Hoopeston,, IL 60942</span><br>"
What I would do normally is this.
string[] splitStarting = startingString.Split('>');
so my splitStarting[1] would = "Hoopeston,, IL 60942</span"
then I would do
string[] splitAgain = splitStarting[1].Split('<');
so splitAgain[0] would = "Hoopeston,, IL 60942"
Now this is what I want to do, I want to split by ' ' (a space) reversed for the last 2 instances of ' '.
For example my array would come back like so:
[0]="60942"
[1]="IL"
[2] = "Hoopeston,,"
To make this even harder I only ever want the first two reverse splits, so normally I would do something like this
string[] splitCity,Zip = splitAgain[0].Split(new char[] { ' ' }, 3);
but how would you do that backwards? The reason for that is, is because it could be a two name city so an extra ' ' would break the city name.
Regular expression with named groups to make things so much simpler. No need to reverse strings. Just pluck out what you want.
var pattern = #">(?<city>.*) (?<state>.*) (?<zip>.*?)<";
var expression = new Regex(pattern);
Match m = expression .Match(startingString);
if(m.success){
Console.WriteLine("Zip: " + m.Groups["zip"].Value);
Console.WriteLine("State: " + m.Groups["state"].Value);
Console.WriteLine("City: " + m.Groups["city"].Value);
}
Should give the following results:
Found 1 match:
1. >Las Vegas,, IL 60942< has 3 groups:
1. Las Vegas,, (city)
2. IL (state)
3. 60942 (zip)
String literals for use in programs:
C#
#">(?<city>.*) (?<state>.*) (?<zip>.*?)<"
One possible solution - not optimal but easy to code - is to reverse the string, then to split that string using the "normal" function, then to reverse each of the individual split parts.
Another possible solution is to use regular expressions instead.
I think you should do it like this:
var s = splitAgain[0];
var zipCodeStart = s.LastIndexOf(' ');
var zipCode = s.Substring(zipCodeStart + 1);
s = s.Substring(0, zipCodeStart);
var stateStart = s.LastIndexOf(' ');
var state = s.Substring(stateStart + 1);
var city = s.Substring(0, stateStart );
var result = new [] {zipCode, state, city};
Result will contain what you requested.
If Split could do everything there would be so many overloads that it would become confusing.
Don't use split, just custom code it with substrings and lastIndexOf.
string str = "Hoopeston,, IL 60942";
string[] parts = new string[3];
int place = str.LastIndexOf(' ');
parts[0] = str.Substring(place+1);
int place2 = str.LastIndexOf(' ',place-1);
parts[1] = str.Substring(place2 + 1, place - place2 -1);
parts[2] = str.Substring(0, place2);
You can use a regular expression to get the three parts of the string inside the tag, and use LINQ extensions to get the strings in the right order.
Example:
string startingString = "<span class=\"address\">East St Louis,, IL 60942</span><br>";
string[] city =
Regex.Match(startingString, #"^.+>(.+) (\S+) (\S+?)<.+$")
.Groups.Cast<Group>().Skip(1)
.Select(g => g.Value)
.Reverse().ToArray();
Console.WriteLine(city[0]);
Console.WriteLine(city[1]);
Console.WriteLine(city[2]);
Output:
60942
IL
East St Louis,,
How about
using System.Linq
...
splitAgain[0].Split(' ').Reverse().ToArray()
-edit-
ok missed the last part about multi word cites, you can still use linq though:
splitAgain[0].Split(' ').Reverse().Take(2).ToArray()
would get you the
[0]="60942"
[1]="IL"
The city would not be included here though, you could still do the whole thing in one statement but it would be a little messy:
var elements = splitAgain[0].Split(' ');
var result = elements
.Reverse()
.Take(2)
.Concat( new[ ] { String.Join( " " , elements.Take( elements.Length - 2 ).ToArray( ) ) } )
.ToArray();
So we're
Splitting the string,
Reversing it,
Taking the two first elements (the last two originally)
Then we make a new array with a single string element, and make that string from the original array of elements minus the last 2 elements (Zip and postal code)
As i said, a litle messy, but it will get you the array you want. if you dont need it to be an array of that format you could obviously simplfy the above code a little bit.
you could also do:
var result = new[ ]{
elements[elements.Length - 1], //last element
elements[elements.Length - 2], //second to last
String.Join( " " , elements.Take( elements.Length - 2 ).ToArray( ) ) //rebuild original string - 2 last elements
};
At first I thought you should use Array.Reverse() method, but I see now that it is the splitting on the ' ' (space) that is the issue.
Your first value could have a space in it (ie "New York"), so you dont want to split on spaces.
If you know the string is only ever going to have 3 values in it, then you could use String.LastIndexOf(" ") and then use String.SubString() to trim that off and then do the same again to find the middle value and then you will be left with the first value, with or without spaces.
Was facing similar issue with audio FileName conventions.
Followed this way: String to Array conversion, reverse and split, and reverse each part back to normal.
char[] addressInCharArray = fullAddress.ToCharArray();
Array.Reverse(addressInCharArray);
string[] parts = (new string(addressInCharArray)).Split(new char[] { ' ' }, 3);
string[] subAddress = new string[parts.Length];
int j = 0;
foreach (string part in parts)
{
addressInCharArray = part.ToCharArray();
Array.Reverse(addressInCharArray);
subAddress[j++] = new string(addressInCharArray);
}

Categories