Reading into list and picking specific numbers - c#

A request came in for an app written a long time ago, they wanted another column on an existing datagridview with current drilling times. These times come from a file that machines read and run. Below is an example of a file.
Now the numbers there looking for are the ones with an S. Now, the very first number is the line number. The line number needs grabbed too as the time's need to match up with the line. So I have to read this into a list, grab the line number and then the associated "s" time number. I will be adding these "s times" to an existing table which already has the line numbers so I'll add these "s times" to a new column next to the corresponding line number. I'm a bit stuck here.
Any help is appreciated.
I know I can read everything in like this
string f = ("//dnc/WJ/MTI-WJ/" + cboPartProgram.Text);
List<string> lines = new List<string>();
using (StreamReader r = new StreamReader(f))
{
// Use while != null pattern for loop
string line;
while ((line = r.ReadLine()) != null)
{
lines.Add(line);
}
}
But once in a list I need to ignore lines that start with an apostrophe " ' ", then take only the first line number "1, 2, 3" etc for however many there are and then the "s" number of that line.
So that I'm left with:
1 2.75
2 2.5
3 2
...etc
Now, on the database there is a program table. with columns
Program Line Time
Program1 1
Program1 2
Program1 3
Program1 4
etc
So the "s" times will be added to the time column of the existing table. But the "s" times taken from line 1,2,3,4 and so on will coincide with the line column of the table.
So, the final end result in this example would be
Program Line Time
Program1 1 2.75
Program1 2 2.5
Program1 3 2
UPDATE
I'm not sure if the edited code to cover the lines w/out the sTime work yet because I'm ran into an issue recently. There is a datagridview that populates the columns from a list filled from a sql statement. I've added a column for the sTime (turns out I may not need the line# so you'll notice I removed that) however I need to incorporate my loop from sTime into the loop that is occurring for the list that is also populating the dgv. Now I've tried to incorporate this in a couple of different ways however what happens is either the very last sTime is use for all 80 entries of the dgv (as the example below does) or if I move the " } " brackets around some and combine them then uses the first sTime to create 80 entries of dgv, then the 2nd sTime to create 80 entries of the dgv. So instead of 80 entries in dgv with each individual sTime next to it, you end up with 6,400 entries in dgv 80x80. so I'm not sure how to combine these 2 to work.
var sLines = File.ReadAllLines("//dnc/WJ/MTI-WJ/" + cboPartProgram.Text)
.Where(s => !s.StartsWith("'"))
.Select(s => new
{
SValue = Regex.Match(s, "(?<=S)[\\d.]*").Value
})
.ToArray();
string lastSValue = "";
foreach (var line in sLines)
{
string val = line.SValue == string.Empty ? lastSValue : line.SValue;
lastSValue = val;
}
foreach (WjDwellOffsets offset in dwellTimes)
{
origionalDwellOffsets.Add(offset.dwellOffset);
dgvDwellTimes.Rows.Add(new object[] { offset.positionId, lastSValue, offset.dwellOffset, 0, offset.dwellOffset, "Update", (offset.dateTime > new DateTime(1900, 1, 1)) ? offset.dateTime.ToString() : "" });
DataGridViewDisableButtonCell btnCell = ((DataGridViewDisableButtonCell)dgvDwellTimes.Rows[dgvDwellTimes.Rows.Count - 1].Cells[5]);
btnCell.Enabled = false;
}

If your only condition is to keep the lines that have an "S" in them, and if that data sample is indicative of the overall line pattern, then this is quite simple (and you can use Regex to further filter your results):
var sLines = File.ReadAllLines("filepath.txt")
.Where(s => !s.StartsWith("'") && s.Contains("S"))
.Select(s => new
{
LineNumber = Regex.Match(s, "^\\d*").Value,
SValue = Regex.Match(s, "(?<=S)[\\d.]*").Value
})
.ToArray();
// Use like this
foreach (var line in sLines)
{
string num = line.LineNumber;
string val = line.SValue;
}
To maintain all lines and just have relevant information on some, this method can be a bit tweaked, but it will also require a bit of some outside processing.
var sLines = File.ReadAllLines("filepath.txt")
.Where(s => !s.StartsWith("'"))
.Select(s => new
{
LineNumber = Regex.Match(s, "^\\d*").Value,
SValue = Regex.Match(s, "(?<=S)[\\d.]*").Value
})
.ToArray();
// Use like this
string lastSValue = "";
foreach (var line in sLines)
{
string num = line.LineNumber;
string val = line.SValue == string.Empty ? lastSValue : line.SValue;
// Do the stuff
lastSValue = val;
}

Related

reading in text file and spliting by comma in c#

I have a text file whose format is like this
Number,Name,Age
I want to read "Number" at the first column of this text file into an array to find duplication. here is the two ways i tried to read in the file.
string[] account = File.ReadAllLines(path);
string readtext = File.ReadAllText(path);
But every time i try to split the array to just get whats to the left of the first comma i fail. Have any ideas? Thanks.
You need to explicitly split the data to access its various parts. How would your program otherwise be able to decide that it is separated by commas?
The easiest approach to access the number that comes to my mind goes something like this:
var lines = File.ReadAllLines(path);
var firstLine = lines[0];
var fields = firstLine.Split(',');
var number = fields[0]; // Voilla!
You could go further by parsing the number as an int or another numeric type (if it really is a number). On the other hand, if you just want to test for uniqueness, this is not really necessary.
If you want all duplicate lines according to the Number:
var numDuplicates = File.ReadLines(path)
.Select(l => l.Trim().Split(','))
.Where(arr => arr.Length >= 3)
.Select(arr => new {
Number = arr[0].Trim(),
Name = arr[1].Trim(),
Age = arr[2].Trim()
})
.GroupBy(x => x.Number)
.Where(g => g.Count() > 1);
foreach(var dupNumGroup in numDuplicates)
Console.WriteLine("Number:{0} Names:{1} Ages:{2}"
, dupNumGroup.Key
, string.Join(",", dupNumGroup.Select(x => x.Name))
, string.Join(",", dupNumGroup.Select(x => x.Age)));
If you are looking specifically for a string.split solution, here is a really simple method of doing what you are looking for:
List<int> importedNumbers = new List<int>();
// Read our file in to an array of strings
var fileContents = System.IO.File.ReadAllLines(path);
// Iterate over the strings and split them in to their respective columns
foreach (string line in fileContents)
{
var fields = line.Split(',');
if (fields.Count() < 3)
throw new Exception("We need at least 3 fields per line."); // You would REALLY do something else here...
// You would probably want to be more careful about your int parsing... (use TryParse)
var number = int.Parse(fields[0]);
var name = fields[1];
var age = int.Parse(fields[2]);
// if we already imported this number, continue on to the next record
if (importedNumbers.Contains(number))
continue; // You might also update the existing record at this point instead of just skipping...
importedNumbers.Add(number); // Keep track of numbers we have imported
}

Eead text file with fixed columns in C#

is there any way to read text files with fixed columns in C # without using regex and substring?
I want to read a file with fixed columns and transfer the column to an excel file (.xlsx)
example 1
POPULACAO
MUNICIPIO UF CENSO 2010
AC 78.507
AC 15.100
Rio Branco AC 336.038
Sena Madureira AC 38.029
example 2
POPULACAO
MUNICIPIO UF CENSO 2010
AC 78.507
Epitaciolândia AC 15.100
Rio Branco AC 336.038
Sena Madureira AC 38.029
remembering that I have a case as in the second example where a column is blank, I can get the columns and the values ​​using regex and / or substring, but if it appears as a file in Example 2, with the regex line of the file is ignored, so does substring.
Assuming you mean "fixed columns" extremely literally, and every single non-terminal column is exactly the same width, each column is separated by exactly one space, yes, you can get away with using neither regex or substring. If that's the case - and bear in mind that's also suggesting that every single person in the database has a name that's exactly four letters long - then you can just read the file in by lines. Id would be line[0].ToString(), name would be new string(new char[] { line[2], line[3], line[4], line[5]), etc.
Or, for any given value:
var str = new StringBuilder();
for (int i = firstIndex; i < lastIndex; i++)
{
str.Append(line[i]);
}
But this is basically just performing the exact function of Substring. Substring isn't your problem - handling empty values in the first (city) column is. So, for any given line, you need to check whether the line is empty:
foreach (line in yourLines)
{
if (line.Substring(cityStartIndex, cityEndIndex).IsNullOrWhitespace) == "")
{
continue;
}
}
Alternately, if you're sure the city name will always be at the very first index of the line:
foreach (line in yourLines)
{
if (line[0] == ' ') { continue; }
}
And if the value you got from the city cell was valid, you'd store that value and continue on to using Substring with the indices of the rest of the values in the row.
If for whatever reason you don't want to use a regular expression or Substring(), you have a couple of other options:
String.Split, e.g. var columns = line.Split(' ');
String.Chars, using the known widths of each column to build your output;
Why not just use string.Split()?
Something like:
using (StreamReader stream = new StreamReader(file)) {
while (!stream.EndOfStream) {
string line = stream.ReadLine();
if (string.IsNullOrWhitespace(line))
continue;
string[] fields = line.Split((char[])null, StringSplitOptions.RemoveEmptyEntries);
int ID = -1, age = -1;
string name = null, training = null;
ID = int.Parse(fields[0]);
if (fields.Length > 1)
name = fields[1];
if (fields.Length > 2)
age = int.Parse(fields[2]);
if (fields.Length > 3)
training = fields[3];
// do stuff
}
}
Only downside to this is that it will allow fields of arbitrary length. And spaces in fields will break the fields.
As for regular expressions being ignored in the last case, try something like:
Match m = Regex.Match(line, #"^(.{2}) (.{4}) (.{2})( +.+?)?$");
First - define a variable for each column in the file. Then go through the file line by line and assign each column to the correct variable. Substitute the correct start positions and lengths. This should be enough information to get you started parsing your file.
private string id;
private string name;
private string age;
private string training;
while((line = file.ReadLine()) != null)
{
id = line.Substring(0, 3)
name = line.Substring(3, 10)
age = line.Substring(12, 2)
training = line.Substring(14, 10)
...
if (string.IsNullOrWhiteSpace(name))
{
// ignore this line if the name is blank
}
else
{
// do something useful
}
counter++;
}

c# 10 textbox to display array items, 30 lines in each textbox, separated with comma

I have 10 different textbox control and a textfile upload script on a button click. I want that when a user uploads a textfile which probably will consist 300 lines, it gets divided into 10 different textbox, 30 lines in each textbox, where each line separated with comma. I have used array to store textfile items.
private void button9_Click(object sender, EventArgs e)
{
OpenFileDialog openFileDialog1 = new OpenFileDialog();
openFileDialog1.Filter = "Text Files|*.txt";
openFileDialog1.Title = "Select a Text file";
openFileDialog1.FileName = "";
DialogResult result = openFileDialog1.ShowDialog();
if (result == DialogResult.OK)
{
string file = openFileDialog1.FileName;
string[] text = System.IO.File.ReadAllLines(file);
button9.Text = textBox13.Text.ToString();
textBox1.Text = string.Join("," + Environment.NewLine, text.Take(30));
if (text.Length > 30)
textBox2.Text = string.Join("," + Environment.NewLine, text[0]);
}
}
Here is some code you can use for that, an explanation follows:
string[] text = System.IO.File.ReadAllLines(file);
var thirtyLineSections = text
.Select((line, index) => new { line, group = index / 30 })
.GroupBy(item => item.group)
.ToArray();
int textboxIndex = 0;
foreach (var section in thirtyLineSections)
{
string textForSection = string.Join(",",
section.Select(item => item.line).ToArray()); // see note below
textboxes[textboxIndex].Text = textForSection;
textboxIndex++;
}
Note: If you're using .NET 4.0 or above you can remove the call to .ToArray(), and instead use this line for the one with the comment:
section.Select(item => item.line));
So, what will this code do?
First, it'll take each line from the original file and run that through a .Select(...) method. This method will be given a 0-based index and the actual element (line) from the original collection. In other words, the delegate to the Select method will be passed the value 0,"first line", 1,"second line", 2,"third line", and so on. We divide this by 30 to get a "group number", where the first group will be number 0, and so on. Then we group on that group number to put all the lines with the same group number into the same group.
In other words, you got this:
original file with index after dividing by 30
line 1 0,line 1 0,line 1
line 2 1,line 2 0,line 2
line 3 2,line 3 0,line 3
some text 3,some text 0,some text
line 5 4,line 5 0,line 5
...
line 30 29,line 30 0,line 30
line 31 30,line 31 1,line 31
So out of that LINQ query we will get an array of elements, where each element is a group that contains 30 lines of text from the original file, in the order they occured in.
Then we loop on that array, handling 30 elements at a time, and combining them into one string using string.Join, assigning the result to a textbox.
Before executing this code you need to do this:
var textboxes = new[]
{
textbox1,
textbox2,
...
textboxN
};
to create an array of the textboxes you want to assign those strings to.
Note: This code does not ensure that you have enough textboxes. If you've dropped 10 textboxes on the form, capable of handling 300 elements, and got more than 300 lines in that file, the code will throw an exception.
OK, as was pointed out in a comment, the LINQ query "looks good", but may be hard to understand for new programmers. I totally agree, so here is a different way to accomplish the same thing:
string[] text = System.IO.File.ReadAllLines(file);
var thirtyLineSections = new List<List<string>>();
List<string> currentList = null;
foreach (string line in text)
{
if (currentList == null)
{
currentList = new List<string>();
thirtyLineSections.Add(currentList);
}
currentList.Add(line);
if (currentList.Count == 30)
currentList = null;
}
foreach (var section in thirtyLineSections)
{
string.Join(",", section).Dump();
}
So what will this code do?
First it'll create the data structure, which in this case will be a "list of 30-line lists", ie. the List<List<string>> declaration.
Then it will loop through all the lines in the file. For each line it will check if we're currently in a group, and we start as "not in a group" so the answer is no, so then we'll create a new group and add this to our list.
Then we keep filling this list with items, until it hits 30 items, and then we simply say "ok, so this group is done, we're no longer in that group". The next line this loop processes will go through that if-statement again adding a fresh group for the next (and following) items.

LINQ conditional aggregation based on next elements' values

What's a good LINQ equivalent of this pesudo-code: "given a list of strings, for each string that doesn't contain a tab character, concatenate it (with a pipe delimiter) to the end of the previous string, and return the resulting sequence" ?
More Info:
I have a List<string> representing lines in a tab-delimited text file. The last field in each line is always a multiline text field, and the file was generated by a buggy system that mishandles fields with embedded newlines. So I end up with a list like this:
1235 \t This is Record 1
7897 \t This is Record 2
8977 \t This is Record 3
continued on the next line
and still continued more
8375 \t This is Record 4
I'd like to coalesce this list by concatenating all the orphan lines (lines with no tab characters) to the end of the previous line. Like this:
1235 \t This is Record 1
7897 \t This is Record 2
8977 \t This is Record 3|continued on the next line|and still continued more
8375 \t This is Record 4
Solving this with a for() loop would be easy, but I'm trying to improve my LINQ skills and I was wondering if there is a reasonably efficient LINQ solution to this problem. Is there?
This is not a problem that should be solved with LINQ. LINQ is designed for enumeration, whereas this is best solved by iteration.
Enumerating a sequence properly means no item has knowledge of the other items, which obviously won't work in your case. Use a for loop so you can cleanly go through the strings one by one and in order.
Just did for my curiosity.
var originalList = new List<string>
{
"1235 \t This is Record 1",
"7897 \t This is Record 2",
"8977 \t This is Record 3",
"continued on the next line",
"and still continued more",
"8375 \t This is Record 4"
};
var resultList = new List<string>();
resultList.Add(originalList.Aggregate((workingSentence, next)
=> {
if (next.Contains("\t"))
{
resultList.Add(workingSentence);
return next;
}
else
{
workingSentence += "|" + next;
return workingSentence;
}
}));
The resultList should contain what you want.
Please note that this is not an optimal solution. The line workingSentence += "|" + next; may create lots of temp objects depending on your data pattern.
An optimal solution may involve to keep multiple index variables to look ahead of strings and concatenate them when the next string contains a tab character instead of concatenating one by one as shown above. However, it will be more complex than the one above because of boundary checking and keeping multiple index variables :).
Update: The following solution will not create temporary string objects for concatenation.
var resultList = new List<string>();
var tempList = new List<string>();
tempList.Add(originalList.Aggregate((cur, next)
=> {
tempList.Add(cur);
if (next.Contains("\t"))
{
resultList.Add(string.Join("|", tempList));
tempList.Clear();
}
return next;
}));
resultList.Add(string.Join("|", tempList));
The following is a solution using for loop.
var resultList = new List<string>();
var temp = new List<string>();
for(int i = 0, j = 1; j < originalList.Count; i++, j++)
{
temp.Add(originalList[i]);
if (j != originalList.Count - 1)
{
if (originalList[j].Contains("\t"))
{
resultList.Add(string.Join("|", temp));
temp.Clear();
}
}
else // when originalList[j] is the last item
{
if (originalList[j].Contains("\t"))
{
resultList.Add(string.Join("|", temp));
resultList.Add(originalList[j]);
}
else
{
temp.Add(originalList[j]);
resultList.Add(string.Join("|", temp));
}
}
}
After trying a for() solution, I tried a LINQ solution and came up with the one below. For my reasonably small (10K lines) file it was fast enough that I didn't care about the efficiency, and I found it much more readable than the equivalent for() solution.
var lines = new List<string>
{
"1235 \t This is Record 1",
"7897 \t This is Record 2",
"8977 \t This is Record 3",
"continued on the next line",
"and still continued more",
"8375 \t This is Record 4"
};
var fixedLines = lines
.Select((s, i) => new
{
Line = s,
Orphans = lines.Skip(i + 1).TakeWhile(s2 => !s2.Contains('\t'))
})
.Where(s => s.Line.Contains('\t'))
.Select(s => string.Join("|", (new string[] { s.Line }).Concat(s.Orphans).ToArray()))
You could do something like this:
string result = records.Aggregate("", (current, s) => current + (s.Contains("\t") ? "\n" + s : "|" + s));
I cheated and got Resharper to generate this for me. This is close -- it leaves a blank line at the top though.
However, as you can see, this is not very readable. I realize you're looking for a learning exercise but I'd take a nice readable foreach loop over this any day.

split a string from a text file into another list

Hi i know the Title might sound a little confusing but im reading in a text file with many lines of data
Example
12345 Test
34567 Test2
i read in the text 1 line at a time and add to a list
using (StreamReader reader = new StreamReader("Test.txt"))
{
string line;
while ((line = reader.ReadLine()) != null)
{
list.Add(line);
}
}
how do i then separate the 1234 from the test so i can pull only the first column of data if i need like list(1).pars[1] would be 12345 and list(2).pars[2] would be test2
i know this sounds foggy but i hope someone out there understands
Maybe something like this:
string test="12345 Test";
var ls= test.Split(' ');
This will get you a array of string. You can get them with ls[0] and ls[1].
If you just what the 12345 then ls[0] is the one to choose.
If you're ok with having a list of string[]'s you can simply do this:
var list = new List<string[]>();
using (StreamReader reader = new StreamReader("Test.txt"))
{
string line;
while ((line = reader.ReadLine()) != null)
{
list.Add(line.Split(' '));
}
}
string firstWord = list[0][0]; //12345
string secondWord = list[0][1]; //Test
When you have a string of text you can use the Split() method to split it in many parts. If you're sure every word (separated by one or more spaces) is a column you can simply write:
string[] columns = line.Split(' ');
There are several overloads of that function, you can specify if blank fields are skipped (you may have, for example columns[1] empty in a line composed by 2 words but separated by two spaces). If you're sure about the number of columns you can fix that limit too (so if any text after the last column will be treated as a single field).
In your case (add to the list only the first column) you may write:
if (String.IsNullOrWhiteSpace(line))
continue;
string[] columns = line.TrimLeft().Split(new char[] { ' ' }, 2);
list.Add(columns[0]);
First check is to skip empty or lines composed just of spaces. The TrimLeft() is to remove spaces from beginning of the line (if any). The first column can't be empty (because the TrimLeft() so yo do not even need to use StringSplitOptions.RemoveEmptyEntries with an additional if (columns.Length > 1). Finally, if the file is small enough you can read it in memory with a single call to File.ReadAllLines() and simplify everything with a little of LINQ:
list.Add(
File.ReadAllLines("test.txt")
.Where(x => !String.IsNullOrWhiteSpace(x))
.Select(x => x.TrimLeft().Split(new char[] { ' ' }, 2)[0]));
Note that with the first parameter you can specify more than one valid separator.
When you have multiple spaces
Regex r = new Regex(" +");
string [] splitString = r.Split(stringWithMultipleSpaces);
var splitted = System.IO.File.ReadAllLines("Test.txt")
.Select(line => line.Split(' ')).ToArray();
var list1 = splitted.Select(split_line => split_line[0]).ToArray();
var list2 = splitted.Select(split_line => split_line[1]).ToArray();

Categories