Get line with starts with some number - c#

I have a file and I have to process this file, but I have to pick just the last line of the file, and check if this line begins with the number 9, how can I do this using linq ... ?
This record, which begins with the number 9, can sometimes, not be the last line of the file, because the last line can be a \r\n
I maded one simple system to make thsi:
var lines = File.ReadAllLines(file);
for (int i = 0; i < lines.Length; i++)
{
if (lines[i].StartsWith("9"))
{
//...
}
}
But, I whant to know if is possible to make something more fast... or, more better, using linq... :)

string output=File.ReadAllLines(path)
.Last(x=>!Regex.IsMatch(x,#"^[\r\n]*$"));
if(output.StartsWith("9"))//found

The other answers are fine, but the following is more intuitive to me (I love self-documenting code):
Edit: misinterpreted your question, updating my example code to be more appropriate
var nonEmptyLines =
from line in File.ReadAllLines(path)
where !String.IsNullOrEmpty(line.Trim())
select line;
if (nonEmptyLines.Any())
{
var lastLine = nonEmptyLines.Last();
if (lastLine.StartsWith("9")) // or char.IsDigit(lastLine.First()) for 'any number'
{
// Your logic here
}
}

You don't need LINQ something like following should work:
var fileLines = File.ReadAllLines("yourpath");
if(char.IsDigit(fileLines[fileLines.Count() - 1][0])
{
//last line starts with a digit.
}
Or for checking against specific digit 9 you can do:
if(fileLines.Last().StartsWith("9"))

if(list.Last(x =>!string.IsNullOrWhiteSpace(x)).StartsWith("9"))
{
}

Since you need to check the last two lines (in case the last line is a newline), you can do this. You can change lines to however many last lines you want to check.
int lines = 2;
if(File.ReadLines(file).Reverse().Take(lines).Any(x => x.StartsWith("9")))
{
//one of the last X lines starts with 9
}
else
{
//none of the last X lines start with 9
}

Related

Search and Replace inside a List with c#

I am a newbie for programming, at this stage I am using an automation software, it supports c# and js.
Is it possible to search each line and replace a word?
Example data
List name: A
Sample data a
Sample data b
Sample data c
To create a c# code so that when there is "a", it changes to x1
This one below is the most close, but it will remove that whole line and replace it with x1. My goal is only to replace a particular word.
If there can be an option to define multiple matches, that would be great.
a > x1
b > x2
c > x3
The code I found that does search and replace, however it remove the whole line that contains this particular match:
The code below will remove whole line that contains a number, and replace it with 1
I found the code in this forum.
var sourceList = project.Lists["A"]; // define list name.
var parserRegex = new Regex("\\d{1,2}"); // it will match all numbers
lock(SyncObjects.ListSyncer)
{
for(int i=0; i < sourceList.Count; i++) // loop through each line.
{
if (parserRegex.IsMatch(sourceList[i])) // to check if there is a match
{
sourceList[i]="1"; // This code do the replacing job, but it replace the whole line, not the string.
}
}
}
With your guys' help, I think I have got what I wanted:
Here is my final code that is working for now. It is not perfect but works for my purpose.
My question still remains is that for replacing command, how to define replacing "a" means only to replace like
Turn "Sample data a" into "sample data x1"
But not to do like "Sx1mple dx1tx1 x1".
Code:
var sourceList = project.Lists["A-Source"]; // define list name.
var parserRegex = new Regex("data"); // it will match all numbers
lock(SyncObjects.ListSyncer)
{
for(int i=0; i < sourceList.Count; i++) // loop through each line.
{
// if (parserRegex.IsMatch(sourceList[i])) // to check if there is a match. This line is commented out and it still works.
{
sourceList[i]= sourceList[i].Replace("a", "x1")
.Replace("b","x2")
.Replace("c","x3")
.Replace("<p>","")
.Replace("<strong>",""); // I added two other lines, to remove like p and strong tags, it works!
}
}
}
The replace pair's left part is the target, and the right part is the final replacing text.
In real examples, it can't just be "a", "b" or "c" because a is not only going to replace a, but also the "a" symbol in word like "data".
C# is powerful, thanks for the generous input!
Like Johannes mentioned in comments, in C# you can use String.Replace()
sourceList[i]= sourceList[i].Replace("a", "x1")
This should work:
var sourceList = project.Lists["A"]; // define list name.
string pattern = #"(?'matched'\d*)";
for (int i = 0; i < sourceList.Count; i++) // loop through each line.
{
foreach (Match m in Regex.Matches(sourceList[i], pattern))
{
Group g = m.Groups["matched"];
if (!string.IsNullOrEmpty(g.Value))
{
sourceList[i] = sourceList[i].Replace(g.Value, "newvalue");
}
}
}

Merging CSV lines in huge file

I have a CSV that looks like this
783582893T,2014-01-01 00:00,0,124,29.1,40.0,0.0,40,40,5,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,Y
783582893T,2014-01-01 00:15,1,124,29.1,40.0,0.0,40,40,5,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,Y
783582893T,2014-01-01 00:30,2,124,29.1,40.0,0.0,40,40,5,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,Y
783582855T,2014-01-01 00:00,0,128,35.1,40.0,0.0,40,40,5,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,Y
783582855T,2014-01-01 00:15,1,128,35.1,40.0,0.0,40,40,5,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,Y
783582855T,2014-01-01 00:30,2,128,35.1,40.0,0.0,40,40,5,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,Y
...
783582893T,2014-01-02 00:00,0,124,29.1,40.0,0.0,40,40,5,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,Y
783582893T,2014-01-02 00:15,1,124,29.1,40.0,0.0,40,40,5,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,Y
783582893T,2014-01-02 00:30,2,124,29.1,40.0,0.0,40,40,5,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,Y
although there are 5 billion records. If you notice the first column and part of the 2nd column (the day), three of the records are all 'grouped' together and are just a breakdown of 15 minute intervals for the first 30 minutes of that day.
I want the output to look like
783582893T,2014-01-01 00:00,0,124,29.1,40.0,0.0,40,40,5,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,Y,40.0,0.0,40,40,5,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,Y,40.0,0.0,40,40,5,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,Y
783582855T,2014-01-01 00:00,0,128,35.1,40.0,0.0,40,40,5,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,Y,40.0,0.0,40,40,5,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,Y,40.0,0.0,40,40,5,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,Y
...
783582893T,2014-01-02 00:00,0,124,29.1,40.0,0.0,40,40,5,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,Y,40.0,0.0,40,40,5,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,Y,40.0,0.0,40,40,5,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,40,Y
Where the first 4 columns of the repeating rows are ommitted and the rest of the columns are combined with the first record of it's kind. Basically I am converting the day from being each line is 15 minutes, to each line is 1 day.
Since I will be processing 5 billion records, I think the best thing is to use regular expressions (and EmEditor) or some tool that is made for this (multithreading, optimized), rather than a custom programmed solution. Althought I am open to ideas in nodeJS or C# that are relatively simple and super quick.
How can this be done?
If there's always a set number of records records and they're in order, it'd be fairly easy to just read a few lines at a time and parse and output them. Trying to do regex on billions of records would take forever. Using StreamReader and StreamWriter should make it possible to read and write these large files since they read and write one line at a time.
using (StreamReader sr = new StreamReader("inputFile.txt"))
using (StreamWriter sw = new StreamWriter("outputFile.txt"))
{
string line1;
int counter = 0;
var lineCountToGroup = 3; //change to 96
while ((line1 = sr.ReadLine()) != null)
{
var lines = new List<string>();
lines.Add(line1);
for(int i = 0; i < lineCountToGroup - 1; i++) //less 1 because we already added line1
lines.Add(sr.ReadLine());
var groupedLine = lines.SomeLinqIfNecessary();//whatever your grouping logic is
sw.WriteLine(groupedLine);
}
}
Disclaimer- untested code with no error handling and assuming that there are indeed the correct number of lines repeated, etc. You'd obviously need to do some tweaks for your exact scenario.
You could do something like this (untested code without any error handling - but should give you the general gist of it):
using (var sin = new SteamReader("yourfile.csv")
using (var sout = new SteamWriter("outfile.csv")
{
var line = sin.ReadLine(); // note: should add error handling for empty files
var cells = line.Split(","); // note: you should probably check the length too!
var key = cells[0]; // use this to match other rows
StringBuilder output = new StringBuilder(line); // this is the output line we build
while ((line = sin.ReadLine()) != null) // if we have more lines
{
cells = line.Split(","); // split so we can get the first column
while(cells[0] == key) // if the first column matches the current key
{
output.Append(String.Join(",",cells.Skip(4))); // add this row to our output line
}
// once the key changes
sout.WriteLine(output.ToString()); // write out the line we've built up
output.Clear();
output.Append(line); // update the new line to build
key = cells[0]; // and update the key
}
// once all lines have been processed
sout.WriteLine(output.ToString()); // We'll have just the last line to write out
}
The idea is to loop through each line in turn and keep track of the current value of the first column. When that value changes, you write out the output line you've been building up and update the key. This way you don't have to worry about exactly how many matches you have or if you might be missing a few points.
One note, it might be more efficient to use a StringBuilder for output rather than a String if you are going to concatentate 96 rows.
Define the ProcessOutputLine to store merged lines.
Call ProcessLine after each ReadLine and at end of file.
string curKey ="" ;
string keyLength = ... ; // set totalength of 4 first columns
string outputLine = "" ;
private void ProcessInputLine(string line)
{
string newKey=line.substring(0,keyLength) ;
if (newKey==curKey) outputline+=line.substring(keyLength) ;
else
{
if (outputline!="") ProcessOutPutLine(outputLine)
curkey = newKey ;
outputLine=Line ;
}
EDIT : this solution is very similar to that of Matt Burland, the only noticable difference is that I don't use the Split function.

Next textbox line

I have a multi-line textbox of sequences, which the game will play one after the other. For example, the textbox may contain this:
RGBY
YGBR
RGBB
I understand that to read the first line of a multi-line textbox, I must write this:
First sequence:
textBox1.Lines[0].Length //Reads first line only for sequence 1
But how can I make it read the next line in a general sense? n+1 where n is the previous line.
New sequence:
textBox1.Lines[0 + 1].Length //Go to next line for future sequences
Any help is appreciated. Thank you in advance!
You need to store the current index in a variable, a field or property in your class.
private int CurrentIndex { get; set; }
Now you can iterate all lines, for example in a button-click event handler where you want to advance to the next line until end:
if (CurrentIndex + 1 < textBox1.Lines.Length)
{
string currentLine = textBox1.Lines[++CurrentIndex];
}
for(int i=0; i < textBox1.Lines.Count(); i++)
{
var currentLine = textBox1.Lines[i];
// do what you want with current line
}

reading string each number c#

suppose this is my txt file:
line1
line2
line3
line4
line5
im reading content of this file with:
string line;
List<string> stdList = new List<string>();
StreamReader file = new StreamReader(myfile);
while ((line = file.ReadLine()) != null)
{
stdList.Add(line);
}
finally
{//need help here
}
Now i want to read data in stdList, but read only value every 2 line(in this case i've to read "line2" and "line4").
can anyone put me in the right way?
Even shorter than Yuck's approach and it doesn't need to read the whole file into memory in one go :)
var list = File.ReadLines(filename)
.Where((ignored, index) => index % 2 == 1)
.ToList();
Admittedly it does require .NET 4. The key part is the overload of Where which provides the index as well as the value for the predicate to act on. We don't really care about the value (which is why I've named the parameter ignored) - we just want odd indexes. Obviously we care about the value when we build the list, but that's fine - it's only ignored for the predicate.
You can simplify your file read logic into one line, and just loop through every other line this way:
var lines = File.ReadAllLines(myFile);
for (var i = 1; i < lines.Length; i += 2) {
// do something
}
EDIT: Starting at i = 1 which is line2 in your example.
Add a conditional block and a tracking mechanism inside of a loop. (The body of the loop is as follows:)
int linesProcessed = 0;
if( linesProcessed % 2 == 1 ){
// Read the line.
stdList.Add(line);
}
else{
// Don't read the line (Do nothing.)
}
linesProcessed++;
The line linesProcessed % 2 == 1 says: take the number of lines we have processed already, and find the mod 2 of this number. (The remainder when you divide that integer by 2.) That will check to see if the number of lines processed is even or odd.
If you have processed no lines, it will be skipped (such as line 1, your first line.) If you have processed one line or any odd number of lines already, go ahead and process this current line (such as line 2.)
If modular math gives you any trouble, see the question: https://stackoverflow.com/a/90247/758446
try this:
string line;
List<string> stdList = new List<string>();
StreamReader file = new StreamReader(myfile);
while ((line = file.ReadLine()) != null)
{
stdList.Add(line);
var trash = file.ReadLine(); //this advances to the next line, and doesn't do anything with the result
}
finally
{
}

how to skip lines in txt file

Hey guys I've been having some trouble skipping some unnecessary lines from the txt file that I am reading into my program. The data has the following format:
Line 1
Line 2
Line 3
Line 4
Line 5
Line 6
Line 7
Line 8
I want to read line 1, trim lines 3, 4 and the white space and then read line 5, trim lines 7 and 8. I have read something similar to this here on this website, however, that particular case was skipping the first 5 lines of the text file. This is what I have tried so far:
string TextLine;
System.IO.StreamReader file =
new System.IO.StreamReader("C://log.txt");
while ((TextLine = file.ReadLine()) != null)
{
foreach (var i in Enumerable.Range(2, 3)) file.ReadLine();
Console.WriteLine(TextLine);
}
As you guys can see, for the range, I have specified the start as line 2 and then skip 3 lines, which includes the white space. However, that first parameter of Enumerable.Range does not seem to matter. I can put a 0 and it will yield the same results. As I have it right now, the program trims from the first line, until the number specified in the second parameter of the .Range function. Does anyone know of a way to get around this problem? Thanks
Why not read all the lines into an array and then just index the ones you want
var lines = File.ReadAllLines("C://log.txt");
Console.WriteLine(lines[0]);
Console.WriteLine(lines[5]);
If it's a really big file with consistent repeating sections you can create a read method and do:
while (!file.EndOfStream)
{
yield return file.ReadLine();
yield return file.ReadLine();
file.ReadLine();
file.ReadLine();
file.ReadLine();
}
or similar for whatever block format you need.
Here's an expanded version of the solution provided here at the OP's request.
public static IEnumerable<string> getMeaningfulLines(string filename)
{
System.IO.StreamReader file =
new System.IO.StreamReader(filename);
while (!file.EndOfStream)
{
//keep two lines that we care about
yield return file.ReadLine();
yield return file.ReadLine();
//discard three lines that we don't need
file.ReadLine();
file.ReadLine();
file.ReadLine();
}
}
public static void Main()
{
foreach(string line in getMeaningfulLines(#"C:/log.txt"))
{
//or do whatever else you want with the "meaningful" lines.
Console.WriteLine(line);
}
}
Here is another version that's going to be a little bit less fragile if the input file ends abruptly.
//Just get all lines from a file as an IEnumerable; handy helper method in general.
public static IEnumerable<string> GetAllLines(string filename)
{
System.IO.StreamReader file =
new System.IO.StreamReader(filename);
while (!file.EndOfStream)
{
yield return file.ReadLine();
}
}
public static IEnumerable<string> getMeaningfulLines2(string filename)
{
int counter = 0;
//This will yield when counter is 0 or 1, and not when it's 2, 3, or 4.
//The result is yield two, skip 3, repeat.
foreach(string line in GetAllLines(filename))
{
if(counter < 2)
yield return line;
//add one to the counter and have it wrap,
//so it is always between 0 and 4 (inclusive).
counter = (counter + 1) % 5;
}
}
of course the range doesn't matter ... what you're doing is skipping 2 lines at a time inside every while loop iteration - the 2-3 has no effect on the file reader pointer. I would suggest you just have a counter telling you on which line you are and skip if the line number is one of those you'd like to skip, e.g.
int currentLine = 1;
while ((TextLine = file.ReadLine()) != null)
{
if ( LineEnabled( currentLine )){
Console.WriteLine(TextLine);
}
currentLine++;
}
private boolean LineEnabled( int lineNumber )
{
if ( lineNumber == 2 || lineNumber == 3 || lineNumber == 4 ){ return false; }
return true;
}
I don't think you want to go about reading the line in two places (one in the loop and then again inside the loop). I would take this approach:
while ((TextLine = file.ReadLine()) != null)
{
if (string.IsNullOrWhitespace(TextLine)) // Or any other conditions
continue;
Console.WriteLine(TextLine);
}
The documentation for Enumerable.Range states:
public static IEnumerable<int> Range(
int start,
int count
)
Parameters
start
Type: System.Int32
The value of the first integer in the sequence.
count
Type: System.Int32
The number of sequential integers to generate.
So changing the first parameter won't change the logic of your program.
However, this is an odd way to do this. A for loop would be much simpler, easier to understand and far more efficient.
Also, you're code currently reads the first line, skips three lines and then outputs the first line and then repeats.
Have you tried something like this?
using (var file = new StreamReader("C://log.txt"))
{
var lineCt = 0;
while (var line = file.ReadLine())
{
lineCt++;
//logic for lines to keep
if (lineCt == 1 || lineCt == 5)
{
Console.WriteLine(line);
}
}
}
Although unless this is an extremely fixed format input file I'd find a different way to figure out what to do with each line rather than a fixed line number.

Categories