I have a text file. I need to find the total count of line numbers which starts with "3" and the total count is already available in the file which is available in the position line starts with "7200" - Position starts with 05 and length is 6. Similar way. Total amount also available in the line starts with "7200" - Position starts with 21 and length is 12.
211 87236486287346872837468724682871238483XYZ BANK
1200ABCDEF 8128361287AXTAKJ COLL
3270210000893281012870095628 00002500 8981273687jhgsjhdg
3270210000896281712870095628 00002500 1231273687jhgajhdj
3270210000891286712870095628 00002500 4561273687cxvnmbal
3270210000899283612870095628 00002500 7891273687nmkdkjhk
720000000400021000080000000100000000000000008128361287
9000001000001000000010002100008000000010000000000000000
For example : in my file total count of line starts with 3 is available in line starts with "7" i.e. "000004"
Total amount is in line starts with "7" i.e. "000000010000"
Currently I am using my below c# code to loop the entire file and navigate to line starts with 7 and read the values which are available in the above mentioned positions, but is taking too much of time due to file records count might be too big like 200K
foreach (var line in FileLines)
{
//// If line length is zero, then do nothing
if (line.Length == 0)
{
continue;
}
switch (line.Substring(1, 1))
{
case 7:
totalCount = int.Parse(line.Substring(4, 6));
TotalAmount = line.Substring(20, 12);
break;
default:
throw new Exception;
}
}
is there any way I can able to rewrite my code using LINQ, so that I get little better performance?
Here is the Linq statement. What would make this more efficient is that it uses Reverse since you mention that the information you're looking for is in the footer.
static void Main(string[] args)
{
var path = Path.Combine(
Path.GetDirectoryName(Assembly.GetEntryAssembly().Location),
"TextFile.txt");
try
{
var count =
int.Parse(
File.ReadAllLines(path)
.Reverse()
.First(line => line.Any() && (line.First() == '7'))
.Substring(4, 6));
Console.WriteLine($"Count = {count}");
}
catch (Exception ex)
{
System.Diagnostics.Debug.Assert(false, ex.Message);
}
}
EDIT
You have asked a great question about the performance. The great thing is that we don't have to speculate or guess! There is always a way to measure performance.
Here's the benchmark I just put together. And look, I did it really quickly so if anyone spots something I missed please point it out. But here's what I get:
static void Main(string[] args)
{
var path = Path.Combine(
Path.GetDirectoryName(Assembly.GetEntryAssembly().Location),
"TextFile.txt");
try
{
// 200K lines of random guids
List<string> builder =
Enumerable.Range(0, 200000)
.Select(n => $"{{{System.Guid.NewGuid().ToString()}}}")
.ToList();
var footer =
File.ReadAllLines(path);
builder.AddRange(footer);
var FileLines = builder.ToArray();
var benchmark = new System.Diagnostics.Stopwatch();
benchmark.Start();
int totalCount = int.MinValue;
foreach (var line in FileLines)
{
//// If line length is zero, then do nothing
if (line.Length == 0)
{
continue;
}
// Original code from post
// switch (line.Substring(1, 1))
// Should be:
switch (line.Substring(0, 1))
{
case "7":
totalCount = int.Parse(line.Substring(4, 6));
// This is another issue!! Breaking from the switch DOESN'T break from the loop
break;
// SHOULD BE: goto breakFromInner;
// One of the few good reasons to use a goto statement!!
}
}
benchmark.Stop();
Console.WriteLine($"200K lines using Original code: Elapsed = {benchmark.Elapsed}");
Console.WriteLine($"Count = {totalCount}");
benchmark.Restart();
for (int i = FileLines.Length - 1; i >= 0; i--)
{
var line = FileLines[i];
//// If line length is zero, then do nothing
if (line.Length == 0)
{
continue;
}
// Original code from post
// switch (line.Substring(1, 1))
// Should be:
switch (line.Substring(0, 1))
{
case "7":
totalCount = int.Parse(line.Substring(4, 6));
// One of the few good reasons to use a goto statement!!
goto breakFromInner;
}
}
// See note
breakFromInner:
benchmark.Stop();
Console.WriteLine($"200K lines using Original code with reverse: Elapsed = {benchmark.Elapsed}");
Console.WriteLine($"Count = {totalCount}");
benchmark.Restart();
var count =
int.Parse(
FileLines
.Reverse()
.First(line => line.Any() && (line.First() == '7'))
.Substring(4, 6));
benchmark.Stop();
Console.WriteLine($"200K lines using Linq with Reverse: Elapsed = {benchmark.Elapsed}");
Console.WriteLine($"Count = {count}");
}
catch (Exception ex)
{
System.Diagnostics.Debug.Assert(false, ex.Message);
}
}
Related
I have a text file that is divided up into many sections, each about 10 or so lines long. I'm reading in the file using File.ReadAllLines into an array, one line per element of the array, and I'm then I'm trying to parse each section of the file to bring back just some of the data. I'm storing the results in a list, and hoping to export the list to csv ultimately.
My for loop is giving me trouble, as it loops through the right amount of times, but only pulls the data from the first section of the text file each time rather than pulling the data from the first section and then moving on and pulling the data from the next section. I'm sure I'm doing something wrong either in my for loop or for each loop. Any clues to help me solve this would be much appreciated! Thanks
David
My code so far:
namespace ParseAndExport
{
class Program
{
static readonly string sourcefile = #"Path";
static void Main(string[] args)
{
string[] readInLines = File.ReadAllLines(sourcefile);
int counter = 0;
int holderCPStart = counter + 3;//Changed Paths will be an different number of lines each time, but will always start 3 lines after the startDiv
/*Need to find the start of the section and the end of the section and parse the bit in between.
* Also need to identify the blank line that occurs in each section as it is essentially a divider too.*/
int startDiv = Array.FindIndex(readInLines, counter, hyphens72);
int blankLine = Array.FindIndex(readInLines, startDiv, emptyElement);
int endDiv = Array.FindIndex(readInLines, counter + 1, hyphens72);
List<string> results = new List<string>();
//Test to see if FindIndexes work. Results should be 0, 7, 9 for 1st section of sourcefile
/*Console.WriteLine(startDiv);
Console.WriteLine(blankLine);
Console.WriteLine(endDiv);*/
//Check how long the file is so that for testing we know how long the while loop should run for
//Console.WriteLine(readInLines.Length);
//sourcefile has 5255 lines (elements) in the array
for (int i = 0; i <= readInLines.Length; i++)
{
if (i == startDiv)
{
results = (readInLines[i + 1].Split('|').Select(p => p.Trim()).ToList());
string holderCP = string.Join(Environment.NewLine, readInLines, holderCPStart, (blankLine - holderCPStart - 1)).Trim();
results.Add(holderCP);
string comment = string.Join(" ", readInLines, blankLine + 1, (endDiv - (blankLine + 1)));//in case the comment is more than one line long
results.Add(comment);
i = i + 1;
}
else
{
i = i + 1;
}
foreach (string result in results)
{
Console.WriteLine(result);
}
//csvcontent.AppendLine("Revision Number, Author, Date, Time, Count of Lines, Changed Paths, Comments");
/* foreach (string result in results)
{
for (int x = 0; x <= results.Count(); x++)
{
StringBuilder csvcontent = new StringBuilder();
csvcontent.AppendLine(results[x] + "," + results[x + 1] + "," + results[x + 2] + "," + results[x + 3] + "," + results[x + 4] + "," + results[x + 5]);
x = x + 6;
string csvpath = #"addressforcsvfile";
File.AppendAllText(csvpath, csvcontent.ToString());
}
}*/
}
Console.ReadKey();
}
private static bool hyphens72(String h)
{
if (h == "------------------------------------------------------------------------")
{
return true;
}
else
{
return false;
}
}
private static bool emptyElement(String ee)
{
if (ee == "")
{
return true;
}
else
{
return false;
}
}
}
}
It looks like you are trying to grab all of the lines in a file that are not "------" and put them into a list of strings.
You can try this:
var lineswithoutdashes = readInLines.Where(x => x != hyphens72).Select(x => x).ToList();
Now you can take this list and do the split with a '|' to extract the fields you wanted
The logic seems wrong. There are issues with the code in itself also. I am unsure what precisely you're trying to do. Anyway, a few hints that I hope will help:
The if (i == startDiv) checks to see if I equals startDiv. I assume the logic that happens when this condition is met, is what you refer to as "pulls the data from the first section". That's correct, given you only run this code when I equals startDiv.
You increase the counter I inside the for loop, which in itself also increases the counter i.
If the issue in 2. wouldn't exists then I'd suggest to not do the same operation "i = i + 1" in both the true and false conditions of the if (i == startDiv).
Given I assume this file might actually be massive, it's probably a good idea to not store it in memory, but just read the file line by line and process line by line. There's currently no obvious reason why you'd want to consume this amount of memory, unless it's because of the convenience of this API "File.ReadAllLines(sourcefile)". I wouldn't be too scared to read the file like this:
Try (BufferedReader br = new BufferedReader(new FileReader (file))) {
String line;
while ((line = br.readLine()) != null) {
// process the line.
}
}
You can skip the lines until you've passed where the line equals hyphens72.
Then for each line, you process the line with the code you provided in the true case of (i == startDiv), or at least, from what you described, this is what I assume you are trying to do.
int startDiv will return the line number that contains hyphens72.
So your current for loop will only copy to results for the single line that matches the calculated line number.
I guess you want to search the postion of startDiv in the current line?
const string hyphens72;
// loop over lines
for (var lineNumber = 0; lineNumber <= readInLines.Length; lineNumber++) {
string currentLine = readInLines[lineNumber];
int startDiv = currentLine.IndexOf(hyphens72);
// loop over characters in line
for (var charIndex = 0; charIndex < currentLine.Length; charIndex++) {
if (charIndex == startDiv) {
var currentCharacter = currentLine[charIndex];
// write to result ...
}
else {
continue; // skip this character
}
}
}
There are a several things which could be improved.
I would use ReadLines over File.ReadAllLines( because ReadAllLines reads all the lines at ones. ReadLines will stream it.
With the line results = (readInLines[i + 1].Split('|').Select(p => p.Trim()).ToList()); you're overwriting the previous results list. You'd better use results.AddRange() to add new results.
for (int i = 0; i <= readInLines.Length; i++) means when the length = 10 it will do 11 iterations. (1 too many) (remove the =)
Array.FindIndex(readInLines, counter, hyphens72); will do a scan. On large files it will take ages to completely read them and search in it. Try to touch a single line only ones.
I cannot test what you are doing, but here's a hint:
IEnumerable<string> readInLines = File.ReadLines(sourcefile);
bool started = false;
List<string> results = new List<string>();
foreach(var line in readInLines)
{
// skip empty lines
if(emptyElement(line))
continue;
// when dashes are found, flip a boolean to activate the reading mode.
if(hyphens72(line))
{
// flip state.. (start/end)
started != started;
}
if(started)
{
// I don't know what you are doing here precisely, do what you gotta do. ;-)
results.AddRange((line.Split('|').Select(p => p.Trim()).ToList()));
string holderCP = string.Join(Environment.NewLine, readInLines, holderCPStart, (blankLine - holderCPStart - 1)).Trim();
results.Add(holderCP);
string comment = string.Join(" ", readInLines, blankLine + 1, (endDiv - (blankLine + 1)));//in case the comment is more than one line long
results.Add(comment);
}
}
foreach (string result in results)
{
Console.WriteLine(result);
}
You might want to start with a class like this. I don't know whether each section begins with a row of hyphens, or if it's just in between. This should handle either scenario.
What this is going to do is take your giant list of strings (the lines in the file) and break it into chunks - each chunk is a set of lines (10 or so lines, according to your OP.)
The reason is that it's unnecessarily complicated to try to read the file, looking for the hyphens, and process the contents of the file at the same time. Instead, one class takes the input and breaks it into chunks. That's all it does.
Another class might read the file and pass its contents to this class to break them up. Then the output is the individual chunks of text.
Another class can then process those individual sections of 10 or so lines without having to worry about hyphens or what separates on chunk from another.
Now that each of these classes is doing its own thing, it's easier to write unit tests for each of them separately. You can test that your "processing" class receives an array of 10 or so lines and does whatever it's supposed to do with them.
public class TextSectionsParser
{
private readonly string _delimiter;
public TextSectionsParser(string delimiter)
{
_delimiter = delimiter;
}
public IEnumerable<IEnumerable<string>> ParseSections(IEnumerable<string> lines)
{
var result = new List<List<string>>();
var currentList = new List<string>();
foreach (var line in lines)
{
if (line == _delimiter)
{
if(currentList.Any())
result.Add(currentList);
currentList = new List<string>();
}
else
{
currentList.Add(line);
}
}
if (currentList.Any() && !result.Contains(currentList))
{
result.Add(currentList);
}
return result;
}
}
I have a console app that takes hundreds of small files, puts them into a temporary DataTable and then copies the data to a single StreamWriter. That works fine.
However, the console output continually adds "..." during the StreamWriter copy process, which is a bit annoying.
Is there any way to turn this off, or just replace it with something else, suck as a blinking "."?
Here's a cut down version of the code being used:
Console.WriteLine("Writing to TA_{0}", fileType);
var streamMaster = new StreamWriter(Settings.WorkingDirectory + "TA_" + fileType, true);
streamMaster.Flush();
foreach (var tempFile in filesList)
{
var isZipped = tempFile.Contains(".gz");
var dtTempFile = InternalUtils.GetTable(tempFile, isZipped);
foreach (DataRow row in dtTempFile.Rows)
{
if(dtTempFile.Rows.IndexOf(row) != 0) streamMaster.WriteLine(String.Join(",", row.ItemArray));
}
streamMaster.Write(dtTempFile.Copy());
dtTempFile.Dispose();
}
streamMaster.Close();
streamMaster.Dispose();
Console.WriteLine("TA_{0} Complete", fileType);
The output looks a lot like this:
Console Output
Any ideas?
For progress you could show Console animation for the purpose. Which deals with few symbol like | / slash etc. You could set cursor position to spin it like animation.
Console.WriteLine("Writing to TA_{0}", fileType);
using(var streamMaster = new StreamWriter(Settings.WorkingDirectory + "TA_" + fileType, true))
{
streamMaster.Flush();
int counter = 0;
foreach (var tempFile in filesList)
{
ShowAnimation(++counter);
var isZipped = tempFile.Contains(".gz");
var dtTempFile = InternalUtils.GetTable(tempFile, isZipped);
foreach (DataRow row in dtTempFile.Rows)
{
if(dtTempFile.Rows.IndexOf(row) != 0)
streamMaster.WriteLine(String.Join(",", row.ItemArray));
}
streamMaster.Write(dtTempFile.Copy());
dtTempFile.Dispose();
}
}
Console.WriteLine("TA_{0} Complete", fileType);
ShowAnimation Method:
public void ShowAnimation (int counter)
{
switch (counter % 4)
{
case 0: Console.Write("/"); break;
case 1: Console.Write("-"); break;
case 2: Console.Write("\\"); break;
case 3: Console.Write("|"); break;
}
Console.SetCursorPosition(Console.CursorLeft - 1, Console.CursorTop);
}
Turns out there was a function several layers in that did this...
If m_intRecord Mod 1000 = 0 Then
Console.Write(".")
End If
I must have completely overlooked it! Whoops!
I am working on an assignment which stores data from .csv file into array. I have used for(int i = 0; i < data.Length; i++), but i++ is unreachable. Have a look on the code you will get to know. The problem is in storing only perhaps. Help me if you can.
Thanks
static void Load(string[] EmployeeNumbers, string[] EmployeeNames, string[] RegistrationNumbers, float[] EngineCapacityArray,
int[] StartKilometresArray, int[] EndKilometresArray, string[] TripDescriptions, bool[] PassengerCarriedArray,
ref int NextAvailablePosition, ref int RecordCount, ref int CurrentRecord)
{
string str = "";
FileStream fin;
string[] data;
bool tval = false;
// Open the input file
try
{
fin = new FileStream("carallowance.csv", FileMode.Open);
}
catch (IOException exc)
{
Console.WriteLine(exc.Message);
return;
}
// Read each line of the file
StreamReader fstr_in = new StreamReader(fin);
try
{
while ((str = fstr_in.ReadLine()) != null)
{
// Separate the line into the name and age
data = str.Split(';');
if (data.Length == 8)
{
Console.WriteLine("Error: Could not load data from the file. Possibly incorrect format.");
}
for (int i = 0; i < data.Length; i++)
{
EmployeeNumbers[NextAvailablePosition] = data[0];
EmployeeNames[NextAvailablePosition] = data[1];
RegistrationNumbers[NextAvailablePosition] = data[2];
tval = float.TryParse(data[3], out EngineCapacityArray[NextAvailablePosition]);
tval = int.TryParse(data[4], out StartKilometresArray[NextAvailablePosition]);
tval = int.TryParse(data[5], out EndKilometresArray[NextAvailablePosition]);
TripDescriptions[NextAvailablePosition] = data[6];
tval = bool.TryParse(data[7], out PassengerCarriedArray[NextAvailablePosition]);
CurrentRecord = NextAvailablePosition;
NextAvailablePosition++;
RecordCount++;
Console.WriteLine("Your file is sucessfully loaded.");
break;
}
}
}
catch (IOException exc)
{
Console.WriteLine(exc.Message);
}
// Close the file
fstr_in.Close();
}
It's unreachable because of the break; at the end of the loop. That forces the for loop to stop executing after the first time around. If you run this in a console project, it'll only put out a 0.
private static void Main(string[] args)
{
for (int i = 0; i < 2; i++)
{
Console.WriteLine(i.ToString());
break;
}
}
Perhaps the code review stackexchange would be better. There's a number of issues here.
First we can simplify using a framework callto File.ReadAllLines(...). That will give you a sequence of all lines in the file. Then you want to transform that into a sequence of arrays (split on ','). That's straightforward:
var splitLines = File.ReadAllLines("\path")
.Select(line => line.Split(new char[] { ',' }));
Now you can just iterate over splitLines with a foreach.
(I do notice that you seem to be setting values into the arrays that are passed in. Try to not get into the habit of doing that. These kinds of side effects and abuse of reference params is prone to becoming very brittle.)
Then this seems very odd:
if (data.Length == 8)
{
Console.WriteLine("...");
}
I suspect that you just have a typo in your comparison operator (should be !=). If you don't care about writing to the console on bad data, you can simply just filter out the bad data after the transformation. That looks like:
var splitLines = File.ReadAllLines("\path")
.Select(line => line.Split(new char[] { ',' }))
.Where(data => data.Length == 8);
Now recall that [int/float].TryParse(s, out v) will set v to be the value that was parsed, or the default value for the type, and return true if the parse was successful. That "or default" is important here. That means that you're stuffing bad/invalid values if they can't be parsed, and you're doing nothing with tval.
Instead of all of that, consider an object/type that represents a record from your dataset. It looks like you're trying to track employee mileage from a csv table. That looks something like:
public class MileageRecord
{
public string Name { get; set; }
/* More properties */
public MileageRecord FromCSV(string[] data)
{
/* try parsing, if not then log errs to file and return null */
}
}
Now you've gotten rid of all of your side effects and the whole thing is cleaner. Loading all this data from file is as straightforward as this:
public static IEnumerable<MileageRecord> Load()
{
return File.ReadAllLines("\path")
//.Skip(1) // if first line of the file is column headers
.Select(line => line.Split(new char[] { ',' }))
.Where(data => data.Length == 8)
.Select(data => MileageRecord.FromCSV(data))
.Where(mileage => mileage != null);
}
This piece of code:
NextAvailablePosition++;
RecordCount++;
Console.WriteLine("Your file is sucessfully loaded.");
break; // <-- this instruction
}
Takes you out of the for loop without the possibility to increment i value.
Problem : you are supposed to add break statement inside the if condition(which is inside the while loop) , so that if the data Length does not match with 8 then it will break/come out from loop. but you have mistakenly added break inside the for-loop.that why it only executed for 1st time and comesout of the loop.
Solution : Move the break statement from for loop to if-blcok inside the while loop.
Try This:
Step 1: Remove the break statement from for-loop.
CurrentRecord = NextAvailablePosition;
NextAvailablePosition++;
RecordCount++;
Console.WriteLine("Your file is sucessfully loaded.");
// break; //move this statement to inside the if block
Step 2: place the break statement in if-block inside while loop.
if (data.Length == 8)
{
Console.WriteLine("Error: Could not load data from the file. Possibly incorrect format.");
break;
}
Suggestion : you can re-write your code using File.ReadAllLines() method to avoid the complexity as below :
static void Load(string[] EmployeeNumbers, string[] EmployeeNames, string[] RegistrationNumbers, float[] EngineCapacityArray,
int[] StartKilometresArray, int[] EndKilometresArray, string[] TripDescriptions, bool[] PassengerCarriedArray,
ref int NextAvailablePosition, ref int RecordCount, ref int CurrentRecord)
{
string str = "";
string[] data;
bool tval = false;
String [] strLines=File.ReadAllLines("carallowance.csv");
for(int i=0;i<strLines.Length;i++)
{
str=strLines[i];
data = str.Split(';');
if (data.Length == 8)
{
Console.WriteLine("Error: Could not load data from the file. Possibly incorrect format.");
break;
}//End of if block
else
{
EmployeeNumbers[NextAvailablePosition] = data[0];
EmployeeNames[NextAvailablePosition] = data[1];
RegistrationNumbers[NextAvailablePosition] = data[2];
tval = float.TryParse(data[3], out EngineCapacityArray[NextAvailablePosition]);
tval = int.TryParse(data[4], out StartKilometresArray[NextAvailablePosition]);
tval = int.TryParse(data[5], out EndKilometresArray[NextAvailablePosition]);
TripDescriptions[NextAvailablePosition] = data[6];
tval = bool.TryParse(data[7], out PassengerCarriedArray[NextAvailablePosition]);
CurrentRecord = NextAvailablePosition;
NextAvailablePosition++;
RecordCount++;
} //End of else block
} //End of for loop
Console.WriteLine("Your file is sucessfully loaded.");
} //End of function
Today i found out why this problem occurs or how this problem occurs during reading line by line from text file using C# ReadLine().
Problem :
Assume there are 3 lines in text file. Each of which has length equals to 400.(manually counted)
while reading line from C# ReadLine() and checking for length in
Console.WriteLine(str.length);
I found out that it prints:
Line 1 => 400
Line 2 => 362
Line 3 => 38
Line 4 => 400
I was confused and that text file has only 3 lines why its printing 4 that too with length changed. Then i quickly checked out for "\n" or "\r" or combination "\r\n" but i didn't find any, but what i found was 2 double quotes ex=> "abcd" , in second line.
Then i changed my code to print lines itself and boy i was amaze, i was getting output in console like :
Line 1 > blahblahblabablabhlabhlabhlbhaabahbbhabhblablahblhablhablahb
Line 2 > blablabbablablababalbalbablabal"blabablhabh
Line 3 > "albhalbahblablab
Line 4 > blahblahblabablabhlabhlabhlbhaabahbbhabhblablahblhablhablahb
now i tried removing the double quotes "" using replace function but i got same 4 lines result just without double quotes.
Now please let me know any solution other than manual edit to overcome this scenario.
Here is my code simple code:
static void Main(string[] args)
{
FileStream fin;
string s;
string fileIn = #"D:\Testing\CursedtextFile\testfile.txt";
try
{
fin = new FileStream(fileIn, FileMode.Open);
}
catch (FileNotFoundException exc)
{
Console.WriteLine(exc.Message + "Cannot open file.");
return;
}
StreamReader fstr_in = new StreamReader(fin, Encoding.Default, true);
int cnt = 0;
while ((s = fstr_in.ReadLine()) != null)
{
s = s.Replace("\""," ");
cnt = cnt + 1;
//Console.WriteLine("Line "+cnt+" => "+s.Length);
Console.WriteLine("Line " + cnt + " => " + s);
}
Console.ReadLine();
fstr_in.Close();
fin.Close();
}
Note: i was trying to read and upload 37 text files of 500 MB each of finance domain where i always face this issue and has to manually do the changes. :(
If the problem is that:
Proper line breaks should be a combination of newline (10) and carriage return (13)
Lone newlines and/or carriage returns are incorrectly being interpreted as line breaks
Then you can fix this, but the best and probably most correct way to fix this problem is to go to the source, fix the program that writes out this incorrectly formatted file in the first place.
However, here's a LINQPad program that replaces lone newlines or carriage returns with spaces:
void Main()
{
string input = "this\ris\non\ra\nsingle\rline\r\nThis is on the next line";
string output = ReplaceLoneLineBreaks(input);
output.Dump();
}
public static string ReplaceLoneLineBreaks(string input)
{
if (string.IsNullOrEmpty(input))
return input;
var result = new StringBuilder();
int index = 0;
while (index < input.Length)
{
switch (input[index])
{
case '\n':
if (index == input.Length - 1 || input[index+1] != '\r')
{
result.Append(' ');
index++;
}
else
{
result.Append(input[index]);
result.Append(input[index + 1]);
index += 2;
}
break;
case '\r':
if (index == input.Length - 1 || input[index+1] != '\n')
{
result.Append(' ');
index++;
}
else
{
result.Append(input[index]);
result.Append(input[index + 1]);
index += 2;
}
break;
default:
result.Append(input[index]);
index++;
break;
}
}
return result.ToString();
}
If the lines are all of the same length, split the lines by their length instead of watching for end of lines.
const int EndOfLine = 2; // CR LF or = 1 if only LF.
const int LineLength = 400;
string text = File.ReadAllText(path);
for (int i = 0; i < text.Length - EndOfLine; i += LineLength + EndOfLine) {
string line = text.Substring(i, Math.Min(LineLength, text.Length - i - EndOfLine));
// TODO Process line
}
If the last line is not terminated by end of line characters, remove the two - EndOfLine.
Also the Math.Min part is only a safety measure. It might not be necessary if no line is shorter than 400.
As part of an assignment -
The User selects a file extension (.txt, .bat, or .xyz)
A list of files from a folder with that extension is shown
The user then selects a file from the list and are shown the first 40 characters of each of its first four lines (or as many lines as present if less than four lines are recorded in the file). If there are more lines left in the file, output a string: “xx more lines are not shown.” (substitute xx with the correct number).
I can't seem to wrap my head around number 3. Any help or pointers are greatly appreciated.
namespace unit9Assignment
{
public partial class Form1 : Form
{
public Form1()
{
InitializeComponent();
//add the extensions to the c box.
comboBox1.Items.Add(".txt");
comboBox1.Items.Add(".xyz");
comboBox1.Items.Add(".bat");
//make .txt the default selection
comboBox1.SelectedItem = ".txt";
tabControl1.SelectedIndexChanged += tabControl1_SelectedIndexChanged;
}
/******Tab Click Event********/
private void tabControl1_SelectedIndexChanged(Object sender, EventArgs e)
{
switch ((sender as TabControl).SelectedIndex)
{
case 0:
break;
case 1:
fileName(comboBox1.Text);
break;
case 2:
fileContent(Files.SelectedItem.ToString());
break;
}
}
/******Get Files Based on Selection*******/
public void fileName(string fileExt)
{
List<string> listOfFiles = new List<string>();
string[] fileExtArray = Directory.GetFiles(#"C:\Users\Public", "*" + fileExt);
foreach (string fileExtFile in fileExtArray)
{
listOfFiles.Add(fileExtFile);
}
Files.DataSource = listOfFiles;
}
/******Display 4 Lines # 40 Characters Per Line*********/
public void fileContent(string fileName)
{
int numberOfLines = File.ReadLines(#fileName).Count(),
remainingLines = numberOfLines - 4;
//THIS PRINTS OUT 4 LINES # 40 CHARACTERS PER LINE IF A FILE HAS LESS THAN 5 LINES
if (numberOfLines < 5)
{
foreach (string line in File.ReadLines(fileName))
{
richTextBox1.AppendText(line.Substring(0, 40) + Environment.NewLine);
Console.WriteLine(line.Substring(0, 40));
}
}
// NO CLUE WHAT TO DO
else
{
}
}
}
}
Rather than checking the number of lines in the file, why don't you just go ahead and start printing, and stop after 4 lines? Something like this:
StreamReader fileIn = new StreamReader(fileName);
for(int i=0; i<4 && !fileIn.EndOfStream; ++i)
{
string line = fileIn.ReadLine();
if(line.Length > 40)
richTextBox1.AppendText(line.Substring(0,40) + Environment.NewLine);
else
richTextBox1.AppendText(line + Environment.NewLine);
}
int j;
for(j=0; !fileIn.EndOfStream; ++j)
fileIn.ReadLine();
if(j>0)
richTextBox1.AppendText(j.ToString() + " more lines are not shown.";
fileIn.Close();
... To clarify, this would be your entire fileContent method. You actually do not need to know the number of lines in the file. Of course, this method won't work if you have more lines in your file than an int variable can hold, but I assume you're not working with such long files.
How about this:
public void fileContent(string fileName)
{
var lines = File.ReadLines(#fileName);
foreach (string line in lines.Take(4))
{
richTextBox1.AppendText(line.Substring(0, 40) + Environment.NewLine);
}
var remaining = lines.Count() - 4;
if (remaining > 0)
richTextBox1.AppendText(remaining + " more line(s) are not shown.");
}
The Take() documentation is here.
Giving answers to homework is bad practice. Instead here are some pointers to help you wrap your head around your problem:
//read a file
var lines = File.ReadLines("myfile");
//get the first 4 lines of your file
var first4 = lines.Take(4);
//get the first 40 characters of the first line of your file
var first40Chars = lines.FirstOrDefault().Take(40);
//get the remaining number of lines
var remainingCount = lines.Count() - 4;
Pulling up a dialog to show files is quite easy also. The WinForms FileDialog can help you there.