Unable to read large file of Network shared folder - c#

I am trying to read the text file to check whether all the rows has same number of columns or not. In local code its working fine but on the Network shared folder (has permission as Everyone) it is working only for small size (5mb) of file and when I am selecting 10 MB or 500 MB file same code is not working (Not working means, it takes some time but after few minutes page gets refresh, that's it). It is not giving any error or showing any message. Below is the code to read the file and get the columns count
LinesLst = File.ReadLines(_fileName, Encoding.UTF8)
.Select((line, index) =>
{
var count = line.Split(Delimiter).Length;
if (NumberOfColumns < 0)
NumberOfColumns = count;
return new
{
line = line,
count = count,
index = index
};
})
.Where(colCount => colCount.count != NumberOfColumns)
.Select(colCount => colCount.line).ToList();

Perhaps you have OutOfMemoryException on large file. The fact is that are created many objects in the code on each iteration: the string array by line.Split and an anonymous object. Meanwhile, the anonymous object in fact is not needed. I would rewrote the code as so:
LinesLst = File.ReadLines(_fileName, Encoding.UTF8)
.Where(line =>
{
var count = line.Split(Delimiter).Length;
if (NumberOfColumns < 0)
NumberOfColumns = count;
return count != NumberOfColumns;
})
.ToList();
In addition, you can try to get rid of the creation of the string array when you call the line.Split. Try to replace the string
var count = line.Split(Delimiter).Length;
to the string
// Assume that Delimiter is char[]
var count = line.Count(c => Delimiter.Contains(c)) + 1;
// Assume that Delimiter is char
var count = line.Count(c => Delimiter == c) + 1;

I have added AsyncPostBackTimeout="36000" which solved my problem.

Related

Updating line of a file if array element matches text box

I have a method which currently reads all lines of a directory file (3 fields per line) and updates a directory array with a record of text box entries if the extension code entered matches an extension code field in the file.
I had the updated directory array displaying to a list view, as soon as I attempted to update the directory file with the updated array, it all went downhill! Edit to clarify: with the latest version of the code below, the array no longer displays to the list view, and the file is not updated. No errors are thrown.
public void updateName()
{
int count = 0;
string[] lines = File.ReadAllLines(directoryFile);
// Set size of directory array equal to number of lines in file
int lineCount = lineCounter();
directory = new record[lineCount];
record currentRecord = new record();
// Iterate through each line in file
foreach (string line in lines)
{
// Split current line into three fields
string[] fields = line.Split(',');
// Save current line as new record with surname, forename and extCode fields
currentRecord.surname = fields[0];
currentRecord.forename = fields[1];
currentRecord.extCode = Convert.ToInt32(fields[2]);
// If extension code in current record matches text box entry
if (Convert.ToInt32(fields[2]) == Convert.ToInt32(txtExtCode.Text))
{
// Change surname and forname fields to match text box entries
currentRecord.surname = txtForename.Text;
currentRecord.forename = txtSurname.Text;
using (StreamWriter writer = new StreamWriter(directoryFile))
{
for (int currentLine = 1; currentLine <= lines.Length; ++currentLine)
{
if (currentLine == count)
writer.WriteLine(currentRecord);
else
writer.WriteLine(lines[currentLine - 1]);
}
}
}
// Save currentRecord as next element in directory array, then increment
directory[count] = currentRecord;
count++;
}
}
You don't need a linecounter(). The number of lines is lines.Length.
But why do you need this directory array? You are filling it, but you are not using it anywhere.
Another major problem is that you are creating a StreamWriter inside the foreach loop. You should open the file before the loop and close it after the loop to make it work.
Also, you are mixing writing currentRecord which is of type record and writing lines of type string to the output file. This cannot work.
You are also putting txtForename.Text into currentRecord.surname instead of currentRecord.forename and vice versa.
I suggest to first apply the change in the lines array and then to write this lines array back to to file with File.WriteAllLines which is the symmetric operation to File.ReadAllLines.
I'm applying the change directly to fields array, so that I can convert it back to a string with String.Join (it is the symmetric operation to String.Split).
public void updateName()
{
// Do this conversion before the loop. We need to do it only once.
int selectedCode = Convert.ToInt32(txtExtCode.Text);
string[] lines = File.ReadAllLines(directoryFile);
for (int i = 0; i < lines.Length; i++)
{
// Split current line into three fields
string[] fields = lines[i].Split(',');
int extCode = Convert.ToInt32(fields[2]);
if (extCode == selectedCode)
{
fields[0] = txtSurname.Text;
fields[1] = txtForename.Text;
lines[i] = String.Join(",", fields);
// If the extension code is unique, leave the for-loop
break;
}
}
File.WriteAllLines(directoryFile, lines);
}
I also use for instead of foreach in order to have an index i, so that I can replace a single line in the lines array at a specific index.
I don't know if the extension code in the directory file is unique. If it is, you can exit the for loop prematurely with break.

C# Array of List Index out of bounds

I've made a program that extracts some info from a file , do some operations with it and store it back on a list.
Following this link:
Are 2 dimensional Lists possible in c#?
I've been able to create a class with a list who would suit my needs. But after some debugging i've found that i was overwriting the list on each loop iteration.
Then i decided to make an array of lists - followed this link:
How to create an array of List<int> in C#?
Created an array of lists, initialized it and added elements. But when it needs to move to the next list position , it throws the out of boundaries exception.
I've tried a few things (readed about race condition) but none of 'em worked.
The problem will happen only when i open more than one file with my code ; otherwise it works perfectly.
Exception is thrown at xmldata , in the last iteration of the current file.
Ex: Selected two files, each one will add five elements. In the last element of the first file the exception will be thrown and there's data in the last element's position to be added.
Additional information: Index was outside the bounds of the array. (Exception thrown).
Any help will be appreciated. Thanks a lot.
Code:
List<xmldata>[] finalcontent = new List<xmldata>[9999];
finalcontent[listpos] = new List<xmldata>();//Initializing a list for each filename
foreach (Match m in matches)
{
Double[] numbers;
string aux;
aux = m.Groups[1].ToString();
aux = Regex.Replace(aux, #"\s+", "|");
string[] numbers_str = aux.Split(new[] { "|" }, StringSplitOptions.RemoveEmptyEntries);
numbers = new Double[numbers_str.Length];
for (int j = 0; j < numbers.Length; j++)
{
numbers[j] = Double.Parse(numbers_str[j], CultureInfo.InvariantCulture);
//Converts each number on the string to a Double number, store it in a position
//in the Double array
numbers[j] = numbers[j] / 100; //Needed calculus
numbers[j] = Math.Round(numbers[j], 3); //Storing numbers rounded
}
string values = String.Join(" ", numbers.Select(f => f.ToString()));
if (i <= colors_str.Length)
{
finalcontent[listpos].Add(new xmldata//The exception is thrown right here
{
colorname = colors_str[i],
colorvalues = values,
});//Closing list add declaration
}//Closing if
i++;
}//Closing foreach loop
Link to the file: https://drive.google.com/file/d/0BwU9_GrFRYrTT0ZTS2dRMUhIWms/view?usp=sharing
Arrays are fixed size, but Lists automatically resize as new items are added.
So instead, and since you're using Lists anyway, why not use a list of lists?
List<List<int>> ListOfListsOfInt = new List<List<int>>();
Then, if you really absolutely must have an array, then you can get one like this:
ListOfListsOfString.ToArray();
// Convert non-ascii characters to .
for (int jx = 0; jx < cnt; ++jx)
if (line[jx] < 0x20 || line[jx] > 0x7f) line[jx] = (byte)'.';
This is a big example, but check this one. You increase 'jx' before entering the statement, possibly exceeding the boundary of cnt?
Try changing the following:
if (i <= colors_str.Length)
to
if (i < colors_str.Length).
In fact I'm convinced that this is the problem.
This is because refereces begin at 0 and the last reference is length - 1, not length.
When using a list - it is better to use native functions for it.
List<xmldata>[] finalcontent = new List<xmldata>();
......
finalcontent[listpos] = new List<xmldata>(); insted of var _tmpVariable = new List<xmldata>();//Initializing a list for each filename
......
_tmpVariable.Add(new xmldata
{
colorname = colors_str[i],
colorvalues = values,
});//Closing list add declaration
fs.Close();//closing current file
listpos++;//Increment list position counter
finalcontent.Add(_tmpVariable); // add list into list
As there is no exception details it is hard to get where the exception is thrown.
It could be a list issue, a string issue or other (even file reading issue as well),
So please update this with current exception details.

alternative to a foreach loop and a string builder

I have a function that gets a collection of entities, and then appends quotes and commas to the string to update the collection in the DB. This is taking an insane amount of time, it's very inefficient, but I can't think of an alternative:
IEntityCollection c = Transactions.EvalToEntityCollection<ITransactions>(Store, key, item);
int max = transes.Count <= 750 ? transes.Count : 750; // DB times out if there are more than 750, so 750 is the limit
int i = 0;
int t = transes.Count;
StringBuilder sb = new StringBuilder();
foreach (ITransactions trans in transes)
{
sb.Append("'");
sb.Append(trans.GUID);
sb.Append("',");
i++;
t--;
if (i == max || t == 0)
{
sb.Remove(sb.Length - 1, 1);
//in here, code updates a bunch of transactions (if <=750 transaction)
i = 0;
sb = new StringBuilder();
}
}
Something like this perhaps?
var str = String.Join(",", transes.Select(t => string.Format("'{0}'", t.GUID)))
But since you have the comment in your code that it times out with > 750 records, your "insane amount of time" might be from the database, not your code.
String.Join is a really handy method when you want to concatenate a list of stuff together because it automatically handles the ends for you (so you don't end up with leading or trailing delimiters).
Seems you want to do this:
Group the transaction numbers into batches of maximum 750 entities
Put all those transaction numbers in one group into one string delimited by comma and surrounded by single quotes
If so then here's the code to build the batches:
const int batchSize = 750;
List<List<Transaction>> batches =
transes
.Select((transaction, index) => new { transaction, index })
.GroupBy(indexedTransaction => indexedTransaction.index / batchSize)
.Select(group => group.Select(indexedTransaction => indexedTransaction.transaction).ToList())
.ToList();
foreach (var batch in batches)
{
// batch here is List<Transaction>, not just the GUIDs
var guids = string.Join(", ", batch.Select(transaction => "'" + transaction.GUID + "'"));
// process transaction or guids here
}
String builder is efficient. Doing it 750 times (which is your max) will definetly NOT take a measurable amount longer than any othe technique available.
Please comment out the StringBuilder part and run the project
sb.Append("'");
sb.Append("',");
I bet it will take exactly the same time to complete.

How can I copy an Array to another minus user specified Elements AND write and replace the new list the a .txt file

Here is what I have so far, obviously you can subtract arrays the way i did. And I also need to know how to write the new list to a .txt file that i already have ("records.txt")
public static int deleteRecord(string num)
{
int amount;
int.TryParse(num, out amount);
string[] arrayRecords = File.ReadAllLines("Records.txt").ToArray();
string[] newArrayRecords = arrayRecords - arrayRecords[amount];
for (int i = 0; i < amount; i++)
{
Console.WriteLine(newArrayRecords[amount]);
}
Console.WriteLine(amount);
return amount;
}
I assume that you want to delete a particular value from a file and that is why you have chosen the "num" parameter to be a string.
If so then this will work:
public static void deleteRecord(string num)
{
var lines = File.ReadAllLines("Records.txt").ToList();
if (lines.Remove(num) == true)
{
File.WriteAllLines("Records.txt", lines.ToArray<string>());
}
}
There are a couple of things to point out in your code. Firstly in your example, if you couldn't convert num to an int then you would be trying to remove the value of 0 from your file - which you may not want.
Secondly File.ReadAllLines already returns an Array of strings, so you don't need the .ToArray() at the end. In fact that converts the string[] array to an object[] array - which is not what you want.
I've converted it to a List as they are easier to work with. I only save the file if the item has been removed.
Hope that helps...
I presume that you want to remove the line that contains specified amount, if so you can try this:
var lines = File.ReadLines("Records.txt")
.Where(x => !x.Contains(amount.ToString());
// this will replace all prev. lines with the new ones
File.WriteAllLines("Records.txt", lines);
If you want to remove all lines that comes before this line then you can try:
var allLines = File.ReadLines("Records.txt");
var line = allLines.Where(x => x.Contains(amount.ToString()).First();
var lineIndex = allLines.IndexOf(line);
File.WriteAllLines("Records.txt",lines.GetRange(lineIndex, allLines.Count - lineIndex));
Ofcourse that answer assumes that there is line that contains amount.If there isn't then second code snippet could possibly throw exception.

Read last 30,000 lines of a file [duplicate]

This question already has answers here:
How to read last "n" lines of log file [duplicate]
(9 answers)
Closed 9 years ago.
If has a csv file whose data will increase by time to time. Now what i need to do is to read the last 30,000 lines.
Code :
string[] lines = File.ReadAllLines(Filename).Where(r => r.ToString() != "").ToArray();
int count = lines.Count();
int loopCount = count > 30000 ? count - 30000 : 0;
for (int i = loopCount; i < lines.Count(); i++)
{
string[] columns = lines[i].Split(',');
orderList.Add(columns[2]);
}
It is working fine but the problem is
File.ReadAllLines(Filename)
Read a complete file which causes performance lack. I want something like it only reads the last 30,000 lines which iteration through the complete file.
PS : i am using .Net 3.5 . Files.ReadLines() not exists in .Net 3.5
You can Use File.ReadLines() Method instead of using File.ReadAllLines()
From MSDN:File.ReadLines()
The ReadLines and ReadAllLines methods differ as follows:
When you use ReadLines, you can start enumerating the collection of strings before
the whole collection is returned; when you use ReadAllLines, you must
wait for the whole array of strings be returned before you can access
the array.
Therefore, when you are working with very large files,
ReadLines can be more efficient.
Solution 1 :
string[] lines = File.ReadAllLines(FileName).Where(r => r.ToString() != "").ToArray();
int count = lines.Count();
List<String> orderList = new List<String>();
int loopCount = count > 30000 ? 30000 : 0;
for (int i = count-1; i > loopCount; i--)
{
string[] columns = lines[i].Split(',');
orderList.Add(columns[2]);
}
Solution 2: if you are using .NET Framework 3.5 as you said in comments below , you can not use File.ReadLines() method as it is avaialble since .NET 4.0 .
You can use StreamReader as below:
List<string> lines = new List<string>();
List<String> orderList = new List<String>();
String line;
int count=0;
using (StreamReader reader = new StreamReader("c:\\Bethlehem-Deployment.txt"))
{
while ((line = reader.ReadLine()) != null)
{
lines.Add(line);
count++;
}
}
int loopCount = (count > 30000) ? 30000 : 0;
for (int i = count-1; i > loopCount; i--)
{
string[] columns = lines[i].Split(',');
orderList.Add(columns[0]);
}
You can use File.ReadLines by you can start enumerating the collection of strings before the whole collection is returned.
After that you can use the linq to make things lot more easier. Reverse will reverse the order of collection and Take will take the n number of items. Now put again Reverse to get the last n lines in original format.
var lines = File.ReadLines(Filename).Reverse().Take(30000).Reverse();
If you are using the .NET 3.5 or earlier you can create your own method which works same as File.ReadLines like this. Here is the code for the method originally written by #Jon
public IEnumerable<string> ReadLines(string file)
{
using (TextReader reader = File.OpenText(file))
{
string line;
while ((line = reader.ReadLine()) != null)
{
yield return line;
}
}
}
Now you can use linq over this function as well like the above statement.
var lines = ReadLines(Filename).Reverse().Take(30000).Reverse();
The problem is that you do not know where to start reading the file to get the last 30,000 lines. Unless you want to maintain a separate index of line offsets you can either read the file from the start counting lines only retaining the last 30,000 lines or you can start from the end counting lines backwards. The last approach can be efficient if the file is very large and you only want a few lines. However, 30,000 does not seem like "a few lines" so here is an approach that reads the file from the start and uses a queue to keep the last 30,000 lines:
var filename = #" ... ";
var linesToRead = 30000;
var queue = new Queue<String>();
using (var streamReader = File.OpenText(fileName)) {
while (!streamReader.EndOfStream) {
queue.Enqueue(streamReader.ReadLine());
if (queue.Count > linesToRead)
queue.Dequeue();
}
}
Now you can access the lines that are stored in queue. This class implements IEnumerable<String> allowing you to use foreach to iterate the lines. However, if you want random access you will have to use the ToArray method to convert the queue into an array which adds some overhead to the computation.
This solution is efficient in terms memory because at most 30,000 lines has to be kept in memory and the garbage collector can free any extra lines when required. Using File.ReadAllLines will pull all the lines into memory at once possibly increasing the memory required by the process.
Or I have a diffrent ideo for this.
Try splitting the csv to categories like A-D , E-G ....
and acces what first character you need .
Or you can split data with count of entites. Every file will contain 15.000 entites for example. And a text file which will contain tiny data about entits and location Like :
Txt File:
entitesID | inWhich.Csv
....

Categories