How can I add a huge string to a textbox efficiently? - c#

I have a massive string (we are talking 1696108 characters in length) which I have read very quickly from a text file. When I add it to my textbox (C#), it takes ages to do. A program like Notepad++ (unmanaged code, I know) can do it almost instantly although Notepad takes a long time also. How can I efficiently add this huge string and how does something like Notepad++ do it so quickly?

If this is Windows Forms I would suggest trying RichTextBox as a drop-in replacement for your TextBox. In the past I've found it to be much more efficient at handling large text. Also when making modifications in-place be sure to use the time-tested SelectionStart/SelectedText method instead of manipulating the Text property.
rtb.SelectionStart = rtb.TextLength;
rtb.SelectedText = "inserted text"; // faster
rtb.Text += "inserted text"; // slower

Notepad and Window TextBox class is optimized for 64K text. You should use RichTextBox

You could, initially, just render the first n characters that are viewable in the UI (assuming you have a scrolling textbox). Then, start a separate thread to render successive blocks asynchronously.
Alternatively, you could combine it with your input stream from the file. Read a chunk and immediately append it to the text box. Example (not thorough, but you get the idea) ...
private void PopulateTextBoxWithFileContents(string path, TextBox textBox)
{
using (var fs = File.OpenRead(path))
{
using (var sr = new StreamReader(fs))
{
while (!sr.EndOfStream)
textBox.Text += sr.ReadLine();
sr.Close();
}
fs.Close();
}
}

Related

Quickest way to Update Multiline Textbox with Large Amount of Text

I have a .NET 4.5 WinForm program that queries a text-based database using ODBC. I then want to display every result in a multiline textbox and I want to do it in the quickest way possible.
The GUI does not have to be usable during the time the textbox is being updated/populated. However, it'd be nice if I could update a progress bar to let the user know that something is happening - I believe a background worker or new thread/task is necessary for this but I've never implemented one.
I initially went with this code and it was slow, as it drew out the result every line before continuing to the next one.
OdbcDataReader dbReader = com.ExecuteReader();
while (dbReader.Read())
{
txtDatabaseResults.AppendText(dbReader[0].ToString());
}
This was significantly faster.
string resultString = "";
while (dbReader.Read())
{
resultString += dbReader[0].ToString();
}
txtDatabaseResults.Text = resultString;
But there is a generous wait time before the textbox comes to life so I want to know if the operation can be even faster. Right now I'm fetching about 7,000 lines from the file and I don't think it's necessary to switch to AvalonEdit (correct me if my way of thinking is wrong, but I would like to keep it simple and use the built-in textbox).
You can make this far faster by using a StringBuilder instead of using string concatenation.
var results = new StringBuilder();
while (dbReader.Read())
{
results.Append(dbReader[0].ToString());
}
txtDatabaseResults.Text = results.ToString();
Using string and concatenation creates a lot of pressure on the GC, especially if you're appending 7000 lines of text. Each time you use string +=, the CLR creates a new string instance, which means the older one (which is progressively larger and larger) needs to be garbage collected. StringBuilder avoids that issue.
Note that there will still be a delay when you assign the text to the TextBox, as it needs to refresh and display that text. The TextBox control isn't optimized for that amount of text, so that may be a bottleneck.
As for pushing this into a background thread - since you're using .NET 4.5, you could use the new async support to handle this. This would work via marking the method containing this code as async, and using code such as:
string resultString = await Task.Run(()=>
{
var results = new StringBuilder();
while (dbReader.Read())
{
results.Append(dbReader[0].ToString());
}
return results.ToString();
});
txtDatabaseResults.Text = resultString;
Use a StringBuilder:
StringBuilder e = new StringBuilder();
while (dbReader.Read())
{
e.Append(dbReader[0].ToString());
}
txtDatabaseResults.Text = e.ToString();
Despite the fact that a parallel Thread is recommended, the way you extract the lines from file is somehow flawed. While string is immutable everytime you concatenate resulString you actually create another (bigger) string. Here, StringBuilder comes in very useful:
StringBuilder resultString = new StringBuilder ()
while (dbReader.Read())
{
resultString = resultString.Append(dbReader[0].ToString());
}
txtDatabaseResults.Text = resultString;
I am filling a regular TextBox (multiline=true) in a single call with a very long string (more than 200kB, loaded from a file. I just assign the Text property of TextBox with my string).
It's very slow (> 1 second).
The Textbox does anything else than display the huge string.
I used a very simple trick to improve performances : I replaced the multiline textbox by a RichTextBox (native control).
Now same loadings are instantaneous and RichTextBox has exactly the same appearance and behavior as TextBox with raw text (as long as you didn't tweaked it). The most obvious difference is RTB does not have Context menu by default.
Of course, it's not a solution in every case, and it's not aiming the OP question but for me it works perfectly, so I hope it could help other peoples facing same problems with Textbox and performance with big strings.

Highlighting in a RichTextBox is taking too long

I have a large list of offsets which I need to highlight in my RichTextBox. However this process is taking too long. I am using the following code:
foreach (int offset in offsets)
{
richTextBox.Select(offset, searchString.Length);
richTextBox.SelectionBackColor = Color.Yellow;
}
Is there a more efficient way to do so?
UPDATE:
Tried using this method but it doesn't highlight anything:
richTextBox.SelectionBackColor = Color.Yellow;
foreach (int offset in offsets)
{
richTextBox.Select(offset, searchString.Length);
}
I've googled your issue and I found that RichTextBox is getting very slow when having many lines. In my opinion, you have either buy a third part control which you can be satisfied by its performance or you may need threads to devide the whole selection task. I think they can accelerate things up.
Hope it helps !
I've had this same problem before. I ended up disregarding all of the methods they give you and manipulated the underlying RTF data. Also, the reason that your second block of code doesnt work is that RTF applies formatting as it goes, so if you call a function (or Property in this case) to change the selection color, it will only apply it for the currently selected block. Any changes made to the selection after that call become irrelavent.
You can play around with the RGB values, or here is a great source on how to do different things within the RTF control. Pop this function in your code and see how well it works. I use it to provide realtime syntax highlighting for SQL code.
public void HighlightText(int offset, int length)
{
String sText = richTextBox.Text.Trim();
sText = sText.Insert(offset + length - 1, #" \highlight0");
sText = sText.Insert(offset, #" \highlight1");
String s = #"{\rtf1\ansi\deff0{\fonttbl{\f0\fnil\fcharset0 Courier New;}}
{\colortbl ;\red255\green255\blue0;}\viewkind4\uc1\pard";
s += sText;
s += #"\par}";
richTextBox.Rtf = s;
}
Does it make any difference if you set the SelectionBackColor outside of the loop?
Looking into the RichTextBox with Reflector shows, that a WindowMessage is sent to the control every time when the color is set. In the case of large number of offsets this might lead to highlighting the already highlighted words again and again, leading to O(n^2) behavior.

WinForms Textbox Changing its margins?

So I've been working on cobbling together a game and decided I'd like to have a little program to show a file with each character replaced by its byte equivalent for working with coding saves and whatnot. Figured it'd a layup. Three hours later, I've been wracking my brain trying to figure this out.
When I load a small (or perhaps short is the better term) file it looks like the window on top. When I load a larger file, it looks like the window on the bottom.
http://dl.dropbox.com/u/16985121/Images/ViewAsBytes.PNG
That's 10pt Courier New, but it seems to happen with any font I try. There's always that extra column, and if there wasn't enough room for the column, it'd just squeeze in whatever it could in that space that it previously didn't use. I've tried tweaking all kinds of variables, as well as comparing the textbox before and after it adds the text from the file (which is read in just as bytes from a FileStream and then fed into a StringBuilder) but nothing seems to change even though something is clearly different.
I can think of a bunch of different workarounds for this, but now I'm just more interested in what TextBox thinks it's doing exactly than getting my program done. Anyone got any idea?
Here's the code that reads in the data and puts that to the textbox:
FileStream stream = new FileStream(files[0], FileMode.Open);
StringBuilder sb = new StringBuilder();
int byteIn = stream.ReadByte();
while (byteIn != -1)
{
sb.Append('[');
if (byteIn < 100)
sb.Append('0');
if (byteIn < 10)
sb.Append('0');
sb.Append(byteIn.ToString());
sb.Append(']');
byteIn = stream.ReadByte();
}
txtView.Text = sb.ToString();
stream.Close();
This is because you set to the WordWrap property to True. Set it to False, set Multiline to True and ScrollBars to Both. Append Environment.NewLine to the string you generate, every 16 bytes is the norm for hex viewers. Use byte.ToString("X2") to generate a hex string instead of a decimal string.
You now have a full scrollable view of the data, any amount is supported. Allow the user to resize the window so she won't have to scroll horizontally. Or just make it big enough.

How could I read a very large text file using StreamReader?

I want to read a huge .txt file and I'm getting a memory overflow because of its sheer size.
Any help?
private void button1_Click(object sender, EventArgs e)
{
using (var Reader = new StreamReader(#"C:\Test.txt"))
{
textBox1.Text += Reader.ReadLine();
}
}
Text file is just:
Line1
Line2
Line3
Literally like that.
I want to load the text file to a multiline textbox just as it is, 100% copy.
Firstly, the code you posted will only put the first line of the file into the TextBox. What you want is this:
using (var reader = new StreamReader(#"C:\Test.txt"))
{
while (!reader.EndOfStream)
textBox1.Text += reader.ReadLine();
}
Now as for the OutOfMemoryException: I haven't tested this, but have you tried the TextBox.AppendText method instead of using +=? The latter will certainly be allocating a ton of strings, most of which are going to be nearly the length of the entire file by the time you near the end of the file.
For all I know, AppendText does this as well; but its existence leads me to suspect it's put there to deal with this scenario. I could be wrong -- like I said, haven't tested personally.
You'll get much faster performance with the following:
textBox1.Text = File.ReadAllText(#"C:\Test.txt");
It might also help with your memory problem, since you're wasting an enormous amount of memory by allocating successively larger strings with each line read.
Granted, the GC should be collecting the older strings before you see an OutOfMemoryException, but I'd give the above a shot anyway.
First use a rich text box instead of a regular text box. They're much better equiped for the large amounts of data you're using. However you still need to read the data in.
// use a string builer, the += on that many strings increasing in size
// is causing massive memory hoggage and very well could be part of your problem
StringBuilder sb = new StringBuilder();
// open a stream reader
using (var reader = new StreamReader(#"C:\Test.txt"))
{
// read through the stream loading up the string builder
while (!reader.EndOfStream)
{
sb.Append( reader.ReadLine() );
}
}
// set the text and null the string builder for GC
textBox1.Text = sb.ToString();
sb = null;
Read and process it one line at a time, or break it into chunks and deal with the chunks individually. You can also show us the code you have, and tell us what you are trying to accomplish with it.
Here is an example: C# Read Text File Containing Data Delimited By Tabs Notice the ReadLine() and WriteLine() statements.
TextBox is severely limited by the number of characters it can hold. You can try using the AppendText() method on a RichTextBox instead.

.NET C# - Random access in text files - no easy way?

I've got a text file that contains several 'records' inside of it. Each record contains a name and a collection of numbers as data.
I'm trying to build a class that will read through the file, present only the names of all the records, and then allow the user to select which record data he/she wants.
The first time I go through the file, I only read header names, but I can keep track of the 'position' in the file where the header is. I need random access to the text file to seek to the beginning of each record after a user asks for it.
I have to do it this way because the file is too large to be read in completely in memory (1GB+) with the other memory demands of the application.
I've tried using the .NET StreamReader class to accomplish this (which provides very easy to use 'ReadLine' functionality, but there is no way to capture the true position of the file (the position in the BaseStream property is skewed due to the buffer the class uses).
Is there no easy way to do this in .NET?
There are some good answers provided, but I couldn't find some source code that would work in my very simplistic case. Here it is, with the hope that it'll save someone else the hour that I spent searching around.
The "very simplistic case" that I refer to is: the text encoding is fixed-width, and the line ending characters are the same throughout the file. This code works well in my case (where I'm parsing a log file, and I sometime have to seek ahead in the file, and then come back. I implemented just enough to do what I needed to do (ex: only one constructor, and only override ReadLine()), so most likely you'll need to add code... but I think it's a reasonable starting point.
public class PositionableStreamReader : StreamReader
{
public PositionableStreamReader(string path)
:base(path)
{}
private int myLineEndingCharacterLength = Environment.NewLine.Length;
public int LineEndingCharacterLength
{
get { return myLineEndingCharacterLength; }
set { myLineEndingCharacterLength = value; }
}
public override string ReadLine()
{
string line = base.ReadLine();
if (null != line)
myStreamPosition += line.Length + myLineEndingCharacterLength;
return line;
}
private long myStreamPosition = 0;
public long Position
{
get { return myStreamPosition; }
set
{
myStreamPosition = value;
this.BaseStream.Position = value;
this.DiscardBufferedData();
}
}
}
Here's an example of how to use the PositionableStreamReader:
PositionableStreamReader sr = new PositionableStreamReader("somepath.txt");
// read some lines
while (something)
sr.ReadLine();
// bookmark the current position
long streamPosition = sr.Position;
// read some lines
while (something)
sr.ReadLine();
// go back to the bookmarked position
sr.Position = streamPosition;
// read some lines
while (something)
sr.ReadLine();
FileStream has the seek() method.
You can use a System.IO.FileStream instead of StreamReader. If you know exactly, what file contains ( the encoding for example ), you can do all operation like with StreamReader.
If you're flexible with how the data file is written and don't mind it being a little less text editor-friendly, you could write your records with a BinaryWriter:
using (BinaryWriter writer =
new BinaryWriter(File.Open("data.txt", FileMode.Create)))
{
writer.Write("one,1,1,1,1");
writer.Write("two,2,2,2,2");
writer.Write("three,3,3,3,3");
}
Then, initially reading each record is simple because you can use the BinaryReader's ReadString method:
using (BinaryReader reader = new BinaryReader(File.OpenRead("data.txt")))
{
string line = null;
long position = reader.BaseStream.Position;
while (reader.PeekChar() > -1)
{
line = reader.ReadString();
//parse the name out of the line here...
Console.WriteLine("{0},{1}", position, line);
position = reader.BaseStream.Position;
}
}
The BinaryReader isn't buffered so you get the proper position to store and use later. The only hassle is parsing the name out of the line, which you may have to do with a StreamReader anyway.
Is the encoding a fixed-size one (e.g. ASCII or UCS-2)? If so, you could keep track of the character index (based on the number of characters you've seen) and find the binary index based on that.
Otherwise, no - you'd basically need to write your own StreamReader implementation which lets you peek at the binary index. It's a shame that StreamReader doesn't implement this, I agree.
I think that the FileHelpers library runtime records feature might help u. http://filehelpers.sourceforge.net/runtime_classes.html
A couple of items that may be of interest.
1) If the lines are a fixed set of characters in length, that is not of necessity useful information if the character set has variable sizes (like UTF-8). So check your character set.
2) You can ascertain the exact position of the file cursor from StreamReader by using the BaseStream.Position value IF you Flush() the buffers first (which will force the current position to be where the next read will begin - one byte after the last byte read).
3) If you know in advance that the exact length of each record will be the same number of characters, and the character set uses fixed-width characters (so each line is the same number of bytes long) the you can use FileStream with a fixed buffer size to match the size of a line and the position of the cursor at the end of each read will be, perforce, the beginning of the next line.
4) Is there any particular reason why, if the lines are the same length (assuming in bytes here) that you don't simply use line numbers and calculate the byte-offset in the file based on line size x line number?
Are you sure that the file is "too large"? Have you tried it that way and has it caused a problem?
If you allocate a large amount of memory, and you aren't using it right now, Windows will just swap it out to disk. Hence, by accessing it from "memory", you will have accomplished what you want -- random access to the file on disk.
This exact question was asked in 2006 here: http://www.devnewsgroups.net/group/microsoft.public.dotnet.framework/topic40275.aspx
Summary:
"The problem is that the StreamReader buffers data, so the value returned in
BaseStream.Position property is always ahead of the actual processed line."
However, "if the file is encoded in a text encoding which is fixed-width, you could keep track of how much text has been read and multiply that by the width"
and if not, you can just use the FileStream and read a char at a time and then the BaseStream.Position property should be correct
Starting with .NET 6, the methods in the System.IO.RandomAccess class is the official and supported way to randomly read and write to a file. These APIs work with Microsoft.Win32.SafeHandles.SafeFileHandles which can be obtained with the new System.IO.File.OpenHandle function, also introduced in .NET 6.

Categories