WPF TextBox binding to StringBuilder to simulate Console Performance

WPF TextBox binding to StringBuilder to simulate Console Performance - c#

So I want redirect my Console output to a TextBox to simulate in-GUI console. Now I saw this post:
https://stackoverflow.com/a/18727100/6871623
Which helps the redirection process.
However, since string is immutable, this means a new string will be allocated on every write and this isn't very efficient.
Therefore, I thought of using StringBuilder like:
public class ControlWriter : TextWriter
{
private Control textbox;
private StringBuilder builder;
public ControlWriter(Control textbox)
{
this.textbox = textbox;
builder = new StringBuilder();
}
public override void Write(char value)
{
builder.Append(value);
textbox.Text = builder.ToString();
}
public override void Write(string value)
{
builder.Append(value);
textbox.Text = builder.ToString();
}
public override Encoding Encoding
{
get { return Encoding.ASCII; }
}
}
Looking at the code, it doesn't seem to improve performance much since a new string will be allocated every time we call builder.ToString(), mainly what we improve by this is the Append portion since now we won't be using string concat every time.
Is there a way to bind TextBox Text directly to StringBuilder? That is, appending or resetting the StringBuilder will automatically be reflected on GUI?
If that isn't possible for TextBox, is there another way of going around this?
Finally, is there a way to improve performance of the code above?

Since this is all hypothetical, I'll answer with what I can.
The Console isn't a string, or a StringBuilder, it's actually a buffer, a multi-dimensional array of char (term used loosely).
A TextBox is backed by a string.
A StringBuilder is a single dimensional array of char, that can convert to a string with ToString()
So, updating a string with and endless stream of chars is going to bottom out sooner or later. Meaning, at some point you are going to be splattering a TextBox with endless amounts of characters that have no concept of lines.
If you want to get this more realistic, maybe you want a List or array instead of a string, but that makes its own problems. However, doing so will allow you to at least buffer a certain amount of data, or at least virtualise it in some way.
Also, you can't get over the allocations here, every time you do something with a string, you are allocating again.
Then you talk about binding... Binding is not going to help the allocations.
To answer your questions, you need to ask your self more questions. What is the best way you can visualise the data, can you virtualise it, how are you going to deal with lines.
For a simple exercise the textbox and what you are doing is fine. However, suggesting anything else on top of this is exceedingly hard and too broad.

Related

C# big string array to string

I have a string array of about 20,000,000 values.
And i need to convert it to a string
I've tried:
string data = "";
foreach (var i in tm)
{
data = data + i;
}
But that takes too long time
does someone know a faster way?

Try StringBuilder:
StringBuilder sb = new StringBuilder();
foreach (var i in tm)
{
sb.Append(i);
}
To get the resulting String use ToString():
string result = sb.ToString();

The answer is going to depend on the size of the output string and the amount of memory you have available and usable. The hard limit on string length appears to be 2^31-1 (int.MaxValue) characters, occupying just over 4GB of memory. Whether you can actually allocate that is dependent on your framework version, etc. If you're going to be producing a larger output then you can't put it into a single string anyway.
You've already discovered that naive concatenation is going to be tragically slow. The problem is that every pass through the loop creates a new string, then immediately discards it on the next iteration. This is going to fill up memory pretty quickly, forcing the Garbage Collector to work overtime finding old strings to clear out of memory, not to mention the amount of memory fragmentation and all that stuff that modern programmers don't pay much attention to.
A StringBuiler, is a reasonable solution. Internally it allocates blocks of characters that it then stitches together at the end using pointers and memory copies. Saves a lot of hassles that way and is quite speedy.
As for String.Join... it uses a StringBuilder. So does String.Concat although it is certainly quicker when not inserting separator characters.
For simplicity I would use String.Concat and be done with it.
But then I'm not much for simplicity.
Here's an untested and possibly horribly slow answer using LINQ. When I get time I'll test it and see how it performs, but for now:
string result = new String(lines.SelectMany(l => (IEnumerable<char>)l).ToArray());
Obviously there is a potential overflow here since the ToArray call can potentially create an array larger than the String constructor can handle. Try it out and see if it's as quick as String.Concat.

So you can do it in LINQ, like such.
string data = tm.Aggregate("", (current, i) => current + i);
Or you can use the string.Join function
string data = string.Join("", tm);

Cant check it right now but I'm curious on how this option would perform:
var data = String.Join(string.Empty, tm);
Is Join optimized and ignores concatenation a with String.Empty?

For this big data unfortunately memory based methods will fail and this will be a real headache for GC. For this operation create a file and put every string in it. Like this:
using (StreamWriter sw = new StreamWriter("some_file_to_write.txt")){
for (int i=0; i<tm.Length;i++)
sw.Write(tm[i]);
}
Try to avoid using "var" on this performance demanding approach. Correction: "var" does not effect perfomance. "dynamic" does.

how to convert part of the string to int/float/vector3 etc. without creating a temp string?

in C#, I have a string like this:
"1 3.14 (23, 23.2, 43,88) 8.27"
I need to convert this string to other types according to the value like int/float/vector3, now i have some code like this:
public static int ReadInt(this string s, ref string op)
{
s = s.Trim();
string ss = "";
int idx = s.IndexOf(" ");
if (idx > 0)
{
ss = s.Substring(0, idx);
op = s.Substring(idx);
}
else
{
ss = s;
op = "";
}
return Convert.ToInt32(ss);
}
this will read the first int value out, i have some similar functions to read float vector3 etc. but the problem is : in my application, i have to do this a lot because i received the string from some plugin and i need to do it every single frame, so i created a lot of strings which caused a lot GC will impact the performance, is their a way i can do similar stuff without creating temp strings?

Generation 0 objects such as those created here may well not impact performance too much, as they are relatively cheap to collect. I would change from using Convert to calling int.Parse() with the invariant culture before I started worrying about the GC overhead of the extra strings.
Also, you don't really need to create a new string to accomplish the Trim() behavior. After all, you're scanning and indexing the string anyway. Just do your initial scan for whitespace, and then for the space delimiter between ss and op, so you get just the substrings you need. Right now you're creating 50% more string instances than you really need.
All that said, no...there's not anything built into the basic .NET framework that would parse a substring without actually creating a new string instance. You would have to write your own parsing routines to accomplish that.
You should measure the actual real-world performance impact first, to make sure these substrings really are a significant issue.
I don't know what the "some plugin" is or how you have to handle the input from it, but I would not be surprised to hear that the overhead in acquiring the original input string(s) for this scenario swamps the overhead of the substrings for parsing.

stringbuilder vs list string when searching line by line c#

I have an WinForm app that getting output from serail/telnet terminals
because of historical decisions all output goes to a list like this
static List<string> BufferLog = new List<string>();
serialInputData += serialPort.ReadExisting();
BufferLog.Add(serialInputData);
Now I want to add another function to block thread until a sentence {one word is also possible }
what I had in mind was to do something like
if (IsWaitForCustomMessage)
{
while(IsNotTimeout)
{
List<string> waiterList = serialInputData.Split('\n').ToList();
if (waiterList.Exists(x => x.Contains("SomeSentenc")) return true ;
}
return false;
}
assuming that serialInputData is not containing one line but many lines
What I want to know is , Is there is any faster way to check those lines ?
The only other way to do it fairly simple for me is with stringBuilder , I am more the willing to try other ways
also fro your experiences should I change the BufferLog from List<string> to some other type ?

Last question first - yes, I'd use StringBuilder instead of List(string) because its a closer fit to what you are doing (building a string with incremental inputs). Just tidier rather than better performance neccessarily.
I think you are asking how to wait until the StringBuilder contains a specific sequence of chars ? Instead of breaking it into lines, is there any reason you couldn't just use IndexOf ? This would prevent the need to move the strings around in memory and will be pretty fast.

Quickest way to Update Multiline Textbox with Large Amount of Text

I have a .NET 4.5 WinForm program that queries a text-based database using ODBC. I then want to display every result in a multiline textbox and I want to do it in the quickest way possible.
The GUI does not have to be usable during the time the textbox is being updated/populated. However, it'd be nice if I could update a progress bar to let the user know that something is happening - I believe a background worker or new thread/task is necessary for this but I've never implemented one.
I initially went with this code and it was slow, as it drew out the result every line before continuing to the next one.
OdbcDataReader dbReader = com.ExecuteReader();
while (dbReader.Read())
{
txtDatabaseResults.AppendText(dbReader[0].ToString());
}
This was significantly faster.
string resultString = "";
while (dbReader.Read())
{
resultString += dbReader[0].ToString();
}
txtDatabaseResults.Text = resultString;
But there is a generous wait time before the textbox comes to life so I want to know if the operation can be even faster. Right now I'm fetching about 7,000 lines from the file and I don't think it's necessary to switch to AvalonEdit (correct me if my way of thinking is wrong, but I would like to keep it simple and use the built-in textbox).

You can make this far faster by using a StringBuilder instead of using string concatenation.
var results = new StringBuilder();
while (dbReader.Read())
{
results.Append(dbReader[0].ToString());
}
txtDatabaseResults.Text = results.ToString();
Using string and concatenation creates a lot of pressure on the GC, especially if you're appending 7000 lines of text. Each time you use string +=, the CLR creates a new string instance, which means the older one (which is progressively larger and larger) needs to be garbage collected. StringBuilder avoids that issue.
Note that there will still be a delay when you assign the text to the TextBox, as it needs to refresh and display that text. The TextBox control isn't optimized for that amount of text, so that may be a bottleneck.
As for pushing this into a background thread - since you're using .NET 4.5, you could use the new async support to handle this. This would work via marking the method containing this code as async, and using code such as:
string resultString = await Task.Run(()=>
{
var results = new StringBuilder();
while (dbReader.Read())
{
results.Append(dbReader[0].ToString());
}
return results.ToString();
});
txtDatabaseResults.Text = resultString;

Use a StringBuilder:
StringBuilder e = new StringBuilder();
while (dbReader.Read())
{
e.Append(dbReader[0].ToString());
}
txtDatabaseResults.Text = e.ToString();

Despite the fact that a parallel Thread is recommended, the way you extract the lines from file is somehow flawed. While string is immutable everytime you concatenate resulString you actually create another (bigger) string. Here, StringBuilder comes in very useful:
StringBuilder resultString = new StringBuilder ()
while (dbReader.Read())
{
resultString = resultString.Append(dbReader[0].ToString());
}
txtDatabaseResults.Text = resultString;

I am filling a regular TextBox (multiline=true) in a single call with a very long string (more than 200kB, loaded from a file. I just assign the Text property of TextBox with my string).
It's very slow (> 1 second).
The Textbox does anything else than display the huge string.
I used a very simple trick to improve performances : I replaced the multiline textbox by a RichTextBox (native control).
Now same loadings are instantaneous and RichTextBox has exactly the same appearance and behavior as TextBox with raw text (as long as you didn't tweaked it). The most obvious difference is RTB does not have Context menu by default.
Of course, it's not a solution in every case, and it's not aiming the OP question but for me it works perfectly, so I hope it could help other peoples facing same problems with Textbox and performance with big strings.

C# byte[] substring? (design)

I'm downloading some files asynchronously into a large byte array, and I have a callback that fires off periodically whenever some data is added to that array. If I want to give developers the ability to use the last chunk of data that was added to array, then... well how would I do that? In C++ I could give them a pointer to somewhere in the middle, and then perhaps tell them the number of bytes that were added in the last operation so they at least know the chunk they should be looking at... I don't really want to give them a 2nd copy of that data, that's just wasteful.
I'm just thinking if people want to process this data before the file has completed downloading. Would anyone actually want to do that? Or is it a useless feature anyway? I already have a callback for when the buffer (entire byte array) is full, and then they can dump the whole thing without worrying about start and end points...

.NET has a struct that does exactly what you want:
System.ArraySegment.
In any case, it's easy to implement it yourself too - just make a constructor that takes a base array, an offset, and a length. Then implement an indexer that offsets indexes behind the scenes, so your ArraySegment can be seamlessly used in the place of an array.

You can't give them a pointer into the array, but you could give them the array and start index and length of the new data.
But I have to wonder what someone would use this for. Is this a known need? or are you just guessing that someone might want this someday. And If so, is there any reason why you couldn't wait to add the capability once somone actually needs it?

Whether this is needed or not depends on whether you can afford to accumulate all the data from a file before processing it, or whether you need to provide a streaming mode where you process each chunk as it arrives. This depends on two things: how much data there is (you probably would not want to accumulate a multi-gigabyte file), and how long it takes the file to completely arrive (if you are getting the data over a slow link you might not want your client to wait till it had all arrived). So it is a reasonable feature to add, depending on how the library is to be used. Streaming mode is usually a desirable attribute, so I would vote for implementing the feature. However, the idea of putting the data into an array seems wrong, because it fundamentally implies a non-streaming design, and because it requires an additional copy. What you could do instead is to keep each chunk of arriving data as a discrete piece. These could be stored in a container for which adding at the end and removing from the front is efficient.

Copying a chunk of a byte array may seem "wasteful," but then again, object-oriented languages like C# tend to be a little more wasteful than procedural languages anyway. A few extra CPU cycles and a little extra memory consumption can greatly reduce complexity and increase flexibility in the development process. In fact, copying bytes to a new location in memory to me sounds like good design, as opposed to the pointer approach which will give other classes access to private data.
But if you do want to use pointers, C# does support them. Here is a decent-looking tutorial. The author is correct when he states, "...pointers are only really needed in C# where execution speed is highly important."

I agree with the OP: sometimes you just plain need to pay some attention to efficiency. I don't think the example of providing an API is the best, because that certainly calls for leaning toward safety and simplicity over efficiency.
However, a simple example is when processing large numbers of huge binary files that have zillions of records in them, such as when writing a parser. Without using a mechanism such as System.ArraySegment, the parser becomes a big memory hog, and is greatly slowed down by creating a zillion new data elements, copying all the memory over, and fragmenting the heck out of the heap. It's a very real performance issue. I write these kinds of parsers all the time for telecommunications stuff which generate millions of records per day in each of several categories from each of many switches with variable length binary structures that need to be parsed into databases.
Using the System.ArraySegment mechanism versus creating new structure copies for each record tremendously speeds up the parsing, and greatly reduces the peak memory consumption of the parser. These are very real advantages because the servers run multiple parsers, run them frequently, and speed and memory conservation = very real cost savings in not having to have so many processors dedicated to the parsing.
System.Array segment is very easy to use. Here's a simple example of providing a base way to track the individual records in a typical big binary file full of records with a fixed length header and a variable length record size (obvious exception control deleted):
public struct MyRecord
{
ArraySegment<byte> header;
ArraySegment<byte> data;
}
public class Parser
{
const int HEADER_SIZE = 10;
const int HDR_OFS_REC_TYPE = 0;
const int HDR_OFS_REC_LEN = 4;
byte[] m_fileData;
List<MyRecord> records = new List<MyRecord>();
bool Parse(FileStream fs)
{
int fileLen = (int)fs.FileLength;
m_fileData = new byte[fileLen];
fs.Read(m_fileData, 0, fileLen);
fs.Close();
fs.Dispose();
int offset = 0;
while (offset + HEADER_SIZE < fileLen)
{
int recType = (int)m_fileData[offset];
switch (recType) { /*puke if not a recognized type*/ }
int varDataLen = ((int)m_fileData[offset + HDR_OFS_REC_LEN]) * 256
+ (int)m_fileData[offset + HDR_OFS_REC_LEN + 1];
if (offset + varDataLen > fileLen) { /*puke as file has odd bytes at end*/}
MyRecord rec = new MyRecord();
rec.header = new ArraySegment(m_fileData, offset, HEADER_SIZE);
rec.data = new ArraySegment(m_fileData, offset + HEADER_SIZE,
varDataLen);
records.Add(rec);
offset += HEADER_SIZE + varDataLen;
}
}
}
The above example gives you a list with ArraySegments for each record in the file while leaving all the actual data in place in one big array per file. The only overhead are the two array segments in the MyRecord struct per record. When processing the records, you have the MyRecord.header.Array and MyRecord.data.Array properties which allow you to operate on the elements in each record as if they were their own byte[] copies.

I think you shouldn't bother.
Why on earth would anyone want to use it?

That sounds like you want an event.
public class ArrayChangedEventArgs : EventArgs {
public (byte[] array, int start, int length) {
Array = array;
Start = start;
Length = length;
}
public byte[] Array { get; private set; }
public int Start { get; private set; }
public int Length { get; private set; }
}
// ...
// and in your class:
public event EventHandler<ArrayChangedEventArgs> ArrayChanged;
protected virtual void OnArrayChanged(ArrayChangedEventArgs e)
{
// using a temporary variable avoids a common potential multithreading issue
// where the multicast delegate changes midstream.
// Best practice is to grab a copy first, then test for null
EventHandler<ArrayChangedEventArgs> handler = ArrayChanged;
if (handler != null)
{
handler(this, e);
}
}
// finally, your code that downloads a chunk just needs to call OnArrayChanged()
// with the appropriate args
Clients hook into the event and get called when things change. This is what most client code in .NET expects to have in an API ("call me when something happens"). They can hook into the code with something as simple as:
yourDownloader.ArrayChanged += (sender, e) =>
Console.WriteLine(String.Format("Just downloaded {0} byte{1} at position {2}.",
e.Length, e.Length == 1 ? "" : "s", e.Start));

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.