What is the best way to pass a stream around - c#

I have tried to pass stream as an argument but I am not sure which way is "the best" so would like to hear your opinion / suggestions to my code sample
I personally prefer Option 3, but I have never seen it done this way anywhere else.
Option 1 is good for small streams (and streams with a known size)
Option 2_1 and 2_2 would always leave the "Hander" in doubt of who has the responsibility for disposing / closing.
public interface ISomeStreamHandler
{
// Option 1
void HandleStream(byte[] streamBytes);
// Option 2
void HandleStream(Stream stream);
// Option 3
void HandleStream(Func<Stream> openStream);
}
public interface IStreamProducer
{
Stream GetStream();
}
public class SomeTestClass
{
private readonly ISomeStreamHandler _streamHandler;
private readonly IStreamProducer _streamProducer;
public SomeTestClass(ISomeStreamHandler streamHandler, IStreamProducer streamProducer)
{
_streamHandler = streamHandler;
_streamProducer = streamProducer;
}
public void DoOption1()
{
var buffer = new byte[16 * 1024];
using (var input = _streamProducer.GetStream())
{
using (var ms = new MemoryStream())
{
int read;
while ((read = input.Read(buffer, 0, buffer.Length)) > 0)
{
ms.Write(buffer, 0, read);
}
_streamHandler.HandleStream(ms.ToArray());
}
}
}
public void DoOption2_1()
{
_streamHandler.HandleStream(_streamProducer.GetStream());
}
public void DoOption2_2()
{
using (var stream = _streamProducer.GetStream())
{
_streamHandler.HandleStream(stream);
}
}
public void DoOption3()
{
_streamHandler.HandleStream(_streamProducer.GetStream);
}
}

Option 2_2 is the standard way of dealing with disposable resources.
Your SomeTestClass instance asks the producer for a stream - then SomeTestClass owns a stream and is responsible for cleaning up.
Options 3 and 2_1 rely on a different object to clean up the resource owned by SomeTestClass - this expectation might not be met.
Option 1 is jut copying a stream's content to another stream - I don't see any benefits in doing that.

You may not realize it, but you are attempting to implement the pipeline design pattern. As a starting point, consider taking a look at:
MSDN: Pipelines
Steve's Blog: Pipeline Design Pattern
(other good references??)
With regards to your implementation, I recommend that you go with option #2:
public interface IStreamHandler
{
void Process(Stream stream);
}
With regards to object lifetime, it is my belief that:
the implementation should be consistent in how it handles calling Dispose
your solution will be more flexible if IStreamHandler did not call Dispose (now you can chain handlers together much like you would in Unix pipes)
THIRD-PARTY SOLUTIONS
Building a pipeline solution can be fun, but it is also worth noting that there are existing products on the market:
Yahoo: Pipes
Microsoft: BizTalk
IBM: Cast Iron
StackOverflow: Alternatives to Yahoo Pipes
ADDITIONAL NOTES
There is a design issue related to your proposed Option 2:
void Process(Stream stream);
In Unix Pipes you can chain a number of applications together by taking the output of one program and make it the input of another. If you were to build a similar solution using Option 2, you will run into problems if you are using multiple handlers and your data Stream is forward only (i.e. stream.CanSeek=False).

Related

Is there a way to avoid using side effects to process this data

I have an application I'm writing that runs script plugins to automate what a user used to have to do manually through a serial terminal. So, I am basically implementing the serial terminal's functionality in code. One of the functions of the terminal was to send a command which kicked off the terminal receiving continuously streamed data from a device until the user pressed space bar, which would then stop the streaming of the data. While the data was streaming, the user would then set some values in another application on some other devices and watch the data streamed in the terminal change.
Now, the streamed data can take different shapes, depending on the particular command that's sent. For instance, one response may look like:
---RESPONSE HEADER---
HERE: 1
ARE: 2 SOME:3
VALUES: 4
---RESPONSE HEADER---
HERE: 5
ARE: 6 SOME:7
VALUES: 8
....
another may look like:
here are some values
in cols and rows
....
So, my idea is to have a different parser based on the command I send. So, I have done the following:
public class Terminal
{
private SerialPort port;
private IResponseHandler pollingResponseHandler;
private object locker = new object();
private List<Response1Clazz> response1;
private List<Response2Clazz> response2;
//setter omited for brevity
//get snapshot of data at any point in time while response is polling.
public List<Response1Clazz> Response1 {get { lock (locker) return new List<Response1Clazz>(response1); }
//setter omited for brevity
public List<Response2Clazz> Response2 {get { lock (locker) return new List<Response1Clazz>(response2); }
public Terminal()
{
port = new SerialPort(){/*initialize data*/}; //open port etc etc
}
void StartResponse1Polling()
{
Response1 = new List<Response1Clazz>();
Parser<List<Response1Clazz>> parser = new KeyValueParser(Response1); //parser is of type T
pollingResponseHandler = new PollingResponseHandler(parser);
//write command to start polling response 1 in a task
}
void StartResponse2Polling()
{
Response2 = new List<Response2Clazz>();
Parser<List<Response2Clazz>> parser = new RowColumnParser(Response2); //parser is of type T
pollingResponseHandler = new PollingResponseHandler(parser); // this accepts a parser of type T
//write command to start polling response 2
}
OnSerialDataReceived(object sender, Args a)
{
lock(locker){
//do some processing yada yada
//we pass in the serial data to the handler, which in turn delegates to the parser.
pollingResponseHandler.Handle(processedSerialData);
}
}
}
the caller of the class would then be something like
public class Plugin : BasePlugin
{
public override void PluginMain()
{
Terminal terminal = new Terminal();
terminal.StartResponse1Polling();
//update some other data;
Response1Clazz response = terminal.Response1;
//process response
//update more data
response = terminal.Response1;
//process response
//terminal1.StopPolling();
}
}
My question is quite general, but I'm wondering if this is the best way to handle the situation. Right now I am required to pass in an object/List that I want modified, and it's modified via a side effect. For some reason this feels a little ugly because there is really no indication in code that this is what is happening. I am purely doing it because the "Start" method is the location that knows which parser to create and which data to update. Maybe this is Kosher, but I figured it is worth asking if there is another/better way. Or at least a better way to indicate that the "Handle" method produces side effects.
Thanks!
I don't see problems in modifying List<>s that are received as a parameter. It isn't the most beautiful thing in the world but it is quite common. Sadly C# doesn't have a const modifier for parameters (compare this with C/C++, where unless you declare a parameter to be const, it is ok for the method to modify it). You only have to give the parameter a self-explaining name (like outputList), and put a comment on the method (you know, an xml-comment block, like /// <param name="outputList">This list will receive...</param>).
To give a more complete response, I would need to see the whole code. You have omitted an example of Parser and an example of Handler.
Instead I see a problem with your lock in { lock (locker) return new List<Response1Clazz>(response1); }. And it seems to be non-sense, considering that you then do Response1 = new List<Response1Clazz>();, but Response1 only has a getter.

Is Marshal.Copy too processor-intensive in this situation?

I am working on a realtime simulation model. The models are written in unmanaged code, but the models are controlled by C# managed code, called the ExecutiveManager. An ExecutiveManager runs multiple models at a time, and controls the timing of the running models (like if a model has a "framerate" of 20 per second, the executive will tell the models when to start it's next frame).
We are seeing a consistently high load on the CPU when running the simulation, it can get up to 100% and stay there on a machine that should be totally appropriate. I have used a processor profiler to determine where the issues are, and it pointed me to two methods: WriteMemoryRegion and ReadMemoryRegion. The ExecutiveManager makes the calls to these methods. Models have shared memory regions, and the ExecutiveManager is used to read and write these regions using these Methods. Both read and write make calls to Marshal.Copy, and my gut tells me that's where the issue is, but I don't want to trust my gut! We are going to do further testing to narrow things down more, but I wanted to do a quick sanity check on Marshal.Copy. WriteMemoryRegion and ReadMemoryRegion are called each frame, and furthermore they're called by each model in the ExecutiveManager, and each model typically has 6 shared regions. So for 10 models each with 6 regions running at 20 frames per second calling both WriteMemoryRegion and ReadMemoryRegion, that's 2400 calls of Marshal.Copy per second. Is this unreasonable, or could my problem lie elsewhere?
public async Task ReadMemoryRegion(MemoryRegionDefinition g) {
if (!cache.ContainsKey(g.Name)) {
cache.Add(g.Name, mmff.CreateOrOpen(g.Name, g.Size));
}
var mmf = cache[g.Name];
using (var stream = mmf.CreateViewStream())
using (var reader = brf.Create(stream)) {
var buffer = reader.ReadBytes(g.Size);
await WriteIcBuffer(g, buffer).ConfigureAwait(false);
}
}
private Task WriteIcBuffer(MemoryRegionDefinition g, byte[] buffer) {
Marshal.Copy(buffer, 0, new IntPtr(g.BaseAddress),
buffer.Length);
return Task.FromResult(0);
}
public async Task WriteMemoryRegion(MemoryRegionDefinition g) {
if (!cache.ContainsKey(g.Name)) {
if (g.Size > 0) {
cache.Add(g.Name, mmff.CreateOrOpen(g.Name, g.Size));
} else if (g.Size == 0){
throw new EmptyGlobalException($#"Global {g.Name} not
created as it does not contain any variables.");
} else {
throw new NegativeSizeGlobalException($#"Global {g.Name}
not created as it has a negative size.");
}
}
var mmf = cache[g.Name];
using (var stream = mmf.CreateViewStream())
using (var writer = bwf.Create(stream)) {
var buffer = await ReadIcBuffer(g);
writer.Write(buffer);
}
}
private Task<byte[]> ReadIcBuffer(MemoryRegionDefinition g) {
var buffer = new byte[g.Size];
Marshal.Copy(new IntPtr(g.BaseAddress), buffer, 0, g.Size);
return Task.FromResult(buffer);
}
I need to come up with a solution so that my processor isn't catching on fire. I'm very green in this area so all ideas are welcome. Again, I'm not sure Marshal.Copy is the issue, but it seems possible. Please let me know if you see other issues that could contribute to the processor problem.

How to work around the fact that Image needs to keep its stream open

I'm building a class library for various document types. One such type is an image which contains our custom business logic for dealing with images, including converting to PDF. I'm running into the problem described in many posts -- e.g. here and here -- where the System.Drawing.Image.Save constructor is throwing a System.Runtime.InteropServices.ExternalException exception with "A generic error occurred in GDI+".
The answers I've seen say that the input stream needs to be kept open throughout the lifetime of the Image. I get that. The issue I have is that my class library doesn't control the input stream or even whether an input stream is used since I have two constructors. Here is some code:
public sealed class MyImage
{
private System.Drawing.Image _wrappedImage;
public MyImage(System.IO.Stream input)
{
_wrappedImage = System.Drawing.Image.FromStream(input);
}
public MyImage(System.Drawing.Image input)
{
_wrappedImage = input;
}
public MyPdf ConvertToPdf()
{
//no 'using' block because ms needs to be kept open due
// to third-party PDF conversion technology.
var ms = new System.IO.MemoryStream();
//System.Runtime.InteropServices.ExternalException occurs here:
//"A generic error occurred in GDI+"
_wrappedImage.Save(ms, System.Drawing.Imaging.ImageFormat.Bmp);
return MyPdf.CreateFromImage(ms);
}
}
public sealed class MyPdf
{
internal static MyPdf CreateFromImage(System.IO.Stream input)
{
//implementation details not important.
return null;
}
}
My question is this: should I keep a copy of the input stream just to avoid the possibility that the client closes the stream before my image is saved? I.e., I could add this to my class:
private System.IO.Stream _streamCopy = new System.IO.MemoryStream();
and change the constructor to this:
public MyImage(System.IO.Stream input)
{
input.CopyTo(_streamCopy);
_wrappedImage = System.Drawing.Image.FromStream(_streamCopy);
}
This would of course add the overhead of copying the stream which is not ideal. Is there a better way to do it?
You could create another Bitmap instance:
public MyImage(System.IO.Stream input)
{
var image = System.Drawing.Image.FromStream(input);
_wrappedImage = new System.Drawing.Bitmap(image);
// input stream may now be closed
}

log4net implementation detail - custom appender

I've implemented a custom log4net appender that writes to an http service... works well, but I am suffering some premature optimization in my head. Specifically, is there a better way to do it? I guess I can make sure that only critical classes have that particular apprender, but it feels like that there could be a lot of appenders and a liability even with conservative logging options.
Does anyone have experience that they would like to share? I've looked at http://geekswithblogs.net/michaelstephenson/archive/2014/01/02/155044.aspx which is essentially what I am doing... (see code) How well does something like this scale? I like the factory for the singleton... what about implementing a concurrent queue to buffer the writes?
Hopefully I won't get spanked too hard by the admin for asking an (potentially opinion) best practice question.
(adding code from article for clarification)
public class ServiceBusAppender : AppenderSkeleton
{
public string ConnectionStringKey { get; set; }
public string MessagingEntity { get; set; }
public string ApplicationName { get; set; }
public string EventType { get; set; }
public bool Synchronous { get; set; }
public string CorrelationIdPropertyName { get; set; }
protected override void Append(log4net.Core.LoggingEvent loggingEvent)
{
var myLogEvent = new AzureLoggingEvent(loggingEvent);
myLogEvent.ApplicationName = ApplicationName;
myLogEvent.EventType = EventType;
myLogEvent.CorrelationId = loggingEvent.LookupProperty(CorrelationIdPropertyName) as string;
if (Synchronous)
AppendInternal(myLogEvent, 0);
else
{
Task.Run(() => AppendInternal(myLogEvent, 0));
}
}
protected void AppendInternal(AzureLoggingEvent myLogEvent, int attemptNo)
{
try
{
//Convert event to JSON
var stream = new MemoryStream();
var json = Newtonsoft.Json.JsonConvert.SerializeObject(myLogEvent);
var writer = new StreamWriter(stream);
writer.Write(json);
writer.Flush();
stream.Seek(0, SeekOrigin.Begin);
//Setup service bus message
var message = new BrokeredMessage(stream, true);
message.ContentType = "application/json";
message.Label = myLogEvent.MessageType;
message.Properties.Add(new KeyValuePair<string, object>("ApplicationName", myLogEvent.ApplicationName));
message.Properties.Add(new KeyValuePair<string, object>("UserName", myLogEvent.UserName));
message.Properties.Add(new KeyValuePair<string, object>("MachineName", myLogEvent.MachineName));
message.Properties.Add(new KeyValuePair<string, object>("MessageType", myLogEvent.MessageType));
message.Properties.Add(new KeyValuePair<string, object>("Level", myLogEvent.Level));
message.Properties.Add(new KeyValuePair<string, object>("EventType", myLogEvent.EventType));
//Setup Service Bus Connection
var connection = ConfigurationManager.ConnectionStrings[ConnectionStringKey];
if (connection == null || string.IsNullOrEmpty(connection.ConnectionString))
{
ErrorHandler.Error("Cant publish the error, the connection string does not exist");
return;
}
var factory = MessagingFactoryManager.Instance.GetMessagingFactory(connection.ConnectionString);
var sender = factory.CreateMessageSender(MessagingEntity);
//Publish
sender.Send(message);
}
catch (Exception ex)
{
if (ex.Message.Contains("The operation cannot be performed because the entity has been closed or aborted"))
{
if (attemptNo < 3)
AppendInternal(myLogEvent, attemptNo++);
else
ErrorHandler.Error("Error occured while publishing error", ex);
}
else
ErrorHandler.Error("Error occured while publishing error", ex);
}
}
protected override void Append(log4net.Core.LoggingEvent[] loggingEvents)
{
foreach(var loggingEvent in loggingEvents)
{
Append(loggingEvent);
}
}
Thx,
Chris
The cure for premature optimisation is to test and measure, then test and measure again. Write an integration test that logs to a thousand loggers, and see how that goes.
If that does show a problem, then rather than implement your own queue, inherit from BufferingAppenderSkeleton instead:
This base class should be used by appenders that need to buffer a
number of events before logging them. For example the AdoNetAppender
buffers events and then submits the entire contents of the buffer to
the underlying database in one go.
Subclasses should override the SendBuffer method to deliver the
buffered events.
The BufferingAppenderSkeleton maintains a fixed size cyclic buffer of events. The size of the buffer is set using the BufferSize property.
(As an aside, what is up with the log4net documentation, there seem to be more '½ï¿' characters every time I look at it?)
I see that your code involves JSON serialization. If you're looking for log4net JSON, why redo what has been done already? See log4net.ext.json. I'm the developer. The wiki covers the first steps on how to get it up and running. It is used in place of a layout so it can be plugged into any log4net appender that takes a layout.
Part of my project I have also created a load testing GUI for log4net. It is not released, but it should compile easily from source. You can use that to discover how different configurations scale in your conditions.
Finally, I'd advise to give LOCALHOST UDP delivery a shot if performance is priority. Projects like nxlog or logstash can swallow that easily. Again, why write new code?
Let me know if you need some clarification. Kind regards and good luck, Rob

How to freeze a popsicle in .NET (make a class immutable)

I'm designing a class that I wish to make readonly after a main thread is done configuring it, i.e. "freeze" it. Eric Lippert calls this popsicle immutability. After it is frozen, it can be accessed by multiple threads concurrently for reading.
My question is how to write this in a thread safe way that is realistically efficient, i.e. without trying to be unnecessarily clever.
Attempt 1:
public class Foobar
{
private Boolean _isFrozen;
public void Freeze() { _isFrozen = true; }
// Only intended to be called by main thread, so checks if class is frozen. If it is the operation is invalid.
public void WriteValue(Object val)
{
if (_isFrozen)
throw new InvalidOperationException();
// write ...
}
public Object ReadSomething()
{
return it;
}
}
Eric Lippert seems to suggest this would be OK in this post.
I know writes have release semantics, but as far as I understand this only pertains to ordering, and it doesn't necessarily mean that all threads will see the value immediately after the write. Can anyone confirm this? This would mean this solution is not thread safe (this may not be the only reason of course).
Attempt 2:
The above, but using Interlocked.Exchange to ensure the value is actually published:
public class Foobar
{
private Int32 _isFrozen;
public void Freeze() { Interlocked.Exchange(ref _isFrozen, 1); }
public void WriteValue(Object val)
{
if (_isFrozen == 1)
throw new InvalidOperationException();
// write ...
}
}
Advantage here would be that we ensure the value is published without suffering the overhead on every read. If none of the reads are moved before the write to _isFrozen as the Interlocked method uses a full memory barrier I would guess this is thread safe. However, who knows what the compiler will do (and according to section 3.10 of the C# spec that seems like quite a lot), so I don't know if this is threadsafe.
Attempt 3:
Also do the read using Interlocked.
public class Foobar
{
private Int32 _isFrozen;
public void Freeze() { Interlocked.Exchange(ref _isFrozen, 1); }
public void WriteValue(Object val)
{
if (Interlocked.CompareExchange(ref _isFrozen, 0, 0) == 1)
throw new InvalidOperationException();
// write ...
}
}
Definitely thread safe, but it seems a little wasteful to have to do the compare exchange for every read. I know this overhead is probably minimal, but I'm looking for a reasonably efficient method (although perhaps this is it).
Attempt 4:
Using volatile:
public class Foobar
{
private volatile Boolean _isFrozen;
public void Freeze() { _isFrozen = true; }
public void WriteValue(Object val)
{
if (_isFrozen)
throw new InvalidOperationException();
// write ...
}
}
But Joe Duffy declared "sayonara volatile", so I won't consider this a solution.
Attempt 5:
Lock everything, seems a bit overkill:
public class Foobar
{
private readonly Object _syncRoot = new Object();
private Boolean _isFrozen;
public void Freeze() { lock(_syncRoot) _isFrozen = true; }
public void WriteValue(Object val)
{
lock(_syncRoot) // as above we could include an attempt that reads *without* this lock
if (_isFrozen)
throw new InvalidOperationException();
// write ...
}
}
Also seems definitely thread safe, but has more overhead than using the Interlocked approach above, so I would favour attempt 3 over this one.
And then I can come up with at least some more (I'm sure there are many more):
Attempt 6: use Thread.VolatileWrite and Thread.VolatileRead, but these are supposedly a little on the heavy side.
Attempt 7: use Thread.MemoryBarrier, seems a little too internal.
Attempt 8: create an immutable copy - don't want to do this
Summarising:
which attempt would you use and why (or how would you do it if entirely different)? (i.e. what is the best way for publishing a value once that is then read concurrently, while being reasonably efficient without being overly "clever"?)
does .NET's memory model "release" semantics of writes imply that all other threads see updates (cache coherency etc.)? I generally don't want to think too much about this, but it's nice to have an understanding.
EDIT:
Perhaps my question wasn't clear, but I am looking in particular for reasons as to why the above attempts are good or bad. Note that I am talking here about a scenario of one single writer that writes then freezes before any concurrent reads. I believe attempt 1 is OK but I'd like to know exactly why (as I wonder if reads could be optimized away somehow, for example).
I care less about whether or not this is good design practice but more about the actual threading aspect of it.
Many thanks for the response the question received, but I have chosen to mark this as an answer myself because I feel that the answers given do not quite answer my question and I do not want to give the impression to anyone visiting the site that the marked answer is correct simply because it was automatically marked as such due to the bounty expiring.
Furthermore I do not think the answer with the highest number of votes was overwhelmingly voted for, not enough to mark it automatically as an answer.
I am still leaning to attempt #1 being correct, however, I would have liked some authoritative answers. I understand x86 has a strong model, but I don't want to (and shouldn't) code for a particular architecture, after all that's one of the nice things about .NET.
If you are in doubt about the answer, go for one of the locking approaches, perhaps with the optimizations shown here to avoid a lot of contention on the lock.
Maybe slightly off topic but just out of curiosity :) Why don't you use "real" immutability? e.g. making Freeze() return an immutable copy (without "write methods" or any other possibility to change the inner state) and using this copy instead of the original object. You could even go without changing the state and return a new copy (with the changed state) on each write operation instead (afaik the string class works this). "Real immutability" is inherently thread safe.
I vote for Attempt 5, use the lock(this) implementation.
This is the most reliable means of making this work. Reader/writer locks could be employed, but to very little gain. Just go with using a normal lock.
If necessary you could improve the 'frozen' performance by first checking _isFrozen and then locking:
void Freeze() { lock (this) _isFrozen = true; }
object ReadValue()
{
if (_isFrozen)
return Read();
else
lock (this) return Read();
}
void WriteValue(object value)
{
lock (this)
{
if (_isFrozen) throw new InvalidOperationException();
Write(value);
}
}
If you really create, fill and freeze the object before showing it to other threads, then you don't need anything special to deal with thread-safety (the strong memory model of .NET is already your guarantee), so the solution 1 is valid.
But, if you give the unfrozen object to another thread (or if you are simple creating your class without knowing how users will use it) then using the version the solution that returns a new fully immutable instance is probably better. In this case, the Mutable instance is like the StringBuilder and the immutable instance is like the string. If you need an extra guarantee, the mutable instance may check its creator thread and throw exceptions if it is used from any other thread (in all methods... to avoid possible partial reads).
Attempt 2 is thread safe on x86 and other processors that have a strong memory model, but how I would do it is to make thread safety the consumers problem because there is no way for you to efficiently do it within the consumed code. Consider:
if(!foo.frozen)
{
foo.apropery = "avalue";
}
the thread saftey of the frozen property and the guard code in apropery's setter doesn't really matter because even they are perfectly thread safe you still have a race condition. Instead I would write it like
lock(foo)
{
if(!foo.frozen)
{
foo.apropery = "avalue";
}
}
and have neither of the properties inherently thread safe.
#1 - reader not threadsafe - I believe problem would be in reader side, not writer (code not shown)
#2 - reader not threadsafe - same as #1
#3 - promising, read check can be optimized out for most cases (when CPU caches are in sync)
Attempt 3:
Also do the read using Interlocked.
public class Foobar {
private object _syncRoot = new object();
private int _isFrozen = 0; // perf compiler warning, but training code, so show defaults
// Why Exchange to 1 then throw away result. Best to just increment.
//public void Freeze() { Interlocked.Exchange(ref _isFrozen, 1); }
public void Freeze() { Interlocked.Increment(ref _isFrozen); }
public void WriteValue(Object val) {
// if this core can see _isFrozen then no special lock or sync needed
if (_isFrozen != 0)
throw new InvalidOperationException();
lock(_syncRoot) {
if (_isFrozen != 0)
throw new InvalidOperationException(); // the 'throw' is 100x-1000x more costly than the lock, just eat it
_val = val;
}
}
public object Read() {
// frozen is one-way, if one-way state has been published
// to my local CPU cache then just read _val.
// There are very strange corner cases when _isFrozen and _val fields are in
// different cache lines, but should be nearly impossible to hit unless
// dealing with very large structs (make it more likely to cross
// 4k cache line).
if (_isFrozen != 0)
return _val;
// else
lock(_syncRoot) { // _isFrozen is 0 here
if (_isFrozen != 0) // if _isFrozen is 1 here we just collided with writer using lock on other thread, or our CPU cache was out of sync and lock() forced the dirty cache line to be read from main memory
return _val;
throw new InvalidOperationException(); // throw is 100x-1000x more expensive than lock, eat the cost of lock
}
}
}
Joe Duffy's post about 'volatile is dead' is, I think, in the context of his next-gen CLR/OS architecture and for CLR on ARM. Those of us doing multi-core x64/x86 I think volatile is fine. If perf is the primary concern I suggest you measure the code above and compare it to volatile.
Unlike other folks posting answers I wouldn't jump straight to lock() if you have lots of readers (3 or more threads likely to read the same object at the same time). But in your sample you mix perf-sensitive question with exceptions when a collision happens, which doesn't make much sense. If you're using exceptions, then you can also use other higher-level constructs.
If you want complete safety but need to optimize for lots of concurrent readers change lock()/Monitor to ReaderWriterLockSlim.
.NET has new primitives to handle publishing values. Take a look at Rx. It can be very fast and lockless for some cases (I think they use optimizations similar to above).
If written multiple times but only one value is kept - in Rx that is "new ReplaySubject(bufferSize: 1)". If you try it you might be surprised how fast it. At the same time I applaud your attempt to learn this level of detail.
If you want to go lockless get over your distaste for Thread.MemoryBarrier(). It is extremely important. But it has the same gotchas as volatile as described by Joe Duffy - it was designed as a hint to the compiler & CPU to prevent reordering of memory reads (which take a long time in CPU terms, so they are aggressively reordered when there are no hints present). When this reordering is combined with CLR constructs like auto-inline of functions and you can see very surprising behavior at the memory & register level. MemoryBarrier() just disables those single-threaded memory access assumptions that CPU and CLR use most of the time.
Perhaps my question wasn't clear, but I am looking in particular for reasons as to why the above attempts are good or bad. Note that I am talking here about a scenario of one single writer that writes then freezes before any concurrent reads. I believe attempt 1 is OK but I'd like to know exactly why (as I wonder if reads could be optimized away somehow, for example). I care less about whether or not this is good design practice but more about the actual threading aspect of it.
Ok, now I better understand what you are doing and looking for in a response. Allow me to elaborate on my previous answer promoting the use of locks by first addressing each of your attempts.
Attempt 1:
The approach of using a simple class that has no synchronization primitives of any form is entirely viable in your example. Since the 'authoring' thread is the only thread having access to this class during it's mutating state this should be safe. If an only if another thread has the potential to access before the class is 'frozen' would you need to provide synchronization. Essentially, it's not possible for a thread to have a cache of something it has never seen.
Aside from a thread having a cached copy of the internal state of this list there is one other concurrency issue that you should be concerned with. You should consider write reordering by the authoring thread. You example solution doesn't have enough code for me to address this, but the process of handing this 'frozen' list to another thread is the heart of the issue. Are you using Interlocked.Exchange or writing to a volatile state?
I still advocate that is not the best approach simply because there is no guarantee that another thread has not seen the instance while it's mutating.
Attempt 2:
While attempt 2 should not be used. If you are using atomic writes to a member, one should also use atomic reads. I would never recommend one without the other as without both reads and writes being atomic you haven't gained anything. The correct application of atomic reads and writes is your 'Attempt 3'.
Attempt 3:
This will guarantee an exception is thrown if a thread has attempted to mutate an frozen list. However it makes no assertion that a read is only acceptable on a frozen instance. This, IMHO, is just as bad as accessing our _isFrozen variable with atomic and non-atomic accessors. If you are going to say that it's important to safeguard writes, then you should always safeguard reads. One without the other is just 'odd'.
Overlooking my own feeling towards writing code that gaurds writes but not reads this is an acceptable approach given your specific uses. I have one writer, I write, I freeze, then I make it available to readers. Under this scenario you code works correctly. You rely on the atomic operation on the set of _isFrozen to provide the required memory barrier prior to handing the class to another thread.
In a nutshell this approach works, but again if a thread has an instance that is not frozen it's going to break.
Attempt 4:
While at heart this is nearly the same as attempt 3 (given one writer) there is one big difference. In this example, if you check _isFrozen in the reader then every access will require a memory barrier. This is unnecessary overhead once the list is frozen.
Still this has the same issue as Attempt 3 in that no assertions are made about the state of _isFrozen during the read so the performance should be identical in your example usage.
Attempt 5:
As I said this is my preference given the modification to read as appears in my other answer.
Attempt 6:
Is essentially the same as #4.
Attempt 7:
You could solve your specific needs with a Thread.MemoryBarrier. Essentially using the code from Attempt 1, you create the instance, call Freeze(), add your Thread.MemoryBarrier, and then share the instance (or share it within a lock). This should work great, again only under your limited use case.
Attempt 8:
Without knowing more about this, I can't advise on the cost of the copy.
Summary
Again I prefer using a class that has some threading guarantee or none at all. Creating a class that is only 'partially' thread safe is, IMO, dangerous.
In the words of a famous jedi master:
Either do or do not there is no try.
The same goes for thread safety. The class should either be thread safe or not. Taking this approach you are left with either using my augmentation of Attempt 5, or using Attempt 7. Given the choice, I would never recommend #7.
So my recommendation stands firmly behind a completely thread-safe version. The performance cost between the two is so infinitesimally small it's almost non-existent. The reader threads will never hit the lock simply because of your usage scenario of having a single writer. Yet, if they do, proper behavior is still a certainty. Thus as your code changes over time and suddenly your instance is being shared prior to being frozen you don't wind up with race condition that crashes your program. Thread safe, or not, don't be half-in or you wind up with nasty surprise someday.
My preference is all classes shared by more than one thread are one of two types:
Completely immutable.
Completely Thread-safe.
Since a popsicle list is not immutable by design it does not fit #1. Therefore if you are going to share the object across threads it should fit #2.
Hopefully all this ranting further explains my reasoning :)
_syncRoot
Many people have noticed that I skipped the use of a _syncRoot on my locking implementation. While the reasons to use _syncRoot are valid they are not always necessary. In your example usage where you have a single writer the use of lock(this) should suffice nicely without adding another heap allocation for _syncRoot.
Is the thing constructed and written to, then permanently frozen and read multiple times?
Or do you freeze and unfreeze and refreeze it multiple times?
If it's the former, then perhaps the "is frozen" check should be in the reader method not the writer method (to prevent it reading before it's frozen).
Or, if it's the latter, then the use case you need to beware of is:
Main thread invokes the writer method, finds that it's not frozen, and therefore begins to write
Before the write has finished, someone tries to freeze the object and then reads from it, while the other (main) thread is still writing
In the latter case, Google shows a lot of results for multiple reader single writer which you might find interesting.
In general, each mutable object should have precisely one clearly-defined "owner"; shared objects should be immutable. Popsicles should not be accessible by multiple threads until after they are frozen.
Personally, I don't like forms of popsicle immunity with an exposed "freeze" method. I think a cleaner approach is to have AsMutable and AsImmutable methods (each of which would simply return the object unmodified when appropriate). Such an approach can allow for more robust promises about immutability. For example, if an "unshared mutable object" is being mutated while its AsImmutable member is being called (behavior which would be contrary to the object being "unshared"), the state of the data in the copy may be indeterminate, but whatever was returned would be immutable. By contrast, if one thread froze an object and then assumed it was immutable while another thread was writing to it, the "immutable" object could end up changing after it was frozen and its values were read.
Edit
Based on further description, I would suggest having code which writes to the object do so within a monitor lock, and having the freeze routine look something like:
public Thingie Freeze(void) // Returns the object in question
{
if (isFrozen) // Private field
return this;
else
return DoFreeze();
}
Thingie DoFreeze(void)
{
if (Monitor.TryEnter(whatever))
{
isFrozen = true;
return this;
}
else if (isFrozen)
return this;
else
throw new InvalidOperationException("Object in use by writer");
}
The Freeze method may be called any number of times by any number of threads; it should be short enough to be inlined (though I haven't profiled it), and should thus take almost no time to execute. If the first access of the object in any thread is via the Freeze method, that should guarantee proper visibility under any reasonable memory model (even if the thread didn't see the updates to the object performed by the thread which created and originally froze it, it would perform the TryEnter, which would guarantee a memory barrier, and after that failed it would notice that the object was frozen and return it.
If code which is going to write the object acquires the lock first, an attempt to write to a frozen object could deadlock. If one would rather have such code throw an exception, one use TryEnter and throw an exception if it can't get the lock.
The object used for locking should be something which is exclusively held by the object to be frozen. If the object to be frozen doesn't hold a purely-private reference to anything, one could either lock on this or create a private object purely for locking purposes. Note that it is safe to abandon 'entered' monitor locks without cleanup; the GC will simply forget about them, since if no references exist to a lock there's no way anybody will ever care (or could even ask) whether the lock was entered at the time it was abandoned.
I am not sure in terms of cost how the following approach will do, but it is a bit different. Only initially if there are multiple threads trying to write value simultaneously will they encounter locks. Once it is frozen all later calls will get the exception directly.
Attempt 9:
public class Foobar
{
private readonly Object _syncRoot = new Object();
private object _val;
private Boolean _isFrozen;
private Action<object> WriteValInternal;
public void Freeze() { _isFrozen = true; }
public Foobar()
{
WriteValInternal = BeforeFreeze;
}
private void BeforeFreeze(object val)
{
lock (_syncRoot)
{
if (_isFrozen == false)
{
//Write the values....
_val = val;
//...
//...
//...
//and then modify the write value function
WriteValInternal = AfterFreeze;
Freeze();
}
else
{
throw new InvalidOperationException();
}
}
}
private void AfterFreeze(object val)
{
throw new InvalidOperationException();
}
public void WriteValue(Object val)
{
WriteValInternal(val);
}
public Object ReadSomething()
{
return _val;
}
}
Have you checked out Lazy
http://msdn.microsoft.com/en-us/library/dd642331.aspx
which uses ThreadLocal
http://msdn.microsoft.com/en-us/library/dd642243.aspx
And actually looking further there is a Freezable class...
http://msdn.microsoft.com/en-us/library/vstudio/ms602734(v=vs.100).aspx
you may achieve this using POST Sharp
take one interface
public interface IPseudoImmutable
{
bool IsFrozen { get; }
bool Freeze();
}
then derive your attribute from InstanceLevelAspect like this
/// <summary>
/// implement by divyang
/// </summary>
[Serializable]
[IntroduceInterface(typeof(IPseudoImmutable),
AncestorOverrideAction = InterfaceOverrideAction.Ignore, OverrideAction = InterfaceOverrideAction.Fail)]
public class PseudoImmutableAttribute : InstanceLevelAspect, IPseudoImmutable
{
private volatile bool isFrozen;
#region "IPseudoImmutable"
[IntroduceMember]
public bool IsFrozen
{
get
{
return this.isFrozen;
}
}
[IntroduceMember(IsVirtual = true, OverrideAction = MemberOverrideAction.Fail)]
public bool Freeze()
{
if (!this.isFrozen)
{
this.isFrozen = true;
}
return this.IsFrozen;
}
#endregion
[OnLocationSetValueAdvice]
[MulticastPointcut(Targets = MulticastTargets.Property | MulticastTargets.Field)]
public void OnValueChange(LocationInterceptionArgs args)
{
if (!this.IsFrozen)
{
args.ProceedSetValue();
}
}
}
public class ImmutableException : Exception
{
/// <summary>
/// The location name.
/// </summary>
private readonly string locationName;
/// <summary>
/// Initializes a new instance of the <see cref="ImmutableException"/> class.
/// </summary>
/// <param name="message">
/// The message.
/// </param>
public ImmutableException(string message)
: base(message)
{
}
public ImmutableException(string message, string locationName)
: base(message)
{
this.locationName = locationName;
}
public string LocationName
{
get
{
return this.locationName;
}
}
}
then apply in your class like this
[PseudoImmutableAttribute]
public class TestClass
{
public string MyString { get; set; }
public int MyInitval { get; set; }
}
then run it in multi thread
/// <summary>
/// The program.
/// </summary>
public class Program
{
/// <summary>
/// The main.
/// </summary>
/// <param name="args">
/// The args.
/// </param>
public static void Main(string[] args)
{
Console.Title = "Divyang Demo ";
var w = new Worker();
w.Run();
Console.ReadLine();
}
}
internal class Worker
{
private object SyncObject = new object();
public Worker()
{
var r = new Random();
this.ObjectOfMyTestClass = new MyTestClass { MyInitval = r.Next(500) };
}
public MyTestClass ObjectOfMyTestClass { get; set; }
public void Run()
{
Task readWork;
readWork = Task.Factory.StartNew(
action: () =>
{
for (;;)
{
Task.Delay(1000);
try
{
this.DoReadWork();
}
catch (Exception exception)
{
// Console.SetCursorPosition(80,80);
// Console.SetBufferSize(100,100);
Console.WriteLine("Read Exception : {0}", exception.Message);
}
}
// ReSharper disable FunctionNeverReturns
});
Task writeWork;
writeWork = Task.Factory.StartNew(
action: () =>
{
for (int i = 0; i < int.MaxValue; i++)
{
Task.Delay(1000);
try
{
this.DoWriteWork();
}
catch (Exception exception)
{
Console.SetCursorPosition(80, 80);
Console.SetBufferSize(100, 100);
Console.WriteLine("write Exception : {0}", exception.Message);
}
if (i == 5000)
{
((IPseudoImmutable)this.ObjectOfMyTestClass).Freeze();
}
}
});
Task.WaitAll();
}
/// <summary>
/// The do read work.
/// </summary>
public void DoReadWork()
{
// ThreadId where reading is done
var threadId = System.Threading.Thread.CurrentThread.ManagedThreadId;
// printing on screen
lock (this.SyncObject)
{
Console.SetCursorPosition(0, 0);
Console.SetBufferSize(290, 290);
Console.WriteLine("\n");
Console.WriteLine("Read Start");
Console.WriteLine("Read => Thread Id: {0} ", threadId);
Console.WriteLine("Read => this.objectOfMyTestClass.MyInitval: {0} ", this.ObjectOfMyTestClass.MyInitval);
Console.WriteLine("Read => this.objectOfMyTestClass.MyString: {0} ", this.ObjectOfMyTestClass.MyString);
Console.WriteLine("Read End");
Console.WriteLine("\n");
}
}
/// <summary>
/// The do write work.
/// </summary>
public void DoWriteWork()
{
// ThreadId where reading is done
var threadId = System.Threading.Thread.CurrentThread.ManagedThreadId;
// random number generator
var r = new Random();
var count = r.Next(15);
// new value for Int property
var tempInt = r.Next(5000);
this.ObjectOfMyTestClass.MyInitval = tempInt;
// new value for string Property
var tempString = "Randome" + r.Next(500).ToString(CultureInfo.InvariantCulture);
this.ObjectOfMyTestClass.MyString = tempString;
// printing on screen
lock (this.SyncObject)
{
Console.SetBufferSize(290, 290);
Console.SetCursorPosition(125, 25);
Console.WriteLine("\n");
Console.WriteLine("Write Start");
Console.WriteLine("Write => Thread Id: {0} ", threadId);
Console.WriteLine("Write => this.objectOfMyTestClass.MyInitval: {0} and New Value :{1} ", this.ObjectOfMyTestClass.MyInitval, tempInt);
Console.WriteLine("Write => this.objectOfMyTestClass.MyString: {0} and New Value :{1} ", this.ObjectOfMyTestClass.MyString, tempString);
Console.WriteLine("Write End");
Console.WriteLine("\n");
}
}
}
but still it will allow you to change property like array ,list . but if you apply more login in that then it may work for all type of property and field
I'd do something like this, inspired by C++ movable types. Just remember not to access the object after Freeze/Thaw.
Of course, you can add a _data != null check/throw if you want to be clear about why the user gets an NRE if accessing after thaw/freeze.
public class Data
{
public string _foo;
public int _bar;
}
public class Mutable
{
private Data _data = new Data();
public Mutable() {}
public string Foo { get => _data._foo; set => _data._foo = value; }
public int Bar { get => _data._bar; set => _data._bar = value; }
public Frozen Freeze()
{
var f = new Frozen(_data);
_data = null;
return f;
}
}
public class Frozen
{
private Data _data;
public Frozen(Data data) => _data = data;
public string Foo => _data._foo;
public int Bar => _data._bar;
public Mutable Thaw()
{
var m = new Mutable(_data);
_data = null;
return m;
}
}

Categories