The documentation for DataReader's DetachBuffer and DetachStream is very vague. It just says `Detaches a buffer that was previously attached to the reader'.
In short
When should reader.DetachBuffer(); be used?
Background
Reading
An example read method for a SerialDevice could look something like this:
using (var reader = new DataReader(inputStream))
{
var bytesReceived = await reader.LoadAsync(EXPECTED_RESPONSE_LENGTH);
var receivedBuffer = new byte[bytesReceived];
reader.ReadBytes(receivedBuffer);
reader.DetachStream();
return receivedBuffer;
}
This code works and seems to be stable, but since I write and read multiple times a second on an embedded device I want to avoid creating the receivedBuffer buffer each time. I modified my method to be something like the code below.
byte[] _receivedBuffer = new byte[EXPECTED_RESPONSE_LENGTH];
private async Task<byte[]> ReadOnceAsync(IInputStream inputStream)
{
using (var reader = new DataReader(inputStream))
{
reader.InputStreamOptions = InputStreamOptions.Partial;
uint bytesReceived = await reader.LoadAsync(EXPECTED_RESPONSE_LENGTH);
var isExpectedLength = (bytesReceived == EXPECTED_RESPONSE_LENGTH);
if (isExpectedLength)
{
reader.ReadBytes(_receivedBuffer);
}
reader.DetachStream();
return isExpectedLength ? _receivedBuffer: null;
}
}
This code crashes my application, sometimes with Access Violation message, within minutes of starting or within seconds if the connected device stops responding.
After I added reader.DetachBuffer(); the code is stable again, but I still don't know if DetachBuffer should be called always, sometimes or not at all.
Writing
My write method does not call writer.DetachStream() but I don't know if it should or not. The code is:
using (var writer = new DataWriter(outputStream))
{
writer.WriteBytes(toSend);
var bytesWritten = await writer.StoreAsync();
//Should writer.DetachBuffer(); be called?
writer.DetachStream();
return bytesWritten;
}
Related
I am working with two C# stream APIs, one of which is a data source and the other of which is a data sink.
Neither API actually exposes a stream object; both expect you to pass a stream into them and they handle writing/reading from the stream.
Is there a way to link these APIs together such that the output of the source is streamed into the sink without having to buffer the entire source in a MemoryStream? This is a very RAM-sensitive application.
Here's an example that uses the MemoryStream approach that I'm trying to avoid, since it buffers the entire stream in RAM before writing it out to S3:
using (var buffer = new MemoryStream())
using (var transferUtil = new TransferUtility(s3client))
{
// This destructor finishes the file and transferUtil closes
// the stream, so we need this weird using nesting to keep everyone happy.
using (var parquetWriter = new ParquetWriter(schema, buffer))
using (var rowGroupWriter = parquetWriter.CreateRowGroup())
{
rowGroupWriter.WriteColumn(...);
...
}
transferUtil.Upload(buffer, _bucketName, _key.Replace(".gz", "") + ".parquet");
}
You are looking for a stream that can be passed to both the data source and sink and that can 'transfer' the data between the two asynchronously. There are a number of possible solutions and I might have considered a producer-consumer pattern around a BlockingCollection.
Recently, the addition of the System.IO.Pipelines, Span and Memory types have really focused on high performance IO and I think it would be a good fit here. The Pipe class with it's associated Reader and Writer, can automatically handle the flow control, back pressure and IO between themselves whilst utilising all the new Span and Memory related types.
I have uploaded a Gist at PipeStream that will give you a custom stream with an internal Pipe implementation that you can pass to both your API classes. Whatever is written to the WriteAsync (or Write) method will be made available to the ReadAsync (or Read) method without requiring any further byte[] or MemoryStream allocations
In your case you would simply substite the MemoryStream for this new class and it should work out of the box. I haven't got a full S3 test working but reading directly from the Parquet stream and dumping it to the console window shows that it works asynchronously.
// Create some very badly 'mocked' data
var idColumn = new DataColumn(
new DataField<int>("id"),
Enumerable.Range(0, 10000).Select(i => i).ToArray());
var cityColumn = new DataColumn(
new DataField<string>("city"),
Enumerable.Range(0, 10000).Select(i => i % 2 == 0 ? "London" : "Grimsby").ToArray());
var schema = new Schema(idColumn.Field, cityColumn.Field);
using (var pipeStream = new PipeStream())
{
var buffer = new byte[4096];
int read = 0;
var readTask = Task.Run(async () =>
{
//transferUtil.Upload(readStream, "bucketName", "key"); // Execute this in a Task / Thread
while ((read = await pipeStream.ReadAsync(buffer, 0, buffer.Length)) > 0)
{
var incoming = Encoding.ASCII.GetString(buffer, 0, read);
Console.WriteLine(incoming);
// await Task.Delay(5000); uncomment this to simulate very slow consumer
}
});
using (var parquetWriter = new ParquetWriter(schema, pipeStream)) // This destructor finishes the file and transferUtil closes the stream, so we need this weird using nesting to keep everyone happy.
using (var rowGroupWriter = parquetWriter.CreateRowGroup())
{
rowGroupWriter.WriteColumn(idColumn); // Step through both these statements to see data read before the parquetWriter completes
rowGroupWriter.WriteColumn(cityColumn);
}
}
The implementation is not completely finished but I think it shows a nice approach. In the console 'readTask' you can un-comment the Task.Delay to simulate a slow read (transferUtil) and you should see the pipe automatically throttles the write task.
You need to be using C# 7.2 or later (VS 2017 -> Project Properties -> Build -> Advanced -> Language Version) for one of the Span extension methods but it should be compatible with any .Net Framework. You may need the Nuget Package
The stream is readable and writable (obviously!) but not seekable which should work for you in this scenario but wouldn't work reading from the Parquet SDK which requires seekable streams.
Hope it helps
Using System.IO.Pipelines it would look something like this:
var pipe = new System.IO.Pipelines.Pipe();
using (var buffer = pipe.Writer.AsStream())
using (var transferUtil = new TransferUtility(s3client))
{
// we can start the consumer first because it will just block
// on the stream until data is available
Task consumer = transferUtil.UploadAsync(pipe.Reader.AsStream(), _bucketName, _key.Replace(".gz", "") + ".parquet");
// start a task to produce data
Task producer = WriteParquetAsync(buffer, ..);
// start pumping data; we can wait here because the producer will
// necessarily finish before the consumer does
await producer;
// this is key; disposing of the buffer early here causes the consumer stream
// to terminate, else it will just hang waiting on the stream to finish.
// see the documentation for Writer.AsStream(bool leaveOpen = false)
buffer.Dispose();
// wait the upload to finish
await consumer;
}
I'm having some trouble with a simple TCP Read/Write application where I need to write a command to a device/host. Normally I can do this using a stream.Write() command however with this particular device, it seems to send an initial welcome message back (PJLINK 0) before any command can be sent to it. I can send the commands fine using PuTTY but when using C# I think my connection is closing before I can get my command through.
So my question would be how can I adjust my code below to receive that welcome message and then send my command back (I don't need to read a response) without the TcpClient closing the connection early?
Any help would be greatly appreciated.
using (tcpClientA = new TcpClient())
{
int portA = 4352;
if (!tcpClientA.BeginConnect("10.0.2.201", portA, null, null).AsyncWaitHandle.WaitOne(TimeSpan.FromSeconds(1.0)))
{
throw new Exception("Failed to connect.");
}
while (tcpClientA.Connected)
{
using (streamA = tcpClientA.GetStream())
{
if (type == "raw")
{
// Buffer to store the response bytes.
byte[] writeBufferC = Encoding.ASCII.GetBytes("%1 INPT 32$0D"); //Command I need to send
byte[] readBufferC = new byte[tcpClientA.ReceiveBufferSize];
string fullServerReply = null;
using (var writer = new MemoryStream())
{
do
{
int numberOfBytesRead = streamA.Read(readBufferC, 0, readBufferC.Length);
if (numberOfBytesRead <= 0)
{
break;
}
writer.Write(writeBufferC, 0, writeBufferC.Length);
} while (streamA.DataAvailable);
fullServerReply = Encoding.UTF8.GetString(writer.ToArray());
Console.WriteLine(fullServerReply.Trim());
}
}
}
}
}
Update 1
Removed the BeginConnect and Async methods.
using (tcpClientA = new TcpClient())
{
int portA = 4352;
tcpClientA.Connect("10.0.2.201", portA);
while (tcpClientA.Connected)
{
using (streamA = tcpClientA.GetStream())
{
if (type == "raw")
{
byte[] readBufferC = new byte[tcpClientA.ReceiveBufferSize];
byte[] writeBufferC = Encoding.ASCII.GetBytes("%1 INPT 31$0D"); //Command I need to send
string fullServerReply = null;
using (var writer = new MemoryStream())
{
do
{
streamA.Read(readBufferC, 0, readBufferC.Length); //First read
writer.Write(writeBufferC, 0, writeBufferC.Length); //Send command
} while (streamA.DataAvailable);
fullServerReply = Encoding.UTF8.GetString(readBufferC.ToArray());
Console.WriteLine(fullServerReply.Trim());
tcpClientA.Close();
}
}
}
}
}
DataAvailable does not tell you how much data will be sent in the future by the remote side. It's use is almost always a bug. Here, it causes you to randomly exit the loop early.
Read, until you have all the bytes you expect or until the stream is being closed.
Is this a line-based protocol? Instantiate a StreamReader and draw entire lines from the stream.
while (tcpClientA.Connected) accomplishes nothing. Even if it returns true, the connection could be lost 1 nanosecond later. Your code has to deal with that anyway. It should be while (true). This is not a bug, it just shows weak TCP understanding so I point it out.
Remove all usages of ReceiveBufferSize. This value means nothing of significance. Instead, use a fixed buffer size. I find that 4096 works well with not very high throughput connections.
numberOfBytesRead <= 0 should be ==0. Again, not a bug but you don't seem to understand exactly what the API does. This is dangerous.
In the updated code you're not using the return value of streamA.Read which is a bug. You have tried to fix that bug by trimming off the resulting \0 chars. That's just treating the symptoms and is not a true fix.
You need a socket tutorial. This carnage comes because you are not relying on best practices. Socket reading loops are actually rather simple if done right. This code is a collection of what can go wrong.
I am developing a game in which I need to retrieve data from a stream (that hasn't end).
I have a class called StreamingChannel which creates the streaming channel
public StreamingChannel (){
//stuff to set the stream
webResponse = (HttpWebResponse) webRequest.GetResponse();
responseStream = new StreamReader (webResponse.GetResponseStream (), encode);
}
and to read from it i have this method
public string Read(){
try{
string jsonText = responseStream.ReadLine();
return jsonText;
}catch(ObjectDisposedException){
return null;
}
}
I perform the reading every tot secs with an InvokeRepeating and I do that for the whole game.
It works great except that for the fact that my stream lasts for about a couple of minute. After that it throws an ObjectDisposedException.
At first I wanted to restore the connection but I didn't manage to do that without reinstantiate the whole connection. In this case the problem is that the game lags for about a seconds.
So how can I tell the StreamReader that has to leave open the channel?
ps I cannot use the constructor
public StreamReader(
Stream stream,
Encoding encoding,
bool detectEncodingFromByteOrderMarks,
int bufferSize,
bool leaveOpen)
because it has been introduced in the version 4.5 of the .NET Framework, and Unity doesn't support that.
A streaming API expects your code to pull data out of Stream pretty aggressively. You may not be able to wait for Unity to schedule your ReadLine method. I think a better model is to use a separate thread to pull data as fast as possible from the Stream and store it in a buffer. (I think this is possible in Unity.) Then you can pull the stream data out of your buffer in the standard Unity thread without worrying about the pull rate. A ConcurrentQueue would be a great buffer, but Unity doesn't support it, so I've used a locked List.
Using a separate thread also allows you to restart after failures without blocking the main game.
using System.Collections.Generic;
using System.Threading;
public class StreamingChannel
{
private readonly List<string> backgroundLinesList;
private readonly object listLock = new object();
private Thread streamReaderThread;
public StreamingChannel()
{
streamReaderThread = new Thread(this.ReadWebStream);
streamReaderThread.Start();
}
public List<string> Read()
{
if (!streamReaderThread.IsAlive)
{
streamReaderThread = new Thread(this.ReadWebStream);
streamReaderThread.Start();
}
List<string> lines = null;
lock (listLock)
{
if (backgroundLinesList != null)
{
lines = backgroundLinesList;
backgroundLinesList = null;
}
}
return lines;
}
private void ReadWebStream()
{
try
{
//stuff to set the stream
HttpWebRequest webRequest;
HttpWebResponse webResponse = (HttpWebResponse)webRequest.GetResponse();
StreamReader responseStream = new StreamReader(webResponse.GetResponseStream(), encode);
while (!responseStream.EndOfStream)
{
var line = responseStream.ReadLine()
lock (listLock)
{
if (backgroundLinesList == null)
{
backgroundLinesList = new List<string>();
}
backgroundLinesList.Add(line);
}
}
log.Debug("Stream closed");
}
catch (Exception e)
{
log.Debug("WebStream thread failure: " + e + " Stack: " + e.StackTrace);
}
}
}
I'm trying to do get a basic interface working for Windows Store using the Windows.Networking.Sockets API. So far I have this:
public async void Test()
{
using (var socket = new StreamSocket())
{
socket.Control.KeepAlive = false;
socket.Control.NoDelay = false;
await socket.ConnectAsync(new HostName("192.168.1.1"), "5555", SocketProtectionLevel.PlainSocket);
using (var writer = new DataWriter(socket.OutputStream))
{
writer.UnicodeEncoding = UnicodeEncoding.Utf8;
writer.WriteString("yea!");
//writer.WriteByte(0x50); //this doesn't work either to send raw ASCII
var t = writer.FlushAsync();
while (t.Status != AsyncStatus.Completed) ; //just in case?
}
}
}
So far, I do appear to get a successful connect and disconnect. However, I never get any text received.
My netcat command (running under an OpenBSD router)
$ nc -lv 5555
If I don't have netcat running when I run the Test function, it will throw an exception and all that as well as expected. What am I doing wrong here?
This makes absolutely no sense to me, but apparently StoreAsync is required on the DataWriter. I would've thought that Flush should've called that, but apparently not. Yet another fun part about the WinRT APIs. My fixed code:
using (var socket = new StreamSocket())
{
socket.Control.KeepAlive = false;
socket.Control.NoDelay = false;
await socket.ConnectAsync(new HostName("192.168.1.1"), "5555", SocketProtectionLevel.PlainSocket);
using (var writer = new DataWriter(socket.OutputStream))
{
writer.UnicodeEncoding = UnicodeEncoding.Utf8;
writer.WriteString("yea!");
await writer.StoreAsync();
}
}
I'm trying to create a collection of FTP web requests to download a collection of files.
Was working correctly doing this in a single thread but am trying to do with multiple threads now but am getting a timeout exception. I think I'm missing something pretty simple but cannot seem to work it out
Here is code:
internal static void DownloadLogFiles(IEnumerable<string> ftpFileNames, string localLogsFolder)
{
BotFinder.DeleteAllFilesFromDirectory(localLogsFolder);
var ftpWebRequests = new Collection<FtpWebRequest>();
// Create web request for each log filename
foreach (var ftpWebRequest in ftpFileNames.Select(filename => (FtpWebRequest) WebRequest.Create(filename)))
{
ftpWebRequest.Credentials = new NetworkCredential(BotFinderSettings.FtpUserId, BotFinderSettings.FtpPassword);
ftpWebRequest.KeepAlive = false;
ftpWebRequest.UseBinary = true;
ftpWebRequest.CachePolicy = NoCachePolicy;
ftpWebRequest.Method = WebRequestMethods.Ftp.DownloadFile;
ftpWebRequests.Add(ftpWebRequest);
}
var threadDoneEvents = new ManualResetEvent[ftpWebRequests.Count];
for (var x = 0; x < ftpWebRequests.Count; x++)
{
var ftpWebRequest = ftpWebRequests[x];
threadDoneEvents[x] = new ManualResetEvent(false);
var threadedFtpDownloader = new ThreadedFtpDownloader(ftpWebRequest, threadDoneEvents[x]);
ThreadPool.QueueUserWorkItem(threadedFtpDownloader.PerformFtpRequest, localLogsFolder);
}
WaitHandle.WaitAll(threadDoneEvents);
}
class ThreadedFtpDownloader
{
private ManualResetEvent threadDoneEvent;
private readonly FtpWebRequest ftpWebRequest;
/// <summary>
///
/// </summary>
public ThreadedFtpDownloader(FtpWebRequest ftpWebRequest, ManualResetEvent threadDoneEvent)
{
this.threadDoneEvent = threadDoneEvent;
this.ftpWebRequest = ftpWebRequest;
}
/// <summary>
///
/// </summary>
/// <param name="localLogsFolder">
///
/// </param>
internal void PerformFtpRequest(object localLogsFolder)
{
try
{
// TIMEOUT IS HAPPENING ON LINE BELOW
using (var response = ftpWebRequest.GetResponse())
{
using (var responseStream = response.GetResponseStream())
{
const int length = 1024*10;
var buffer = new Byte[length];
var bytesRead = responseStream.Read(buffer, 0, length);
var logFileToCreate = string.Format("{0}{1}{2}", localLogsFolder,
ftpWebRequest.RequestUri.Segments[3].Replace("/", "-"),
ftpWebRequest.RequestUri.Segments[4]);
using (var writeStream = new FileStream(logFileToCreate, FileMode.OpenOrCreate))
{
while (bytesRead > 0)
{
writeStream.Write(buffer, 0, bytesRead);
bytesRead = responseStream.Read(buffer, 0, length);
}
}
}
}
threadDoneEvent.Set();
}
catch (Exception exception)
{
BotFinder.HandleExceptionAndExit(exception);
}
}
}
It seems to be downloading the first two files (using two threads I'm assuming) but then timeout seems to occur when these complete and application tries to move onto next file.
I can confirm that the FTPWebRequest which is timing out is valid and the file exists, I think I may have an open connection or something.
Was going to post a comment but probably easier to read in an answer:
Firstly, if I set the ftpRequest.Timout property to Timeout.Infinite, the timeout issue disappears however having an infinite timeout is probably not best practice. So I'd prefer to go about solving this another way...
Debugging the code, I can see that when it gets to:
ThreadPool.QueueUserWorkItem(threadedFtpDownloader.PerformFtpRequest, localLogsFolder);
It enters into PerformFtpRequest method for each FTP web request and calls the ftpWebRequest.GetResponse() but then only progresses further for the first two requests. The rest of the requests stay active but don't go any further until the first two finish. So this basically means they are left open while waiting for other requests to complete before starting.
I think the solution to this problem would either be allowing all the requests to execute at once (ConnectionLimit property is having no effect here) or to prevent execution from calling GetResponse until it's actually ready to use the response.
Any good ideas on best way to solve this? At the moment all I can seem to think of are hacky solutions which I'd like to avoid :)
Thanks!
You should get the ServicePoint for the request and set the ConnectionLimit
ServicePoint sp = ftpRequest.ServicePoint;
sp.ConnectionLimit = 10;
The default ConnectionLimit is 2 -- that's why you're seeing that behavior.
UPDATE: See this answer for a more thorough explanation:
How to improve the Performance of FtpWebRequest?