C# and Rx: testing a timeout - c#

I've implemented an observer for a FileSystemWatcher. The idea is to track copies in a folder and wait until copy is finished. Once done, let's do something with copied files.
I've tried my implementation on a small program and it works. What I wanted to do is to make it more formal using unit test:
[TestMethod]
public void TestIsFileWritingFinished()
{
try
{
var dirName = Path.GetTempPath()+Path.DirectorySeparatorChar+DateTime.Now.ToString("MMddyyyy");
if (Directory.Exists(dirName))
{
Directory.Delete(dirName, true);
}
var dir = Directory.CreateDirectory(dirName);
var observer = new FileSystemObserver(dirName, "*.*", true)
.ChangedFiles
.Where(x => (new FileInfo(x.FullPath)).Length > 0)
.Select(x => x.Name);
var timeout = observer.Timeout(/*DateTimeOffset.UtcNow.AddSeconds(1)*/TimeSpan.FromSeconds(1));
var filesChanged = new List<string>();
var terminated = false;
timeout.Subscribe(Console.WriteLine, Console.WriteLine, ()=>terminated=true);
Thread.Sleep(100);
var origin = #"C:\Users\David\ResultA";
CopyDirectory(origin, dirName, true);
Thread.Sleep(100);
Console.WriteLine("nap for 5s");
Thread.Sleep(5000);
//Directory.Delete(dirName, true);
Assert.IsTrue(terminated);
}
catch(Exception ex)
{
Assert.Fail(ex.Message);
}
}
So, when timeout happens, I expect the boolean to be true. But looks like it's not.
Any idea about what's wrong with my test?
Thanks in advance, your suggestions will be appreciated,
Kind regards,

Even thought there is a Timeout operator, I find it is a bit of an anti-pattern. It's like programming with exceptions. I find the following pattern more useful:
IObservable<bool> query =
Observable.Amb(
source.LastAsync().Select(x => true),
Observable.Timer(TimeSpan.FromSeconds(seconds)).Select(x => false));
This is effectively running source to completion and then returns true, but if the Timer completes first it returns false.
So, in your code, I'd try something like this:
var dirName = Path.GetTempPath() + Path.DirectorySeparatorChar + DateTime.Now.ToString("MMddyyyy");
if (Directory.Exists(dirName))
{
Directory.Delete(dirName, true);
}
var dir = Directory.CreateDirectory(dirName);
var observer =
new FileSystemObserver(dirName, "*.*", true)
.ChangedFiles
.Where(x => (new FileInfo(x.FullPath)).Length > 0)
.Select(x => x.Name);
var result =
Observable.Amb(
observer.ToArray().Select(x => true),
Observable.Timer(TimeSpan.FromSeconds(1.0)).Select(x => false));
var query =
result.Zip(Observable.Start(() =>
{
var origin = #"C:\Users\David\ResultA";
CopyDirectory(origin, dirName, true);
}), (r, s) => r);
var terminated = query.Wait();
Assert.IsTrue(terminated);
That totally avoids any pesky sleeps.

Related

Writing to two standard input pipes from C#

I am using FFMPEG from my C# application to build out the video stream from raw unencoded frames. For just one input stream this is fairly straightforward:
var argumentBuilder = new List<string>();
argumentBuilder.Add("-loglevel panic");
argumentBuilder.Add("-f h264");
argumentBuilder.Add("-i pipe:");
argumentBuilder.Add("-c:v libx264");
argumentBuilder.Add("-bf 0");
argumentBuilder.Add("-pix_fmt yuv420p");
argumentBuilder.Add("-an");
argumentBuilder.Add(filename);
startInfo.Arguments = string.Join(" ", argumentBuilder.ToArray());
var _ffMpegProcess = new Process();
_ffMpegProcess.EnableRaisingEvents = true;
_ffMpegProcess.OutputDataReceived += (s, e) => { Debug.WriteLine(e.Data); };
_ffMpegProcess.ErrorDataReceived += (s, e) => { Debug.WriteLine(e.Data); };
_ffMpegProcess.StartInfo = startInfo;
Console.WriteLine($"[log] Starting write to {filename}...");
_ffMpegProcess.Start();
_ffMpegProcess.BeginOutputReadLine();
_ffMpegProcess.BeginErrorReadLine();
for (int i = 0; i < videoBuffer.Count; i++)
{
_ffMpegProcess.StandardInput.BaseStream.Write(videoBuffer[i], 0, videoBuffer[i].Length);
}
_ffMpegProcess.StandardInput.BaseStream.Close();
One of the challenges that I am trying to address is writing to two input pipes, similar to how I could do that from, say, Node.js, by referring to pipe:4 or pipe:5. It seems that I can only write to standard input directly but not split it into "channels".
What's the approach to do this in C#?
Based on what is written here and on a good night of sleep (where I dreamt that I could use Stream.CopyAsync), this is the skeleton of the solution:
string pathToFFmpeg = #"C:\ffmpeg\bin\ffmpeg.exe";
string[] inputs = new[] { "video.m4v", "audio.mp3" };
string output = "output2.mp4";
var npsss = new NamedPipeServerStream[inputs.Length];
var fss = new FileStream[inputs.Length];
try
{
for (int i = 0; i < fss.Length; i++)
{
fss[i] = File.OpenRead(inputs[i]);
}
// We use Guid for pipeNames
var pipeNames = Array.ConvertAll(inputs, x => Guid.NewGuid().ToString("N"));
for (int i = 0; i < npsss.Length; i++)
{
npsss[i] = new NamedPipeServerStream(pipeNames[i], PipeDirection.Out, 1, PipeTransmissionMode.Byte, PipeOptions.Asynchronous);
}
string pipeNamesFFmpeg = string.Join(" ", pipeNames.Select(x => $#"-i \\.\pipe\{x}"));
using (var proc = new Process
{
StartInfo = new ProcessStartInfo
{
FileName = pathToFFmpeg,
Arguments = $#"-loglevel debug -y {pipeNamesFFmpeg} -c:v copy -c:a copy ""{output}""",
UseShellExecute = false,
}
})
{
Console.WriteLine($"FFMpeg path: {pathToFFmpeg}");
Console.WriteLine($"Arguments: {proc.StartInfo.Arguments}");
proc.EnableRaisingEvents = false;
proc.Start();
var tasks = new Task[npsss.Length];
for (int i = 0; i < npsss.Length; i++)
{
var pipe = npsss[i];
var fs = fss[i];
pipe.WaitForConnection();
tasks[i] = fs.CopyToAsync(pipe)
// .ContinueWith(_ => pipe.FlushAsync()) // Flush does nothing on Pipes
.ContinueWith(x => {
pipe.WaitForPipeDrain();
pipe.Disconnect();
});
}
Task.WaitAll(tasks);
proc.WaitForExit();
}
}
finally
{
foreach (var fs in fss)
{
fs?.Dispose();
}
foreach (var npss in npsss)
{
npss?.Dispose();
}
}
There are various attention points:
Not all formats are compatible with pipes. For example many .mp4 aren't, because they have their moov atom towards the end of the file, but ffmpeg needs it immediately, and pipes aren't searchable (ffmpeg can't go to the end of the pipe, read the moov atom and then go to the beginning of the pipe). See here for example
I receive an error at the end of the streaming. The file seems to be correct. I don't know why. Some other persons signaled it but I haven't seen any explanation
\.\pipe\55afc0c8e95f4a4c9cec5ae492bc518a: Invalid argument
\.\pipe\92205c79c26a410aa46b9b35eb3bbff6: Invalid argument
I don't normally use Task and Async, so I'm not 100% sure if what I wrote is correct. This code doesn't work for example:
tasks[i] = pipe.WaitForConnectionAsync().ContinueWith(x => fs.CopyToAsync(pipe, 4096)).ContinueWith(...);
Mmmmh perhaps the last can be solved:
tasks[i] = ConnectAndCopyToPipe(fs, pipe);
and then
public static async Task ConnectAndCopyToPipe(FileStream fs, NamedPipeServerStream pipe)
{
await pipe.WaitForConnectionAsync();
await fs.CopyToAsync(pipe);
// await fs.FlushAsync(); // Does nothing
pipe.WaitForPipeDrain();
pipe.Disconnect();
}

Properly using multiple resources with Rx

I need to use multiple disposable resources with Rx. This is how I have nested the Observable.Using statements (the inner source is just for testing).
var obs = Observable.Using(
() => new FileStream("file.txt", FileMode.Open),
fs => Observable.Using(
() => new StreamReader(fs),
sr => Observable.Create<string>(o =>
TaskPoolScheduler.Default.ScheduleAsync(async (sch, ct) =>
{
while (!ct.IsCancellationRequested)
{
var s = await sr.ReadLineAsync().ConfigureAwait(false);
if (s is null) break;
o.OnNext(s);
}
o.OnCompleted();
}))));
obs.Subscribe(Console.WriteLine);
Is there a more concise way to use multiple disposable resources?
I can't think of a general way to use unlimited number of resources, but at least you could make helper methods for the common cases of 2-3 resources. Here is an implementation for two:
public static IObservable<TResult> Using<TResult, TResource1, TResource2>(
Func<TResource1> resourceFactory1,
Func<TResource1, TResource2> resourceFactory2,
Func<TResource1, TResource2, IObservable<TResult>> observableFactory)
where TResource1 : IDisposable
where TResource2 : IDisposable
{
return Observable.Using(resourceFactory1, resource1 => Observable.Using(
() => resourceFactory2(resource1),
resource2 => observableFactory(resource1, resource2)));
}
Usage example:
var obs = Using(
() => new FileStream("file.txt", FileMode.Open),
(fs) => new StreamReader(fs),
(fs, sr) => Observable.Create<string>(o =>
TaskPoolScheduler.Default.ScheduleAsync(async (sch, ct) =>
{
while (!ct.IsCancellationRequested)
{
var s = await sr.ReadLineAsync().ConfigureAwait(false);
if (s is null) break;
o.OnNext(s);
}
o.OnCompleted();
})));
Usually when I have a set of resources to be used in a cold observable, I stick with the create pattern which is much easier to read.
var obs = Observable.Create<string>(observer =>
{
var fs = new FileStream("file.txt", FileMode.Open);
var sr = new StreamReader(fs);
var task = TaskPoolScheduler.Default.ScheduleAsync(async (_, ct) =>
{
while (!ct.IsCancellationRequested)
{
var s = await sr.ReadLineAsync().ConfigureAwait(false);
if (s is null) break;
observer.OnNext(s);
}
observer.OnCompleted();
});
return new CompositeDisposable(task, fs, sr);
});
I'd just continue to use the double Using pattern, but I'd clean up the internals of how you're reading the lines. Try this:
var obs =
Observable.Using(() => new FileStream("file.txt", FileMode.Open),
fs => Observable.Using(() => new StreamReader(fs),
sr =>
Observable
.Defer(() => Observable.FromAsync(() => sr.ReadLineAsync()))
.Repeat()
.TakeWhile(x => x != null)));
You should try to avoid Observable.Create where possible.
Cleaner observable queries means that the Using operators don't look so out-of-place.

Execution Timeout Expired errors using Entity Framework

My code below starts out great and seems to run pretty quickly but then I will start getting this error message after awhile. I realize that Entity Framework/Dbcontext is not thread safe and this is probably causing the issue so is there a way to change this code so that it keeps the same performance and doesn't have issues with not closing threads which is probably causing the problem or is there another way to speed up this process? I have over 9000 symbols to download and insert into a database and I tried doing basic for loops with the await command but it was extremely slow and took more than 10 times longer to achieve the same results.
public static async Task startInitialMarketSymbolsDownload(string market)
{
try
{
List<string> symbolList = new List<string>();
symbolList = getStockSymbols(market);
var historicalGroups = symbolList.Select((x, i) => new { x, i })
.GroupBy(x => x.i / 50)
.Select(g => g.Select(x => x.x).ToArray());
await Task.WhenAll(historicalGroups.Select(g => Task.Run(() => getLocalHistoricalStockData(g, market))));
}
catch (Exception ex)
{
Console.WriteLine(ex.Message);
}
}
public static async Task getLocalHistoricalStockData(string[] symbols, string market)
{
// download data for list of symbols and then upload to db tables
string symbolInfo = null;
try
{
using (financeEntities context = new financeEntities())
{
foreach (string symbol in symbols)
{
symbolInfo = symbol;
List<HistoryPrice> hList = Get(symbol, new DateTime(1900, 1, 1), DateTime.UtcNow);
var backDates = context.DailyAmexDatas.Where(c => c.Symbol == symbol).Select(d => d.Date).ToList();
List<HistoryPrice> newHList = hList.Where(c => backDates.Contains(c.Date) == false).ToList<HistoryPrice>();
foreach (HistoryPrice h in newHList)
{
DailyAmexData amexData = new DailyAmexData();
// set the data then add it
amexData.Symbol = symbol;
amexData.Open = h.Open;
amexData.High = h.High;
amexData.Low = h.Low;
amexData.Close = h.Close;
amexData.Volume = h.Volume;
amexData.AdjustedClose = h.AdjClose;
amexData.Date = h.Date;
context.DailyAmexDatas.Add(amexData);
}
// now save everything
await context.SaveChangesAsync();
Console.WriteLine(symbol + " added to the " + market + " database!");
}
}
}
catch (Exception ex)
{
Console.WriteLine(ex.InnerException.Message);
}
}

Polling SSIS execution status

I have an SSIS package that's launching another SSIS package in a Foreach container; because the container reports completion as soon as it launched all the packages it had to launch, I need a way to make it wait until all "child" packages have completed.
So I implemented a little sleep-wait loop that basically pulls the Execution objects off the SSISDB for the ID's I'm interested in.
The problem I'm facing, is that a grand total of 0 Dts.Events.FireProgress events get fired, and if I uncomment the Dts.Events.FireInformation call in the do loop, then every second I get a message reported saying 23 packages are still running... except if I check in SSISDB's Active Operations window I see that most have completed already and 3 or 4 are actually running.
What am I doing wrong, why wouldn't runningCount contain the number of actually running executions?
using ssis = Microsoft.SqlServer.Management.IntegrationServices;
public void Main()
{
const string serverName = "REDACTED";
const string catalogName = "SSISDB";
var ssisConnectionString = $"Data Source={serverName};Initial Catalog=msdb;Integrated Security=SSPI;";
var ids = GetExecutionIDs(serverName);
var idCount = ids.Count();
var previousCount = -1;
var iterations = 0;
try
{
var fireAgain = true;
const int secondsToSleep = 1;
var sleepTime = TimeSpan.FromSeconds(secondsToSleep);
var maxIterations = TimeSpan.FromHours(1).TotalSeconds / sleepTime.TotalSeconds;
IDictionary<long, ssis.Operation.ServerOperationStatus> catalogExecutions;
using (var connection = new SqlConnection(ssisConnectionString))
{
var server = new ssis.IntegrationServices(connection);
var catalog = server.Catalogs[catalogName];
do
{
catalogExecutions = catalog.Executions
.Where(execution => ids.Contains(execution.Id))
.ToDictionary(execution => execution.Id, execution => execution.Status);
var runningCount = catalogExecutions.Count(kvp => kvp.Value == ssis.Operation.ServerOperationStatus.Running);
System.Threading.Thread.Sleep(sleepTime);
//Dts.Events.FireInformation(0, "ScriptMain", $"{runningCount} packages still running.", string.Empty, 0, ref fireAgain);
if (runningCount != previousCount)
{
previousCount = runningCount;
decimal completed = idCount - runningCount;
decimal percentCompleted = completed / idCount;
Dts.Events.FireProgress($"Waiting... {completed}/{idCount} completed", Convert.ToInt32(100 * percentCompleted), 0, 0, "", ref fireAgain);
}
iterations++;
if (iterations >= maxIterations)
{
Dts.Events.FireWarning(0, "ScriptMain", $"Timeout expired, requesting cancellation.", string.Empty, 0);
Dts.Events.FireQueryCancel();
Dts.TaskResult = (int)Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Canceled;
return;
}
}
while (catalogExecutions.Any(kvp => kvp.Value == ssis.Operation.ServerOperationStatus.Running));
}
}
catch (Exception exception)
{
if (exception.InnerException != null)
{
Dts.Events.FireError(0, "ScriptMain", exception.InnerException.ToString(), string.Empty, 0);
}
Dts.Events.FireError(0, "ScriptMain", exception.ToString(), string.Empty, 0);
Dts.Log(exception.ToString(), 0, new byte[0]);
Dts.TaskResult = (int)ScriptResults.Failure;
return;
}
Dts.TaskResult = (int)ScriptResults.Success;
}
The GetExecutionIDs function simply returns all execution ID's for the child packages, from my metadata database.
The problem is that you're re-using the same connection at every iteration. Turn this:
using (var connection = new SqlConnection(ssisConnectionString))
{
var server = new ssis.IntegrationServices(connection);
var catalog = server.Catalogs[catalogName];
do
{
catalogExecutions = catalog.Executions
.Where(execution => ids.Contains(execution.Id))
.ToDictionary(execution => execution.Id, execution => execution.Status);
Into this:
do
{
using (var connection = new SqlConnection(ssisConnectionString))
{
var server = new ssis.IntegrationServices(connection);
var catalog = server.Catalogs[catalogName];
catalogExecutions = catalog.Executions
.Where(execution => ids.Contains(execution.Id))
.ToDictionary(execution => execution.Id, execution => execution.Status);
}
And you'll get correct execution status every time. Not sure why the connection can't be reused, but keeping connections as short-lived as possible is always a good idea - and that's another proof.

How to observe dependent events in Reactive Extensions (Rx)?

What is the best way to handle dependent events such as;
There is an object for which I need to test if connection is succeeded or failed.
But the object first needs to pass the initialization step which I test for success or failure and then continue to connection step.
If initialization fails return is connection failed.
If initialization succeeds return is result of the connection step.
My code is below. Is there a better way to handle those dependent events because I'm subscribing for connection inside initialization subscription?
If I have more dependent events like this will I keep nesting the subscriptions?
public static void Test()
{
const int maxValue = 501;
var random = new Random(BitConverter.ToInt32(Guid.NewGuid().ToByteArray(), 0));
var initOk = Observable.Interval(TimeSpan.FromMilliseconds(random.Next(maxValue))).Select(i => true);
var initKo = Observable.Interval(TimeSpan.FromMilliseconds(random.Next(maxValue))).Select(i => false);
var connectOk = Observable.Interval(TimeSpan.FromMilliseconds(random.Next(maxValue))).Select(i => true);
var connectKo = Observable.Interval(TimeSpan.FromMilliseconds(random.Next(maxValue))).Select(i => false);
var initResult = initOk.Amb(initKo).Take(1);
var connectResult = connectOk.Amb(connectKo).Take(1);
var id =
initResult.Subscribe(ir =>
{
if (ir)
{
var cd =
connectResult.Subscribe(cr =>
{
Console.WriteLine(cr
? "Connection succeeded."
: "Connection failed.");
});
}
else
{
Console.WriteLine("Initialization failed thus connection failed.");
}
});
}
You can normally avoid nesting by utilising a variety of the rx operators to chain calls up.
Your example could be tidied up in using:
initResult.SelectMany(ir =>
{
if (ir != null)
{
return connectResult;
}
Console.WriteLine("Initialization failed thus connection failed.");
return Observable.Throw(new Exception("Some Exception"));
})
.Subscribe(cr =>
{
Console.WriteLine(cr != null
? "Connection succeeded."
: "Connection failed.");
})
You could use this:
var finalResult =
initResult
.Select(ir =>
Observable.If(() => ir, connectResult, Observable.Return(false)))
.Merge();
To get your messages out, you could change it like this:
var initResultText =
initResult
.Select(ir =>
ir ? (string)null : "Initialization failed thus connection failed.");
var connectResultText =
connectResult
.Select(cr =>
String.Format("Connection {0}.", cr ? "succeeded" : "failed"));
var finalResult =
initResultText
.Select(irt =>
Observable.If(() =>
irt == null, connectResultText, Observable.Return(irt)))
.Merge();
If you need to nest further than this you should consider making an extension method that wraps up the complexity and thus composition would be much easier.

Categories