Timeseries/temporal data in DDD on write side in CQRS

Timeseries/temporal data in DDD on write side in CQRS - c#

I am having trouble getting my head around how I would support timeseries/temporal data in DDD and how it would be handled on the write side using CQRS. Ultimately I would like to find a solution that also plays nice with event sourcing.
Using temperature forecasts as an example, a change in temperature could also affect the forecast energy demand for a region/location. Assuming temperature forecasts can go far in to the future (based on historic data), loading all the forecasts in to a Location aggregate I think would be impractical without applying some limit to the amount of data loaded.
What is a good/recommended approach for synchronising/storing this kind of data to be used on the write side in CQRS when keeping event sourcing in mind?
Are any of my attempts below (Option A or B) considered as suitable DDD/CQRS solutions?
Option A:
Allow temperature to be updated independently and subscribe to events using a process manager/saga to then recalculate the demand. This solution would help keep aggregate size small, however it feels like the aggregate boundary could be wrong as demand is dependent on temperature and now spread across commands/events.
// OverrideTemperatureForecastCommandHandler.cs
public void Handle(OverrideTemperatureForecast cmd)
{
var from = cmd.TemperatureOverrides.Min(t => t.DateTime);
var to = cmd.TemperatureOverrides.Max(t => t.DateTime);
TemperatureForecasts forecasts = temperatureForecastRepository.GetByLocation(cmd.LocationId, from, to);
forecasts.Override(cmd.TemperatureOverrides);
temperatureForecastRepository.Save(forecasts);
// raises
// TemperatureForecastsOverridden(locationId, overrides)
}
// TemperatureForecastsOverriddenProcessManager.cs
public void Handle(TemperatureForecastsOverridden #event)
{
var from = cmd.TemperatureOverrides.Min(t => t.DateTime);
var to = cmd.TemperatureOverrides.Max(t => t.DateTime);
// issue a command to recalculate the energy demand now temperature has changed...
commandBus.Send(new RecalculateEnergyDemand
{
LocationId = #event.LocationId,
From = from,
To = to
}));
}
// RecalculateEnergyDemandCommandHandler.cs
public void Handle(RecalculateEnergyDemand cmd)
{
EnergyDemand demandForecasts = energyDemandForecastRepository.GetByLocation(cmd.LocationId, cmd.From, cmd.To);
// have to fetch temperature forecasts again...
TemperatureForecasts temperatureForecasts = temperatureForecastRepository.GetByLocation(cmd.LocationId, cmd.From, cmd.To);
demandForecasts.AdjustForTemperature(temperatureForecasts);
energyDemandForecastRepository.Save(demandForecasts);
// raises
// ForecastDemandChanged(locationId, demandforecasts)
}
Option B:
Create a single aggregate 'Location' and pre-load forecast data internally based on a given date range. This feels cleaner from a DDD behaviour perspective however loading an aggregate constrained to time range feels a bit awkward to me (or is it just me?). Without limiting the size of the forecasts values the 'Location' aggregate could get huge.
// OverrideTemperatureForecastCommandHandler.cs
public void Handle(OverrideTemperatureForecast cmd)
{
var from = cmd.TemperatureOverrides.Min(t => t.DateTime);
var to = cmd.TemperatureOverrides.Max(t => t.DateTime);
// use from/to to limit internally the range of temperature and demand forecasts that get loaded in to the aggregate.
Location location = locationRepository.Get(cmd.LocationId, from, to);
location.OverrideTemperatureForecasts(cmd.TemperatureOverrides);
locationRepository.Save(forecasts);
// raises
// TemperatureForecastsOverridden(locationId, overrides)
// ForecastDemandChanged(locationId, demandforecasts)
}
For either option A or B, denormalisers on the read side could look something like:
// TemperatureDenormaliser.cs
public void Handle(TemperatureForecastsOverridden #event)
{
var from = #event.Overrides.Min(t => t.DateTime);
var to = #event.Overrides.Max(t => t.DateTime);
var temperatureDTOs = storage.GetByLocation(#event.LocationId, from, to);
// TODO ... (Add or update)
storage.Save(temperatureDTOs);
}
// EnergyDemandDenormalizer.cs
public void Handle(ForecastDemandChanged #event)
{
var from = #event.Overrides.Min(t => t.DateTime);
var to = #event.Overrides.Max(t => t.DateTime);
var demandDTOs = storage.GetByLocation(#event.LocationId, from, to);
// TODO ... (Add or update)
storage.Save(demandDTOs);
}

Event sourcing would not be an option with either of your examples.
As new events come in, the older ones become irrelevant. These do not necessarily need to be in one aggregate; There are no invariants to protect for the whole history of readings.
Series of events could be managed in a saga instead, only keeping a limited amount of knowledge, and cascading into result events.

Related

How to get the count of items in an observable stream without maintaining state yourself?

How do I get the number of students in this school at any given point in time using the Rx idiom and without having to maintain state in the School class myself?
using System;
using System.Reactive.Subjects;
namespace SchoolManagementSystem
{
public class School
{
private ISubject<Student> _subject = null;
private int _maxNumberOfSeats;
private int _numberOfStudentsAdmitted;
public string Name { get; set; }
public School(string name, int maxNumberOfSeats)
{
Name = name;
_maxNumberOfSeats = maxNumberOfSeats;
_numberOfStudentsAdmitted = 0;
_subject = new ReplaySubject<Student>();
}
public void AdmitStudent(Student student)
{
try
{
if (student == null)
throw new ArgumentNullException("student");
if (_numberOfStudentsAdmitted == _maxNumberOfSeats)
{
_subject.OnCompleted();
}
// Obviously can't do this because this will
// create a kind of dead lock in that it will
// wait for the _subject to complete, but I am
// using the same _subject to issue notifications.
// _numberOfStudentsAdmitted = _subject.Count().Wait();
// OR to keep track of state myself
Interlocked.Increment(ref _numberOfStudentsAdmitted);
_subject?.OnNext(student);
}
catch(Exception ex)
{
_subject.OnError(ex);
}
}
public IObservable<Student> Students
{
get
{
return _subject;
}
}
}
}
Or is this just not in tandem with the principles of components designed using Rx?
Is this something that should be the responsibility of the client (to get the count and do all side-effects in the onNext handler)? And that the observables should simply act as stateless signal-sources or gates much like the hardware interrupt routines that simply signal to the CPU that something of interest has happened?
In that case, we lose the criteria for the observable to signal completion. How then it is supposed to know when to complete?

You can use the Count() method on your _subject sequence. It will itself create an observable sequence where each value produced represents the latest total number of students in _subject.
You could then react to this sequence of student count values. The Zip() operation could be useful in that regard, since it has the advantage on completing the resulting sequence when any of its inner sequences complete, which you can force with a TakeWhile.
The result looks something like this
Observable.Zip(
_subject.Select(student => != null ? student ? throw new ArgumentNullException("student")),
_subject.Count().TakeWhile(studentCount => studentCount < _maxNumberOfSeats),
(student, count) => student
);
All that would be left to do in the AdmitStudent method body would simply be to push any new student to the sequence with _subject?.OnNext(student) (like you already do), but without the extra logic. You could also modify this a bit to make sure that _subject itself also gets completed once the maximum student count is reached, but I'm not certain about your business rules, so I'll leave that for you to decide.
One last thing I can recommend if to play with the extensions for Rx types and to have a look around this website, which uses them liberally.

Filter Change Notifications in Active Directory: Create, Delete, Undelete

I am currently using the Change Notifications in Active Directory Domain Services in .NET as described in this blog. This will return all events that happen on an selected object (or in the subtree of that object). I now want to filter the list of events for creation and deletion (and maybe undeletion) events.
I would like to tell the ChangeNotifier class to only observe create-/delete-/undelete-events. The other solution is to receive all events and filter them on my side. I know that in case of the deletion of an object, the atribute list that is returned will contain the attribute isDeleted with the value True. But is there a way to see if the event represents the creation of an object? In my tests the value for usnchanged is always usncreated+1 in case of userobjects and both are equal for OUs, but can this be assured in high-frequency ADs? It is also possible to compare the changed and modified timestamp. And how can I tell if an object has been undeleted?
Just for the record, here is the main part of the code from the blog:
public class ChangeNotifier : IDisposable
{
static void Main(string[] args)
{
using (LdapConnection connect = CreateConnection("localhost"))
{
using (ChangeNotifier notifier = new ChangeNotifier(connect))
{
//register some objects for notifications (limit 5)
notifier.Register("dc=dunnry,dc=net", SearchScope.OneLevel);
notifier.Register("cn=testuser1,ou=users,dc=dunnry,dc=net", SearchScope.Base);
notifier.ObjectChanged += new EventHandler<ObjectChangedEventArgs>(notifier_ObjectChanged);
Console.WriteLine("Waiting for changes...");
Console.WriteLine();
Console.ReadLine();
}
}
}
static void notifier_ObjectChanged(object sender, ObjectChangedEventArgs e)
{
Console.WriteLine(e.Result.DistinguishedName);
foreach (string attrib in e.Result.Attributes.AttributeNames)
{
foreach (var item in e.Result.Attributes[attrib].GetValues(typeof(string)))
{
Console.WriteLine("\t{0}: {1}", attrib, item);
}
}
Console.WriteLine();
Console.WriteLine("====================");
Console.WriteLine();
}
LdapConnection _connection;
HashSet<IAsyncResult> _results = new HashSet<IAsyncResult>();
public ChangeNotifier(LdapConnection connection)
{
_connection = connection;
_connection.AutoBind = true;
}
public void Register(string dn, SearchScope scope)
{
SearchRequest request = new SearchRequest(
dn, //root the search here
"(objectClass=*)", //very inclusive
scope, //any scope works
null //we are interested in all attributes
);
//register our search
request.Controls.Add(new DirectoryNotificationControl());
//we will send this async and register our callback
//note how we would like to have partial results
IAsyncResult result = _connection.BeginSendRequest(
request,
TimeSpan.FromDays(1), //set timeout to a day...
PartialResultProcessing.ReturnPartialResultsAndNotifyCallback,
Notify,
request
);
//store the hash for disposal later
_results.Add(result);
}
private void Notify(IAsyncResult result)
{
//since our search is long running, we don't want to use EndSendRequest
PartialResultsCollection prc = _connection.GetPartialResults(result);
foreach (SearchResultEntry entry in prc)
{
OnObjectChanged(new ObjectChangedEventArgs(entry));
}
}
private void OnObjectChanged(ObjectChangedEventArgs args)
{
if (ObjectChanged != null)
{
ObjectChanged(this, args);
}
}
public event EventHandler<ObjectChangedEventArgs> ObjectChanged;
#region IDisposable Members
public void Dispose()
{
foreach (var result in _results)
{
//end each async search
_connection.Abort(result);
}
}
#endregion
}
public class ObjectChangedEventArgs : EventArgs
{
public ObjectChangedEventArgs(SearchResultEntry entry)
{
Result = entry;
}
public SearchResultEntry Result { get; set; }
}

I participated in a design review about five years back on a project that started out using AD change notification. Very similar questions to yours were asked. I can share what I remember, and don't think things have change much since then. We ended up switching to DirSync.
It didn't seem possible to get just creates & deletes from AD change notifications. We found change notification resulted enough events monitoring a large directory that notification processing could bottleneck and fall behind. This API is not designed for scale, but as I recall the performance/latency were not the primary reason we switched.
Yes, the usn relationship for new objects generally holds, although I think there are multi-dc scenarios where you can get usncreated == usnchanged for a new user, but we didn't test that extensively, because...
The important thing for us was that change notification only gives you reliable object creation detection under the unrealistic assumption that your machine is up 100% of the time! In production systems there are always some case where you need to reboot and catch up or re-synchronize, and we switched to DirSync because it has a robust way to handle those scenarios.
In our case it could block email to a new user for an indeterminate time if an object create were missed. That obviously wouldn't be good, we needed to be sure. For AD change notifications, getting that resync right that would have some more work and hard to test. But for DirSync, its more natural, and there's a fast-path resume mechanism that usually avoids resync. For safety I think we triggered a full re-synchronize every day.
DirSync is not as real-time as change notification, but its possible to get ~30-second average latency by issuing the DirSync query once a minute.

How to seed an observable from a database

I'm trying to expose an observable sequence that gives observers all existing records in a database table plus any future items. For the sake of argument, lets say it's log entries. Therefore, I'd have something like this:
public class LogService
{
private readonly Subject<LogEntry> entries;
public LogService()
{
this.entries = new Subject<LogEntry>();
this.entries
.Buffer(...)
.Subscribe(async x => WriteLogEntriesToDatabaseAsync(x));
}
public IObservable<LogEntry> Entries
{
get { return this.entries; }
}
public IObservable<LogEntry> AllLogEntries
{
get
{
// how the heck?
}
}
public void Log(string message)
{
this.entries.OnNext(new LogEntry(message));
}
private async Task<IEnumerable<LogEntry>> GetLogEntriesAsync()
{
// reads existing entries from DB table and returns them
}
private async Task WriteLogEntriesToDatabaseAsync(IList<LogEntry> entries)
{
// writes entries to the database
}
}
My initial thought for the implementation of AllLogEntries was something like this:
return Observable.Create<LogEntry>(
async observer =>
{
var existingEntries = await this.GetLogEntriesAsync();
foreach (var existingEntry in existingEntries)
{
observer.OnNext(existingEntry);
}
return this.entries.Subscribe(observer);
});
But the problem with this is that there could log entries that have been buffered and not yet written to the database. Hence, those entries will be missed because they are not in the database and have already passed through the entries observable.
My next thought was to separate the buffered entries from the non-buffered and use the buffered when implementing AllLogEntries:
return Observable.Create<LogEntry>(
async observer =>
{
var existingEntries = await this.GetLogEntriesAsync();
foreach (var existingEntry in existingEntries)
{
observer.OnNext(existingEntry);
}
return this.bufferedEntries
.SelectMany(x => x)
.Subscribe(observer);
});
There are two problems with this:
It means clients of AllLogEntries also have to wait for the buffer timespan to pass before they receive their log entries. I want them to see log entries instantaneously.
There is still a race condition in that log entries could be written to the database between the point at which I finish reading the existing ones and the point at which I return the future entries.
So my question is: how would I actually go about achieving my requirements here with no possibility of race conditions, and avoiding any major performance penalties?

To do this via the client code, you will probably have to implement a solution using polling and then look for differences between calls. Possibly combining a solution with
Observable.Interval() : http://rxwiki.wikidot.com/101samples#toc28 , and
Observable.DistinctUntilChanged()
will give you sufficient solution.
Alternatively, I'd suggest you try to find a solution where the clients are notified when the DB/table is updated. In a web application, you could use something like SignalR to do this.
For example: http://techbrij.com/database-change-notifications-asp-net-signalr-sqldependency
If its not a web-application, a similar update mechanism via sockets may work.
See these links (these came from the accepted answer of SignalR polling database for updates):
http://xsockets.net/api/net-c#snippet61
https://github.com/codeplanner/XSocketsPollingLegacyDB

RavenDB: Sync documents in database with data from external source

What is the most efficient way to sync documents in a RavenDB?
From an external source I get an IEnumerable of BlogPosts that I want to do the following with:
Add new objects that are new to RavenDB
Update existing objects
Remove objects that were removed in the external source
The code that needs implementation:
public void SyncIntoRaven(IEnumerable<BlogPost> postsToSync, IDocumentStore store) {
// TODO: Implement
// AddNewItems(postsToSync);
// TODO: Implement
// RemoveDeletedItems(postsToSync);
// TODO: Implement
// UpdateExistingItems(postsToSync);
}
One could just pull out all BlogPosts from RavenDB and sync locally to then push all the changes back, but I want to minimize traffic to RavenDB. But maybe that's not the right approach either?

If you are sharing the same ID between your external source and RavenDB, you can do this quite easily, in an ACID fashion, and within one transaction.
Keep track of IDs that changed between sync operations, and once you have that list of ID's you can easily do this:
Open a session, add the new documents using session.Store(), load all the documents need updating or deleting using session.Load(string[]) session.Load().Lazily, make the updates (and deletions using the Deferred option), and once you are done call session.SaveChanges().
That should get you covered, and happen in only one roundtrip to the server.
Either way, you never want to do complete sync every time. You always want to use deltas.

With the help in description-form from synhershko I figured it out and wanted to share the code, simplified to show the concepts.
private void RefreshBlogPosts(IDocumentSession session, IList<BlogPost> parsedPosts) {
var parsedPostsIds = parsedPosts.Select(x => x.Id);
var storePosts = session.Load<BlogPost>(parsedPostsIds);
// Update existing or create new posts
for(int i = 0; i < storePosts.Count(); i++) {
var parsedPost = parsedPosts[i];
var storePost = storePosts[i];
if(storePost == null) {
storePost = parsedPost;
session.Store(storePost);
} else {
// Update post's properties
}
}
// Find posts IDs no longer in database
var removedPostIds = session.Query<BlogPost>().Select(x => x.Id)
.Where(postId => !parsedPostsIds.Contains(postId));
foreach(var removedPostId in removedPostIds) {
session.Advanced.Defer(new DeleteCommandData() { Key = removedPostId });
}
session.SaveChanges();
}

How can I combine two streams ordered then grouped by timestamp?

I have two streams of objects that each have a Timestamp value. Both streams are in order, so for example the timestamps might be Ta = 1,3,6,6,7 in one stream and Tb = 1,2,5,5,6,8 in the other. Objects in both streams are of the same type.
What I'd like to be able to do is to put each of these events on the bus in order of timestamp, i.e., put A1, then B1, B2, A3 and so on. Furthermore, since some streams have several (sequential) elements with the same timestamp, I want those elements grouped so that each new event is an array. So we would put [A3] on the bus, followed by [A15,A25] and so on.
I've tried to implement this by making two ConcurrentQueue structures, putting each event at the back of the queue, then looking at each front of the queue, choosing first the earlier event and then traversing the queue such that all events with this timestamp are present.
However, I've encountered two problems:
If I leave these queues unbounded, I quickly run out of memory as the read op is a lot faster than the handlers receiving the events. (I've got a few gigabytes of data).
I sometimes end up with a situation where I handle the event, say, A15 before A25 has arrived. I somehow need to guard against this.
I'm thinking that Rx can help in this regard but I don't see an obvious combinator(s) to make this possible. Thus, any advice is much appreciated.

Rx is indeed a good fit for this problem IMO.
IObservables can't 'OrderBy' for obvious reasons (you would have to observe the entire stream first to guarantee the correct output order), so my answer below makes the assumption (that you stated) that your 2 source event streams are in order.
It was an interesting problem in the end. The standard Rx operators are missing a GroupByUntilChanged that would have solved this easily, as long as it called OnComplete on the previous group observable when the first element of the next group was observed. However looking at the implementation of DistinctUntilChanged it doesn't follow this pattern and only calls OnComplete when the source observable completes (even though it knows there will be no more elements after the first non-distinct element... weird???). Anyway, for those reasons, I decided against a GroupByUntilChanged method (to not break Rx conventions) and went instead for a ToEnumerableUntilChanged.
Disclaimer: This is my first Rx extension so would appreciate feedback on my choices made. Also, one main concern of mine is the anonymous observable holding the distinctElements list.
Firstly, your application code is quite simple:
public class Event
{
public DateTime Timestamp { get; set; }
}
private IObservable<Event> eventStream1;
private IObservable<Event> eventStream2;
public IObservable<IEnumerable<Event>> CombineAndGroup()
{
return eventStream1.CombineLatest(eventStream2, (e1, e2) => e1.Timestamp < e2.Timestamp ? e1 : e2)
.ToEnumerableUntilChanged(e => e.Timestamp);
}
Now for the ToEnumerableUntilChanged implementation (wall of code warning):
public static IObservable<IEnumerable<TSource>> ToEnumerableUntilChanged<TSource,TKey>(this IObservable<TSource> source, Func<TSource,TKey> keySelector)
{
// TODO: Follow Rx conventions and create a superset overload that takes the IComparer as a parameter
var comparer = EqualityComparer<TKey>.Default;
return Observable.Create<IEnumerable<TSource>>(observer =>
{
var currentKey = default(TKey);
var hasCurrentKey = false;
var distinctElements = new List<TSource>();
return source.Subscribe((value =>
{
TKey elementKey;
try
{
elementKey = keySelector(value);
}
catch (Exception ex)
{
observer.OnError(ex);
return;
}
if (!hasCurrentKey)
{
hasCurrentKey = true;
currentKey = elementKey;
distinctElements.Add(value);
return;
}
bool keysMatch;
try
{
keysMatch = comparer.Equals(currentKey, elementKey);
}
catch (Exception ex)
{
observer.OnError(ex);
return;
}
if (keysMatch)
{
distinctElements.Add(value);
return;
}
observer.OnNext( distinctElements);
distinctElements.Clear();
distinctElements.Add(value);
currentKey = elementKey;
}), observer.OnError, () =>
{
if (distinctElements.Count > 0)
observer.OnNext(distinctElements);
observer.OnCompleted();
});
});
}

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.