I've been trying to wrap my head around unit testing and I'm trying to deal with unit testing a function whose return value depends on a bunch of parameters. There's a lot of information however and it's a bit overwhelming..
Consider the following:
I have a class Article, which has a collection of prices. It has a method GetCurrentPrice which determines the current price based on a few rules:
public class Article
{
public string Id { get; set; }
public string Description { get; set; }
public List<Price> Prices { get; set; }
public Article()
{
Prices = new List<Price>();
}
public Price GetCurrentPrice()
{
if (Prices == null)
return null;
return (
from
price in Prices
where
price.Active &&
DateTime.Now >= price.Start &&
DateTime.Now <= price.End
select price)
.OrderByDescending(p => p.Type)
.FirstOrDefault();
}
}
The PriceType enum and Price class:
public enum PriceType
{
Normal = 0,
Action = 1
}
public class Price
{
public string Id { get; set; }
public string Description { get; set; }
public decimal Amount { get; set; }
public PriceType Type { get; set; }
public DateTime Start { get; set; }
public DateTime End { get; set; }
public bool Active { get; set; }
}
I want to create a unit test for the GetCurrentPrice method. Basically I want to test all combinations of rules that could possibly occur, so I would have to create multiple articles to contain various combinations of prices to get full coverage.
I'm thinking of a unit test such as this (pseudo):
[TestMethod()]
public void GetCurrentPriceTest()
{
var articles = getTestArticles();
foreach (var article in articles)
{
var price = article.GetCurrentPrice();
// somehow compare the gotten price to a predefined value
}
}
I've read that 'multiple asserts are evil', but don't I need
them to test all conditions here? Or would I need a separate unit
test per condition?
How would I go about providing the unit test with a set of test data?
Should I mock a repository? And should that data also include the
expected values?
You are not using a repository in this example so there's no need to mock anything. What you could do is to create multiple unit tests for the different possible inputs:
[TestMethod]
public void Foo()
{
// arrange
var article = new Article();
// TODO: go ahead and populate the Prices collection with dummy data
// act
var actual = article.GetCurrentPrice();
// assert
// TODO: assert on the actual price returned by the method
// depending on what you put in the arrange phase you know
}
and so on you could add other unit tests where you would only change the arrange and assert phases for each possible input.
You do not need multiple asserts. You need multiple tests with only a single assert each.
new test for each startup condition and single assert,f.e.
[Test]
public void GetCurrentPrice_PricesCollection1_ShouldReturnNormalPrice(){...}
[Test]
public void GetCurrentPrice_PricesCollection2_ShouldReturnActionPrice(){...}
and also test for boundaries
for unit tests i use pattern
MethodName_UsedData_ExpectedResult()
I think you need datadriven testing. In vsts there is an attribute called Datasource, using it you can send a test method multiple test cases. Make sure you don't use multiple asserts. Here is one MSDN link http://msdn.microsoft.com/en-us/library/ms182527.aspx
Hope this will help you.
Related
I currently have a class with around 40 dependency injection. It is a hard to maintain and unit test. I am not sure a good way around.
The code is done for any type of application process that is needed to process (New License, License Renewal, Student Registration, ...), there are around 80 different types applications and what sections are associated with each application type is determined by a database table.
I have a class with all of the possible properties, there are a several more than listed but you should get the idea. Each the properties have their own set of properties that are basic data types or object pointing to other classes.
class Application
{
[JsonProperty(PropertyName = "accounting")]
public Accounting Accounting { get; set; }
[JsonProperty(PropertyName = "application")]
public Application Application { get; set; }
[JsonProperty(PropertyName = "applicationType")]
public ApplicationType ApplicationType { get; set; }
[JsonProperty(PropertyName = "document")]
public List<Attachment> Document { get; set; }
[JsonProperty(PropertyName = "employment")]
public List<Employment> Employment { get; set; }
[JsonProperty(PropertyName = "enrollment")]
public Enrollment Enrollment { get; set; }
[JsonProperty(PropertyName = "individualAddressContact")]
public IndividualAddressContact IndividualAddressContact { get; set; }
[JsonProperty(PropertyName = "instructors")]
public List<Instructor> Instructors { get; set; }
[JsonProperty(PropertyName = "license")]
public License License { get; set; }
[JsonProperty(PropertyName = "licenseRenewal")]
public LicenseRenewal LicenseRenewal { get; set; }
[JsonProperty(PropertyName = "MilitaryService")]
public List<MilitaryService> MilitaryService { get; set; }
[JsonProperty(PropertyName = "paymentDetail")]
public PaymentDetail PaymentDetail { get; set; }
[JsonProperty(PropertyName = "photo")]
public List<Attachment> Photo { get; set; }
[JsonProperty(PropertyName = "portal")]
public Portal Portal { get; set; }
[JsonProperty(PropertyName = "section")]
public List<Section> Section { get; set; }
[JsonProperty(PropertyName = "testingCalendar")]
public TestingCalendar TestingCalendar { get; set; }
[JsonProperty(PropertyName = "testingScore")]
public List<TestingScore> TestingScore { get; set; }
[JsonProperty(PropertyName = "USCitizen")]
public USCitizen USCitizen { get; set; }
}
So this class is sent/received to an Angular 10 front end using Web API's.
When an application is requested the sections and the different properties are initiated and if the application has be started the progress will be reloaded. So it is possible some of properties will be pulled from the database and sent to the Angular app.
So I have something such as
Load(applicationTypeId, applicationId)
{
Get the sections for the application type
For each section in the sections
switch sectionid
case Documents
Load all of the documents required for the application type and get any documents uploaded
case Accounting
Load the payment details, if no payment made calculate the payment
case IndividualAddressContact
Load the person name/address/contact and set a few defaults if the person hasn't started.
.....
next
}
Save()
{
Save the application
switch current section
case Documents
Save all of the documents for the application
case Accounting
Save the payment details for the application
case IndividualAddressContact
Save the person name/address/contact for the application
.....
get the next section
Update the application current section
}
I have put all of the items in the switch into their own classes but in the end I still have 1 point for serialization/deserialization and still end up with to many dependencies injected. Creating a unit test with over 40 dependencies seems hard to maintain and given I won't know which properties will/won't used until an application is requested and loaded from database. I am unsure how to get around the switch, without at some point and time having to have all of the dependencies injected into 1 class.
I would appreciate some ideas of how to get around this.
"I currently have a class with around 40 dependency injection..." - Oh my gosh!
"It is a hard to maintain and unit test..." - I don't doubt that in the least!
SUGGESTED REFACTORING:
Create a class that manages "Applications" (e.g. "ApplicationManager").
Create an abstract class "Application".
One advantage of "abstract class" over "interface" here that you can put "common code" in the abstract base class.
Create a concrete subclass for each "Application" : public class NewLicense : Application, public class LicenseRenewal : Application, etc. etc.
... AND ...
Use DI primarily for those "services" that each concrete class needs.
I'll bet the constructors for your individual concrete classes will only need to inject three or four services ... instead of 40. Who knows - maybe your base class won't need any DI at all.
This is actually a design we're actually using in one of our production systems. It's simple; it's robust; it's flexible. It's working well for us :)
I would recommend using convention over configuration principle, with the Service Locator.
Declare something like IApplicationHandler interface in your program, e.g.
public interface IApplicationQueryHandler
{
Application Populate(Application application);
}
public interface IApplicationSaveHandler
{
Bool Save(Application application);
}
Then, write pieces of your code, with dependencies and such, e.g.
public class AccountingApplicationQueryHandler : IApplicationQueryHandler
{
public Application Populate(Application application) {
//// Load the payment details, if no payment made calculate the payment
return application;
}
}
public class AccountingApplicationSaveHandler : IApplicationSaveHandler
{
public Bool Save(Application application) {
//// Save the payment details for the application
return true; // this just flags for validation
}
}
// repeat for all other properties
Then in your controller, do something like
public class ApplicationController: Controller
{
public readonly IServiceProvider _serviceProvider;
public ApplicationController(IServiceProvider sp) {
_serviceProvider = sp;
}
public Application Load(string applicationTypeId, string applicationId)
{
var application = new Application(); // or get from db or whatever
var queryHandlers = _serviceProvider.GetServices(typeof(IApplicationQueryHandler));
foreach(var handler in queryHandlers) {
application = handler.Populate(application);
}
return application;
}
[HttpPost]
public bool Save(Application application)
{
var result = true;
var saveHandlers = _serviceProvider.GetServices(typeof(IApplicationSaveHandler));
foreach(var handler in queryHandlers) {
result = handler. Save(application);
}
return result;
}
}
You would need to register your handlers, which you can do e.g. like so:
var queryHandlers = Assembly.GetAssembly(typeof(IApplicationQueryHandler)).GetExportedTypes()
.Where(x => x.GetInterfaces().Any(y => y == typeof(IApplicationQueryHandler)));
foreach(queryHandler in queryHandlers) {
services.AddTransient(typeof(IApplicationQueryHandler), queryHandler);
}
// repeat the same for IApplicationSaveHandler
Now finally, you can write unit tests for part of the code like so
[TestClass]
public class AccountingApplicationQueryHandlerTests
{
[TestMethod]
public void TestPopulate()
{
// arrange
var application = new Application();
var handler = new AccountingApplicationQueryHandler(); // inject mocks here
// act
var result = handler.Populate(application);
// Assert
Assert.AreEqual(result. PaymentDetail, "whatever");
}
}
And you can test that your controller calls the right things by mocking IServiceProvider and injecting that with a couple of dummy handlers to confirm they are called correctly.
Following zaitsman's answer you also could create AggregatedApplicationQueryHandler and AggregatedApplicationSaveHandler and pass collection of concrete implementation of IApplicationQueryHandler and IApplicationSaveHandler to its constructor.
Then you don't need foreach loop inside controller(you loop over handlers inside aggregated handler) and always have only one handler passed to controller. Passing its by constructor parameter shouldn't be so much painful.
You also could create facade over some small services and aggregate theirs functions into one bigger facade service.
I'm new to unit testing and need some help. This example is only for me to learn, I'm not actually counting the number of users in a static variable when I clearly could just use the count property on the List data structure. Help me figure out how to get my original assertion that there are 3 users. Here is the code:
Class User
namespace TestStatic
{
public class User
{
public string Name { get; set; }
public int Dollars { get; set; }
public static int Num_users { get; set; }
public User(string name)
{
this.Name = name;
Num_users++;
}
public int CalculateInterest(int interestRate)
{
return Dollars * interestRate;
}
}
}
Test using MSTest
namespace TestStaticUnitTest
{
[TestClass]
public class CalcInterest
{
[TestMethod]
public void UserMoney()
{
// arrange
User bob = new User("Bob");
bob.Dollars = 24;
// act
int result = bob.CalculateInterest(6);
// assert
Assert.AreEqual(144, result);
//cleanup?
}
[TestMethod]
public void UserCount()
{
// arrange
List<User> users = new List<User>(){ new User("Joe"), new User("Bob"), new User("Greg") };
// act
int userCount = User.Num_users;
// assert
Assert.AreEqual(3, userCount);
}
}
}
The result in the UserCount test fails because a fourth user exist. The user from the UserMoney test is still in memory. What should I do to get three users? Should I garbage collect the first Bob?
Also, I would think that a test that reaches into another test wouldn't be a good unit test. I know that could be an argument, but I'll take any advice from the community on this code. Thanks for the help.
The obvious solution would be to remove the static counter. As you see, when you enter the second unit test method UserCount() the value of that counter is still 1 from the execution of the first unit test method UserMoney() before.
If you want to keep the counter (for learning purposes to see what's going on), you can use cleanup methods which will "reset" the environment before all or each unit test method. In this case you want to reset the counter to 0 for every unit test method execution. You do so by writing a method with the [TestInitialize] attribute:
[TestInitialize]
public void _Initialize() {
User.Num_users = 0;
}
That way, each unit test runs with a "clean" state where the counter will be reset to 0 before the actual unit test method is executed.
You might want to look at Why does TestInitialize get fired for every test in my Visual Studio unit tests? to see how these attributes work.
I make a Booking form for restaurant, which asks for the name of the restaurant, the date of the meal and the number of person.
I have a booking class, which has an ID, an ID of the restaurant, a date and a number of people :
public class Booking
{
public int Id { get; set; }
public int IDRestaurant{ get; set; }
[CustomPlaceValidator]
public int Nbpeople { get; set; }
[CustomDateValidator]
public DateTime Date { get; set; }
}
As well as a Resto class, which has an ID, a name, phone number and a number of table :
public class Resto
{
public int Id { get; set; }
[Required(ErrorMessage = "Le nom du restaurant doit être saisi")]
public string Nom { get; set; }
[Display(Name = "Téléphone")]
[RegularExpression(#"^0[0-9]{9}$", ErrorMessage = "Le numéro de téléphone est incorrect")]
public string Telephone { get; set; }
[Range(0, 9999)]
public int Size { get; set; }
}
I would like to make a validation to check with each new reservation, that the restaurant is not full.
To do this, when validating the "Number of persons" field of the Booking, I need the value of the "restaurant name" field and the value of the "date" field, and then retrieve all the bookings on this Restaurant at that date, and check whether the sum of the number of persons is much lower than the capacity of the restaurant.
public class CustomPlaceValidator : ValidationAttribute
{
private IDal dal = new Dal();
protected override ValidationResult IsValid(object value, ValidationContext validationContext)
{
int nb = 0;
if (dal.GetAllBooking() != null)
{
foreach (var booking in dal.GetAllBooking())
nb += booking.Nbpeople;
if (nb ..... ) return ValidationResult.Success;
return new ValidationResult("The restaurant is full for this date.");
}
return ValidationResult.Success;
}
}
(It's a draft, the tests are not finished obviously)
How can I have the value of the other proprieties for my validation ?
This is not appropriate for a validation attribute. First, a validation attribute should be independent, or at least self-contained. Since the logic here depends on two different properties (the number of people and the date of the booking) a validation attribute would require too much knowledge of the domain in order to perform the necessary validation. In other words, it's not reusable, and if it's not reusable, then there's no point in using an attribute.
Second, a validation attribute should not do something like make a database query. The controller alone should be responsible for working with your DAL. When you start littering database access across your application, you're going to start running into all sorts of issues in very short order. If you use a DI container to inject your DAL where it needs to go, it's less problematic to use it outside of the controller, but importantly, attributes really don't play well with dependency injection. You can make it work with some DI containers, but it's never easy and you're probably going to regret it later. So, again, this really shouldn't be something a validation attribute handles.
The best approach in my opinion is to simply create a private/protected method on your controller to handle this validation. Something like:
public void ValidateCapacity(Booking booking)
{
var restaurant = dal.GetRestaurant(booking.IDRestaurant);
var existingBookings = dal.GetBookings(booking.IDRestaurant, booking.Date);
var available = restaurant.Size - existingBookings.Sum(b => b.Nbpeople);
if (booking.Nbpeople > available)
{
ModelState.AddModelError("Nbpeople", "There is not enough capacity at the restaurant for this many people on the date you've selected");
}
}
Then, in your post action for the booking, simply call this before checking ModelState.IsValid.
I'm looking at this question: Group validation messages for multiple properties together into one message asp.net mvc
My guess is something like:
public class Booking
{
public int Id { get; set; }
public int IDRestaurant{ get; set; }
[CustomPlace("IDRestaurant", "Date", ErrorMessage = "the restaurant is full")]
public int Nbpeople { get; set; }
[CustomDateValidator]
public DateTime Date { get; set; }
}
and the custom validation:
public class CustomPlaceAttribute : ValidationAttribute
{
private readonly string[] _others
public CustomPlaceAttribute(params string[] others)
{
_others= others;
}
protected override ValidationResult IsValid(object value, ValidationContext validationContext)
{
// TODO: validate the length of _others to ensure you have all required inputs
var property = validationContext.ObjectType.GetProperty(_others[0]);
if (property == null)
{
return new ValidationResult(
string.Format("Unknown property: {0}", _others[0])
);
}
// This is to get one of the other value information.
var otherValue = property.GetValue(validationContext.ObjectInstance, null);
// TODO: get the other value again for the date -- and then apply your business logic of determining the capacity
}
}
However, it feels a bit messy to do a database call for the validationAttribute though
What you are asking for is cross-property validation. If you are not strongly opposed to implementing an interface on your data objects you should take a look at the following:
https://msdn.microsoft.com/en-us/library/system.componentmodel.dataannotations.ivalidatableobject.aspx
A simple example implementation for a small rectangle class where we want its area not to exceed 37 (whatever that unit is).
public class SmallRectangle : IValidatableObject
{
public uint Width { get; set; }
public uint Height { get; set; }
public IEnumerable<ValidationResult> Validate(ValidationContext validationContext)
{
var area = Width * Height;
if (area > 37)
{
yield return new ValidationResult($"The rectangle is too large.");
}
}
}
Alternatives
The the second parameter of the IsValid function in your ValidationAttribute provides you with the ValidationContext which has the property ObjectInstance which you can cast to your object type and access its other members. That, however, will make your validation attribute specific to your class. I would generally advise against that.
You could also opt to use a different validation approach altogether such as using a validation library such as FluentValidations, see:
https://github.com/JeremySkinner/FluentValidation
A different perspective
Last but not least I would like to note that usually validation should be used to validate the integrity of the data. A booking request which requests more seats than available is not invalid. It can not be granted, but it is a valid request which will, unfortunately, be answered with a negative result. To give that negative result is, in my opinion not the responsibility of the validation, but the business logic.
I'm trying to become better at unit testing and one of my biggest uncertainties is writing unit tests for methods that require quite a bit of setup code, and I haven't found a good answer. The answers that I find are generally along the lines of "break your tests down into smaller units of work" or "use mocks". I'm trying to follow all of those best practices. However, even with mocking (I'm using Moq) and trying to break down everything into the smallest unit of work, I eventually run into a method that has several inputs, makes calls to several mock services, and requires me to specify return values for those mock method calls.
Here's an example of the code under test:
public class Order
{
public string CustomerId { get; set; }
public string OrderNumber { get; set; }
public List<OrderLine> Lines { get; set; }
public decimal Value { get { /* return the order's calculated value */ } }
public Order()
{
this.Lines = new List<OrderLine>();
}
}
public class OrderLine
{
public string ItemId { get; set; }
public int QuantityOrdered { get; set; }
public decimal UnitPrice { get; set; }
}
public class OrderManager
{
private ICustomerService customerService;
private IInventoryService inventoryService;
public OrderManager(ICustomerService customerService, IInventoryService inventoryService)
{
// Guard clauses omitted to make example smaller
this.customerService = customerService;
this.inventoryService = inventoryService;
}
// This is the method being tested.
// Return false if this order's value is greater than the customer's credit limit.
// Return false if there is insufficient inventory for any of the items on the order.
// Return false if any of the items on the order on hold.
public bool IsOrderShippable(Order order)
{
// Return false if the order's value is greater than the customer's credit limit
decimal creditLimit = this.customerService.GetCreditLimit(order.CustomerId);
if (creditLimit < order.Value)
{
return false;
}
// Return false if there is insufficient inventory for any of this order's items
foreach (OrderLine orderLine in order.Lines)
{
if (orderLine.QuantityOrdered > this.inventoryService.GetInventoryQuantity(orderLine.ItemId)
{
return false;
}
}
// Return false if any of the items on this order are on hold
foreach (OrderLine orderLine in order.Lines)
{
if (this.inventoryService.IsItemOnHold(orderLine.ItemId))
{
return false;
}
}
// If we are here, then the order is shippable
return true;
}
}
Here's a test:
[TestClass]
public class OrderManagerTests
{
[TestMethod]
public void IsOrderShippable_OrderIsShippable_ShouldReturnTrue()
{
// Setup inventory on-hand quantities for this test
Mock<IInventoryService> inventoryService = new Mock<IInventoryService>();
inventoryService.Setup(e => e.GetInventoryQuantity("ITEM-1")).Returns(10);
inventoryService.Setup(e => e.GetInventoryQuantity("ITEM-2")).Returns(20);
inventoryService.Setup(e => e.GetInventoryQuantity("ITEM-3")).Returns(30);
// Configure each item to be not on hold
inventoryService.Setup(e => e.IsItemOnHold("ITEM-1")).Returns(false);
inventoryService.Setup(e => e.IsItemOnHold("ITEM-2")).Returns(false);
inventoryService.Setup(e => e.IsItemOnHold("ITEM-3")).Returns(false);
// Setup the customer's credit limit
Mock<ICustomerService> customerService = new Mock<ICustomerService>();
customerService.Setup(e => e.GetCreditLimit("CUSTOMER-1")).Returns(1000m);
// Create the order being tested
Order order = new Order { CustomerId = "CUSTOMER-1" };
order.Lines.Add(new OrderLine { ItemId = "ITEM-1", QuantityOrdered = 10, UnitPrice = 1.00m });
order.Lines.Add(new OrderLine { ItemId = "ITEM-2", QuantityOrdered = 20, UnitPrice = 2.00m });
order.Lines.Add(new OrderLine { ItemId = "ITEM-3", QuantityOrdered = 30, UnitPrice = 3.00m });
OrderManager orderManager = new OrderManager(
customerService: customerService.Object,
inventoryService: inventoryService.Object);
bool isShippable = orderManager.IsOrderShippable(order);
Assert.IsTrue(isShippable);
}
}
This is an abbreviated example. My actual methods that I'm testing are similar in their structure, but they often have a few more service methods that they're calling or they have more setup code for the models (for instance, the Order object requires more properties to be assigned in order for the test to work).
Given that some of my methods have to do several things at once like this example (such as methods that are behind button-click events), is this the best way of dealing with writing unit tests for those methods?
You are already on the right path. And at some point, if a 'method under test' is big (not complex), then your unit test is bound to be big (not complex). i tend to differentiate between code which is 'big' vs. code which is 'complex'. A complex code snippet needs to be simplified.. a big code snippet is sometimes more clearer yet simple..
In your case, your code is just big, not complex. Hence it is not a big deal, if your unit tests are big as well.
Having said that, here is how we can make it crisper and more readable.
Option #1
The target code under test seems to be:
public bool IsOrderShippable(Order order)
As i can see, there are at least 4 unit test scenarios straightaway:
// Scenario 1: Return false if the order's value is
// greater than the customer's credit limit
[TestMethod]
public void IsOrderShippable_OrderValueGreaterThanCustomerCreditLimit_ShouldReturnFalse()
{
// Setup the customer's credit limit
var customerService = new Mock<ICustomerService>();
customerService.Setup(e => e.GetCreditLimit(It.IsAny<string>())).Returns(1000m);
// Create the order with value greater than credit limit
var order = new Order { Value = 1001m };
var orderManager = new OrderManager(
customerService: customerService.Object,
inventoryService: new Mock<IInventoryService>().Object);
bool isShippable = orderManager.IsOrderShippable(order);
Assert.IsFalse(isShippable);
}
As you can see, this test is pretty compact. it doesn't bother to setup a lot of mocks etc. that you don't expect your scenario code to hit.
similarly you can write compact tests for the other 2 scenarios as well..
and then finally for the last scenario, you have the proper unit test.
the only thing i would do is extract out some private helper methods to make the actual unit test pretty crisp and readable as follows:
[TestMethod]
public void IsOrderShippable_OrderIsShippable_ShouldReturnTrue()
{
// you can parametrize this helper method as needed
var inventoryService = GetMockInventoryServiceWithItemsNotOnHold();
// You can parametrize this helper method with credit line, etc.
var customerService = GetMockCustomerService(1000m);
// parametrize this method with number of items and total price etc.
Order order = GetTestOrderWithItems();
OrderManager orderManager = new OrderManager(
customerService: customerService.Object,
inventoryService: inventoryService.Object);
bool isShippable = orderManager.IsOrderShippable(order);
Assert.IsTrue(isShippable);
}
As you can see, by using helper methods, you made the test smaller and crisper, but we do lose some readability in terms of what parameters are being setup.
However, i tend to be very explicit about helper method names and parameter names, so that by reading the method name and parameters, a reader is clear about what sort of data is being arranged.
Most of the times, the happy path scenarios end up requiring the maximum setup code, since they need all the mocks setup properly with all correlated items, quantity, prices etc. In those cases, i prefer to sometimes put all the setup code on the TestSetup method.. so that it is by default available to every test method.
The upside, is that the tests get a good mock value out of the box.. (your happy path unit test can literally be just 2 lines, since you can keep a well-valid Order ready in the TestSetup method)
The downside is that the happy path scenario is typically one unit test.. but putting that stuff in the testSetup will run it for every unit test, even though they would never need it.
Option #2
Here is another way..
you could breakdown your IsOrderShippable method into 4 private methods that each exercise the 4 scenarios. You can make these private methods internal and then have your unit tests, work on those methods (internalsvisibleto etc.).. it is still a bit clunky, since you are making private methods internal, and also you still need to unit test your public method, which brings us kinda back to the original problem.
I am posing this question as it relates to a C# solution, however, I faced the same quandary in a RoR solution and simply opted to use Map-Reduce to its fullest, abandoning all hope of abstracting the data store.
MongoDB Map-Reduce seems to be THE way to perform pivots as well as other reporting queries. An alternative, which is the typical document repository manner, such as is encouraged by typical EntityFramework (EF) folks, is to move the logic to the application layer.
Without getting deep into arguments of the relative advantages of each approach, the amount of data within the data store is proven to be too large to fetch it all up into the application layer.
The following code is a proof-of-concept (POC), which yields results, but begs the question I am asking here, is there a way to reduce the impact of using Map-Reduce within a C# (any .NET) solution?
Data Models used throughout:
public class Call
{
[BsonId]
[BsonRepresentation(BsonType.ObjectId)]
public string Id { get; set; }
public DateTime? StartTime { get; set; }
public DateTime? EndTime { get; set; }
public Agent Agent { get; set; }
public Caller Caller { get; set; }
}
public class Agent : Person
{
public DateTime JoinedCompany { get; set; }
}
public class Caller : Person
{
}
Data Models used within the Map-Reduce POC:
public class AgentCallSummary
{
public ObjectId _id;
public AgentCallAggregateValues value;
public class AgentCallAggregateValues
{
public int count;
public int totalTimeOnCall;
}
}
The following code depends upon CreateCollection() and an extension method Dump(this T, string) which are being used to represent abstractly that a document collection can be obtained from whatever document store, and any document may be dumped (like LINQPad provides):
private void DemostrateMapReduce()
{
var calls = CreateCollectionCall<Call>();
calls.Count().Dump("Call Count");
const string mapJavascript =
#"function(){
var call = this;
/* averageCallTime should be fetched, simplified here, averageCallTime is used as the timeOnCall for calls that are in progress */
var averageCallTime = 15.0;
var calculateTotalTimeOnCall = function(startTime, endTime) {
if ((!endTime) || (!startTime)) {
return averageCallTime;
}
var diffMs = endTime - startTime;
return (diffMs / 1000) * 60;
};
emit(call.Agent._id, { count: 1, totalTimeOnCall: 1 });
}";
const string reduceJavascript =
#"function(key, values) {
var result = { count: 0, totalTimeOnCall: 0 };
values.forEach(function(value) {
result.count += value.count;
result.totalTimeOnCall += value.totalTimeOnCall;
});
return result;
}";
var mapReduceResult = calls.MapReduce(mapJavascript, reduceJavascript, MapReduceOptions.SetOutput(MapReduceOutput.Inline));
foreach (var item in mapReduceResult.GetInlineResultsAs<AgentCallSummary>())
{
item.Dump();
}
}