Enforcing invariants for child entities in concurrent editing environments

Enforcing invariants for child entities in concurrent editing environments - c#

Given the invariant that a child collection cannot exceed x number of items, how can the domain guarantee such an invariant is enforced in a concurrent/web environment? Let's look at a (classic) example:
We have a Manager with Employees. The (hypothetical) invariant states that a Manager cannot have more than seven direct reports (Employees). We might implement this (naively) like so:
public class Manager {
// Let us assume that the employee list is mapped (somehow) from a persistence layer
public IList<Employee> employees { get; private set; }
public Manager(...) {
...
}
public void AddEmployee(Employee employee) {
if (employees.Count() < 7) {
employees.Add(employee);
} else {
throw new OverworkedManagerException();
}
}
}
Until recently, I had considered this approach to be good enough. However, it seems there is an edge-case that makes it possible for the database to store more than seven employees and thus break the invariant. Consider this series of events:
Person A goes to edit Manager in UI
(6 employees in memory, 6 employees in database)
Person B goes to edit Manager in UI
(6 employees in memory, 6 employees in database)
Person B adds Employee and saves changes
(7 employees in memory, 7 employees in database)
Person A adds Employee and saves changes
(7 employees in memory, 8 employees in database)
When the domain object is once again pulled from the database, the Manager constructor may (or may not) reinforce the Employee count invariant on the collection, but either way we now have a discrepancy between our data and what our invariant expects. How do we prevent this situation from happening? How do we recover from this cleanly?

Consider this series of events:
Person A goes to edit Manager in UI
(6 employees in memory, 6 employees in database)
Person B goes to edit Manager in UI
(6 employees in memory, 6 employees in database)
Person B adds Employee and saves changes
(7 employees in memory, 7 employees in database)
Person A adds Employee and saves changes
(7 employees in memory, 8 employees in database)
The simplest approach is to implement the database writes as a compare and swap operation. All writes are working with a stale copy of the aggregate (after all, we're looking at the aggregate in memory, but the book of record is the durable copy on disk). The key idea is that when we actually perform the write, we are also checking that the stale copy we were working with is still the live copy in the book of record.
(For instance, in an event sourced system, you don't append to the stream, but append to a specific position in the stream -- ie, where you expect the tail pointer to be. So in a race, only one write gets to commit to the tail position; the other fails on a concurrency conflict and starts over.)
The analog to this in a web environment might be to use an eTag, and verify that the etag is still valid when you perform the write. The winner gets a successful response, the loser gets a 412 Precondition Failed.
An improvement on this is to use a better model for your domain. Udi Dahan wrote:
A microsecond difference in timing shouldn’t make a difference to core business behaviors
Specifically, if your model ends up in a different state just because commands A and B happen to be processed in a different order, your model probably doesn't match your business very well.
The analog in your example would be that both commands should succeed, but the second of the two should also set a flag that notes that the aggregate is currently out of compliance. This approach prevents idiocies when an addEmployee command and a removeEmployee command happen to get ordered the wrong way around in the transport layer.
The (hypothetical) invariant states that a Manager cannot have more than seven direct reports
A thing to be wary of -- even in hypothetical examples, is whether or not the database is the book of record. The database seldom gets veto power over the real world. If the real world is the book of record, your probably shouldn't be rejecting changes.

How do we prevent this situation from happening?
You implement this behavior in your Repository implementation: when you load the Aggregate, you also keep track of the Aggregate's version. The version can be implemented as a unique key constraint of Aggregate's Id and a integer sequence number. Every Aggregate has it's own sequence number (initially every Aggregate has sequence number 0). Before the Repository tries to persist it, it increments the sequence number; if a concurrent persist has occurred, the database behind the Repository will throw a "unique key constraint violated" kind of exception and the persisting will not occur.
Then (if you have designed the Aggregate as a pure, non-side effect object as you should do in DDD!), you could transparently retry the command execution, re-running all the Aggregate's domain code, thus re-checking the invariants. Please note that the operation must be retried only if a "unique constraint violation" infrastructure exception occur, not in case the Aggregate throws a domain exception.
How do we recover from this cleanly?
You could retry the command execution until no "unique constraint violation" is thrown.
I've implemented this retrying in PHP here: https://github.com/xprt64/cqrs-es/blob/master/src/Gica/Cqrs/Command/CommandDispatcher/ConcurrentProofFunctionCaller.php

This is not so much a DDD problem as a persistence layer problem. There are multiple ways to look at this.
From a traditional ACID/strong consistency perspective
You need to have a look at your particular database's available concurrency and isolation strategies, possibly reflected in your ORM capabilities. Some of them will allow you to detect such conflicts and throw an exception as Person A saves their changes at step 4.
As I said in my comment, in a typical web application that uses the Unit of Work pattern (via an ORM or otherwise), this shouldn't happen quite as often as you seem to imply though. Entities don't stay in memory tracked by the UoW all along steps 1. to 4., they are reloaded at steps 3. and 4. Transactions 3 and 4 would have to be concurrent for the problem to occur.
Weaker, lock-free consistency
You have a few options here.
Last-one-wins, where the 7 employees from Person A will erase those from Person B. This can be viable in certain business contexts. You can do it by persisting the change as a employees = <new list> instead of employees.Add.
Relying on version numbers, as #VoiceOfUnreason described.
Eventual consistency with compensation, where something else in the application checks the invariant (employees.Count() < 7) after the fact, outside of Person A and B's transactions. A compensating action has to be taken if a violation of the rule is detected, like rollbacking the last operation and notifying Person A that the manager would have been overworked.

Related

Event sourcing incremental int id

I looked at a lot of event sourcing tutorials and all are using simple demos to focus on the tutorials topic (Event sourcing)
That's fine until you hit in a real work application something that is not covered in one of these tutorials :)
I hit something like this.
I have two databases, one event-store and one projection-store (Read models)
All aggregates have a GUID Id, what was 100% fine until now.
Now I created a new JobAggregate and a Job Projection.
And it's required by my company to have a unique incremental int64 Job Id.
Now I'm looking stupid :)
An additional issue is that a job is created multiple times per second!
That means, the method to get the next number have to be really safe.
In the past (without ES) I had a table, defined the PK as auto increment int64, save Job, DB does the job to give me the next number, done.
But how can I do this within my Aggregate or command handler?
Normally the projection job is created by the event handler, but that's to late in the process, because the aggregate should have the int64 already. (For replaying the aggregate on an empty DB and have the same Aggregate Id -> Job Id relation)
How should I solve this issue?
Kind regards

In the past (without ES) I had a table, defined the PK as auto increment int64, save Job, DB does the job to give me the next number, done.
There's one important thing to notice in this sequence, which is that the generation of the unique identifier and the persistence of the data into the book of record both share a single transaction.
When you separate those ideas, you are fundamentally looking at two transactions -- one that consumes the id, so that no other aggregate tries to share it, and another to write that id into the store.
The best answer is to arrange that both parts are part of the same transaction -- for example, if you were using a relational database as your event store, then you could create an entry in your "aggregate_id to long" table in the same transaction as the events are saved.
Another possibility is to treat the "create" of the aggregate as a Prepare followed by a Created; with an event handler that responds to the prepare event by reserving the long identifier post facto, and then sends a new command to the aggregate to assign the long identifier to it. So all of the consumers of Created see the aggregate with the long assigned to it.
It's worth noting that you are assigning what is effectively a random long to each aggregate you are creating, so you better dig in to understand what benefit the company thinks it is getting from this -- if they have expectations that the identifiers are going to provide ordering guarantees, or completeness guarantees, then you had best understand that going in.
There's nothing particularly wrong with reserving the long first; depending on how frequently the save of the aggregate fails, you may end up with gaps. For the most part, you should expect to be able to maintain a small failure rate (ie - you check to ensure that you expect the command to succeed before you actually run it).
In a real sense, the generation of unique identifiers falls under the umbrella of set validation; we usually "cheat" with UUIDs by abandoning any pretense of ordering and pretending that the risk of collision is zero. Relational databases are great for set validation; event stores maybe not so much. If you need unique sequential identifiers controlled by the model, then your "set of assigned identifiers" needs to be within an aggregate.
The key phrase to follow is "cost to the business" -- make sure you understand why the long identifiers are valuable.

Here's how I'd approach it.
I agree with the idea of an Id generator which is the "business Id" but not the "techcnical Id"
Here the core is to have an application-level JobService that deals with all the infrastructure services to orchestrate what is to be done.
Controllers (like web controller or command-lines) will directly consume the JobService of the application level to control/command the state change.
It's in PHP-like pseudocode, but here we talk about the architecture and processes, not the syntax. Adapt it to C# syntax and the thing is the same.
Application level
class MyNiceWebController
{
public function createNewJob( string $jobDescription, xxxx $otherData, ApplicationJobService $jobService )
{
$projectedJob = $jobService->createNewJobAndProject( $jobDescription, $otherData );
$this->doWhateverYouWantWithYourAleadyExistingJobLikeForExample301RedirectToDisplayIt( $projectedJob );
}
}
class MyNiceCommandLineCommand
{
private $jobService;
public function __construct( ApplicationJobService $jobService )
{
$this->jobService = $jobService;
}
public function createNewJob()
{
$jobDescription = // Get it from the command line parameters
$otherData = // Get it from the command line parameters
$projectedJob = $this->jobService->createNewJobAndProject( $jobDescription, $otherData );
// print, echo, console->output... confirmation with Id or print the full object.... whatever with ( $projectedJob );
}
}
class ApplicationJobService
{
// In application level because it just serves the first-level request
// to controllers, commands, etc but does not add "domain" logic.
private $application;
private $jobIdGenerator;
private $jobEventFactory;
private $jobEventStore;
private $jobProjector;
public function __construct( Application $application, JobBusinessIdGeneratorService $jobIdGenerator, JobEventFactory $jobEventFactory, JobEventStoreService $jobEventStore, JobProjectorService $jobProjector )
{
$this->application = $application; // I like to lok "what application execution run" is responsible of all domain effects, I can trace then IPs, cookies, etc crossing data from another data lake.
$this->jobIdGenerator = $jobIdGenerator;
$this->jobEventFactory = $jobEventFactory;
$this->jobEventStore = $jobEventStore;
$this->jobProjector = $jobProjector;
}
public function createNewJobAndProjectIt( string $jobDescription, xxxx $otherData ) : Job
{
$applicationExecutionId = $this->application->getExecutionId();
$businessId = $this->jobIdGenerator->getNextJobId();
$jobCreatedEvent = $this->jobEventFactory->createNewJobCreatedEvent( $applicationExecutionId, $businessId, $jobDescription, $otherData );
$this->jobEventStore->storeEvent( $jobCreatedEvent ); // Throw exception if it fails so no projecto will be invoked if the event was not created.
$entityId = $jobCreatedEvent->getId();
$projectedJob = $this->jobProjector->project( $entityId );
return $projectedJob;
}
}
Note: if projecting is too expensive for synchronous projection just return the Id:
// ...
$entityId = $jobCreatedEvent->getId();
$this->jobProjector->enqueueProjection( $entityId );
return $entityId;
}
}
Infrastructure level (common to various applications)
class JobBusinessIdGenerator implements DomainLevelJobBusinessIdGeneratorInterface
{
// In infrastructure because it accesses persistance layers.
// In the creator, get persistence objects and so... database, files, whatever.
public function getNextJobId() : int
{
$this->lockGlobalCounterMaybeAtDatabaseLevel();
$current = $this->persistance->getCurrentJobCounter();
$next = $current + 1;
$this->persistence->setCurrentJobCounter( $next );
$this->unlockGlobalCounterMaybeAtDatabaseLevel();
return $next;
}
}
Domain Level
class JobEventFactory
{
// It's in this factory that we create the entity Id.
private $idGenerator;
public function __construct( EntityIdGenerator $idGenerator )
{
$this->idGenerator = $idGenerator;
}
public function createNewJobCreatedEvent( Id $applicationExecutionId, int $businessId, string $jobDescription, xxxx $otherData ); : JobCreatedEvent
{
$eventId = $this->idGenerator->createNewId();
$entityId = $this->idGenerator->createNewId();
// The only place where we allow "new" is in the factories. No other places should do a "new" ever.
$event = new JobCreatedEvent( $eventId, $entityId, $applicationExecutionId, $businessId, $jobDescription, $otherData );
return $event;
}
}
If you do not like the factory creating the entityId, could seem ugly to some eyes, just pass it as a parameter with a specific type and pss the responsibility to create a new fresh one and do not reuse one at some other intermedaite service (never the application service) to create it for you.
Nevertheless if you do so, pay care to what if a "silly" service just creates "two" JobCreatedEvent with the same entity Id? That would really be ugly. At the end, creation would only occur once, and the Id is created at the very core of the "creation of the event of JobCreationEvent" (reundant redundancy). Your choice anyway.
Other classes...
class JobCreatedEvent;
class JobEventStoreService;
class JobProjectorService;
Things that do not matter in this post
We could discuss much if the projectors shoud be in the infrastructure level global to multiple applications calling them... or even in the domain (as I need "at least" one way to read the model) or it belongs more to the application (maybe the same model can be read in 4 different ways in 4 different applications and each they have their own projectors)...
We could discuss much where are the side-effects triggered if implicit in the event-store or in the application level (I've not called any side-effects processor == event listener). I think of side-effects being in the application layer as they depend on infrastructure...
But all this... is not the topic of this question.
I don't care about all those things for this "post". Of course they are not negligible topics and you will have your own strategy for them. And you have to design all this very carefully. But here the question was where to crete the auto-incremental Id coming from a business requierement. And doing all those projectors (sometimes called calculators) and side-effects (sometimes called reactors) in a "clean-code" way here would blur the focus of this answer. You get the idea.
Things that I care in this post
What I care is that:
If the experts what an "autonumeric" then it's a "domain requirement" and therefore its a property in the same level of definition than "description" or "other data".
The fact they want this property does not conflict with the fact that all entities have an "internal id" in the format that the coder chooses, being an uuid, a sha1 or whatever.
If you need sequential ids for that property, you need a "supplier of values" AKA JobBusinessIdGeneratorService which has nothing to do with the "entity Id" itself.
That Id generator will be the responsible to ensure that once the number has been autoincremented, it is syncrhonously persisted before it's being returned to the client, so it is impossible to return two times the same id upon failures.
Drawbacks
There's a sequence-leak you'll have to deal with:
If the Id generator points to 4007, the next call to getNextJobId() will increment it to 4008, persist the pointer to "current = 4008" and then return.
If for some reason the creation and persistence fails, then the next call will give 4009. We then will have a sequence of [ 4006, 4007, 4009, 4010 ], with 4008 missing.
It was because from the generator point of view, 4008 was "actually used" and it, as a generator, does not know what you did with it, the same way than if you have a dummy silly loop that extracts 100 numbers.
Do never compensate with a ->rollback() in a catch of a try / catch block because that can generate you concurrency problems if you get 2008, another process gets 2009, then the first process fails, the rollback will break. Just assume that "on failure" the Id was "just consumed" and do not blame the generator. Blame who failed.
I hope it helps!

#SharpNoizy, very simple.
Create your own Id Generator. Say an alphanumeric string, for example "DB3U8DD12X" that gives you billions of possibilites. Now, what you want to do is generate these ids in a sequencial order by giving each character an ordered value...
0 - 0
1 - 1
2 - 2
.....
10 - A
11 - B
Get the idea? So, what you do next is to create your function that will increment each index of your "D74ERT3E4" string using that matrix.
So, "R43E4D", "R43E4E", "R43E4F", "R43E4G"... get the idea?
Then when you application loads, you look at the database and find the latest Id generated. Then you load in memory the next 50,000 combinations (in case that you want super speed) and create a static class/method that is going to give you that value back.
Aggregate.Id = IdentityGenerator.Next();
this way you have control over the generation of your IDs because that's the only class that has that power.
I like this approach because is more "readable" when using it in your web api for example. GUIDs are hard (and tedious) to read, remember, etc.
GET api/job/DF73 is way better to remember than api/job/XXXX-XXXX-XXXXX-XXXX-XXXX
Does that make sense?

Converting Object.GetHashCode() to Guid

I need to assign a guid to objects for managing state at app startup & shutdown
It looks like i can store the lookup values in a dictionary using
dictionary<int,Guid>.Add(instance.GetHashCode(), myGUID());
are there any potential issues to be aware of here ?
NOTE
This does NOT need to persist between execution runs, only the guid like so
create the object
gethashcode(), associate with new or old guid
before app terminate, gethashcode() and lookup guid to update() or insert() into persistence engine USING GUID
only assumption is that the gethashcode() remains consistent while the process is running
also gethashcode() is called on the same object type (derived from window)
Update 2 - here is the bigger picture
create a state machine to store info about WPF user controls (later ref as UC) between runs
the types of user controls can change over time (added / removed)
in the very 1st run, there is no prior state, the user interacts with a subset of UC and modifies their state, which needs to recreated when the app restarts
this state snapshot is taken when the app has a normal shutdown
also there can be multiple instances of a UC type
at shutdown, each instance is assigned a guid and saved along with the type info and the state info
all these guids are also stored in a collection
at restart, for each guid, create object, store ref/guid, restore state per instance so the app looks exactly as before
the user may add or remove UC instances/types and otherwise interact with the system
at shutdown, the state is saved again
choices at this time are to remove / delete all prior state and insert new state info to the persistence layer (sql db)
with observation/analysis over time, it turns out that a lot of instances remain consistent/static and do not change - so their state need not be deleted/inserted again as the state info is now quite large and stored over a non local db
so only the change delta is persisted
to compute the delta, need to track reference lifetimes
currently stored as List<WeakReference> at startup
on shutdown, iterate through this list and actual UC present on screen, add / update / delete keys accordingly
send delta over to persistence
Hope the above makes it clear.
So now the question is - why not just store the HashCode (of usercontrol only)
instead of WeakReference and eliminate the test for null reference while
iterating thru the list
update 3 - thanks all, going to use weakreference finally

Use GetHashCode to balance a hash table. That's what it's for. Do not use it for some other purpose that it was not designed for; that's very dangerous.

You appear to be assuming that a hash code will be unique. Hash codes don't work like that. See Eric Lippert's blog post on Guidelines and rules for GetHashCode for more details, but basically you should only ever make the assumptions which are guaranteed for well-behaving types - namely the if two objects have different hash codes, they're definitely unequal. If they have the same hash code, they may be equal, but may not be.
EDIT: As noted, you also shouldn't persist hash codes between execution runs. There's no guarantee they'll be stable in the face of restarts. It's not really clear exactly what you're doing, but it doesn't sound like a good idea.
EDIT: Okay, you've now noted that it won't be persistent, so that's a good start - but you still haven't dealt with the possibility of hash code collisions. Why do you want to call GetHashCode() at all? Why not just add the reference to the dictionary?

The quick and easy fix seems to be
var dict = new Dictionary<InstanceType, Guid>();
dict.Add(instance, myGUID());
Of course you need to implement InstanceType.Equals correctly if it isn't yet. (Or implement IEQuatable<InstanceType>)

Possible issues I can think of:
Hash code collisions could give you duplicate dictionary keys
Different object's hash algorithms could give you the same hash code for two functionally different objects; you wouldn't know which object you're working with
This implementation is prone to ambiguity (as described above); you may need to store more information about your objects than just their hash codes.
Note - Jon said this more elegantly (see above)

Since this is for WPF controls, why not just add the Guid as a dependency proptery? You seem to already be iterating through the user controls, in order to get their hash codes, so this would probably be a simpler method.
If you want to capture that a control was removed and which Guid it had, some manager object that subscribes to closing/removed events and just store the Guid and a few other details would be a good idea. Then you would also have an easier time to capture more details for analysis if you need.

new objects added during long loop

We currently have a production application that runs as a windows service. Many times this application will end up in a loop that can take several hours to complete. We are using Entity Framework for .net 4.0 for our data access.
I'm looking for confirmation that if we load new data into the system, after this loop is initialized, it will not result in items being added to the loop itself. When the loop is initialized we are looking for data "as of" that moment. Although I'm relatively certain that this will work exactly like using ADO and doing a loop on the data (the loop only cycles through data that was present at the time of initialization), I am looking for confirmation for co-workers.
Thanks in advance for your help.
//update : here's some sample code in c# - question is the same, will the enumeration change if new items are added to the table that EF is querying?
IEnumerable<myobject> myobjects = (from o in db.theobjects where o.id==myID select o);
foreach (myobject obj in myobjects)
{
//perform action on obj here
}

It depends on your precise implementation.
Once a query has been executed against the database then the results of the query will not change (assuming you aren't using lazy loading). To ensure this you can dispose of the context after retrieving query results--this effectively "cuts the cord" between the retrieved data and that database.
Lazy loading can result in a mix of "initial" and "new" data; however once the data has been retrieved it will become a fixed snapshot and not susceptible to updates.
You mention this is a long running process; which implies that there may be a very large amount of data involved. If you aren't able to fully retrieve all data to be processed (due to memory limitations, or other bottlenecks) then you likely can't ensure that you are working against the original data. The results are not fixed until a query is executed, and any updates prior to query execution will appear in results.

I think your best bet is to change the logic of your application such that when the "loop" logic is determining whether it should do another interation or exit you take the opportunity to load the newly added items to the list. see pseudo code below:
var repo = new Repository();
while (repo.HasMoreItemsToProcess())
{
var entity = repo.GetNextItem();
}
Let me know if this makes sense.

The easiest way to assure that this happens - if the data itself isn't too big - is to convert the data you retrieve from the database to a List<>, e.g., something like this (pulled at random from my current project):
var sessionIds = room.Sessions.Select(s => s.SessionId).ToList();
And then iterate through the list, not through the IEnumerable<> that would otherwise be returned. Converting it to a list triggers the enumeration, and then throws all the results into memory.
If there's too much data to fit into memory, and you need to stick with an IEnumerable<>, then the answer to your question depends on various database and connection settings.

I'd take a snapshot of ID's to be processed -- quickly and as a transaction -- then work that list in the fashion you're doing today.
In addition to accomplishing the goal of not changing the sample mid-stream, this also gives you the ability to extend your solution to track status on each item as it's processed. For a long-running process, this can be very helpful for progress reporting restart / retry capabilities, etc.

using the db to prevent errors in a UI presentation

I am going though this msdn article by noted DDD expert Udi Dahan, where he makes a great observation that he said took him years to realize; "Bringing all e-mail addresses into memory would probably get you locked up by the performance police. Even having the domain model call some service, which calls the database, to see if the e-mail address is there is unnecessary. A unique constraint in the database would suffice."
In a LOB presentation that captured some add or edit scenario, you wouldn't enable the Save type action until all edits were considered valid, so the first trade-off doing the above is that you need to enable Save and be prepared to notify the user if the uniqueness constraint is violated. But how best to do that, say with NHibernate?
I figure it needs to follow the lines of the pseudo-code below. Does anyone do something along these lines now?
Cheers,
Berryl
try {}
catch (GenericADOException)
{
// "Abort due to constraint violation\r\ncolumn {0} is not unique", columnName)
//(1) determine which db column violated uniqueness
//(2) potentially map the column name to something in context to the user
//(3) throw that can be translated into a BrokenRule for the UI presentation
//(4) reset the nhibernate session
}

The pessimistic approach is to check for unique-ness before saving; the optimistic approach is to attempt the save and handle the exception. The biggest problem with the optimistic approach is that you have to be able to parse the exception returned by the database to know that it's a specific unique constraint violation rather than the myriad of other things that can go wrong.
For this reason, it's much easier to check unique-ness before saving. It's a trivial database call to make this check: select 1 where email = 'newuser#somewhere.com'. It's also a better user experience to notify the user that the value is a duplicate (perhaps they already registered with the site?) before making them fill out the rest of the form and click save.
The unique constraint should definitely be in place, but the UI should check that the address is unique at the time it is entered on the form.

Is it a good idea to create a custom type for the primary key of each data table?

We have a lot of code that passes about “Ids” of data rows; these are mostly ints or guids. I could make this code safer by creating a different struct for the id of each database table. Then the type checker will help to find cases when the wrong ID is passed.
E.g the Person table has a column calls PersonId and we have code like:
DeletePerson(int personId)
DeleteCar(int carId)
Would it be better to have:
struct PersonId
{
private int id;
// GetHashCode etc....
}
DeletePerson(PersionId persionId)
DeleteCar(CarId carId)
Has anyone got real life experience
of dong this?
Is it worth the overhead?
Or more pain then it is worth?
(It would also make it easier to change the data type in the database of the primary key, that is way I thought of this ideal in the first place)
Please don’t say use an ORM some other big change to the system design as I know an ORM would be a better option, but that is not under my power at present. However I can make minor changes like the above to the module I am working on at present.
Update:
Note this is not a web application and the Ids are kept in memory and passed about with WCF, so there is no conversion to/from strings at the edge. There is no reason that the WCF interface can’t use the PersonId type etc. The PersonsId type etc could even be used in the WPF/Winforms UI code.
The only inherently "untyped" bit of the system is the database.
This seems to be down to the cost/benefit of spending time writing code that the compiler can check better, or spending the time writing more unit tests. I am coming down more on the side of spending the time on testing, as I would like to see at least some unit tests in the code base.

It's hard to see how it could be worth it: I recommend doing it only as a last resort and only if people are actually mixing identifiers during development or reporting difficulty keeping them straight.
In web applications in particular it won't even offer the safety you're hoping for: typically you'll be converting strings into integers anyway. There are just too many cases where you'll find yourself writing silly code like this:
int personId;
if (Int32.TryParse(Request["personId"], out personId)) {
this.person = this.PersonRepository.Get(new PersonId(personId));
}
Dealing with complex state in memory certainly improves the case for strongly-typed IDs, but I think Arthur's idea is even better: to avoid confusion, demand an entity instance instead of an identifier. In some situations, performance and memory considerations could make that impractical, but even those should be rare enough that code review would be just as effective without the negative side-effects (quite the reverse!).
I've worked on a system that did this, and it didn't really provide any value. We didn't have ambiguities like the ones you're describing, and in terms of future-proofing, it made it slightly harder to implement new features without any payoff. (No ID's data type changed in two years, at any rate - it's could certainly happen at some point, but as far as I know, the return on investment for that is currently negative.)

I wouldn't make a special id for this. This is mostly a testing issue. You can test the code and make sure it does what it is supposed to.
You can create a standard way of doing things in your system than help future maintenance (similar to what you mention) by passing in the whole object to be manipulated. Of course, if you named your parameter (int personID) and had documentation then any non malicious programmer should be able to use the code effectively when calling that method. Passing a whole object will do that type matching that you are looking for and that should be enough of a standardized way.
I just see having a special structure made to guard against this as adding more work for little benefit. Even if you did this, someone could come along and find a convenient way to make a 'helper' method and bypass whatever structure you put in place anyway so it really isn't a guarantee.

You can just opt for GUIDs, like you suggested yourself. Then, you won't have to worry about passing a person ID of "42" to DeleteCar() and accidentally delete the car with ID of 42. GUIDs are unique; if you pass a person GUID to DeleteCar in your code because of a programming typo, that GUID will not be a PK of any car in the database.

You could create a simple Id class which can help differentiate in code between the two:
public class Id<T>
{
private int RawValue
{
get;
set;
}
public Id(int value)
{
this.RawValue = value;
}
public static explicit operator int (Id<T> id) { return id.RawValue; }
// this cast is optional and can be excluded for further strictness
public static implicit operator Id<T> (int value) { return new Id(value); }
}
Used like so:
class SomeClass
{
public Id<Person> PersonId { get; set; }
public Id<Car> CarId { get; set; }
}
Assuming your values would only be retrieved from the database, unless you explicitly cast the value to an integer, it is not possible to use the two in each other's place.

I don't see much value in custom checking in this case. You might want to beef up your testing suite to check that two things are happening:
Your data access code always works as you expect (i.e., you aren't loading inconsistent Key information into your classes and getting misuse because of that).
That your "round trip" code is working as expected (i.e., that loading a record, making a change and saving it back isn't somehow corrupting your business logic objects).
Having a data access (and business logic) layer you can trust is crucial to being able to address the bigger pictures problems you will encounter attempting to implement the actual business requirements. If your data layer is unreliable you will be spending a lot of effort tracking (or worse, working around) problems at that level that surface when you put load on the subsystem.
If instead your data access code is robust in the face of incorrect usage (what your test suite should be proving to you) then you can relax a bit on the higher levels and trust they will throw exceptions (or however you are dealing with it) when abused.
The reason you hear people suggesting an ORM is that many of these issues are dealt with in a reliable way by such tools. If your implementation is far enough along that such a switch would be painful, just keep in mind that your low level data access layer needs to be as robust as an good ORM if you really want to be able to trust (and thus forget about to a certain extent) your data access.
Instead of custom validation, your testing suite could inject code (via dependency injection) that does robust tests of your Keys (hitting the database to verify each change) as the tests run and that injects production code that omits or restricts such tests for performance reasons. Your data layer will throw errors on failed keys (if you have your foreign keys set up correctly there) so you should also be able to handle those exceptions.

My gut says this just isn't worth the hassle. My first question to you would be whether you actually have found bugs where the wrong int was being passed (a Car ID instead of a Person ID in your example). If so, it is probably more of a case of worse overall architecture in that your Domain objects have too much coupling, and are passing too many arguments around in method parameters rather than acting on internal variables.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.