Accounting Database - storing a transaction

Accounting Database - storing a transaction - c#

You make a gaming website where the user can buy gaming credits and the funds are deposited/credited into the user's virtual account to play some game etc...etc..
1
If you got an accountant to record the transaction, it would be recorded like this (maybe a bit more complex but you get the point)
TRANSACTION
PK_ID1 Cash - $10 (System)
PK_ID2 Deposit $10 (System)
TRANSACTION
PK_ID3 Bank Account - $10 (John)
PK_ID4 Deposit $10 (John)
2
As a developer, do you really need to waste 2 extra records? why not just record it like this…(then you might store information where the funds came from, status in other columns under the same deposit record)
TRANSACTION
PK_ID1 Cash - $10 (system)
PK_ID2 Deposit $10 (John)
Is there any real advantage of option #1 over option #2 and vice visa?
EDIT: modified question, removed CR, DR and replaced with a sign.

(Answering your question, but also responding some points raised in paxdiablo's answer.)
It is nothing to do with the accountant looking inside your database. With Double entry, errors are easy to trace; it is an Accounting and IRS requirement, so really, you do not have a choice, you need double entry for any system that deals with public funds.
(Please do not try to tell me what "double entry" is; I have written double entry systems for banks, to Audit requirements.) Double entry is an accounting method, based on a set of accounts. Every financial transaction is Journal Entry; if all the transactions were re-applied from the beginning, all the accounts would at their exact same balance as they are today.
Double Entry means every transaction has a "To" and a "From" account; money never leaves the system or enters the system. Every Credit has a Debit attached to it.
Therefore (1) is not the "double entry" version of (2), they cannot be readily compared. The double entry version of John's transaction is (one financial transaction), in logical accounting terms:
From: JohnAccount To: SystemAccount Amount: 10.00 (dollars)
That may well be two rows in a table, one a credit and the other a debit, the two inserts wrapped in an SQL Transaction.
That is it for the Accounting system, which is internal, and deals with money. We are done.
But you are additionally marrying the accounting system to a purchase/sale system (without having explicitly declared it). Of course for the ten bucks you took from John, you need to give him whatever he purchased for it, and record that. John bought ten bucks worth of gaming credits, if you are tracking that, then yes, you also need:
From: SystemGamingAccount To: JohnGamingAccount Amount: 100 (credits)
or,expressed in dollars:
From: SystemGamingAccount To: JohnGamingAccount Amount: 10.00 (dollars)
That, too, may well be two rows in a table, one a credit and the other a debit, the four inserts wrapped in an SQL Transaction.
To be clear, if you were selling widgets instead of gaming credits, the second (widget tracking) transaction would be:
From: Warehouse To: PublicSale Amount: 1 (widgets)
and since you are tracking Units in the warehouse but not how many widgets John Q Public has in his pocket, that is two inserts plus one update (UPDATE Part SET QtInStock = QtyInStock - 1 WHERE PartCode = "Widget"), all wrapped in a SQL transaction.
And there IS an Account for each user, right. Virtual, esoteric or physical, it is a Legal Entity, against which transactions are made. So let's not pretend it does not exist because it is virtual. For gaming, one dollar Account plus one gaming (credit) Account.
Credit/Debit
I would put the CR/DB back in; not CHAR (2), but boolean. It will help you later when the table is large,
WHERE IsCredit = 1
is much faster than
WHERE Amount >= 0.
Note that with ">=" you have to ensure that every code segment is coded the same way, not ">" sometimes. Boolean or char does not have that problem.

In terms of the data (which is what you're asking), no. You should store it as a signed value. Double-entry bookkeeping is not something the mob does so it can hide the real profits from the IRS :-)
It means transaction have to be balanced (value is never created or destroyed, just transformed). And it'll be a lot easier to balance transactions (and the books) if you just store them in one column with a sign.
In terms of visual presentation, some accountant may like them in separate columns but the vast majority will generate reports with the "negatives" simply indicated differently (such as surrounding them with parentheses).
It may well be that (like many other accounting things), the dual columns are carried forward from many moons ago. It would be easier to add up two columns then subtract the negative total from the positive total to get the current position (as opposed to adding and subtracting in a intermixed fashion). But that's supposition on my part.
See also here.

Related

SQL Server, Does 'sum' funcation works fine in large tables?

I'm building a market sell points system using C# winForms, SQL Server 2014
My problem is the idea of storing the products and quantities
the program should display the quantity of some product when making a new sell bill
so before user make a sell he should know the quantity of this product, and see if it enough for the order or not
i created this table for items
Item_ID, Item, Buy_Price, Sell_Price, Quantity
but this makes confusions when user edit some bills , sell bills or buy bills
so for example when user edit some buy bill(invoice), edit the quantity in the buy bill
then the product quantity cell in items table will not match the quantity in buybills table
so i want to make it little more dynamic and remove the quantity cell from items table
and using the bills table to get the quantity of any item
that means when user search for an item(Product) to sell, system uses a sum funcation buy sql statement
quantity = sum(item_quantity) in buy_bills - sum(item_quantity) in sell_bills
does it works? or this will be so heavey when databases size and records count get larger?

Problem here is not related to SUM function as such (it will sum up whatever you put in your query and rather quickly - in general) but with concept of detecting available quantities. Assuming that there is more than one selling point it will be impossible to get accurate available quantity of any product if you do not provide some form of concurrency control.
So, you may use sum from table where you have history of buy or sell of products. But, before you accept any change you need to put some lock to prevent other sellers to change (sell) same quantity. After you have committed one sell you may allow other sellers to continue with their activities assuring that you refresh currently presented items quantities.
Do have in mind that this scenario is one of the basic ones that may occur in any multiuser business application.
So, to conclude, do provide
One point in time where you are sure that existing stock cannot be lowered
SUM in the way you see fit for your application
Refresh any pending sale with new quantities

I want to make it little more dynamic and remove the quantity cell from items table and using the bills table to get the quantity of any item that means when user search for an item(Product) to sell, system uses a sum funcation buy sql statement quantity = sum(item_quantity) in buy_bills - sum(item_quantity) in sell_bills does it works?
Assuming you have your syntax correct, it is possible to sum a column using the sum function. You can subtract two sums in a single statement. Ie, SUM(Column_1) - SUM(Column2) AS Column3. This assumes that you have everything else set up correctly.
or this will be so heavey when databases size and records count get larger?
This is 100% relative to your database setup and not answerable here. It is possible that it will slow down when the database gets larger. But, if your database is set up correctly, it should not be by much.
It is also possible, later on, to get around that issue by pre-loading the data.

should the user's Account balance be stored in the database or calculated dynamically?

Should the user's Account balance be stored in the database or calculated dynamically?
For accurate results calculating it dynamically make sense but then it might be a problem, when there are many user's and the database grows very large?
Transaction
Id (PK)
AccountId
Type
DateTime
Amount
etc..etc...
AccountBalance
TransactionId (PK/FK)
BalanceAmount

In order to keep accurate auditing you should make record of every transaction that affects the users account balance. This means you can calculate the balance dynamically, however for performance reasons I would have the balance stored as well. To ensure the balance is correct though, I would have a daily job run that recalculates the balance from scratch.

You need to ask yourself a few questions:
1) Who will OWN the calculation?
2) Who will NEED the result?
Now if the owner of the calculation is the only one who will need it - or if anyone else who needs it will get it from the owner, then there is no need to store the calculation.
However, in most applications that actually run for a long time, the calculated result will probably end up being needed somewhere else. For instance, a reporting application like SQLReportingServices will need the result of the calculation, so if the owner of the calculation is a web application, you have a problem. How will reporting services (which talks to the database only) get the result?
Solution in that case - either store the calculation OR make the database the owner OF the calculation and have a sql function that returns the result.
Personally, I tend to go for the non-purist approach - I store calculated results in the database. Space is cheap, and response time is faster on a read than on a read+function call.

I think this is a good question. Calculating every time is obviously easy to do but would probably result in a lot of needless calculations with the resultant performance hit.
But storing the current balance in some other table can lead to the issues in data concurrency, where the data the builds the aggregate is modified out of sync with the aggregate.
Perhaps a happy medium is to have a sql trigger on the transaction table that updates the aggregate value on an insert or update for that user.

the current balance is already available!
it is the balance in the last transaction for the account:
select top 1 [Balance]
from dbo.Trans
where [AccountID] = #AccountID
order by [TranID] desc
the balance has to be calculated and stored as part of every transaction
otherwise the system won't scale ...
also if you don't store the balance you have no checks and balances (as in balance must equal previous balance plus new credits less new debits)

If your application is not retrieving data from database for balance calculation while you need the balance, I will suggest that you should calculate the balance or else store in database.
If you need updated balance frequently and it is dynamically change based on more than one table then you should have table view instead of trigger.

DataTables, DataSets, Entity Framework, LINQ & Lambda Expression for Financial Technical Analysis C#

this is more of an architectural question more than a specific code problem as I've hit a major block in how I am going to proceed with this project.
I'm building a financial scanning software that filters stock picks on specific criteria, for example. For example if out of 8000 stocks, its closing price today is above the SMA 100 and 10 days ago the closing price is below the SMA 100, then return this stock Symbol back to me.
However, note that the SMA (Simple Moving Average) is calculated with the last 100 days of data in the above example, but it could be that we could change the 100 days for lets say another value, 105 or 56 - could be anything.
In my Database I have a table definition called EODData with a few columns, here is the definition below;
EODData
sSymbol nvarchar(6)
mOpen money
mClose money
mHigh money
mLow money
Date datetime
The table will hold 3 years of End Of Day Data for the American Stock Exchange so that is approximately 6,264,000 rows, no problem for MS SQL 2008 R2.
Now, I'm currently using Entity Framework to retrieve data from my database, however what would be the best way to run or create my filter because the SMA must be calculated for each Symbol or underlying Stock Ticker each time a scan is performed because the 100 day variable can change.
Should I convert from Entity Objects to a DataSet for in memory filtering etc???
I've not worked with DataSets or DataTables much so I am looking for pointers.
Note that the SMA is just one of the filters, I have another algorithm that calculates the EMA (Exponential Moving Average, which is a much more complicated formula) and MACD (Moving Average Convergence Divergence).
Any opinions?

What about putting the calculations in the database? You have your EODData table, which is great. Create another table that is your SummaryData, something like:
SummaryData
stockSymbol varchar(6) -- don't need nvarchar, since AMSE doesn't have characters outside of normal English alphabet.
SMA decimal
MCDA decimal
EMA decimal
Then you can write some stored procedures that run on close of day and update this one table based on the data in your EODData table. Or you could write a trigger so that each insert of the EODData table updates the summary data in your database.
One downside to this is that you're putting some business logic in the database. Another downside is that you will be updating statistical data on a stock symbol that you might not need to do. For example, if nobody ever wants to see what XYZZ did, then the calculation on it is pointless.
However, this second issue is mitigated by the fact that 1. you're running SPs on the server which MSSQL can optimize and 2. You can run these after hours when everyone is at home, so if it takes a little bit of time you're not affected. (To be honest, I'd assume if they're calculations like rolling averages, min/max etc, SQL won't be that slow.)
One upside is your queries should be wicked fast, because you could write
select stockSymbol from SummaryData where SMA > 10 you've already done the calculation.
Another upside is that the data will only change once per day (at the close of the business day) but you might be querying several times throughout the day. For example, you want to run several different queries today for all the data up to and including yesterday. If you run 10 queries, and your partner runs the same 10 queries, you've just done the same calculation over. (Essentially, write once, read many.)

history of records in lookup tables

i hope the term 'lookup table' is well chosen, what i mean is for example a rate table (lookup) with the following rates:
cheap: $15,-
Medium: $30,-
expensive: $45,-
we're at the situation that for a given entity (we call it 'fault', it is a malfunction of a device, airco, elevator, krane, toilet etc.) a constructor is hired to fix that device.
that constructor has these three (made up) rates: cheap, medium and expensive.
When the constructor fixes the fault, he enters the hours worked and the rate (when a senior has done the job, 'expensive', and when a junior has done the job, 'cheap')
technically, we then add a FK from the Fault table to the Rates table.
So when the invoice has to be printed, we get the rate via the FK and the hours worked from the fault record.
Problem is that when the constructor changes his rates, and you recalculate an old invoice months later, other amounts are calculated for the invoice because the record has changed.
So we have to construct some kind of history, and that's the question: how is that done?
what i've come up with is 2 different situations, and the question is: is one of these a good one are there better ways?
1 add a valid-from and valid-until field at the rate table, so when you edit a value, you in fact create a new record with new valid dates. downside is you have to always get the rates with a specific date in mind, which for the current situation (the actual rate at this moment) is not neccessary.
2 don't put a FK from fault to rate, but when you set a rate at a fault, you just copy the VALUE from rate to fault. downside is that when the fault is still editable, when you edit the rate, the fault's rate is not updated. And, when you edit a fault, you get a dropdown box with 3 values to choose from, non of which are the same of the current values.
At this point thanks already for reading this entire post!

I don't like #2; I never like replacing relationships with actual values (denormalizing) if I can help it. Also, it makes auditing a lot harder; if there's a weird value in for the rate, where did it come from?
The problem with #1, though, is that if for some reason you change the date of the invoice, it should probably still have the same rate that it had when it was originally created.
For these reasons, I'd recommend doing the part of #1 where a rate change always created a new row, but then link from each fault to the rate that was actually applied (i.e. rather than relying on the date to join to a rate, actually store a rate id with the fault).
One approach to finding the current rate is just to look for the one that has no end date. Or alternately, don't use end dates at all (the start date of the next rate is treated as the end date of the previous rate), and just sort by date and take the last one.

There was a good discussion of this over on Programmers.SE
How to Store Prices That Have Effective Dates
It's a well-known problem and using effective dates is the best way to do it.

I'd suggest keeping a table of contractor rates, ordered by date. When a contractor's rates change, instead of changing the existing rate add a new entry. When you need to get the current rate, sort by the timestamp descending and limit 1. Add the date entry for the current rate entry to each job record and then you can perform a simple join to get all the information at once.

Ratings to the items based on user click

I am trying to have a ratings strategy in the hotels search website.Ratings is based on number of clicks by the users who view different hotels.I am trying to assign number of points to each click.
Suppose i have a POINTS column in the hotels table which increases by the number of clicks on each hotel, the problem i face is suppose i have a hotel1 which was introduced in week1 then it gains considerable amount of points by week2 but if i introduce a new hotel2 at week2 although this new hotel is competetively increasing the points it cant reach easily the hotel1 becoz of there difference in weeks which they were introduced.
I thought a way to solve the above problem by introducing a DAYS column then i can divide the POINTS of each HOTEL by number of days so that i can have clear ratings to each hotel.
I need your help about how i get the number of each passing day in the DAYS column after new hotel is added in the table of database.

It would probably be better to have a CreateDate column, and then in client side code do something along the lines of:
int days = Date.Now.Subtract(hotel.CreateDate).Days;
This will cause less updates to your database too, as the date only needs to be set on create.

I'm not going to go into the statistical theory of what you're doing, but let me state for the record that I think your stats are going to be misleading.
Be that as it may, just to achieve what you say you want to achieve, I would do what #Andy just said you should do. (Beat me to it!)

As I understand it, you are asking people to "vote" on the hotels, but the only possible votes are "yes" and "no vote".
Whether this is statistically valid will depend on when people are asked to vote. If every time someone visits a hotel, you ask him whether he was satisfied with his stay, and people consistently do this, I suppose it would be generally valid. But if the scoring system is such that users do not perceive a need to re-vote on a hotel that they have already voted on, then hotels that are on the list longer will see their ratings tend to sink. If it reached a point where every user who had visited a hotel has voted (or not), and no one saw a need to vote again, then that hotels score would gradually sink.
Also, this system would be biased in favor of big hotels. If hotel A has 500 rooms and hotel B has 10 rooms, it would b e very tough for hotel B to ever get as many votes as hotel A.
I think you'd be better to ask people to rate the hotel on some scale -- 1 to 5 stars or whatever -- and then present the average score. Probably along with the number of ratings, as people can probably figure out that if there's only one rating and it's the highest possible, that might be the owner rating his own hotel.

An alternative to calculating the days in code would be to use a computed column in the database (assuming by the sql tag you meant sql server). As the other posters have said, add a CreateDate column for the hotel and then add a computed column to return the date diff.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.