Best way of acquiring information from several database tables - c#

I have a medical database that keeps different types of data on patients: examinations, lab results, x-rays... each type of record exists in a separate table. I need to present this data on one table to show the patient's history with a particular clinic.
My question: what is the best way to do it? Should I do a SELECT from each table where the patient ID matches, order them by date, and then keep them in some artificial list-like structure (ordered by date)? Or is there a better way of doing this?
I'm using WPF and SQL Server 2008 for this app.

As others have said, JOIN is the way you'd normally do this. However, if there are multiple rows in one table for a patient then there's a chance you'll get data in some columns repeated across multiple rows, which often you don't want. In that case it's sometimes easier to use UNION or UNION ALL.
Let's say you have two tables, examinations and xrays, each with a PatientID, a Date and some extra details. You could combine them like this:
SELECT PatientID, ExamDate [Date], ExamResults [Details]
FROM examinations
WHERE PatientID = #patient
UNION ALL
SELECT PatientID, XrayDate [Date], XrayComments [Details]
FROM xrays
WHERE PatientID = #patient
Now you have one big result set with PatientID, Date and Details columns. I've found this handy for "merging" multiple tables with similar, but not identical, data.

If this is something you're going to be doing often, I'd be tempted to create a denormalized view on all of patient data (join the appropriate tables) and index the appropriate column(s) in the view. Then use the appropriate method (stored procedure, etc) to retrieve the data for a passed-in patientID.

Use a JOIN to get data from several tables.

You can use a join (can't remember which type exactly) to get all the records from each table for a specific patient. The way this works depends on your database design.

I'd do it with separate SELECT statements, since a simple JOIN probably won't do due to the fact that some tables might have more than 1 row for the patient.
So I would retrieve multiple result-sets in a simple DataSet, add a DalaRelation, cache the object and query it down the line (by date, by exam type, subsets, ...)
The main point is that you have all the data handy, even cached if needed, in a structure which is easily queried and filtered.

Related

migrating an access multi valued field column to c#

I am attempting to use the Microsoft.ACE.OLEDB.12.0 driver to read data from an access database. came upon an odd situation. one of the columns in the access database shows as a comma delimited list of ids.
Wells
________
345,456,7
6,387
when I looked at the column definition in access I thought it would say string but it does not, it says number. so I guess it is storing an array of integers in a single column?
I'm having a tough time getting a data reader to pick this up.
using
var w = DB_Reader.GetValue(DB_Reader.GetOrdinal("Wells"));
results in the error
The provider could not determine the Object value. For example, the
row was just created, the default for the Object column was not
available, and the consumer had not yet set a new Object value.
Well, at the end of the day, you can think of the mutli-value column as in fact a child table.
So, if you looking to migrate a master and child table, then in YOUR database, you need a relational set of tables to re-create what Access is doing behind the scene.
So, lets take a multi-value example and query.
Say we have this sql query in Access:
SELECT ID, Person_Name, FavorateColors FROM tPerson;
But, "favorite colors" is one of those MV columns. (and I should point out with the HUGE movement towards no-sql databases - they also often work this way also - same for XML or JSON data for that matter. However, be it some XML, JSON or Access mutli-value features? Well, you need that child table if you going to adopt a relational data model to represent this data.
Ok, so we run the above query, and you get this output:
In fact, when I used the lookup wizard - I picked a child table called tblColors.
but, how can we explode the above query to dig out the data?
Change the above query to this:
SELECT ID, Person_Name, FavorateColors.Value FROM tPerson
Note how we added ".value" after the MV column name. Now, when you run the query, you get the SAME result as if you had two tables, and did a left join. The parent table rows will like any relational database simple repeat for each child table value, and you get this:
Note how now the PK value and the row is repeating for each child mv value.
So, you are quite much free to query as per above - you get what amounts to a left joined table, and of course the parent record repeats.
So, just like XML, JSON, or in fact a query or a table of data with repeating parent row, and child rows? Well, you quite much forced to write code to split out this data, or re-normalize the data. This of course is far more common when receiving say JSON/XML data, or in fact often say data from a Excel sheet.
So, you have to process out the child record data, and create a relation for that data.
And thus now our question becomes how can we import JSON/XML/Excel data that really should have used two relational database tables.
So, assuming we want to process this data? You process it the same as for any data you have that should have been two related tables in the first place.
it really depends if this is a one time import, or you have to do this all the time?
If it was a one time deal, then I would use Access, and use a make table query based on the above query. You would in fact have to pluck up the PK ID from the child table. In above there is a child table called colors - we just missing that "junction" table in between that Access automatic created. The hidden tables are not exposed, and thus I would simply use a make table query in access, and then add a FK column that is the PK value from the tblColors.

Best way to check if multiple records exist in database

I am creating an application that takes data from a text file which has sales data from Amazon market place.The market place has items with different names compared to the data in our main database. The application accepts the text file as input and it needs to check if the item exists in our database. If not present I should throw an option to save the item to a Master table or to Sub item table and map it to a master item. My question is if the text file has 100+ items should I hit the database each time to check if the data exists there.Is there any better way of doing that so that we can minimize the database hits.
I have two options that i have used earlier
Hit database and check if it exists in table.
Fill the data in a DataTable and use DataTable.Select to check if it exists.
Can some one tell me the best way to do this?. I have to check two tables (master table, subItem table), maybe 1 at a time. Thanks.
Update:
#Downvoters add an comment .
i am not asking you whats the way to check if an item exists in database.I just want to know the best way of doing that. Should I be hitting database 1000 times if an file has 1000 items? That's my question.
The current query I use:
if exists (select * from [table] where itemname= [itemname] )
select 'True'
else
select 'False'
return
(From Chat)
I would create a Stored Procedure which takes a table valued parameter of all the items that you want to check. You can then use a join (a couple of options here)* to return a result set of items and whether each one exists or not. You can use TVP's from ADO like this.
It will certainly handle the 100 to 1000 row range mentioned in your post. To be honest, I haven't used it in the 1M+ range.
in newer versions of SQL Server, I would prefer TVP's over using an xml input parameter, as it is really quite cumbersome to pack the xml in your .Net code and then unpack it again in your SPROC.
(*) Re Joins : With the result set, you can either just inner join the TVP to your items / product table and check in .Net if the row doesn't exist, or you can do an left outer join with the TVP as the left table, and e.g. ISNULL() missing items to 0 / 'false' etc.
Make it as batch of 100 items to the database. probably a stored procedure might help, since repetitive queries has to be fired. If the data is not changed frequently, you can consider caching. I assume you will be making service calls from ur .net application, so ingest a xml from back end, in batches. Consider increasing batch size based on the filesize.
If your entire application is local, batch size size may very high, as there is no netowrk oberhead, still dont make 100 calls to db.
Try like this
SELECT EXISTS(SELECT * FROM table1 WHERE itemname= [itemname])
SELECT EXISTS(SELECT 1 FROM table1 WHERE itemname= [itemname])

Database structure, Users + User Types where Users can be of more than one Type

I currently have a User table, tblUser and a User Types table, tblUserTypes.
The two are linked by means of a foreign key link in tblUser... fkUserTypeID.
Hence at the moment a user can be of only one type.
BUT, there are circumstances where the user can be of multiple types... say for example, a Customer as well as a Supplier.
The obvious solution to me is to create a new table in between tblUser and tblUserTypes, tblUser_UserTypes which is a bridging table:
[tblUser] ---< [tblUser_UserTypes] >--- [tblUserTypes]
BUT, I can see complexities arising from this... for example when exporting a list of users joined onto their user types, with a straight forward join I'm going to end up with multiple rows of those users. It could be possible to bring each user record back to a single row using a PIVOT query perhaps? (more below on this)
Importing Users into the system also seems problematic... I am currently using a BCP (Bulk Copy Process) from a file to import users directly into the user table... the import file contains a single field "user type" which works in the existing model because each user can currently only be of one type. BUT, with multiple user types I can't see how a direct BCP directly into the user table could work.
Adding to the complexity is that user types are not currently fixed... the table tblUserTypes is dynamic ... part of the system is to allow creation of any number of user types. However, there are some types of users that I need to know about to be able to define business logic at a higher level.... e.g. "Only allow users of type=x in this area"... so it has been suggested that in the user types table there is a series of flags that define what type of type the user types are (e.g. IsCustomer, IsSupplier)
This is feeling like an over complicated mess and I'm loosing sleep over how to move forward.
I would love to bring the user types back into the table tblUser and do away with the other two tables entirely... a series of checkboxes in the user table (e.g. IsCustomer, IsSupplier)... because that makes importing and exporting straight forward. BUT then the user types wouldn't be dynamic. Interestingly though the user types are not COMPLETELY dynamic... because as mentioned above there are some user types I need to know about when it comes to business login.
Hmmm, should it be a hybrid of the two? Am I trying to squash two features into one? Perhaps I could have checkbox / boolean types in the user table for the types that correlate to business logic (e.g. IsCustomer, IsSupplier) and rename the context of the "User Types" to be "User Groups" or something like that.
A major concern for me is impact on importing, exporting and search results when considering a structure where a straight forward join is going to result in users being replicated... one row for each user type they belong to. I would have to do a PIVOT query to bring this back to one record per user, with a column for each user type, wouldn't I? A realistic example is a User table with 3 million records and importing 10,000 records at a time... or exporting 10,000 records at a time... or searching across those 3 million records to retrieve 3,000 matches and having that rendered on a web page in a paginated fashion where they can flick through the search result pages (I use ROWNUM in my search query to work with pagination, I don't return the whole lot every time).
This is my first question on Stack Overflow, I'm sorry if it's a bit convoluted or there are already answers listed... I tried to search but couldn't come up with examples handling the complexities of working with Users that can be of multiple Types.
Oh, in case it matters... this is a C# ASP.NET application working with SQL Server.
After thinking it through and reading responses I'm going to go all the way and use the bridging table... the requirements say that users can be of multiple types so that's how it will be. Consequences on existing code are dramatic, but better now than down the track.
I played around with the table structure and the queries required to get data out in a flat structure are a bit fiddly and ultimately require dynamic SQL (because the list of user types is dynamic) a which I'm not a fan of but I can't see another way to do it.
In the examples below companies fetched are filtered by an 'Event ID' i.e. fkEventID
If there is a better way to do the 'flattening' I would be very appreciative of any help :-)
Straight forward join (multiple rows per company if they are of more than one type)
select * from tblCompany
left join tblCompany_CompanyType on fkCompanyID = pkCompanyID
left join tblCompanyType on fkCompanyTypeID = pkCompanyTypeID
where tblCompany.fkEventID = 1
Hard Coded pivot query (single rows per company if they are of more than one type, but the company types are not dynamic)
select * from (
select tblCompany.*,tblCompanyType.CompanyType from tblCompany left join
tblCompany_CompanyType on fkCompanyID = pkCompanyID
left join tblCompanyType on fkCompanyTypeID = pkCompanyTypeID
where tblCompany.fkEventID = 1
) AS sourcequery
Pivot (count(CompanyType) for CompanyType IN ([Customer],[Supplier],[Something Else])) as CompanyTypeName
Dynamic Pivot Query (multiple rows per company and handles dynamic company types)
DECLARE #cols AS NVARCHAR(MAX)
DECLARE #sql AS NVARCHAR(MAX)
SET #cols = STUFF(
(SELECT N',' + QUOTENAME(CompanyType) AS [text()]
FROM (
select CompanyType from tblCompanyType
where fkEventID = 1
) AS Y
FOR XML PATH('')),
1, 1, N'');
SET #sql = N'SELECT * FROM (
select tblCompany.*,tblCompanyType.CompanyType from tblCompany left join tblCompany_CompanyType on fkCompanyID = pkCompanyID
left join tblCompanyType on fkCompanyTypeID = pkCompanyTypeID
where tblCompany.fkEventID = 1
) AS sourcequery
Pivot (count(CompanyType) for CompanyType IN (' + #cols + ')) as CompanyTypeName
order by pkCompanyID'
EXEC sp_executesql #sql;
You truly do have a many to many relationship between users and user types, and I suggest you go ahead and implement it that way.
If you have a need to see it flattened out in some instances, you can accomodate that with a view or stored procedure.
If you want to continue to import using BCP, you can always BCP into a staging table and then use a stored proc to fill out your 3 tables. It's probably safer to do it that way anyway.
Keeping to fully implementing the many to many relationship will give you the most flexibility in your app, and will prevent you from needing to continually modify your user table as you get new requirements for new security roles.

Which approach is better to retrieve data from a database

I am confused about selecting two approaches.
Scenario
there are two tables Table 1 and Table 2 respectively. Table 1 contains user's data for example first name, last name etc
Table 2 contains cars each user has with its description. i.e Color, Registration No etc
Now if I want to have all the information of all users then what approach is best to be completed in minimum time?
Approach 1.
Query for all rows in Table 1 and store them all in a list for ex.
then Loop through the list and query it and get data from Table 2 according to user saved in in first step.
Approach 2
Query for all rows and while saving that row get its all values from table 2 and save them too.
If I think of system processes then I think it might be the same because there are same no of records to be processed in both approaches.
If there is any other better idea please let me know
Your two approaches will have about the same performance (slow because of N+1 queries). It would be faster to do a single query like this:
select *
from T1
left join T2 on ...
order by T1.PrimaryKey
Your client app can them interpret the results and have all data in a single query. An alternative would be:
select *, 1 as Tag
from T1
union all
select *, 2 as Tag
from T2
order by T1.PrimaryKey, Tag
This is just pseudo code but you could make it work.
The union-all query will have surprisingly good performance because sql server will do a "merge union" which works like a merge-join. This pattern also works for multi-level parent-child relationships, although not as well.

How to read the result of SELECT * from joined tables with duplicate column names in .NET

I am a PHP/MySQL developer, slowly venturing into the realm of C#/SQL Server and I am having a problem in C# when it comes to reading an SQL Server query that joins two tables.
Given the two tables:
TableA:
int:id
VARCHAR(50):name
int:b_id
TableB:
int:id
VARCHAR(50):name
And given the query
SELECT * FROM TableA,TableB WHERE TableA.b_id = TableB.id;
Now in C# I normally read query data in the following fashion:
SqlDataReader data_reader= sql_command.ExecuteReader();
data_reader["Field"];
Except in this case I need to differentiate from TableA's name column, and TableB's name column.
In PHP I would simply ask for the field "TableA.name" or "TableB.name" accordingly but when I try something like
data_reader["TableB.name"];
in C#, my code errors out.
How can fix this? And how can I read a query on multiple tables in C#?
The result set only sees the returned data/column names, not the underlying table. Change your query to something like
SELECT TableA.Name as Name_TA, TableB.Name as Name_TB from ...
Then you can refer to the fields like this:
data_reader["Name_TA"];
To those posting that it is wrong to use "SELECT *", I strongly disagree with you. There are many real world cases where a SELECT * is necessary. Your absolute statements about its "wrong" use may be leading someone astray from what is a legitimate solution.
The problem here does not lie with the use of SELECT *, but with a constraint in ADO.NET.
As the OP points out, in PHP you can index a data row via the "TABLE.COLUMN" syntax, which is also how raw SQL handles column name conflicts:
SELECT table1.ID, table2.ID FROM table1, table;
Why DataReader is not implemented this way I do not know...
That said, a solution to be used could build your SQL statement dynamically by:
querying the schema of the tables you're selecting from
build your SELECT clause by iterating through the column names in the schema
In this way you could build a query like the following without having to know what columns currently exist in the schema for the tables you're selecting from
SELECT TableA.Name as Name_TA, TableB.Name as Name_TB from ...
You could try reading the values by index (a number) rather than by key.
name = data_reader[4];
You will have to experiment to see how the numbers correspond.
Welcome to the real world. In the real world, we don't use "SELECT *". Specify which columns you want, from which tables, and with which alias, if required.
Although it is better to use a column list to remove duplicate columns, if for any reason you want *****, then just use
rdr.item("duplicate_column_name")
This will return the first column value, since the inner join will have the same values in both identical columns, so this will accomplish the task.
Ideally, you should never have duplicate column names, across a database schema. So if you can rename your schema to not have conflicting names.
That rule is for this very situation. Once you've done your join, it is just a new recordset, and generally the table names do go with it.

Categories