So, I have recently been tasked with transferring an old Flat File System Scheduling system into a SQL Database using C#.
The main issue I am finding is the fact that the tasks (See snippet below) use a List of strings (Created using GUIDs) has made me unsure of how to structure the database.
public class Task
{
private string TaskID;
private string TaskName;
private string TaskDescription;
private bool IsComplete;
private DateTime EstimatedStartDate;
private DateTime ActualStartDate;
private DateTime EstimatedCompletionDate;
private DateTime ActualCompletionDate;
private string TeamLead;
private List<string> TeamMembers = new List<string>();
private TaskType TaskType;
private string ParentID;
private List<string> ChildIDs = new List<string>();
}
When it comes to SQL I know that using list that can only be contained in a single Cell are generally a nono.
The real question is: Should I be having this in a list where the query will only have to query the taskID or parentID to find the requested task OR to have it split into different tables for each Category in the system (This works in 4 different Categories) and then dependant on the task's type and taskID to choose the correct table it will need to query for its children.
It helps if you define the problem domain more clearly, using a semi-formal syntax. Interpreting your code snippet, I think it boils down to the following.
A task is identified by TaskID
A task has attributes name, description etc.
A task has exactly one person, in the role "TeamLead".
A task has 0 or more persons, in the role "team member".
A task has exactly one type, selected from a collection of valid types.
A task may or may not have a relationship to another task, in the role ParentTask
A task has a relationship with 0 or more other tasks, in the relationship "childTask".
If this is true, you can see the relational model emerging.
In general, any relationship where you have "x..n" connections leads to a bridging table. In your case, that's "TeamMembers", with TaskID and PersonID as foreign keys. ChildTasks is a similar relationship.
In the case where there's "has exactly one", or "may have one", it's a foreign key. TeamLead and TaskType are examples.
There is absolutely no reason to create different tables for task type - the relational model encourages you to group similar things together, and distinguish them by data, rather than by structure.
Having multiple tables with the same structure and the same meaning is a strong antipattern: you have to modify all your queries to access the correct table (which gets especially complex if you want to summarize data from multiple categories), and you would have to modify the database structure whenever the set of possible categories changes. There is unlikely to be any measurable (let alone noticeable) performance difference.
In other words, never put data into the table name.
Related
Is DataSet Parent-Child nested relation available in blazor?
if yes, how to apply or use it?
Thanks.
what i'm trying to do is building something like nested repeater but in blazor
At its simplest you'd perhaps have a DB table pair, one Company and many Employee
class Company{
int Id {get;set;}
string Name {get;set;}
ICollection<Employee> Employees {get;set;}
};
class Employee{
int Id {get;set;}
string Name {get;set;}
int CompanyId {get;set;}
Company Company {get;set;}
};
You'd get EFC to download them for you and fix them up into company manager and related employee details
override async Task OnInitializedAsync(){
var context = new YourdbContext();
_companies = context.Companies.Include(c => c.Employees).ToListAsync();
}
And this would be on a razor page like
#page "someurl/here"
#foreach(var c in _companies){
<h3>#c.Name</h3>
#foreach(var e in c.Employees){
<div>#e.Name</div>
}
}
#code{
List<Company> _companies;
//oninitialized here
}
This isn't intended to be a production grade example. In the fullness of time I'm sure you'd fill it out to more, maybe map the db entities to view models and enumerate those, make components per employee and company etc ..
..but as a basic dump we use something like Entity Framework Core, which understands the relationship between our objects and the relationship between our tables, and will download all the companies and all the employees and figure out which goes with which.
What used to be a data relation is now an object parent with a list of object children inside. If you modify any object EFC detects it, and will persist the change (like DataRow row state and the adapter). If you clear the children collection or remove the parent EFC will treat related data in the manner it's been coded to with the cascading delete behavior, like a data relation used to
Further reading
Read https://learn.microsoft.com/en-us/aspnet/core/blazor/blazor-server-ef-core?view=aspnetcore-6.0 if you plan on injecting a context.
If you want to lower the bar for getting EFC up and running I would genuinely consider creating the db first, installing EFCorePowerTools and reverse engineering the db into your code; what is a quick "make two tables, 5 columns, one db diagram and drag a relationship between them, like we did in a dataset" then becomes a fully set up context, with entity classes and properties all done in a few seconds; no typing c# required
Henk's given a link to a good blog, there are loads of component libraries out there with table controls that can do funky "click the row to show the child" (I use MudBlazor and Blazorise)..
..this is just intended to be an absolute bare bones "here's how we behave like a repeater; we put loops that repeat the emitting of markup with context relevant data changing each pass of the loop" to introduce you to notions of how we might code in razor
I have no affiliation with any software mentioned
when i try to select some items, items are coming with their includes despite i did not include their object to linq
public List<Institution> GetListWithCities(Expression<Func<Institution,bool>> filter = null)
{
using (var context = new DbContext())
{
return filter == null
? context.Set<Institution>()
.Include(x => x.City)
.ToList()
: context.Set<Institution>()
.Include(x => x.City)
.Where(filter)
.ToList();
}
}
[Table("Institution")]
public class Institution{
public int ID;
public string Name;
public int CITY_ID;
public int RESPONSIBLE_INSTUTION_ID;
public virtual City City{ get; set; }
public virtual Institution ResponsibleInstution{ get; set; }
}
I expect a result include with city of instution but my method returns city and responsible instution. And it continues recursively.
People tend to use Include instead of Select while they don't plan to use the functionality that Include gives, but still wasting the processing power that Include uses.
In entity framework always use Select to fetch some data. Only user Include if you plan to update the included items.
One of the slower parts of a database query is the transport from the fetched data from the database management system to your local process. Hence it is wise to Select only those properties that you really plan to use.
Apparently your Institution is in exactly one City, namely the City that the foreign key (CityId?) is referring to. If Institution [10] is located in City [15], then Institution.CityId will have a value 15, equal to City.Id. So you are transferring this value twice.
using (var dbContext = new MyDbContext())
{
IQueryable<Institution> filteredInstitutions = (filter == null) ?
dbContext.Institutions :
dbContext.Institutions.Where(filter);
return filteredInstitutions.Select(institution => new Institution
{
// Select only the Institution properties that you actually plan to use:
Id = institution.Id,
Name = institution.Name,
City = new City
{
Id = institution.City.Id,
Name = institution.City.Name,
...
}
// not needed: you already know the value:
// CityId = institution.City.Id,
});
Possible improvement
Apparently you chose to add a layer between entity framework and the users of your functions: although they use your functions, they don't really have to know that you use entity framework to access the database. This gives your the freedom to use SQL instead of entity framework. Hell, it even gives you the freedom to get rid of your database and use an XML file instead of a DBMS: your users won't know the difference: nice if you want to write unit tests.
Although you chose to separate the method you use to persist the data, you chose to expose your database layout, inclusive foreign keys to the outside world. This makes it more difficult to change your database in future: your users have to change as well.
Consider writing repository classes for Institution and City that only expose those properties that the users of your persistency really need. If people only query "some properties of institutions with some properties of the City in which they are located", or the other way round "Several properties of Cities with several properties of the Institutions located in these Cities", then they won't need the foreign keys.
The intermediate repository classes give you more freedom to change your database. Apart from that, it will give you the freedom to hide certain properties for certain users.
For instance: suppose you add the possibility to delete an institution, but you don't want to immediately delete all information about this institution, for instance because this allows you to restore if someone accidently deletes the institution, you might add a nullable property ObsoleteDate
Moest people that query institutions, don't want the obsolete institutions. If you had an intermediate repository institution class, where you omitted the ObsoleteDate, and all queries removed all Institutions that have a non-zero ObsoleteData, then for your users it would be as if an obsolete institution would have been deleted from the database.
Only one user will need access to the ObsoleteDate: a cleaner task, that every now and then deleted all Institutions that are obsolete for a considerable time.
A third improvement for an intermediate repository class would be that you can give different users access to the same data, with different interfaces: some users can only query information about institutions, some are also allowed to change some data, while others are allowed to change other data. If you give them an interface, they can break this by casting them back to the original Institution.
With separate repository classes, you will have the possibility to give each of these users their own data, and nothing more than this data.
The disadvantage of a repository pattern is that you have to think about different users, and create different query functions. The advantages is that a repository is easier to change and easier to test, and thus easier to keep everything bug free after future changes.
I am new to NHibernate and am not sure if what I am asking makes sense.
I am trying to rewrite some code I currently have:
public IEnumerable<Order> GetByQueue(OrderStatus orderStatus, Queue queue)
{
var criteria = NHibernateSession.CreateCriteria(typeof (TaskDevice), "TaskDevice");
//Pull up all Tasks where a Task's TaskDevice's SourceSiteID or DestinationSiteID are represented in a Queue's QueueLocations.
foreach(QueueLocation queueLocation in queue.QueueLocations)
{
criteria.Add(
Expression.Disjunction()
.Add(Restrictions.Eq("OriginalLocationID", queueLocation.ComponentID))
.Add(Restrictions.Eq("LocationID", queueLocation.ComponentID))
);
}
//Get a hold on all the Tasks returned from TaskDevices.
List<Task> tasks = criteria.List<TaskDevice>().Select(taskDevice => taskDevice.Task).ToList();
//Return all Orders of the given Tasks whose OrderStatus matched the provided orderStatus.
return tasks.Where(task => task.Order.OrderStatus == orderStatus).Select(task => task.Order);
}
This code currently depends on a Queue object. I would like to change this code such that a queueID is provided instead of a Queue object. The table QueueLocation contains 'QueueID' for one of its columns.
This means that I now need to interact with another table in my database, QueueLocation, load the QueueLocation who has a QueueID matching the provided QueueID, and then emulate the adding of restrictions without iterating over a Queue object.
Task does not know of Queue and Queue does not know of Task. They are related by the fact that a Queue may contain a QueueLocation whose ComponentID matches a Task's OriginalLocationID or LocationID.
If I change my initial criteria declaration to:
var criteria = NHibernateSession
.CreateCriteria(typeof (TaskDevice), "TaskDevice")
.CreateCriteria("QueueLocation", "QueueLocation");
then an exception is generated indication that NHibernate could not find property QueueLocation on TaskDevice. This is a valid exception -- TaskDevice does not know of QueueLocation.
I am wondering how to load two non-related tables using NHibernate such that I may filter my restrictions fully through NHibernate in one query. Is this possible?
Criteria is not a good API for queries with entities that are not related in the model.
Use HQL instead.
I want to display a bunch of different data objects that I have using WPF. These data objects vary in data. They have common properties, but they also have some differing properties. There will be a "master" class that references these entities. Each master class can have one of each of these data types.
I'm finding it difficult to design them even on a database level. Should I have one table per data object, thereby making it easy to get the data using NHibernate (just reference one of these entities). This makes it quite difficult to consume using WCF though. If I'm wanting to display this data in WPF, I'll probably need a collection of some variety, and that's what I don't have.
I could put all data types into the same table and have a multi-column unique constraint on the owner id and data type id. But then I may have null properties in my entities, and it would also be hard to display in the UI. It would also complicate editing the entities, as I'd have to mindful of which properties the user can and can't edit.
I guess visually, the entities would look like this in the first way:
public class Master
{
int Id
DataType1 Data1
DataType2 Data2
}
public class DataType1
{
int Id
string SomeString
string AnotherString
}
public class DataType2
{
int Id
string SomeString
string DifferentString
}
And this in the second way:
public class Master
{
int Id
List<DataType> Types
}
public class DataType
{
int Id
string SomeString
string AnotherString
string DifferentString
}
So which would be the best way? Or is there a different way that's better than both (there probably is)?
It is really depend on your business case as it is not such an architectural issue. If you have known number of DataType-s do static (one to one) reference (first example).
If you have unknown or dynamic number of the DataType-s you have no other option than to make these DataType-s as a list in your "Master" object.
So, I'd love some feedback on the best way to design the classes and store the data for the following situation:
I have an interface called Tasks that looks like this:
interface ITask
{
int ID{ get; set;}
string Title {get; set;}
string Description{get; set;}
}
I would like the ability to create different types of Tasks depending on who is using the application...for example:
public class SoftwareTask: ITask
{
//ITask Implementation
string BuildVersion {get; set;}
bool IsBug {get; set;}
}
public class SalesTask: ITask
{
//ITask Implementation
int AccountID {get; set;}
int SalesPersonID {get; set;}
}
So the way I see it I can create a Tasks table in the database with columns that match the ITask interface and a column that shoves all of the properties of more specific tasks in a single column (or maybe even serialize the task object into a single column)
OR
Create a table for each task type to store the properties that are unique to that type.
I really don't like either solution right now. I need to be able to create different types of Tasks ( or any other class) that all share a common core set of properties and methods through a base interface, but have the ability to store their unique properties in a fashion that is easy to search and filter against without having to create a bunch of database tables for each type.
I've starting looking into Plug-In architecture and the strategy pattern, but I don't see where either would address my problem with storing and accessing the data.
Any help or push in the right direction is greatly appreciated!!!
Your second approach (one table per type) is the canonical way to solve this problem - while it requires a bit more effort to implement it fits better with the relational model of most databases and preserves a consistent and cohesive representation of the data. The approach of using one table per concrete type works well, and is compatible with most ORM libraries (like EntityFramework and NHibernate).
There are, however, a couple of alternative approaches sometimes used when the number of subtypes is very large, or subtypes are created on the fly.
Alternative #1: The Key-Value extension table. This is a table with one row per additional field of data you wish to store, a foreign key back to the core table (Task), and a column that specifies what kind of field this is. It's structure is typically something like:
TaskExt Table
=================
TaskID : Number (foreign key back to Task)
FieldType : Number or String (this would be AccountID, SalesPersonID, etc)
FieldValue : String (this would be the value of the associated field)
Alternative #2: The Type-Mapped Extension Table. In this alternative, you create a table with a bunch of nullable columns of different data types (numbers, strings, date/time, etc) with names like DATA01, DATA02, DATA03 ... and so on. For each kind of Task, you select a subset of the columns and map them to particular fields. So, DATA01 may end up being the BuildVersion for a SoftwareTask and an AccountName for a SalesTask. In this approach, you must manage some metadata somewhere that control which column you map specific fields to. A type-mapped table will often look something like:
TaskExt Table
=================
TaskID : Number (foreign key back to task)
Data01 : String
Data02 : String
Data03 : String
Data04 : String
Data05 : Number
Data06 : Number
Data07 : Number
Data08 : Number
Data09 : Date
Data10 : Date
Data11 : Date
Data12 : Date
// etc...
The main benefit of option #1 is that you can dynamically add as many different fields as you need, and you can even support a level of backward compatibility. A significant downside, however, is that even simple queries can become challenging because fields of the objects are pivoted into rows in the table. Unpivoting turns out to be an operation that is both complicated and often poorly performing.
The benefits of option #2 is that it's easy to implement, and preserves a 1-to-1 correspondence betweens rows, making queries easy. Unfortunately, there are some downsides to this as well. The first is that the column names are completely uninformative, and you have to refer to some metadata dictionary to understand which columns maps to which field for which type of task. The second downside is that most databases limit the number of columns on a table to a relatively small number (usually 50 - 300 columns). As a result, you can only have so many numeric, string, datetime, etc columns available to use. So if you type ends up having more DateTime fields than the table supports you have to either use string fields to store dates, or create multiple extension tables.
Be forewarned, most ORM libraries do not provide built-in support for either of these modeling patterns.
You should probably take a lead from how ORMs deal with this, like TPH/TPC/TPT
Given that ITask is an interface you should probably go for TPC (Table per Concrete Type). When you make it a baseclass, TPT and TPH are also options.