I am writing a program that generates a single large table of information. The table is made up of two types of columns, columns that contain quantities, and columns that contain properties. Quantities are numeric values that can be summed, properties are qualitative values that attribute the quantity values.
If the data is in a table in a database I can write a query that selects specific properties and quantities and sums the quantities that have the same value for the selected properties.
Example:
Table:
Quanity1 Quanity2 Quanity3 Property1 Property2 Property3
12 43 12 RED Long Rough
43 23 23 Blue Short Smooth
43 90 34 RED Fat Bumpy
Query:
SELECT sum(Quanity1), sum(Quanity2), Property1 FROM Table Group By Property1
Result:
Quanity1 Quanity2 Property1
43 23 Blue
55 133 Red
What I want to do is give the user a graphical interface to do this with out knowing how to write SQL queries, or any code for that matter. Such as a set of list boxes where they select the properties and quantities they want to view and a table is displayed that shows the selected fields with the quantities summed. I may also later want to add the ability for the user to perform other SQL query like actions such as filtering based on certain conditions. Also I know later I'll need to be able to generate nice looking reports based on these user Queries.
I'm very new to ADO and .NET in general. But I'm thinking the best way to do this is to export my data into a System.Data.DataTable and then create an interface for the user to create a System.Data.DataView by generating a string for it's RowFilter property. Although, it's not obvious to me how I can not only filter and sort a DataTable but generate another Table or view that only contains specific columns from the big master table.
Overall does this sound like the best option, or is there another method I should consider? Does anyone have any specific tips or suggestions on how I should implement this? I was also questioning if any of this would be made easier with LINQ.
Update
I appreciate the suggestion of using Access or other available tool, but it's really not an option. Access is way too complicated for users here to try to figure out, and much more then I actually need. I'd always leave Access as an option for advanced users. But I would still like to setup a basic querying feature where the user selects the columns they want and the software automatically creates the view/query that selects and sums the appropriate columns.
Aside from being to complex the other issue with Access is there are to many clicks between changing something in my data structure and seeing a change in a report. I don't want the user to have to change something, re-export to access, open another program, and then open the report to see the effect of their change.
Consider buying an off-the-shelf query tool rather than re-inventing the wheel. The cheapest one that could do this sort of thing is MS Access or MSQuery in Excel. More elaborately you could use Report Builder (if your database is based on SQL Server - it comes for free with this) or a third-party tool such as Business Objects or Brio.
If you can live without tight integration this is far easier than trying to build your own ad-hoc query tool.
I also strongly recommend off-the-shelf - especially early on. If it becomes apparent later on that the users really need you to write a custom solution, then by all means go for it. But this early on I don't think it will be worth the time and effort you will spend.
Related
I am writing a report in C# that will generate an SQL statement to call data in SAP. In SAP ABAP, there is a command "SELECT-OPTIONS" which will automatically place on a screen a field which automatically has a number of different options to input data. For example, if you wanted to query a customer master database, you could enter a single customer number, multiple customer numbers, multiple ranges of customer numbers. Set criteria to include the customer numbers, exclude them, etc.
It is really nice functionality that users are asking me to duplicate but with a C# front end.
I am trying to replicate this a portion of this functionality by using lookup buttons, datagridviews, internal lists, etc.
I was wondering if anyone has done anything similar or if there is a customer class that already exists that does the equivalent.
You probably need to understand SAP ABAP and C# to fully understand the question as it is hard to explain without having to show a lot pictures and using a lot of words.
Thanks
Stephen
Most likely there is no generic finished product that will do it. In ABAP, this relies on the fact that select-options is bound to a variable, data element and domain, which, in turn, has either a valid-values-list (fix or via table) and/or various search helps. So if you need to enter an employee number, you will be able to select the number by name or by email or by department or other criteria. So basically, for each “type of object” that you want to enter there is some sort of input help that has intrinsic knowledge of entered data.
If you are only interested in an “input field” that is able to select an arbitrary number of following inputs at the same time (without value help dialogs)
include/exclude single values
include/exclude range (for sortable values) (42-50 or Bob-Mike)
include/exclude open ranges (>= 42)
include/exclude values by pattern (ash*)
Then: I never saw anything like that in any UI other than SAPs DynPro or WebDynpro.
In the end, you end up with a so-called range table, which has four values per line:
include/exclude
operation (equals, not equals, less than, between, etc)
value1
value2 (only relevant for operations like “between”)
So if you build a UI for that, the user will need to enter something which will end up in this construct.
Try ERPConnect from Theobald Software:
https://theobald-software.com/en/erpconnect/
I didn't find a mention of SELECT-OPTION control in the brochures but they claim they have .Net API for core SAP/ABAP tools and interfaces, so you can give a try.
I've been tasked with an enhancement to our order system that will require importing segmented GL account codes for assignment on individual line items of an order.
I need to support querying the codes by segment1, segment2, etc in order to load cascading dropdown boxes for assignment by the user. The GL codes will have one or more segments delimited by a character. An example of a code is "1010.1034001.99.01".
I've loaded several thousand codes into a table for testing where the entire string value exists in one column (delimited by a character). I've created two variations of functions that return rows where segment1 value is equal to some parameter. The query also supports further querying by providing additional parameters for other segment values.
I intend to support these queries from the table using Entify Framework 6, but used sql functions to get a feel for what the performance may be when the GL account codes are stored in one column. Performance was not as good as I had hoped.
Does anyone have recommendations on how best to store this data (there may be 200,000 codes). Do you feel that I can query using EF and expect performant results?
Would a hierarchy organization make more sense for this data? Our team was hopeful to store the delimited values on one column.
Thanks in advance.
If you would use a table with three columns you could store the values cascading, enabling you to make your queries a lot easier and probably faster. Why would your team hope to store it in one column, what advantage does that have?
if you have
ID
Code
ParentCodeId
where ID is a unique key and ParentCodeId is a nullable reference to that unique Id you can split your exaple code as follows:
ID Code Parent
1 1010 null
2 1034001 1
3 99 2
4 01 3
By applying some logic when importing your codes, you can check if a code already exists as a parent on the needed level so you don;t have to repeat them, and that way you coul dget all codes that start with 10100 by selecting on selectiong on parentID 1.
looking for examples/tutorial for custom user fields, not via EAV
EAV is going to be problematic for various reasons such as performance
there are many base entities/tables with over 100000 records each
there will likely be over a dozen attributes
the records are to be displayed in a flat ui grid incl. custom fields so flattening them would be an issue while maintaining performance
Looking at enabling this via DDL where all custom fields would go into a matching table such as
<tablename>_custom_<userid>
and all user attributes would map to a column each and all their metadata stored in a metadata table
the retrieval would be simpler where the query would simply be
select *
from <tablename> A, tableName_custom_userid B
where B.KeyField = A.KeyField --( perhaps using outer join, haven't gone that far yet )
Wondering if there are any gotchas down the road that i need to be aware of ?
of course any samples/pointers would be helpful to kickstart the effort
specifically would appreciate any advice on using DDL for Sql Server compact 4
One technique I have seen used is to use a sort of 'hard-coded' EAV pattern. Don't hang up! It worked well with the dataset sizes you were talking about and didn't actually use EAV - it was only EAV-esque.
The idea is to have a set of tables to store these custom attributes within it, with some triggers (described below) on them. The custom attributes tablesets store metadata about the attribute (what table it goes with, data type, constraints, etc). You can get very fancy with this but I did not haev the need.
The triggers on your meta-tables are there to re-generate views that rollup base+extension into first class objects within the DB. So instead of table person + employee extension table, you have an employee view that includes both. When you drop a new value into the custom attributes tables, the triggers will re-roll the views and include the new stuff. If you wanted to go nuts, you could also have the triggers re-write stored procedures as well. Depending on how your mid-tier code is structured, you would still be forced to re-code some, however this would be the case anyway should you be applying rules that read the data.
In testing, I found that for the relatively small # of records you're talking about, performance was somewhat slower but followed roughly the same pattern of degradation (2x the number of records, ~2x as slow).
-- edits --
How I saw it done, you had a table that represented your first class objects, so a row for 'person' and a row for 'employee,' etc. We'll call that FCO. Then you had a secondary table that stored what tables represented the FCO. We'll call that Srcs.. For person, there would be one row, which is the person table. For Employee, there would be two rows, the person table and the Employee extension. There is a third table, called Attribs, which stores the columns from the tables that constitute the FCO. For simplicity, we'll say Employee has ID, Name and Address, and Employee has Hire Date and Department, and obviously PersonID referring back to Person table. So, 2 rows in FCO table (person and employee), 3 rows in Src table, 8 rows in Attribs.
The view, we'll call it vw_Employee, selects PersonID, Name, Address, Hire Date, Department from the two tables. It is built by a SQL stored procedure we'll call OnMetadataChange.
This SP is fired (by trigger or batch process), and its purpose is to generate the CREATE VIEW statements. It will iterate through every First Class Object, collect which fields from which tables constitute the view, and will issue a CREATE statement based on that. So OnMetadataChange produces a DROP and CREATE for each view, it generates a dynamic SQL statement that is executed once per entry in FCO table. It is preferable to do this with Triggers but not necessary. Hopefully your FCO definitions won't change too often, and when they do, there will probably be a code release as well. You can run your OnMetadataChange SP at that time.
The end result is a 2-layer database. The views constitute the First Class Object layer, which is meaningful to the application. The application only uses views. The tables constitute the 'physical' layer, which the application shouldn't care about. The meta-tables are essentially your mapping between the FCO layer and the physical layer. It takes some time to set it up, but it's quite effective, and gives you many of the benefits of EAV, while at the same time giving you the concrete benefits of 3nf tables (indexability, etc).
If you'd like I can throw some sample SQL out there.
Part of the problem you are having is that you are trying to store schema-less data in a SQL database, which is not its strength. There are three approaches that would make your life far easier:
1) Have a column which stores the serialized custom fields, with whatever format is mst convenient. For example, this column could store xml. Upsides are that you can use SQL Server Compact and pulling back a record is trivial. Downsides are that you always have to pull/push the entire xml blob to do an update, and it is difficult to impossible to query on any custom fields.
2) Upgrade to SQL Server Express, and use XML columns. This is nearly the same as the first suggestion, except that any server ready version of SQL Server has native support for XML data. These columns can have indexes added and fields within the data can be used in queries.
3) Use a Schema-less Database, like MongoDB or CouchDB. These databases are all about storing schemaless data, so your custom fields will be no different than any other field. As such, you can index and query custom fields. Upsides are that custom data is incredibly easy to work with, downsides are that you would have to spend some time rethinking how you store data to fit within their model.
If you do not need to query based on custom fields, or if you can query custom fields within business logic, then the first option can work for you. In any other case, I would err towards something with more capabilities than compact. If cost is the deciding factor, both SQL Server Express and MongoDB are free.
I'm developing a tax calculation system that applies various taxes based on a set of supplied criteria.
The information frequently changes, so I'm trying to create a way to store all these logic rules in the database.
As you can imagine, there is a lot of compound logic involved in applying taxes.
For example, a tax might only apply if A is true, B is less than 100, and C equals 7.
My current design is terrible.
I have a few database columns for very common criteria filtering, such as location and tax year.
For more complex logic, I have a column that holds JavaScript, and in code, I run an interpreter to filter the results. Performance and maintainability suck.
I'd like to improve this design by making the logic entirely data-driven, but I'm having trouble figuring out how to correctly represent this logic within a relational database. What is a good way to model this logic in the database?
I have worked on this similar issue for over a year now for a manufacturing cost generation application. Similarly, it takes in loads of product design data input and base on the design, and other inventory considerations such as quantity, bulk purchase options, part supplier, electrical ratings etc. The result is a list of direct materials, labour and costs.
I knew from the onset that what I need is some kind of query language instead of a computational one, and it has to be scripted, not compiled. But I have yet to find a perfect solution:
METHOD 1 - SQL
I created tables that represents my objects and columns that represents properties and then manually typed in the all the SQL SELECT statments required in an item_rules table. What I did was to first save the object into the database, then then I did
rules = SELECT * FROM item_rules
foreach(rules as _rule)
{
count = SELECT COUNT(*) FROM (_rule[select_statement]) as T1
if(count > 1) itemlist.add(_rule[item_that_satisfy_rule])
}
What it does is it takes each rule in the item_rules table and run it against my object that is now in the tables. e.g. SELECT * FROM my_object WHERE A=5 AND B>10. If I successfully pick it up, I get a positive count and then I know I should include the corresponding rule item to my items list.
METHOD 2 - NCALC
Instead of storing the queries in SQL format, I found the NCALC opensource expression parsing library. NCALC takes a string expression and option variable and computes a result. The string expressions can be stored in plain text on the filesystem.
METHOD 3 - EXCEL
EXCEL is actually a very good piece of software for doing data lookups. You can create the formulas in excel and then feed data from your application into excel and then let excel run the formulas to give you the results. Advantage is that many people knows how to use excel, so different people can maintain it.
But like I say, none of these are perfect for me. I am just sharing and hopefully we can get better recommedations.
If you are to go with Jake's approach, You can use Dynamic Sql too.
Let me first describe the situation. We host many Alumni events over the course of each year and provide online registration forms for each event. There is a large chunk of data that is common for each event:
An Event with dates, times, managers, internal billing info, etc.
A Registration record with info about the payment and total amount charged per form submission
Bio/Demographic and alumni data about the 1 or more attendees (name, address, degree, etc.)
We store all of the above data within columns in tables as you would expect.
The trouble comes with the 'extra' fields we are asked to put on the forms. Maybe it is a dinner and there is a Veggie or Carnivore option, perhaps there is lodging and there are bed or smoking options, or perhaps there is an optional transportation option. There are tons of weird little "can you add this to the form?" types of requests we receive.
Currently, we JSONify any non-standard data and store it all in one column (per attendee) called 'extras'. We can read this data out in code but it is not well suited to querying. Our internal staff would like to generate a quick report on Veggie dinners needed for instance.
Other than creating a separate table for each form that holds the specific 'extra' data items, are there any other approaches that could make my life (and reporting) easier? Anyone working in a simialr environment?
This is actually one of the toughest problem to solve efficiently. The SQL Server Customer Advisory Team has dedicated a white-paper to the topic which I highly recommend you read: Best Practices for Semantic Data Modeling for Performance and Scalability.
You basically have 3 options:
semantic database (entity-attribute-value)
XML column
sparse columns
Each solution comes with ups and downs. Out of the top of my hat I'd say XML is probably the one that gives you the best balance of power and flexibility, but the optimal solution really depends on lots of factors like data set sizes, frequency at which new attributes are created, the actual process (human operators) that create-populate-use these attributes etc, and not at least your team skill set (some might fare better with an EAV solution, some might fare better with an XML solution). If the attributes are created/managed under a central authority and adding new attributes is a reasonable rare event, then the sparse columns may be a better answer.
Well you could also have the following db structure:
Have a table to store custom attributes
AttributeID
AttributeName
Have a mapping table between events and attributes with:
AttributeID
EventID
AttributeValue
This means you will be able to store custom information per event. And you will be able to reuse your attributes. You can include some metadata as
AttributeType
AllowBlankValue
to the attribute to handle it easily afterwards
Have you considered using XML instead of JSON? Difference: XML is supported (special data type) and has query integration ;)
quick and dirty, but actually nice for querying: simply add new columns. it's not like the empty entries in the previous table should cost a lot.
more databasy solution: you'll have something like an event ID in your table. You can link this to an n:m table connecting events to additional fields. And then store the additional field data in a table with additional_field_id, record_id (from the original table) and the actual value. Probably creates ugly queries, but seems politically correct in terms of database design.
I understand "NoSQL" (not only sql ;) databases like couchdb let you store arbitrary fields per record, but since you're already with SQL Server, I guess that's not an option.
This is the solution that we first proposed in ASP.NET Forums (that later became Community Server), and that the ASP.NET team built a similar version of in the ASP.NET 2.0 Membership when they released it:
Property Bags on your domain objects
For example:
Event.Profile() or in your case, Event.Extras().
Basically, a property bag is a serialized collection of data stored in a name/value pair in a column (or columns). The ASP.NET 2.0 Membership went the route of storing names in a semi-colon delimited list, and values in the same:
Table: aspnet_Profile
Column: PropertyNames (separated by semi-colons, and has start index and end index)
Column: PropertyValues (separated by semi-colons, and only stores the string value)
The downside to that approach is it is all strings, and manually has to be parsed (even though the membership system does it for you automatically).
Recently, my current method is I've built FormCollection and NameValueCollection C# extension methods that automatically serialize the collections to an XML result. And I store that XML in the table in it's own column associated with that entity. I also have a deserializer C# extension on XElement that deserializes that data back to the collection at runtime.
This gives you the power of actually querying those properties in XML, via SQL (though, that can be slow though - always flatten out your read-only data).
The final note is runtime querying: The general rule we follow is, if you are going to query a property of an entity in normal application logic, then you move that property to an actual column on the table - and create the appropriate indexes. If that data will never be queried directly (for example, Linq-to-Sql or EF), then leave it in the XML Property Bag.
Property Bags gives you the power of extending your domain models however you like, without having to modify the db schema.