I am working on a large project that contains many reference / look up type tables. This is maybe not the correct place to ask this question but I would like to find out the name that people give in English to these kind of tables:
The tables contain data such as status_code, status_type. They are preloaded and the data in them will probably never change.
Please excuse my not knowing this but ENglish is not my first language and I need to do a presentatation to talk about these kind of tables.
People give different names - LookUpData, ReferenceData, StaticData, etc depending upon the properties of data. Is this answer your questions? If not, probably you need to be more elaborative.
The most often used english name I've encountered is 'static data'. This indicates that the data does not change very often.
Aside: one interesting aspect of static data is that it is a good candidate for caching.
Where the tables contain data that's interesting to the application, but do reflect not the outside world - status codes, for instance - I tend to call them enumeration tables.
Where the table contains values that change rarely, but do reflect the outside world - cities, currencies, etc. - I tend to call them lookup tables.
Where tables contain "friendly" values - e.g translating ISO country/locale codes - I tend to call them reference tables.
We tend to use the term "Reference Data". But there isn't a standard term, it varies a lot in my experience. I haven't heard "Static Data" before, but I like it.
Related
I have built a CRUD app using C# which is used to input cases against employees.
Im trying to figure out a way to link cases together if necessary.
e.g - A big fight breaks out
Tom was involved and has a case raised against him (caseID: 1)
Mark was also involved and has a case raised against him (caseID: 2)
Steve was also involved and has a case raised against him (caseID: 3)
(All the above stored in a single table)
As this was the same fight we want to link all the cases together.
How would you suggest I store this in another table?
Yes! You apparently need an "incident id" of some sort. This would be a different table with one row per incident. If the only incidents are fights, then this would be a fights table.
The incident id would then be related to each of the "cases" that you have.
Information about the incident would be in the incident table. That would probably include information such as type, date/time, location, and so on. In fact, what you are calling "cases" might simply be incidentParticipants or something like that.
I want to use ML.Net Multi-class classification in my current project that collects error logs from one my company systems.
Point is to add tags to errors and one point in the future train a model to predict and assign tags to incoming logs.
I'm using a model builder and I can't see my table relations, I store all logs in one table, tags in another and all relations in the third one.
|Logs| <-- |LogId|TagId| --> |Tags|
My goal is to classify table with TagId column based on Logs table - is that possible? or do I have to have everything in one table?
Generally speaking, machine learning algorithms are dealing with the fully 'denormalized' and 'prepared' data: every training example is vector of floats ('features'), and one 'ground truth' value.
ML.NET helps with some of the typical pre-processing tasks, like text featurization, one-hot encoding, rescaling/normalization, but it doesn't provide pretty much any 'relational' functionality (no JOINs).
So, you should de-normalize / 'flatten' your data before you pass it to ML.NET.
I am writing a report in C# that will generate an SQL statement to call data in SAP. In SAP ABAP, there is a command "SELECT-OPTIONS" which will automatically place on a screen a field which automatically has a number of different options to input data. For example, if you wanted to query a customer master database, you could enter a single customer number, multiple customer numbers, multiple ranges of customer numbers. Set criteria to include the customer numbers, exclude them, etc.
It is really nice functionality that users are asking me to duplicate but with a C# front end.
I am trying to replicate this a portion of this functionality by using lookup buttons, datagridviews, internal lists, etc.
I was wondering if anyone has done anything similar or if there is a customer class that already exists that does the equivalent.
You probably need to understand SAP ABAP and C# to fully understand the question as it is hard to explain without having to show a lot pictures and using a lot of words.
Thanks
Stephen
Most likely there is no generic finished product that will do it. In ABAP, this relies on the fact that select-options is bound to a variable, data element and domain, which, in turn, has either a valid-values-list (fix or via table) and/or various search helps. So if you need to enter an employee number, you will be able to select the number by name or by email or by department or other criteria. So basically, for each “type of object” that you want to enter there is some sort of input help that has intrinsic knowledge of entered data.
If you are only interested in an “input field” that is able to select an arbitrary number of following inputs at the same time (without value help dialogs)
include/exclude single values
include/exclude range (for sortable values) (42-50 or Bob-Mike)
include/exclude open ranges (>= 42)
include/exclude values by pattern (ash*)
Then: I never saw anything like that in any UI other than SAPs DynPro or WebDynpro.
In the end, you end up with a so-called range table, which has four values per line:
include/exclude
operation (equals, not equals, less than, between, etc)
value1
value2 (only relevant for operations like “between”)
So if you build a UI for that, the user will need to enter something which will end up in this construct.
Try ERPConnect from Theobald Software:
https://theobald-software.com/en/erpconnect/
I didn't find a mention of SELECT-OPTION control in the brochures but they claim they have .Net API for core SAP/ABAP tools and interfaces, so you can give a try.
Scenario
I'm parsing emails and inserting them a database using an ORM (NHibernate to be exact). While my current approach does technically work I'm not very fond of it but can't of a better solution. The email contains 50~ fields and is sent from a third party and looks like this (obviously a very short dummy sample).
Field #1: Value 1 Field #2: Value 2
Field #3: Value 3 Field #4: Value 4 Field #5: Value 5
Problem
My problem is that with parsing this many fields the database table is an absolute monster. I can't create proper models employing any kind of relationships either AFAIK because each email sent is all static data and doesn't rely on any other sources.
The only idea I have is to find commonalities between each field and split them into more manageable chunks. Say 10~ fields per entity, so 5 entities total. However, I'm not terribly in love with that idea either seeing as all I'd be doing is create one-to-one relationships.
What is a good way of managing large number of properties that are out of your control?
Any thoughts?
Create 2 tables: 1 for the main object, and the other for the fields. That way you can programatically access each field as necessary, and the object model doesn't look to nasty.
But this is just off the top of my head; you have a weird problem.
If the data is coming back in a file that you can parse easily, then you might be able to get away with creating a command line application that will produce scripts and c# that you can then execute and copy, paste into your program. I've done that when creating properties out of tables from html pages (Like this one I had to do recently)
If the 50 properties are actually unique and discrete pieces of data regarding this one entity, I don't see a problem with having those 50 properties (even though that sounds like a lot) on one object. For example, the Type class has a large number of boolean properties relating to it's data (IsPublic, etc).
Alternatives:
Well, one option that comes to mind immediately is using dynamic object and overriding TryGetMember to lookup the 'property' name as a key in a dictionary of key value pairs (where your real set up of 50 key value pairs exists). Of course, figuring out how to map that from your ORM into your entity is the other problem and you'd lose intellisense support.
However, just throwing the idea out there.
Use a dictionary instead of separate fields. In the database, you just have a table for the field name and its value (and what object it belongs to).
Let me first describe the situation. We host many Alumni events over the course of each year and provide online registration forms for each event. There is a large chunk of data that is common for each event:
An Event with dates, times, managers, internal billing info, etc.
A Registration record with info about the payment and total amount charged per form submission
Bio/Demographic and alumni data about the 1 or more attendees (name, address, degree, etc.)
We store all of the above data within columns in tables as you would expect.
The trouble comes with the 'extra' fields we are asked to put on the forms. Maybe it is a dinner and there is a Veggie or Carnivore option, perhaps there is lodging and there are bed or smoking options, or perhaps there is an optional transportation option. There are tons of weird little "can you add this to the form?" types of requests we receive.
Currently, we JSONify any non-standard data and store it all in one column (per attendee) called 'extras'. We can read this data out in code but it is not well suited to querying. Our internal staff would like to generate a quick report on Veggie dinners needed for instance.
Other than creating a separate table for each form that holds the specific 'extra' data items, are there any other approaches that could make my life (and reporting) easier? Anyone working in a simialr environment?
This is actually one of the toughest problem to solve efficiently. The SQL Server Customer Advisory Team has dedicated a white-paper to the topic which I highly recommend you read: Best Practices for Semantic Data Modeling for Performance and Scalability.
You basically have 3 options:
semantic database (entity-attribute-value)
XML column
sparse columns
Each solution comes with ups and downs. Out of the top of my hat I'd say XML is probably the one that gives you the best balance of power and flexibility, but the optimal solution really depends on lots of factors like data set sizes, frequency at which new attributes are created, the actual process (human operators) that create-populate-use these attributes etc, and not at least your team skill set (some might fare better with an EAV solution, some might fare better with an XML solution). If the attributes are created/managed under a central authority and adding new attributes is a reasonable rare event, then the sparse columns may be a better answer.
Well you could also have the following db structure:
Have a table to store custom attributes
AttributeID
AttributeName
Have a mapping table between events and attributes with:
AttributeID
EventID
AttributeValue
This means you will be able to store custom information per event. And you will be able to reuse your attributes. You can include some metadata as
AttributeType
AllowBlankValue
to the attribute to handle it easily afterwards
Have you considered using XML instead of JSON? Difference: XML is supported (special data type) and has query integration ;)
quick and dirty, but actually nice for querying: simply add new columns. it's not like the empty entries in the previous table should cost a lot.
more databasy solution: you'll have something like an event ID in your table. You can link this to an n:m table connecting events to additional fields. And then store the additional field data in a table with additional_field_id, record_id (from the original table) and the actual value. Probably creates ugly queries, but seems politically correct in terms of database design.
I understand "NoSQL" (not only sql ;) databases like couchdb let you store arbitrary fields per record, but since you're already with SQL Server, I guess that's not an option.
This is the solution that we first proposed in ASP.NET Forums (that later became Community Server), and that the ASP.NET team built a similar version of in the ASP.NET 2.0 Membership when they released it:
Property Bags on your domain objects
For example:
Event.Profile() or in your case, Event.Extras().
Basically, a property bag is a serialized collection of data stored in a name/value pair in a column (or columns). The ASP.NET 2.0 Membership went the route of storing names in a semi-colon delimited list, and values in the same:
Table: aspnet_Profile
Column: PropertyNames (separated by semi-colons, and has start index and end index)
Column: PropertyValues (separated by semi-colons, and only stores the string value)
The downside to that approach is it is all strings, and manually has to be parsed (even though the membership system does it for you automatically).
Recently, my current method is I've built FormCollection and NameValueCollection C# extension methods that automatically serialize the collections to an XML result. And I store that XML in the table in it's own column associated with that entity. I also have a deserializer C# extension on XElement that deserializes that data back to the collection at runtime.
This gives you the power of actually querying those properties in XML, via SQL (though, that can be slow though - always flatten out your read-only data).
The final note is runtime querying: The general rule we follow is, if you are going to query a property of an entity in normal application logic, then you move that property to an actual column on the table - and create the appropriate indexes. If that data will never be queried directly (for example, Linq-to-Sql or EF), then leave it in the XML Property Bag.
Property Bags gives you the power of extending your domain models however you like, without having to modify the db schema.