Custom .Net library or existing library to handle sequence formatting?

Custom .Net library or existing library to handle sequence formatting? - c#

I am thinking a library already exists for this, but I need allow my users to create a numbering format for their documents.
For example, let's say we have an RFI from and the user has a specific format the numbering sequence needs to be in. A typical RFI number looks like this for their system: R0000100. The next RFI in line would be R0000101.
Before I set out to creating a formatting engine for numbers such as these, does something already exist that can accommodate this?
Update:
I failed to save the edit to this question. Anyway, I also want to give the users the ability to create their own formats. So, I may have a form where they can input the format: R####### And also allow them to specify the starting integer: in the case 100. Also, I may want to allow them to specify how they want to increment. maybe only by 100s. So the next number may be R0000200. I know this may sound ridiculous, but you never know. That is why I asked if something like this already exists.

If you keep value and format separated, you won't need a library or such a thing.
The numbers would be simple, say, integers i, i.e. 100, 101, 102, that you manage/store however you see fit. The formatting part would simply be a matter of R + i.ToString("0000000"), or if you want to have the format as a string literal string.Format("R{0:0000000}", i).
I know, this might only be an example, but as your question stands, the formatting options, that .NET provides out of the box seem to suffice.

The incrementing of identity field values is most often handled in an RDBMS-style database. This comes with a few benefits, such as built-in concurrency handling. If you want to generate the values yourself, a simple class to get the last-issued value and increment by one would be very easy to create. Make it thread-safe so you don't get any duplicates or gaps and you'll be good to go.

Related

What's the difference between Entity.Attributes and Entity.FormattedValues?

I'm learning how to write custom workflows and am trying to figure out where and in which format all the values I need are stored. I have noticed that I can access Entity instance data in both the Attributes and FormattedValues properties. How do I know when to use which one?
I have noticed MSDN's remark "Entity formatted values are only available on a retrieve operation, not on an update operation.".
For testing I've made two foreach-blocks iterating through both collections. Attributes gives me 65 lines and FormattedValues gives me 39. I can see that, yes, the output from FormattedValues is indeed formatted.
For example, where Attributes gives the output "Microsoft.Xrm.Sdk.OptionSetValue", FormattedValues gives me a string with the actual value.
Which values/attributes are generally excluded from the FormattedValues collection and why?

I'm not 100% sure about this, but the formatted values are the values you will be able to see on the form. In that list you will be able to find money types with the $ symbol, or the labels of the option sets. A text field shouldn't be shown since is already human-readable.
https://community.dynamics.com/crm/b/crmmitchmilam/archive/2013/04/18/crm-sdk-nugget-entity-formattedvalues-property.aspx
Refer to this article to know a little bit more about it. I rareley using that attribute list since the data is in string format. I found it really useful to retrieve the OprionSet lables.

After a quick check it'd appear that the difference between an attribute and a formatted value is that the former to the actual value stored in the DB (or at least the value that was stored there at the occasion of fetching it) while the latter serves what's shown to the user.
I haven't used formatted values but until proven otherwise, I'd say that an entity's attribute will provide you with the typed value that the regarded field is based on (be that int, DateTime and such), whereas its formatted value is the rendered, stringified representation of that value (being dependent on e.g. what form you're referring to, what language etc.)
By that logic, I'd expect the set of formatted values to be a subset to the set of attributes. Also, it should be constituted of exclusively String typed values, while the counterpart is a member of the type conversion table.
An example of the difference I can think of is an option set called picky with the currently selected option named "hazaa" and the ID of 1234. The below sample is written by heart so feel free to correct. It exemplifies the point, though: plainValue will be an integer equal to 1234, while formattedValue will be "hazaa".
int plainValue = (int)entity["picky"];
String formattedValue = (String)entity.FormattedValues["picky"];
I'd say that the attributive approach is more reliable as it'll render the actual values, while the alternative can lead to unexpected outcome. However, there's a certain convenience to it, I must add.
Personally, I'd recommend to look into the method GetAttributeValue<T>(String) or, as every cocky CRM developer would - have your own class with extension methods and use the method Get<T>(T,String) in there. That provides you with the sense of control and offers the best predictability and readability, IMAO.

Speech to text in c#

I have a c# program that lets me use my microphone and when I speak, it does commands and will talk back. For example, when I say "What's the weather tomorrow?" It will reply with tomorrows weather.
The only problem is, I have to type out every phrase I want to say and have it pre-recorded. So if I want to ask for the weather, I HAVE to say it like i coded it, no variations. I am wondering if there is code to change this?
I want to be able to say "Whats the weather for tomorrow", "whats tomorrows weather" or "can you tell me tomorrows weather" and it tell me the next days weather, but i don't want to have to type in each phrase into code. I seen something out there about e.Result.Alternates, is that what I need to use?

This cannot be done without involving linguistic resources. Let me explain what I mean by this.
As you may have noticed, your C# program only recognizes pre-recorded phrases and only if you say the exact same words. (As an aside node, this is quite an achievement in itself, because you can hardly say a sentence twice without altering it a bit. Small changes, that is, e.g. in sound frequency or lengths, might not be relevant to your colleagues, but they matter to your program).
Therefore, you need to incorporate a kind of linguistic resource in your program. In other words, make it "understand" facts about human language. Two suggestions with increasing complexity below. All apporaches assume that your tool is capable of tokenizing an audio input stream in a sensible way, i.e. extract words from it.
Pattern matching
To avoid hard-coding the sentences like
Tell me about the weather.
What's the weather tomorrow?
Weather report!
you can instead define a pattern that matches any of those sentences:
if a sentence contains "weather", then output a weather report
This can be further refined in manifold ways, e.g. :
if a sentence contains "weather" and "tomorrow", output tomorrow's forecast.
if a sentence contains "weather" and "Bristol", output a forecast for Bristol
This kind of knowledge must be put into your program explicitly, for instance in the form of a dictionary or lookup table.
Measuring Similarity
If you plan to spend more time on this, you could implement a means for finding the similarity between input sentences. There are many approaches to this as well, but a prominent one is a bag of words, represented as a vector.
In this model, each sentence is represented as a vector, each word in it present as a dimension of the vector. For example, the sentence "I hate green apples" could be represented as
I = 1
hate = 1
green = 1
apples = 1
red = 0
you = 0
Note that the words that do not occur in this particular sentence, but in other phrases the program is likely to encounter, also represent dimensions (for example the red = 0).
The big advantage of this approach is that the similarity of vectors can be easily computed, no matter how multi-dimensional they are. There are several techniques that estimate similarity, one of them is cosine similarity (see for example http://en.wikipedia.org/wiki/Cosine_similarity).
On a more general note, there are many other considerations to be made of course.
For example, some words might be utterly irrelevant to the message you want to convey, as in the following sentence:
I want you to output a weather report.
Here, at least "I", "you" "to" and "a" could be done away with without damaging the basic semantics of the sentence. Such words are called stop words and are discarded early in many tools that perform speech-to-text analysis.
Also note that we started out assuming that your program reliably identifies sound input. In reality, no tool is capable of infallibly identifying speech.
Humans tend to forget that sound actually exists without cues as to where word or sentence boundaries are. This makes so-called disambiguation of input a gargantuan task that is easily underestimated - and ambiguity one of the hardest problems of computational linguistics in general.

For that, the code won't be able to judge that! You need to split the command in text array! Such as
Tomorrow
Weather
What
This way, you will compare it with the text that is present in your computer! Lets say, with the command (what) with type (weather) and with the time (tomorrow).
It is better to read and understand each word, then guess it will work as Google! Google uses the same, they break down the string and compare it.

Data structure for searching strings

I am looking for the best data structure for the following case:
In my case I will have thousands of strings, however for this example I am gonna use two for obvious reasons. So let's say I have the strings "Water" and "Walter", what I need is when the letter "W" is entered both strings to be found, and when "Wat" is entered "Water" to be the only result. I did a research however I am still not quite sure which is the correct data structure for this case and I don't want to implement it if I am not sure as this will waste time. So basically what I am thinking right now is either "Trie" or "Suffix Tree". It seems that the "Trie" will do the trick but as I said I need to be sure. Additionally the implementation should not be a problem so I just need to know the correct structure. Also feel free to let me know if there is a better choice. As you can guess normal structures such as Dictionary/MultiDictionary would not work as that will be a memory killer. I am also planning to implement cache to limit the memory consumption. I am sorry there is no code but I hope I will get a answer. Thank you in advance.

You should user Trie. Tries are the foundation for one of the fastest known sorting algorithms (burstsort), it is also used for spell checking, and is used in applications that use text completion. You can see details here.

Practically, if you want to do auto suggest, then storing upto 3-4 chars should suffice.
I mean suggest as and when user types "a" or "ab" or "abc" and the moment he types "abcd" or more characters, you can use map.keys starting with "abcd" using c# language support lamda expressions.
Hence, I suggest, create a map like:
Map<char, <Map<char, Map<char, Set<string>>>>> map;
So, if user enters "a", you look for map[a] and finds all children.

Better method of handling/reading these files (HCFA medical claim form)

I'm looking for some suggestions on better approaches to handling a scenario with reading a file in C#; the specific scenario is something that most people wouldn't be familiar with unless you are involved in health care, so I'm going to give a quick explanation first.
I work for a health plan, and we receive claims from doctors in several ways (EDI, paper, etc.). The paper form for standard medical claims is the "HCFA" or "CMS 1500" form. Some of our contracted doctors use software that allows their claims to be generated and saved in a HCFA "layout", but in a text file (so, you could think of it like being the paper form, but without the background/boxes/etc). I've attached an image of a dummy claim file that shows what this would look like.
The claim information is currently extracted from the text files and converted to XML. The whole process works ok, but I'd like to make it better and easier to maintain. There is one major challenge that applies to the scenario: each doctor's office may submit these text files to us in slightly different layouts. Meaning, Doctor A might have the patient's name on line 10, starting at character 3, while Doctor B might send a file where the name starts on line 11 at character 4, and so on. Yes, what we should be doing is enforcing a standard layout that must be adhered to by any doctors that wish to submit in this manner. However, management said that we (the developers) had to handle the different possibilities ourselves and that we may not ask them to do anything special, as they want to maintain good relationships.
Currently, there is a "mapping table" set up with one row for each different doctor's office. The table has columns for each field (e.g. patient name, Member ID number, date of birth etc). Each of these gets a value based on the first file that we received from the doctor (we manually set up the map). So, the column PATIENT_NAME might be defined in the mapping table as "10,3,25" meaning that the name starts on line 10, at character 3, and can be up to 25 characters long. This has been a painful process, both in terms of (a) creating the map for each doctor - it is tedious, and (b) maintainability, as they sometimes suddenly change their layout and then we have to remap the whole thing for that doctor.
The file is read in, line by line, and each line added to a
List<string>
Once this is done, we do the following, where we get the map data and read through the list of file lines and get the field values (recall that each mapped field is a value like "10,3,25" (without the quotes)):
ClaimMap M = ClaimMap.GetMapForDoctor(17);
List<HCFA_Claim> ClaimSet = new List<HCFA_Claim>();
foreach (List<string> cl in Claims) //Claims is List<List<string>>, where we have a List<string> for each claim in the text file (it can have more than one, and the file is split up into separate claims earlier in the process)
{
HCFA_Claim c = new HCFA_Claim();
c.Patient = new Patient();
c.Patient.FullName = cl[Int32.Parse(M.Name.Split(',')[0]) - 1].Substring(Int32.Parse(M.Name.Split(',')[1]) - 1, Int32.Parse(M.Name.Split(',')[2])).Trim();
//...and so on...
ClaimSet.Add(c);
}
Sorry this is so long...but I felt that some background/explanation was necessary. Are there any better/more creative ways of doing something like this?

Given the lack of standardization, I think your current solution although not ideal may be the best you can do. Given this situation, I would at least isolate concerns e.g. file read, file parsing, file conversion to standard xml, mapping table access etc. to simple components employing obvious patterns e.g. DI, strategies, factories, repositories etc. where needed to decouple the system from the underlying dependency on the mapping table and current parsing algorithms.

You need to work on the DRY (Don't Repeat Yourself) principle by separating concerns.
For example, the code you posted appears to have an explicit knowledge of:
how to parse the claim map, and
how to use the claim map to parse a list of claims.
So there are at least two responsibilities directly relegated to this one method. I'd recommend changing your ClaimMap class to be more representative of what it's actually supposed to represent:
public class ClaimMap
{
public ClaimMapField Name{get;set;}
...
}
public class ClaimMapField
{
public int StartingLine{get;set;}
// I would have the parser subtract one when creating this, to make it 0-based.
public int StartingCharacter{get;set;}
public int MaxLength{get;set;}
}
Note that the ClaimMapField represents in code what you spent considerable time explaining in English. This reduces the need for lengthy documentation. Now all the M.Name.Split calls can actually be consolidated into a single method that knows how to create ClaimMapFields out of the original text file. If you ever need to change the way your ClaimMaps are represented in the text file, you only have to change one point in code.
Now your code could look more like this:
c.Patient.FullName = cl[map.Name.StartingLine].Substring(map.Name.StartingCharacter, map.Name.MaxLength).Trim();
c.Patient.Address = cl[map.Address.StartingLine].Substring(map.Address.StartingCharacter, map.Address.MaxLength).Trim();
...
But wait, there's more! Any time you see repetition in your code, that's a code smell. Why not extract out a method here:
public string ParseMapField(ClaimMapField field, List<string> claim)
{
return claim[field.StartingLine].Substring(field.StartingCharacter, field.MaxLength).Trim();
}
Now your code can look more like this:
HCFA_Claim c = new HCFA_Claim
{
Patient = new Patient
{
FullName = ParseMapField(map.Name, cl),
Address = ParseMapField(map.Address, cl),
}
};
By breaking the code up into smaller logical pieces, you can see how each piece becomes very easy to understand and validate visually. You greatly reduce the risk of copy/paste errors, and when there is a bug or a new requirement, you typically only have to change one place in code instead of every line.

If you are only getting unstructured text, you have to parse it. If the text content changes you have to fix your parser. There's no way around this. You could probably find a 3rd party application to do some kind of visual parsing where you highlight the string of text you want and it does all the substring'ing for you but still unstructured text == parsing == fragile. A visual parser would at least make it easier to see mistakes/changed layouts and fix them.
As for parsing it yourself, I'm not sure about the line-by-line approach. What if something you're looking for spans multiple lines? You could bring the whole thing in a single string and use IndexOf to substring that with different indices for each piece of data you're looking for.
You could always use RegEx instead of Substring if you know how to do that.

While the basic approach your taking seems appropriate for your situation, there are definitely ways you could clean up the code to make it easier to read and maintain. By separating out the functionality that you're doing all within your main loop, you could change this:
c.Patient.FullName = cl[Int32.Parse(M.Name.Split(',')[0]) - 1].Substring(Int32.Parse(M.Name.Split(',')[1]) - 1, Int32.Parse(M.Name.Split(',')[2])).Trim();
to something like this:
var parser = new FormParser(cl, M);
c.PatientFullName = FormParser.GetName();
c.PatientAddress = FormParser.GetAddress();
// etc
So, in your new class, FormParser, you pass the List that represents your form and the claim map for the provider into the constructor. You then have a getter for each property on the form. Inside that getter, you perform your parsing/substring logic like you're doing now. Like I said, you're not really changing the method by which your doing it, but it certainly would be easier to read and maintain and might reduce your overall stress level.

Indirect code execution/C#

Background
We currently have an excel-based system for the creation of specifications for a sound and lighting rental company.
Part of this is a column in the excel sheet called 'autospec', which is made up of Excel formulas for individual stock items ( e.g., you specify a loudspeaker, and then the formulas in the autospec column calculate the cables you need and add them onto the specification automatically.
sample formula
10m microphone cable might have a formula like the following:
=IF([loudspeakers]>0,[loudspeakers]*2,0)+([mixing desk]*4)
We're now moving over to a proper database with a C# front end.
Question
What I would like is to be able to store the autospec formulas for each stock item in a table, and when the user specs an item the front end, the program should find the relevant formula, execute it, and change the spec quantity as appropriate.
Bottom line: I need to execute code contained within a string.
Am I going about this the wrong way?
Is there a better way?

I asked a similar question once: How can I evaluate a C# expression dynamically?
You could probably use that to evaluate those expressions. This not really a safe thing to do, unless you can guarantee nobody will be adding junk (read: evil code) to your database.
If you can break down the formulas into "families", such that each entry in your formula column is a member of a small (5-10) set of formulas with just different parameters, you could try something like this:
[ItemTable]<-[ItemFormulaParameters(param1, param2, param3)]->[FormulaTable(name)]
And have a factory method for instantiating the formula objects by name. Each such formula object has a "calculate(param1, param2, param3)" property...

Either build you're own code interprenter or build a 'dynamic rule engine' that creates some binary representation of c# rules that can be executed by a 'rule engine' you'd have to write.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.