Customer Name Formatter - c#

I have a feature request to implement automatic capitalisation of the name fields.
rachel mcMillan -> Rachel Mc'Millan
dara obriain -> Dara O'Briain
bill gates -> Bill Gates
etc...
seemed like an innocent request before, no?
Unfortunately with such generic search terms, I'm struggling to find any help. If I have to implement this myself then I'd need a list of "double" last names (Mc' Mac' O' ... etc ...) or something to work from, but it occurs to me that this must of been done before.
So I was wondering if someone could point me in the right direction?
Thanks,
D.R

I think the best way to approach this is to write a first approximation of a solution, i.e. turn everything into lowercase, capitalise first letters and handle extra cases you might think of.
Try going for an extensible solution and then just wait for requirement changes. It's the customer's job to provide you with the exact requirements. It will be their issue to differentiate the "O'Brian" and "Oblivious" cases.

Related

Convert text to sentence case but with added constraints and grammar

I have a task and i am lost in understanding how to begin. I have to convert a sentence that is always in capital case to sentence case. while that is easy and i have done it , there are some constraints.
for e.g. if there is a common short form like say VAT. this has to remain capital.
If there is something like City Tax, this has to remain as it is. similarly, for some other keywords.
Is there any ai approach I can take? performance is important here since this is for a backend API.
we have data to train if needed but it will require some manual labor. I would love if we could take a logical approach to it as well.
I code in C#.
I would love advice on how to approach this problem :)

Filter elements in Revit and set parameter

Hello everybody,
around two or three month ago I started to learn Dynamo for Revit... finally :)
After learning and testing a lot, I got a few own scripts working. Then I learned Python, because I couldn't create the next script only with Dynamo-Nodes.
Then I thought "Let's see how difficult it is to get something done as a PlugIn".
I watched some Videos and read a lot of stuff.
Finally I got the Revit-AddIn-Wizard installed and made my first small Test-PlugIn.
Great...
Now I have a few problems which I do not understand... so I thought I will try my luck here... because I got so much information and help, reading through this site.
My goal was/is the following: (I tell you what I have now)
A form with a few buttons, comboboxes and a DataGridView.
I can load an Excelfile, click on "Show" to show it in the DataGridView.
The header of each row will be automatically put into 3 comboboxes.
In the first combobox you select the first search-parameter, in the second you CAN select another search-parameter and in the third combobox you select the parameter you want to set.
I have a checkbox to switch from type- to instance-parameter for the search- and the set-operation.
There is also a button which shows another small form with a list of categories (I won't search for ALL, only nearly all modelcategories).
PlugIn
I took me a lot of "watching Videos, reading through the internet, testing, testing and testing".
Thanks to this site here and a few others... I managed to get this whole PlugIn nearly 100% working.
But now I have a few strange issues and I have absolutely no clue on how to fix them or if it is possible. And I really hope that someone can help me.
First... I just tell you my problems and perhaps someone can say "this really IS an issue!" or that it is possible to get it done. Then I would post some code.
So... what do I do?!
1. I have a FilteredElementCollector which filters ALL elements.
2. Depending on my "Type/Instance-Checkboxes" I do .WhereElementIsElementType OR .WhereElementIsNotElementType.
3. Then it passes a MultiCategoryFilter to get the big list down to only the modelcategories.
4. Next, the collection passes one of ten different "methods" depending on all settings. There I filter this collection depending on the searchlists-comboboxes. When the combobox says "Familie" or "Typ" then it filters for ".BuiltInParameter.SymbolFamilyName" or ".Name" otherwise it just uses ".LookupParameter".
After that I have a collection with only the elements of selected categories which contains the values from the Excellist.
5. Depending on what my search- and set-settings are (e.g. search for type and set instance) I have to get the instances from the collected types or the other way around.
6. Then I pass it down to another method where I finally set the parameter.
So... Excelheader goes into comboboxes, depending on what you select in there it creates lists with the values of the selected rows.
I hope you all understand.
Now... where are my problems?
When I search for type-familynames or instance-parameter and set a typeparameter it works for ALL categories without any error.
1. When I try to set an instanceparameter (doesn't matter what my search-setting are) it works for all "normal" families but not for the systemfamilies (e.g. walls, floors, pipes etc.). No error, just nothing happens WHY? It seems that I cannot set an instance-parameter for system-families.
2. Roofs, Stairs, CurtainPanels and GenericModel make problems when I search for a typeparameter Error is something like "The object reference was not set to an object instance". Only with these 4 categories and it doesn't matter what I want to set... but when I search for family-/typeNAME or Instance-Parameter, then I can set type or instance and it works (except instance for sysfam).
3. When I try to search AND set an instance-parameter it works for ALL categories EXCEPT if one wall does not contain a search value... it really is enough that ONE wall does not have a search-param-value that everything will be cancelled.
I have a few other small problems... but I hope someone can help me with these problems... I would be extremely thankfull
greetings and have a nice day or night :)
Philipp
Tl; dr.
The three problems you describe sound like your own. I have no heard anybody else runAsk three separate questions and provide three separate minimal code snippets describing how they arise,. into those. I suggest that you create three separate independent minimal reproducible cases to demonstrate all three issues. Chances are, when you simplify and minimalise your code, the problem will go away. If it does not, it might just possibly be in a small and manageable enough state for other people to help you take a look at it. Given the long-winded description above, nobody in the world can help you.
Thank you for your answer Jeremy,
as I said, as a first start it is ok for me if you don't say "With theses categories, there are indeed some issues!"
I think I've managed to create 3 small examples of my problems.
For each problem I made a zip-file containing the complete visual-studio folder, a small exampleproject and a readme.txt with (I hope) enough information to understand everything in detail.
Problem1
Problem3
You only need to compile them or copy the .addin and .ddl files into the Revit AddIn folder. Then you get the new ribbons.
Short problem summary = I get problems when searching for parametervalues and setting values to another parameter.
Edit: I just solved the 2. problem when searching for familynames and setting system-families-parameter.
I used:
ElementClassFilter ecf = new ElementClassFilter(typeof(FamilyInstance));
FilteredElementColletor colle2 = new FilteredElementCollector(doc);
colle2.WherePasses(ecf);
I simply deleted the ClassFilter and do it now like in the other cases where I need instances.
FilteredElementCollector colle2 = new FilteredElementCollector(doc);
colle2.WhereElementIsNotElementType();
The 1. and 3. problem still exist :/
I would be thankful for any help someone can provide :)

Data structure for searching strings

I am looking for the best data structure for the following case:
In my case I will have thousands of strings, however for this example I am gonna use two for obvious reasons. So let's say I have the strings "Water" and "Walter", what I need is when the letter "W" is entered both strings to be found, and when "Wat" is entered "Water" to be the only result. I did a research however I am still not quite sure which is the correct data structure for this case and I don't want to implement it if I am not sure as this will waste time. So basically what I am thinking right now is either "Trie" or "Suffix Tree". It seems that the "Trie" will do the trick but as I said I need to be sure. Additionally the implementation should not be a problem so I just need to know the correct structure. Also feel free to let me know if there is a better choice. As you can guess normal structures such as Dictionary/MultiDictionary would not work as that will be a memory killer. I am also planning to implement cache to limit the memory consumption. I am sorry there is no code but I hope I will get a answer. Thank you in advance.
You should user Trie. Tries are the foundation for one of the fastest known sorting algorithms (burstsort), it is also used for spell checking, and is used in applications that use text completion. You can see details here.
Practically, if you want to do auto suggest, then storing upto 3-4 chars should suffice.
I mean suggest as and when user types "a" or "ab" or "abc" and the moment he types "abcd" or more characters, you can use map.keys starting with "abcd" using c# language support lamda expressions.
Hence, I suggest, create a map like:
Map<char, <Map<char, Map<char, Set<string>>>>> map;
So, if user enters "a", you look for map[a] and finds all children.

Efficient algorithm for finding related submissions

I recently launched my humble side project and would like to add a "related submissions" section when viewing a submission. Exactly like what SO is doing here - see right column, titled "Related"
Considering that each submission has a title and a set of tags, what is most effective (optimum result), most efficient (fast, memory friendly) way to query the database for related submissions?
I can think of one way to do this (which I'll post as an answer) but I'm very interested to see what others have to say. Or perhaps there's already a standard way of achieving this?
Here's my two cent solution:
To achieve the best output, we need to put “weight” on the query results.
To start with, each submission in the database is assumed to have a weight of zero.
Then, if a submission in the "pool" shares one tag with the current submission, we'd add +3 to the found submission. Hence, if another submission is found that shares two tags with the current submission, we add +6 to the weight.
Next, we split/tokenize the title of the current submission and remove “stop words”.
I’ve seen a list of stop words from google, but for now I’ll define my stop words to be: [“of”, “a”, “the”, “in”]
Example:
Title “The Best Submission of All Times”
Result the array: ["The", “Best”, “Submission”, “of”, “All”, “Times”]
Remove stop words: [“Best”, “Submission”, “All”, “Times”]
Then we query the database for submissions containing any of the mentioned titles, and for each result we add the weight: +2
And finally sort the list descending by weight and take the top N results.
What do you think? (be gentle!)
If I understand well, you need a technique to find whether two posts are "similar" one to each other. You may want to use a probabilistic model for that:
http://en.wikipedia.org/wiki/Mutual_information
The idea would be to say that if two posts share a lot of "uncommon" words, they are probably speaking on the same topic. For detecting uncommon words, depending on your application, you may use a general table of frequencies, or maybe better, build it yourself on the universe of the words of your posts (but you will need to have enough of them to have something relevant).
I would not limit myself on title and tags, but I would overweight them in the research.
This kind of ideas is very common in spam filtering. I unfortunately the time to make a full review, but a quick google search gives:
http://www.aclweb.org/anthology/P/P04/P04-3024.pdf
karlmicha.googlepages.com/acl2004_poster.pdf

Adding words to SQL Server Full Text Stemmer

I've dug around for a few hours now and cannot find an option to do this. What I would like to do is to add words to the stemmer used by Full Text in SQL Server. I work for an agency that would like to search on variations of names. In other words, if an officer enters the name of "Bill" I would also get a hit on "Will" or "William". Anyone know if this is possible?
I did look at implementing a custom IStemmable interface but that seems a bit of an overblow solution to this problem. Does anyone know of an easier way or have an off the shelf solution that will do this?
Thanks...
In SQL Server 2K5 or 2K8 it is called the "Thesaurus". Well doced in MSDN etc
It handles things like these
<expansion>
<sub>Internet Explorer</sub>
<sub>IE</sub>
<sub>IE5</sub>
</expansion>
<replacement>
<pat>NT5</pat>
<pat>W2K</pat>
<sub>Windows 2000</sub>
</replacement>
<expansion>
<sub>run</sub>
<sub>jog</sub>
</expansion>
Sigh....
Thanks. One of those times I was definitely trying to make it more difficult then it needed to be. I think I found stemmer early on and kept using it in my searches.
Thanks again.

Categories