Word translator from XML file - c#

I have to make a program which translates a word from a language to another. Example, if I do translate("Hello","FR") the method must return "Bonjour".
The data is contained in a .NET dictionnary which is in a cache memory zone.
At first, I have to write the translations in a XML file but I don't know how to organize it and how to read it.
I'll have one dictionnary by langage, for example, i'll have
EN which contains 3 keys which are "Bonjour" - "Ola" - "Gutentag" with the same value, which is "Hello".
So, when i'll receive ("Bonjour", "EN"), i'll go in the dictionnary EN and return the value of the key Bonjour.
Bur I really don't see how to organize it in a XML at first to be able to set up all this sytem.
Is this a possibility?
<dico>
<en>
<traduction id ="bonjour" name="hello"/>
<traduction id ="hola" name="hello"/>
<traduction id ="dormir" name="to sleep"/>
<traduction id ="geld" name="argent"/>
<traduction id ="por favor" name="please"/>
</en>
<fr>
...
</fr>
Can you help me please?

This looks fine to me.
For you question about how to read such a file you can check this question How to read xml file in C#.
In order to read id and name value, use node.Attributes["id"].Value.

Following on my my comment above, a better model might be to have the dictionary key above the language element, e.g.
<dico>
<lex id="hello">
<en>hello</en>
<fr>bonjour</fr>
...
</lex>
...
</dico>
Although that might not work with the way that you need to query it, particularly when going from another language to English (or the language you use for the key).

Related

How to find newest document in Mongo Collection (C#)

How do I find the most recent document in a MongoCollection? Currently I'm doing the following, but it seems to be returning the same value regardless:
_collection.FindAllAs<Game>().SetSortOrder(SortBy.Descending("When")).FirstOrDefault<Game>();
The documents are structured in pseudocode as follows:
Game
{
DateTime When;
List<Score> Scores;
...other variables...
}
The games are always stored sequentially via Update.PushWrapped<Score>(Score s)
How could I improve this?
One possible solution is to create a collection that stores the last inserted _id value of your collections and query for this value when you need to get the newest document.
As I said, that is one possible solution, and I'm sure that it will works, but maybe is not the best solution, it depends on your documents structure, etc.
I use this solution to do autoinc fields.

Strategies for modeling large (50~) number of properties

Scenario
I'm parsing emails and inserting them a database using an ORM (NHibernate to be exact). While my current approach does technically work I'm not very fond of it but can't of a better solution. The email contains 50~ fields and is sent from a third party and looks like this (obviously a very short dummy sample).
Field #1: Value 1 Field #2: Value 2
Field #3: Value 3 Field #4: Value 4 Field #5: Value 5
Problem
My problem is that with parsing this many fields the database table is an absolute monster. I can't create proper models employing any kind of relationships either AFAIK because each email sent is all static data and doesn't rely on any other sources.
The only idea I have is to find commonalities between each field and split them into more manageable chunks. Say 10~ fields per entity, so 5 entities total. However, I'm not terribly in love with that idea either seeing as all I'd be doing is create one-to-one relationships.
What is a good way of managing large number of properties that are out of your control?
Any thoughts?
Create 2 tables: 1 for the main object, and the other for the fields. That way you can programatically access each field as necessary, and the object model doesn't look to nasty.
But this is just off the top of my head; you have a weird problem.
If the data is coming back in a file that you can parse easily, then you might be able to get away with creating a command line application that will produce scripts and c# that you can then execute and copy, paste into your program. I've done that when creating properties out of tables from html pages (Like this one I had to do recently)
If the 50 properties are actually unique and discrete pieces of data regarding this one entity, I don't see a problem with having those 50 properties (even though that sounds like a lot) on one object. For example, the Type class has a large number of boolean properties relating to it's data (IsPublic, etc).
Alternatives:
Well, one option that comes to mind immediately is using dynamic object and overriding TryGetMember to lookup the 'property' name as a key in a dictionary of key value pairs (where your real set up of 50 key value pairs exists). Of course, figuring out how to map that from your ORM into your entity is the other problem and you'd lose intellisense support.
However, just throwing the idea out there.
Use a dictionary instead of separate fields. In the database, you just have a table for the field name and its value (and what object it belongs to).

how to insert data into db as a serialized object

my basic question is how to insert data into DB as a serialized object and how to extract and use it then ... any suggestion !!?
e.g :
{id:1, userId:1, type:PHOTO, time:2008-10-15 12:00:00, data:{photoId:2089, photoName:A trip to the beach}}
as you see how could I insert data into column Data and then to use it !?
another question is that if I stored the photoName inside Data instead of using JOINS and get the name from it's table (photos) according to it's Id thats will not implement the last update on the photoName (right !?) besides that I'll not be able to make a relation between table photos and the Current table - (Id => photoId) - if I stored data like that .. so part of the problem is that I don't know exactly what kind of information are going to be stored in colum Data So I can't customize a separate column for every type of these information ...
Typically I see two options for you here.
You can store an XML serialized object into the database, and simply use standard XML Serialization, here is an example that you can adapt for your needs.
You can create a true table for this object, and do things the "Standard" way.
With option 1, filtering/joining/searching on the information in the "data" column although still technically possible, is NOT something i would recommend and would be more for a static storage process in my opinion. Something like a user settings entity, or some other item that is VERY unlikely to be needed for a backend query.
With option 2, yes, you have to do more work, but if you define the object well, it will be possible.
Clarification
With regards to my example in #1 above. You would write out to a memory stream, etc for the serialization rather than a file.
If you don't want to store the data relationally, you're really better off not using a relational database. Several object databases speak JSON and would be able to handle this kind of problem pretty easily.
You can store it as JSON string and use JSONSerializer of JSON lib
http://json-lib.sourceforge.net/apidocs/index.html
to convert javabean into json string/object and vice versa.
Generally we use this to store the configuration where no of config parameters are unknown.
Regarding saving an object in your database; you can serialize your object into xml using XDocument.ToString() and save it in database's xml datatype column.
cmd.Parameters.AddWithValue("#Value", xmldoc.ToString());
Checkout, Work with XML Data Type in SQL Server

Creating a list of checkboxes using ASP.NET MVC for a table without a primary key

Long story short, the database I'm using needs to get looked at. Until that happens, I need to make do with what I've been given (I know, I should fix it..).
I have a table that get populated via an external text file. I am not sure of the exact process as I'm relatively new to the company.
The table does not have a primary key as the entire table is dumped and re-loaded every quarter when there is a new text file.
Enter ASP.NET MVC. I need to display that table with checkboxes in a grid so the user can select some rows and send it back to the server. It sounds relatively easy, but I am really not sure what to put as the value for the checkboxes as I am pretty sure I'll need to use multiple columns to create a unique. Yep, I know, I know :).
OldTable
- Field1
- Field2
- Field3
...
- FieldN
The View
...
<input type="checkbox"
name="bunchOfStuff"
value"Field1Value,Field2Value,Field3Value"/>
...
Would something like this work? If I can create a key with a few fields, can I use those fields as the value in the checkbox? I realize my action will be a bit ugly as I'll have to split and parse each value in the array of values.
Wow, good luck with this!
I think your solution will certainly work and I can't think of a more elegant solution.
However I think you're going to be in deep trouble down the line. The thing that I would do would to be simply to put a Unique Identifer on the table that auto increments.
That shouldn't affect any of your processes or even your bulk insert application unless you insert using an ordinal field offset rather than named.
Sorry this answer isn't exactly what your looking for but the DB is just so bad that (any) answer presented I think will be equally bad.

I need to parse an HTML formatted country list into SQL inserts. Is there an easier way to do this?

There is about 2000 lines of this, so manually would probably take more work than to figure out a way to do ths programatically. It only needs to work once so I'm not concerned with performance or anything.
<tr><td>Canada (CA)</td><td>Alberta (AB)</td></tr>
<tr><td>Canada (CA)</td><td>British Columbia (BC)</td></tr>
<tr><td>Canada (CA)</td><td>Manitoba (MB)</td></tr>
Basically its formatted like this, and I need to divide it into 4 parts, Country Name, Country Abbreviation, Division Name and Division Abbreviation.
In keeping with my complete lack of efficiency I was planning just to do a string.Replace on the HTML tags after I broke them up and then just finding the index of the opening brackets and grabbing the space delimited strings that are remaining. Then I realized I have no way of keeping track of which is the country and which is the division, as well as figuring out how to group them by country.
So is there a better way to do this? Or better yet, an easier way to populate a database with Country and Provinces/States? I looked around SO and the only readily available databases I can find dont provide the full name of the countries or the provinces/states or use IPs instead of geographic names.
Paste it into a spreadsheet. Some spreadsheets will parse the HTML table for you.
Save it as a .CSV file and process it that way. Or. Add a column to the spreadsheet that says something like the following:
="INSERT INTO COUNTRY(CODE,NAME) VALUES=('" & A1 & "','" & B1 & "');"
Then you have a column of INSERT statements that you can cut, paste and execute.
Edit
Be sure to include the <table> tag when pasting into a spreadsheet.
<table><tr><th>country</th><th>name></th></tr>
<tr><td>Canada (CA)</td><td>Alberta (AB)</td></tr>
<tr><td>Canada (CA)</td><td>British Columbia (BC)</td></tr>
<tr><td>Canada (CA)</td><td>Manitoba (MB)</td></tr>
</table>
Processing a CSV file requires almost no parsing. It's got quotes and commas. Much easier to live with than XML/HTML.
/<tr><td>([^\s]+)\s\(([^\)])\)<\/td><td>([^\s]+)\s\(([^\)])\)<\/td><\/tr>/
Then you should have 4 captures with the 4 pieces of data from any PCRE engine :)
Alternatively, something like http://jacksleight.com/assets/blog/really-shiny/scripts/table-extractor.txt provides more completeness.
Sounds like a problem easily solved by a Regex.
I recently learned that if you open a url from Excel it will try and parse out the table data.
If you are able to see this table in the browser (Internet explorer), you can select the entire table, right click & "Export to Microsoft Excel"
That should help you get data into separate columns, I guess.
do you have to do this programatically? If not, may i suggest just copying and pasting the table (from the browser) onto MS Excel and then clearing all formats? This way tou get a nice table that can then be imported into your database without problem.
just a suggestion... hth
An assembly exists for .Net called System.Xml; you can just reference the assembly and convert your HTML document to a System.Xml.XmlDocument, you can easily pinpoint the HTML node that contains your required data, and use the use the children nodes to add into your data. This requires little string parsing on your part.
Load the HTML data as XElements, use LINQ to grab the values you need, and then INSERT.
Blowing my own trumpet here but my FOSS tool CSVfix will do it with a combination of the read_xml and sql_insert commands.

Categories