How to validate field length - c#

I have a Abc field in the database and I want to make sure that the user doesnot cross the limit of the field in database, therefore I use validation that would say "You cannot enter more than 50 characters".
Now the question is that because the memory to store Chinese,Indic script and other such languages take more than 1 byte how would I validate the length.
What happens right now is that user has input of 30 characters and he gets a display message saying "You cannot enter more than 50 characters".
for solution what I know is I can increase the size of the database or check the byte length but it doesn't seems to be a good Idea.
What approach would you follow to get it done?

If your GUI is in WinForms or WPF then you should be able to set a maximum length on a text field which nicely stops people entering more than the specified no of characters, e.g.
myTextBox.MaxLength = 50
But you can set this from the VS Designer - select the textbox and go to its MaxLength property

Use a nvarchar type in your DB it limits number of characters and not number of bytes.

If you're using .Net and SQl Server's nvarchar field, you should be fine I think.
Nvarchar stores 2 bytes per char anyway, so should accommodate the unicode string .net will pass to it with no problem. (in other words, the length in .Net and SQL are both measured in chars, not bytes, if you're using Nvarchar in the database).

Related

Custom kendonumeric max value more than 16 digits

I use NumericTextBoxFor to show amount in my application.
I need to show 20 digits.
Can I accomplish it using NumericTextBoxFor?
I've been trying "max", but kendo just give 16 digits.
#(Html.Kendo().NumericTextBoxFor(model => model.To)
.Max(999999999999999999) //just 16 digits
.Spinners(false)
)
I tried to set 88888888888888888888 but it shows me 88888888888888885248.
Since the actual logic takes place client side, you are most likely getting funny results because the maximum value you can store in a JavaScript Number is 9,007,199,254,740,992. After that, it is representing it as an exponent and losing precision.
I tried doing this max test on the Kendo web site, I did see the value you were talking about, but when you click in the input box it turns to 88888888888888890000, which makes sense if it is turning it into an exponent.
See this and this for more information.
Also an alternative to max value, you could also try a Kendo mask docs here

Storing an int to SQL, but keep leading zeros

I have a field in my SQL Server 2012 table defined as Int. but when I try to get the value from a textbox in C# using the converting (Convert.toint32(textbox.text)). Lets say if the textbox contains the number 0032, it will be saved to the database as 32 removing the 00.
Any solutions other than changing the Field Data Type??
Numeric datatypes do not retain leading zeros, as they are insignificant to the number you want to store. Char or Varchar is more appropriate. You could set a constraint to ensure only numeric characters are stored.
If you absolutely cannot change the data type, then another alternative is to store the number of leading zeros into another int field
So in your example you would store:
Value : 32
Leading zeros : 2
So save to the db in a numeric format - ignoring the leading zero's (as the others have mentioned) and then format like this example:
int i = 32;
string format = "0000";
Console.WriteLine(i.ToString(format));
A datatype is defined by set of possible values and operations that can be done on them.
So, the question here is: what operations will you do on that values? Summing, adding, multiplying...?
If answer is 'no', then change the column's type to varchar() or char() and store the value as-is, with the leading zeroes.
If it's 'yes', then store a proper number and leave the formatting to the client.
In any case, always try to use a proper datatype in the database, domain integrity is a nice thing to have.

Storing stories in sql server 2008?

I am going to store stories in nvarchar(MAX) fields in SQL Server, but I know the stories will be much longer than MAX allows, so what approach should I take? Should I split the story across multiple rows or should I skip using a database and use text files?
I believe the confusion stems from a misunderstanding of terms here.
nvarchar(n) is a data type where n can be a number from 1-4000. The number n in this case has a max of 4000, which adds up to 8000 bytes (2 bytes per character).
nvarchar(MAX) is a different data type altogether - the keyword MAX is a literal, and it is not a synonym for any potential value of n in my example above. Fields of this type have a maximum length of 2^31-1 characters, or over 1 billion, which adds up to over 2 billion bytes (2 bytes per character).
The same principles apply to varchar(n) and varchar(MAX), except each character may only be 1 byte, in which case the number of characters that can be stored is double. Whether it is only 1 byte depends on the collation, as Martin Smith notes in a comment!
Store them in chapters.
This is not technical - it is pretty much impossible to have astory of 1 billion nvarchar characters (and nvarchar(max) is the "new" TEXT data type.
BUt loading and processing them will be painfull.
Store them as chapters and store a start / end page number for every chapter when it makes sense, so you can navigate a little easier.
Btw., you posted you thought it is 800 chars - that was NEVER trhe case. The limit would be 8000 bytes - if it would apply - and that would be 4000 chars unicode.
I'd probably suggest looking into document oriented databases for something like this.
Ok you could try storing as LONGTEXT (Mysql) or TEXT (MSSQL) (if you want to store objects I think you can use BLOB) datatype?

what type shall use to store 12 digit value shall I use decimal or nvarchar in SQL DB?

I need to store an CARD ID number in Database. So there is no calculation just a search of the ID and putting the value in Session as property in a class.
The is ID is always numeric and it's 12 positions.
e.g. 123456789012 and I would like to show on the screen in this format. 123.456.789.012 (every 3 digit a dot).
I tried a test and defined Decimal(12,0) in database and I have put this value in database: 555666777888
then I try to display on the screen I used this code (CardID is decimal):
lblCardID.Text = ent.CardID.ToString("0:#,###")
but it shows on the screen like this: 555,666,77:7,888
where is the colon (:) coming from?
question additional:
- What type shall use in MS SQL to store this value in Database. Decimal (12,0) or Nvarchar(12) ?
nvarchar is definitely not needed. if it's always 12 digits, char(12) would be fine, but I think a 64-bit integer would be most appropriate.
Try writing
lblCardID.Text = ent.CardID.ToString("#,###")
You can user the decimal(12,0) or the bigint datatype. bigint requires one byte less (8 bytes total) per stored value.
The colon is coming from the colon in your format string. The "0:" at the beginning of the format string is needed when you are using string.Format(), as a placeholder to identify which of the arguments to format, but not if you are using ToString() (since there's only one value being formatted).
I would use bigint because it needs only 8 bytes per value.
decimal(12,0) needs 9 bytes and varchar or nvarchar even more (12 or 24 bytes respectively in case of storing 12 digits).
Smaller column size makes indexes smaller, which make indexes faster in use.
Formatting numbers can be done in application.
It's also much easier to change formatting in app in case of requirements change.
If you need to store the formatting, and it's just a numeric value, use varchar, don't waste time with nvarchar as it increases your storage size and won't do you any good unless you expect special (international) chars
If it's never going to be calculated on, I would store it as char(12).
Then in your code, split it with something like this and use the replace function to convert commas to dots:
lblCardID.Text = ent.CardID.ToString("#,###").Replace(",", ".")
If it's an ID number store it as a string datatype, you're not going to be doing sums on it, you also won't have problems losing any leading zeros. You could also then store the card id with the embedded dots, sorting out your formatting problems.
Does your identifier's domain have matematical properties, other than being composed of digits? If not, your value is fixed width, so use CHAR(12). Do not forget to add appropriate domain checks (no characters other than digits, no leading zero, etc) e.g.
CREATE TABLE Cards
(
card_ID CHAR(12) NOT NULL
UNIQUE
CONSTRAINT card_ID__all_digits
CHECK (card_ID NOT LIKE '%[^0-9]%'),
CONSTRAINT card_ID__no_leading_zero
CHECK (card_ID NOT LIKE '[1-9]%)')
);

Reading a Cobol generated file

I’m currently on the task of writing a c# application, which is going sit between two existing apps. All I know about the second application is that it processes files generated by the first one. The first application is written in Cobol.
Steps:
1) Cobol application, writes some files and copies to a directory.
2) The second application picks these files up and processes them.
My C# app would sit between 1) an 2). It would have to pick up the file generated by 1), read it, modify it and save it, so that application 2)
wouldn’t know I have even been there.
I have a few problems.
First of all if I open a file generated by 1) in notepad, most of it is unreadable while other parts are.
If I read the file, modify it and save, I must save the file with the same notation used by the cobol application, so that app 2), doesn´t know I´ve been there.
I´ve tried reading the file this way, but it´s still unreadable:
Code:
string ss = #"filename";
using (FileStream fs = new FileStream(ss, FileMode.Open))
{
StreamReader sr = new StreamReader(fs);
string gg = sr.ReadToEnd();
}
Also if I find a way of making it readable (using some sort of encoding technique), I´m afraid that when I save the file again, I may change it´s original format.
Any thoughts? Suggestions?
To read the COBOL-genned file, you'll need to know:
First, you'll need the record layout (copybook) for the file. A COBOL record layout will look something like this:
01 PATIENT-TREATMENTS.
05 PATIENT-NAME PIC X(30).
05 PATIENT-SS-NUMBER PIC 9(9).
05 NUMBER-OF-TREATMENTS PIC 99 COMP-3.
05 TREATMENT-HISTORY OCCURS 0 TO 50 TIMES
DEPENDING ON NUMBER-OF-TREATMENTS
INDEXED BY TREATMENT-POINTER.
10 TREATMENT-DATE.
15 TREATMENT-DAY PIC 99.
15 TREATMENT-MONTH PIC 99.
15 TREATMENT-YEAR PIC 9(4).
10 TREATING-PHYSICIAN PIC X(30).
10 TREATMENT-CODE PIC 99.
You'll also need a copy of IBM's Principles of Operation (S/360, S370, z/OS, doesn't really matter for our purposes). Latest is available from IBM at
http://www-01.ibm.com/support/docview.wss?uid=isg2b9de5f05a9d57819852571c500428f9a (but you'll need an IBM account.
An older edition is available, gratis, at http://www.hack.org/mc/texts/principles-of-operation.pdf
Chapters 8 (Decimal Instructions) and 9 (Floating Point Overview and Support Instructions) are the interesting bits for our purposes.
Without that, you're pretty much lost.
Then, you need to understand COBOL data types. For instance:
PIC defines an alphameric formatted field (PIC 9(4), for example is 4 decimal digits, that might be filled with for space characters if missing). Pic 999V99 is 5 decimal digits, with an implied decimal point. So-on and so forthe.
BINARY is [usually] a signed fixed point binary integer. Usual sizes are halfword (2 octets) and fullword (4 octets).
COMP-1 is single precision floating point.
COMP-2 is double precision floating point.
If the datasource is an IBM mainframe, COMP-1 and COMP-2 likely won't be IEE floating point: it will be IBM's base-16 excess 64 floating point format. You'll need something like the S/370 Principles of Operation to help you understand it.
COMP-3 is 'packed decimal', of varying lengths. Packed decimal is a compact way of representing a decimal number. The declaration will look something like this: PIC S9999V99 COMP-3. This says that is it signed, consists of 6 decimal digits with an implied decimal point. Packed decimal represents each decimal digit as a nibble of an octet (hex values 0-9). The high-order digit is the upper nibble of the leftmost octet. The low nibble of the rightmost octet is a hex value A-F representing the sign. So the above PIC clause will require ceil( (6+1)/2 ) or 4 octets. the value -345.67, as represented by the above PIC clause will look like 0x0034567D. The actual sign value may vary (the default is C/positive, D/negative, but A, C, E and F are treated as positive, while only B and D are treated as negative). Again, see the S\370 Principles of Operation for details on the representation.
Related to COMP-3 is zoned decimal. This might be declared as `PIC S9999V99' (signed, 5 decimal digits, with an implied decimal point). Decimal digits, in EBCDIC, are the hex values 0xFO - 0xF9. 'Unpack' (mainframe machine instruction) takes a packed decimal field and turns in into a character field. The process is:
start with the rightmost octet. Invert it, so the sign nibble is on top and place it into the rightmost octet of the destination field.
Working from right to left (source and the target both), strip off each remaining nibble of the packed decimal field, and place it into the low nibble of the next available octet in the destination. Fill the high nibble with a hex F.
The operation ends when either the source or destination field is exhausted.
If the destination field is not exhausted, if it left-padded with zeroes by filling the remaining octets with decimal '0' (oxF0).
So our example value, -345.67, if stored with the default sign value (hex D), would get unpacked as 0xF0F0F0F3F4F5F6D7 ('0003456P', in EBDIC).
[There you go. There's a quiz later]
If the COBOL app lives on an IBM mainframe, has the file been converted from its native EBCDIC to ASCII? If not, you'll have to do the mapping your self (Hint: its not necessarily as straightforward as that might seem, since this might be a selective process -- only character fields get converted (COMP-1, COMP-2, COMP-3 and BINARY get excluded since they are a sequence of binary octets). Worse, there are multiple flavors of EBCDIC representations, due to the varying national implementations and varying print chains in use on different printers.
Oh...one last thing. The mainframe hardware tends to like different things aligned on halfword, word or doubleword boundaries, so the record layout may not map directly to the octets in the file as there may be padding octets inserted between fields to maintain the needed word alignment.
Good Luck.
I see from comments attached to your question that you are dealing with the “classic” COBOL batch file structure: Header record, detail records and trailer record.
This is probably bad news if you are responsible for creating the trailer record! The typical “trailer” record is used to identify the end-of-file and provides control information such as the number of records that precede it and various check sums and/or grand totals for “detail” records. In other words, you may need to read and summarize the entire file in order to create the trailer. Add to this the possibility that much of the data in the file is in Packed Decimal, Zoned Decimal or other COBOLish numeric data types, you could be in for a rough time.
You might want to question why you are adding trailer records to these files. Typically the “trailer” is produced by the same program or application that created the “detail” records. The trailer is supposed to act as a verification that the sending application/program wrote all of the data it was supposed to. The summary totals, counts etc. are used by the receiving application to verify that the detail records tally with the preceding details. This is supposed to serve as another verification that the sending application didn't muff up the data or that it was not corrupted en-route (no that wasn't a joke – but maybe it should be). When a "man in the middle" creates the trailers it kind of defeats the entire purpose of the exercise (no matter how flawed it might have been to begin with).
It would be useful to know which Cobol Dialect you are dealing with because there is
no single Cobol Format. Some Cobol Compilers (Micro Focus) put a "File Description" at the front of files (For Micro Focus VB / Indexed files).
Have a look at the RecordEditor (http://record-editor.sourceforge.net/). It has a File Wizard which could be very useful for you.
In the File Wizard set the file as Fixed-Width File (most common in Cobol). The program lets you try out different Record Lengths. When you get the correct record length, the Text fields should line up.
Latter on in the Wizard there is field search which can look for Binary, Comp-3, Text Fields.
There is some notes on using the RecordEditor's Wizard with an unknown file here
http://record-editor.sourceforge.net/Unkown.htm
Unless the file is coming from a Mainframe / AS400 it is unlikely to use EBCDIC (cp037 - Coded Page 37 is US EBCDIC), any text is most likely in Ascii.
The file probably contains Packed-Decimal (Comp3) and Binary-Integer data. Most Cobols
use Big-Endian (for Comp integers) even on Intel (little endian hardware).
One thing to remember with Cobol PIC s9(6)V99 comp is stored as a Binary Integer with x'0001' representing 0.01. So unless you have the Cobol definition you can not tell wether a binary 1 is 1 0.1, 0.01 etc

Categories