Having noticed a small mistake in my C# code (end of line 4):
Domain.Models.Patient patient = new Domain.Models.Patient
{
PatientId = patientId,
StudyID = studyId,
};
I don't get any build error, or runtime errors - when there is an errant comma at the end of studyId.
Why is this, and does it really matter?
It doesn't matter, and it will not cause any compile time error as well. Its true for others as well like enums. It is probably to show, that other items may be added to the object.
enum Test
{
Value1,
Value2,
//Value3, May be to comment out easily
}
Found a Reference - C# Language
Specification:
(Section 24.2) Like Standard C++, C# allows a trailing comma at the end of an
array-initializer. This syntax provides flexibility in adding or
deleting members from such a list, and simplifies machine generation
of such lists.
And
(Section - 21.1) - C# allows a trailing comma in an enum-body, just like it allows one in
an array-initializer
The last comma is ignored by the compiler.
Domain.Models.Patient patient = new Domain.Models.Patient
{
PatientId = patientId,
StudyID = studyId, // this comma is ignored by compiler.
};
This is very convenient when you copy/paste, rearrange, comment out/in your initializers. You don't have to worry if you have to add or remove comma in process.
Related
I have a file that is formatted this way --
{2000}000000012199{3100}123456789*{3320}110009558*{3400}9876
54321*{3600}CTR{4200}D2343984*JOHN DOE*1232 STREET*DALLAS TX
78302**{5000}D9210293*JANE DOE*1234 STREET*SUITE 201*DALLAS
TX 73920**
Basically, the number in curly brackets denotes field, followed by the value for that field. For example, {2000} is the field for "Amount", and the value for it is 121.99 (implied decimal). {3100} is the field for "AccountNumber" and the value for it is 123456789*.
I am trying to figure out a way to split the file into "records" and each record would contain the record type (the value in the curly brackets) and record value, but I don't see how.
How do I do this without a loop going through each character in the input?
A different way to look at it.... The { character is a record delimiter, and the } character is a field delimiter. You can just use Split().
var input = #"{2000}000000012199{3100}123456789*{3320}110009558*{3400}987654321*{3600}CTR{4200}D2343984*JOHN DOE*1232 STREET*DALLAS TX78302**{5000}D9210293*JANE DOE*1234 STREET*SUITE 201*DALLASTX 73920**";
var rows = input.Split( new [] {"{"} , StringSplitOptions.RemoveEmptyEntries);
foreach (var row in rows)
{
var fields = row.Split(new [] { "}"}, StringSplitOptions.RemoveEmptyEntries);
Console.WriteLine("{0} = {1}", fields[0], fields[1]);
}
Output:
2000 = 000000012199
3100 = 123456789*
3320 = 110009558*
3400 = 987654321*
3600 = CTR
4200 = D2343984*JOHN DOE*1232 STREET*DALLAS TX78302**
5000 = D9210293*JANE DOE*1234 STREET*SUITE 201*DALLASTX 73920**
Fiddle
This regular expression should get you going:
Match a literal {
Match 1 or more digts ("a number")
Match a literal }
Match all characters that are not an opening {
\{\d+\}[^{]+
It assumes that the values itself cannot contain an opening curly brace. If that's the case, you need to be more clever, e.g. #"\{\d+\}(?:\\{|[^{])+" (there are likely better ways)
Create a Regex instance and have it match against the text. Each "field" will be a separate match
var text = #"{123}abc{456}xyz";
var regex = new Regex(#"\{\d+\}[^{]+", RegexOptions.Compiled);
foreach (var match in regex.Matches(text)) {
Console.WriteLine(match.Groups[0].Value);
}
This doesn't fully answer the question, but it was getting too long to be a comment, so I'm leaving it here in Community Wiki mode. It does, at least, present a better strategy that may lead to a solution:
The main thing to understand here is it's rare — like, REALLY rare — to genuinely encounter a whole new kind of a file format for which an existing parser doesn't already exist. Even custom applications with custom file types will still typically build the basic structure of their file around a generic format like JSON or XML, or sometimes an industry-specific format like HL7 or MARC.
The strategy you should follow, then, is to first determine exactly what you're dealing with. Look at the software that generates the file; is there an existing SDK, reference, or package for the format? Or look at the industry surrounding this data; is there a special set of formats related to that industry?
Once you know this, you will almost always find an existing parser ready and waiting, and it's usually as easy as adding a NuGet package. These parsers are genuinely faster, need less code, and will be less susceptible to bugs (because most will have already been found by someone else). It's just an all-around better way to address the issue.
Now what I see in the question isn't something I recognize, so it's just possible you genuinely do have a custom format for which you'll need to write a parser from scratch... but even so, it doesn't seem like we're to that point yet.
Here is how to do it in linq without slow regex
string x = "{2000}000000012199{3100}123456789*{3320}110009558*{3400}987654321*{3600}CTR{4200}D2343984*JOHN DOE*1232 STREET*DALLAS TX78302**{5000}D9210293*JANE DOE*1234 STREET*SUITE 201*DALLASTX 73920**";
var result =
x.Split('{',StringSplitOptions.RemoveEmptyEntries)
.Aggregate(new List<Tuple<string, string>>(),
(l, z) => { var az = z.Split('}');
l.Add(new Tuple<string, string>(az[0], az[1]));
return l;})
LinqPad output:
I am getting incorrect syntax error
sYNTAX ERROR Unclosed quotation mark after the character string ' AND ID=''.'
Parameters, basically. They solve problems with quotes and other special symbols, SQL injection problems, and a range of i18n/l10n problems. They're also more efficient due to query plan reuse.
Now, ADO.NET doesn't make it trivial to add parameters, so that's where tools like "Dapper" come in. You also probably want to use a "reader" rather than ExecuteScalar, which can only read one column and one row, but that's a separate issue.
If we did this with "Dapper":
int x = (int)connselect.ExecuteScalar(#"
SELECT * FROM PrintedCards
WHERE Card_Id=#cardId AND Name=#name
-- etc, only 2 shown here
", new { cardId = r["Card_Id"], name = r["Name"] } // <== the parameters
);
If you actually intended to read objects, this would be .Query<T> (for some T) rather than .ExecuteScalar
I am generating SQL code for different types of databases. To do that dynamically, certain parameters of the SQL script are stored in variables.
One such stored parameter is the comparison expression for certain queries.
Lets say I have a Dogs table with a Name, DateOfBirth and Gender columns, then I have comparison expressions in a variable such as:
string myExpression = "Gender=1";
string myExpression2 = "Gender=1 AND Name='Bucky'";
I would build the following SQL string then:
string mySqlString = "SELECT * FROM "dbo"."Dogs" WHERE " + myExpression;
The problem is, that for Oracle syntax, I have to quote the column names (as seen at dbo.Dogs above). So I need to create a string from the stored expression which looks like:
string quotedExpression = "\"Gender\"=1";
Is there a fast way, to do this? I was thinking of splitting the string at the comparison symbol, but then I would cut the symbol itself, and it wouldn't work on complex conditions either. I could iterate through the whole string, but that would include lot of conditions to check (the comparison symbol can be more than one character (<>) or a keyword (ANY,ALL,etc.)), and I rather avoid lots of loops.
IMO the problem here is the attempt to use myExpression / myExpression2 as naked SQL strings. In addition to being a massive SQL-injection hole, it causes problems like you're seeing now. When I need to do this, I treat the filter expression as a DSL, which I then parse into an AST (using something like a modified shunting yard algorithm - although there are other ways to do it). So I end up with
AND
=
Gender
1
=
Name
'Bucky'
Now I can walk that tree (visitor pattern), looking at each. 1 looks like an integer (int.TryParse etc), so we can add a parameter with that value. 'Bucky' looks like a string literal (via the quotes), so we can add a string-based parameter with the value Bucky (no quotes in the actual value). The other two are non-quoted strings, so they are column names. We check them against our model (white-list), and apply any necessary SQL syntax such as escaping - and perhaps aliasing (it might be Name in the DSL, but XX_Name2_ChangeMe in the database). If the column isn't found in the model: reject it. If you can't understand an expression completely: reject it.
Yes, this is more complex, but it will keep you safe and sane.
There may be libraries that can already do the expression parsing (to AST) for you.
I`ve got a bunch of Enums that have been generated from an XSD. They have formats like the following (enums with names, but not numeric values):
public enum MyEnum
{
/// <remarks/>
[System.Xml.Serialization.XmlEnumAttribute("001")]
Item001,
/// <remarks/>
[System.Xml.Serialization.XmlEnumAttribute("002")]
Item002,
.... // etc.
/// <remarks/>
[System.Xml.Serialization.XmlEnumAttribute("199")]
Item199,
}
What I would like is a simple way to refactor these as follows:
public enum MyEnum
{
/// <remarks/>
[System.Xml.Serialization.XmlEnumAttribute("001")]
Item001 = 1,
/// <remarks/>
[System.Xml.Serialization.XmlEnumAttribute("002")]
Item002 = 2,
.... // etc.
/// <remarks/>
[System.Xml.Serialization.XmlEnumAttribute("199")]
Item199 = 199,
}
I need this in order to parse integer values (from a config-file or a DB) into enum values. Note that the needed int values are found both in the XmlEnumAttribute, and in the enum value name itself - just not as a numeric value.
Any ideas for performing this refactoring quickly would be greatly appreciated.
Example and extra background info:
I want to do:
var myEnumValue = (MyEnum) integerFromDb;
I realize that I could probably solve this by creating a method that appends the int value of each piece of data to the string Item, and parses it to an enum using the resulting name, but that has a couple of weaknesses:
Feels like a dirty hack
Might not work properly for names like MyEnum.Item02 and MyOtherEnum.Item002
It would not allow me to refer to enum values using the integer values that are defined outside my system (i.e. this would not be in proper compliance with the rules in the XSD that my enums are based on).
You can use Visual Studio's search-and-replace feature to do this without too much trouble (note…I am assuming VS2013 here, which uses the standard .NET regex syntax; earlier versions of VS can do this also, but they use a custom regex syntax, which you can look up yourself if needed):
Open your source file. Make sure every enum value is declared identically; in particular, put a comma after even the last one.
Press Ctrl+H to show the search-and-replace UI
Enter Item(\d+), as your text to find, and Item$1 = $1, as the replacement text.
Make sure the scope is set to "Current Document".
Press Alt+A. This will replace all matches in the file.
This actually is enough to get the code to compile as you want. But you may prefer to remove leading 0 digits. You can again use the search-and-replace to do that:
Enter = 0 as the text to find, and = as the replacement text.
Press Alt+A twice (because you have at most two leading zeroes)
Finally: as far as your idea of handling it in run-time only, converting via the value name would indeed be potentially problematic, given the dependency on the exact formatting of the name. But note that you have a true parseable integer here, in the [XmlEnum] attribute.
So if you wanted to create the necessary dictionaries for converting (you wouldn't want to keep inspecting the attribute itself, as reflection is slow), you could enumerate the enum type via reflection, get the attributes for each value, parse the string found in the XmlEnumAttribute.Name property, and use that to create dictionary entries, i.e. in a Dictionary<int, MyEnum> and Dictionary<MyEnum, int> to facilitate conversion in either direction.
I've already accepted the answer by Peter Duniho, but would like to expand a little more on it here, since I used a couple of variations:
In my generated file, most of the enums were in the same file, and for some of them, then values had already been replaced. Running the "Replace All" procecure would therefore cause other problems.
Replace next
Therefore, instead of Alt + A (replace all) to replace all matches, I used Alt + R (replace next) repeatedly to loop through and replace only those matches necessary. This let me quickly loop through all my code, without messing up what was already fixed.
F3 (Find Next) can be used as in a normal search, to skip matches that should not be altered.
Leading zeros
I didn't want to remover the 0 in e.g. Item010 = 0, which would be invalid, so I used the following search term instead: = 0\d, which finds only leading zeros (i.e. zeros followed by another number).
var cv = (new CustomValidator()
{
ErrorMessage="Formato de telefone inválido",
ControlToValidate = "txtTelefoneIni", //<-- see the extra comma
});
Because it is legal syntax.
Really, it is.
When you're constructing an object using the object initializer syntax, you're allowed to leave a trailing comma after the last item, even if you end the initializer block there and then.
The reason behind it is probably that it makes it easier to come back and edit it later, as forgetting to add a comma after the previously-last-item is the #1 compile-time-problem with object-initializers.
It's intentional. Also works on enum members. (so it's easier to add more stuff)
Why should it not compile?
This type of syntax is legal in Python as well.
It compiles because it is valid C# syntax. You can also do the same thing with arrays
string[] meh = { "a", "b", "c", };
It is part of the C# language specification: Object initializers.
Because it's allowed.
Generally, allowing trailing comma in such forms (and actually using that "privilege") is useful so that programmers can add more items to the list without the risk of having forgotten to add comma at the end of the previous item.
Its a valid syntax in C#
lets see
var c = new List<int>() { 1, 2, 3 ,};
FWIW, it was already valid in C.
My favorite reason to use such a syntax (especially when defining enums) is that when you add an item to the list, the source control sees changes in one line only (the new one). It doesn't tag the previusly one as modified (because of added comma). Much more readable diff.
My 2-cents.