Get raw decodated name of IDiaSymbol - c#

I'm trying to improve SymbolSort library, which reads PDB files with DIA SDK. I need to match the symbols read from object files with the symbols read from PDB.
The question is: given an IDiaSymbol variable, how can I obtain its real name? I'm not interesting in the undecorated or human-readable name, I need the mangled name, exactly as it appears in the object file, exactly as linker sees it.
The undecorated name can be easily obtained via IDiaSymbol::get_undecoratedName (ref). For decorated name, I use the following code:
string rawName;
IDiaSymbolUndecoratedNameExFlags flags = Flags.UNDNAME_32_BIT_DECODE | Flags.UNDNAME_TYPE_ONLY;
diaSymbol.get_undecoratedNameEx((uint)flags, out rawName);
It was found empirically that this hack seems to work well in most cases (for no reason). But sometimes it gives some trash as result, e.g.:
diaSymbol.undecoratedName:
"private: bool __cdecl idPhysics_Player::SlideMove(bool,bool,bool,bool) __ptr64"
rawSymbol:
" ?? :: ?? ::Z::_N_N000 & __ptr64 volatile "

I was reading only private symbols from the PDB. The private symbols in general do not provide raw symbol name (in some cases they don't even have one).
The problem is solved by reading public symbols instead (using SymTagEnum.SymTagPublicSymbol). For them, diaSymbol.name always gives the raw name of the symbol.
All this is well documented in public and private symbols article.

Related

Regular Expression For File Name in C#

I am trying to find a regular expression to parse two sections out of the file name for the .resx files in my project. There is one main file called "UiText.resx" and then many translation .resx files with convention "UiText.ja-JP.resx". I need both the "UiText" and the "ja-JP" out of the latter string, as we do have other resx files that don't have to be for UiText (e.g. I have some files named "ExceptionText.resx").
The pattern I'm using right now (which works, it just requires a little extra coding after) is "(?<=\.)((.*?)(?=\.resx))". For the example above, "UiText.ja-JP.resx" gets me a match set in C# of "UiText.", "ja-JP.", "ja-JP.", ".resx"
Of course I am able to just take the first occurrence of "ja-JP." and "UiText." from this set and massage it to what I want, but I'd rather just have a cleaner "UiText" "ja-JP" and be done with it.
I figure I'll probably have to have at least two different patterns for this, so that is OK. Thank you in advance!
Since UiText seems to be constant you can use this regex to extract just js-JP into $1:
^UiText\.(.+?)\.resx$
https://regex101.com/r/XKvwHA/1/
If I'm understanding your needs correctly, then the main reason you need "UiText" is not because you have any value for the term itself, but rather because you need to filter your files. The real term you need to play around with is "ja-JP", which changes for the files you need.
If I'm correct, try this regex:
(?<=UiText\.).+(?=\.resx)
Used in C# as follows:
var fileName = "UiText.ja-JP.resx";
var result = new Regex(#"(?<=^UiText\.).+(?=\.resx$)").Match(fileName).Value;
A little explanation:
(?<=^UiText\.) Start of string must begin exactly with "UiText."
.+ Any number of characters (but at least one)
(?=\.resx$) End of string must end with ".resx"
Any file that doesn't meet your criteria will return an empty string for 'result'.

How to Handle Accented Characters in a Directory Name

I have a problem with using Directory.Exists() on a string that contains an accented character.
This is the directory path: D:\ést_test\scenery. It is coming in as a simple string in a file that I am parsing:
[Area.121]
Title=ést_test
local=D:\AITests\ést_test
Layer=121
Active=FALSE
Required=FALSE
My code is taking the local value and adding \scenery to it. I need to test that this exists (which it does) and am simply using:
if (!Directory.Exists(area.Path))
{
// some handling code
area.AreaIsValid = false;
}
This returns false. It seems that the string handling that I am doing is replacing the accented character. The text visualizer in VS2012 is showing this (directoryManager is just a wrap around System.IO.Directory):
And the warning message as displayed is showing this:
So it seems that the accented character is not being recognized. Searching for this issue does turn up but mostly about removing or replacing the accented character. I am currently using 'normal' string handling. I tried using FileInfo but the path seems to get mangled anyway.
So my first question is how do I get the path stored into a string so that it will pass the Directory.Exists test?
This raises a wider question of non latin characters in path names. I have users all over the world so I can see arabic. Russian, Chinese and so on in paths. How can I handle all of these?
The problem is almost certainly that you're loading the file with the wrong encoding. The fact that it's a filename is irrelevant - the screenshots show that you've lost the relevant data before you call Directory.Exists.
You should make sure you know the file encoding (e.g. UTF-8, Cp1252 etc) and then pass that in as an argument into however you're loading the file (e.g. File.ReadAllText). If this isn't enough information to get you going, you'll need to tell us more about the file (to work out what encoding it's in) and more about your code (how you're reading it).
Once you've managed to load the correct data, I'd hope that the file aspect just handles itself automatically.

Odd C# path issue

My C# application writes its full path surrounded by double quotes to a file, with:
streamWriter.WriteLine("\"" + Application.ExecutablePath + "\"");
Normally it works, the written file contains
"D:\Dev\Projects\MyApp\bin\Debug\MyApp.exe"
But, if the executable path of my application contains a #, something weird happens. The output becomes:
"D:\Dev\Projects#/MyApp/bin/Debug/MyApp.exe"
The slashes after the # become forward slashes. This causes issues with the system I am developing.
Why is this happening, and is there a way to prevent it that is more elegant than string.replacing the path before writing?
I just looked into the source code of Application.ExecutablePath, and the implementation is essentially this*:
Assembly asm = Assembly.GetEntryAssembly();
string cb = asm.CodeBase;
var codeBase = new Uri(cb);
if (codeBase.IsFile)
return codeBase.LocalPath + Uri.UnescapeDataString(codeBase.Fragment);
else
return codeBase.ToString();
The property Assembly.CodeBase will return the location as an URI. Something like:
file:///C:/myfolder/myfile.exe
The # is the fragment marker in a URI; it marks the beginning of the fragment. Apparently, the Uri class alters the given uri when it's parsed and converted back to a string again.
Since Assembly.Location contains a 'normal' file path, I guess your best alternative is:
string executablePath = Assembly().GetEntryAssembly().Location;
*) The implementation is more complex than this, because it also deals with situations where there are multiple appdomains and other special situations. I simplified the code for the most common situation.
Odd error/bug. Other than using a replace function or extension method to always return the correct format you could try using
System.Reflection.Assembly.GetExecutingAssembly().Location
instead of ExecutablePath.

How to reference identifiers with dollar signs from C#?

I'm trying to use a DLL generated by ikvmc from a jar file compiled from Scala code (yeah my day is THAT great). The Scala compiler seems to generate identifiers containing dollar signs for operator overloads, and IKVM uses those in the generated DLL (I can see it in Reflector). The problem is, dollar signs are illegal in C# code, and so I can't reference those methods.
Any way to work around this problem?
You should be able to access the funky methods using reflection. Not a nice solution, but at least it should work. Depending on the structure of the API in the DLL it may be feasible to create a wrapper around the methods to localise the reflection code. Then from the rest of your code just call the nice wrapper.
The alternative would be to hack on the IL in the target DLL and change the identifiers. Or do some post-build IL-hacking on your own code.
Perhaps you can teach IKVM to rename these identifiers such that they have no dollar sign? I'm not super familar, but a quick search pointed me at these:
http://weblog.ikvm.net/default.aspx?date=2005-05-02
What is the format of the Remap XML file for IKVM?
String and complex data types in Map.xml for IKVM!
Good Hunting
Write synonyms for those methods:
def +(a:A,b:A) = a + b
val plus = + _
I fear that you will have to use Reflection in order to access those members. Escaping simply doesn't work in your case.
But for thoose of you, who interested in escaping mechanics I've wrote an explanation.
In C# you can use the #-sign in order to escape keywords and use them as identifiers. However, this does not help to escape invalid characters:
bool #bool = false;
There is a way to write identifiers differently by using a Unicode escape sequence:
int i\u0064; // '\u0064' == 'd'
id = 5;
Yes this works. However, even with this trick you can still not use the $-sign in an identifier. Trying...
int i\u0024; // '\u0024' == '$'
... gives the compiler error "Unexpected character '\u0024'". The identifier must still be a valid identifier! The c# compiler probably resolves the escape sequence in a kind of pre-processing and treats the resulting identifier as if it had been entered normally
So what is this escaping good for? Maybe it can help you, if someone uses a foreign language character that is not on your keyboard.
int \u00E4; // German a-Umlaut
ä = 5;

C# - retrieve file path from config file - # doesn't do it's magic

I'm currently working on a web service that retrieves an XML message, archives it and then processes it further. The archive folder is read from the Web.config. This is what the archive method looks like
private void Archive(System.Xml.XmlDocument xmlDocument)
{
try
{
string directory = System.Configuration.ConfigurationManager.AppSettings.Get("ArchivePath");
ParseMessage(xmlDocument);
directory = string.Format(#"{0}\{1}\{2}", directory, _senderService, DateTime.Now.ToString("MMMyyyy"));
System.IO.Directory.CreateDirectory(directory);
string Id = _messageID;
string senderService = _senderService;
xmlDocument.Save(directory + #"\" + DateTime.Now.ToString("yyyyMMdd_") + Id + "_" + System.Guid.NewGuid().ToString().Substring(0, 13) + ".xml");
}
The path structure I retrieve is C:\Program Files\Subfolder\Subfolder. In the development, QA, UAT and PRD environments everything works fine. But on another machine I now need to install the web service on (which I cannot debug, unfortunately), the directory string is 'C:Files'.
Just to be sure I double checked the .NET version on the different machines (I thought perhaps the usage of # before a string was version-dependent); all machines use 2.0.50727.
Does anyone recognize this problem?
Thanks in advance!
EDIT: I see the # before the directory variable has caused some confusion regarding the question I asked. It was not about that # (in fact, that should not have been there. I have removed it).
My question (rephrased) is:
when you place an # before a quoted string, like #"c:\folder\subfolder", it ensures that the backslashes are not interpreted as escape characters, right? What could be the cause of it working on one machine, but not working on another?
(I do agree with the answers stating to use Path.Combine by the way. I'm just curious what causes this inconsistent behaviour)
You could try using Path.Combine() instead of String.Format(). A good example is here.
When a value is pulled from the configuration file, it is automatically escaped properly. The '#' symbol on your directory variable name is not setting it to be 'explicit' - it's tell the compiler that it is a named parameter. For example:
public void (string[] args)
{
int length = args.Length;
length = #args.Length; // Same thing!
}
The '#' operator on a variable name means to not treat that symbol as a reserved word. It allows you to make variable names with the same name as a keyword:
public static void Foo(object #class)
{
//#class exists here, even though class is a reserved keyword!
}
In addition, if the value it is getting is 'C:Files', then that is invalid, as it is missing a '\'. 'C:\Files' would be valid.
Use Path.Combine(), for instance:
strFilename = CombinePaths(directory, _senderService) + DateTime.Now.ToString("MMMyyyy");
From your question I think that you've treated:
#directory
As if it performed the same function as:
#"c:\myfolder\"
The difference is that the first example allows you to use a reserved word as a variable name, like #class (don't get into the habit of using it) and the second example allows the string to contain unescaped characters such as .

Categories