Cannot get line number of Instruction using Mono.Cecil 0.10.2 - c#

I am trying to get the line number associated with the Instruction object in the method below. SequencePoint.StartLine is supposed to give me the information I need, but in the commented out section of the method seqPoint is always null.
/// <summary>
/// Finds all places in code where one or more methods are called.
/// </summary>
/// <param name="classOfMethods">Full name of the class that contains the methods to find.</param>
/// <param name="methodNames">Names of the methods to find in code.</param>
public MethodCall[] FindAllMethodCalls(Type classOfMethods, params string[] methodNames)
{
#region Use Mono.Cecil to find all instances where methods are called
var methodCalls = new List<MethodCall>();
var ad = AssemblyDefinition.ReadAssembly(binaryFileToSearch, new ReaderParameters { ReadSymbols = true });
foreach (var module in ad.Modules)
{
foreach (var type in module.GetTypes())
{
foreach (var method in type.Methods.Where(x => x.HasBody))
{
var instrs = method.Body.Instructions.Where(x => x.OpCode == OpCodes.Call).ToArray();
foreach (var instr in instrs)
{
var mRef = instr.Operand as MethodReference;
if (mRef != null && mRef.DeclaringType.FullName == classOfMethods.FullName && methodNames.Contains(mRef.Name))
{
// this does not work -- always returns null
//var seqPoint = method.DebugInformation.GetSequencePoint(instr);
//if (seqPoint != null)
//{
//}
methodCalls.Add(new MethodCall
{
CodeFile = method.DebugInformation.SequencePoints.First().Document.Url,
MethodRef = mRef,
});
}
}
}
}
}
...
return methodCalls.ToArray();
}
The binary files I am calling AssemblyDefinition.ReadAssembly() on have corresponding .pdb files, and I am using the ReadSymbols = true option. What am I doing wrong?

Your code is OK.
Cecil read PDB file when he should. In the read process, he is populating all PDB functions: (this is not the complete real code but for demonstration, although the real code is similar)
private bool PopulateFunctions()
{
PdbFunction[] array = PdbFile.LoadFunctions(.....);
foreach (PdbFunction pdbFunction in array)
{
this.functions.Add(pdbFunction.token, pdbFunction);
}
}
Then when he reading MethodBody, he populate the sequence points:
private void ReadSequencePoints(PdbFunction function, MethodDebugInformation info)
{
foreach (PdbLines lines in function.lines)
{
this.ReadLines(lines, info);
}
}
And,
private static void ReadLine(PdbLine line, Document document, MethodDebugInformation info)
{
SequencePoint sequencePoint = new SequencePoint(ine.offset, document);
sequencePoint.StartLine = line.lineBegin;
sequencePoint.StartColumn = line.colBegin;
sequencePoint.EndLine = line.lineEnd;
sequencePoint.EndColumn = line.colEnd;
info.sequence_points.Add(sequencePoint);
}
If you can't find a specific sequence point, it means that it does not exist in the sequence points collection of the method debug information.
Anyway, I'm in doubt that you are getting always null.
Please write this:
method.DebugInformation.GetSequencePointMapping()
And compare it with your instrs collection.
method.DebugInformation.GetSequencePointMapping() is the sequence points that Cecil know about them, and for each of them you can see the instruction that it maps to it, so you can check which call instruction in your method has a sequence point.

Related

Visual Studio debugger is out of sync with my code

I modified some code and set a breakpoint, but when the debugger hits that breakpoint, it goes nuts and runs the old code anyway!
Here is the original code:
/// <summary>
/// Creates a new <see cref="CommaSeparatedValue"/> for the specified values.
/// </summary>
/// <param name="values"></param>
public CommaSeparatedValue(params object[] values)
{
List<string> list = new List<string>();
foreach (var value in values)
{
if (value is IEnumerable)
{
foreach (var item in (IEnumerable)value)
{
list.Add(Scrub(item));
}
}
else
{
list.Add(Scrub(value));
}
}
_List = list;
}
And what I changed it to:
/// <summary>
/// Creates a new <see cref="CommaSeparatedValue"/> for the specified values.
/// </summary>
/// <param name="values"></param>
public CommaSeparatedValue(params object[] values)
{
List<string> list = new List<string>();
foreach (var value in values)
{
if (value is IEnumerable && !(value is string)) // !!! - I changed this line here
{
foreach (var item in (IEnumerable)value)
{
list.Add(Scrub(item));
}
}
else
{
list.Add(Scrub(value));
}
}
_List = list;
}
I set the breakpoint on the line that I modified (checking for a string value) and when the debugger hits that line it ignores the part that I added and continues running into the "if" block even when the value variable is a string.
If it matters, this code is being run from a MSTest unit test.
This can happen when for some reason your project is not being built before you run it, so that the code the debugger is running is no longer the same as the source you are looking at. Look in the Configuration Manager and make sure 'Build' is checked for the configuration you're using.

Compare file hashes every 10 seconds to evaluate if it has been modified or not

I am trying to check to see if a given file has changed by checking every ten seconds and comparing the previous file hash to the file current hash ( see FileHasChanged() method ). If the hashes are different, it would indicate that the file has been modified.
When I run my code, however, I get the following error:
System.InvalidOperationException: Collection was modified. Enumeration operation may not execute.
I'm at a loss for how to get this working. Any help would be greatly appreciated!
public class File : Subscribable
{
private Dictionary<byte[], byte[]> FileHashes;
private bool FileModified;
private DateTime FileModifiedDate;
private List<string> FileNames;
private List<Observable> Observers;
public File(List<string> fileNames)
{
FileHashes = new Dictionary<byte[], byte[]>();
FileNames = new List<string>();
Observers = new List<Observable>();
foreach (var file in fileNames)
FileNames.Add(file);
var initializeHashesTimer = new Timer(LoadFileHashes, null, 0, 10000);
var checkIfChangedTimer = new Timer(FileHasChanged, null, 10, 20000);
}
public void NotifyObservers()
{
foreach (var observer in Observers)
observer.Update(FileModified, FileModifiedDate, FileNames);
}
public void RegisterObserver(Observable o)
{
Observers.Add(o);
}
public void RemoveObserver(Observable o)
{
int i = Observers.IndexOf(o);
if (i >= 0)
Observers.RemoveAt(i);
}
private void LoadFileHashes(object state)
{
Console.WriteLine("Loading file hashes... ");
// For each file in a list of files, store the file hash.
foreach (var file in FileNames)
{
var hash = GetFileHash(file);
FileHashes.Add(hash, null);
}
}
private void FileHasChanged(object state)
{
Console.WriteLine("Checking to see if file has been modified... ");
var values = new List<byte[]>();
// In ten seconds, store the file hash again.
foreach (var file in FileNames)
{
var hash = GetFileHash(file);
values.Add(hash);
}
foreach (KeyValuePair<byte[], byte[]> entry in FileHashes)
{
if(entry.Key != null)
{
foreach (var value in values)
FileHashes[entry.Key] = value;
}
}
// If the file hash is different, the file has been modified.
foreach (KeyValuePair<byte[], byte[]> entry in FileHashes)
{
if (entry.Key != entry.Value)
{
FileModified = true;
NotifyObservers();
}
}
}
public static byte[] GetFileHash(string fileName)
{
HashAlgorithm sha1 = HashAlgorithm.Create();
using (FileStream stream = new FileStream(fileName, FileMode.Open, FileAccess.Read))
return sha1.ComputeHash(stream);
}
}
To explain what's likely giving you the error:
When you write a foreach loop like
foreach (KeyValuePair<byte[], byte[]> entry in FileHashes)
{
// ...
}
this is really syntactic sugar for something like
IEnumerator<KeyValuePair<byte[], byte[]>> enumerator = FileHashes.GetEnumerator();
while(enumerator.MoveNext())
{
KeyValuePair<byte[], byte[]> = enumerator.Current;
// ...
}
enumerator.Dispose();
Here the enumerator object can loop through the underlying collection, provided that the underlying collection doesn't change. If it does, the enumerator is invalidated and the call to enumerator.MoveNext() throws an exception.
That said, it's not really clear to me what the code is meant to do. For instance, the code
foreach (var value in values)
FileHashes[entry.Key] = value;
viewed alone, sets FileHashes[entry.Key] to whatever the last value in values is, since each iteration of the foreach loop will just overwrite the last one.
You might want to unpack what you're trying to do and making sure your code actually says what you want it to.
Some people have suggested using FileSystemWatcher, but this often won't work in practice, at least if dealing with every change is a hard requirement.
One obvious issue is that if your program stops and has to restart it obviously won't know about anything that happened when it was shut off. Another is that if your program modifies the files it's watching, it will get notifications of its own activity, meaning you have to go to some trouble to distinguish what's causing the notifications. A less obvious issue is that if you're watching a network drive and there's a blip in the network connection the files might change without you receiving a notification.
(I'm going from memory here, so some of the above may not be 100% correct, but hopefully the idea gets across.)
For these reasons I've always ended up having to do manual polling anyway.

Classification of instances in Weka

I'm trying to use Weka in my C# application. I've used IKVM to bring the Java parts into my .NET application. This seems to be working quite well. However, I am at a loss when it comes to Weka's API. How exactly do I classify instances if they are programmatically passed around in my application and not available as ARFF files.
Basically, I am trying to integrate a simple co-reference analysis using Weka's classifiers. I've built the classification model in Weka directly and saved it to disk, from where my .NET application opens it and uses the IKVM port of Weka to predict the class value.
Here is what I've got so far:
// This is the "entry" method for the classification method
public IEnumerable<AttributedTokenDecorator> Execute(IEnumerable<TokenPair> items)
{
TokenPair[] pairs = items.ToArray();
Classifier model = ReadModel(); // reads the Weka generated model
FastVector fv = CreateFastVector(pairs);
Instances instances = new Instances("licora", fv, pairs.Length);
CreateInstances(instances, pairs);
for(int i = 0; i < instances.numInstances(); i++)
{
Instance instance = instances.instance(i);
double classification = model.classifyInstance(instance); // array index out of bounds?
if(AsBoolean(classification))
MakeCoreferent(pairs[i]);
}
throw new NotImplementedException(); // TODO
}
// This is a helper method to create instances from the internal model files
private static void CreateInstances(Instances instances, IEnumerable<TokenPair> pairs)
{
instances.setClassIndex(instances.numAttributes() - 1);
foreach(var pair in pairs)
{
var instance = new Instance(instances.numAttributes());
instance.setDataset(instances);
for (int i = 0; i < instances.numAttributes(); i++)
{
var attribute = instances.attribute(i);
if (pair.Features.ContainsKey(attribute.name()) && pair.Features[attribute.name()] != null)
{
var value = pair.Features[attribute.name()];
if (attribute.isNumeric()) instance.setValue(attribute, Convert.ToDouble(value));
else instance.setValue(attribute, value.ToString());
}
else
{
instance.setMissing(attribute);
}
}
//instance.setClassMissing();
instances.add(instance);
}
}
// This creates the data set's attributes vector
private FastVector CreateFastVector(TokenPair[] pairs)
{
var fv = new FastVector();
foreach (var attribute in _features)
{
Attribute att;
if (attribute.Type.Equals(ArffType.Nominal))
{
var values = new FastVector();
ExtractValues(values, pairs, attribute.FeatureName);
att = new Attribute(attribute.FeatureName, values);
}
else
att = new Attribute(attribute.FeatureName);
fv.addElement(att);
}
{
var classValues = new FastVector(2);
classValues.addElement("0");
classValues.addElement("1");
var classAttribute = new Attribute("isCoref", classValues);
fv.addElement(classAttribute);
}
return fv;
}
// This extracts observed values for nominal attributes
private static void ExtractValues(FastVector values, IEnumerable<TokenPair> pairs, string featureName)
{
var strings = (from x in pairs
where x.Features.ContainsKey(featureName) && x.Features[featureName] != null
select x.Features[featureName].ToString())
.Distinct().ToArray();
foreach (var s in strings)
values.addElement(s);
}
private Classifier ReadModel()
{
return (Classifier) SerializationHelper.read(_model);
}
private static bool AsBoolean(double classifyInstance)
{
return classifyInstance >= 0.5;
}
For some reason, Weka throws an IndexOutOfRangeException when I call model.classifyInstance(instance). I have no idea why, nor can I come up with an idea how to rectify this issue.
I am hoping someone might know where I went wrong. The only documentation for Weka I found relies on ARFF files for prediction, and I don't really want to go there.
For some odd reason, this exception was raised by the DTNB classifier (I was using three in a majority vote classification model). Apparently, not using DTNB "fixed" the issue.

SmartThreadPool - Is it possible to pass delegate method with method parameters?

I have a long running process called ImportProductInformation called by a consoleapp that I'm trying to speed up, which appears to be an excellent candidate for thread-pooling, so I did a little searching and came across SmartThreadPool on CodeProject and am trying to implement it.
ImportProductInformation currently requires an "item", which is just a single entity-framework row pulled from a list. SmartThreadPool uses a delegate called "WorkItemCallback", but if I build it like below it complains about "Method name expected" in the foreach loop on smartThreadPool.QueueWorkItem, as it appears I can't pass my params to the delegated method. What am I missing here? I'm sure it's something stupid...noob lacking experience with delegates...any help would be appreciated:
public static void ImportProductInformation_Exec()
{
// List
List<productinformation> _list = GetProductInformation();
// Import
if (_list != null)
{
SmartThreadPool smartThreadPool = new SmartThreadPool();
foreach (var item in _list)
{
smartThreadPool.QueueWorkItem
(new WorkItemCallback
(ImportProductInformation(item)));
}
smartThreadPool.WaitForIdle();
smartThreadPool.Shutdown();
}
}
public void ImportProductInformation(productinformation item)
{
// Do work associated with "item" here
}
If I change the loop to this I get "Method is used like a Type" in the build error:
foreach (var item in _list)
{
ImportProductInformation ipi =
new ImportProductInformation(item);
smartThreadPool.QueueWorkItem(new WorkItemCallback(ipi));
}
Ended up getting it to work with this:
public class ProductInformationTaskInfo
{
public productinformation ProductInformation;
public ProductInformationTaskInfo(productinformation pi)
{
ProductInformation = pi;
}
}
public class PI
{
foreach (var item in _list)
{
ProductInformationTaskInfo pi =
new ProductInformationTaskInfo(item);
smartThreadPool.QueueWorkItem
(new WorkItemCallback
(ImportProductInformation), pi);
}
public static object ImportProductInformation(Object _pi)
{
ProductInformationTaskInfo pi = (ProductInformationTaskInfo)_pi;
var item = pi.ProductInformation;
// Do work here
}
}
I don't know or have the SmartThreadPool, the following is approximate:
foreach (var item in _list)
{
var itemCopy = item;
smartThreadPool.QueueWorkItem
(dummy => ImportProductInformation(itemCopy));
}
You may have to do some fixing.
This works because the lambda captures a variable from the containing method. And that's why you need itemCopy.
But note that the normal ThreadPool is not suited for longrunning tasks, the same may hold for the SmartThreadPool. It should also keep a limit on the number of threads, and when ImportProductInformation does mainly I/O threading might not help at all.
You can use anonymous methods:
int a = 15;
String b = "hello world!";
ThreadPool.QueueUserWorkItem((state)=>SomeFunction(a,b));

Understanding the Open Closed Principle

I was refactoring some old code of a simple script file parser when I came across the following code:
StringReader reader = new StringReader(scriptTextToProcess);
StringBuilder scope = new StringBuilder();
string line = reader.ReadLine();
while (line != null)
{
switch (line[0])
{
case '$':
// Process the entire "line" as a variable,
// i.e. add it to a collection of KeyValuePair.
AddToVariables(line);
break;
case '!':
// Depending of what comes after the '!' character,
// process the entire "scope" and/or the command in "line".
if (line == "!execute")
ExecuteScope(scope);
else if (line.StartsWith("!custom_command"))
RunCustomCommand(line, scope);
else if (line == "!single_line_directive")
ProcessDirective(line);
scope = new StringBuilder();
break;
default:
// No processing directive, i.e. add the "line"
// to the current scope.
scope.Append(line);
break;
}
line = reader.ReadLine();
}
This simple script processor seems to me like a good candidate for refactoring by applying the "open closed principle". The lines beginning with a $ will probably never be handled differently. But, what if new directives beginning with a ! needs to be added? Or new processing identifiers (e.g. new switch-cases) are needed?
The problem is, I could not figure out how to easily and correctly add more directives and processors without breaking OCP. The !-case using scope and/or line makes it a bit tricky, as does the default-case.
Any suggestions?
Use a Dictionary<Char, YourDelegate> to specify how a character should be handled. Call DefaultHandler if the character key do not exist in the dictionary.
Add a Add(char key, YourDelegate handler) method allowing anyone to handle a specific character.
Update
It's better to work with interfaces:
/// <summary>
/// Let anyone implement this interface.
/// </summary>
public interface IMyHandler
{
void Process(IProcessContext context, string line);
}
/// <summary>
/// Context information
/// </summary>
public interface IProcessContext
{
}
// Actual parser
public class Parser
{
private Dictionary<char, IMyHandler> _handlers = new Dictionary<char, IMyHandler>();
private IMyHandler _defaultHandler;
public void Add(char controlCharacter, IMyHandler handler)
{
_handlers.Add(controlCharacter, handler);
}
private void Parse(TextReader reader)
{
StringBuilder scope = new StringBuilder();
IProcessContext context = null; // create your context here.
string line = reader.ReadLine();
while (line != null)
{
IMyHandler handler = null;
if (!_handlers.TryGetValue(line[0], out handler))
handler = _defaultHandler;
handler.Process(context, line);
line = reader.ReadLine();
}
}
}
Note that I pass in a TextReader instead. It gives much more flexibility since the source can be anything from a simple string to a complex stream.
Update 2
I would also break up the ! handling in a similar way. i.e. Create a class that handles IMyHandler:
public interface ICommandHandler
{
void Handle(ICommandContext context, string commandName, string[] arguments);
}
public class CommandService : IMyHandler
{
public void Add(string commandName, ICommandHandler handler)
{
}
public void Handle(IProcessContext context, string line)
{
// first word on the line is the command, all other words are arguments.
// split the string properly
// then find the corrext command handler and invoke it.
// take the result and add it to the `IProcessContext`
}
}
That gives more flexibility for both handling the actual protocol and add more commands. you do not have to change anything to add more functionality. The solution is therefore OK regarding Open/Closed and some other SOLID principles.

Categories