So I have this code that takes care of command acknowledgment from remote computers, sometimes (like once in 14 days or something) the following line throws a null reference exception:
computer.ProcessCommandAcknowledgment( commandType );
What really bugs me is that I check for a null reference before it, so I have no idea whats going on.
Here's the full method for what its worth:
public static void __CommandAck( PacketReader reader, SocketContext context )
{
string commandAck = reader.ReadString();
Type commandType = Type.GetType( commandAck );
Computer computer = context.Client as Computer;
if (computer == null)
{
Console.WriteLine("Client already disposed. Couldn't complete operation");
}
else
{
computer.ProcessCommandAcknowledgment( commandType );
}
}
Any clues?
Edit: ProcessCommandAcknowledgment:
public void ProcessCommandAcknowledgment( Type ackType )
{
if( m_CurrentCommand.GetType() == ackType )
{
m_CurrentCommand.Finish();
}
}
Based on the information you gave, it certainly appears impossible for a null ref to occur at that location. So the next question is "How do you know that the particular line is creating the NullReferenceException?" Are you using the debugger or stack trace information? Are you checking a retail or debug version of the code?
If it's the debugger, various setting combinations which can essentially cause the debugger to appear to report the NullRef in a different place. The main on that would do that is the Just My Code setting.
In my experience, I've found the most reliable way to determine the line an exception actually occurs on is to ...
Turn off JMC
Compile with Debug
Debugger -> Settings -> Break on Throw CLR exceptions.
Check the StackTrace property in the debugger window
I would bet money that there's a problem with your TCP framing code (if you have any!)
"PacketReader" perhaps suggests that you don't. Because, technically, it would be called "FrameReader" or something similar if you did.
If the two PC's involved are on a local LAN or something then it would probably explain the 14 days interval. If you tried this over the Internet I bet your error frequency would be much more common especially if the WAN bandwidth was contended.
Is it possible that ReadString() is returning null? This would cause GetType to fail. Perhaps you've received an empty packet? Alternatively, the string may not match a type and thus commandType would be null when used later.
EDIT:
Have you checked that m_CurrentCommand is not null when you invoke ProcessCommandAcknowledgment?
What are the other thread(s) doing?
Edit: You mention that the server is single threaded, but another comment suggests that this portion is single threaded. If that's the case, you could still have concurrency issues.
Bottom line here, I think, is that you either have a multi-thread issue or a CLR bug. You can guess which I think is more likely.
If you have optimizations turned on, it's likely pointing you to a very wrong place where it actually happens.
Something similar happened to me a few years back.
Or else a possible thread race somewhere where context gets set to null by another thread. That would also explain the uncommonness of the error.
Okay, ther are really only a few possibilities.
Somehow your computer reference is being tromped by the time you call that routine.
Something under the call is throwing the null pointer dereference error but it's being detected at that line.
Looking at it, I'm very suspicious the stack is getting corrupted, causing your computer automatic to get mangled. Check the subroutine/method/function calls around the one you have trouble with; in particular, check that what you're making into a "Computer" item really is the type you expect.
computer.ProcessCommandAcknowledgment( commandType );
Do you have debugging symbols to be able to step into this?
The null ref exception could be thrown by ProcessCommandAcknowledgement, and bubble up.
Related
I recently had a coding bug where under certain conditions a variable wasn't being initialized and I was getting a NullReferenceException . This took a while to debug as I had to find the bits of data that would generate this to recreate it the error as the exception doesn't give the variable name.
Obviously I could check every variable before use and throw an informative exception but is there a better (read less coding) way of doing this? Another thought I had was shipping with the pdb files so that the error information would contain the code line that caused the error. How do other people avoid / handle this problem?
Thanks
Firstly: don't do too much in a single statement. If you have huge numbers of dereferencing operations in one line, it's going to be much harder to find the culprit. The Law of Demeter helps with this too - if you've got something like order.SalesClerk.Manager.Address.Street.Length then you've got a lot of options to wade through when you get an exception. (I'm not dogmatic about the Law of Demeter, but everything in moderation...)
Secondly: prefer casting over using as, unless it's valid for the object to be a different type, which normally involves a null check immediately afterwards. So here:
// What if foo is actually a Control, but we expect it to be String?
string text = foo as string;
// Several lines later
int length = text.Length; // Bang!
Here we'd get a NullReferenceException and eventually trace it back to text being null - but then you wouldn't know whether that's because foo was null, or because it was an unexpected type. If it should really, really be a string, then cast instead:
string text = (string) foo;
Now you'll be able to tell the difference between the two scenarios.
Thirdly: as others have said, validate your data - typically arguments to public and potentially internal APIs. I do this in enough places in Noda Time that I've got a utility class to help me declutter the check. So for example (from Period):
internal LocalInstant AddTo(LocalInstant localInstant,
CalendarSystem calendar, int scalar)
{
Preconditions.CheckNotNull(calendar, "calendar");
...
}
You should document what can and can't be null, too.
In a lot of cases it's near impossible to plan and account for every type of exception that might happen at any given point in the execution flow of your application. Defensive coding is effective only to a certain point. The trick is to have a solid diagnostics stack incorporated into your application that can give you meaningful information about unhandled errors and crashes. Having a good top-level (last ditch) handler at the app-domain level will help a lot with that.
Yes, shipping the PDBs (even with a release build) is a good way to obtain a complete stack trace that can pinpoint the exact location and causes of errors. But whatever diagnostics approach you pick, it needs to be baked into the design of the application to begin with (ideally). Retrofitting an existing app can be tedious and time/money-intensive.
Sorry to say that I will always make a check to verify that any object I am using in a particular method is not null.
It's as simple as
if( this.SubObject == null )
{
throw new Exception("Could not perform METHOD - SubObject is null.");
}
else
{
...
}
Otherwise I can't think of any way to be thorough. Wouldn't make much sense to me not to make these checks anyway; I feel it's just good practice.
First of all you should always validate your inputs. If null is not allowed, throw an ArgumentNullException.
Now, I know how that can be painful, so you could look into assembly rewriting tools that do it for you. The idea is that you'd have a kind of attribute that would mark those arguments that can't be null:
public void Method([NotNull] string name) { ...
And the rewriter would fill in the blanks...
Or a simple extension method could make it easier
name.CheckNotNull();
If you are just looking for a more compact way to code against having null references, don't overlook the null-coalescing operator ?? MSDN
Obviously, it depends what you are doing but it can be used to avoid extra if statements.
A Technical Lead asked me the following:
He created a class, declared an object and initialized it. But in some circumstance we may get "null reference" exception.
He commented that there are 1000 possible reasons for such exception and asked me to guess a single reason.
I am unable to figure it out. What is (are) the reason(s) ,we may get such an exception?
You have used an object reference you have explicitly set to null, or
You have used an object reference you have implicitly set to null, or
Somewhere in your code, or in code called by you, there is the statement throw new NullReferenceException() (which you shouldn't do, by the way). I don't know if this counts, since it's not a real null reference.
I can't think of any of the other 997 reasons.
Edit: Thanks, Mark Byers, for point 3.
If it's a multi-threaded app, then some other thread could come along and set the object to a null reference.
Stack overflow?
{◕ ◡ ◕}
A few ways I can think of:
The constructor can throw a NullReferenceException before it completes.
When you access a property, the property can throw a NullReferenceException.
If you have a try { } finally { } around the code, if it throws an exception the finally runs and the code in the finally could throw a NullReferenceException.
There could be an implicit conversion during the assignment and the code for the conversion throws a NullReferenceException.
Here's example code for the last:
class Foo {}
class Bar
{
public static implicit operator Foo(Bar bar)
{
throw new NullReferenceException();
}
}
class Program
{
public static void Main()
{
Foo foo = new Bar(); // This causes a NullReferenceException to be thrown.
}
}
He created a class, declared an object
and initialized it. But in some
circumstance we may get "null
reference" exception. He commented
that there are 1000 possible reasons
for such exception and asked me to
guess a single reason. I am unable to
figure it out. What is (are) the
reason(s) ,we may get such an
exception?
Straightforward answer: I'd tell the interviewer that you can't debug code you can't see. Ask to see offending line of code and a debugger.
Not-so-straightforward answer: assuming your interviewer isn't an idiot, he probably feeling you out for your debugging skills. If you get a crappy bug report, do you throw your arms up and surrender right away, or do you attempt to resolve it.
Guessing is not an acceptable way to debug the error. The first step would be reproducing the bug on your machine.
Does it reproduce reliably? If yes, get your debugger out.
If no, can you reproduce it intermittently, or non-deterministically? Does the exception occur randomly in different places in code or on different threads? If yes, you probably have some sort of race condition, or maybe a corrupted pointer.
If no, ask whoever found the bug to reproduce. When you follow the same steps as the person who originally found the bug, can you reproduce? If yes, see above.
If no, is there are a difference in environments? Configuration files? Data in the databases? Is the environment updated with the latest service packs, software updates, etc?
You won't be able to give your interviewer an answer, but you can give him a list of steps you'd take to eventually get to an answer.
Not an expert, but just a wild guess, out of memory?
You can always initialize something to a null value;
public class MyClass
{
// initialized to null
private string _myString = null;
// _myString is initialized, but this throws null reference
public int StringLength { get { return _myString.Length(); } }
}
The object in question may contain other objects that are not initialized in the main object's constructor. The question doesn't specify where or when the null reference exception is occurring.
999 to go.
In Multi-threaded code the variable can be accessed after the object has been created, but before the variable has been assigned to its location.
I think the interviewer was actually looking for how you'd go about solving the problem, ie what troubleshooting steps you'd take to solve a problem that could be caused by a thousand different things.
I am getting a NullReferenceException when running my multi-threaded application, but only when I run in Release mode outside of the debugger. The stack trace gets logged, and it always points to the same function call. I put several logging statements in the function to try to determine how far it would get, and every statement gets logged, including one on the last line of the function. What is interesting is that when the NullReferenceException occurs, the statement after the function call does not get logged:
// ...
logger.Log( "one" ); // logged
Update( false );
logger.Log( "eleven" ); // not logged when exception occurs
}
private void Update( bool condition )
{
logger.Log( "one" ); // logged
// ...
logger.Log( "ten" ); // logged, even when exception occurs
}
The exception does not occur every time the function is called. Is it possible that the stack is being corrupted either before or during execution of the function such that the return address is lost, resulting in the null reference? I didn't think that sort of thing was possible under .NET, but I guess stranger things have happened.
I tried replacing the call to the function with the contents of the function, so everything happens inline, and the exception then occurs on a line that looks like this:
foreach ( ClassItem item in classItemCollection )
I have verified through logging that the "classItemCollection" is not null, and I also tried changing the foreach to a for in case the IEnumerator was doing something funny, but the exception occurs on the same line.
Any ideas on how to investigate this further?
Update: Several responders have suggested possible solutions having to do with making sure the logger isn't null. To be clear, the logging statements were added for debugging purposes after the exception started happening.
I found my null reference. Like Fredrik and micahtan suggested, I didn't provide enough information for the community to find a solution, so I figured I should post what I found just to put this to rest.
This is a representation of what was happening:
ISomething something = null;
//...
// the Add method returns a strong reference to an ISomething
// that it creates. m_object holds a weak reference, so when
// "this" no longer has a strong reference, the ISomething can
// be garbage collected.
something = m_object.Add( index );
// the Update method looks at the ISomethings held by m_object.
// it obtains strong references to any that have been added,
// and puts them in m_collection;
Update( false );
// m_collection should hold the strong reference created by
// the Update method.
// the null reference exception occurred here
something = m_collection[ index ];
return something;
The problem turned out to be my use of the "something" variable as a temporary strong reference until the Update method obtained a permanent one. The compiler, in Release mode, optimizes away the "something = m_object.Add();" assignment, since "something" isn't used until it is assigned again. This allowed the ISomething to be garbage collected, so it no longer existed in m_collection when I tried to access it.
All I had to do was ensure that I held a strong reference until after the call to Update.
I am doubtful that this will be of any use to anyone, but in case anyone was curious, I didn't want to leave this question unanswered.
The fact that it logs "ten" would make me look first at:
is logger ever assigned... is this perhaps becoming null somehow
is the bug inside Log itself
Hard to tell without enough context for either - but that is how I'd investigate it. You could also add a simple null test somewhere; as a cheeky approach, you could rename the Log method to something else, and add an extension method:
[Conditional("TRACE")]
public static void Log(this YourLoggerType logger, string message) {
if(logger==null) {
throw new ArgumentNullException("logger",
"logger was null, logging " + message);
} else {
try {
logger.LogCore(message); // the old method
} catch (Exception ex) {
throw new InvalidOperationException(
"logger failed, logging " + message, ex);
}
}
}
Your existing code should call the new Log extension method, and the exception will make it clear exactly where it barfed. Maybe change it back once fixed... or maybe leave it.
Agree w/Fredrik -- more details are necessary. One place to maybe start looking: you mention multi-threaded application and the error happening in release but not debug. You might be running into a timing issue with multiple threads accessing the same object references.
Regardless, I'd also probably put a:
Debug.Assert(classItemCollection != null);
right before the loop iteration. It won't help you in release mode, but it may help you catch the problem if (when?) it happens in Debug.
I'd look for code that's setting logger or one of its dependents to null. Are there properties of logger that, when set to null, might trigger this? Release mode sometimes speeds up application execution which can reveal synchronization problems that are masked by the performance penalty of debug mode and/or the debugger.
The fact that "eleven" isn't getting logged leads me to believe that logger is being set to null just before that call is made. Can you wrap it in a try/catch and see if it hits the catch portion of the block? Maybe you can insert a MessageBox.Show or write something to a known file when that happens.
Are you modifying classItemCollection from multiple threads? If you change the collection in another thread you may be invalidating the iterator which might lead to your exception. You may need to protect access with a lock.
edit:
Can you post more info about the types of ClassItem and classItemCollection?
Another possibility is that ClassItem is a value type and classItemCollection is a generic collection and somehow a null is getting added to the collection. The following throws a NullReferenceException:
ArrayList list=new ArrayList();
list.Add(1);
list.Add(2);
list.Add(null);
list.Add(4);
foreach (int i in list)
{
System.Diagnostics.Debug.WriteLine(i);
}
This particular problem can be resolved by int? i or Object i in the foreach or using a generic container.
I got an OutOfMemoryException earlier and couldn't figure out what it was for. It made no sense at all. Dug around in my code, and suddenly remembered that somewhere had forgotten to check for null, and in this particular case it was (and should be) exactly that. That shouldn't cause an OutOfMemoryException in my opinion, but I fixed it anywas of course. And when I did, the exception didn't appear anymore!
So I removed the check again and studied the exception I got some more. And turns out it had an InnerException of type NullReferenceException and a stack trace which of course made a lot more sense.
But why did I get an OutOfMemoryException? This has never happend to me before... makes no sense to me...
Would love to give some more context, but can't really say much without having to upload the whole project, which I can't (And which you wouldn't want to read through anyways :p). But the specific place it happend looks like this:
{
foreach (var exportParameter in exportParameters)
{
// Copy to local
var ep = exportParameter;
// Load stored values from db
...
}
int i = 1;
exportParameters
.OrderBy(ø => ø.Sequence)
.ForEach(ø => { if (!ø.Locked) ø.Sequence = i++; });
}
The fix was to put an if(exportParameters != null) before the code block. exportParameters is a List<ExportParameter>, except in the failing case in which it was null.
You might be facing the problem that Constrained Execution Regions are designed to prevent - that is, the JITting of some code that your catch clause relies on is causing the out of memory condition.
(In response to svish's comment, this is the first link when googling the phrase: http://msdn.microsoft.com/en-us/library/ms228973.aspx)
Aside from the obvious reason for getting an OOMException, you can also get it if you still have memory available, just not a big enough chunk for what is being requested. If you're getting it reliably and relatively near startup, you're probably accidentally requesting more memory than you intend to (ie. requesting a very large array). Can you post a bit of your code or at least describe your allocation pattern?
I must be going out of my mind, because my unit tests are failing because the following code is throwing a null ref exception:
int pid = 0;
if (parentCategory != null)
{
Console.WriteLine(parentCategory.Id);
pid = parentCategory.Id;
}
The line that's throwing it is:
pid = parentCategory.Id;
The console.writeline is just to debug in the NUnit GUI, but that outputs a valid int.
Edit: It's single-threaded so it can't be assigned to null from some other thread, and the fact the the Console.WriteLine successfully prints out the value shows that it shouldn't throw.
Edit: Relevant snippets of the Category class:
public class Category
{
private readonly int id;
public Category(Category parent, int id)
{
Parent = parent;
parent.PerformIfNotNull(() => parent.subcategories.AddIfNew(this));
Name = string.Empty;
this.id = id;
}
public int Id
{
get { return id; }
}
}
Well if anyone wants to look at the full code, it's on Google Code at http://code.google.com/p/chefbook/source/checkout
I think I'm going to try rebooting the computer... I've seen pretty weird things fixed by a reboot. Will update after reboot.
Update: Mystery solved. Looks like NUnit shows the error line as the last successfully executed statement... Copy/pasted test into new console app and ran in VS showed that it was the line after the if statement block (not shown) that contained a null ref. Thanks for all the ideas everyone. +1 to everyone that answered.
Based on all the info so far, I think either "the line that's throwing is" is wrong (what makes you think that), or possibly your 'sources' are out of sync with your built assemblies.
This looks impossible, so some 'taken for granted assumption' is wrong, and it's probably the assumption that "the source code you're looking at matches the process you're debugging".
When things look right on the surface then it's likely something you would dismiss. Are you 350% positive that your DLL/PDB matches your source code thus giving you the right line?
Try manually deleting your assemblies and run whatever it is you're running. Does it fail as it should?
Try cleaning out your solution and recompiling. Copy the assemblies to wherever and run it again. Does it work this time? Null ref at the same spot?
Debug the surrounding lines for values you expect. parentCategory, etc. should match what you think.
Modify the code by throwing an exception. Do the stack trace lines match exactly the lines in your code?
I've had some extremely mind-boggling experiences just like this because one of my assumptions was wrong. Typical is a stale assembly. Question all assumptions.
What doesn't make sense, though, is that the OP states that the console.writeline outputs a valid int and it calls the same property as the line where he states the error is being thrown.
A stacktrace would be helpful, or being able to look at the actual unit test.
Yes, could we see the code of the category class .. in particular the Id property? Assuming the Id property is an int (and not a nullable int), the get method must be accessing a NULL object.
Id could be a nullable int type (int?) with no value, however I feel that the property Id must be doing more than just returning an int, and something inside that is null.
EDIT Your edit reveals this is more than peculiar, could you supply us with a stack trace?
I'm not an expert on C#, and am not sure if it makes a difference, but are you debugging in Debug mode, or Release mode? If you're in release mode, the line that your IDE is pointing to and the line the problem is actually on may be different (I know this is the case in C++ when using Visual Studio)
I'd have to agree with Brian... Maybe you're using outdated .pdbs and the code you're seen on debug mode does not comply with the code actually being debugged.
Try cleaning the project, and rebuilding it in debug mode.
Looks like NUnit shows the error line as the last successfully executed statement... Copy/pasted test into new console app and ran in VS showed that it was the line after the if statement block (not shown) that contained a null ref. Thanks for all the ideas everyone. +1 to everyone that answered.