Path traversal warning when using Path.Combine - c#

I am currently using NewtonJSON to load some UI data from a JSON file. However, there is an warning says I have path traversal.
Here is the situation:
Any idea to remove this security vulnerability?

The path traversal exploit is possible when a path provided by a user or other untrusted source is combined, without checking, with a parent path. The problem is that there are "path traversal" components of paths that enable navigation out of the parent folder.
For example, if you were to combine the following two paths, an absolute path and a relative one:
Base/absolute path: C:\WebData
Relative/user path: ..\windows\system32\
Then this would yield:
C:\windows\system32\
As you can imagine, letting someone read or write this directory, which is not the intended WebData directory, could be a huge problem, as it could lead to someone learning information about your system or placing exploits that compromise the system, giving control of it to malicious actors.
You can read more about this exploit.
To deal with this vulnerability properly, you need to ensure that the relative path combined with the parent path is safe. Here are some ways to ensure this:
The relative path comes from a known trusted source, such as a vetted, internal list.
The relative path has previously been checked and was stored/maintained in a way that ensures it can be distinguished from untrusted paths, and could not be modified by a user or untrusted agent in the meantime. For example, keeping the path in a string is a bad idea. Instead, you would do something like create a TrustedPath class that code could only gain an instance of by, in fact, running code that checked that the path is safe.
The resulting path is checked after combination to ensure it is in the correct location.
You could do the last item like so (all of the below):
(As good practice to avoid unnecessary exceptions), use Path. GetInvalidFileNameChars() to check the (untrusted) relative path for invalid characters.
Perform Path.Combine() as you have done.
Ensure the resulting path is still within the original, parent path. This can be done by simply ensuring the resulting path starts with the parent path, but there may be issues with that. So consider an answer like this or other code that ensures the resulting path is truly a descendant of the desired folder.
Once you have done all this, if the "path traversal" warning is still showing, you can use the menu options in your code quality/security checking tool to annotate this instance of path traversal as safe. You may also want to put a comment with notes explaining why you marked it as safe, which could conceivably include a link to this SO question or one of its answers.
Note 1
Be careful about reusing a relative path that you have proven combines okay with a specific absolute path. Consider the following:
Base/absolute path: C:\WebData\FormElements\SuperForm\windows\
Relative/user path: ..\windows\
These two paths would combine safely, however this has not proven that the relative path is always safe to use with any absolute path.
Note 2
Be wary how you go about this. Assuming that relative path traversal always starts with ..\ is a mistake. The following is a valid relative path: folder\..\..\wheeeWeGotOut.
Note 3
Another answer proposes that safety can be guaranteed by simply removing invalid characters, and disallowing paths that contain .. or :. This is problematic for multiple reasons:
Files in many non-Windows file systems such as HFS or linux can legitimately have these characters. For example, a:filename and another..filename are perfectly fine (I just tested them). Restricting those characters limits what users can do.
Trying to improve the filtering to allow legitimate use-cases through is not a good idea. How do you know that you've written this code correctly and didn't miss an edge-case?
But most of all: what if there is an accidental symlink inside of the user's allowed path that points to a file elsewhere in the filesystem? What if the user is able to write a file that can function as a symlink, or has another exploit to cause such a file to be written or copied? The path exploit may not stand alone. It is often a combination of small exploits that leads to large exploits (each next one intersecting with or escalating the previous one). The only safe technique to ensure a file or directory is in the proper location is to, after all other filtering and passes and combining have been done, to check that the result is still within the expected location or is a direct descendant (paying attention to symlinks).
Regarding hard links—that is another matter. Good luck with that one. Don't make hard links. It's very hard to tell that a hard-link even exists, because it's a low-level modification. Read up on it if you're interested.

A secure way of using Path.Combine can be the next:
public string PathCombine(string path1, string path2)
{
if (path2.Contains("..")) return null;
if (path2.Contains(":")) return null;
string result = Path.Combine(path1, path2);
return (result.Equals(path2) ? null : result);
}
Note that this is only an example, not allowed strings check can be improved.
As additional information you can take a look here.
If you want to get more information about how this problem can be exploited I recommend you to read the OWASP documentation.
The fact of check if the result is like the second parameter is due to the behavior of the CombineInternal function that you can check here:
private static string CombineInternal(string first, string second)
{
if (string.IsNullOrEmpty(first))
return second;
if (string.IsNullOrEmpty(second))
return first;
if (IsPathRooted(second.AsSpan()))
return second;
return JoinInternal(first.AsSpan(), second.AsSpan());
}
As you can see, if the second variable IsPathRooted, the result is going to be it, and this is a common way to explode this without the need of use '..' characters, think on this example:
Your web is on c:\wwwroot\web1\public\index.html
if you pass as second parameter c:\wwwroot\web1\private\secret.conf you are going to be able to access this file.

Related

Best way to check if a directory is a starting point of another path

I want to know how to implement the following:
Debug.Assert(PathStartsWith("C:\\dir1\\dir2\\dir3", "C:\\dir1") == true);
Debug.Assert(PathStartsWith("C:\\dir1\\dir2\\dir3", "C:\\dir2") == false);
Debug.Assert(PathStartsWith("C:\\dir1\\dir2\\dir3", "C:\\dir1\dir2") == true);
//not matching from start
Debug.Assert(PathStartsWith("C:\\dir1\\dir2\\dir3", "dir1") == false);
//redundant slashes are ignored
Debug.Assert(PathStartsWith("C:\\dir1\\dir2\\dir3", "c:\\\\dir1") == true);
Do I have to do it myself (won't be too hard but there are a lot of cases to check, for example UNC paths, device paths, urls etc), or there are some system routine that can do this easily?
I do not believe there is any built in capability for this in the BCL. If you are willing to use p/invoke, the shell provides a number of path functions.
See: PathIsSameRoot function
As a side note, none of the Win32 path functions will reliably tell if you two paths are equivalent. To do that, you'd need to create some kind of canonical path based on the Windows Object Manager namespace and then probably still take into account NTFS junction points, hard links and soft links (and probably more).
Even with that, you'd still have problems with network share UNC paths as a single server may have multiple names with no general way to reconcile them.

How to validate a Unix path?

I'm trying to write a simple C# function to validate a full Unix path to a shell script entered by a user to avoid a couple of things:
The path is in correct format, no invalid symbols, like spaces or semi-colons
There are no suspicious commands, e.g. rm -rf /
The path represents a full path, no relatives
It does not matter if the script actually exists
The function would take a form like,
public bool IsUnixPathValid(string path)
{
return !path.IsEmptyOrNull()
&& path.StartsWith("/")
&& !path.ContainsCharacters(";', \"")
}
Question: Is there an existing library that would perform something like this? And if not, what would be the best approach and what little details should I look out for (think secure).
If you're not trying to verify whether or not any file actually exists at the specified path, then probably the only thing you should be doing is checking that the path starts with / (because you want only absolute paths) and that there are no embedded NUL (0) bytes (because POSIX paths can't contain those). That's it. Absolutely anything else can be a valid path. Notably, spaces and semicolons are allowed in paths.
I guess you could also check for multiple adjacent slashes because those are redundant... however they are still accepted (with each group of multiple slashes having the same meaning as a single slash) so they're not actually invalid.
Checking for suspicious strings like "rm -rf /" embedded in the path is a bad idea. If you have security issues caused by unquoted paths sent directly to system commands then you need to solve those security issues (either by quoting the paths as appropriate or, better, by not passing them through things like shells that parse them) instead of blacklisting a few chosen embedded strings. If you blacklist then you're all too likely to miss something that should have been blacklisted, and, furthermore, you're liable to reject things that are actually valid benign paths.

Does Path.Combine ever make network calls in .net?

Someone I know is claiming that it does and I am decompiling System.IO and looking in the Path class and I can't see it making networking calls. The only suspect is in NormalizePath, where it calls to PathHelper which has calls into Win32Native.GetFullPathName. I don't know what that does.
They are also claiming that System.Uri makes networking calls when created, which I find very incredible. I just can't believe that it would do that given how unbelievably slow that would be and how intrinsic these methods are.
Can anyone enlighten me?
Edit:
It turns out that Path.Combine(p) doesn't ever call the network but Path.GetFullName(p) can. In the scenario where you have a UNC path with a short filename ("\\server\abcdef~1.txt" for example) it will actually call out to the network and try to expand the path, which blows my mind frankly.
No, the Path.Combine method simply performs the required string manipulation to generate a legal path string, given the path separator. It explicitly does not check to see if you've given it a valid path, or a valid file name, or whatever.
The reference source code for .NET 4 is available, if you're curious, and you can see that the work is done entirely in managed code, no native method calls, and is basically:
return path1 + (path1.EndsWidth("\") ? "" : "\") + path2;
(A lot more robust and flexible, of course, but that's the idea.)
Similarly, the constructors for the Uri class do mostly string parsing (though orders of magnitude more complex than the Path stuff) but still, no network calls that I can see.
You could also check this yourself by running a packet capture utility such as Wireshark while executing such commands in a C# app.

A question about checking for the existence of a file versus the directory being empty and reliability

I know that pretty much every programming language has a method to check the existence of a file or directory.
However, in my case, a file is made which stores program settings. If it does not exist (ie !File.Exists or Directory.Count == 0 where Directory is the containing directory of the file), then prompt for some settings to be filled in.
However, is this sort of code reliable? For example, there may be no files in the directory in which case the details are requested, otherwise there may be files of other types which are irrelevant or a tampered file of the correct format. Would it be better to check for the specific file itself? Would it also be better to check for variables which make up the file as this is faster?
What is the best general way of doing this? Check a file, if the folder is there? If the folder is there and empty? Check the strings which are written to the file?
EDIT: A school of thought from a colleague was that I should check at the variable level because it would be closer to the problem (identify issues like incorrect encryption, corruption, locales, etc).
Thanks
I'd just check for the existence and validity of the specific file. If you encounter a corrupted file, get rid of it and make a new one.
For basic development, it is purely choice. In the case where the file existing is crucial to the stability of the application, checking the file directly is the safest route.

How do I determine whether two file system URI's point to the same resource?

If I have two different file paths, how can I determine whether they point to the same file ?
This could be the case, if for instance the user has a network drive attached, which points to some network resource. For example drive S: mapped to \servercrm\SomeFolder.
Then these paths actually point to the same file:
S:\somefile.dat
And
\\servercrm\SomeFolder\somefile.dat
How can I detect this ? I need to code it so that it works in all scenarios where there might be different ways for a path to point to the same file.
I don't know if there is an easy way to do this directly in C# but you could do an unmanaged call to GetFileInformationByHandle (pinvoke page here) which will return a BY_HANDLE_FILE_INFORMATION structure. This contains three fields which can be combined to uniquely ID a file:
dwVolumeSerialNumber:
The serial number of the volume that contains a file.
...
nFileIndexHigh:
The high-order part of a unique identifier that is associated with a
file.
nFileIndexLo:
The low-order part of a unique identifier that is associated with a
file.
The identifier (low and high parts) and the volume serial number uniquely identify a file on a single computer. To determine whether two open handles represent the same file, combine the identifier and the volume serial number for each file and compare them.
Note though that this only works if both references are declared from the same machine.
Edited to add:
As per this question this may not work for the situation you have since the dwVolumeSerialNumber may be different is the share definitions are different. I'd try it out first though, since I always thought that the volume serial number was drive specific, not path specific. I've never needed to actually prove this though, so I could be (and probably am) wrong.
At the very least you could take and compare the MD5 hashes of the combined file contents, file name, and metadata such as CreationTime, LastAccessTime, and LastWriteTime.
If you're only worried about local files then you can use the combination of GetFileInformationByHandle and the BY_HANDLE_FILE_INFORMATION structure. Lucian did an excellent blog post on this subject here. The code is in VB.Net but it should be easily convertible to C#
http://blogs.msdn.com/vbteam/archive/2008/09/22/to-compare-two-filenames-lucian-wischik.aspx

Categories