How to validate a Unix path?

How to validate a Unix path? - c#

I'm trying to write a simple C# function to validate a full Unix path to a shell script entered by a user to avoid a couple of things:
The path is in correct format, no invalid symbols, like spaces or semi-colons
There are no suspicious commands, e.g. rm -rf /
The path represents a full path, no relatives
It does not matter if the script actually exists
The function would take a form like,
public bool IsUnixPathValid(string path)
{
return !path.IsEmptyOrNull()
&& path.StartsWith("/")
&& !path.ContainsCharacters(";', \"")
}
Question: Is there an existing library that would perform something like this? And if not, what would be the best approach and what little details should I look out for (think secure).

If you're not trying to verify whether or not any file actually exists at the specified path, then probably the only thing you should be doing is checking that the path starts with / (because you want only absolute paths) and that there are no embedded NUL (0) bytes (because POSIX paths can't contain those). That's it. Absolutely anything else can be a valid path. Notably, spaces and semicolons are allowed in paths.
I guess you could also check for multiple adjacent slashes because those are redundant... however they are still accepted (with each group of multiple slashes having the same meaning as a single slash) so they're not actually invalid.
Checking for suspicious strings like "rm -rf /" embedded in the path is a bad idea. If you have security issues caused by unquoted paths sent directly to system commands then you need to solve those security issues (either by quoting the paths as appropriate or, better, by not passing them through things like shells that parse them) instead of blacklisting a few chosen embedded strings. If you blacklist then you're all too likely to miss something that should have been blacklisted, and, furthermore, you're liable to reject things that are actually valid benign paths.

Related

Path traversal warning when using Path.Combine

I am currently using NewtonJSON to load some UI data from a JSON file. However, there is an warning says I have path traversal.
Here is the situation:
Any idea to remove this security vulnerability?

The path traversal exploit is possible when a path provided by a user or other untrusted source is combined, without checking, with a parent path. The problem is that there are "path traversal" components of paths that enable navigation out of the parent folder.
For example, if you were to combine the following two paths, an absolute path and a relative one:
Base/absolute path: C:\WebData
Relative/user path: ..\windows\system32\
Then this would yield:
C:\windows\system32\
As you can imagine, letting someone read or write this directory, which is not the intended WebData directory, could be a huge problem, as it could lead to someone learning information about your system or placing exploits that compromise the system, giving control of it to malicious actors.
You can read more about this exploit.
To deal with this vulnerability properly, you need to ensure that the relative path combined with the parent path is safe. Here are some ways to ensure this:
The relative path comes from a known trusted source, such as a vetted, internal list.
The relative path has previously been checked and was stored/maintained in a way that ensures it can be distinguished from untrusted paths, and could not be modified by a user or untrusted agent in the meantime. For example, keeping the path in a string is a bad idea. Instead, you would do something like create a TrustedPath class that code could only gain an instance of by, in fact, running code that checked that the path is safe.
The resulting path is checked after combination to ensure it is in the correct location.
You could do the last item like so (all of the below):
(As good practice to avoid unnecessary exceptions), use Path. GetInvalidFileNameChars() to check the (untrusted) relative path for invalid characters.
Perform Path.Combine() as you have done.
Ensure the resulting path is still within the original, parent path. This can be done by simply ensuring the resulting path starts with the parent path, but there may be issues with that. So consider an answer like this or other code that ensures the resulting path is truly a descendant of the desired folder.
Once you have done all this, if the "path traversal" warning is still showing, you can use the menu options in your code quality/security checking tool to annotate this instance of path traversal as safe. You may also want to put a comment with notes explaining why you marked it as safe, which could conceivably include a link to this SO question or one of its answers.
Note 1
Be careful about reusing a relative path that you have proven combines okay with a specific absolute path. Consider the following:
Base/absolute path: C:\WebData\FormElements\SuperForm\windows\
Relative/user path: ..\windows\
These two paths would combine safely, however this has not proven that the relative path is always safe to use with any absolute path.
Note 2
Be wary how you go about this. Assuming that relative path traversal always starts with ..\ is a mistake. The following is a valid relative path: folder\..\..\wheeeWeGotOut.
Note 3
Another answer proposes that safety can be guaranteed by simply removing invalid characters, and disallowing paths that contain .. or :. This is problematic for multiple reasons:
Files in many non-Windows file systems such as HFS or linux can legitimately have these characters. For example, a:filename and another..filename are perfectly fine (I just tested them). Restricting those characters limits what users can do.
Trying to improve the filtering to allow legitimate use-cases through is not a good idea. How do you know that you've written this code correctly and didn't miss an edge-case?
But most of all: what if there is an accidental symlink inside of the user's allowed path that points to a file elsewhere in the filesystem? What if the user is able to write a file that can function as a symlink, or has another exploit to cause such a file to be written or copied? The path exploit may not stand alone. It is often a combination of small exploits that leads to large exploits (each next one intersecting with or escalating the previous one). The only safe technique to ensure a file or directory is in the proper location is to, after all other filtering and passes and combining have been done, to check that the result is still within the expected location or is a direct descendant (paying attention to symlinks).
Regarding hard links—that is another matter. Good luck with that one. Don't make hard links. It's very hard to tell that a hard-link even exists, because it's a low-level modification. Read up on it if you're interested.

A secure way of using Path.Combine can be the next:
public string PathCombine(string path1, string path2)
{
if (path2.Contains("..")) return null;
if (path2.Contains(":")) return null;
string result = Path.Combine(path1, path2);
return (result.Equals(path2) ? null : result);
}
Note that this is only an example, not allowed strings check can be improved.
As additional information you can take a look here.
If you want to get more information about how this problem can be exploited I recommend you to read the OWASP documentation.
The fact of check if the result is like the second parameter is due to the behavior of the CombineInternal function that you can check here:
private static string CombineInternal(string first, string second)
{
if (string.IsNullOrEmpty(first))
return second;
if (string.IsNullOrEmpty(second))
return first;
if (IsPathRooted(second.AsSpan()))
return second;
return JoinInternal(first.AsSpan(), second.AsSpan());
}
As you can see, if the second variable IsPathRooted, the result is going to be it, and this is a common way to explode this without the need of use '..' characters, think on this example:
Your web is on c:\wwwroot\web1\public\index.html
if you pass as second parameter c:\wwwroot\web1\private\secret.conf you are going to be able to access this file.

URL display with proper output using System.Uri c#

I have an application where in I have stored a lot of websites without validating them. Now I am validating the URL entered. But the already stored URL's are there as it is.
I want a strict display code that allows me to correct the user typos also and just gives the a proper URL to deal with.
The data that is already in the system has a lot of typos such as ...http://example.com or htp://example.com or ttp://example.com. I want the code to tackle that and come up with the proper url either by regexing the invalid part or making it correct.
That is the best approach to establish this?

You can obviously pick out the correct ones with a regex.
However, you will need to write your own logic to fix those that are 'broken'. You could pull these and with another regex and then simply search and replace the broken element. There are going to be limitations to this as you can only really check the protocol prefix and not the domain part itself.

Here is my try:
http(s)?://(www.)?[a-zA-Z0-9\-\.\\/]+
where [a-zA-Z0-9-.\/] includes all characters that you want to allow users to use.
P.S. please be aware that if you are using RegEx under C#, do not forget to use double \\ as otherwise your expression might not work properly.
Hope it gets you started.

Best way to check if a directory is a starting point of another path

I want to know how to implement the following:
Debug.Assert(PathStartsWith("C:\\dir1\\dir2\\dir3", "C:\\dir1") == true);
Debug.Assert(PathStartsWith("C:\\dir1\\dir2\\dir3", "C:\\dir2") == false);
Debug.Assert(PathStartsWith("C:\\dir1\\dir2\\dir3", "C:\\dir1\dir2") == true);
//not matching from start
Debug.Assert(PathStartsWith("C:\\dir1\\dir2\\dir3", "dir1") == false);
//redundant slashes are ignored
Debug.Assert(PathStartsWith("C:\\dir1\\dir2\\dir3", "c:\\\\dir1") == true);
Do I have to do it myself (won't be too hard but there are a lot of cases to check, for example UNC paths, device paths, urls etc), or there are some system routine that can do this easily?

I do not believe there is any built in capability for this in the BCL. If you are willing to use p/invoke, the shell provides a number of path functions.
See: PathIsSameRoot function
As a side note, none of the Win32 path functions will reliably tell if you two paths are equivalent. To do that, you'd need to create some kind of canonical path based on the Windows Object Manager namespace and then probably still take into account NTFS junction points, hard links and soft links (and probably more).
Even with that, you'd still have problems with network share UNC paths as a single server may have multiple names with no general way to reconcile them.

Read structure from file c#

I'm trying to read a structure of a text file in a certain way. The text file is kind of a user-friendly configuration file.
Current structure of file (structure can be changed if necessary):
info1=exampleinfo
info2=exampleinfo2
info3="example","example2","example3"
info4="example","example2","example3"
There is no real difficulty in getting the first two lines, but the latter two are more difficult. I need to put both in two seperate string arrays that I can use. I could use a split string, but the problem is in that in the info4 array, the values can contain comma's (this is all user input).
How to go about solving this?

The reason you're having trouble writing parser is that you're not starting with a good definition of the file format. Instead of asking how you should parse it if there are commas, you should be deciding how to properly encode values with commas. Then parsing is simple.
If this file is written by non-technical users who can't be trusted with a complex format (like json), consider a format like:
info1=exampleinfo
info2=exampleinfo2
info3=example
example2
example3
info4=example
example2
example3
That is, don't mess around with quotes and commas. Users understand line breaks and spaces pretty well.

I'm 100% in favor of #DavidHeffernan's solutions, JSON would be great. And #ScottMermelstein's solution of a program that builds the output - that's probably your best bet if possible, not allowing the user to make a mistake even if they wanted to.
However, if you need them to build the textfile, and you're working with users who can't be trusted to put together valid JSON, since it is a picky format, maybe try a delimiter that won't be used by the user, to separate values.
For example, pipes are always good, since practically nobody uses them:
info1=exampleinfo
info2=exampleinfo2
info3=example|example2|example3
info4=example|exam,ple2|example3
All you'd need is a rule that says their data cannot contain pipes. More than likely, the users would be ok with that.

Importing Windows registry to a file structure

I need help from someone someone smarter than I am to solve this puzzle.
I have a Registry branch that I want to convert into a file structure. Users make changes in the file structure, mostly because its easier for lusers to manipulate files. Then I can write those changes back to the registry. (I realize the risk here, please don't tell me that doing this is "bad", I know. This code is for a personal project and not going into any production models!)
So far my code works great:
This translates great to:
However, since the registry is NOT a file structure, we can have "bad" keys such as:
This makes a file named Banana, in directory ...\Branch2\SubBranch1\Apple. Obviously. I thought about replacing the '\' with something, but what?
There is also an issue with ending a key or value with a '.' The file will not have the period.
Does anyone have a solution (or suggestion) to obtain the intended result?

A simple character substitution map should accomodate you just fine.
backslash could be the arabic question mark: ؟
and slash could be the greek capital letter omega with dasia and prosgegrammeni: ᾩ
of course you would have to pick substitutions that were legal filesystem characters as well as characters that had a near zero chance of existing in a real registry key. But the unicode "alphabet" is quite large. Shouldnt be too much of a problem. The trailing period could be handled the same way.
charmap.exe is your friend

there is a couple of character that you need to handle /:?"<>|
if the key value contain one of this character then you can't name your file with the same name as a key value
i suggest you to use XML structure its very easy to maintain and manipulate

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.