How to make filenames web safe using c# - c#

This is not about encoding URLs its more to do with a problem I noticed where you can have a valid filename on IIS sucha as "test & test.jpg" but this cannot be downloaded due to the & causing an error. There are other characters that do this also that are valid in windows but not for web.
My quick solution is to change the filename before saving using a regex below...
public static string MakeFileNameWebSafe(string fileNameIn)
{
string pattern = #"[^A-Za-z0-9. ]";
string safeFilename = System.Text.RegularExpressions.Regex.Replace(fileNameIn, pattern, string.Empty);
if (safeFilename.StartsWith(".")) safeFilename = "noname" + safeFilename;
return safeFilename;
}
but I was wondering if there were any better built in ways of doing this.

Built-in I don't know about.
What you can do is, like you say, scan the original filename and generate a Web-safe version of it.
For such Web-safe versions, you can make it appear like slugs in blogs and blog categories (these are search engine-optimized):
Only lowercase characters
Numbers are allowed
Dashes are allowed
Spaces are replaced by dashes
Nothing else is allowed
Possibly you could replace "&" by "-and-"
So "test & test.jpg" would translate to "test-and-test.jpg".

Just looking back at this question since its fairly popular. Just though I would post my current solution up here with various overloads for anyone who wants it..
public static string MakeSafeFilename(string filename, string spaceReplace)
{
return MakeSafeFilename(filename, spaceReplace, false, false);
}
public static string MakeSafeUrlSegment(string text)
{
return MakeSafeUrlSegment(text, "-");
}
public static string MakeSafeUrlSegment(string text, string spaceReplace)
{
return MakeSafeFilename(text, spaceReplace, false, true);
}
public static string MakeSafeFilename(string filename, string spaceReplace, bool htmlDecode, bool forUrlSegment)
{
if (htmlDecode)
filename = HttpUtility.HtmlDecode(filename);
string pattern = forUrlSegment ? #"[^A-Za-z0-9_\- ]" : #"[^A-Za-z0-9._\- ]";
string safeFilename = Regex.Replace(filename, pattern, string.Empty);
safeFilename = safeFilename.Replace(" ", spaceReplace);
return safeFilename;
}

I think you are referring to the "A potentially dangerous Request.Path value was detected from the client (%)" error which Asp.Net throws for paths which include characters which might indicate cross site scripting attempts:
there is a good article on how to work around this:
http://www.hanselman.com/blog/ExperimentsInWackinessAllowingPercentsAnglebracketsAndOtherNaughtyThingsInTheASPNETIISRequestURL.aspx

Here's the one I use:
public static string MakeFileNameWebSafe(string path, string replace, string other)
{
var folder = System.IO.Path.GetDirectoryName(path);
var name = System.IO.Path.GetFileNameWithoutExtension(path);
var ext = System.IO.Path.GetExtension(path);
if (name == null) return path;
var allowed = #"a-zA-Z0-9" + replace + (other ?? string.Empty);
name = System.Text.RegularExpressions.Regex.Replace(name.Trim(), #"[^" + allowed + "]", replace);
name = System.Text.RegularExpressions.Regex.Replace(name, #"[" + replace + "]+", replace);
if (name.EndsWith(replace)) name = name.Substring(0, name.Length - 1);
return folder + name + ext;
}

If you are not concerned to keep the original name perhaps you could just replace the name with a guid?

Related

how to convert char #"\" to Escape String \ by C#

I have grabbed some data from a website.A string which is named as urlresult in the data is "http:\/\/www.cnopyright.com.cn\/index.php?com=com_noticeQuery&method=wareList&optionid=1221&obligee=\u5317\u4eac\u6c83\u534e\u521b\u65b0\u79d1\u6280\u6709\u9650\u516c\u53f8&softwareType=1".
what I want to do is to get rid of the first three char #'\' in the string urlresult above . I have tried the function below:
public string ConvertDataToUrl(string urlresult )
{
var url= urlresult.Split('?')[0].Replace(#"\", "") + "?" + urlresult .Split('?')[1];
return url
}
It returns "http://www.cnopyright.com.cn/index.php?com=com_noticeQuery&method=wareList&optionid=1221&obligee=\\u5317\\u4eac\\u6c83\\u534e\\u521b\\u65b0\\u79d1\\u6280\\u6709\\u9650\\u516c\\u53f8&softwareType=1" which is incorrect.
The correct result is "http://www.cnopyright.com.cn/index.php?com=com_noticeQuery&method=wareList&optionid=1221&obligee=北京沃华创新科技有限公司&softwareType=1"
I have tried many ways,but it hasn't worked.I have no idea how to get the correct result.
I think you may be misled by the debugger because there's no reason that extra "\" characters should get inserted by the code you provided. Often times the debugger will show extra "\" in a quoted string so that you can tell which "\" characters are really there versus which are there to represent other special characters. I would suggest writing the string out with Debug.WriteLine or putting it in a log file. I don't think the information you provided in the question is correct.
As proof of this, I compiled and ran this code:
static void Main(string[] args)
{
var url = #"http:\/\/www.cnopyright.com.cn\/index.php?com=com_noticeQuery&method=wareList&optionid=1221&obligee=\u5317\u4eac\u6c83\u534e\u521b\u65b0\u79d1\u6280\u6709\u9650\u516c\u53f8&softwareType=1";
Console.WriteLine("{0}{1}{2}", url, Environment.NewLine,
url.Split('?')[0].Replace(#"\", "") + "?" + url.Split('?')[1]);
}
The output is:
http:\/\/www.cnopyright.com.cn\/index.php?com=com_noticeQuery&method=wareList&optionid=1221&obligee=\u5317\u4eac\u6c83\u534e\u521b\u65b0\u79d1\u6280\u6709\u9650\u516c\u53f8&softwareType=1
http://www.cnopyright.com.cn/index.php?com=com_noticeQuery&method=wareList&optionid=1221&obligee=\u5317\u4eac\u6c83\u534e\u521b\u65b0\u79d1\u6280\u6709\u9650\u516c\u53f8&softwareType=1
You can use the System.Text.RegularExpressions.Regex.Unescape method:
var input = #"\u5317\u4eac\u6c83\u534e\u521b\u65b0\u79d1\u6280\u6709\u9650\u516c\u53f8";
string escapedText = System.Text.RegularExpressions.Regex.Unescape(input);

Get page name from URI, substring between two characters

I want to get the page name from a URI for instance if I have
"/Pages/Alarm/AlarmClockPage.xaml"
I want to get AlarmClockPage
I tried
//usage GetSubstring("/", ".", "/Pages/Alarm/AlarmClockPage.xaml")
public static string GetSubstring(string a, string b, string c)
{
string str = c.Substring((c.IndexOf(a) + a.Length),
(c.IndexOf(b) - c.IndexOf(a) - a.Length));
return str;
}
But because the string being search may contain one or more forward slashes, I don't think this method work in such case.
So how do I consider the multiple forward slashes that may present?
Why don't you use method which is already in the framework?
System.IO.Path.GetFileNameWithoutExtension(#"/Pages/Alarm/AlarmClockPage.xaml");
If you only want to use string functions, you may try:
var startIdx = pathString.LastIndexOf(#"/");
var endIdx = pathString.LastIndexOf(".");
if(endIdx!=-1)
{
fileName = pathString.Substring(startIdx,endIdx);
}
else
{
fileName = pathString.Substring(startIdx);
}
It gives file name from a given file path. try this
string pageName = System.IO.Path.GetFileName(#"/Pages/Alarm/AlarmClockPage.xaml");

Using a for loop to replace string characters in a .NET C# Model

I am new to .NET MVC, and come from PHP/Java/ActionScript.
The problem I have come across is with the .NET Model and get{}. I do not understand why my Hyphenize string will return the value of SomeText truncated to 64 characters, but without replacing any of the characters defined in the array.
Model - This is supposed to replace certain characters in SomeText with a simple hyphen - :
public string SomeText{ get; set;} // Unmodified string
public string Hyphenize{
get {
//unwanted characters to replace
string[] replace_items = {"#", " ", "!", "?", "#", "*", ",", ".", "/", "'", #"\", "=" };
string stringbuild = SomeText.Substring(0, (SomeText.Length > 64 ? 64 : SomeText.Length));
for (int i = 0; i < replace_items.Length; i++)
{
stringbuild.Replace(replace_items[i], "-");
}
return stringbuild;
}
set { }
}
Alternatively, the method below does work correctly and will return the string with " " and "#" characters replaced. However, it bothers me that I am unable to understand why the for loop did not work.
public string Hyphenize{
get {
//Replaces unwanted characters
return SomeText.Substring(0, (SomeText.Length > 64 ? 64 : SomeText.Length)).Replace(" ", "-").Replace("#", "-");
}
set { }
}
Ultimately I ended up with
return Regex.Replace(SomeText.Substring(0, (SomeText.Length > 64 ? 64 : SomeText.Length)).Replace("'", ""), #"[^a-zA-Z0-9]", "-").Replace("--", "-");
string is immutable, from MSDN:
Strings are immutable--the contents of a string object cannot be changed after the object is created, although the syntax makes it appear as if you can do this.
so you need to have to assign again:
stringbuild = stringbuild.Replace(replace_items[i], "-");
You're not assigning the value of Replace() to anything. It returns its result, and does not modify the string that it operates on. (String's are immutable).

How can i remove the part "http://" from a string?

I have this method:
private List<string> offline(string targetDirectory)
{
if (targetDirectory.Contains("http://"))
{
MessageBox.Show("true");
}
DirectoryInfo di = new DirectoryInfo(targetDirectory);
List<string> directories = new List<string>();
try
{
string[] dirs = Directory.GetDirectories(targetDirectory,"*.*",SearchOption.TopDirectoryOnly);
for (int i = 0; i < dirs.Length; i++)
{
string t = "http://" + dirs[i];
directories.Add(t);
}
}
catch
{
MessageBox.Show("hgjghj");
}
return directories;
}
This is the part:
if (targetDirectory.Contains("http://"))
{
MessageBox.Show("true");
}
I'm getting a directory which give me all the directories in this directory and I'm adding to each directory the string "http://".
The problem is when next time a directory is getting to the function its coming with "http://"
For example: http://c:\\ or http://c:\\windows
And then the line
DirectoryInfo di = new DirectoryInfo(targetDirectory); // throws exception.
So I want that each time a directory is getting to the function to check if it starts with "http://" in the beginning, strip the "http://" part, get all the directories, and then add to each directory "http://" like now.
How can I remove "http://"?
I would be stricter than using Contains - I'd use StartsWith, and then Substring:
if (targetDirectory.StartsWith("http://"))
{
targetDirectory = targetDirectory.Substring("http://".Length);
}
Or wrap it in a helper method:
public static string StripPrefix(string text, string prefix)
{
return text.StartsWith(prefix) ? text.Substring(prefix.Length) : text;
}
It's not clear to me why you're putting the http:// as a prefix anyway though, to be honest. I can't see how you'd expect a directory name prefixed with http:// to be a valid URL. Perhaps if you could explain why you're doing it, we could suggest a better approach.
(Also, I really hope you don't have a try/catch block like that in your real code, and that normally you follow .NET naming conventions.)
The problem is how can i remove the http:// ?
You may use string.Replace, and replace the string with an empty string.
targetDirectory = targetDirectory.Replace("http://","");
or
targetDirectory = targetDirectory.Replace("http://",string.Empty);
both of them are same
Try this:
if(example.StartsWith("http://"))
{
example.substring(7);
}
You can always use the String.Replace to remove / replace characters in the string.
Exampel:
targetDirectory = targetDirectory.Replace("http://", string.Empty);
And you can check if the string begins with Http:// by doing
if(targetDirectory.StartsWith("http://"))
You can use the replace characters in the string by string.Replace
if (targetDirectory.Contains("http://"))
{
targetDirectory = targetDirectory.Replace("http://",string.Empty);
}

Auto quotes around string in c# - build in method?

Is there some build in method that add quotes around string in c# ?
Do you mean just adding quotes? Like this?
text = "\"" + text + "\"";
? I don't know of a built-in method to do that, but it would be easy to write one if you wanted to:
public static string SurroundWithDoubleQuotes(this string text)
{
return SurroundWith(text, "\"");
}
public static string SurroundWith(this string text, string ends)
{
return ends + text + ends;
}
That way it's a little more general:
text = text.SurroundWithDoubleQuotes();
or
text = text.SurroundWith("'"); // For single quotes
I can't say I've needed to do this often enough to make it worth having a method though...
string quotedString = string.Format("\"{0}\"", originalString);
Yes, using concatenation and escaped characters
myString = "\"" + myString + "\"";
Maybe an extension method
public static string Quoted(this string str)
{
return "\"" + str + "\"";
}
Usage:
var s = "Hello World"
Console.WriteLine(s.Quoted())
No but you can write your own or create an extension method
string AddQuotes(string str)
{
return string.Format("\"{0}\"", str);
}
Using Escape Characters
Just prefix the special character with a backslash, which is known as an escape character.
Simple Examples
string MyString = "Hello";
Response.Write(MyString);
This would print:
Hello
But:
string MyString = "The man said \"Hello\"";
Response.Write(MyString);
Would print:
The man said "Hello"
Alternative
You can use the useful # operator to help escape strings, see this link:
http://www.kowitz.net/archive/2007/03/06/the-c-string-literal
Then, for quotes, you would use double quotes to represent a single quote. For example:
string MyString = #"The man said ""Hello"" and went on his way";
Response.Write(MyString);
Outputs:
The man said "Hello" and went on his way
I'm a bit C# of a novice myself, so have at me, but I have this in a catch-all utility class 'cause I miss Perl:
// overloaded quote - if no quote chars spec'd, use ""
public static string quote(string s) {
return quote(s, "\"\"");
}
// quote a string
// q = two quote chars, like "", '', [], (), {} ...
// or another quoted string (quote-me-like-that)
public static string quote(string s, string q) {
if(q.Length == 0) // no quote chars, use ""
q = "\"\"";
else if(q.Length == 1) // one quote char, double it - your mileage may vary
q = q + q;
else if(q.Length > 2) // longer string == quote-me-like-that
q = q.Substring(0, 1) + q.Substring(q.Length - 1, 1);
if(s.Length == 0) // nothing to quote, return empty quotes
return q;
return q[0] + s + q[1];
}
Use it like this:
quote("this with default");
quote("not recommended to use one char", "/");
quote("in square brackets", "[]");
quote("quote me like that", "{like this?}");
Returns:
"this with default"
/not recommended to use one char/
[in square brackets]
{quote me like that}
In my case I wanted to add quotes only if the string was not already surrounded in quotes, so I did:
(this is slightly different to what I actually did, so it's untested)
public static string SurroundWith(this string text, string ends)
{
if (!(text.StartsWith(ends) && text.EndsWith(ends)))
{
return string.Format("{1}{0}{1}", text, ends);
}
else
{
return text;
}
}
There is no such built in method to do your requirement
There is SplitQuotes method that does something
Input - This is a "very long" string
Output - This, is, a, very long, string
When you get a string from textbox or some control it comes with quotes.
If still you want to place quotes then you can use this kind of method
private string PlaceQuotes(string str, int startPosition, int lastPosition)
{
string quotedString = string.Empty;
string replacedString = str.Replace(str.Substring(0, startPosition),str.Substring(0, startPosition).Insert(startPosition, "'")).Substring(0, lastPosition).Insert(lastPosition, "'");
return String.Concat(replacedString, str.Remove(0, replacedString.Length));
}
Modern C# version below. Using string.Create() we avoid unnecessary allocations:
public static class StringExtensions
{
public static string Quote(this string s) => Surround(s, '"');
public static string Surround(this string s, char c)
{
return string.Create(s.Length + 2, s, (chars, state) =>
{
chars[0] = c;
state.CopyTo(chars.Slice(1));
chars[^1] = c;
});
}
}

Categories