How to convert a string to a "unique key" in c#? - c#

I need a function that gives a one-to-one map from a string to another string, but the output string the nice property of being a proper name for a file.
More specifically my problem is that, given a URL of an image, I want to save the image with a unique name given that URL. I need a code like this
string url;
string uniqueName = UrlToName (url);
string fileName = path + uniqueName + ".png";
The problem is how to get the UrlToName function. An possible solution could be GetHashCode but I don't know if its correct.

Your best bet is to use a database - create a new row for each file, store the Uri in one column and a generated file name in another column (e.g. Guid.NewGuid().ToString() + ".png").
This is immune to any problems with file name lengths (~255 characters in NTFS whilst a URL could be ~2000), there is no chance of a hash collision and you can evolve your storage algorithm over time as your database grows, for example, adding directories so that you don't end up with too many files in a single directory (which makes it unuseable in Explorer).
You should also be concerned about security risks if you ever create file names on a server based on external input. Much of the advice in this answer applies here too.

Try something like this:
private string UrlToName(string url)
{
foreach (char c in System.IO.Path.GetInvalidFileNameChars())
{
url = url.Replace(c, '_');
}
return url;
}
This will make sure that it has valid file name characters. As long as the url being passed in is unique, then this function should always return a unique string back.
If you do need a truly unique name, even if the same URL gets passed in, try this:
private string UrlToName(string url)
{
url = url + "_" + DateTime.Now.ToString("o");
foreach (char c in System.IO.Path.GetInvalidFileNameChars())
{
url = url.Replace(c, '_');
}
return url;
}
This will add a date stamp, down to the milliseconds to the end of the string. Unless you pass in the exact same URL, at the same exact millisecond (highly unlikely), you will get a unique string back everytime.

Related

get the name of a file without empty spaces

I´m working in a MVC project and receive a file(HttpPostedFileBase property) in my controller via modelbinding and what I want is to delete all the empty spaces in the name of the file I just received, for that purpose I use this
var nombre_archivo = DateTime.Now.ToString("yyyyMMddHHmm.") +"_"+ (info.file.FileName.ToString()).Split(new[] { '.' }, StringSplitOptions.RemoveEmptyEntries);
but the var "nombre_archivo" is always: 201801240942.System.String[] and what I want is 201801240942.nameOfFile, could you please tell me where is the error?
Your are splitting on an array of dots.
Use replace instead :
var nombre_archivo = string.Format("{0}_{1}",
DateTime.Now.ToString("yyyyMMddHHmm."),
info.file.FileName.replace(" ", "")
);
Moreover, we recommend to use string.Format instead of + concatenation. It's faster and clearer
var name = $"{DateTime.Now.ToString("yyyyMMddHHmm.")}_{info.file.FileName.Replace(" ", "")}";
You can use String.Replace to replace a space with an empty string. The method has other issues though. It doesn't check whether FileName is valid which means someone could make a POST request with a hand-coded path like ../../ or E:\somepath\myinnocentprogram.exe to write a file to the server's disk. Or worse, ../index.htm.
Replacing spaces doesn't make much sense. It's the dots and slashes that can result
If you check Uploading a File (Or Files) With ASP.NET MVC you'll see that the author uses Path.GetFileName() to retrieve only the file's name before saving it in the proper folder. Your code should look like this::
[HttpPost]
public ActionResult Index(HttpPostedFileBase file) {
if (file.ContentLength > 0) {
var fileName = Path.GetFileName(file.FileName)
.Replace(" ","");
var finalName=String.Format("{0:yyyyMMddHHmm}._{1}",DateTime.Now,fileName);
var path = Path.Combine(Server.MapPath("~/App_Data/uploads"), finalName);
file.SaveAs(path);
}
return RedirectToAction("Index");
}
This ensures that only the filename part of the file is used, and that the file is saved in the appropriate folder, even if someone posted an invalid path
This is how you should be able to fix your problem
string nombre_archivo = string.Format("{0}_{1}", DateTime.Now.ToString("yyyyMMddHHmm."), info.file.FileName.Where(c => !Char.IsWhiteSpace(c))
EDIT :
You should use string.Replace instead of using Linq query.
There is a format problem I did not expect.
So, the right answer is given below, but basically, it would look like :
string nombre_archivo = string.Format("{0}_{1}", DateTime.Now.ToString("yyyyMMddHHmm."), info.file.FileName.Replace(" ", ""));

Changing file name

Consider the following code snippet
public static string AppendDateTimeToFileName(this string fileName)
{
return string.Concat(
Path.GetFileNameWithoutExtension(fileName),
DateTime.Now.ToString("yyyyMMddHHmmssfff"),
Path.GetExtension(fileName));
}
This basically puts a date time stamp on any file that is being uploaded by the users. Now this works great is the file name is something like
MyFile.png
AnotherFile.png
Now I'm trying to change this method so if the file name is something like
MyFile - Copy(1).png
AnotherFile - Copy(1).png
I want the file name to become
MyFile-Copy-120170303131815555.png
AnotherFile-Copy-120170303131815555.png
If there an easy soltuion for this with regex or similar or do I have to re-write the method again and check each of those values one by one.
return string.Concat(
Regex.Replace(Path.GetFileNameWithoutExtension(fileName), #" - Copy\s*\(\d+\)", "-Copy-", RegexOptions.IgnoreCase),
DateTime.Now.ToString("yyyyMMddHHmmssfff"),
Path.GetExtension(fileName));
This matches any number of digits and is a global replace.

How can I get a part/subdomain of my URL in C#?

I have a URL like the following
http://yellowcpd.testpace.net
How can I get yellowcpd from this? I know I can do that with string parsing, but is there a builtin way in C#?
Assuming your URLs will always be testpace.net, try this:
var subdomain = Request.Url.Host.Replace("testpace.net", "").TrimEnd('.');
It'll just give you the non-testpace.net part of the Host. If you don't have Request.Url.Host, you can do new Uri(myString).Host instead.
try this
string url = Request.Url.AbsolutePath;
var myvalues= url.Split('.');
How can I get yellowcpd from this? I know I can do that with string
parsing, but is there a builtin way in C#?
.Net doesn't provide a built-in feature to extract specific parts from Uri.Host. You will have to use string manipulation or a regular expression yourself.
The only constant part of the domain string is the TLD. The TLD is the very last bit of the domain string, eg .com, .net, .uk etc. Everything else under that depends on the particular TLD for its position (so you can't assume the next to last part is the "domain name" as, for .co.uk it would be .co
This fits the bill.
Split over two lines:
string rawURL = Request.Url.Host;
string domainName = rawURL .Split(new char[] { '.', '.' })[1];
Or over one:
string rawURL = Request.Url.Host.Split(new char[] { '.', '.' })[1];
The simple answer to your question is no there isn't a built in method to extract JUST the sub-domain. With that said this is the solution that I use...
public enum GetSubDomainOption
{
ExcludeWWW,
IncludeWWW
};
public static class Extentions
{
public static string GetSubDomain(this Uri uri,
GetSubDomainOption getSubDomainOption = GetSubDomainOption.IncludeWWW)
{
var subdomain = new StringBuilder();
for (var i = 0; i < uri.Host.Split(new char[]{'.'}).Length - 2; i++)
{
//Ignore any www values of ExcludeWWW option is set
if(getSubDomainOption == GetSubDomainOption.ExcludeWWW && uri.Host.Split(new char[]{'.'})[i].ToLowerInvariant() == "www") continue;
//I use a ternary operator here...this could easily be converted to an if/else if you are of the ternary operators are evil crowd
subdomain.Append((i < uri.Host.Split(new char[]{'.'}).Length - 3 &&
uri.Host.Split(new char[]{'.'})[i+1].ToLowerInvariant() != "www") ?
uri.Host.Split(new char[]{'.'})[i] + "." :
uri.Host.Split(new char[]{'.'})[i]);
}
return subdomain.ToString();
}
}
USAGE:
var subDomain = Request.Url.GetSubDomain(GetSubDomainOption.ExcludeWWW);
or
var subDomain = Request.Url.GetSubDomain();
I currently have the default set to include the WWW. You could easilly reverse this by switching the optional parameter value in the GetSubDomain() method.
In my opinion this allows for an option that looks nice in code and without digging in appears to be 'built-in' to c#. Just to confirm your expectations...I tested three values and this method will always return just the "yellowcpd" if the exclude flag is used.
www.yellowcpd.testpace.net
yellowcpd.testpace.net
www.yellowcpd.www.testpace.net
One assumption that I use is that...splitting the hostname on a . will always result in the last two values being the domain (i.e. something.com)
As others have pointed out, you can do something like this:
var req = new HttpRequest(filename: "search", url: "http://www.yellowcpd.testpace.net", queryString: "q=alaska");
var host = req.Url.Host;
var yellow = host.Split('.')[1];
The portion of the URL you want is part of the hostname. You may hope to find some method that directly addresses that portion of the name, e.g. "the subdomain (yellowcpd) within TestSpace", but this is probably not possible, because the rules for valid host names allow for any number of labels (see Valid Host Names). The host name can have any number of labels, separated by periods. You will have to add additional restrictions to get what you want, e.g. "Separate the host name into labels, discard www if present and take the next label".

How do I verify that a string supplied file path is in a valid directory format?

I have a reasonably straight-forward question here but I seem to find myself revisiting each time I have to deal with the validation of file paths and names. So I'm wondering if there is a method available in System.IO or some other library in the framework that can make my life easier!?
Lets take the contrived example of a method that takes a file path and a filename and from these inputs it formats and returns unique full file-location.
public string MakeFileNameUnique(string filePath, string fileName)
{
return filePath + Guid.NewGuid() + fileName;
}
I know that I must do the following to get the path in a correct format so that I can append the guid and filename:
if filePath is null or empty then throw exception
if filePath does not exist then throw exception
if no valid postfixed '/' then add one
if it contains a postfixed '\' then remove and replace with a '/'
Can someone tell me if there is a framework method that can do this(particularly the forwareslash/backslash logic) available to achieve this repetitive logic?
Are you looking for the Path.Combine method:
public string MakeFileNameUnique(string filePath, string fileName)
{
return Path.Combine(filePath, Guid.NewGuid().ToString(), fileName);
}
but looking at the name of your method (MakeFileNameUnique), have you considered using the Path.GenerateRandomFileName method? Or the Path.GetTempFileName method?
Following your requirements this will do
public string MakeFileNameUnique(string filePath, string fileName)
{
// This checks for nulls, empty or not-existing folders
if(!Directory.Exists(filePath))
throw new DirectoryNotFoundException();
// This joins together the filePath (with or without backslash)
// with the Guid and the file name passed (in the same folder)
// and replace the every backslash with forward slashes
return Path.Combine(filePath, Guid.NewGuid() + "_" + fileName).Replace("\\", "/");
}
a call with
string result = MakeFileNameUnique(#"d:\temp", "myFile.txt");
Console.WriteLine(result);
will result in
d:/temp/9cdb8819-bdbc-4bf7-8116-aa901f45c563_myFile.txt
However I wish to know the reason about the replace for the backslash with forward slashes

Trim all chars off file name after first "_"

I'd like to trim these purchase order file names (a few examples below) so that everything after the first "_" is omitted.
INCOLOR_fc06_NEW.pdf
Keep: INCOLOR (write this to db as the VendorID) Remove: _fc08_NEW.pdf
NORTHSTAR_sc09.xls
Keep: NORTHSTAR (write this to db as the VendorID) Remove: _sc09.xls
Our scenario: The managers are uploading these files to our Intranet web server, to make them available to download/view ect. I'm using Brettles NeatUpload, and for each file uploaded, am writing the files attributes into the PO table (sql 2000). The first part of the file name will be written to the DB as a VendorID.
The naming convention for these files is consistent in that the the first part of the file is always the vendor name (or Vendor ID) followed by an "_" then other unpredictable chars used to identify the type of Purchase Order then the file extention - which is consistently either .xls, .XLS, .PDF, or .pdf.
I tried TrimEnd - but the array of chars that you have to provide ends up being long and can conflict with the part of the file name I want to keep. I have a feeling I'm not using TrimEnd properly.
What is the best way to use string.TrimEnd (or any other string manipulation in C#) that will strip off all chars after the first "_" ?
String s = "INCOLOR_fc06_NEW.pdf";
int index = s.IndexOf("_");
return index >= 0 ? s.Substring(0,index) : s;
I'll probably offend the anti-regex lobby, but here I go (ducking):
string stripped = Regex.Replace(filename, #"(?<=[^_]*)_.*",String.Empty);
This code will strip all extra characters after the first '_', unless there is no '_' in the string (then it will just return the original string).
It's one line of code. It's slower than the more elaborate IndexOf() algorithm, but when used in a non-performance-sensitive part of the code, it's a good solution.
Get your flame-throwers out...
TrimEnd removes white spaces and punctuation marks at the end of the String, it won't help you here. Read more about TrimEnd here:
http://msdn.microsoft.com/en-us/library/system.string.trimend.aspx
Bnaffas code (with a small tweak):
String fileName = "INCOLOR_fc06_NEW.pdf";
int index = fileName.IndexOf("_");
return index >= 0 ? fileName.Substring(0, index) : fileName;
If you want to do something with the other parts, you could use a Split
string fileName = "INCOLOR_fc06_NEW.pdf";
string[] parts = fileName.Split('_');
public string StripOffStuff(string sInput)
{
int iIndex = sInput.IndexOf("_");
return (iIndex > 0) ? sInput.Substring(0, iIndex) : sInput;
}
// Call it like:
string sNewString = StripOffStuff("INCOLOR_fc06_NEW.pdf");
I would go with the SubString approach but to round out the available solutions here's a LINQ approach just for fun:
string filename = "INCOLOR_fc06_NEW.pdf";
string result = new string(filename.TakeWhile(c => c != '_').ToArray());
It'll return the original string if no underscore is found.
To go with all the "alternative" solutions, here's the second one that I thought of (after substring):
string filename = "INCOLOR_fc06_NEW.pdf";
string stripped = filename.Split('_')[0];

Categories