How to get middle part of string between definite symbols? - c#

I have strings like:
"d:\tmp\abc_list.csv"
"d:\tmp\xyzx_list.csv"
"d:\tmp\qwert_list.csv"
I need to take first part of filename: abx,xyzc, qwert. I do it now as:
string name = filename.Substring(filename.LastIndexOf('\\') + 1 , filename.IndexOf('_') - filename.LastIndexOf('\\') - 1);
I feel there should be easier and nicer way to do it. What is it?

Use the Path class:
string fullPath = #"d:\tmp\abc_list.csv";
string fileNameWOE = Path.GetFileNameWithoutExtension(fullPath);
string firstToken = fileNameWOE.Split('_').First();

Your solution is nice, but it is going to break if another part of the file name, say, part of its directory path, has an underscore. You should change it slightly to avoid this problem:
int pos = filename.LastIndexOf('\\') + 1;
string name = filename.Substring(pos , filename.IndexOf('_', pos) - pos);
When your solution is nice, robust, and easy to understand, there's no reason to go for a shorter solution. Of course you can use regular expression, but the resulting one-line solution is far less readable:
var res = Regex.Matches(s, #"(?<=\\)[^_\\]*(?=_[^\\]*$)")[0].Value;
Here is a demo of this solution on ideone.

you can use the Path class
and Path.GetFileNameWithoutExtension Method
http://msdn.microsoft.com/en-us/library/system.io.path.getfilenamewithoutextension.aspx

Something like this:
string str = #"d:\tmp\abc_list.csv";
// The preferred way to manipulate paths is to use the Path.* methods
string str2 = Path.GetFileNameWithoutExtension(str);
int ix = str2.LastIndexOf('_');
if (ix != -1)
{
str2 = str2.Remove(ix);
}

Related

The fastest way to trim string in C#

I need to trim paths in million strings like this:
C:\workspace\my_projects\my_app\src\my_component\my_file.cpp
to
src\my_component\my_file.cpp
I.e. remove absolute part of the path, what is the fastest way to do that?
My try using regex:
Regex.Replace(path, #"(.*?)\src", ""),
I wouldn't go with regex for this, use the plain old method.
If the path prefix is always the same:
const string partToRemove = #"C:\workspace\my_projects\my_app\";
if (path.StartsWith(partToRemove, StringComparison.OrdinalIgnoreCase))
path = path.Substring(partToRemove.Length);
If the prefix is variable, you can get the last index of \src\:
var startIndex = path.LastIndexOf(#"\src\", StringComparison.OrdinalIgnoreCase);
if (startIndex >= 0)
path = path.Substring(startIndex + 1);
define the regex with a new and reuse it
there is a (significant) cost to creating the regex
string input = "This is text with far too much " +
"whitespace.";
string pattern = "\\s+";
string replacement = " ";
Regex rgx = new Regex(pattern);
string result = rgx.Replace(input, replacement);
I'm not sure if you need speed here, but if you always get the full path, you could do a simple .Substring()
var path = #"C:\workspace\my_projects\my_app\src\my_component\my_file.cpp";
Console.WriteLine(path.Substring(32));
However, I think you should sanitize your input first; in this case, the Uri class could do the parsing step:
var root = #"C:\workspace\my_projects\my_app\";
var path = #"C:\workspace\my_projects\my_app\src\my_component\my_file.cpp";
var relative = new Uri(root).MakeRelativeUri(new Uri(path));
Console.WriteLine(relative.OriginalString.Replace("/", "\\"));
Notice here the Uri will change the \ with a /: that's the .Replace reason.
Cant think any faster than this
path.Substring(33);
What is before src is constant. and it starts from index 33.
C:\workspace\my_projects\my_app\src\my_component\my_file.cpp
^
How ever if its not always constant. you can find it once. and do the rest inside loop.
int startInd = path.IndexOf(#"\src\") + 1;
// Do this inside loop. 1 million times
path.Substring(startInd);
If your files will all end in "src/filename.ext" you could use the Path class in the .NET framework for it and get around all caveats you could have with pathes and filenames:
result = "src\" + Path.GetFileName(path);
So you should first double-check that the conversion is the thing that takes to long.

Getting substring between two separators in an arbitrary position

I have following string:
string source = "Test/Company/Business/Department/Logs.tvs/v1";
The / character is the separator between various elements in the string. I need to get the last two elements of the string. I have following code for this purpose. This works fine. Is there any faster/simpler code for this?
CODE
static void Main()
{
string component = String.Empty;
string version = String.Empty;
string source = "Test/Company/Business/Department/Logs.tvs/v1";
if (!String.IsNullOrEmpty(source))
{
String[] partsOfSource = source.Split('/');
if (partsOfSource != null)
{
if (partsOfSource.Length > 2)
{
component = partsOfSource[partsOfSource.Length - 2];
}
if (partsOfSource.Length > 1)
{
version = partsOfSource[partsOfSource.Length - 1];
}
}
}
Console.WriteLine(component);
Console.WriteLine(version);
Console.Read();
}
Why no regular expression? This one is fairly easy:
.*/(?<component>.*)/(?<version>.*)$
You can even label your groups so for your match all you need to do is:
component = myMatch.Groups["component"];
version = myMatch.Groups["version"];
The following should be faster, as it only scans as much of the string as it needs to to find two / and it doesn't bother splitting up the whole string:
string component = "";
string version = "";
string source = "Test/Company/Business/Department/Logs.tvs/v1";
int last = source.LastIndexOf('/');
if (last != -1)
{
int penultimate = source.LastIndexOf('/', last - 1);
version = source.Substring(last + 1);
component = source.Substring(penultimate + 1, last - penultimate - 1);
}
That said, as with all performance questions: profile! Try the two side-by-side with a big list of real-life inputs and see which is fastest.
(Also, this will leave empty strings rather than throw an exception if there is no slash in the input... but throw if source is null, lazy me.)
Your approach is the most suitable one given that your are looking for substrings at a particular index. A LINQ expression to do the same in this case will likely not improve the code or its readability.
For reference, there is some great information from Microsoft here on working with strings and LINQ. In particular see the article here which covers some examples with both LINQ and RegEx.
EDIT: +1 For Matt's named group within RegEx approach... that's the nicest solution I've seen.
Your code mostly looks fine. A couple of points to note:
String.Split() will never return null, so you don't need the null check on it.
If the source string has fewer than two / characters, how would you deal with that? (The Original Post was updated to address this)
Do you really want to just output empty strings if your source string is null or empty (or invalid)? If you have specific expectations about the nature of the input, you may want to consider failing fast when those expectations are not met.
You could try something like this but I doubt it would be much faster. You could do some meassurements with System.Diagnostics.StopWatch to see if you feel the need.
string source = "Test/Company/Business/Department/Logs.tvs/v1";
int index1 = source.LastIndexOf('/');
string last = source.Substring(index1 + 1);
string substring = source.Substring(0, index1);
int index2 = substring.LastIndexOf('/');
string secondLast = substring.Substring(index2 + 1);
I would try
string source = "Test/Company/Business/Department/Logs.tvs/v1";
var components = source.Split('/').Reverse().Take(2);
String last = string.Empty;
var enumerable = components as string[] ?? components.ToArray();
if (enumerable.Count() == 2)
last = enumerable.FirstOrDefault();
var secondLast = enumerable.LastOrDefault();
Hope this will help
you can retrieve the last two words using the process as below:
string source = "Test/Company/Business/Department/Logs.tvs/v1";
String[] partsOfSource = source.Split('/');
if(partsOfSourch.length>2)
for(int i=partsOfSourch.length-2;i<=partsOfSource.length-1;i++)
console.writeline(partsOfSource[i]);

Shorthand way to remove last forward slash and trailing characters from string

If I have the following string:
/lorem/ipsum/dolor
and I want this to become:
/lorem/ipsum
What is the short-hand way of removing the last forward slash, and all characters following it?
I know how I can do this by spliting the string into a List<> and removing the last item, and then joining, but is there a shorter way of writing this?
My question is not URL specific.
You can use Substring() and LastIndexOf():
str = str.Substring(0, str.LastIndexOf('/'));
EDIT (suggested comment)
To prevent any issues when the string may not contain a /, you could use something like:
int lastSlash = str.LastIndexOf('/');
str = (lastSlash > -1) ? str.Substring(0, lastSlash) : str;
Storing the position in a temp-variable would prevent the need to call .LastIndexOf('/') twice, but it could be dropped in favor of a one-line solution instead.
If there is '/' at the end of the url, remove it.
If not; just return the original one.
var url = this.Request.RequestUri.ToString();
url = url.EndsWith("/") ? url.Substring(0, url.Length - 1) : url;
url += #"/mycontroller";
You can do something like str.Remove(str.LastIndexOf("/")), but there is no built-in method to do what you want.
Edit: you could also use the Uri object to traverse directories, although it does not give exactly what you want:
Uri baseUri = new Uri("http://domain.com/lorem/ipsum/dolor");
Uri myUri = new Uri(baseUri, ".");
// myUri now contains http://domain.com/lorem/ipsum/
One simple way would be
String s = "domain.com/lorem/ipsum/dolor";
s = s.Substring(0, s.LastIndexOf('/'));
Console.WriteLine(s);
Another maybe
String s = "domain.com/lorem/ipsum/dolor";
s = s.TrimEnd('/');
Console.WriteLine(s);
You can use the regex /[^/]*$ and replace with the empty string:
var fixed = new Regex("/[^/]*$").Replace("domain.com/lorem/ipsum/dolor", "")
But it's probably overkill here. #newfurniturey's answer of Substring with LastIndexOf is probably best.
I like to create a String Extension for stuff like this:
/// <summary>
/// Returns with suffix removed, if present
/// </summary>
public static string TrimIfEndsWith(
this string value,
string suffix)
{
return
value.EndsWith(suffix) ?
value.Substring(0, value.Length - suffix.Length) :
value;
}
You can then use like this:
var myString = "/lorem/ipsum/dolor";
myStringClean = myString.TrimIfEndsWith("/dolor");
You now have a re-usable extension across all of your projects that can be used to remove one trailing character or multiple.
using System.IO;
mystring.TrimEnd(Path.AltDirectorySeparatorChar); // To remove "/"
mystring.TrimEnd(Path.DirectorySeparatorChar); // To remove "\"
while (input.Last() == '/' || input.Last() == '\\')
{
input = input.Substring(0, input.Length - 1);
}
Thank you #Curt for your question.
I slightly improved #newfurniturey's code, and here is my version.
if(str.Contains('/')){
str = str.Substring(0, str.LastIndexOf('/'));
}
I'm way late to the party, but if you're using C# 8.0+, another clean approach would be to use the range operator:
if (urlStr.EndsWith("/")) urlStr = urlStr[..^1];
If you're curious as to how this works, take a look at the spec for ranges in C#:
https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/proposals/csharp-8.0/ranges
tldr; urlStr[..^1] roughly translates to something along the lines of "Give me a substring comprised of the characters contained within the range of index 0 to whatever index is 1 away from the last index.".
In other words, it's similar to...
urlStr.Substring(0, urlStr.Length-1)

String functions

I want to search for a given string, within another string (Ex. find if "something" exists inside "something like this". How can I do the following? :
Know the position in which "something" is located (in the curr. ex. this is = 0.
Extract everything to the left or to the right, up to the char. found (see 1).
Extract a substring beggining where the sought string was found, all the way to X amount of chars (in Visual Basic 6/VBA I would use the Mid function).
string searched = "something like this";
1.
int pos = searched.IndexOf("something");
2.
string start = searched.Substring(0, pos);
string endstring = searched.Substring(pos);
3.
string mid = searched.Substring(pos, x);
Have you looked at the String.SubString() method? You can use the IndexOf() method to see if the substring exists first.
Take a look at the System.String member functions, in particular the IndexOf method.
Use int String.IndexOf(String).
I would do something like this:
string s = "I have something like this";
//question No. 1
int pos = s.IndexOf("something");
//quiestion No. 2
string[] separator = {"something"};
string[] leftAndRightEntries = s.Split(separator, StringSplitOptions.None);
//question No. 3
int x = pos + 10;
string substring = s.Substring(pos, x);
I would avoid using Split, as it's designed to give you multiple results. I would stick with the code in the first example, though the second block should actually read...
string start = searched.Substring(0, pos);
string endstring;
if(pos < searched.Length - 1)
endstring = searched.Substring(pos + "something".Length);
else
endstring = string.Empty
The key difference is accounting for the length of the string to find (hence the rather odd-looking "something".Length, as this example is designed for you to be able to plop in your own variable).

C# Using Substring, how do I extract this string?

I want to extract the first folder in the URL below, in this example it is called 'extractThisFolderName' but the folder could have any name and be any length. With this in mind how can I use substring to extract the first folder name?
The string: www.somewebsite.com/extractThisFolderName/leave/this/behind
String folderName = path.Substring(path.IndexOf(#"/"),XXXXXXXXXXX);
It's the length I'm struggling with.
If you're getting a Uri, why not just do uri.Segments[0]?
Or even path.Split(new Char[] { '/' })[1] ?
If you're going to be using each path part, you can use:
String[] parts = path.Split('/');
At which point you can access the "extractThisFolderName" part by accessing parts[1].
Alternatively, you can do this to splice out the foldername:
int firstSlashIndex = path.IndexOf('/');
int secondSlashIndex = path.IndexOf('/', firstSlashIndex + 1);
String folderName = path.Substring(firstSlashIndex + 1, secondSlashIndex - firstSlashIndex);
Daniel's answer gives you other practical ways of doing it. Another alternative using substring:
int start = path.IndexOf('/')+1; // Note that you don't need a verbatim string literal
int secondSlash = path.IndexOf('/', start);
return path.Substring(start, secondSlash-start);
You'll want to add some error checking in there, of course :)
The problem also lends itself to regular expressions. An expression like:
(?<host>.*?)/(?<folder>.*?)/
Is clear about what's going on and you can get the data out by those names.
int start = path.IndexOf('/');
int end = path.IndexOf('/', start + 1);
if (end == -1) end = path.Length;
string folderName = path.Substring(start + 1, end - start - 1);
EDIT: Daniel Schaffer's answer about using uri segments is preferable, but left this in as it may be your path is not really a valid uri.
You could do:
string myStr = "www.somewebsite.com/extractThisFolderName/leave/this/behind";
int startIndex = myStr.IndexOf('/') + 1;
int length = myStr.IndexOf('/', startIndex) - startIndex;
Console.WriteLine(myStr.Substring(startIndex, length));
At the same point I assume this is being done in ASP.Net if so I think there might be another way to get this without doign the querying.
folderName.Split('/')[1]

Categories