Display string after certain words are found - c#

Basically what I'm trying to do is find the first string that starts with "/Game/Mods" but the problem is how do i tell the program where to end the string? here's an example what a string can look like: string example
As you can see the string starts with "/Game/Mods", i want it to end after the word "TamingSedative", the problem is that the ending word (TamingSedative)is different for every file it has to check, for example: example 2
There you can see that the ending word is now "WeapObsidianSword" (instead of TamingSedative) so basically the string has to end when it comes across the "NUL" but how do i specify that in c# code?

This a simple example using Regex.
Dim yourString As String = "/Game/Mods/TamingSedative/PrimalItemConsumable_TamingSedative"
Dim M As System.Text.RegularExpressions.Match = System.Text.RegularExpressions.Regex.Match(yourString, "/Game/Mods/(.+?)/")
MessageBox.Show(M.Groups(0).Value) 'This should show /Game/Mods/TamingSedative/
MessageBox.Show(M.Groups(1).Value) 'This should show TamingSedative

Since you need only the first occurance, this is the simplest solution I could think of:
(In case you cannot see the image, click on it to open in new tab)
EDIT:
In case the existence of a path like this is not guaranteed in the string, you can do an additional check before proceeding to use Substring, like this:
int exists = fullString.IndexOf("/Game/Mods");
if (exists == -1) return null;
Note: I have included "ENDED" in order to see in case any NULL chars have been included (white spaces)

From your comments: "the string just has to start at /Game/Mods and end when it reaches the whitespace".
In that case, you can easily get the matches using Linq, like this (assuming filePath is a string that has the path to your file):
var text = File.ReadAllText(filePath);
var matches = text.Split(null).Where(s => s.StartsWith("/Game/Mods"));
And, if you only need the first occurrence, it would be:
var firstMatch = matches.Any() ? matches.First() : null;
Check this post.

Related

trim string before character but still keep the remain part after it

So I have this string which I have to trim and manipulate a little with it.
My string example:
string test = "studentName_123.pdf";
Now, what I want to do is somehow extract only the _123 part and at the end I need to have studentName.pdf
What I have tried:
string test_extracted = test.Substring(0, test.LastIndexOf("_") )+".pdf";
This also works but the thing is that I don't want to add the ".pdf" suffix at the end of the string manually because I can have strings that are not pdf, for ex. studentName.docx , studentName.png.
So basically I just want the "_123" part removed but still keep the remain part after that.
I think this might help you:
string test = "studentName_123.pdf";
string test_extracted = test.Substring(0, test.LastIndexOf("_") )+ test.Substring(test.LastIndexOf("."),test.Length - test.LastIndexOf(".") );
Using Remove(int startIndex, int count):
string test = "studentName_123.pdf";
string test_extracted = test.Remove(test.LastIndexOf("_"), test.LastIndexOf(".") - test.LastIndexOf("_"));
Sounds like you mean something like this?
string extension = Path.GetExtension(test);
string pdfName = Path.GetFileNameWithoutExtension(test).Split('_')[0];
string fullName = pdfName + extension;
Since you know what value you will always be replacing in your strings, "_123", to base on your example, just utilize the replace method and replace it with nothing since the method expects two arguments;
string test_extracted = test.replace('_123', '');
This could be solved with a regular expression like this
(\w*)_.*(\.\w*) where the first capture group (\w*) matches everything before the underscore and the second group (\.\w*) matches the file extensions.
Lastly we just have to concat the groups without the stuff inbetween like so:
string test = "studentName_123.pdf";
var regex = Regex.Match(test, #"(\w*)_.*(\.\w*)");
string newString = regex.Groups[1].Value + regex.Groups[2].Value;

Remove escape from string c#

I have a path in my network who I can use in File Explorer without problems:
\\MyNetwork\Projects\16000
Now I want to access it using Directory.Exists as:
var normalFolderPath = #"\MyNetwork";
var number = #"\16000"
var a = Directory.Exists($#"{normalFolderPath}\{number}");
But this: $#"{normalFolderPath}\{number}" return \\MyNetwork\\16000 but if I try to access it that on File Explorer just can not found, but if I remove \ from subfolder like: \\MyNetwork\16000 it works!, how can I remove one \ from string in c#
You're complaining that your string has two slashes in the middle of it, but you put two slashes in the middle of it:
number is the literal string \16000
You asked c# to concatenate normalfolderpath with number, separated by slash: {normalFolderPath}\{number}
Naturally, you'll end up with two slashes; one from number, and one as the separator. To demo what I mean, here is an altered code:
var normalFolderPath = #"\MyNetwork";
var number = #"!16000"
var a = Directory.Exists($#"{normalFolderPath}={number}");
This will produce the string \MyNetwork=!16000: the = is the separator between the interpolated fields and ! came from the start of number
But this: $#"{normalFolderPath}\{number}" return \\MyNetwork\\16000
I disagree: it will definitely return \MyNetwork\\16000 with only one slash at the start and two in the middle. No way will that code you put, return something with two slashes at the start
As has been commented, you should use Path.Combine to combine path elements:
var normalFolderPath = #"\\MyNetwork";
var number = "16000"
var a = Directory.Exists(Path.Combine(normalFolderPath,number));

How to strip a string from the point a hyphen is found within the string C#

I'm currently trying to strip a string of data that is may contain the hyphen symbol.
E.g. Basic logic:
string stringin = "test - 9894"; OR Data could be == "test";
if (string contains a hyphen "-"){
Strip stringin;
output would be "test" deleting from the hyphen.
}
Console.WriteLine(stringin);
The current C# code i'm trying to get to work is shown below:
string Details = "hsh4a - 8989";
var regexItem = new Regex("^[^-]*-?[^-]*$");
string stringin;
stringin = Details.ToString();
if (regexItem.IsMatch(stringin)) {
stringin = stringin.Substring(0, stringin.IndexOf("-") - 1); //Strip from the ending chars and - once - is hit.
}
Details = stringin;
Console.WriteLine(Details);
But pulls in an Error when the string does not contain any hyphen's.
How about just doing this?
stringin.Split('-')[0].Trim();
You could even specify the maximum number of substrings using overloaded Split constructor.
stringin.Split('-', 1)[0].Trim();
Your regex is asking for "zero or one repetition of -", which means that it matches even if your input does NOT contain a hyphen. Thereafter you do this
stringin.Substring(0, stringin.IndexOf("-") - 1)
Which gives an index out of range exception (There is no hyphen to find).
Make a simple change to your regex and it works with or without - ask for "one or more hyphens":
var regexItem = new Regex("^[^-]*-+[^-]*$");
here -------------------------^
It seems that you want the (sub)string starting from the dash ('-') if original one contains '-' or the original string if doesn't have dash.
If it's your case:
String Details = "hsh4a - 8989";
Details = Details.Substring(Details.IndexOf('-') + 1);
I wouldn't use regex for this case if I were you, it makes the solution much more complex than it can be.
For string I am sure will have no more than a couple of dashes I would use this code, because it is one liner and very simple:
string str= entryString.Split(new [] {'-'}, StringSplitOptions.RemoveEmptyEntries)[0];
If you know that a string might contain high amount of dashes, it is not recommended to use this approach - it will create high amount of different strings, although you are looking just for the first one. So, the solution would look like something like this code:
int firstDashIndex = entryString.IndexOf("-");
string str = firstDashIndex > -1? entryString.Substring(0, firstDashIndex) : entryString;
you don't need a regex for this. A simple IndexOf function will give you the index of the hyphen, then you can clean it up from there.
This is also a great place to start writing unit tests as well. They are very good for stuff like this.
Here's what the code could look like :
string inputString = "ho-something";
string outPutString = inputString;
var hyphenIndex = inputString.IndexOf('-');
if (hyphenIndex > -1)
{
outPutString = inputString.Substring(0, hyphenIndex);
}
return outPutString;

Find and replace file lines

I have a text file with over 12,000 lines. In that file I need to replace certain lines.
Some lines begin with a ;, some have random words, some start with space. However, I am only concerned with the two types of lines I describe below.
I have a line like
SET avariable:0 ;Comments
and I need to replace it to look like
set aDIFFvariable:0 :Integer // comments
The only CASE that is necessary is in the word Integer I needs to be capitalized.
I also have
String aSTRING(7) ;Comment
that needs to look like
STRING aSTRING(7) :array [0..7] of AnsiChar; // Comments
I need to keep all the spacing the same.
Here is what I have so far
static void Main(string[] args)
{
string text = File.ReadAllText("C:\\old.txt");
text = text.Replace("old text", "new text");
File.WriteAllText("C:\\new.txt", text);
}
I think I need to use REGEX, which I have tried to make for my first example:
\s\s[set]\s*{4}.*[:0]\s*[;].* <-- I now know this is invalid - please advise
I need help with properly setting up my program to find and replace those lines. Should I read one line at a time and if it matches then do something? I am confused really as to where to start.
BRIEF pseudo code of what I want to do
//open file
//step through file
//if line == [regex] then add/replace as needed
//else, go to next line
//if EOF, close file
Taking a stab at this separately because each line is so radically different that capturing both in the same expression will be a nightmare.
To match your first example and replace it:
String input = "SET avariable:0 ;Comments";
if (Regex.IsMatch(input, #"\s?(set)\s*(\w+):?(\d)\s+;?(.*)?"))
{
input = Regex.Replace(input, #"\s?(set)\s*(\w+):?(\d)\s+;?(.*)?", "$1 $2:$3 :Integer // $4";
}
Give that a shot (Play with it here: http://regex101.com/r/zY7hV2)
To match your second example and replace it:
String input = "String aSTRING(7) ;Comments";
if (Regex.IsMatch(input, #"\s?(string)\s*(\w+)\((\d)\)\s*;(.*)"))
{
input = Regex.Replace(input, #"\s?(string)\s*(\w+)\((\d)\)\s*;(.*)", "$1 $2($3) :array [0..$3] of AnsiChar; // $4";
}
And play around with this one here: http://regex101.com/r/jO5wP5

Trim all chars off file name after first "_"

I'd like to trim these purchase order file names (a few examples below) so that everything after the first "_" is omitted.
INCOLOR_fc06_NEW.pdf
Keep: INCOLOR (write this to db as the VendorID) Remove: _fc08_NEW.pdf
NORTHSTAR_sc09.xls
Keep: NORTHSTAR (write this to db as the VendorID) Remove: _sc09.xls
Our scenario: The managers are uploading these files to our Intranet web server, to make them available to download/view ect. I'm using Brettles NeatUpload, and for each file uploaded, am writing the files attributes into the PO table (sql 2000). The first part of the file name will be written to the DB as a VendorID.
The naming convention for these files is consistent in that the the first part of the file is always the vendor name (or Vendor ID) followed by an "_" then other unpredictable chars used to identify the type of Purchase Order then the file extention - which is consistently either .xls, .XLS, .PDF, or .pdf.
I tried TrimEnd - but the array of chars that you have to provide ends up being long and can conflict with the part of the file name I want to keep. I have a feeling I'm not using TrimEnd properly.
What is the best way to use string.TrimEnd (or any other string manipulation in C#) that will strip off all chars after the first "_" ?
String s = "INCOLOR_fc06_NEW.pdf";
int index = s.IndexOf("_");
return index >= 0 ? s.Substring(0,index) : s;
I'll probably offend the anti-regex lobby, but here I go (ducking):
string stripped = Regex.Replace(filename, #"(?<=[^_]*)_.*",String.Empty);
This code will strip all extra characters after the first '_', unless there is no '_' in the string (then it will just return the original string).
It's one line of code. It's slower than the more elaborate IndexOf() algorithm, but when used in a non-performance-sensitive part of the code, it's a good solution.
Get your flame-throwers out...
TrimEnd removes white spaces and punctuation marks at the end of the String, it won't help you here. Read more about TrimEnd here:
http://msdn.microsoft.com/en-us/library/system.string.trimend.aspx
Bnaffas code (with a small tweak):
String fileName = "INCOLOR_fc06_NEW.pdf";
int index = fileName.IndexOf("_");
return index >= 0 ? fileName.Substring(0, index) : fileName;
If you want to do something with the other parts, you could use a Split
string fileName = "INCOLOR_fc06_NEW.pdf";
string[] parts = fileName.Split('_');
public string StripOffStuff(string sInput)
{
int iIndex = sInput.IndexOf("_");
return (iIndex > 0) ? sInput.Substring(0, iIndex) : sInput;
}
// Call it like:
string sNewString = StripOffStuff("INCOLOR_fc06_NEW.pdf");
I would go with the SubString approach but to round out the available solutions here's a LINQ approach just for fun:
string filename = "INCOLOR_fc06_NEW.pdf";
string result = new string(filename.TakeWhile(c => c != '_').ToArray());
It'll return the original string if no underscore is found.
To go with all the "alternative" solutions, here's the second one that I thought of (after substring):
string filename = "INCOLOR_fc06_NEW.pdf";
string stripped = filename.Split('_')[0];

Categories