Split values in arrays - c#

I have a Long string from that I want to store the keyword in array or collection, the format of my string is like below:
Title: My Test Page Title.
Desc: My page description.
Keywords: Bessel function, legendre function, Differential Equations, Bessel, Legendre, Homogenous, Assignment & Maths Homework Help.
Bessel & Legendre Function:
Homogenous Equations of the second order of the type
+ x + ( - )y = 0, v [0, ), x [0, )………………….(1)
(1 - ) - 2x + n (n + 1)y = 0, n = 1, 2 ……, x (-1, 1)…………………(2)
In this String I want to store all Keywords in Array/collection split from comma.
My problem is that How I can find out the starting and ending point to split the keywords, I can get the Starting point from Keywords: but what should be my ending point to store the keyword in array/collection, there is no any fix format,
there is only one fix format which is there will be a Para after ending the Keyword section.
any one can suggest me regular expression for this.

there will be a Para
Seems like you should first split the string into lines.
And then the line that starts with Keywords: holds your keywords.
You can use the string.Split() method to split into lines as well as for breaking out the keywords.

It also looks like the Keywords section ends with a fullstop. So you could find the next fullstop ie IndexOf(".") after the "Keywords:" ....

I think this should do:
string afterKeywords = data.Substring(data.IndexOf("Keywords:") + 9);
string beforeNextPara = afterKeywords.Substring(0, afterKeywords.IndexOf(Environment.NewLine + Environment.NewLine));
var dataWeNeed = beforeNextPara.Split(',');

Related

How do I Split string only at last occurrence of special character and use both sides after split

I want to split a string only at last occurrence of special character.
I try to parse a name of a tab from browser, so my initial string looks for example like this:
Untitled - Google Chrome
That is easy to solve as there is a Split function. Here is my implementation:
var pageparts= Regex.Split(inputWindow.ToString(), " - ");
InsertWindowName(pageparts[0].ToString(), pageparts[1].ToString());//method to save string into separate columns in DB
This works, but problem occurs, when I get a page like this:
SQL injection - Wikipedia, the free encyclopedia - Mozilla Firefox
Here are two dashes, which means, that after split is done, there are 3 separate strings in array and if I would continue normally, database would contain in first column value "SQL injection" and in second column value "Wikipedia, the free encyclopedia". Last value will be completely left out.
What I want is that first column in database will have value:
SQL injection - Wikipedia, the free encyclopedia" and second column will have:
"Mozilla Firefox". Is that somehow possible?
I tried to use a Split(" - ").Last() function (even LastOrDefault() too), but then I only got a last string. I need to get both side of the original string. Just separated by last dash.
You can use String.Substring with String.LastIndexOf:
string str = "SQL injection - Wikipedia, the free encyclopedia - Mozilla Firefox";
int lastIndex = str.LastIndexOf('-');
if (lastIndex + 1 < str.Length)
{
string firstPart = str.Substring(0, lastIndex);
string secondPart = str.Substring(lastIndex + 1);
}
Create a extension method (or a simple method) to perform that operation and also add some error checking for lastIndex.
EDIT:
If you want to split on " - " (space-space) then use following to calculate lastIndex
string str = "FirstPart - Mozzila Firefox-somethingWithoutSpace";
string delimiter = " - ";
int lastIndex = str.LastIndexOf(delimiter);
if (lastIndex + delimiter.Length < str.Length)
{
string firstPart = str.Substring(0, lastIndex);
string secondPart = str.Substring(lastIndex + delimiter.Length);
}
So for string like:
"FirstPart - Mozzila Firefox-somethingWithoutSpace"
Output would be:
FirstPart
Mozzila Firefox-somethingWithoutSpace
Please forgive me for my laziness ins this solution i'm sure there is a better approach but i will give you one solution proposal i'm assuming you are codding in C#.
First of all correct me if I get wrongly the question no matter what you just want to columns returned the first (all text even of it includes dashes but the last one) and last column (all the text after last dash) if it's ok. let's do it.
// I Only use split function when I want all data in separate variable (array position) in you case I assumed that you just want 2 values (if possible), so you can use substring.
static void Main(string[] args)
{
string firstname = "";
string lastName = "";
string variablewithdata = "SQL injection - Wikipedia, -the free encyclopedia - Mozilla Firefox";
// variablewithdata.LastIndexOf('-') = returns Integer corresponding to the last position of that character.
//I suggest you validate if variablewithdata.LastIndexOf('-') is equal to -1 or not because if it don't found your character it returns -1 so if the value isn't -1 you can substring
firstname = variablewithdata.Substring(0, (variablewithdata.LastIndexOf('-') - 1));
lastName = variablewithdata.Substring(variablewithdata.LastIndexOf('-') + 1);
Console.WriteLine("FirstColumn: {0} \nLastColumn:{1}",firstname,lastName);
Console.ReadLine();
}
If it's not what you want can you explain me for example for "SQL injection - Wikipedia,- the free - encyclopedia - Mozilla Firefox" what's suppose to be returned?
Forgive me for unclean code i'm bored today.
If you don't care about reassembling strings, you could use something like :
var pageparts= Regex.Split(inputWindow.ToString(), " - ");
var firstPart = string.Join(" - ", pageparts.Take(pageparts.Length - 1));
var secondPart = pageparts.Last()
InsertWindowName(firstPart, secondPart);

Remove a linebreak in c#

I have the following lines of text:
W&BL 15&384&320&214&1&S235JR&&&&&&&&&&S&&0.267&&&&4&&
N&214.nc
A&214&1&&15
W&BL 15&384&320&215&1&S235JR&&&&&&&&&&S&&0.267&&&&4&&
N&215.nc
A&213&2&&14
I want to remove the linebreaks so the outcome will be like this:
A&213&2&&14W&BL 15&384&320&214&1&S235JR&&&&&&&&&&S&&0.267&&&&4&&N&214.nc
A&214&1&&15W&BL 15&384&320&215&1&S235JR&&&&&&&&&&S&&0.267&&&&4&&N&215.nc
I do this because I need to format these lines and I'm putting the whole textfile in a reader per line. When I filter this with linebreaks I can't properly search through the lines. Since I need to delete everything after the S235JR, replace the & with ; and start the line with the BL code.
If someone knows a smarter/better solution to filter these lines, you will be my hero of the day.
Edit for clarification:
This is a example and how it needs to be formatted:
H&HEA100&1712&&1001&2&S235JR&&&HEA100 - 1712&&&&&&&S&&0.96&&&&2&&&1.7&0.2&0.2
N&1001.nc
W&BL 15&384&320&215&1&S235JR&&&&&&&&&&S&&0.267&&&&4&&
N&215.ncA&214&1&&15
H&L80X8&375&&1010&1&S275JR&&&L80X8 - 375&&&&&&&S&&0.117&&&&4&&&0.4&0.1&0.1
N&1010.nc
After formatting:
H;HEA100;1712;;1001;2;S235JR;
BL 15;384;320;215;1;S235JR;
L80X8;375;;1010;1;S275JR;
The input is a text file imported with a StreamReader. The H, BL 15 and L80X8 are determined after 6 & characters. The program was originally written in DOS and I need to convert it into C#. I'm sorry for the confusion.
x.Replace(Environment.NewLine, String.Empty);
where x is string.
From the example in your question, it looks like you want to remove line breaks, but keep every third line break.
You can use a regular expression that matches three lines, and remove the two line breaks between them:
text = Regex.Replace(text, #"(.+)\r\n(.+)\r\n(.+)", "$1$2$3");
x = x.Replace("\r\n", "");
x is your string Object;
string x = x.Replace("&", ";");
string x1 = x.Substring(x.IndexOf("H;"), x.IndexOf("S235JR;", x.IndexOf("H;")) - x.IndexOf("H;")+7);
string x2 = x.Substring(x.IndexOf("BL 15;"), x.IndexOf("S235JR;", x.IndexOf("BL 15;")) - x.IndexOf("BL 15;")+7);
string x3 = x.Substring(x.IndexOf("L80X8;"), x.IndexOf("S275JR;", x.IndexOf("L80X8;")) - x.IndexOf("L80X8;")+7);
string result = x1 + "\r\n" + x2 + "\r\n" + x3;

How to extract range of characters from a string

If I have a string such as the following:
String myString = "SET(someRandomName, \"hi\", u)";
where I know that "SET(" will always exists in the string, but the length of "someRandomName" is unknown, how would I go about deleting all the characters from "(" to the first instance of """? So to re-iterate, I would like to delete this substring: "SET(someRandomName, \"" from myString.
How would I do this in C#.Net?
EDIT: I don't want to use regex for this.
Providing the string will always have this structure, the easiest is to use String.IndexOf() to look-up the index of the first occurence of ". String.Substring() then gives you appropriate portion of the original string.
Likewise you can use String.LastIndexOf() to find the index of the first " from the end of the string. Then you will be able to extract just the value of the second argument ("hi" in your sample).
You will end up with something like this:
int begin = myString.IndexOf('"');
int end = myString.LastIndexOf('"');
string secondArg = myString.Substring(begin, end - begin + 1);
This will yield "\"hi\"" in secondArg.
UPDATE: To remove a portion of the string, use the String.Remove() method:
int begin = myString.IndexOf('(');
int end = myString.IndexOf('"');
string altered = myString.Remove(begin + 1, end - begin - 1);
This will yield "SET(\"hi\", u)" in altered.
I know it's been years, but .Net been has also evolved in the meantime.
Consider using range operator in case anyone looking here for an answer.
Assuming that Set( and \"hi\", u) is constant value (8 digit without the escapes):
var sub = myString[^4...^8];
myString.Replace(sub, replaceValue);
more examples and a good explanation in this article or of course in microsoft docs
This is pretty awful, but this will accomplish what you want with a simple linq statement. Just presenting as an alternative to the IndexOf answers.
string myString = "SET(someRandomName, \"hi\", 0)";
string fixedStr = new String( myString.ToCharArray().Take( 4 ).Concat( myString.ToCharArray().SkipWhile( c => c != '"' ) ).ToArray() );
yields: SET("hi", 0)
Note: the skip is hard-coded for 4 characters, you could alter it to skip over the characters in an array that contains them instead.
I assume you want to transform
SET(someRandomName, "hi", u)
into:
SET(u)
To achieve that, you can use:
String newString = "SET(" + myString.Substring(myString.LastIndexOf(',') + 1).Trim();
To explain this bit by bit:
myString.LastIndexOf(',')
will give you the index (position) of your last , character. Increment it by 1 to get the start index of the third argument in your SET function.
myString.Substring(myString.LastIndexOf(',') + 1)
The Substring method will eliminate all characters up to the specified position. In this case, we’re eliminating everything up to (and including) the last ,. In the example above, this would eliminate the SET(someRandomName, "hi", part, and leave us with u).
The Trim is necessary simply to remove the leading space character before your u.
Finally, we prepend SET( to our substring (since we had formerly removed it due to our Substring).
Edit: Based on your comment below (which contradicts what you asked in your question), you can use:
String newString = "SET(" + myString.Substring(myString.IndexOf(',') + 1).Trim();

Extracting values from a string in C#

I have the following string which i would like to retrieve some values from:
============================
Control 127232:
map #;-
============================
Control 127235:
map $;NULL
============================
Control 127236:
I want to take only the Control . Hence is there a way to retrieve from that string above into an array containing like [127232, 127235, 127236]?
One way of achieving this is with regular expressions, which does introduce some complexity but will give the answer you want with a little LINQ for good measure.
Start with a regular expression to capture, within a group, the data you want:
var regex = new Regex(#"Control\s+(\d+):");
This will look for the literal string "Control" followed by one or more whitespace characters, followed by one or more numbers (within a capture group) followed by a literal string ":".
Then capture matches from your input using the regular expression defined above:
var matches = regex.Matches(inputString);
Then, using a bit of LINQ you can turn this to an array
var arr = matches.OfType<Match>()
.Select(m => long.Parse(m.Groups[1].Value))
.ToArray();
now arr is an array of long's containing just the numbers.
Live example here: http://rextester.com/rundotnet?code=ZCMH97137
try this (assuming your string is named s and each line is made with \n):
List<string> ret = new List<string>();
foreach (string t in s.Split('\n').Where(p => p.StartsWith("Control")))
ret.Add(t.Replace("Control ", "").Replace(":", ""));
ret.Add(...) part is not elegant, but works...
EDITED:
If you want an array use string[] arr = ret.ToArray();
SYNOPSYS:
I see you're really a newbie, so I try to explain:
s.Split('\n') creates a string[] (every line in your string)
.Where(...) part extracts from the array only strings starting with Control
foreach part navigates through returned array taking one string at a time
t.Replace(..) cuts unwanted string out
ret.Add(...) finally adds searched items into returning list
Off the top of my head try this (it's quick and dirty), assuming the text you want to search is in the variable 'text':
List<string> numbers = System.Text.RegularExpressions.Regex.Split(text, "[^\\d+]").ToList();
numbers.RemoveAll(item => item == "");
The first line splits out all the numbers into separate items in a list, it also splits out lots of empty strings, the second line removes the empty strings leaving you with a list of the three numbers. if you want to convert that back to an array just add the following line to the end:
var numberArray = numbers.ToArray();
Yes, the way exists. I can't recall a simple way for It, but string is to be parsed for extracting this values. Algorithm of it is next:
Find a word "Control" in string and its end
Find a group of digits after the word
Extract number by int.parse or TryParse
If not the end of the string - goto to step one
realizing of this algorithm is almost primitive..)
This is simplest implementation (your string is str):
int i, number, index = 0;
while ((index = str.IndexOf(':', index)) != -1)
{
i = index - 1;
while (i >= 0 && char.IsDigit(str[i])) i--;
if (++i < index)
{
number = int.Parse(str.Substring(i, index - i));
Console.WriteLine("Number: " + number);
}
index ++;
}
Using LINQ for such a little operation is doubtful.

String functions

I want to search for a given string, within another string (Ex. find if "something" exists inside "something like this". How can I do the following? :
Know the position in which "something" is located (in the curr. ex. this is = 0.
Extract everything to the left or to the right, up to the char. found (see 1).
Extract a substring beggining where the sought string was found, all the way to X amount of chars (in Visual Basic 6/VBA I would use the Mid function).
string searched = "something like this";
1.
int pos = searched.IndexOf("something");
2.
string start = searched.Substring(0, pos);
string endstring = searched.Substring(pos);
3.
string mid = searched.Substring(pos, x);
Have you looked at the String.SubString() method? You can use the IndexOf() method to see if the substring exists first.
Take a look at the System.String member functions, in particular the IndexOf method.
Use int String.IndexOf(String).
I would do something like this:
string s = "I have something like this";
//question No. 1
int pos = s.IndexOf("something");
//quiestion No. 2
string[] separator = {"something"};
string[] leftAndRightEntries = s.Split(separator, StringSplitOptions.None);
//question No. 3
int x = pos + 10;
string substring = s.Substring(pos, x);
I would avoid using Split, as it's designed to give you multiple results. I would stick with the code in the first example, though the second block should actually read...
string start = searched.Substring(0, pos);
string endstring;
if(pos < searched.Length - 1)
endstring = searched.Substring(pos + "something".Length);
else
endstring = string.Empty
The key difference is accounting for the length of the string to find (hence the rather odd-looking "something".Length, as this example is designed for you to be able to plop in your own variable).

Categories