String split with specified string without delimeter - c#

Updated - When searched value is in middle
string text = "Trio charged over alleged $100m money laundering syndicate at Merrylands, Guildford West";
string searchtext= "charged over";
string[] fragments = text.Split(new string[] { searchtext }, StringSplitOptions.None);
//Fragments
//if [0] is blank searched text is in the beginning - searchedtext + [1]
//if [1] is blank searched text is in the end - [0] + searched text
// If searched text is in middle then both items has value - [0] + seachedtext + [1]
//This loop will execute only two times because it can have maximum 2 values, issue will
//come when searched value is in middle (loop should run 3 times) as for the searched value i have to apply differnt logic (like change background color of the text)
// and dont change background color for head and tail
//How do i insert searched value in middle of [0] and [1] ??
I am having a string without delimeter which i am trying to split based on searched string. My requirement is split the string into two , one part contains string without the searchtext and other contains searchtext like below-
Original String - "Bitcoin ATMs Highlight Flaws in EU Money Laundering Rules"
String 1 - Bitcoin ATMs Highlight Flaws in EU
String 2 - Money Laundering Rules
I have written below code it works for the above sample value, but it failed for
Failed - Not returning String 1 and String 2, String is empty
string watch = " Money Laundering Rules Bitcoin ATMs Highlight Flaws in EU";
string serachetxt = "Money Laundering Rules";
This works -
List<string> matchedstr = new List<string>();
string watch = "Bitcoin ATMs Highlight Flaws in EU Money Laundering Rules";
string serachetxt = "Money Laundering Rules";
string compa = watch.Substring(0,watch.IndexOf(serachetxt)); //It returns "Bitcoin ATMs Highlight Flaws in EU"
matchedstr.Add(compa);
matchedstr.Add(serachetxt);
foreach(var itemco in matchedstr)
{
}

You could just consider "Money Laundering Rules" to be the delimiter. Then you can write
string[] result = watch.Split(new string[] { searchtext }, StringSplitOptions.None);
Then you can add the delimiter again
string result1 = result[0];
string result2 = searchtext + result[1];

Use string.Split.
string text = "Bitcoin ATMs Highlight Flaws in EU Money Laundering Rules";
string searchtext = "Money Laundering Rules";
string[] fragments = text.Split(new string[] { searchtext }, StringSplitOptions.None);
fragments will equal:
[0] "Bitcoin ATMs Highlight Flaws in EU "
[1] ""
Everywhere there is a gap between consecutive array elements, your search string appears. e.g.:
string originaltext = string.Join(searchtext, fragments);
Extended Description of String.Split Behaviour
Here is a quick table of the behaviour of string.Split when passed a string.
| Input | Split | Result Array |
+--------+-------+--------------------+
| "ABC" | "A" | { "", "BC" } |
| "ABC" | "B" | { "A", "C" } |
| "ABC" | "C" | { "AB", "" } |
| "ABC" | "D" | { "ABC" } |
| "ABC" | "ABC" | { "", "" } |
| "ABBA" | "A" | { "", "BB", "" } |
| "ABBA" | "B" | { "A", "", "A" } |
| "AAA" | "A" | { "", "", "", "" } |
| "AAA" | "AA" | { "", "A" } |
If you look at the table above, Every place there was a comma in the array (between two consecutive elements in the array), is a place that the split string was found.
If the string was not found, then the result array is only one element (the original string).
If the split string is found at the beginning of the input string, then an empty string is set as the first element of the result array to represent the beginning of the string. Similarly, if the split string is found at the end of the string, an empty string is set as the last element of the result array.
Also, an empty string is included between any consecutive occurrences of the search string in the input string.
In cases where there are ambiguous overlapping locations at which the string could be found in the input string: (e.g. splitting AAA on AA could be split as AA|A or A|AA - where AA is found at position 0 or position 1 in the input string) then the earlier location is used. (e.g. AA|A, resulting in { "", "A" } ).
Again, the invariant is that the original string can always be reconstructed by joining all the fragments and placing exactly one occurrence of the search text in between elements. The following will always be true:
string.Join(searchtext, fragments) == text
If you only want the first split...
You can merge all results after the first back together like this:
if (fragments.Length > 1) {
fragments = new string[] { fragments[0], string.Join(searchtext, fragments.Skip(1)) };
}
... or a more efficient way using String.IndexOf
If you just want to find the first location of the search text string then use String.IndexOf to get the position of the first occurrence of the search text in the input string.
Here's a complete function you can use
private static bool TrySplitOnce(string text, string searchtext, out string beforetext, out string aftertext)
{
int pos = text.IndexOf(searchtext);
if (pos < 0) {
// not found
beforetext = null;
aftertext = null;
return false;
} else {
// found at position `pos`
beforetext = text.Substring(0, pos); // may be ""
aftertext = text.Substring(pos + searchtext.Length); // may be ""
return true;
}
}
You can use this to produce an array, if you like.
usage:
string text = "red or white or blue";
string searchtext = "or";
if (TrySplitOnce(text, searchtext, out string before, out string after)) {
Console.WriteLine("{0}*{1}", before, after);
// output:
// red * white or blue
string[] array = new string[] { before, searchtext, after };
// array == { "red ", "or", " white or blue" };
Console.WriteLine(string.Join("|", array));
// output:
// red |or| white or blue
} else {
Console.WriteLine("Not found");
}
output:
red * white or blue
red |or| white or blue

You can write your own extension method for this:
// Splits s at sep with sep included at beginning of each part except first
// return no more than numParts parts
public static IEnumerable<string> SplitsBeforeInc(this string s, string sep, int numParts = Int32.MaxValue)
=> s.Split(new[] { sep }, numParts, StringSplitOptions.None).Select((p,i) => i > 0 ? sep+p : p);
And use it with:
foreach(var itemco in watch.SplitsBeforeInc(watch, serachetxt, 2))
Here is the same method in a non-LINQ version:
// Splits s at sep with sep included at beginning of each part except first
// return no more than numParts parts
public static IEnumerable<string> SplitsBeforeInc(this string s, string sep, int numParts = Int32.MaxValue) {
var startPos = 0;
var searchPos = 0;
while (startPos < s.Length && --numParts > 0) {
var sepPos = s.IndexOf(sep, searchPos);
sepPos = sepPos < 0 ? s.Length : sepPos;
yield return s.Substring(startPos, sepPos - startPos);
startPos = sepPos;
searchPos = sepPos+sep.Length;
}
if (startPos < s.Length)
yield return s.Substring(startPos);
}

You can try this
string text = "Trio charged over alleged $100m money laundering syndicate at Merrylands, Guildford West";
string searchtext = "charged over";
searchtextPattern = "(?=" + searchtext + ")";
string[] fragments= Regex.Split(text, searchtextPattern);
//fargments will have two elements here
// fragments[0] - "Trio"
// fragments[1] - "charged over alleged $100m money laundering syndicate at Merrylands, Guildford West"
now you can again split fragment which have search text i.e fragments[1] in this case.
see code below
var stringWithoutSearchText = fragments[1].Replace(searchtext, string.Empty);
you need to check whether each fragment contains search text or not. You can do that it your foreach loop on fragments. add below check over there
foreach (var item in fragments)
{
if (item.Contains(searchtext))
{
string stringWithoutSearchText = item.Replace(searchtext, string.Empty);
}
}
Reference : https://stackoverflow.com/a/521172/8652887

Related

String to Array, Sort by 3rd Word/Column

I have a string with numbers, words, and linebreaks that I split into an Array.
If I run Array.Sort(lines) it will sort the Array numerically by Column 1, Number.
How can I instead sort the Array alphabetically by Column 3, Color?
Note: They are not real columns, just spaces separating the words.
I cannot modify the string to change the results.
| Number | Name | Color |
|------------|------------|------------|
| 1 | Mercury | Gray |
| 2 | Venus | Yellow |
| 3 | Earth | Blue |
| 4 | Mars | Red |
C#
Example: http://rextester.com/LSP53065
string planets = "1 Mercury Gray\n"
+ "2 Venus Yellow\n"
+ "3 Earth Blue\n"
+ "4 Mars Red\n";
// Split String into Array by LineBreak
string[] lines = planets.Split(new string[] { "\n" }, StringSplitOptions.None);
// Sort
Array.Sort(lines);
// Result
foreach(var line in lines)
{
Console.WriteLine(line.ToString());
}
Desired Sorted Array Result
3 Earth Blue
1 Mercury Gray
4 Mars Red
2 Venus Yellow
Try this code:
string planets = "1 Mercury Gray \n"
+ "2 Venus Yellow \n"
+ "3 Earth Blue \n"
+ "4 Mars Red \n";
var lines = planets.Split("\n".ToCharArray(), StringSplitOptions.RemoveEmptyEntries)
.OrderBy(s => s.Split(' ')[2])
.ToArray();
foreach (var line in lines)
{
Console.WriteLine(line);
}
EDIT: Thanks #Kevin!
Aleks has got the straight-up answer - I just wanted to contribute something from another angle.
This code is fine from an academic, just learning the concepts point of view.
But if you're looking to translate this into something for business dev, you should get in the habit of structuring it like:
Develop a Planet class
Have a function that returns a Planet from a source text line
Have a function that displays a Planet how you intend it to be
displayed.
There are a lot of reasons for this, but the big one is that you'll have reusable, flexible code (look at the function you're writing right now - how likely is it that you'll be able to reuse it down the line for something else?) If you're interested, look up some info on SRP (Single Responsibility Principle) to get more info on this concept.
This is a translated version of your code:
static void Main(string[] args)
{
string planetsDBStr = "1 Mercury Gray \n"
+ "2 Venus Yellow \n"
+ "3 Earth Blue \n"
+ "4 Mars Red \n";
List<Planet> planets = GetPlanetsFromDBString(planetsDBStr);
foreach (Planet p in planets.OrderBy(x => x.color))
{
Console.WriteLine(p.ToString());
}
Console.ReadKey();
}
private static List<Planet> GetPlanetsFromDBString(string dbString)
{
List<Planet> retVal = new List<Planet>();
string[] lines = dbString.Split("\n".ToCharArray(), StringSplitOptions.RemoveEmptyEntries);
foreach (string line in lines)
retVal.Add(new Planet(line));
return retVal;
}
public class Planet
{
public int orderInSystem;
public string name;
public string color;
public Planet(string databaseTextLine)
{
string[] parts = databaseTextLine.Split(' ');
this.orderInSystem = int.Parse(parts[0]);
this.name = parts[1];
this.color = parts[2];
}
public override string ToString()
{
return orderInSystem + " " + name + " " + color;
}
}
EDIT: Fixed some formatting issues
You can use an Array.Sort overload that takes the custom comparer:
public class MyComparer : IComparer {
int IComparer.Compare( Object x, Object y ) {
//compare last parts here
}
}

In C#, what is the best way to parse this WIKI markup?

I need to take data that I am reading in from a WIKI markup page and store it as a table structure. I am trying to figure out how to properly parse the below markup syntax into some table data structure in C#
Here is an example table:
|| Owner || Action || Status || Comments ||
| Bill | Fix the lobby | In Progress | This is easy |
| Joe | Fix the bathroom | In Progress | Plumbing \\
\\
Electric \\
\\
Painting \\
\\
\\ |
| Scott | Fix the roof | Complete | This is expensive |
and here is how it comes in directly:
|| Owner|| Action || Status || Comments || | Bill\\ | fix the lobby |In Progress | This is eary| | Joe\\ |fix the bathroom\\ | In progress| plumbing \\Electric \\Painting \\ \\ | | Scott \\ | fix the roof \\ | Complete | this is expensive|
So as you can see:
The column headers have "||" as the separator
A row columns have a separator or "|"
A row might span multiple lines (as in the second data row example above) so i would have to keep reading until I hit the same number of "|" (cols) that I have in the header row.
I tried reading in line by line and then concatenating lines that had "\" in between then but that seemed a bit hacky.
I also tried to simply read in as a full string and then just parse by "||" first and then keep reading until I hit the same number of "|" and then go to the next row. This seemed to work but it feel like there might be a more elegant way using regular expressions or something similar.
Can anyone suggest the correct way to parse this data?
I have largely replaced the previous answer, due to the fact that the format of the input after your edit is substantially different from the one posted before. This leads to a somewhat different solution.
Because there are no longer any line breaks after a row, the only way to determine for sure where a row ends, is to require that each row has the same number of columns as the table header. That is at least if you don't want to rely on some potentially fragile white space convention present in the one and only provided example string (i.e. that the row separator is the only | not preceded by a space). Your question at least does not provide this as the specification for a row delimiter.
The below "parser" provides at least the error handling validity checks that can be derived from your format specification and example string and also allows for tables that have no rows. The comments explain what it is doing in basic steps.
public class TableParser
{
const StringSplitOptions SplitOpts = StringSplitOptions.None;
const string RowColSep = "|";
static readonly string[] HeaderColSplit = { "||" };
static readonly string[] RowColSplit = { RowColSep };
static readonly string[] MLColSplit = { #"\\" };
public class TableRow
{
public List<string[]> Cells;
}
public class Table
{
public string[] Header;
public TableRow[] Rows;
}
public static Table Parse(string text)
{
// Isolate the header columns and rows remainder.
var headerSplit = text.Split(HeaderColSplit, SplitOpts);
Ensure(headerSplit.Length > 1, "At least 1 header column is required in the input");
// Need to check whether there are any rows.
var hasRows = headerSplit.Last().IndexOf(RowColSep) >= 0;
var header = headerSplit.Skip(1)
.Take(headerSplit.Length - (hasRows ? 2 : 1))
.Select(c => c.Trim())
.ToArray();
if (!hasRows) // If no rows for this table, we are done.
return new Table() { Header = header, Rows = new TableRow[0] };
// Get all row columns from the remainder.
var rowsCols = headerSplit.Last().Split(RowColSplit, SplitOpts);
// Require same amount of columns for a row as the header.
Ensure((rowsCols.Length % (header.Length + 1)) == 1,
"The number of row colums does not match the number of header columns");
var rows = new TableRow[(rowsCols.Length - 1) / (header.Length + 1)];
// Fill rows by sequentially taking # header column cells
for (int ri = 0, start = 1; ri < rows.Length; ri++, start += header.Length + 1)
{
rows[ri] = new TableRow() {
Cells = rowsCols.Skip(start).Take(header.Length)
.Select(c => c.Split(MLColSplit, SplitOpts).Select(p => p.Trim()).ToArray())
.ToList()
};
};
return new Table { Header = header, Rows = rows };
}
private static void Ensure(bool check, string errorMsg)
{
if (!check)
throw new InvalidDataException(errorMsg);
}
}
When used like this:
public static void Main(params string[] args)
{
var wikiLine = #"|| Owner|| Action || Status || Comments || | Bill\\ | fix the lobby |In Progress | This is eary| | Joe\\ |fix the bathroom\\ | In progress| plumbing \\Electric \\Painting \\ \\ | | Scott \\ | fix the roof \\ | Complete | this is expensive|";
var table = TableParser.Parse(wikiLine);
Console.WriteLine(string.Join(", ", table.Header));
foreach (var r in table.Rows)
Console.WriteLine(string.Join(", ", r.Cells.Select(c => string.Join(Environment.NewLine + "\t# ", c))));
}
It will produce the below output:
Where "\t# " represents a newline caused by the presence of \\ in the input.
Here's a solution which populates a DataTable. It does require a litte bit of data massaging (Trim), but the main parsing is Splits and Linq.
var str = #"|| Owner|| Action || Status || Comments || | Bill\\ | fix the lobby |In Progress | This is eary| | Joe\\ |fix the bathroom\\ | In progress| plumbing \\Electric \\Painting \\ \\ | | Scott \\ | fix the roof \\ | Complete | this is expensive|";
var headerStop = str.LastIndexOf("||");
var headers = str.Substring(0, headerStop).Split(new string[1] { "||" }, StringSplitOptions.None).Skip(1).ToList();
var records = str.Substring(headerStop + 4).TrimEnd(new char[2] { ' ', '|' }).Split(new string[1] { "| |" }, StringSplitOptions.None).ToList();
var tbl = new DataTable();
headers.ForEach(h => tbl.Columns.Add(h.Trim()));
records.ForEach(r => tbl.Rows.Add(r.Split('|')));
This makes some assumptions but seems to work for your sample data. I'm sure if I worked at I could combine the expressions and clean it up but you'll get the idea.
It will also allow for rows that do not have the same number of cells as the header which I think is something confluence can do.
List<List<string>> table = new List<List<string>>();
var match = Regex.Match(raw, #"(?:(?:\|\|([^|]*))*\n)?");
if (match.Success)
{
var headersWithExtra = match.Groups[1].Captures.Cast<Capture>().Select(c=>c.Value);
List<String> headerRow = headersWithExtra.Take(headersWithExtra.Count()-1).ToList();
if (headerRow.Count > 0)
{
table.Add(headerRow);
}
}
match = Regex.Match(raw + "\r\n", #"[^\n]*\n" + #"(?:\|([^|]*))*");
var cellsWithExtra = match.Groups[1].Captures.Cast<Capture>().Select(c=>c.Value);
List<string> row = new List<string>();
foreach (string cell in cellsWithExtra)
{
if (cell.Trim(' ', '\t') == "\r\n")
{
if (!table.Contains(row) && row.Count > 0)
{
table.Add(row);
}
row = new List<string>();
}
else
{
row.Add(cell);
}
}
This ended up very similar to Jon Tirjan's answer, although it cuts the LINQ to a single statement (the code to replace that last one was horrifically ugly) and is a bit more extensible. For example, it will replace the Confluence line breaks \\ with a string of your choosing, you can choose to trim or not trim whitespace from around elements, etc.
private void ParseWikiTable(string input, string newLineReplacement = " ")
{
string separatorHeader = "||";
string separatorRow = "| |";
string separatorElement = "|";
input = Regex.Replace(input, #"[ \\]{2,}", newLineReplacement);
string inputHeader = input.Substring(0, input.LastIndexOf(separatorHeader));
string inputContent = input.Substring(input.LastIndexOf(separatorHeader) + separatorHeader.Length);
string[] headerArray = SimpleSplit(inputHeader, separatorHeader);
string[][] rowArray = SimpleSplit(inputContent, separatorRow).Select(r => SimpleSplit(r, separatorElement)).ToArray();
// do something with output data
TestPrint(headerArray);
foreach (var r in rowArray) { TestPrint(r); }
}
private string[] SimpleSplit(string input, string separator, bool trimWhitespace = true)
{
input = input.Trim();
if (input.StartsWith(separator)) { input = input.Substring(separator.Length); }
if (input.EndsWith(separator)) { input = input.Substring(0, input.Length - separator.Length); }
string[] segments = input.Split(new string[] { separator }, StringSplitOptions.None);
if (trimWhitespace)
{
for (int i = 0; i < segments.Length; i++)
{
segments[i] = segments[i].Trim();
}
}
return segments;
}
private void TestPrint(string[] lst)
{
string joined = "[" + String.Join("::", lst) + "]";
Console.WriteLine(joined);
}
Console output from your direct input string:
[Owner::Action::Status::Comments]
[Bill::fix the lobby::In Progress::This is eary]
[Joe::fix the bathroom::In progress::plumbing Electric Painting]
[Scott::fix the roof::Complete::this is expensive]
A generic regex solution that populate a datatable and is a little flexible with the syntax.
var text = #"|| Owner|| Action || Status || Comments || | Bill\\ | fix the lobby |In Progress | This is eary| | Joe\\ |fix the bathroom\\ | In progress| plumbing \\Electric \\Painting \\ \\ | | Scott \\ | fix the roof \\ | Complete | this is expensive|";
// Get Headers
var regHeaders = new Regex(#"\|\|\s*(\w[^\|]+)", RegexOptions.Compiled);
var headers = regHeaders.Matches(text);
//Get Rows, based on number of headers columns
var regLinhas = new Regex(String.Format(#"(?:\|\s*(\w[^\|]+)){{{0}}}", headers.Count));
var rows = regLinhas.Matches(text);
var tbl = new DataTable();
foreach (Match header in headers)
{
tbl.Columns.Add(header.Groups[1].Value);
}
foreach (Match row in rows)
{
tbl.Rows.Add(row.Groups[1].Captures.OfType<Capture>().Select(col => col.Value).ToArray());
}
Here's a solution involving regular expressions. It takes a single string as input and returns a List of headers and a List> of rows/columns. It also trims white space, which may or may not be the desired behavior, so be aware of that. It even prints things nicely :)
using System;
using System.Collections.Generic;
using System.Text.RegularExpressions;
namespace parseWiki
{
class Program
{
static void Main(string[] args)
{
string content = #"|| Owner || Action || Status || Comments || | Bill\\ | fix the lobby |In Progress | This is eary| | Joe\\ |fix the bathroom\\ | In progress| plumbing \\Electric \\Painting \\ \\ | | Scott \\ | fix the roof \\ | Complete | this is expensive|";
content = content.Replace(#"\\", "");
string headerContent = content.Substring(0, content.LastIndexOf("||") + 2);
string cellContent = content.Substring(content.LastIndexOf("||") + 2);
MatchCollection headerMatches = new Regex(#"\|\|([^|]*)(?=\|\|)", RegexOptions.Singleline).Matches(headerContent);
MatchCollection cellMatches = new Regex(#"\|([^|]*)(?=\|)", RegexOptions.Singleline).Matches(cellContent);
List<string> headers = new List<string>();
foreach (Match match in headerMatches)
{
if (match.Groups.Count > 1)
{
headers.Add(match.Groups[1].Value.Trim());
}
}
List<List<string>> body = new List<List<string>>();
List<string> newRow = new List<string>();
foreach (Match match in cellMatches)
{
if (newRow.Count > 0 && newRow.Count % headers.Count == 0)
{
body.Add(newRow);
newRow = new List<string>();
}
else
{
newRow.Add(match.Groups[1].Value.Trim());
}
}
body.Add(newRow);
print(headers, body);
}
static void print(List<string> headers, List<List<string>> body)
{
var CELL_SIZE = 20;
for (int i = 0; i < headers.Count; i++)
{
Console.Write(headers[i].Truncate(CELL_SIZE).PadRight(CELL_SIZE) + " ");
}
Console.WriteLine("\n" + "".PadRight( (CELL_SIZE + 2) * headers.Count, '-'));
for (int r = 0; r < body.Count; r++)
{
List<string> row = body[r];
for (int c = 0; c < row.Count; c++)
{
Console.Write(row[c].Truncate(CELL_SIZE).PadRight(CELL_SIZE) + " ");
}
Console.WriteLine("");
}
Console.WriteLine("\n\n\n");
Console.ReadKey(false);
}
}
public static class StringExt
{
public static string Truncate(this string value, int maxLength)
{
if (string.IsNullOrEmpty(value) || value.Length <= maxLength) return value;
return value.Substring(0, maxLength - 3) + "...";
}
}
}
Read the input string one character at a time and use a state-machine to decide what should be done with each input character. This approach probably needs more code, but it will be easier to maintain and to extend than regular expressions.

How to use Regex to split a string AND include whitespace

I can't seem to find (or write) a simple way of splitting the following sentence into words and assigning a word to the whitespace between the letters.
(VS 2010, C#, .net4.0).
String text = "This is a test.";
Desired result:
[0] = This
[1] = " "
[2] = is
[3] = " "
[4] = a
[5] = " "
[6] = test.
The closest I have come is:
string[] words = Regex.Split(text, #"\s");
but ofcourse, this drops the whitespace.
Suggestions are appreciated. Thanks
Edit: There may be one or more spaces between the words. I would like all spaces between the words to be returned as a "word" itself (with all spaces being placed in that "word"). e.g., if 5 spaces between a word would be.
String spaceword = " "; <--This is not showing correctly, there should be a string of 5 spaces.
Change your pattern to (\s+):
String text = "This is a test.";
string[] words = Regex.Split(text, #"(\s+)");
for(int i =0; i < words.Length;i++)
{
Console.WriteLine(i.ToString() + "," + words[i].Length.ToString() + " = " + words[i]);
}
Here's the output:
0,4 = This
1,8 =
2,2 = is
3,1 =
4,1 = a
5,3 =
6,5 = test.
You can use LINQ to add spaces manually between them:
var parts = text.Split(new[]{ ' ' }, StringSplitOptions.RemoveEmptyEntries);
var result = parts.SelectMany((x,idx) => idx != parts.Length - 1
? new[] { x, " " }
: new[] { x }).ToList();
You can try this regex, \w+|\s+ which uses or operator |
var arr = Regex.Matches(text, #"\S+|\s+").Cast<Match>()
.Select(i => i.Value)
.ToArray();
It just matches both words and spaces and some LINQ stuff is being used so arr is just a String Array

C# Index of for space and next informations

Please, can you help me please. I have complete select adress from DB but this adress contains adress and house number but i need separately adress and house number.
I created two list for this distribution.
while (reader_org.Read())
{
string s = reader_org.GetString(0);
string ulice, cp, oc;
char mezera = ' ';
if (s.Contains(mezera))
{
Match m = Regex.Match(s, #"(\d+)");
string numStr = m.Groups[0].Value;
if (numStr.Length > 0)
{
s = s.Replace(numStr, "").Trim();
int number = Convert.ToInt32(numStr);
}
Match l = Regex.Match(s, #"(\d+)");
string numStr2 = l.Groups[0].Value;
if (numStr2.Length > 0)
{
s = s.Replace(numStr2, "").Trim();
int number = Convert.ToInt32(numStr2);
}
if (s.Contains('/'))
s = s.Replace('/', ' ').Trim();
MessageBox.Show("Adresa: " + s);
MessageBox.Show("CP:" + numStr);
MessageBox.Show("OC:" + numStr2);
}
else
{
Definitions.Ulice.Add(s);
}
}
You might find the street name consists of multiple words, or the number appears before the street name. Also potentially some houses might not have a number. Here's a way of dealing with all that.
//extract the first number found in the address string, wherever that number is.
Match m = Regex.Match(address, #"((\d+)/?(\d+))");
string numStr = m.Groups[0].Value;
string streetName = address.Replace(numStr, "").Trim();
//if a number was found then convert it to numeric
//also remove it from the address string, so now the address string only
//contains the street name
if (numStr.Length > 0)
{
string streetName = address.Replace(numStr, "").Trim();
if (numStr.Contains('/'))
{
int num1 = Convert.ToInt32(m.Groups[2].Value);
int num2 = Convert.ToInt32(m.Groups[3].Value);
}
else
{
int number = Convert.ToInt32(numStr);
}
}
Use .Split on your string that results. Then you can index into the result and get the parts of your string.
var parts = s.Split(' ');
// you can get parts[0] etc to access each part;
using (SqlDataReader reader_org = select_org.ExecuteReader())
{
while (reader_org.Read())
{
string s = reader_org.GetString(0); // this return me for example KarlĂ­nkova 514 but i need separately adress (karlĂ­nkova) and house number (514) with help index of or better functions. But now i dont know how can i make it.
var values = s.Split(' ');
var address = values.Count > 0 ? values[0]: null;
var number = values.Count > 1 ? int.Parse(values[1]) : 0;
//Do what ever you want with address and number here...
}
Here is a way to split it the address into House Number and Address without regex and only using the functions of the String class.
var fullAddress = "1111 Awesome Point Way NE, WA 98122";
var index = fullAddress.IndexOf(" "); //Gets the first index of space
var houseNumber = fullAddress.Remove(index);
var address = fullAddress.Remove(0, (index + 1));
Console.WriteLine(houseNumber);
Console.WriteLine(address);
Output: 1111
Output: Awesome Point Way NE, WA 98122

String manipulation to truncate string up to a specified expression on C#

How do I remove more than three spaces from each line and end the string right there to look the line on the right using c#?
[Example1]
PO BOX XXX OVERDUE - PAY NOW
then transform to
PO BOX XXX
[Example2]
ClientB AMOUNT CARRI
then transform to
ClientB
[Example3]
PO BOX 400 FORWARD TO N
then transform to
PO BOX 400
var firstColumn = origString.SubString(0, origString.IndexOf(" "));
input = "PO BOX XXX OVERDUE - PAY NOW ";
input = input.Remove(input.IndexOf(" "));
there are 3 spaces in the indexOf paranthesis
Or you can do a split if you dont know if there is a tab or space -
input = input.Split(new char[] {' ', '\t'}, StringSplitOptions.RemoveEmptyEntries)[0];
You can do this:
var input = new string[3] { "PO BOX XXX OVERDUE - PAY NOW ",
"ClientB AMOUNT CARRI",
"PO BOX 400 FORWARD TO N "
};
for (int x = 0, len = input.Length; x != len; x++)
{
input[x] = Regex.Replace(input[x], #"\s{3}[^\n]+", string.Empty);
}
//input is ["PO BOX XXX","ClientB","PO BOX 400"]
Using linq:
var output = input.Select(str => Regex.Replace(str, #"\s{3}[^\r\n]+$", string.Empty));
if you're reading this string from file, you can do this:
var file = #"D:\file.txt";
var lines = File.ReadAllLines(file);
var output = lines.Select(str => Regex.Replace(str, #"\s{3}[^\n]+$", string.Empty)); // is ["PO BOX XXX","ClientB","PO BOX 400"]
You can use the string.Split method which results the string[]. Basing on the array count you can take the elements u need.
string base string = "PO BOX XXX OVERDUE - PAY NOW";
string[] delimittedStringArray = baseString.Split(' ');
if(delimittedStringArray.Length > 3)
{
// Take the data from array
}
else
{
// Do what ever
}
// I am not sure whether it is Length or Count in the if condition.

Categories