C# Index of for space and next informations - c#

Please, can you help me please. I have complete select adress from DB but this adress contains adress and house number but i need separately adress and house number.
I created two list for this distribution.
while (reader_org.Read())
{
string s = reader_org.GetString(0);
string ulice, cp, oc;
char mezera = ' ';
if (s.Contains(mezera))
{
Match m = Regex.Match(s, #"(\d+)");
string numStr = m.Groups[0].Value;
if (numStr.Length > 0)
{
s = s.Replace(numStr, "").Trim();
int number = Convert.ToInt32(numStr);
}
Match l = Regex.Match(s, #"(\d+)");
string numStr2 = l.Groups[0].Value;
if (numStr2.Length > 0)
{
s = s.Replace(numStr2, "").Trim();
int number = Convert.ToInt32(numStr2);
}
if (s.Contains('/'))
s = s.Replace('/', ' ').Trim();
MessageBox.Show("Adresa: " + s);
MessageBox.Show("CP:" + numStr);
MessageBox.Show("OC:" + numStr2);
}
else
{
Definitions.Ulice.Add(s);
}
}

You might find the street name consists of multiple words, or the number appears before the street name. Also potentially some houses might not have a number. Here's a way of dealing with all that.
//extract the first number found in the address string, wherever that number is.
Match m = Regex.Match(address, #"((\d+)/?(\d+))");
string numStr = m.Groups[0].Value;
string streetName = address.Replace(numStr, "").Trim();
//if a number was found then convert it to numeric
//also remove it from the address string, so now the address string only
//contains the street name
if (numStr.Length > 0)
{
string streetName = address.Replace(numStr, "").Trim();
if (numStr.Contains('/'))
{
int num1 = Convert.ToInt32(m.Groups[2].Value);
int num2 = Convert.ToInt32(m.Groups[3].Value);
}
else
{
int number = Convert.ToInt32(numStr);
}
}

Use .Split on your string that results. Then you can index into the result and get the parts of your string.
var parts = s.Split(' ');
// you can get parts[0] etc to access each part;

using (SqlDataReader reader_org = select_org.ExecuteReader())
{
while (reader_org.Read())
{
string s = reader_org.GetString(0); // this return me for example KarlĂ­nkova 514 but i need separately adress (karlĂ­nkova) and house number (514) with help index of or better functions. But now i dont know how can i make it.
var values = s.Split(' ');
var address = values.Count > 0 ? values[0]: null;
var number = values.Count > 1 ? int.Parse(values[1]) : 0;
//Do what ever you want with address and number here...
}

Here is a way to split it the address into House Number and Address without regex and only using the functions of the String class.
var fullAddress = "1111 Awesome Point Way NE, WA 98122";
var index = fullAddress.IndexOf(" "); //Gets the first index of space
var houseNumber = fullAddress.Remove(index);
var address = fullAddress.Remove(0, (index + 1));
Console.WriteLine(houseNumber);
Console.WriteLine(address);
Output: 1111
Output: Awesome Point Way NE, WA 98122

Related

Parse a multiline email to var

I'm attempting to parse a multi-line email so I can get at the data which is on its own newline under the heading in the body of the email.
It looks like this:
EMAIL STARTING IN APRIL
Marketing ID Local Number
------------------- ----------------------
GR332230 0000232323
Dispatch Code Logic code
----------------- -------------------
GX3472 1
Destination ID Destination details
----------------- -------------------
3411144
It appears I am getting everything on each messagebox when I use string reader readline, though all I want is the data under each ------ as shown
This is my code:
foreach (MailItem mail in publicFolder.Items)
{
if (mail != null)
{
if (mail is MailItem)
{
MessageBox.Show(mail.Body, "MailItem body");
// Creates new StringReader instance from System.IO
using (StringReader reader = new StringReader(mail.Body))
{
string line;
while ((line = reader.ReadLine()) !=null)
//Loop over the lines in the string.
if (mail.Body.Contains("Marketing ID"))
{
// var localno = mail.Body.Substring(247,15);//not correct approach
// MessageBox.Show(localrefno);
//MessageBox.Show("found");
//var conexid = mail.Body.Replace(Environment.NewLine);
var regex = new Regex("<br/>", RegexOptions.Singleline);
MessageBox.Show(line.ToString());
}
}
//var stringBuilder = new StringBuilder();
//foreach (var s in mail.Body.Split(' '))
//{
// stringBuilder.Append(s).AppendLine();
//}
//MessageBox.Show(stringBuilder.ToString());
}
else
{
MessageBox.Show("Nothing found for MailItem");
}
}
}
You can see I had numerous attempts with it, even using substring position and using regex. Please help me get the data from each line under the ---.
It is not a very good idea to do that with Regex because it is quite easy to forget the edge cases, not easy to understand, and not easy to debug. It's quite easy to get into a situation that the Regex hangs your CPU and times out. (I cannot make any comment to other answers yet. So, please check at least my other two cases before you pick your final solution.)
In your cases, the following Regex solution works for your provided example. However, some additional limitations are there: You need to make sure there are no empty values in the non-starting or non-ending column. Or, let's say if there are more than two columns and any one of them in the middle is empty will make the names and values of that line mismatched.
Unfortunately, I cannot give you a non-Regex solution because I don't know the spec, e.g.: Will there be empty spaces? Will there be TABs? Does each field has a fixed count of characters or will they be flexible? If it is flexible and can have empty values, what kind of rules to detected which columns are empty? I assume that it is quite possible that they are defined by the column name's length and will have only space as delimiter. If that's the case, there are two ways to solve it, two-pass Regex or write your own parser. If all the fields has fixed length, it would be even more easier to do: Just using the substring to cut the lines and then trim them.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text.RegularExpressions;
public class Program
{
public class Record{
public string Name {get;set;}
public string Value {get;set;}
}
public static void Main()
{
var regex = new Regex(#"(?<name>((?!-)[\w]+[ ]?)*)(?>(?>[ \t]+)?(?<name>((?!-)[\w]+[ ]?)+)?)+(?:\r\n|\r|\n)(?>(?<splitters>(-+))(?>[ \t]+)?)+(?:\r\n|\r|\n)(?<value>((?!-)[\w]+[ ]?)*)(?>(?>[ \t]+)?(?<value>((?!-)[\w]+[ ]?)+)?)+", RegexOptions.Compiled);
var testingValue =
#"EMAIL STARTING IN APRIL
Marketing ID Local Number
------------------- ----------------------
GR332230 0000232323
Dispatch Code Logic code
----------------- -------------------
GX3472 1
Destination ID Destination details
----------------- -------------------
3411144";
var matches = regex.Matches(testingValue);
var rows = (
from match in matches.OfType<Match>()
let row = (
from grp in match.Groups.OfType<Group>()
select new {grp.Name, Captures = grp.Captures.OfType<Capture>().ToList()}
).ToDictionary(item=>item.Name, item=>item.Captures.OfType<Capture>().ToList())
let names = row.ContainsKey("name")? row["name"] : null
let splitters = row.ContainsKey("splitters")? row["splitters"] : null
let values = row.ContainsKey("value")? row["value"] : null
where names != null && splitters != null &&
names.Count == splitters.Count &&
(values==null || values.Count <= splitters.Count)
select new {Names = names, Values = values}
);
var records = new List<Record>();
foreach(var row in rows)
{
for(int i=0; i< row.Names.Count; i++)
{
records.Add(new Record{Name=row.Names[i].Value, Value=i < row.Values.Count ? row.Values[i].Value : ""});
}
}
foreach(var record in records)
{
Console.WriteLine(record.Name + " = " + record.Value);
}
}
}
output:
Marketing ID = GR332230
Local Number = 0000232323
Dispatch Code = GX3472
Logic code = 1
Destination ID = 3411144
Destination details =
Please note that this also works for this kind of message:
EMAIL STARTING IN APRIL
Marketing ID Local Number
------------------- ----------------------
GR332230 0000232323
Dispatch Code Logic code
----------------- -------------------
GX3472 1
Destination ID Destination details
----------------- -------------------
3411144
output:
Marketing ID = GR332230
Local Number = 0000232323
Dispatch Code = GX3472
Logic code = 1
Destination ID =
Destination details = 3411144
Or this:
EMAIL STARTING IN APRIL
Marketing ID Local Number
------------------- ----------------------
Dispatch Code Logic code
----------------- -------------------
GX3472 1
Destination ID Destination details
----------------- -------------------
3411144
output:
Marketing ID =
Local Number =
Dispatch Code = GX3472
Logic code = 1
Destination ID =
Destination details = 3411144
var dict = new Dictionary<string, string>();
try
{
var lines = email.Split(Environment.NewLine.ToCharArray(), StringSplitOptions.RemoveEmptyEntries);
int starts = 0, end = 0, length = 0;
while (!lines[starts + 1].StartsWith("-")) starts++;
for (int i = starts + 1; i < lines.Length; i += 3)
{
var mc = Regex.Matches(lines[i], #"(?:^| )-");
foreach (Match m in mc)
{
int start = m.Value.StartsWith(" ") ? m.Index + 1 : m.Index;
end = start;
while (lines[i][end++] == '-' && end < lines[i].Length - 1) ;
length = Math.Min(end - start, lines[i - 1].Length - start);
string key = length > 0 ? lines[i - 1].Substring(start, length).Trim() : "";
end = start;
while (lines[i][end++] == '-' && end < lines[i].Length) ;
length = Math.Min(end - start, lines[i + 1].Length - start);
string value = length > 0 ? lines[i + 1].Substring(start, length).Trim() : "";
dict.Add(key, value);
}
}
}
catch (Exception ex)
{
throw new Exception("Email is not in correct format");
}
Live Demo
Using Regular Expressions:
var dict = new Dictionary<string, string>();
try
{
var lines = email.Split(Environment.NewLine.ToCharArray(), StringSplitOptions.RemoveEmptyEntries);
int starts = 0;
while (!lines[starts + 1].StartsWith("-")) starts++;
for (int i = starts + 1; i < lines.Length; i += 3)
{
var keys = Regex.Matches(lines[i - 1], #"(?:^| )(\w+\s?)+");
var values = Regex.Matches(lines[i + 1], #"(?:^| )(\w+\s?)+");
if (keys.Count == values.Count)
for (int j = 0; j < keys.Count; j++)
dict.Add(keys[j].Value.Trim(), values[j].Value.Trim());
else // remove bug if value of first key in a line has no value
{
if (lines[i + 1].StartsWith(" "))
{
dict.Add(keys[0].Value.Trim(), "");
dict.Add(keys[1].Value.Trim(), values[0].Value.Trim());
}
else
{
dict.Add(keys[0].Value, values[0].Value.Trim());
dict.Add(keys[1].Value.Trim(), "");
}
}
}
}
catch (Exception ex)
{
throw new Exception("Email is not in correct format");
}
Live Demo
Here is my attempt. I don't know if the email format can change (rows, columns, etc).
I can't think of an easy way to separate the columns besides checking for a double space (my solution).
class Program
{
static void Main(string[] args)
{
var emailBody = GetEmail();
using (var reader = new StringReader(emailBody))
{
var lines = new List<string>();
const int startingRow = 2; // Starting line to read from (start at Marketing ID line)
const int sectionItems = 4; // Header row (ex. Marketing ID & Local Number Line) + Dash Row + Value Row + New Line
// Add all lines to a list
string line = "";
while ((line = reader.ReadLine()) != null)
{
lines.Add(line.Trim()); // Add each line to the list and remove any leading or trailing spaces
}
for (var i = startingRow; i < lines.Count; i += sectionItems)
{
var currentLine = lines[i];
var indexToBeginSeparatingColumns = currentLine.IndexOf(" "); // The first time we see double spaces, we will use as the column delimiter, not the best solution but should work
var header1 = currentLine.Substring(0, indexToBeginSeparatingColumns);
var header2 = currentLine.Substring(indexToBeginSeparatingColumns, currentLine.Length - indexToBeginSeparatingColumns).Trim();
currentLine = lines[i+2]; //Skip dash line
indexToBeginSeparatingColumns = currentLine.IndexOf(" ");
string value1 = "", value2 = "";
if (indexToBeginSeparatingColumns == -1) // Use case of there being no value in the 2nd column, could be better
{
value1 = currentLine.Trim();
}
else
{
value1 = currentLine.Substring(0, indexToBeginSeparatingColumns);
value2 = currentLine.Substring(indexToBeginSeparatingColumns, currentLine.Length - indexToBeginSeparatingColumns).Trim();
}
Console.WriteLine(string.Format("{0},{1},{2},{3}", header1, value1, header2, value2));
}
}
}
static string GetEmail()
{
return #"EMAIL STARTING IN APRIL
Marketing ID Local Number
------------------- ----------------------
GR332230 0000232323
Dispatch Code Logic code
----------------- -------------------
GX3472 1
Destination ID Destination details
----------------- -------------------
3411144";
}
}
Output looks something like this:
Marketing ID,GR332230,Local Number,0000232323
Dispatch Code,GX3472,Logic code,1
Destination ID,3411144,Destination details,
Here is an aproach asuming you don't need the headers, info comes in order and mandatory.
This won't work for data that has spaces or optional fields.
foreach (MailItem mail in publicFolder.Items)
{
MessageBox.Show(mail.Body, "MailItem body");
// Split by line, remove dash lines.
var data = Regex.Split(mail.Body, #"\r?\n|\r")
.Where(l => !l.StartsWith('-'))
.ToList();
// Remove headers
for(var i = data.Count -2; lines >= 0; i -2)
{
data.RemoveAt(i);
}
// now data contains only the info you want in the order it was presented.
// Asuming info doesn't have spaces.
var result = data.SelectMany(d => d.Split(' '));
// WARNING: Missing info will not be present.
// {"GR332230", "0000232323", "GX3472", "1", "3411144"}
}

How to separate a first name and last name in c#?

I am looking for how to get rid off below exception "Index was outside the bounds of the array." for the below case 2
Aim: To separate the first name and last name (last name may be null some times)
Case 1:
Name: John Melwick
I can be able to resolve the first case with my code
Case 2:
Name: Kennedy
In case two I am getting an error Index was out of range at LastName in my code
Case 3:
Name: Rudolph Nick Bother
In case 3, I can be able to get:
FirstName: Rudolph and LastName: Nick (whereas I need Nick Bother together to be lastname)
Very much thankful, if anybody help me.
Here is the code:
Match Names = Regex.Match(item[2], #"(((?<=Name:(\s)))(.{0,60})|((?<=Name:))(.{0,60}))", RegexOptions.IgnoreCase);
if (Names.Success)
{
FirstName = Names.ToString().Trim().Split(' ')[0];
LastName = Names.ToString().Trim().Split(' ')[1];
}
Split the string with a limit on the number of substrings to return. This will keep anything after the first space together as the last name:
string[] names = Names.ToString().Trim().Split(new char[]{' '}, 2);
Then check the length of the array to handle the case of only the lastname:
if (names.Length == 1) {
FirstName = "";
LastName = names[0];
} else {
FirstName = names[0];
LastName = names[1];
}
Use
String.indexof(" ")
And
string.lastindexof(" ")
if they match there is one space. If they dont there is 2. I believe it returns 0 if there are no matches. Hope this helps
edit
if you use the indexes you can do a substring using them and get the last name as you are wanting
Something like this works:
string name = "Mary Kay Jones" ;
Regex rxName = new Regex( #"^\s*(?<givenName>[^\s]*)(\s+(?<surname>.*))?\s*$") ;
Match m = rxName.Match( name ) ;
string givenName = m.Success ? m.Groups[ "givenName" ].Value : "" ;
string surname = m.Success ? m.Groups[ "surname" ].Value : "" ;
But it is an extremely erroneous assumption that a given name consists only of a single word. I can think of many examples to the contrary, such as (but by no means limited to):
Billy Ray (as in the earlier example of "Billy Ray Cyrus")
Mary Kay
Mary Beth
And there's no real way to know without asking the person in question. Does "Mary Beth Jones" consist a given, middle and surname or does consist of a given name, Mary Beth and a surnname "Jones".
If you are considering English-speaking cultures, the usual convention is that one may have as many given names (forenames) followed by a family name (surname). Prince Charles, heir to the British Crown, for instance carries the rather heavy-duty Charles Phillip Arthur George Mountbatten-Windsor. Strictly speaking, he has no surname. Mountbatten-Windsor is used when one is required and his full name is just "Charles Phillip Arthur George".
string fullName = "John Doe";
var names = fullName.Split(' ');
string firstName = names[0];
string lastName = names[1];
The reason you're getting an error is because you're not checking for the length of names.
names.Length == 0 //will not happen, even for empty string
names.Length == 1 //only first name provided (or blank)
names.Length == 2 //first and last names provided
names.Length > 2 //first item is the first name. last item is the last name. Everything else are middle names
See this answer for more information.
Modify the code to be something like:
Match Names = Regex.Match(item[2], #"(((?<=Name:(\s)))(.{0,60})|((?<=Name:))(.{0,60}))", RegexOptions.IgnoreCase);
if (Names.Success)
{
String[] nameParts = Names.ToString().Trim().Split(' ');
int count = 0;
foreach (String part in nameParts) {
if(count == 0) {
FirstName = part;
count++;
} else {
LastName += part + " ";
}
}
}
Here is the most generalized solution for this issue.
public class NameWrapper
{
public string FirstName { get; set; }
public string LastName { get; set; }
public NameWrapper()
{
this.FirstName = "";
this.LastName = "";
}
}
public static NameWrapper SplitName(string inputStr, char splitChar)
{
NameWrapper w = new NameWrapper();
string[] strArray = inputStr.Trim().Split(splitChar);
if (string.IsNullOrEmpty(inputStr)){
return w;
}
for (int i = 0; i < strArray.Length; i++)
{
if (i == 0)
{
w.FirstName = strArray[i];
}
else
{
w.LastName += strArray[i] + " ";
}
}
w.LastName = w.LastName.Trim();
return w;
}

Extract node value from xml resembling string C#

I am having strings like below
<ad nameId="\862094\"></ad>
or comma seprated like below
<ad nameId="\862593\"></ad>,<ad nameId="\862094\"></ad>,<ad nameId="\865599\"></ad>
How to extract nameId value and store in single string like below
string extractedValues ="862094";
or in case of comma seprated string above
string extractedMultipleValues ="862593,862094,865599";
This is what I have started trying with but not sure
string myString = "<ad nameId="\862593\"></ad>,<ad nameId="\862094\"></ad>,<ad
nameId="\865599\"></ad>";
string[] myStringArray = myString .Split(',');
foreach (string str in myStringArray )
{
xd.LoadXml(str);
chkStringVal = xd.SelectSingleNode("/ad/#nameId").Value;
}
Search for:
<ad nameId="\\(\d*)\\"><\/ad>
Replace with:
$1
Note that you must search globally. Example: http://www.regex101.com/r/pL2lX1
Please see code below to extract all numbers in your example:
string value = #"<ad nameId=""\862093\""></ad>,<ad nameId=""\862094\""></ad>,<ad nameId=""\865599\""></ad>";
var matches = Regex.Matches(value, #"(\\\d*\\)", RegexOptions.RightToLeft);
foreach (Group item in matches)
{
string yourMatchNumber = item.Value;
}
Try like this;
string s = #"<ad nameId=""\862094\""></ad>";
if (!(s.Contains(",")))
{
string extractedValues = s.Substring(s.IndexOf("\\") + 1, s.LastIndexOf("\\") - s.IndexOf("\\") - 1);
}
else
{
string[] array = s.Split(new char[] { ',' }, StringSplitOptions.RemoveEmptyEntries);
string extractedMultipleValues = "";
for (int i = 0; i < array.Length; i++)
{
extractedMultipleValues += array[i].Substring(array[i].IndexOf("\\") + 1, array[i].LastIndexOf("\\") - array[i].IndexOf("\\") - 1) + ",";
}
Console.WriteLine(extractedMultipleValues.Substring(0, extractedMultipleValues.Length -1));
}
mhasan, here goes an example of what you need(well almost)
EDITED: complete code (it's a little tricky)
(Sorry for the image but i have some troubles with tags in the editor, i can send the code by email if you want :) )
A little explanation about the code, it replaces all ocurrences of parsePattern in the given string, so if the given string has multiple tags separated by "," the final result will be the numbers separated by "," stored in parse variable....
Hope it helps

Isolate characters from string

So I have a string called today with the value "nick_george_james"
it looks like this
string today = "_nick__george__james_";
how can i isolate the text between the '_' into a new string? i want to get the 3 names into seperate strings so that in the end i have name1, name2, name3 with the values nick, george and james
my application is written in c#
use string.Split
string[] array = today.Split('_');
After editing your question, I realized that you have multiple _ in your string. You should try the following.
string[] array = today.Split("_".ToCharArray(), StringSplitOptions.RemoveEmptyEntries);
Or
string[] array = today.Split(new []{"_"}, StringSplitOptions.RemoveEmptyEntries);
Later your array will contain:
array[0] = "nick";
array[1] = "george";
array[2] = "james";
string[] array = today.Split('_');
name1=array[0];
name2=array[1];
name3=array[2];
Thought of coming up with an idea other than string.Split.
string today = "_nick__george__james_";
//Change value nNoofwordstobeFound accordingly
int nNoofwordstobeFound = 3;
int nstartindex = 0;
int nEndindex = 0;
int i=1;
while (i <= nNoofwordstobeFound)
{
Skip:
nstartindex = today.IndexOf("_",nEndindex);
nEndindex = today.IndexOf("_", nstartindex + 1);
string sName = today.Substring(nstartindex + 1, nEndindex - (nstartindex + 1));
if (sName == "")
{
goto Skip;
}
else
{
//Do your code
//For example
string abc= sName;
}
i++;
}
I'd still prefer string.split method over this anytime.
string[] nameArray = today.Split('_');
Here you will get a array of names. You can get each name from by specifying index positions of the nameArray.
ie Now the the nameArray contains values as below
nameArray[0] = "nick", nameArray[1] = "george", nameArray[2] = "james"

Get String (Text) before next upper letter

I have the following:
string test = "CustomerNumber";
or
string test2 = "CustomerNumberHello";
the result should be:
string result = "Customer";
The first word from the string is the result, the first word goes until the first uppercase letter, here 'N'
I already tried some things like this:
var result = string.Concat(s.Select(c => char.IsUpper(c) ? " " + c.ToString() : c.ToString()))
.TrimStart();
But without success, hope someone could offer me a small and clean solution (without RegEx).
The following should work:
var result = new string(
test.TakeWhile((c, index) => index == 0 || char.IsLower(c)).ToArray());
You could just go through the string to see which values (ASCII) are below 97 and remove the end. Not the prettiest or LINQiest way, but it works...
string test2 = "CustomerNumberHello";
for (int i = 1; i < test2.Length; i++)
{
if (test2[i] < 97)
{
test2 = test2.Remove(i, test2.Length - i);
break;
}
}
Console.WriteLine(test2); // Prints Customer
Try this
private static string GetFirstWord(string source)
{
return source.Substring(0, source.IndexOfAny("ABCDEFGHIJKLMNOPQRSTUVWXYZ".ToArray(), 1));
}
Z][a-z]+ regex it will split the string to string that start with big letters her is an example
regex = "[A-Z][a-z]+";
MatchCollection mc = Regex.Matches(richTextBox1.Text, regex);
foreach (Match match in mc)
if (!match.ToString().Equals(""))
Console.writln(match.ToString() + "\n");
I have tested, this works:
string cust = "CustomerNumberHello";
string[] str = System.Text.RegularExpressions.Regex.Split(cust, #"[a-z]+");
string str2 = cust.Remove(cust.IndexOf(str[1], 1));

Categories