How to find the capital substring of a string? [closed] - c#

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 9 years ago.
Improve this question
I am trying to find the capitalized portion of a string, to then insert two characters that represent the Double Capital sign in the Braille language. My intention for doing this is to design a translator that can translate from regular text to Braille.
I'll give an example belo.
English String: My variable is of type IEnumerable.
Braille: ,My variable is of type ,,IE-numberable.
I also want the dash in IE-numerable to only break words that have upper and lower case, but not in front of punctuation marks, white spaces, numbers or other symbols.
Thanks a lot in advance for your answers.

I had never heard of a "Double Capital" sign, so I read up on it here. From what I can tell, this should suit your needs.
You can use this to find any sequence of two or more uppercase (majuscule) Latin letters or hyphens in your string:
var matches = Regex.Matches(input, "[A-Z-]{2,}");
You can use this to insert the double-capital sign:
var result = Regex.Replace(input, "[A-Z-]{2,}", ",,$0");
For example:
var input = "this is a TEST";
var result = Regex.Replace(input, "[A-Z-]{2,}", ",,$0"); // this is a ,,TEST
You can use this to hand single and double capitals:
var input = "McGRAW-HILL";
var result = Regex.Replace(input, "[A-Z-]([A-Z-]+)?",
m => (m.Groups[1].Success ? ",," : ",") + m.Value); // ,Mc,,GRAW-HILL

You can find them with a simple regex:
using System.Text.RegularExpressions;
// ..snip..
Regex r = new Regex("[A-Z]"); // This will capture only upper case characters
Match m = r.Match(input, 0);
The variable m of type System.Text.RegularExpressions.Match will contain a collection of captures. If only the first match matters, you can check its Index property directly.
Now you can insert the characters you want in that position, using String.Insert:
input = input.Insert(m.Index, doubleCapitalSign);

this code can solve your problema
string x = "abcdEFghijkl";
string capitalized = string.Empty;
for (int i = 0; i < x.Length; i++)
{
if (x[i].ToString() == x[i].ToString().ToUpper())
capitalized += x[i];
}

Have you tried using the method Char.IsUpper method
http://msdn.microsoft.com/en-us/library/9s91f3by.aspx
This is another similar question that uses that method to solve a similar problem
Get the Index of Upper Case letter from a String

If you just want to find the first index of an uppercase letter:
var firstUpperCharIndex = text // <-- a string
.Select((chr, index) => new { chr, index })
.FirstOrDefault(x => Char.IsUpper(x.chr));
if (firstUpperCharIndex != null)
{
text = text.Insert(firstUpperCharIndex.index, ",,");
}

Not sure if this is what you are going for?
var inputString = string.Empty; //Your input string here
var output = new StringBuilder();
foreach (var c in inputString.ToCharArray())
{
if (char.IsUpper(c))
{
output.AppendFormat("_{0}_", c);
}
else
{
output.Append(c);
}
}
This will loop through each character in the inputString if the characater is upper it inserts a _ before and after (replace that with your desired braille characters) otherwise appends the character to the output.

Related

Splitting a string into characters, but keeping some together [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 1 year ago.
Improve this question
I have this string: TF'E'
I want to split it to characters, but the '" character should join the character before it.
So it would look like this: T, F' and E'
You could use a regular expression to split the string at each position immediately before a new letter and an optional ':
var input = "TF'E'";
var output = Regex.Split(input, #"(?<!^)(?=\p{L}'?)");
output will now be a string array like ["T", "F'", "E'"]. The lookbehind (?<!^) ensure we never split at the start of the string, whereas the lookahead (?=\p{L}'?) describes one letter \p{L} followed by 0 or 1 '.
You can use a regex to capture "an uppercase character followed optionally by an apostrophe"
var mc = Regex.Matches(input, "(?<x>[A-Z]'?)");
foreach(Match m in mc)
Console.WriteLine(m.Groups["x"].Value);
If you don't like regex, you can use this method:
public static IEnumerable<string> Split(string input)
{
for(int i = 0; i < input.Length; i++)
{
if(i != (input.Length - 1) && input[i+1] == '\'')
{
yield return input[i].ToString() + input[i+1].ToString();
i++;
}
else
{
yield return input[i].ToString();
}
}
}
We loop through the input string. We check if there is a next character and if it is a '. If true, return the current character and the next character and increase the index by one. If false, just return the current character.
Online demo: https://dotnetfiddle.net/sPCftB

string.IndexOf ignoring escape sequences [closed]

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 1 year ago.
Improve this question
I'm trying to extract the CN of an LDAP DN string.
Here's an example string that illustrates the problem
var dn = #"CN=Firstname Lastname\, Organization,OU=some ou,DC=company,DC=com";
What I want is the position of the first non escaped ',' character, which is at position 32.
var pos = dn.IndexOf(',');
returns the first comma, regardless of escaping or not. Now can I bring IndexOf to skip the escaped comma in the string?
Assuming that \ should be escaped by itself: \\ to put just \ you can implement a simple
finite state machine
private static int IndexOfUnescaped(string source,
char toFind,
char escapement = '\\') {
if (string.IsNullOrEmpty(source))
return -1;
for (int i = 0; i < source.Length; ++i)
if (source[i] == escapement)
i += 1; // <- skip the next (escaped) character
else if (source[i] == toFind)
return i;
return -1;
}
...
var dn = #"CN=Firstname Lastname\, Organization,OU=some ou,DC=company,DC=com";
var pos = IndexOfUnescaped(dn, ',');
You can use Regex:
string s = #"CN=Firstname Lastname\, Organization,OU=some ou,DC=company,DC=com";
Regex regex = new Regex("(?<!\\\\),", RegexOptions.Compiled);
int firstMatch = regex.Matches(s).FirstOrDefault()?.Index ?? -1;
Demo: https://regex101.com/r/Jxco8K/1
It's using a negative lookbehind, so check all commas and look if it's not preceeded by a backslash.
Colleague of mine whipped up this regex. Not entirely the question, but since I wanted the position to then use SubString it also does the trick.
var CnRegex = new Regex(#"([a-zA-Z_]*)=((?:[^\\,}]|\\.)*)");
var match = CnRegex.Match(input);
if (match.Success)
return match.Value;
return null;
I feared it would come down to a Regex, as in Tim's solution, or 'brute force' as with Dmitry's solution.

Get values between string in C# [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
I have a string value like "%2%,%12%,%115%+%55%,..."
Sample inputs "(%2%+%5%)/5"
Step1:
get the vales 2 and 5
Step2:
get values from table column2 and column5
step3:
from that value (Column2+Column5)/5
How to get the values from that string
2,12,115,55
Also
commas, "+" (symbols)
Thanks in Advance.
I referred these links:
Find a string between 2 known values
How do I extract text that lies between parentheses (round brackets)?
You can replace the % and then split on the , and +:
var value = "%2%,%12%,%115%+%55%,";
value = value.Replace("%", "");
var individualValues = value.Split(new[] {',', '+'});
foreach (var val in individualValues)
Console.WriteLine(val);
If I understand you correctly...
var string = "%2%,%12%,%115%+%55%";
var values = string.replace("%", "").replace("+", "").split(',');
Edit: Actually, I think you mean you want to split on "+" so that becomes split(',', '+')
str.Replace("%", "").Replace("+",",").Split(',');
This will do it.
Another regex solution:
foreach (Match m in Regex.Matches(str, #"\d+"))
// m.Value is the values you wanted
Using regular expressions:
using System.Text.RegularExpressions;
Regex re = new Regex("\\d+");
MatchCollection matches = re.Matches("%2%,%12%,%115%+%55%,...");
List<int> lst = new List<int>();
foreach (Match m in matches)
{
lst.Add(Convert.ToInt32(m.Value));
}
It's possible parsing the string with splits and other stuff or through Regex:
[TestMethod]
public void TestGetNumberCommasAndPlusSign()
{
Regex r = new Regex(#"[\d,\+]+");
string result = "";
foreach (Match m in r.Matches("%2%,%12%,%115%+%55%,..."))
result += m.Value;
Assert.AreEqual("2,12,115+55,", result);
}

How can I return a string between two other strings in C#? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 9 years ago.
Improve this question
I have a string for a page source already created. I need to grab a few lines of text from the string. The string I need is between two other strings. These two strings are "keywords": and ", "
How would I search for a string that has a colon after the quotations such as "keywords":
?
Would I use regex?
Thank you.
In your case, regex is too powerful to use it with such a problem. Just use string.IndexOf() and string.Substring(). Get a position of the word, get a position of the closest comma - there is an overload for this in IndexOf that let you specify starting position of searching.
Here is a code snippet, it is more explaining then I could do it in words.
var text = "\"keywords\":some text you want,and a text you do not want";
var searchFor = "\"keywords\":";
int firstIndex = text.IndexOf(searchFor);
int secondIndex = text.IndexOf(",", firstIndex);
var result = text.Substring(firstIndex + searchFor.Length, secondIndex - searchFor.Length);
The following Regex will match everything between "keywords" and ",":
Regex r = new Regex("keywords:(.*),");
Match m = r.Match(yourStringHere);
foreach(Group g in m.Groups) {
// do your work here
}
You can try like this, without using Regex
string str = "This is an example string and my data is here";
string first = "keywords:";
string second = ",";
int Start, End;
if (str.Contains(first) && str.Contains(second))
{
Start = str.IndexOf(first, 0) + first.Length;
End = str.IndexOf(second, Start);
return str.Substring(Start, End - Start);
}
else
{
return "";
}
This ought to work across multiple lines.
string input = #"blah blah blah ""keywords"":this is " + Environment.NewLine + "what you want right?, more blah...";
string pattern = #"""keywords"":(.*),";
Match match = Regex.Match(input, pattern, RegexOptions.Singleline);
if (match.Success)
{
string stuff = match.Groups[1].Value;
}

Transform title into dashed URL-friendly string [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I would like to write a C# method that would transform any title into a URL friendly string, similar to what Stack Overflow does:
replace spaces with dashes
remove parenthesis
etc.
I'm thinking of removing Reserved characters as per RFC 3986 standard (from Wikipedia) but I don't know if that would be enough? It would make links workable, but does anyone know what other characters are being replaced here at stackoverflow? I don't want to end up with %-s in my URLs...
Current implementation
string result = Regex.Replace(value.Trim(), #"[!*'""`();:#&+=$,/\\?%#\[\]<>«»{}_]");
return Regex.Replace(result.Trim(), #"[\s*[\-–—\s]\s*]", "-");
My questions
Which characters should I remove?
Should I limit the maximum length of resulting string?
Anyone know which rules are applied on titles here on SO?
Rather than looking for things to replace, the list of unreserved chars is so short, it'll make for a nice clear regex.
return Regex.Replace(value, #"[^A-Za-z0-9_\.~]+", "-");
(Note that I didn't include the dash in the list of allowed chars; that's so it gets gobbled up by the "1 or more" operator [+] so that multiple dashes (in the original or generated or a combination) are collapsed, as per Dominic Rodger's excellent point.)
You may also want to remove common words ("the", "an", "a", etc.), although doing so can slightly change the meaning of a sentence. Probably want to remove any trailing dashes and periods as well.
Also strongly recommend you do what SO and others do, and include a unique identifier other than the title, and then only use that unique ID when processing the URL. So http://example.com/articles/1234567/is-the-pop-catholic (note the missing 'e') and http://example.com/articles/1234567/is-the-pope-catholic resolve to the same resource.
I would be doing:
string url = title;
url = Regex.Replace(url, #"^\W+|\W+$", "");
url = Regex.Replace(url, #"'\"", "");
url = Regex.Replace(url, #"_", "-");
url = Regex.Replace(url, #"\W+", "-");
Basically what this is doing is it:
strips non-word characters from the beginning and end of the title;
removes single and double quotes (mainly to get rid of apostrophes in the middle of words);
replaces underscores with hyphens (underscores are technically a word character along with digits and letters); and
replaces all groups of non-word characters with a single hyphen.
Most "sluggifiers" (methods for converting to friendly-url type names) tend to do the following:
Strip everything except whitespace, dashes, underscores, and alphanumerics.
(Optional) Remove "common words" (the, a, an, of, et cetera).
Replace spaces and underscores with dashes.
(Optional) Convert to lowercase.
As far as I know, StackOverflow's sluggifier does #1, #3, and #4, but not #2.
How about this:
string FriendlyURLTitle(string pTitle)
{
pTitle = pTitle.Replace(" ", "-");
pTitle = HttpUtility.UrlEncode(pTitle);
return Regex.Replace(pTitle, "\%[0-9A-Fa-f]{2}", "");
}
this is how I currently slug words.
public static string Slug(this string value)
{
if (value.HasValue())
{
var builder = new StringBuilder();
var slug = value.Trim().ToLowerInvariant();
foreach (var c in slug)
{
switch (c)
{
case ' ':
builder.Append("-");
break;
case '&':
builder.Append("and");
break;
default:
if ((c >= '0' && c <= '9') || (c >= 'a' && c <= 'z') && c != '-')
{
builder.Append(c);
}
break;
}
}
return builder.ToString();
}
return string.Empty;
}
I use this one...
public static string ToUrlFriendlyString(this string value)
{
value = (value ?? "").Trim().ToLower();
var url = new StringBuilder();
foreach (char ch in value)
{
switch (ch)
{
case ' ':
url.Append('-');
break;
default:
url.Append(Regex.Replace(ch.ToString(), #"[^A-Za-z0-9'()\*\\+_~\:\/\?\-\.,;=#\[\]#!$&]", ""));
break;
}
}
return url.ToString();
}
This works for me
string output = Uri.UnescapeDataString(input);

Categories