Related
I want to build a program that analyses sentences, and then for each character/number/symbol that appears in the words of the sentence, record which words the character appears in. (upper/lower case to be ignored, and duplicate entries of a character in a word are ignored).
So if I had the sentence "I wandered lonely as a cow".
After the first word, i'd have a data construct... i - 1; // because "I" occurred in the first word.
after the second word, my data construct would be... i - 1; w - 2; a - 2; n - 2; d - 2; e - 2; r - 2;
after the sixth word... i - 1; w - 2,6; a - 2,4,5; n - 2,3; d - 2; e - 2,3; r - 2; l - 3; o - 3,6; y - 3; s - 4; c - 6;
This is to be in c#. I've considered a 2d array, 26 (for the letters) x 20 (words in a sentence. The issue here is that my array is going to be sparse, and its also going to be hard work keeping track of which element is the next spare one against each letter. I'd want my array for the letter a to be [2,4,5] not [0,2,0,4,5] or [0,0,2,0,4,5], Its also complicated by wanting to cater for other symbols, so the 26 will get bigger quickly. The third of those arrays is the one that is "obvious" how to program, but is the least elegant solution.
static void Main(string[] args)
{
string[] sentence = new string[6] { "i", "wandered", "lonely", "as", "a", "cow" };
string alphabet = "abcdefghijklmnopqrstuvwxyz";
int[,] letterInWord= new int[26, 7];
for (int letterIndex = 0; letterIndex < alphabet.Length; letterIndex++)
{
for (int wordIndex = 0; wordIndex < sentence.Length; wordIndex++)
{
if(sentence[wordIndex].IndexOf(alphabet[letterIndex]) >= 0)
{
letterInWord[letterIndex, wordIndex+1] = wordIndex+1;
}
}
}
// then analyse or just print out (adding 1 to get counting base 1)
for (int letterIndex = 0; letterIndex < alphabet.Length; letterIndex++)
{
Console.Write(alphabet[letterIndex]+ " is in word(s) " );
for (int wordIndex = 1; wordIndex <= sentence.Length; wordIndex++)
{
if (letterInWord[letterIndex, wordIndex] > 0)
{
Console.Write(letterInWord[letterIndex, wordIndex] + " ");
}
}
Console.WriteLine();
}
}
So, that works, but I just don't like it.
Ideally i'd want a list for the sentence called sentenceList, and then for each letter i find (e.g. z), I'd look in sentenceList for a list called listForZ, and if i didnt find it, I'd create a new list called listForZ, add the word number to the List, and add listForZ into the sentenceList.
But that requires programmatically creating the name of the list from the variable I've just found in the word, and I've struggled to understand how that would work. I suppose I could use a factory method pattern which IS aware of all the listnames I could have and creates them appropriately, but again, that seems overkill for what I want.
Any suggested directions?
But that requires programmatically creating the name of the list from
the variable I've just found in the word, and I've struggled to
understand how that would work.
Using a Dictionary, you can associate a key with a value. In your case, the characters in the words are the keys, and the word positions where they occur are the values:
Dictionary<char, List<int>> occurrences = new Dictionary<char, List<int>>();
string sentence = "I wandered lonely as a cow";
string[] words = sentence.ToLower().Split(" ".ToCharArray());
for(int i = 0; i < words.Length; i++)
{
foreach(char c in words[i].ToCharArray().Distinct())
{
if (!occurrences.ContainsKey(c))
{
occurrences.Add(c, new List<int>());
}
occurrences[c].Add(i + 1);
}
}
foreach(KeyValuePair<char, List<int>> kvp in occurrences)
{
Console.WriteLine(kvp.Key.ToString() + " - " + String.Join(",", kvp.Value.ToArray()));
}
Output generated:
i - 1
w - 2,6
a - 2,4,5
n - 2,3
d - 2
e - 2,3
r - 2
l - 3
o - 3,6
y - 3
s - 4
c - 6
Using Regex :
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Text.RegularExpressions;
namespace ConsoleApplication108
{
class Program
{
static void Main(string[] args)
{
string input = "I wandered lonely as a cow";
string pattern = #"(?'word'\w+)\s*";
string[] words = Regex.Matches(input, pattern).Cast<Match>().Select(x => x.Groups["word"].Value).ToArray();
var results = words
.Select(x => new { word = x, characters = x.ToCharArray().Select((y, i) => new { ch = y, index = i }).GroupBy(y => y.ch).Select(y => y.First()).ToList() }).ToList();
}
}
}
With a help of regular expressions (we have to match words) and Linq to query these words you can implement something like this:
string sentence = "I wandered lonely as a cow";
var result = string.Join("; ", Regex
.Matches(sentence, "[A-Za-z]+") // Word is a sequence of A..Z a..z letters
.OfType<Match>()
.Select((match, index) => new {
word = match.Value.ToLower(), // So we have word, e.g. "lonely"
index + 1 // and its index, e.g. "3"
})
.SelectMany(item => item.word.Select(c => new {
character = c, // for each character
wordNumber = item.index // we have a index of the word(s) where it appears
}))
.GroupBy(item => item.character, item => item.wordNumber) // grouping by character
.Select(chunk => $"{chunk.Key} - {string.Join(",", chunk.Distinct().OrderBy(n => n))}"));
// Let's have a look at the results
Console.Write(result);
Outcome:
i - 1; w - 2,6; a - 2,4,5; n - 2,3; d - 2; e - 2,3; r - 2; l - 3; o - 3,6; y - 3; s - 4; c - 6
I need assistance with Combinations with Repetition. Have searched all over the net and although I found a few examples I can't understand them completely. My goal is simple a function (CombinationsWithRepetiion) receives list with items (in this case integer values) and length (that represents how long each combination can be) and returns a list containing the result.
List<int> input = new List<int>() {1, 2, 3}
CombinationsWithRepetition(input, length);
result:
length = 1: 1, 2, 3
length = 2: 11,12,13,21,22,23,31,32,33
length = 3: 111,112 ....
I hope someone helps me and thank you in advance!
recursion
Ok,
here is the C# version - I walk you through it
static IEnumerable<String> CombinationsWithRepetition(IEnumerable<int> input, int length)
{
if (length <= 0)
yield return "";
else
{
foreach(var i in input)
foreach(var c in CombinationsWithRepetition(input, length-1))
yield return i.ToString() + c;
}
}
First you check the border-cases for the recursion (in this case if length <= 0) - in this case the answer is the empty string (btw: I choose to return strings as you did not say what your really needed - should be easy to change).
In any other case you look at each input i and recursivley take the next-smaller combinations and just plug em together (with String-concatination because I wanted strings).
I hope you understand the IEnumerable/yield stuff - if not say so in the comments please.
Here is a sample output:
foreach (var c in CombinationsWithRepetition(new int[]{1,2,3}, 3))
Console.WriteLine (c);
111
112
113
...
332
333
converting numbers
The following uses the idea I sketched in the comment below and has no problems with stack-overflow exceptions (recursion might for big lenght) - this too assumes strings as they are easier to work with (and I can do a simple PadLeft to simplify things)
static String Convert(string symbols, int number, int totalLen)
{
var result = "";
var len = symbols.Length;
var nullSym = symbols [0];
while (number > 0)
{
var index = number % len;
number = number / len;
result = symbols [index] + result;
}
return result.PadLeft (totalLen, nullSym);
}
static IEnumerable<String> CombinationsWithRepetition(string symbols, int len)
{
for (var i = 0; i < Math.Pow(symbols.Length,len); i++)
yield return Convert (symbols, i, len);
}
string[] items = {"1", "2", "3"};
var query = from i1 in items
from i2 in items
from i3 in items
select i1 + i2 + i3 ;
foreach(var result in query)
Console.WriteLine(result);
Console.ReadKey();
I'm trying to get all combinations in a string in c# with that idea in mind:
Given a string like foo I want to get a List<string> with the values:
f o o
fo o
foo
f oo
As you can see it's not as easy as get all the substring but get ALL the chars in a string separated by spaces.
I've tried doing something like:
List<string> result = new List<string>();
string text = "foo";
for (int i = 1; i < foo.Lenght; i++)
{
//I'm stucked --> everything I think is too stupid and I don't know how to procede or not fast enough. I'm really stuck.
}
EDIT:
There are some correct answers but it's clear that any of them won't do since the strings I am working with have between 55 and 85 chars each one so that means that the best function in the answers will give me something between 2^54 and 2^84 possible combinations and that is just a bit too much.
It is clear now that find all the combinations and afterwards do something with them won't do. I'll have to drop it.
First things first: if the string length is n, you get 2^n strings as output.
So, if you want to treat strings of length 70, you have a problem.
You can use a counter, enumerating from 0 to 2^n, and treat it as a bitwise mask: if the first bit is 1, you put a space between the first and the second char, if it's zero, you don't.
So, an unsigned long of length 64 is barely enough to treat strings of length 65.
An example implementation, without recursion (it's slightly more verbose than the other examples), but should be a lot faster than the other implementations for long inputs:
public IEnumerable<string> GetPartitionedStrings(string s)
{
if (s == null) yield break;
if (s == "")
{
yield return "";
yield break;
}
if (s.Length > 63) throw new ArgumentOutOfRangeException("String too long...");
var arr = s.ToCharArray();
for(ulong i = 0, maxI = 1UL << (s.Length - 1); i < maxI; i++)
{
yield return PutSpaces(arr, i);
}
}
public string PutSpaces(char[] arr, ulong spacesPositions)
{
var sb = new StringBuilder(arr.Length * 2);
sb.Append(arr[0]);
ulong l = 1;
for (int i = 1; i < arr.Length; i++, l <<= 1)
{
if ((spacesPositions & l) != 0UL) sb.Append(" ");
sb.Append(arr[i]);
}
return sb.ToString();
}
Probably you could get away with a bit field, but we are already in the billions of strings, so I would try to reformulate a bit the problem.
Here is one more recursive solution to consider:
private static IEnumerable<string> Permute(string target) {
if (target.Length <= 1) {
yield return target;
yield break;
}
var c = target[0];
foreach (var rest in Permute(target.Remove(0, 1))) {
yield return c + rest;
yield return c + " " + rest;
}
}
For your test string produces desired result. Basically we combine first char + either space or no space + the rest of the string (without first char) recursively.
To get a list, just do Permute("foo").ToList();
For "abcde" string the result is:
abcde
a bcde
ab cde
a b cde
abc de
a bc de
ab c de
a b c de
abcd e
a bcd e
ab cd e
a b cd e
abc d e
a bc d e
ab c d e
a b c d e
A number of answers have suggested recursive solutions, which is fine. But here's a sketch of a non-recursive solution.
Suppose your word has x letters where x is less than 64.
Compute long n = 2(x - 1)
Make a loop where i goes from 0 to n - 1
Decompose i into the (x-1) low bits.
Output the first letter.
If the first bit is set, output a space, otherwise no space.
Output the second letter.
If the second bit is set, output a space, otherwise no space.
And so on.
Can you implement the method given that sketch?
You can do it using recursion, starting with an empty string, you call recursively adding an space and without add it, and adding current character:
static IEnumerable<string> SplitString(string s, int max)
{
return SplitString(s, 0, max, max);
}
private static IEnumerable<string> SplitString(string s, int idx, int available, int maxLength)
{
if (idx == s.Length) yield return string.Empty;
else
{
if (available > 0)
foreach (var item in SplitString(s, idx + 1, available - 1, maxLength))
yield return s[idx] + item;
if (idx > 0)
foreach (var item in SplitString(s, idx + 1, maxLength - 1, maxLength))
yield return " " + s[idx] + item;
}
}
For an input like abcde, SplitString("abcde", 3) get this output:
abc de
abc d e
ab cde
ab cd e
ab c de
ab c d e
a bcd e
a bc de
a bc d e
a b cde
a b cd e
a b c de
a b c d e
You could try something recursive. You start with a string, hello.
For each character that's not a space, if it's not followed by a space, add one at that location in the string and run the function on that string. On the first iteration, you have:
h ello
he llo
hel lo
hell o
hello
Repeat until all characters are followed by spaces. This will create duplicates though.
What you could do is convert the string into a char array like this:
char characters[] = text.toCharArray()
Then in your for loop iterate through this array
for (int i = 1; i < foo.Lenght; i++)
{
System.out.println(characters[i]);
}
I have a list and I need to output each subset of the list
for example a b c d e
would output to
a
b
c
d
e
ab
ac
ad
ae
abc
abd
abe
bcd
bce
....
abcde
I believe the correct term is combination no element should be duplicated on the same line
I was going to attempt this with a series of loops but im not even sure wehre to start
any suggestions?
This will generate the set you want, but in a different order (I sort by alphabet at the end, you'd want to sort by length as well).
You'll end up with:
a
ab
abc
abcd
abcde
abce
...
d
de
e
So, every possible subset (aside from the empty string), while maintaining the order of the original list.
The idea is to add each element to to a growing list. With every new element, add it first, and then add it to all existing elements.
So, start with 'a'.
Go on to 'b'. Add it to the list. We now have {'a', 'b'}.
Add it to existing elements, so we have 'ab'. Now we have {'a', 'b', 'ab'}.
Then 'c', and add it to existing elements to get 'ac', 'bc', 'abc': {'a', 'b', 'ab', 'c', 'ac', 'bc', abc'}. So forth until we're done.
string set = "abcde";
// Init list
List<string> subsets = new List<string>();
// Loop over individual elements
for (int i = 1; i < set.Length; i++)
{
subsets.Add(set[i - 1].ToString());
List<string> newSubsets = new List<string>();
// Loop over existing subsets
for (int j = 0; j < subsets.Count; j++)
{
string newSubset = subsets[j] + set[i];
newSubsets.Add(newSubset);
}
subsets.AddRange(newSubsets);
}
// Add in the last element
subsets.Add(set[set.Length - 1].ToString());
subsets.Sort();
Console.WriteLine(string.Join(Environment.NewLine, subsets));
if all you need are combinations of the elements in your original list, you can transform the problem into the following: you have a bit array of size N, and you want to find all possible choices for the elements of the array. For example, if your original list is
a b c d e
then your array can be
0 1 0 0 0
which results in an output of
b
or the array can be
1 0 1 1 0
which returns
acd
this is a simple recursion problem that can be solved in an O(2^n) time
edit adding pseudo-code for recursion algorithm:
CreateResultSet(List<int> currentResult, int step)
{
if (the step number is greater than the length of the original list)
{
add currentResult to list of all results
return
}
else
{
add 0 at the end of currentResult
call CreateResultSet(currentResult, step+1)
add 1 at the end of currentResult
call CreateResultSet(currentResult, step+1)
}
}
for every item in the list of all results
display the result associated to it (i.e. from 1 0 1 1 0 display acd)
This will work with any collection. I modified #Sapp's answer a little
static List<List<T>> GetSubsets<T>(IEnumerable<T> Set)
{
var set = Set.ToList<T>();
// Init list
List<List<T>> subsets = new List<List<T>>();
subsets.Add(new List<T>()); // add the empty set
// Loop over individual elements
for (int i = 1; i < set.Count; i++)
{
subsets.Add(new List<T>(){set[i - 1]});
List<List<T>> newSubsets = new List<List<T>>();
// Loop over existing subsets
for (int j = 0; j < subsets.Count; j++)
{
var newSubset = new List<T>();
foreach(var temp in subsets[j])
newSubset.Add(temp);
newSubset.Add(set[i]);
newSubsets.Add(newSubset);
}
subsets.AddRange(newSubsets);
}
// Add in the last element
subsets.Add(new List<T>(){set[set.Count - 1]});
//subsets.Sort();
return subsets;
}
**And then if I have a set of strings I will use it as:
Here is some code I made. It constructs a list of all possible strings from an alphabet, of lengths 1 to maxLength: (in other words, we are calculating the powers of the alphabet)
static class StringBuilder<T>
{
public static List<List<T>> CreateStrings(List<T> alphabet, int maxLength)
{
// This will hold all the strings which we create
List<List<T>> strings = new List<List<T>>();
// This will hold the string which we created the previous time through
// the loop (they will all have length i in the loop)
List<List<T>> lastStrings = new List<List<T>>();
foreach (T t in alphabet)
{
// Populate it with the string of length 1 read directly from alphabet
lastStrings.Add(new List<T>(new T[] { t }));
}
// This holds the string we make by appending each element from the
// alphabet to the strings in lastStrings
List<List<T>> newStrings;
// Here we make string2 for each length 2 to maxLength
for (int i = 0; i < maxLength; ++i)
{
newStrings = new List<List<T>>();
foreach (List<T> s in lastStrings)
{
newStrings.AddRange(AppendElements(s, alphabet));
}
strings.AddRange(lastStrings);
lastStrings = newStrings;
}
return strings;
}
public static List<List<T>> AppendElements(List<T> list, List<T> alphabet)
{
// Here we just append an each element in the alphabet to the given list,
// creating a list of new string which are one element longer.
List<List<T>> newList = new List<List<T>>();
List<T> temp = new List<T>(list);
foreach (T t in alphabet)
{
// Append the element
temp.Add(t);
// Add our new string to the collection
newList.Add(new List<T>(temp));
// Remove the element so we can make another string using
// the next element of the alphabet
temp.RemoveAt(temp.Count-1);
}
return newList;
}
}
something on the lines of an extended while loop :
<?
$testarray[0] = "a";
$testarray[1] = "b";
$testarray[2] = "c";
$testarray[3] = "d";
$testarray[4] = "e";
$x=0;
$y = 0;
while($x<=4) {
$subsetOne[$x] .= $testarray[$y];
$subsetOne[$x] .= $testarray[$x];
$subsetTwo[$x] .= $testarray[$y];
$subsetTwo[$x] .= $subsetOne[$x];
$subsetThree[$x] = str_replace("aa","ab",$subsetTwo[$x]);
$x++;
}
?>
I know how to do this in an ugly way, but am wondering if there is a more elegant and succinct method.
I have a string array of e-mail addresses. Assume the string array is of arbitrary length -- it could have a few items or it could have a great many items. I want to build another string consisting of say, 50 email addresses from the string array, until the end of the array, and invoke a send operation after each 50, using the string of 50 addresses in the Send() method.
The question more generally is what's the cleanest/clearest way to do this kind of thing. I have a solution that's a legacy of my VBScript learnings, but I'm betting there's a better way in C#.
You want elegant and succinct, I'll give you elegant and succinct:
var fifties = from index in Enumerable.Range(0, addresses.Length)
group addresses[index] by index/50;
foreach(var fifty in fifties)
Send(string.Join(";", fifty.ToArray());
Why mess around with all that awful looping code when you don't have to? You want to group things by fifties, then group them by fifties.
That's what the group operator is for!
UPDATE: commenter MoreCoffee asks how this works. Let's suppose we wanted to group by threes, because that's easier to type.
var threes = from index in Enumerable.Range(0, addresses.Length)
group addresses[index] by index/3;
Let's suppose that there are nine addresses, indexed zero through eight
What does this query mean?
The Enumerable.Range is a range of nine numbers starting at zero, so 0, 1, 2, 3, 4, 5, 6, 7, 8.
Range variable index takes on each of these values in turn.
We then go over each corresponding addresses[index] and assign it to a group.
What group do we assign it to? To group index/3. Integer arithmetic rounds towards zero in C#, so indexes 0, 1 and 2 become 0 when divided by 3. Indexes 3, 4, 5 become 1 when divided by 3. Indexes 6, 7, 8 become 2.
So we assign addresses[0], addresses[1] and addresses[2] to group 0, addresses[3], addresses[4] and addresses[5] to group 1, and so on.
The result of the query is a sequence of three groups, and each group is a sequence of three items.
Does that make sense?
Remember also that the result of the query expression is a query which represents this operation. It does not perform the operation until the foreach loop executes.
Seems similar to this question: Split a collection into n parts with LINQ?
A modified version of Hasan Khan's answer there should do the trick:
public static IEnumerable<IEnumerable<T>> Chunk<T>(
this IEnumerable<T> list, int chunkSize)
{
int i = 0;
var chunks = from name in list
group name by i++ / chunkSize into part
select part.AsEnumerable();
return chunks;
}
Usage example:
var addresses = new[] { "a#example.com", "b#example.org", ...... };
foreach (var chunk in Chunk(addresses, 50))
{
SendEmail(chunk.ToArray(), "Buy V14gr4");
}
It sounds like the input consists of separate email address strings in a large array, not several email address in one string, right? And in the output, each batch is a single combined string.
string[] allAddresses = GetLongArrayOfAddresses();
const int batchSize = 50;
for (int n = 0; n < allAddresses.Length; n += batchSize)
{
string batch = string.Join(";", allAddresses, n,
Math.Min(batchSize, allAddresses.Length - n));
// use batch somehow
}
Assuming you are using .NET 3.5 and C# 3, something like this should work nicely:
string[] s = new string[] {"1", "2", "3", "4"....};
for (int i = 0; i < s.Count(); i = i + 50)
{
string s = string.Join(";", s.Skip(i).Take(50).ToArray());
DoSomething(s);
}
I would just loop through the array and using StringBuilder to create the list (I'm assuming it's separated by ; like you would for email). Just send when you hit mod 50 or the end.
void Foo(string[] addresses)
{
StringBuilder sb = new StringBuilder();
for (int i = 0; i < addresses.Length; i++)
{
sb.Append(addresses[i]);
if ((i + 1) % 50 == 0 || i == addresses.Length - 1)
{
Send(sb.ToString());
sb = new StringBuilder();
}
else
{
sb.Append("; ");
}
}
}
void Send(string addresses)
{
}
I think we need to have a little bit more context on what exactly this list looks like to give a definitive answer. For now I'm assuming that it's a semicolon delimeted list of email addresses. If so you can do the following to get a chunked up list.
public IEnumerable<string> DivideEmailList(string list) {
var last = 0;
var cur = list.IndexOf(';');
while ( cur >= 0 ) {
yield return list.SubString(last, cur-last);
last = cur + 1;
cur = list.IndexOf(';', last);
}
}
public IEnumerable<List<string>> ChunkEmails(string list) {
using ( var e = DivideEmailList(list).GetEnumerator() ) {
var list = new List<string>();
while ( e.MoveNext() ) {
list.Add(e.Current);
if ( list.Count == 50 ) {
yield return list;
list = new List<string>();
}
}
if ( list.Count != 0 ) {
yield return list;
}
}
}
I think this is simple and fast enough.The example below divides the long sentence into 15 parts,but you can pass batch size as parameter to make it dynamic.Here I simply divide using "/n".
private static string Concatenated(string longsentence)
{
const int batchSize = 15;
string concatanated = "";
int chanks = longsentence.Length / batchSize;
int currentIndex = 0;
while (chanks > 0)
{
var sub = longsentence.Substring(currentIndex, batchSize);
concatanated += sub + "/n";
chanks -= 1;
currentIndex += batchSize;
}
if (currentIndex < longsentence.Length)
{
int start = currentIndex;
var finalsub = longsentence.Substring(start);
concatanated += finalsub;
}
return concatanated;
}
This show result of split operation.
var parts = Concatenated(longsentence).Split(new string[] { "/n" }, StringSplitOptions.None);
Extensions methods based on Eric's answer:
public static IEnumerable<IEnumerable<T>> SplitIntoChunks<T>(this T[] source, int chunkSize)
{
var chunks = from index in Enumerable.Range(0, source.Length)
group source[index] by index / chunkSize;
return chunks;
}
public static T[][] SplitIntoArrayChunks<T>(this T[] source, int chunkSize)
{
var chunks = from index in Enumerable.Range(0, source.Length)
group source[index] by index / chunkSize;
return chunks.Select(e => e.ToArray()).ToArray();
}