How to go n levels up from a web address - c#

I have some urls, How to go up n levels the web "address". with a n variable. for example if I have http://www.example.com/the/multi/part/path/file.ext and n=3 it gives: http://www.example.com/the/multi ?

Related

Find matching entries in list which are different

I have two lists. The first one contains entries like
RB Leipzig vs SV Darmstadt 98
Hertha Berlin vs Hoffenheim
..
and in the second contains basically the same entries but could but written in different forms. For example:
Hertha BSC vs TSG Hoffenheim
RB Leipzig vs Darmstadt 98
..
and so on. Both lists represent the same sport games but they can use alternate team names and don't appear in the same order.
My goal (hehe pun) is to unify both lists to one and match the same entries and discard entries which don't appear in both lists.
I already tried to use Levensthein distance and fuzzy search.
I thought about using machine learning but have no idea how to start with that.
Would appriciate any help and ideas!
You can solve this problem using Linear Programming combined with the Levenshtein Distance you already mentioned. Linear Programming is a commonly used optimization technique for solving optimization problems, like this one. Check this link to find out an example how to use Solver Foundation in C#. This example isn't related with the specific problem you have, but is a good example how the library works.
Hints:
You need to build a matrix of distances between each pair of teams/strings between 2 lists. Let's say both lists have N elements. In i-th row of the matrix you will have N values, the j-th value will indicate the Levenshtein Distance between i-th element from the first and j-th element from the second list. Then, you need to set the constraints. The constraints would be:
The sum in each row needs to equal 1
The sum in each column equals 1
Each of the coefficient (matrix entry) needs to be either 0 or 1
I have solved the same problem a couple of months ago and this approach worked perfectly for me.
And the cost function would be the sum: `
sum(coef[i][j] * dist[i][j] for i in [1, n] and for j in [1, n])
`. You want to minimize this function, because you want the overall "distance" between the 2 sets after the mapping to be as low as possible.
You can use a BK-tree (I googled C# implementations and found two: 1, 2). Use the Levenshtein distance as the metric. Optionally, delete the all-uppercase substrings from the names in the lists in order to improve the metric (just be careful that this doesn't accidentally leave you with empty strings for names).
1. Put the names from the first list in the BK-tree
2. Look up the names from the second list in the BK-tree
a. Assign an integer token to the name pair, stored in a Map<Integer, Tuple<String, String>>
b. Replace each team name with the token
3. Sort each token pair (so [8 vs 4] becomes [4 vs 8])
4. Sort each list by its first token in the token pair,
then by the second token in the token pair (so the list
would look like [[1 vs 2], [1 vs 4], [2 vs 4]])
Now you just iterate through the two lists
int i1 = 0
int i2 = 0
while(i1 < list1.length && i2 < list2.length) {
if(list1[i1].first == list2[i2].first && list1[i1].second == list2[i2].second) {
// match
i1++
i2++
} else if(list1[i1].first < list2[i2].first) {
i1++
} else if(list1[i1].first > list2[i2].first) {
i2++
} else if(list1[i1].second < list2[i2].second {
i1++
} else {
i2++
}
}

How to equally distribute files to users based on file sizes?

I have N files with various file sizes and also M users.
What I want to do is to use an algorithm in C#, C++ or pseudocode that will equally distribute the files to the users.
If file sizes were not in the game it would be something like N/M files per user. So, I could randomly select N/M files for each user (maybe some users could not take part if M > N and no more files were left). But, now I have the file sizes in the game and I want to auto assign the files to users with file sizes in mind.
A file can be related only with one user. So, when a file is related with a user it cannot be used again.
A user can be related with many files.
If files are less than the users (N > M) some users may or many not take part at all.
Also, these cases are possible N < M, M > N and M = N and the algorithm should equally distribute files to users.
If anyone can help me I would appreciate.
Thank you.
If this is homework, it's a stinker!
It's the optimization version of the partition problem, and it's NP-hard (i.e., you're not going to be able to solve it efficiently) even when you have only two users.
There is a greedy algorithm which gives a decent approximation to the optimal arrangement, and does it in O(n log n) time. That is what I would go with if I were you, unless you have a very clear need for perfect optimality. This is the pseudocode, taken from the Wikipedia page I linked to above. It is for two sets (i.e., M=2), but easily generalises. The basic idea is that at each stage, you assign the current file to the user who has the smallest total.
INPUT: A list of integers S
OUTPUT: An attempt at a partition of S into two sets of equal sum
1 function find_partition(S):
2 A ← {}
3 B ← {}
4 sort S in descending order
5 for i in S:
6 if sum(A) <= sum(B)
7 add element i to set A
8 else
9 add element i to set B
10 return {A, B}
Perfect optimality is certainly achievable in principle, but there are two issues to think about.
If nothing else, you could try every possible assignment of files to users. That would be very inefficient, but it's known to be an NP-hard problem, which means that whatever you do, you're going to end up with something with an exponential running time.
It's not absolutely clear what optimal means in a case with more than two users. (It's clear for two, which is why the partition problem is expressed in terms of two.) For instance, suppose you have eight users. Which is the better allocation: [8,4,4,4,4,4,4,0] or [5,5,5,5,3,3,3,3]? You need some well-defined metric that determines the "badness" of an allocation before you can try to minimise it.

Java / C# - Array[][] complexity task [duplicate]

This question already has answers here:
Algorithm: how to find a column in matrix filled with all 1, time complexity O(n)?
(5 answers)
Closed 9 years ago.
I'm dealing with some problematic complexity question via my university:
Program input : A n x n Array[][] that is filled with either 0 or 1.
DEFINITION: Define k as a SINK if in the k row all the values are 0, and in the k column all the values are 1 (except [k][k] itself which needs to be 0)
Program output : Is there a k number that is a SINK? If so, returnk, else return -1.
Example :
On Arr A k=3 is a SINK, on Arr B there in no SINK, so -1 is returned.
The main problem with this task is that the complexity of the program must be below O(n^2) , I have managed to solve this with that complexity, going over the oblique line summing the rows&columns. I haven't find a way to solve this with O(logn) or O(n). Also the task prevents you from using another Array[] (Due to memory complexity). Can anyone drop any light on that matter? thanks in advance!
Just to make explicit the answer harold links to in the OP's comments: start yourself off with a list of all n indices, S = {0, 1, .., n-1}. These are our candidates for sinks. At each step, we're going to eliminate one of them.
Consider the first two elements of S, say i and j.
Check whether A[i, j] is 1.
If it is, remove i from S (because the i th row isn't all 0s, so i can't be our sink )
If it isn't, remove j from S (because the j th column isn't all 1s, so j can't be our sink)
If there're still two or more elements in S, go back to Step 1.
When we get to the last element, say k, check whether the k th row is all zero and the k th column (other than A[k,k]) are all ones.
If they are, k is a sink and you can return it.
If they aren't, the matrix does not have a sink and you can return -1.
There are n elements in S to begin with, each step eliminates one of them and each step takes constant time, so it's O(n) overall.
You mention you don't want to use a second array. If that really is strict, you can just use two integers instead, one representing the "survivor" from the last step and one representing how far into the sequence 0, 1, .., n-1 you are.
I've never seen this algorithm before and I'm quite impressed with it's simplicity. Cheers.

Find the number of pages in a Word document Section using Interop

I'm trying to find the number of pages in a Section of a Word document using Interop in c#.
The main goal is really to find out if a header is visible or not. (E.g. a document is only 1 page, the DifferentFirstpageHeaderFooter is enabled, so the wdHeaderFooterPrimary exists but is technically not shown (because there's only 1 page and not 2 or more).) So if you can find a different way to figure out how to do this, I'm fine with that too.
Currently, WdInformation.wdActiveEndPageNumber works if there is only 1 section in the document, but if there is 2, and I'm doing the processing of the second section, wdActiveEndPageNumber gives me the total number of pages including section 1.
var section = headerFooter.Parent as Section;
int numOfPages = section.Range.Information[WdInformation.wdActiveEndPageNumber];
I don't have the C# for this, but using VBA syntax what you need for "section n" is
a. if n = 1 then you look at
theDocument.sections[1].Range.Information[WdInformation.wdActiveEndPageNumber]
b. if n > 1 then you establish that section n exists, then look at
theDocument.sections[n].Range.Information[WdInformation.wdActiveEndPageNumber]-
theDocument.sections[n-1].Range.Information[WdInformation.wdActiveEndPageNumber]
and notice that case (b) can return 0 if you have a continuous section break on the last page of section n. I don't know what that would mean in terms of the headers that you would have, but I'd hope it would mean you just had the first page header.

How can I apply Fisher-Yates-Knuth to Shuffling with restrictions? Or is there any other efficient approach?

For example, i would like to shuffle 4 decks of cards, and make sure:
Any consecutive 4 cards won't come from the same deck.
Surely I can do the shuffling first and then filter out bad permutations, but if the restrictions are strong (e.g. any consecutive 2 cards won't come from the same deck) , there will be too many failures.
If i don't mind that if it is slightly unbiased, (of course the less bias the better), how should I do?
Edit: Clarify
Yes I want as uniformly as possible to pick from all full shuffles such that this additional criterion applied.
I would process as below :
First you can shuffle each 4 decks (using FYK algorithm)
Then generate a 4 partitions (* I define partition below) of your 52 cards of 4 decks with the constraint of having not more than 3 element in each set of the partition :
For example :
(1,3,0,3,2,0,1) would be a partition of 10 with this constraint
(1,1,1,1,1,1,1,1,1,1) would be a partition of 10 too with this constraint
Then Mix the 4 decks based on these partition.
For example if you have :
(3,2,1)
(2,2,2)
you take the 3 first of deck one then 2 of deck 2 then 2 of deck one the 2 of deck 2 then 1 of deck 1 then 2 of deck 2. (okay ?)
All partitions are not valid, so you need to add one more constraint :
for example with this method :
(1,2,1,1,1,1,1,1)
(3,3,3)
will end up having 4 elements of deck 1 at the end.
So the last partition must satisfy a constraint, I wrote a little python program to generate these partitions.
from random import randint,choice
def randomPartition(maxlength=float('inf'),N=52):
'''
the max length is the last constraint in my expanation:
you dont want l[maxlength:len(l)]) to be more than 3
in this case we are going to randomly add +1 to all element in the list that are less than 3
N is the number of element in the partition
'''
l=[] # it's your result partition
while(sum(l)<N ):
if (len(l)>maxlength and sum(l[maxlength:len(l)])>=3): #in the case something goes wrong
availableRange=[i for i in range(len(l)) if l[i]<3] #create the list of available elements to which you can add one
while(sum(l)<N):
temp=choice(availableRange) #randomly pick element in this list
l[temp]+=1
availableRange=[i for i in range(len(l)) if l[i]<3] #actualize the list
if availableRange==[]: #if this list gets empty your program cannot find a solution
print "NO SOLUTION"
break
break
else:
temp=randint(0,min(N-sum(l),3)) # If everything goes well just add the next element in the list until you get a sum of N=52
l.append(temp)
return l
Now you can generate the 4 partitions and mix the decks according to these partitions :
def generate4Partitions():
l1=randomPartition();
l2=randomPartition();
l3=randomPartition();
m=max(len(l1),len(l2),len(l3))
l4=randomPartition(m);
return l1,l2,l3,l4
This 4 partitions will always be admissible based on their definitions.
NOte:
There might be more cases where it's not admissible , for example :
(3,0,0,3)
(0,3,3,0)
I guess this code needs to be modified a bit to take more constraints into account.
But it should be easy, just to delete unwanted zeros,like this:
(3,0,3)
(0,3,3,0)
Hope it's understandable and it helps

Categories