I want to parse some websites and get list of all pages on current domain, like a:
sample.com/
sample.com/page1/
sample.com/page2.html
But I can't find samples, how to build this sitemap or tree using C# and ASP.NET
I found only one example:
http://www.codeproject.com/Articles/13486/A-Simple-Crawler-Using-C-Sockets
But I can't understand, how author use it
if(Directory.Exists(strUri) == true)
{
//some code
DirectoryInfo dir = new DirectoryInfo(folderName);
FileInfo[] fia = dir.GetFiles("*.txt");
}
When I use this code, result in if is always false. When I use only
GetFiles function
I have Exception:
URI formats are not supported
Who have any ideas?
remember that on a Web ambient, you cant read files that way, you need to use
Server.MapPath(url)
to get the physical address to the files, then you can do the loop you are using.
Related
Good afternoon, I am using the Microsoft.SharePointOnline.CSOM library to work with Sharepoint. I have a code where I get a list called MyDoc like this:
using (var context = new ClientContext(url)
{
varweb = context.Web;
var list = web.Lists.GetByTitle("MyDocs");
}
Then I iterate through all the folders to find a folder with a suitable name and get files from there. With the help of file.ServerRelativeUrl I found out the link to the file on Sharepoint:
/MyDocs/Documents/Students/Homework/1lesson.pdf
How can I immediately access the Homework folder and download all the files from there without going through all the possible folders in the MyDocs sheet?
Instead of getting the list, just use the method GetFolderByServerRelativeUrl within the context.Web object
var folder = context.Web.GetFolderByServerRelativeUrl("/MyList/MyFolder");
context.Load(folder);
context.ExecuteQuery();
I am wondering how to remove the version number from a file path in a Windows Form Application.
Currently I wish to save some users application data to a .xml file located in the roaming user profile settings.
To do this I use:
get
{
return Application.UserAppDataPath + "\\FileName.xml";
}
However this returns the following string:
C:\Users\user\AppData\Roaming\folder\subfolder\1.0.0.0\FileName.xml
and I was wondering if there is a non-hack way to remove the version number from the file path so the file path looks like this:
C:\Users\user\AppData\Roaming\folder\subfolder\FileName.xml
Besides parsing the string looking for the last "\", I do not know what to do.
Thanks
Use Directory.GetParent method for this purpose.
get
{
var dir = Directory.GetParent(Application.UserAppDataPath);
return Path.Combine(dir.FullName, "FileName.xml");
}
Also note that I've used Path.Combine instead of concatenating paths, this method helps you to avoid so many problems. Never concatenate strings to create path.
I'm trying to write a function in C# that gets a directory path as parameter and returns a dictionary where the keys are the files directly under that directory and the values are their last modification time.
This is easy to do with Directory.GetFiles() and then File.GetLastWriteTime(). However, this means that every file must be accessed, which is too slow for my needs.
Is there a way to do this while accessing just the directory? Does the file system even support this kind of requirement?
Edit, after reading some answers:
Thank you guys, you are all saying pretty much the same - use FileInfo object. Still, it is just as slow to use Directory.GetFiles() (or Directory.EnumerateFiles()) to get those objects, and I suspect that getting them requires access to every file. If the file system keeps last modification time of its files in the files themselves only, there can't be a way to extract that info without file access. Is this the case here? Do GetFiles() and EnumerateFiles() of DirectoryInfo access every file or get their info from the directory entry? I know that if I would have wanted to get just the file names, I could do this with the Directory class without accessing every file. But getting attributes seems trickier...
Edit, following henk's response:
it seems that it really is faster to use FileInfo Object. I created the following test:
static void Main(string[] args)
{
Console.WriteLine(DateTime.Now);
foreach (string file in Directory.GetFiles(#"\\169.254.78.161\dir"))
{
DateTime x = File.GetLastWriteTime(file);
}
Console.WriteLine(DateTime.Now);
DirectoryInfo dirInfo2 = new DirectoryInfo(#"\\169.254.78.161\dir");
var files2 = from f in dirInfo2.EnumerateFiles()
select f;
foreach (FileInfo file in files2)
{
DateTime x = file.LastWriteTime;
}
Console.WriteLine(DateTime.Now);
}
For about 800 files, I usually get something like:
31/08/2011 17:14:48
31/08/2011 17:14:51
31/08/2011 17:14:52
I didn't do any timings but your best bet is:
DirectoryInfo di = new DirectoryInfo(myPath);
FileInfo[] files = di.GetFiles();
I think all the FileInfo attributes are available in the directory file records so this should (could) require the minimum I/O.
The only other thing I can think of is using the FileInfo-Class. As far as I can see this might help you or it might read the file as well (Read Permissions are required)
(1)
I can list the files on a folder this way:
var parameters = new Dictionary<GetListParameters, string>();
parameters.Add(GetListParameters.Path, "folder1/"); // get items from this specific path
var containerItemList = connection.GetContainerItemList(Settings.ContainerName, parameters);
However, this:
parameters.Add(GetListParameters.Path, "/");
or this:
parameters.Add(GetListParameters.Path, "");
does not work.
How can I query the files on the root folder?
(2)
The code above returns the list of files in a folder.
How can I get the list of folders within a folder? I there any parameter I can set to get this list?
Note: I know that this is a 'flat' file system, similar to Amazon S3. However, both (cloudfiles and S3) provides a way to work with 'folder'. In S3 is easy. In cloudfiles (with the .net API) I could not find how to do this.
Any hint will be highly appreciated.
This has just been fixed with the latest push and closes issue #51 on github
Link to downloadable package
Hope this helps.
I am using Visual Studio C# to parse an XML document for a file location from a local search tool I am using. Specifically I am using c# to query if the user has access to certain files and hide those to which it does not have access. I seem to have files that should return access is true however because not all files are local (IE some are web files without proper names) it is not showing access to files it should be showing access to. The error right now is caused by a url using .aspx?i=573, is there a work around or am I going to have to just remove all of these files... =/
Edit: More info...
I am using right now....
foreach (XmlNode xn in nodeList)
{
string url = xn.InnerText;
//Label1.Text = url;
try
{ using (FileStream fs = File.OpenRead(url)) { }
}
catch { i++; Label2.Text = i.ToString(); Label1.Text = url; }
}
The issue is, when it attempts to open files like the ....aspx?i=573 it puts them in the catch stack. If I attempt to open the file however the file opens just fine. (IE I have read access but because of either the file type or the append of the '?=' in the file name it tosses it into the unreadable stack.
I want everything that is readable either via url or local access to display else it will catch the error files for me.
I'm not sure exactly what you are trying to do, but if you only want the path of a URI, you can easily drop the query string portion like this:
Uri baseUri = new Uri("http://www.domain.com/");
Uri myUri = new Uri(baseUri, "home/default.aspx?i=573");
Console.WriteLine(myUri.AbsolutePath); // ie "home/default.aspx"
You cannot have ? in file names in Windows, but they are valid in URIs (that is why IE can open it, but Windows cannot).
Alternatively, you could just replace the '?' with some other character if you are converting a URL to a filename.
In fact thinking about it now, you could just check to see if your "document" was a URI or not, and if it isn't then try to open the file on the file system. Sounds like you are trying to open any and everything that is supplied, but it wouldn't hurt to performs some checks on the data.
private static bool IsLocalPath(string p)
{
return new Uri(p).IsFile;
}
This is from Check if the path input is URL or Local File it looks like exactly what you are looking for.
FileStream reads and writes local files. "?" is not valid character for local file name.
It looks like you want to open local and remote files. If it is what you are trying to do you should use approapriate metod of downloading for each type - i.e. for HTTP you WebRequest or related classes.
Note: it would be much easier to answer if you'd say: when url is "..." File.OpenRead(url) failes with exception, mesasge "...".