Lucene.net folder search - c#

I am newbie in Lucene.net. I want to search a content from the folder which may have all type of files (.txt, .xls, .pdf, .exe, .ppt, .doc,...).
Suppose if I search any content, I want to list the filepath & content matched (it should be highlighted) inside the file if any.
Any sample code would be appreciated.
Note : I am want to use this result in C# class library.

I haven't used it myself, but you should look into using SOLR. AFAIK you cannot host it on a .NET server, but you can connect to it from .NET using solrSHARP.

Related

how to upload a folder includes sub folders? [duplicate]

I need something like the FileUpload control in asp.net that will allow the user to browse for a folder and enter a file name of a new file to upload.
From what I've seen FileUpload requires a file to be selected. It seems that html input type="file" has the same requirement.
Thanks!
Selecting an entire folder is not possible in FileUploadControl as it is meant for a single file. Although you can have a Multi File Selection. Multiple File Upload User Control
C# has build-in FTPrequest class where you can create folders, upload files, delete files etc.
If you want to upload folders from a webpage, you cannot use this technology in the browser, then you will have to use a rich-client such as Java, Flash or similar plugin.
If you can provide the users with a Windows or Mac client, you can use C# (either .NET or Mono) for the FTP transfer.
ZIP files arent a problem for ASP.net nor C#, but you still only upload 1 file (zip-archive) and then its up to the server to unzip it using eg. C#. Look at 7-Zip which is opensource, then you might get some ideas too.
You could also just try and use the build-in lib for it (compression):
http://www.eggheadcafe.com/community/csharp/2/10050636/how-to-compress-and-decompress-file-in-c.aspx
or try this link...
http://www.aurigma.com/docs/iu7/uploading-folders-in-aspnet.htm

ASP.NET Server.MapPath and SharePoint Document Library

I have a C# application that needs to populate a list of all the filenames within a particular SharePoint web environment, in which there is a specific document library from which I have to read all the filenames.
Let's say the URL for the document library in question is "http://example.com/lib.aspx".
If I used Server.MapPath like so:
Directory.GetFiles(Server.MapPath("http://example.com/lib.aspx"), SearchOption.TopDirectoryOnly);
This would effectively treat the document library as a physical pathname and successfully populate the an array of filenames, correct?
I don't currently have the ability to test this and I am wondering if this operation would be valid; in other words, the filenames would (most likely) be indexed successfully.
That won't work at all. The documents in the library are not located in the server's file system.
If you're enumerating all files in the library, then you can use the Items property of the library and for each item, use the File property to retrieve the SPFile associated with the item.

File system searching (C#) against specified file list

My client wants to add a file system searching feature in a B/S application based on C#. It is a little special that the search shall be in a scope of specified file list but not a whole directory with just certain file extension.
I did some research on Microsoft Office Sharepoint Server Search Service, but couldn't get a clue whether it supports searching against specific files. I'm now using it to search PDF files, but not the same case of what I'm asking for.
Can anyone give me some suggestions what 3rd party search service/engine I should take for the requirement?
Thanks.
Elaine
I assume you are wanting full text indexing of a certain set of files?
Java has the best selection of libraries for this, but there are C# ports as well.
I highly recommend Lucene for indexing and retrieval.:
http://incubator.apache.org/lucene.net/
If this is on a server, it might be easiest to run a Solr instance and use C# as the client:
http://crazorsharp.blogspot.com/2010/01/full-text-search-using-solr-lucene-and.html
Lucene has many examples on indexing different document types, but if you use Solr, it will handle that for you.

how to index a folder using lucene.net

I am trying to develop a search engine in asp.net using lucene.net. I go through many tutorials and pages to get the appropriate results but i couldn't.
Actually I have a folder with some files(doc,ppt,pdf,excel etc..) and i want to search within that folder only for contents and if the results are not found within that folder then ask user to search on web.
for example i have a folder with thousands of files # C:\test
and if user searched for "miller" then it should search into every document. if results are found then it should display results like that
Searched text file no of occurences
miller C:\test\1\file.doc 5
miller C:\test\1\11\new.doc 2
please help me i am not getting appropriate results .
Lucene / Lucene.NET is just an indexing engine, you still have to extract the text from the file types that you want to support yourself -on Windows you can use the IFilter interface for many file types, if you have Acrobat Reader 7+ installed there should be built in support for IFilter for PDF files. As for the indexing part itself there are many, many samples out there.
Also see this thread What's a good method for extracting text from a PDF using C# or classic ASP (VBScript)?

How Can I Embed a Word Document in Silverlight?

I need to Embed the Word Document in silverlight,and i need to have all the same functionality of Word Document.
Like Cut,Copy,Paste,Save,Save us,Formating Etc.
How can i Achieve this?.
Also Suggest me some links too.
SL4 comes with COM automation support mean if client machine has Word installed SL can use it to display work doc:
http://forums.silverlight.net/forums/p/185680/424357.aspx
http://www.silverlightshow.net/items/MS-Word-Mail-Merge-with-Silverlight-4-COM-Automation.aspx
If you are using SL3.. it will be a little daunting... may be you will have to find some RTE to display word in it.
Regards.
The problem with using the COM model to read the file is that you must run the Silverlight app out of the browser with Elevated privileges and the user mush have Word installed so not very useful if you want a web app.
However Word documents are storred as XML files inside a zipped file (rename file from name.docx to name.zip to see the files) so you could always write a class to read in the XML and display it inside a Rich Text Box and then after formatting write it out to a XML form, this will take a lot of effort.

Categories