Index pdf documents in Solr from C# client - c#

Basically I'm trying to index word or pdf documents in Solr and found the ExtractingRequestHandler, but can't figure out how to write code in c# that performs the HTTP POST request like in the Solr wiki: http://wiki.apache.org/solr/ExtractingRequestHandler.
I've installed Solr 3.4 on Tomcat 7 (7.0.22) using the files from the example/solr directory in the Solr zip and I haven't altered anything. The ExtractingRequestHandler should be configured out of the box in the solrconfig.xml and ready to use, right?
Can some of you give an C# (HttpWebRequest) example of how you make the HTTP POST request and upload a PDF file like it is done using curl in the Solr wiki?
I've look all over this site and many others trying to find an example or a tutorial on how this is done, but haven't found anything.
EDIT:
I finally managed to get it to work using SolrNet!
In order for it to work you need to copy this to a lib-folder in your Solr installation directory from the Solr zip:
apache-solr-cell-3.4.0.jar file from the dist folder
content of contrib\extraction\lib directory
With SolrNet 0.4.0 beta 2, this code does the job:
Startup.Init<IndexDocument>("YOUR-SOLR-SERVICE-PATH");
var solr = ServiceLocator.Current.GetInstance<ISolrOperations<IndexDocument>>();
using (FileStream fileStream = File.OpenRead("FILE-PATH-FOR-THE-FILE-TO-BE-INDEXED"))
{
var response =
solr.Extract(
new ExtractParameters(fileStream, "doc1")
{
ExtractFormat = ExtractFormat.Text,
ExtractOnly = false
});
}
solr.Commit();
Sorry for the trouble. I hope however that others will find this useful.

I would recommend using the SolrNet client. It supports the ExtractingRequestHandler.

Related

.net Maui android how to retrieve file from external fileserver

I am making an app that allows you to open and edit a pdf file on tablets. Because i usually work with .NET, i decided to write it in .NET MAUI. That way i also have access to windows tablets.
It uses Itext as its main library to read and edit the pdf's.
I have an external shared fileserver that anyone can access when they are coneected to the WIFI.
I'd like to access that fileserver when i connect from my android tablet using Itext pdfreader.
How do I achieve this correctly?
Am i missing a library or a package which would allow to me to access that file?
Are there options i haven't discovered yet?
This works on windows tablets:
string dest "\\\\Path\\to\\File\\";
string file = "\\\\Path\\to\\File\\file.pdf";
PdfDocument pdfDoc = new PdfDocument(new PdfReader(file), new PdfWriter(dest));
I have tried :
string file = Environment.GetFolderPath(Environment.SpecialFolder.Windows)+ "\\Path\to\File\file.pdf";
string file = "\\\\Path\\to\\File\\file.pdf";
All of them result in file not found
Among the getfolderpath options ive tried a dozen, none of them seem to work.
thank you for your time
So i ended up solving this by transforming the document into a base64 string and sending it through an api that i had.
i used the classic httprequest aproach that you can look up and copy anywhere.

How to upload to OneDrive using Microsoft Graph Api in c#

I have been trying to upload to a OneDrive account and I am hopelessly stuck not being able to upload neither less or greater than 4MB files. I have no issues accessing the drive at all, since I have working functions that create a folder, rename files/folders, and a delete files/folders.
https://learn.microsoft.com/en-us/graph/api/driveitem-put-content?view=graph-rest-1.0&tabs=csharp
This documentation on Microsoft Graph API is very friendly to HTTP code, and I believe I am able to fairly "translate" the documentation to C#, but still fail to grab a file and upload to OneDrive. Some places online seem to be using byte arrays? Which I am completely unfamiliar with since my primary language is C++ and we just use ifstream/ofstream. Anyways, here is the portion of code in specific (I hope this is enough):
var item = await _client.Users[userID].Drive.Items[FolderID]//"01YZM7SMVOQ7YVNBXPZFFKNQAU5OB3XA3K"].Content
.ItemWithPath("LessThan4MB.txt")//"D:\\LessThan4MB.txt")
.CreateUploadSession()
.Request()
.PostAsync();
Console.WriteLine("done printing");
As it stands, it uploads a temporary file that has a tilde "~" in the OneDrive (like as if I was only able to open but not import any data from the file onto it). If I swap the name of the file so it includes the file location it throws an error:
Message: Found a function 'microsoft.graph.createUploadSession' on an open property. Functions on open properties are not supported.
Try this approach with memory stream and PutAsync<DriveItem> request:
string path = "D:\\LessThan4MB.txt";
byte[] data = System.IO.File.ReadAllBytes(path);
using (Stream stream = new MemoryStream(data))
{
var item = await _client.Me.Drive.Items[FolderID]
.ItemWithPath("LessThan4MB.txt")
.Content
.Request()
.PutAsync<DriveItem>(stream);
}
I am assuming you have already granted Microsoft Graph Files.ReadWrite.All permission. Check your API permission. I tested this code snippet with pretty old Microsoft.Graph library version 1.21.0. Hopefully it will work for you too.

Get last modified date of a remote file

I have an app with which at startup it downloads a file from a remote location (through the net) and parses it's contents.
I am trying to speed up the process of startup as the bigger the file gets the slower the app starts.
As a way to speed up the process I thought of getting the last modified date of the file and if it is newer from the file on the user's pc then and only then download it.
I have found many ways to do it online but none of them are in C# (for windows store apps). Does anybody here know of a way of doing this without the need to download the file? If I am to download the file then the process is sped up at all.
My C# code for downloading the file currently is this
const string fileLocation = "link to dropbox";
var uri = new Uri(fileLocation);
var downloader = new BackgroundDownloader();
StorageFile file = await ApplicationData.Current.LocalFolder.CreateFileAsync("feedlist.txt",CreationCollisionOption.ReplaceExisting);
DownloadOperation download = downloader.CreateDownload(uri, file);
await download.StartAsync();
If it helps the file is stored in dropbox but if any of you guys have a suggestion for another free file hosting service I am open to suggestions
Generally, you can check the file time by sending HEAD request and parsing/looking HTTP header response for a Last-Modified filed. The remote server should support it and DropBox does not support this feature for direct links (only via API). But DropBox have another feature, the headers have the etag field. You should store it and check in the next request. If it changed - the file has been changed too. You can use this tool to check the remote file headers.

How do I import Quick Books data?

I heard that using Quick book SDK we can import Quick Books data in our own application using C#.
Let me know how is this possible.
I am developing desktop applicaton using Silverlight.
This a SaaS app (I am allowing customers to connect their QuickBooks files to my app)
Are there any resources to go through (any links, examples)?
Go install the QuickBooks SDK.
After installation, navigate to this directory on your computer:
C:\Program Files (x86)\Intuit\IDN\QBSDK12.0\samples\qbdt\c-sharp
In that directory you will find many examples, provided by Intuit, which show how to do this. In addition, you'll find about 600 pages of PDF documentation included with the SDK, which detail every single aspect of what you're trying to do.
Desktop connections to QuickBooks using C# and the SDK are pretty easy - you basically set up a COM object and feed XML to QuickBooks. QuickBooks processes the XML request and sends you back an XML response.
Here's some QuickBooks C# example code.
rp = new RequestProcessor2();
rp.OpenConnection("", "IDN CustomerAdd C# sample");
ticket = rp.BeginSession("C:\\path\\to\\file.QBW", QBFileModeE.qbFileOpenDoNotCare);
//ticket = rp.BeginSession("C:\\path\\to\\file.QBW", QBFileMode.qbFileOpenDoNotCare);
Random random = new Random();
string input = #"<?xml version=""1.0"" encoding=""utf-8""?>
<?qbxml version=""2.0""?>
<QBXML>
<QBXMLMsgsRq onError=""stopOnError"">
<CustomerAddRq requestID=""15"">
<CustomerAdd>
...
</CustomerAdd>
</CustomerAddRq>
</QBXMLMsgsRq>
</QBXML>";
response = rp.ProcessRequest(ticket, input);
You should refer to the QuickBooks OSR for details on the XML requests you can send. Also included in the SDK is the QBFC library, which allows you to create the XML requests with objects, and marshall the object to an XML string.

reading a url and getting back a csv file

i have a URL and when i load it up in a browser it recognizes it as a csv file and pops up excel "do you want to open". I want to do this programatically so i can have a winforms app use that url and parse the csv file directly.
what is the quickest way to do this?
EDIT: i tried using WebClient and i am getting the folowing error:
"The remote server returned an error: (500) Internal Server Error."
I don't see why something like this wouldn't work (in C#):
// Download the file to a specified path. Using the WebClient class we can download
// files directly from a provided url, like in this case.
System.Net.WebClient client = new WebClient();
client.DownloadFile(url, csvPath);
Where the url is your site with the csv file and the csvPath is where you want the actual file to go.
If you have a WinForms app, you can use a System.Net.WebClient to read the data as a string.
It will read the entire csv file as a string, but you can write it out or parse it at will.
If you want to just whip something together I would suggest using a scripting language and some bash. Just use wget or something similar to get the file and some scripting language to parse it. You could even use php to parse it once you had the file because I know that php has the following function which is very nice: http://php.net/manual/en/function.fgetcsv.php
I would suggest doing it this way because it is easier, this will certainly let you parse it easily enough though I don't know what you want to do with it from there but the worlds your oyster.
The following code works for me but I am running Open Office. I have not tested it with Excel.
The hacky bit is to rename the local copy of the file to *.xls so that Windows will launch Excel by default, if you leave the file extension as CSV, Windows will launch Notepad by default.
String url = "http://www.example.com/test.csv";
String localfile = "test.xls";
var client = new WebClient();
client.DownloadFile(url, localfile);
System.Diagnostics.Process.Start(localfile);

Categories