GetFile AWS S3 .NET 6 C# - c#

I have these two types of method to get the file in S3.
//method 1
var client = new AmazonS3Client();
var request = new GetPreSignedUrlRequest()
{
BucketName = BucketName,
Key = fileName,
Expires = DateTime.UtcNow.AddSeconds(300),
};
var presignedUrlResponse = client.GetPreSignedURL(request);
return presignedUrlResponse;
//method 2
var client = new AmazonS3Client();
var request = new GetPreSignedUrlRequest()
{
BucketName = BucketName,
Key = fileName,
};
var file = await client.GetObjectAsync(BucketName, fileName);
return File(file.ResponseStream, file.Headers.ContentType);
In my method 1 by using GetPreSignedURL it saves the name of the region in the path of the photo and I don't want that, I need it to save without, because then I can't open the photo in the browser.
Example: https://service-manager-estagio.s3.sa-east-1.amazonaws.com/urlphoto
I want to save without this sa-east-1
In my method 2 on the return File line I can't use this File, the message is that I can't use it as a method
I need help using method 1 and saving without the region name or using method 2 with this File.
But if anyone knows another way to do this GET is valid too.
Thanks!!

Related

Don't know how to transcribe wav file from Google Cloud Storage for LongRunningRecognize conversion to text in C#?

I'm able to convert audio files to text as long as they are under a minute. I need to transcribe longer files. Apparently, you have to have the file in Cloud Storage but I can't figure out if there is one command that does that or if I have to do it separately. What I'm using now is:
var credential = GoogleCredential.FromFile(GoogleCredentials);
var channel = new Grpc.Core.Channel(SpeechClient.DefaultEndpoint.ToString(), credential.ToChannelCredentials());
var speech = SpeechClient.Create(channel);
var response = speech.LongRunningRecognize(
new RecognitionConfig()
{
Encoding = RecognitionConfig.Types.AudioEncoding.Linear16,
LanguageCode = "en",
},
RecognitionAudio.FromFile(waveFile));
response = response.PollUntilCompleted();
I know I need to specify a file in Cloud Storage like:
RecognitionAudio.FromStorageUri("gs://ldn-speech/" + waveFile);
But I don't know how to get the file into the gs bucket. Do I have to do that in a separate step or as part of one of the Speech API's? I'm looking for someone to show me an example.
EDIT: I found that I needed to upload the file separately and could use the credential file I had already been using in the speech recognition process: So, all I needed was:
var credential = GoogleCredential.FromFile(GoogleCredentials);
var storage = StorageClient.Create(credential);
using (var f = File.OpenRead(fullFileName))
{
fileName = Path.GetFileName(fullFileName);
storage.UploadObject(bucketName, fileName, null);
}
There is also another method of going about in your case.
As stated in your edit you indeed needed to upload the file separately to your Cloud Storage bucket.
If you are planning on transcribing long audio files (longer than 1 minute) to text you may consider using Asynchronous Speech recognition:
https://cloud.google.com/speech-to-text/docs/async-recognize#speech-async-recognize-gcs-csharp
The code sample uses Cloud Storage bucket to store raw audio input for long-running transcription processes. It also requires that you have created and activated a service account.
Here’s an example:
static object AsyncRecognizeGcs(string storageUri)
{
var speech = SpeechClient.Create();
var longOperation = speech.LongRunningRecognize(new RecognitionConfig()
{
Encoding = RecognitionConfig.Types.AudioEncoding.Linear16,
SampleRateHertz = 16000,
LanguageCode = "en",
}, RecognitionAudio.FromStorageUri(storageUri));
longOperation = longOperation.PollUntilCompleted();
var response = longOperation.Result;
foreach (var result in response.Results)
{
foreach (var alternative in result.Alternatives)
{
Console.WriteLine($"Transcript: { alternative.Transcript}");
}
}
return 0;
}
(1) I found that I did indeed need to upload the file separately to cloud storate. (2) could use the credential file I had already been using in the speech recognition process: So, all I needed was:
var credential = GoogleCredential.FromFile(GoogleCredentials);
var storage = StorageClient.Create(credential);
using (var f = File.OpenRead(fullFileName))
{
fileName = Path.GetFileName(fullFileName);
storage.UploadObject(bucketName, fileName, null);
}
Once in Cloud storage, I could transcribe it as I originally thought. Then delete the file after the process was complete with:
var credential = GoogleCredentials;
var storage = StorageClient.Create(credential);
using (var f = File.OpenRead(fullFileName))
{
fileName = Path.GetFileName(fullFileName);
storage.DeleteObject(bucketName, fileName);
}

PDF/TIFF Document Text Detection gcsDestinationBucketName

I'm working on Pdf to text file conversion using google cloud vision API.
I got an initial code help through there side, image to text conversion working fine with JSON key which I got through registration and activation,
here is a code which I got for pdf to text conversion
private static object DetectDocument(string gcsSourceUri,
string gcsDestinationBucketName, string gcsDestinationPrefixName)
{
var client = ImageAnnotatorClient.Create();
var asyncRequest = new AsyncAnnotateFileRequest
{
InputConfig = new InputConfig
{
GcsSource = new GcsSource
{
Uri = gcsSourceUri
},
// Supported mime_types are: 'application/pdf' and 'image/tiff'
MimeType = "application/pdf"
},
OutputConfig = new OutputConfig
{
// How many pages should be grouped into each json output file.
BatchSize = 2,
GcsDestination = new GcsDestination
{
Uri = $"gs://{gcsDestinationBucketName}/{gcsDestinationPrefixName}"
}
}
};
asyncRequest.Features.Add(new Feature
{
Type = Feature.Types.Type.DocumentTextDetection
});
List<AsyncAnnotateFileRequest> requests =
new List<AsyncAnnotateFileRequest>();
requests.Add(asyncRequest);
var operation = client.AsyncBatchAnnotateFiles(requests);
Console.WriteLine("Waiting for the operation to finish");
operation.PollUntilCompleted();
// Once the rquest has completed and the output has been
// written to GCS, we can list all the output files.
var storageClient = StorageClient.Create();
// List objects with the given prefix.
var blobList = storageClient.ListObjects(gcsDestinationBucketName,
gcsDestinationPrefixName);
Console.WriteLine("Output files:");
foreach (var blob in blobList)
{
Console.WriteLine(blob.Name);
}
// Process the first output file from GCS.
// Select the first JSON file from the objects in the list.
var output = blobList.Where(x => x.Name.Contains(".json")).First();
var jsonString = "";
using (var stream = new MemoryStream())
{
storageClient.DownloadObject(output, stream);
jsonString = System.Text.Encoding.UTF8.GetString(stream.ToArray());
}
var response = JsonParser.Default
.Parse<AnnotateFileResponse>(jsonString);
// The actual response for the first page of the input file.
var firstPageResponses = response.Responses[0];
var annotation = firstPageResponses.FullTextAnnotation;
// Here we print the full text from the first page.
// The response contains more information:
// annotation/pages/blocks/paragraphs/words/symbols
// including confidence scores and bounding boxes
Console.WriteLine($"Full text: \n {annotation.Text}");
return 0;
}
this function required 3 parameters
string gcsSourceUri,
string gcsDestinationBucketName,
string gcsDestinationPrefixName
I don't understand which value should I set for those 3 params.
I never worked on third party API before so it's a little bit confusing for me
Suppose you own a GCS bucket named 'giri_bucket' and you put a pdf at the root of the bucket 'test.pdf'. If you wanted to write the results of the operation to the same bucket you could set the arguments to be
gcsSourceUri: 'gs://giri_bucket/test.pdf'
gcsDestinationBucketName: 'giri_bucket'
gcsDestinationPrefixName: 'async_test'
When the operation completes, there will be 1 or more output files in your GCS bucket at giri_bucket/async_test.
If you want, you could even write your output to a different bucket. You just need to make sure your gcsDestinationBucketName + gcsDestinationPrefixName is unique.
You can read more about the request format in the docs: AsyncAnnotateFileRequest

How to avoid storing this file when i move from AWS to Azure DataLake?

I am writing an Azure Function that moves files from AWS S3 to Azure Datalake, I got the download working and I got the upload working but I am struggling to piece the two together because I don't want to store the file in the intermediate app so to say as the azure function itself does not need to store it just pass it on.
Its not so easy to explain so please bear with me a little here while I try explain what I want to do.
When I download from S3 using this code
await client.GetObjectAsync(new GetObjectRequest { BucketName = bucketName, Key = entry.Key });
I don't have a file system to store it on and I don't want to store it, I want it as some sort of "object" that I can pass directly to the azure data lake writer which looks like this
adlsFileSystemClient.FileSystem.UploadFile(adlsAccountName, source, destination, 1, false, true);
The code works fine if I download it to my local disk, and then uploads it, but that's not what I want since the azure function has no storage I want to pass the downloaded object directly to the uploader so to say
How can I achieve this?
**** EDIT ****
// Process the response.
foreach (S3Object entry in response.S3Objects)
{
Console.WriteLine("key = {0} size = {1}", entry.Key.Split('/').Last(), entry.Size);
string fileNameOnly = entry.Key.Split('/').Last();
//await client.GetObjectAsync(new GetObjectRequest { BucketName = bucketName, Key = entry.Key });
GetObjectResponse getObjRespone = await client.GetObjectAsync(bucketName, entry.Key);
MemoryStream stream = new MemoryStream();
getObjRespone.ResponseStream.CopyTo(stream);
if (entry.Key.Contains("MerchandiseHierarchy") == true)
{
WriteToAzureDataLake(stream, #"/PIMRAW/MerchandiseHierarchy/" + fileNameOnly);
}
}
and then I pass the memory stream to the azure method but I need a streamuploader, and I cannot fid it, the following complains it cannot convert stream to string
adlsFileSystemClient.FileSystem.UploadFile(adlsAccountName, source, destination, 1, false, true);
* EDIT2 *
Change the upload method as follows and it creates the file at destination but with 0 size, so I am wondering if I am creating before the download is done?
static void WriteToAzureDataLake(MemoryStream inputSource, string inputDestination)
{
// 1. Set Synchronization Context
SynchronizationContext.SetSynchronizationContext(new SynchronizationContext());
// 2. Create credentials to authenticate requests as an Active Directory application
var clientCredential = new ClientCredential(clientId, clientSecret);
var creds = ApplicationTokenProvider.LoginSilentAsync(tenantId, clientCredential).Result;
// 2. Initialise Data Lake Store File System Client
adlsFileSystemClient = new DataLakeStoreFileSystemManagementClient(creds);
// 3. Upload a file to the Data Lake Store
//var source = #"c:\nwsys\source.txt";
var source = inputSource;
//var destination = "/PIMRAW/MerchandiseHierarchy/destination.txt";
var destination = inputDestination;
//adlsFileSystemClient.FileSystem.UploadFile(adlsAccountName, source, destination, 1, false, true);
adlsFileSystemClient.FileSystem.Create(adlsAccountName, destination, source);
// FINISHED
Console.WriteLine("6. Finished!");
}
Change the upload method as follows and it creates the file at destination but with 0 size
It seems that need to set stream position to 0 before write to datalake.
stream.Position = 0;

This method or property is not supported after HttpRequest.GetBufferlessInputStream has been invoked

I am trying to browse and upload a file from client to server using Angular Js and WEB API.I used Input file type for user to select file and post the file to WEB API. In web API, I am getting following error "This method or property is not supported after HttpRequest.GetBufferlessInputStream has been invoked."
I am using the following code:-
public IHttpActionResult UploadForm()
{
HttpResponseMessage response = new HttpResponseMessage();
var httpRequest = System.Web.HttpContext.Current.Request;
if (httpRequest.Files.Count > 0)
{
foreach (string file in httpRequest.Files)
{
var postedFile = httpRequest.Files[file];
var filePath = System.Web.HttpContext.Current.Server.MapPath("~/UploadFile/" + postedFile.FileName);
postedFile.SaveAs(filePath);
}
}
return Json("Document Saved");
}
I get this error when i tried to get files from HTTP request... should I update anything in web config??
Please help me to resolve this issue..
try this it work fine for me.
//get the root folder where file will be store
string root = HttpContext.Current.Server.MapPath("~/UploadFile");
// Read the form data.
var provider = new MultipartFormDataStreamProvider(root);
await Request.Content.ReadAsMultipartAsync(provider);
if (provider.FileData.Count > 0 && provider.FileData[0] != null)
{
MultipartFileData file = provider.FileData[0];
//clean the file name
var fileWithoutQuote = file.Headers.ContentDisposition.FileName.Substring(1, file.Headers.ContentDisposition.FileName.Length - 2);
//get current file directory on the server
var directory = Path.GetDirectoryName(file.LocalFileName);
if (directory != null)
{
//generate new random file name (not mandatory)
var randomFileName = Path.Combine(directory, Path.GetRandomFileName());
var fileExtension = Path.GetExtension(fileWithoutQuote);
var newfilename = Path.ChangeExtension(randomFileName, fileExtension);
//Move file to rename existing upload file name with new random filr name
File.Move(file.LocalFileName, newfilename);
}
}
I also had the same problem. And the solution by #Jean did not work for me.
I need to upload some CSV file and had to use it in the controller.
In Javascript, I used Fetch API to upload the csv file.
But, in the controller, I used this code:
[HttpPost]
[CatchException]
public bool ImportBundlesFromCsv()
{
var a = Request.Content.ReadAsByteArrayAsync();
//convert to Stream if needed
Stream stream = new MemoryStream(a.Result); // a.Result is byte[]
// convert to String if needed
string result = System.Text.Encoding.UTF8.GetString(a.Result);
// your code
return true;
}
This worked for me. Hope this helps!

How to get jpg file from CloudFiles on Rackspace using openstack.net

I'm running the below code in my controller in an asp.net mvc project. I want to enable the user to view or download the files that I store on Cloud Files on rackspace.
var identity =
new CloudIdentity()
{
Username = "username",
APIKey = "apikey"
};
var storage = new CloudFilesProvider(identity);
Stream jpgStream = new MemoryStream();
storage.GetObject("files.container", "1.jpg", jpgStream);
Stream pdfStream = new MemoryStream();
storage.GetObject("files.container", "2.pdf", pdfStream);
var jpgResult = File(jpgStream, "Image/jpg", "1.jpg");
var pdfResult = File(pdfStream, "Application/pdf", "2.pdf");
The above code works when I return pdfResult. I get the correct file. But when I return the jpgResult, the browser downloads 1.jpg as an empty 0KB file.
Am I doing this the right way? Any idea what the problem might be?
Problem solved after I added:
jpgStream.Position = 0;
pdfStream.Position = 0;
Before the File() call. As per the question: File is empty and I don't understand why. Asp.net mvc FileResult
I don't know why this wasn't an issue with the pdf file.
You can also use the GetObjectSaveToFile method.

Categories