Dynamic output blob storage container names - c#

When configuring the output blob storage container for an Azure function, is it somehow possible to run some code in order to generate the path where the BLOB will be stored? To be more precise, I would like to use a new GUID within the path, every time this function would be triggered. Something like this (code does not work):
[FunctionName("BlobTriggered")]
public static void BlobTriggered(
[BlobTrigger("myContainer/{name}.{extension}")] Stream myBlob,
[Blob("myContainer/{Guid.NewGuid()}", FileAccess.Write)] Stream outputContainer,
string name,
string extension,
TraceWriter log)
{
...
}
In the code above, I am trying to generate the GUID by using Guid.NewGuid(), which doesn't work. Is there a similar way to achieve this?

You can set the variable in {} and set the corresponding parameter in the declaration section to get this value in the attribute. But because the parameters of the function declaration part must be fixed at compile time, I think your idea cannot be completed using binding. But you can still achieve what you want, please have a look of the below code, I am using Storage Blob SDK:
using System;
using System.IO;
using Azure.Storage.Blobs;
using Microsoft.Azure.WebJobs;
using Microsoft.Azure.WebJobs.Host;
using Microsoft.Extensions.Logging;
namespace FunctionApp53
{
public static class Function1
{
[FunctionName("Function1")]
public static void Run([BlobTrigger("samples-workitems/{name}.{extension}", Connection = "str")]Stream myBlob,
string name, ILogger log)
{
log.LogInformation($"C# Blob trigger function Processed blob\n Name:{name} \n Size: {myBlob.Length} Bytes");
string connectionString = "DefaultEndpointsProtocol=https;AccountName=0730bowmanwindow;xxx;EndpointSuffix=core.windows.net";
BlobServiceClient myClient = new BlobServiceClient(connectionString);
var container = myClient.GetBlobContainerClient("samples-workitems");
string a = Guid.NewGuid().ToString();
var blockBlob = container.GetBlobClient(a);
blockBlob.Upload(myBlob);
}
}
}

Related

Get file names from azure blob storage

I'm using azure blobstorage in c#, is there a way, a method to get the list of files from a given specific folder?
like get all file names inside this url https://prueba.blob.core.windows.net/simem/UAL/Dato%20de%20archivo%20prueba%20No1/2022/1/16
i know that using container.GetBlobs() i would get all files but not from a specific folder
Just use
var results = await container.ListBlobsSegmentedAsync(prefix, true, BlobListingDetails.None, null, null, null, null);
You can get file names from a specific folder using BlobServiceClient and GetBlobs and by using below code in C# Console App and I followed Microsoft-Document and #Cindy Pau's answer:
using Azure.Storage.Blobs;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace ConsoleApp4
{
class Program
{
static void Main(string[] args)
{
string cs= "Connection String of Storage Account";
string f = "test";
BlobServiceClient b = new BlobServiceClient(cs);
string c = "pool";
BlobContainerClient containerClient =b.GetBlobContainerClient(c);
var bs= containerClient.GetBlobs(prefix: f);
foreach (var x in bs)
{
Console.WriteLine(x.Name);
Console.ReadLine();
}
}
}
}
In Storage Account of pool Container:
Now inside test Folder:
Output:
Press Enter after every line to get File names one by one.

C# call method from different class with 'args' as a parameter

I'm writing a Discord Bot (Discord.net) and I found myself requiring to access some data on a google sheet using their APIs. Being that I thought it would be best to actually separate those two in two different class files I have tried summoning the Main method of the Google APIs into my program (after having renamed it "Sheets") like this in my Program.cs:
using Discord;
using Discord.WebSocket;
using System;
using System.IO;
using System.Threading.Tasks;
namespace WoM_Balance_Bot
{
public class Program
{
public static void Main(string[] args)
{
GoogleAPI GSheet = new GoogleAPI();
GSheet.Sheets();
new Program().MainAsync().GetAwaiter().GetResult();
}
private DiscordSocketClient _client;
public async Task MainAsync()
{
_client = new DiscordSocketClient();
_client.MessageReceived += CommandHandler;
_client.Log += Log;
var token = File.ReadAllText("bot-token.txt");
await _client.LoginAsync(TokenType.Bot, token);
await _client.StartAsync();
// Block this task until the program is closed.
await Task.Delay(-1);
}.......ecc
I tried writing the parameters to pass here in these parentheses like "string" and "args" but either I get the syntax wrong or I have a very wrong idea about what to pass exactly.
This is the actual content of GoogleAPI.cs, which is the other class file I created that has the Google Sheet APIs:
using Google.Apis.Auth.OAuth2;
using Google.Apis.Services;
using Google.Apis.Sheets.v4;
using Google.Apis.Sheets.v4.Data;
using Google.Apis.Util.Store;
using System;
using System.Collections.Generic;
using System.IO;
using System.Threading;
namespace WoM_Balance_Bot
{
public class GoogleAPI
{
// If modifying these scopes, delete your previously saved credentials
// at ~/.credentials/sheets.googleapis.com-dotnet-quickstart.json
private static readonly string[] Scopes = { SheetsService.Scope.SpreadsheetsReadonly };
private static readonly string ApplicationName = "wombankrolls";
public static void Sheets(string[] args)
{
UserCredential credential;
Console.WriteLine("if you read this then it's good");
using (var stream =
new FileStream("credentials.json", FileMode.Open, FileAccess.Read))
{
// The file token.json stores the user's access and refresh tokens, and is created
// automatically when the authorization flow completes for the first time.
string credPath = "token.json";
credential = GoogleWebAuthorizationBroker.AuthorizeAsync(
GoogleClientSecrets.Load(stream).Secrets,
Scopes,
"user",
CancellationToken.None,
new FileDataStore(credPath, true)).Result;
Console.WriteLine("Credential file saved to: " + credPath);
}
// Create Google Sheets API service.
var service = new SheetsService(new BaseClientService.Initializer()
{
HttpClientInitializer = credential,
ApplicationName = ApplicationName,
});
// Define request parameters.
String spreadsheetId = "16W56LWqt6wDaYAU5xNdTWCdaY_gkuQyl4CE1lPpUui4";
String range = "Class Data!G163:I";
SpreadsheetsResource.ValuesResource.GetRequest request =
service.Spreadsheets.Values.Get(spreadsheetId, range);
// Prints the names and majors of students in a sample spreadsheet:
// https://docs.google.com/spreadsheets/d/1BxiMVs0XRA5nFMdKvBdBZjgmUUqptlbs74OgvE2upms/edit
ValueRange response = request.Execute();
IList<IList<Object>> values = response.Values;
/*
if (values != null && values.Count > 0)
{
Console.WriteLine("Name, Major");
foreach (var row in values)
{
// Print columns A and E, which correspond to indices 0 and 4.
Console.WriteLine("{0}, {1}", row[0], row[4]);
}
}
else
{
Console.WriteLine("No data found.");
}
Console.Read();
*/
}
}
}
I have modified it from the quickstart given by Google in a way that I thought it made sense but I still get in the end the same error:
There is no argument given that corresponds to the required formal parameter 'args' of 'GoogleAPI.Sheets(string[])'
as the user "David L" wrote in the comments:
As a general rule of thumb, if you do not use an argument, remove it. C# helps enforce this paradigm by throwing a compiler error if your method expects an argument and you do not provide one, which is exactly what is happening here.
It was my bad as I was under the impression of the total opposite during an API implementation. I would like to always target a clean code as a result and keeping stuff that I will not use was my bad. Thank you David!

Azure function cannot find blob during blob trigger

There is an Azure function that is triggered when HTML files are placed into Azure blob storage.The function opens the HTML file, and transforms it into JSON. There is a small percentage of triggered files (less than 1%), that result in the following exception:
Microsoft.WindowsAzure.Storage.StorageException
There does happen to be a second function triggered by the placement of the blob that changes the files content type, but I am not sure if this is effecting the first function's ability to also open the file.
What can be done to allow the Azure functions to correctly process the HTML files without throwing this type of exception?
Exception properties:
Message: Exception while executing function: [Function name here] The condition specified using HTTP conditional header(s) is not met.
Exception type: Microsoft.WindowsAzure.Storage.StorageException
Failed method: HtmlAgilityPack.HtmlDocument.Load
Exception type: Microsoft.WindowsAzure.Storage.StorageException
Function 1 (supporting methods, class, and namespace omitted for brevity):
using System;
using System.Collections.Generic;
using System.IO;
using Microsoft.Azure.WebJobs;
using Microsoft.Azure.WebJobs.Host;
using Microsoft.Extensions.Logging;
using Newtonsoft.Json;
using HtmlAgilityPack;
using System.Threading.Tasks;
[FunctionName("Function name")]
public static async Task Run([BlobTrigger("container-name/html/{name}", Connection = "ConnectionString")]Stream myBlob, ILogger log, Binder binder)
{
var doc = new HtmlDocument();
doc.Load(myBlob);
var form = doc.DocumentNode.SelectSingleNode("//form");
var elements = form.SelectNodes("//input");
CustomType MyObject = BuildObject(elements);
var attributes = new Attribute[]
{
new BlobAttribute("container-name/json/" + MyObject.ID + ".json"),
new StorageAccountAttribute("ConnectionString")
};
using (var writer = await binder.BindAsync<TextWriter>(attributes))
{
writer.Write(BuildJSON(MyObject));
}
}
Function 2 same trigger but in a different function and it's own .cs file. Class and namespace omitted for brevity:
using System;
using System.IO;
using System.Threading.Tasks;
using Microsoft.Azure.WebJobs;
using Microsoft.Azure.WebJobs.Host;
using Microsoft.Extensions.Logging;
using Microsoft.WindowsAzure.Storage.Blob;
[FunctionName("Function name")]
public static async Task Run([BlobTrigger("container-name/html/{name}", Connection = "ConnectionString")]ICloudBlob myBlob)
{
if (myBlob.Properties.ContentType == "text/html; charset=utf-8")
return;
myBlob.Properties.ContentType = "text/html; charset=utf-8";
await myBlob.SetPropertiesAsync();
}
I think your error should appear like this: Funtion1 retrieves the blob, and then function2's operation on the blob causes the change of Etag. Then function1 tries to load the retrieved blob, but finds that the Etag has changed, so it returns to you abnormal.
If the resource is accessed or changed by multiple apps, please have a try to make sure the orginal files are not changed. Otherwise the Etag of the blob will be changed automatically.
Azure Storage blob use 'strong Etag validation'. the content of the two resource representations must be byte-for-byte identical and that all other entity fields (such as Content-Language) are also unchanged.
Please refer to this:https://www.microsoftpressstore.com/articles/article.aspx?p=2224058&seqNum=12

Issue in reading values from Dictionary in case of functionapp hosted on Azure environment

I have the following function app setup on Azure environment. I tried to follow the inputs from the link and extended it
https://mitra.computa.asia/articles/msdn-smart-image-re-sizing-azure-functions-and-cognitive-services
Here I am trying to generate two different sized thumbnail images whenever any image is uploaded to the container.
I have one input: myBlob and two different outputs added in this case: outputBlobsm, outputBlobmd under the Integrate section.
Function App Code:
using System;
using System.Text;
using System.Net.Http;
using System.Net.Http.Headers;
using System.Collections.Generic;
public static void Run(Stream myBlob,string blobname, string blobextension, Stream outputBlobsm,Stream outputBlobmd, TraceWriter log)
{
bool smartCropping = true;
log.Info($"C# Blob trigger function Processed blob\n Name:{blobname} \n Extension: {blobextension} extension");
var sizesm = imageDimensionsTable[ImageSize.Small];
log.Info($"C# Blob \n Sizesm:{sizesm}");
log.Info($"C# Blob \n width:{sizesm.item1} \n height: {sizesm.item2}");
ResizeImage(sizesm.item1, sizesm.item2,smartCropping,myBlob, outputBlobsm);
var sizemd = imageDimensionsTable[ImageSize.Medium];
log.Info($"C# Blob \n width:{sizemd.item1} \n height: {sizemd.item2}");
ResizeImage(sizemd.item1, sizemd.item2,smartCropping,myBlob, outputBlobmd);
}
public void ResizeImage(int width, int height, bool smartCropping,Stream myBlob, Stream outputBlob)
{
string _apiKey = "xxxxxxxxxxxxxxxxxxxxxxxxx";
string _apiUrlBase = "xxxxxxxxxxxxxxxxxxx/generateThumbnail";
using (var httpClient = new HttpClient())
{
httpClient.BaseAddress = new Uri(_apiUrlBase);
httpClient.DefaultRequestHeaders.Add("Ocp-Apim-Subscription-Key", _apiKey);
using (HttpContent content = new StreamContent(myBlob))
{
//get response
content.Headers.ContentType = new MediaTypeWithQualityHeaderValue("application/octet-stream");
var uri = $"{_apiUrlBase}?width={width}&height={height}&smartCropping={smartCropping.ToString()}";
var response = httpClient.PostAsync(uri, content).Result;
var responseBytes = response.Content.ReadAsByteArrayAsync().Result;
//write to output thumb
outputBlob.Write(responseBytes, 0, responseBytes.Length);
}
}
}
public enum ImageSize { ExtraSmall, Small, Medium }
private static Dictionary<ImageSize, (int, int)> imageDimensionsTable = new Dictionary<ImageSize, (int, int)>() {
{ ImageSize.ExtraSmall, (320, 200) },
{ ImageSize.Small, (640, 400) },
{ ImageSize.Medium, (800, 600) }
};
On compiling the code I am getting the below error:
[Error] run.csx(16,41): error CS1061: '(int, int)' does not contain a definition for 'item1' and no extension method 'item1' accepting a first argument of type '(int, int)' could be found (are you missing a using directive or an assembly reference?)
Can anyone help me to fix this issue.
C# is case sensitive. The new tuples if you haven't specified names should be Item1, Item2....
See examples here under "Tuples"
Note that you can also specify names for those ints in the declaration:
private static Dictionary<ImageSize, (int Width, int Height)> imageDimensionsTable
which makes
ResizeImage(sizesm.Width, sizesm.Height,smartCropping,myBlob, outputBlobsm);
much more readable
Oh... and you'll might need to make ResizeImage static

Submit C# MapReduce Job Windows Azure HDInsight - Response status code does not indicate success: 500 (Server Error)

I'm trying to submit a MapReduce job to HDInsight cluster. In my job I didn't write reduce portion because I don't want to reduce anything. All I want to do is to parse the each filename and append the values to every line in the file. So that I will have all the data needed inside the file.
My code is
using Microsoft.Hadoop.MapReduce;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace GetMetaDataFromFileName
{
class Program
{
static void Main(string[] args)
{
var hadoop = connectAzure();
//Temp Workaround to Env Variables
Environment.SetEnvironmentVariable("HADOOP_HOME", #"c:\hadoop");
Environment.SetEnvironmentVariable("Java_HOME", #"c:\hadoop\jvm");
var result = hadoop.MapReduceJob.ExecuteJob<MetaDataGetterJob>();
}
static IHadoop connectAzure()
{
//TODO: Update credentials and other information
return Hadoop.Connect(
new Uri("https://sampleclustername.azurehdinsight.net//"),
"admin",
"Hadoop",
"password",
"blobstoragename.blob.core.windows.net", //Storage Account that Log files exists
"AccessKeySample", //Storage Account Access Key
"logs", //Container Name
true
);
}
//Hadoop Mapper
public class MetaDataGetter : MapperBase
{
public override void Map(string inputLine, MapperContext context)
{
try
{
//Get the meta data from name of the file
string[] _fileMetaData = context.InputFilename.Split('_');
string _PublicIP = _fileMetaData[0].Trim();
string _PhysicalAdapterMAC = _fileMetaData[1].Trim();
string _BootID = _fileMetaData[2].Trim();
string _ServerUploadTime = _fileMetaData[3].Trim();
string _LogType = _fileMetaData[4].Trim();
string _MachineUpTime = _fileMetaData[5].Trim();
//Generate CSV portion
string _RowHeader = string.Format("{0},{1},{2},{3},{4},{5},", _PublicIP, _PhysicalAdapterMAC, _BootID, _ServerUploadTime, _LogType, _MachineUpTime);
//TODO: Append _RowHeader to every row in the file.
context.EmitLine(_RowHeader + inputLine);
}
catch(ArgumentException ex)
{
return;
}
}
}
//Hadoop Job Definition
public class MetaDataGetterJob : HadoopJob<MetaDataGetter>
{
public override HadoopJobConfiguration Configure(ExecutorContext context)
{
//Initiate the job config
HadoopJobConfiguration config = new HadoopJobConfiguration();
config.InputPath = "asv://logs#sample.blob.core.windows.net/Input";
config.OutputFolder = "asv://logs#sample.blob.core.windows.net/Output";
config.DeleteOutputFolder = true;
return config;
}
}
}
}
Usually what do you thing the reason of 500 (Server Error) ? Am I suppling to wrong credentials ? Actually I didn't really understand the difference between Username and HadoopUser parameters in Hadoop.Connect method ?
Thank you,
I had approximately same issue in the past (was unable to submit hive job to the cluster with BadGateway response). I have contacted the support team and in my case the problem was in memory leakage at the head node, what means that the problem was not at client's side and it seems to be inherited hadoop problem.
I've solved that stuff by redeploying the cluster.
Have you tried to submit other jobs (simple ones)? If so, than I suggest to have a contact with azure support team or just redeploy the cluster if it's not painful for you.

Categories