Checking if a blob exists in Azure Storage - c#

I've got a very simple question (I hope!) - I just want to find out if a blob (with a name I've defined) exists in a particular container. I'll be downloading it if it does exist, and if it doesn't then I'll do something else.
I've done some searching on the intertubes and apparently there used to be a function called DoesExist or something similar... but as with so many of the Azure APIs, this no longer seems to be there (or if it is, has a very cleverly disguised name).

The new API has the .Exists() function call. Just make sure that you use the GetBlockBlobReference, which doesn't perform the call to the server. It makes the function as easy as:
public static bool BlobExistsOnCloud(CloudBlobClient client,
string containerName, string key)
{
return client.GetContainerReference(containerName)
.GetBlockBlobReference(key)
.Exists();
}

Note: This answer is out of date now. Please see Richard's answer for an easy way to check for existence
No, you're not missing something simple... we did a good job of hiding this method in the new StorageClient library. :)
I just wrote a blog post to answer your question: http://blog.smarx.com/posts/testing-existence-of-a-windows-azure-blob.
The short answer is: use CloudBlob.FetchAttributes(), which does a HEAD request against the blob.

Seem lame that you need to catch an exception to test it the blob exists.
public static bool Exists(this CloudBlob blob)
{
try
{
blob.FetchAttributes();
return true;
}
catch (StorageClientException e)
{
if (e.ErrorCode == StorageErrorCode.ResourceNotFound)
{
return false;
}
else
{
throw;
}
}
}

If the blob is public you can, of course, just send an HTTP HEAD request -- from any of the zillions of languages/environments/platforms that know how do that -- and check the response.
The core Azure APIs are RESTful XML-based HTTP interfaces. The StorageClient library is one of many possible wrappers around them. Here's another that Sriram Krishnan did in Python:
http://www.sriramkrishnan.com/blog/2008/11/python-wrapper-for-windows-azure.html
It also shows how to authenticate at the HTTP level.
I've done a similar thing for myself in C#, because I prefer to see Azure through the lens of HTTP/REST rather than through the lens of the StorageClient library. For quite a while I hadn't even bothered to implement an ExistsBlob method. All my blobs were public, and it was trivial to do HTTP HEAD.

The new Windows Azure Storage Library already contains the Exist() method.
It´s in the Microsoft.WindowsAzure.Storage.dll.
Available as NuGet Package
Created by: Microsoft
Id: WindowsAzure.Storage
Version: 2.0.5.1
See also msdn

Here's a different solution if you don't like the other solutions:
I am using version 12.4.1 of the Azure.Storage.Blobs NuGet Package.
I get an Azure.Pageable object which is a list of all of the blobs in a container. I then check if the name of the BlobItem equals to the Name property of each blob inside the container utilizing LINQ. (If everything is valid, of course)
using Azure.Storage.Blobs;
using Azure.Storage.Blobs.Models;
using System.Linq;
using System.Text.RegularExpressions;
public class AzureBlobStorage
{
private BlobServiceClient _blobServiceClient;
public AzureBlobStorage(string connectionString)
{
this.ConnectionString = connectionString;
_blobServiceClient = new BlobServiceClient(this.ConnectionString);
}
public bool IsContainerNameValid(string name)
{
return Regex.IsMatch(name, "^[a-z0-9](?!.*--)[a-z0-9-]{1,61}[a-z0-9]$", RegexOptions.Singleline | RegexOptions.CultureInvariant);
}
public bool ContainerExists(string name)
{
return (IsContainerNameValid(name) ? _blobServiceClient.GetBlobContainerClient(name).Exists() : false);
}
public Azure.Pageable<BlobItem> GetBlobs(string containerName, string prefix = null)
{
try
{
return (ContainerExists(containerName) ?
_blobServiceClient.GetBlobContainerClient(containerName).GetBlobs(BlobTraits.All, BlobStates.All, prefix, default(System.Threading.CancellationToken))
: null);
}
catch
{
throw;
}
}
public bool BlobExists(string containerName, string blobName)
{
try
{
return (from b in GetBlobs(containerName)
where b.Name == blobName
select b).FirstOrDefault() != null;
}
catch
{
throw;
}
}
}
Hopefully this helps someone in the future.

This is the way I do it. Showing full code for those who need it.
// Parse the connection string and return a reference to the storage account.
CloudStorageAccount storageAccount = CloudStorageAccount.Parse(CloudConfigurationManager.GetSetting("AzureBlobConnectionString"));
CloudBlobClient blobClient = storageAccount.CreateCloudBlobClient();
// Retrieve reference to a previously created container.
CloudBlobContainer container = blobClient.GetContainerReference("ContainerName");
// Retrieve reference to a blob named "test.csv"
CloudBlockBlob blockBlob = container.GetBlockBlobReference("test.csv");
if (blockBlob.Exists())
{
//Do your logic here.
}

If your blob is public and you need just metadata:
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(url);
request.Method = "HEAD";
string code = "";
try
{
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
code = response.StatusCode.ToString();
}
catch
{
}
return code; // if "OK" blob exists

If you don't like using the exception method then the basic c# version of what judell suggests is below. Beware though that you really ought to handle other possible responses too.
HttpWebRequest myReq = (HttpWebRequest)WebRequest.Create(url);
myReq.Method = "HEAD";
HttpWebResponse myResp = (HttpWebResponse)myReq.GetResponse();
if (myResp.StatusCode == HttpStatusCode.OK)
{
return true;
}
else
{
return false;
}

With the updated SDK, once you have the CloudBlobReference you can call Exists() on your reference.
See http://msdn.microsoft.com/en-us/library/microsoft.windowsazure.storage.blob.cloudblockblob.exists.aspx

Although most answers here are technically correct, most code samples are making synchronous/blocking calls. Unless you're bound by a very old platform or code base, HTTP calls should always be done asynchonously, and the SDK fully supports it in this case. Just use ExistsAsync() instead of Exists().
bool exists = await client.GetContainerReference(containerName)
.GetBlockBlobReference(key)
.ExistsAsync();

With Azure Blob storage library v12, you can use BlobBaseClient.Exists()/BlobBaseClient.ExistsAsync()
Answered on another similar question: https://stackoverflow.com/a/63293998/4865541

Java version for the same ( using the new v12 SDK )
This uses the Shared Key Credential authorization (account access key)
public void downloadBlobIfExists(String accountName, String accountKey, String containerName, String blobName) {
// create a storage client using creds
StorageSharedKeyCredential credential = new StorageSharedKeyCredential(accountName, accountKey);
String endpoint = String.format(Locale.ROOT, "https://%s.blob.core.windows.net", accountName);
BlobServiceClient storageClient = new BlobServiceClientBuilder().credential(credential).endpoint(endpoint).buildClient();
BlobContainerClient container = storageClient.getBlobContainerClient(containerName);
BlobClient blob = container.getBlobClient(blobName);
if (blob.exists()) {
// download blob
} else {
// do something else
}
}

Related

How to check of object exists in Google Cloud Storage using C#?

Does anyone know how to check if an object exists inside a Google Cloud Storage bucket via C#?
To test for the existence of an object without generating an exception when it isn't found, use the ListObjects() method.
The documentation on the prefix argument is a little misleading.
Only objects with names that start with this string will be returned.
In fact, the full object name can be used and will result in positive matches.
using( var client = StorageClient.Create() )
{
var results = client.ListObjects( "bucketname", "test.jpg" );
var exists = results?.Count() > 0;
return exists;
}
I believe protecting results w/ the nullable test is unnecessary. Even when I get no results with this code results is still not null. That said, I feel safer with that added protection since we are explicitly trying to avoid a try/catch.
Obviously this code could be shortened, but I wanted to demonstrate each step of the process for clarity.
This code was tested in a .net6 Console App using Google.Cloud.Storage.V1 version 3.7.0.
You can use Google.Cloud.Storage.V1 library
using Google.Cloud.Storage.V1;
public class StorageClass
{
public bool IsObjectExist(string bucketName, string objectname)
{
var client = StorageClient.Create();
return client.GetObject(bucketName, objectname) != null ? true : false;
}
}

Error when attaching file to System.Net.Mail. How do I fix this error?

Trying to send an email with an attachment. However I am getting an error:
"Cannot convert from
'System.Threading.Tasks.Task' to
'System.Net.Mail.Attachment'
My error occurs in the line Attachments.Add(GetAttachment(attachmentFileName));
I have tried various conversions (see code) but I dont quite see what the issue is. I know the solution is right in front of me but I dont see it.
public class NonFERosterEmail : BaseNotificationEmail<OfferViewModel>
{
public NonFERosterEmail(OfferViewModel vm, string emailList, string attachmentFileName) : base(vm)
{
To.AddRange(GetTo(emailList));
Body = GetBody();
Subject = GetSubject();
//Attachments.Add(new Attachment(GetAttachment(attachmentFileName)));
Attachments.Add(GetAttachment(attachmentFileName));
From = new MailAddress(ConfigurationManager.AppSettings["RedirectEmailTo"]);
}
//public async Task<List<Attachment>> GetAttachment(string attachmentFileName)
public async Task<Attachment> GetAttachment(string attachmentFileName)
{
//var ret = new List<Attachment>();
CloudStorageAccount storageAccount = CloudStorageAccount.Parse(ConfigurationManager.AppSettings["azureStorageAccount"]);
CloudBlobClient blobClient = storageAccount.CreateCloudBlobClient();
CloudBlobContainer container = blobClient.GetContainerReference("attachments");
CloudBlockBlob blob = container.GetBlockBlobReference(attachmentFileName);
var contentType = MimeMapping.GetMimeMapping(attachmentFileName);
Stream target = new MemoryStream();
await blob.DownloadToStreamAsync(target);
target.Position = 0;
//ret.Add(new Attachment(target, attachmentFileName, contentType));
Attachment ret = new Attachment(target, attachmentFileName, contentType);
return ret;
}
//remainder of code left out for brevity
}
I expect the GetAttachment to return a correct Attachment object which would be added to the Mail object and sent successfully.
I believe the answers from #SLaks and #Roman Marusyk are correct, but it looks like you are calling GetAttachment from the constructor, which is not asynchronous. As such you would not be able to use await without using an async method. Try using the result property of GetAttachment as shown below.
Attachments.Add(GetAttachment(attachmentFileName).Result);
A better solution would be to use .GetAwaiter().GetResult(), which as #Roman Marusyk pointed out and shown in this post, if the method fails it will throw the exception directly rather than throwing an AggregateException.
A better solution would be to use the following
Attachments.Add(GetAttachment(attachmentFileName).GetAwaiter().GetResult());
To get the value from a Task<T>, you must make your method async and await the task.
You need to await when call method that returns Task, so instead of this
Attachments.Add(GetAttachment(attachmentFileName));
Use:
Attachments.Add(await GetAttachment(attachmentFileName));
or
Attachments.Add(GetAttachment(attachmentFileName).GetAwaiter().GetResult());

How to trigger a ServiceBusTrigger?

I have an Azure WebJob which has a similar code inside:
public class Functions
{
public static void GenerateImagesForViewer(
[QueueTrigger("resize-images-queue")] BlobInformation blobInfo,
[Blob("unprocessed-pdf-storage-container/{BlobName}", FileAccess.Read)] Stream input,
[Blob("unprocessed-pdf-storage-container/{BlobNameWithoutExtention}-pdf.jpg")] CloudBlockBlob outputPdf)
{
//Do something here
string connectionString = "myConnectionString";
TopicClient Client =
TopicClient.CreateFromConnectionString(connectionString, "resize-
images-topic");
var topicMessage = new BrokeredMessage(blobInfo);
Client.Send(topicMessage);
}
public static void GenerateImagesForViewerW80(
[ServiceBusTrigger("resize-images-topic", "SizeW80")] BlobInformation blobInfo,
[Blob("unprocessed-pdf-storage-container/{BlobNameWithoutExtention}-pdf.jpg", FileAccess.Read)] Stream input,
[Blob("processed-image-storage-container/{BlobNameWithoutExtention}-h0-w80.jpg")] CloudBlockBlob outputBlob_0_80)
{
// It never comes here
//Do something here
}
}
After uploading data (BlobInformation object) to my Queue there is no problem triggering the first method (GenerateImagesForViewer). But when I try to send data (BlobInformation object) to the topic it never triggers any of the subscribers(GenerateImagesForViewerW80). Is there something wrong in the code, or there is a required configuration in Azure?
In Program.cs, config.UseServiceBus(); is necessary for usage of ServiceBus trigger. We won't see warning if there are other trigger or bindings in Functions, like your case.
See code sample below and check official guidance for more details.
var config = new JobHostConfiguration();
if (config.IsDevelopment)
{
config.UseDevelopmentSettings();
}
config.UseServiceBus();
var host = new JobHost(config);
host.RunAndBlock();
Besides, I see some suspicious blank in your input and output blob path. If it's the same as your original code, just remove them otherwise the trigger won't execute code related to blob operation correctly.

How can I Lock an Azure Table partition in an Azure Function using IQueryable and IAsyncCollector?

I'm fiddling with Azure Functions, combining it with CQRS and event sourcing. I'm using Azure Table Storage as an Event Store. The code below is a simplified version to not distract from the problem.
I'm not interested in any code tips, since this is not a final version of the code.
public static async Task Run(BrokeredMessage commandBrokeredMessage, IQueryable<DomainEvent> eventsQueryable, IAsyncCollector<IDomainEvent> eventsCollector, TraceWriter log)
{
var command = commandBrokeredMessage.GetBody<FooCommand>();
var committedEvents = eventsQueryable.Where(e => e.PartitionKey = command.AggregateRootId);
var expectedVersion = committedEvents .Max(e => e.Version);
// some domain logic that will result in domain events
var uncommittedEvents = HandleFooCommand(command, committedEvents);
// using(Some way to lock partition)
// {
var currentVersion = eventsQueryable.Where(e => e.PartitionKey = command.AggregateRootId).Max(e => e.Version);
if(expectedVersion != currentVersion)
{
throw new ConcurrencyException("expected version is not the same as current version");
}
var i = currentVersion;
foreach (var domainEvent in uncommittedEvents.OrderBy(e => e.Timestamp))
{
i++;
domainEvent.Version = i;
await eventsCollector.AddAsync(domainEvent);
}
// }
}
public class DomainEvent : TableEntity
{
private string eventType;
public virtual string EventType
{
get { return eventType ?? (eventType = GetType().UnderlyingSystemType.Name); }
set { eventType = value; }
}
public long Version { get; set; }
}
My efforts
To be fair, I could not try anything, because I don't know where to start and if this is even possible. Id did some research which did not solve my problem, but could help you solve this problem.
Do Azure Tables support locking?
yes, they do: Managing Concurrency in Microsoft Azure Storage. It's called leasing, but I do not know how to implement this in an Azure Function.
Other sources
Azure Functions triggers and bindings developer reference
Azure Functions C# developer reference
Tips, suggestions, alternatives
I'm always open to any suggestions on how to solve problems, but I cannot accept these as an answer to my question. Unless the answer to my question is "no", I can not mark an alternative as an answer. I'm not seeking for the best way to solve my problem, I want it to work the way I engineered it. I know this is stubborn, but this is practice/fiddling.
Blob leases would indeed work pretty well for what you're trying to accomplish (the Functions runtime actually makes extensive use of that internally).
If, before working on a partition, you acquire a lease on a blob (by convention, a blob named after the partition, or something like that) you'd be able to ensure only a given function is working on that partition.
The article you've linked to does show an example of lease acquisition and release, you can find more information in the documentation.
One thing you want to ensure is that you flush your collector before you leave the lock scope (by calling FlushAsync on it)
I hope this helps!

what's best way to check if a S3 object exists?

Currently, I make a GetObjectMetaDataRequest, if the GetObjectMetaDataResponse throw an exception means the object doesn't exist. Is there a better way to check whether the file exists without downloading the file.
you can use S3FileInfo class and Exists method of this class it will hep you to check if file exists without download the file .see the example below I used the AWSSDK 3.1.6 .net(3.5) :
public static bool ExistsFile()
{
BasicAWSCredentials basicCredentials = new BasicAWSCredentials("my access key", "my secretkey");
AmazonS3Config configurationClient = new AmazonS3Config();
configurationClient.RegionEndpoint = RegionEndpoint.EUCentral1;
try
{
using (AmazonS3Client clientConnection = new AmazonS3Client(basicCredentials, configurationClient))
{
S3FileInfo file = new S3FileInfo(clientConnection, "mybucket", "FolderNameUniTest680/FileNameUnitTest680");
return file.Exists;//if the file exists return true, in other case false
}
}
catch(Exception ex)
{
return false;
}
}
If you are in a similar situation as myself and are using .Net Core and don't have access to Amazon.S3.IO (and S3FileInfo method), you can do the following using asynchronous GetObjectMetadataRequest method:
static private AmazonS3Client s3Client = new AmazonS3Client();
public static async Task<bool> FileExistsS3Async(string _bucket, string _key)
{
GetObjectMetadataRequest request = new GetObjectMetadataRequest()
{
BucketName = _bucket,
Key = _key
};
try
{
await s3Client.GetObjectMetadataAsync(request);
return true;
}
catch (AmazonS3Exception exception)
{
return false;
}
}
This function has worked for me when calling within a Unity game. You can also call the above function synchronously using the following:
bool exists = Task.Run(()=>FileExistsS3Async(_bucket, _key)).Result;
Try this solution, it works for me.
AmazonS3Client client = new AmazonS3Client(accessKey, secretKey, regionEndpoint);
S3FileInfo s3FileInfo = new S3FileInfo(client, bucketName, fileName);
return s3FileInfo.Exists;
There is no ListObjectRequest, but instead a ListObjectsRequest where you cannot specify the Key. You then have to go through all the objects to find the one you want. I am currently looking in to it since I seem to get time out errors whilst downloading the file. (If anyone has some idea how to solve that feel free to comment).
You could instead try the List Parts Request if you happen to know the upload id.
Other than that I have no idea. Would like to have a chat with the person who wrote the S3 api...
You're probably going to have to use the REST API yourself, as the method suggested, internally just does exactly the same thing (try...catch on the request)
You can use this code to check whether an object exist in S3 or not:
public class S3CheckFileExists
{
private readonly IAmazonS3 amazonS3;
public S3CheckFileExists(IAmazonS3 amazonS3)
{
this.amazonS3 = amazonS3;
}
public async Task<bool> S3ObjectExists(string bucketName, string keyLocation)
{
var listS3Objects = await this.amazonS3.ListObjectsV2Async(new ListObjectsV2Request
{
BucketName = bucketName,
Prefix = keyLocation, // eg myfolder/myimage.jpg (no / at start)
MaxKeys = 1
});
if (listS3Objects.S3Objects.Any() == false || listS3Objects.S3Objects.All(x => x.Key != keyLocation))
{
// S3 object doesn't exist
return false;
}
// S3 object exists
return true;
}
}
You'll need to register IAmazonS3 in your IoC (aka services) container though:
services.AddAWSService<IAmazonS3>();
Yes.
You can use a ListObjectsRequest. Use the Marker property, and retrieve only 1 element.

Categories