dynamic query to azure tables

dynamic query to azure tables - c#

I'm using azure table storage to store blog posts. Each blog post can have different tags.
So I'm going to have three different tables.
One which will store the blog posts.
One to store the tags
One that will store the relation between the tags and posts
So my question is as following, is it possible to create dynamic search queuries? Because I do not know until at run time how many tags I want to search.
As I understand it you can only query azure table using LINQ. Or can I input a string query that I can change dynamically?
UPDATE
Here's some example data that's in the blog table
PartitionKey,RowKey,Timestamp,Content,FromUser,Tags
user1, 1, 2012-08-08 13:57:23, "Hello World", "root", "yellow,red"
blogTag table
PartitionKey,RowKey,Timestamp,TagId,TagName
"red", "red", 2012-08-08 11:40:29, 1, red
"yellow", "yellow", 2012-08-08 11:40:29, 2, yellow
relation table
PartitionKey,RowKey,Timestamp,DataId,TagId
1, 1, 2012-08-08 11:40:29, 1, 1
2, 1, 2012-08-08 13:57:23, 1, 2
One usage example of these tables is for example when I want to get all blog post with certain tag.
I have to query the tagId from the blogTag table
There after I need to search in the relation table for the dataId
Lastly I need to search blog table for blog post with that dataId
I'm using LINQ to perform the query and it looks like following
CloudTableQuery<DataTag> tagIds = (from e in ctx2.CreateQuery<DataTag>("datatags")
where e.PartitionKey == tags
select e).AsTableServiceQuery<DataTag>();
I tried Gaurav Mantri suggestion of using filter, and it works. But I'm afraid of how the effiency of that will be. And about the limitation of 15 discrete comparison that's only allowed.

You can simple build where clause and pass to where method for example:
var whereClause="(PartitionKey eq 'Key1') and (PartitionKey eq 'Key2')"
CloudStorageAccount storageAccount = CloudStorageAccount.Parse("AccountDetails");
CloudTableClient tableClient = storageAccount.CreateCloudTableClient();
CloudTable table = tableClient.GetTableReference(<TableName>);
table.CreateIfNotExists();
TableQuery<YourAzureTableEntity> query =
new TableQuery<YourAzureTableEntity>()
.Where(whereClause));
var list = table.ExecuteQuery(query).ToList();

I am also facing exactly same problem. I did find one solution which I am pasting below:
public static IEnumerable<T> Get(CloudStorageAccount storageAccount, string tableName, string filter)
{
string tableEndpoint = storageAccount.TableEndpoint.AbsoluteUri;
var tableServiceContext = new TableServiceContext(tableEndpoint, storageAccount.Credentials);
string query = string.Format("{0}{1}()?filter={2}", tableEndpoint, tableName, filter);
var queryResponse = tableServiceContext.Execute<T>(new Uri(query)) as QueryOperationResponse<T>;
return queryResponse.ToList();
}
Basically it utilizes DataServiceContext's Execute(Uri) method: http://msdn.microsoft.com/en-us/library/cc646700.aspx.
You would need to specify the filter condition as you would do if you're invoking the query functionality through REST API (e.g. PartitionKey eq 'mypk' and RowKey ge 'myrk').
Not sure if this is the best solution :) Looking forward to comments on this.

It is possible, but it may not be a good idea. Adding multiple query parameters like that always results in a table scan. That's probably OK in a small table, but if your tables are going to be large it will be very slow. For large tables, you're better off running a separate query for each key combination.
That said, you can build a dynamic query with some LINQ magic. Here is the helper class I've used for that:
public class LinqBuilder
{
/// <summary>
/// Build a LINQ Expression that roughly matches the SQL IN() operator
/// </summary>
/// <param name="columnValues">The values to filter for</param>
/// <returns>An expression that can be passed to the LINQ .Where() method</returns>
public static Expression<Func<RowType, bool>> BuildListFilter<RowType, ColumnType>(string filterColumnName, IEnumerable<ColumnType> columnValues)
{
ParameterExpression rowParam = Expression.Parameter(typeof(RowType), "r");
MemberExpression column = Expression.Property(rowParam, filterColumnName);
BinaryExpression filter = null;
foreach (ColumnType columnValue in columnValues)
{
BinaryExpression newFilterClause = Expression.Equal(column, Expression.Constant(columnValue));
if (filter != null)
{
filter = Expression.Or(filter, newFilterClause);
}
else
{
filter = newFilterClause;
}
}
return Expression.Lambda<Func<RowType, bool>>(filter, rowParam);
}
public static Expression<Func<RowType, bool>> BuildComparisonFilter<RowType, ColumnType>(string filterColumnName, Func<MemberExpression, BinaryExpression> buildComparison)
{
ParameterExpression rowParam = Expression.Parameter(typeof(RowType), "r");
MemberExpression column = Expression.Property(rowParam, filterColumnName);
BinaryExpression filter = buildComparison(column);
return Expression.Lambda<Func<RowType, bool>>(filter, rowParam);
}
}
You would use it something like this:
var whereClause = BuildListFilter(queryColumnName, columnValues);
CloudTableQuery<RowType> query = (from r in tableServiceContext.CreateQuery<MyRow>("MyTable")
where r.PartitionKey == partitionKey
select r)
.Where(whereClause) //Add in our multiple where clauses
.AsTableServiceQuery(); //Convert to table service query
var results = query.ToList();
Note also that the Table service enforces a maximum number of constraints per query. The documented maximum is 15 per query, but when I last tried this (which was some time ago) the actual maximum was 14.

Building something like this in table storage is quite cumbersome; akin to forcing a square peg in a round hole.
Instead you could considered using Blob storage to store your Blogs and Lucene.NET to implement your search of tags. Lucene would also allow more complex searches like (Tag = "A" and Tag = "B" and Tag != "C") and in addition would also allow searching over the blog text itself, if you so choose.
http://code.msdn.microsoft.com/windowsazure/Azure-Library-for-83562538

Related

Sorting out Nodetypes after yield in Neo4jClient

I have the follwoing query in cypher
Match(n1: Red)
Where n1.Id = "someId"
Call apoc.path.subgraphAll(n1,{ minLevel: 0,maxLevel: 100,relationshipFilter: "link",labelFilter: "+Red|Blue"})
Yield nodes, relationships
Return nodes, relationships
The graph I query has roughly a structure of "Red -> Blue -> Red" where all the edges are of the type "link".
The query yield exactly the expected result in the browser client.
My C# looks like this:
string subgraphAll = "apoc.path.subgraphAll";
object optionsObj = new {
minLevel = 0,
maxLevel = 100,
relationshipFilter = $"{link}",
labelFilter = $"+{Red}|{Blue}",
beginSequenceAtStart = "true",
bfs = true,
filterStartNode = false,
limit = -1,
//endNodes = null,
//terminatorNodes = null,
//whitelistNodes = null,
//blacklistNodes = null,
};
string options = JObject.FromObject(optionsObj).ToString();
var query = client.Cypher
.Match($"(n1:{"Red"})")
.Where((Red n1) => n1.Id == "someId")
.Call($"{subgraphAll}(n1, {options})")
.Yield($"nodes,relationships")
//FigureOut what to do
.Return<Object>("");
var result = query.ResultsAsync.Result;
My question is: How would I write that in C# with the Neo4J client and how do I get typesafe lists at the end (something like List<Red>, List<Blue>, List<Relationship>).
As Red and Blue are different types in C#, I don't see how I can deserialize the mixed "nodes" list from the query.
Note that my examples are a bit simplified. The Nodetypes are not strings but come from Enums in my application to have a safe way to know what node types exist and there are real models behind those types.
I tried to break out the whole parametrization of the stored proc, but the code is untested and I don't know if there is a better solution to do this yet. If there is a better way, please advise on that too.
I am new to cypher, so I need a little help here.
My idea was to split the nodes list into two lists (Red and Blue List) and then output the three Lists as properties of an anonymous object (as in the examples). Unfortunately My cypher isn't good enough to figure it out yet, and translating to the c# syntax at the same time doesn't help either.
My main concern is that once I deserialize into a list of untyped objects, It will be hell to parse them back into my models. So I want the query to do that sorting out for me.

In my view, if you want to go down the route of parsing the outputs into Red/Blue classes, it's going to be easier to do it in C# than in Cypher.
Unfortunately, also in this case - I think it'll be easier to execute the query using the Neo4j.Driver driver instead of Neo4jClient - and that's because at the moment, Neo4jClient seems to remove the id (etc) properties you'd need to be able to rebuild the graph properly.
With 4.0.3 of the Client you can access the Driver by doing this:
((BoltGraphClient)client).Driver
I have used a 'Movie/Person' example, as it's a dataset I had to hand, but the principals are the same, something like:
var queryStr = #"
Match(n1: Movie)
Where n1.title = 'The Matrix'
Call apoc.path.subgraphAll(n1,{ minLevel: 0,maxLevel: 2,relationshipFilter: 'ACTED_IN',labelFilter: '+Movie|Person'})
Yield nodes, relationships
Return nodes, relationships
";
var movies = new List<Movie>();
var people = new List<People>();
var session = client.Driver.AsyncSession();
var res = await session.RunAsync(queryStr);
await res.FetchAsync();
foreach (var node in res.Current.Values["nodes"].As<List<INode>>())
{
//Assumption of one label per node.
switch(node.Labels.Single().ToLowerInvariant()){
case "movie":
movies.Add(new Movie(node));
break;
case "person":
/* similar to above */
break;
default:
throw new ArgumentOutOfRangeException("node", node.Labels.Single(), "Unknown node type");
}
}
With Movie etc defined as:
public class Movie {
public long Id {get;set;}
public string Title {get;set;}
public Movie(){}
public Movie(INode node){
Id = node.Id;
Title = node.Properties["title"].As<string>();
}
}
The not pulling back ids etc problem for the client is something I need to look at how to fix, but this is the quickest way short of that to get where you want to be.

DocumentDB filter an array by an array

I have a document that looks essentially like this:
{
"Name": "John Smith",
"Value": "SomethingIneed",
"Tags: ["Tag1" ,"Tag2", "Tag3"]
}
My goal is to write a query where I find all documents in my database whose Tag property contains all of the tags in a filter.
For example, in the case above, my query might be ["Tag1", "Tag3"]. I want all documents whose tags collection contains Tag1 AND Tag3.
I have done the following:
tried an All Contains type linq query
var tags = new List<string>() {"Test", "TestAccount"};
var req =
Client.CreateDocumentQuery<Contact>(UriFactory.CreateDocumentCollectionUri("db", "collection"))
.Where(x => x.Tags.All(y => tags.Contains(y)))
.ToList();
Created a user defined function (I couldn't get this to work at all)
var tagString = "'Test', 'TestAccount'";
var req =
Client.CreateDocumentQuery<Contact>(UriFactory.CreateDocumentCollectionUri("db", "collection"),
$"Select c.Name, c.Email, c.id from c WHERE udf.containsAll([${tagString}] , c.Tags)").ToList();
with containsAll defined as:
function arrayContainsAnotherArray(needle, haystack){
for(var i = 0; i < needle.length; i++){
if(haystack.indexOf(needle[i]) === -1)
return false;
}
return true;
}
Use System.Linq.Dynamic to create a predicate from a string
var query = new StringBuilder("ItemType = \"MyType\"");
if (search.CollectionValues.Any())
{
foreach (var searchCollectionValue in search.CollectionValues)
{
query.Append($" and Collection.Contains(\"{searchCollectionValue}\")");
}
}
3 actually worked for me, but the query was very expensive (more than 2000 RUs on a collection of 10K documents) and I am getting throttled like crazy. My result set for the first iteration of my application must be able to support 10K results in the result set. How can I best query for a large number of results with an array of filters?
Thanks.

The UDF could be made to work but it would be a full table scan and so not recommended unless combined with other highly-selective criteria.
I believe the most performant (index-using) approach would be to split it into a series of AND statements. You could do this programmatically building up your query string (being careful to fully escape and user-provided data for security reasons). So, the resulting query would look like:
SELECT *
FROM c
WHERE
ARRAY_CONTAINS(c.Tags, "Tag1") AND
ARRAY_CONTAINS(c.Tags, "Tag3")

LINQ to SQL Intersect Query Failing

I have a Frequently Asked Question (FAQ) database with columns: id, Question, Answer, Category, and Keywords.
I want to take input from a user and search my database for matches. I ultimately want to take the search string and retrieve all records where any of the search string is found in either the Question or Keyword columns.
Since I'm relatively new at Linq to Sql, I'm trying to get just a single column search working first, and then try to get to a double column search. My code is still failing. I've seen other posts similar the topic, but the recommended solutions do not work exactly for my situation. Attempts to tweak those solutions have failed.
Problematic portion of my code:
private FAQViewModel getSearchResultFAQs(string search)
{
FAQViewModel vm = new FAQViewModel();
vm.isSearch = true;
string[] searchTerms = search.Split(new[] { ' ' }, StringSplitOptions.RemoveEmptyEntries);
//var ques = db.FAQs.Where(q => q.Keywords.Intersect(searchTerms).Any());
var ques = db.FAQs.Where(q => searchTerms.Any(t => q.Keywords.Contains(t)));
List<FAQ> list = new List<FAQ>();
foreach(var qs in ques)
{
FAQ f = new FAQ();
f.Question = qs.Question;
f.Answer = qs.Answer;
f.Category = qs.Category;
f.Keywords = qs.Keywords;
f.id = qs.id;
list.Add(f);
}
CategoryModel cm = new CategoryModel();
cm.faqs = list;
vm.faqs.Add(cm);
return vm;
}
This code fails. q.Keywords is underlined in red stating
string does not contain a definition for 'Intersect'
Any assistance would be greatly appreciated.
EDIT:
I commented out the bad line of code and used Gilad's first recommendation. I now get the following error:
"Local sequence cannot be used in LINQ to SQL implementations of query operators except the Contains operator."
It doesn't like the foreach codeblock:
foreach(var qs in ques)
I'm really out of my depth here but its so close to working.

Parallel Querying Azure Storage

I currently have a queury which looks along the lines of:
TableQuery<CloudTableEntity> query = new TableQuery<CloudTableEntity().Where(TableQuery.GenerateFilterCondition("PartitionKey", QueryComparisons.Equal, PK));
foreach (CloudTableEntity entity in table.ExecuteQuery(query))
{
//Logic
}
I been researching about parallels, however, I cannot find any good code examples on how to use it. I want to be able to query thousands of partition keys like
CloudTableEntity().Where(PartitionKey == "11" || PartitionKey == "22")
Where I can have around 40000 Partition keys. Is there a good way to do this?

The following sample code will issue multiple partition key queries in parallel:
CloudTable table = tableClient.GetTableReference("xyztable");
List<string> pkList = new List<string>(); // Partition keys to query
pkList.Add("1");
pkList.Add("2");
pkList.Add("3");
Parallel.ForEach(
pkList,
//new ParallelOptions { MaxDegreeOfParallelism = 128 }, // optional: limit threads
pk => { ProcessQuery(table, pk); }
);
Where ProcessQuery is defined as:
static void ProcessQuery(CloudTable table, string pk)
{
string pkFilter = TableQuery.GenerateFilterCondition("PartitionKey", QueryComparisons.Equal, pk);
TableQuery<TableEntity> query = new TableQuery<TableEntity>().Where(pkFilter);
var list = table.ExecuteQuery(query).ToList();
foreach (TableEntity entity in list)
{
// Process Entities
}
}
Note that ORing two partition keys in the same query as you listed above will result in a full table scan. To avoid a full table scan, execute individual queries with one partition key per query as the sample code above demonstrates.
For more details on query construction please see http://blogs.msdn.com/b/windowsazurestorage/archive/2010/11/06/how-to-get-most-out-of-windows-azure-tables.aspx

Using table.ExecuteQuerySegmentedAsync will provide better performance

How do I order a sql datasource of uniqueidentifiers in Linq by an array of uniqueindentifiers

I have a string list(A) of individualProfileId's (GUID) that can be in any order(used for displaying personal profiles in a specific order based on user input) which is stored as a string due to it being part of the cms functionality.
I also have an asp c# Repeater that uses a LinqDataSource to query against the individual table. This repeater needs to use the ordered list(A) to display the results in the order specified.
Which is what i am having problems with. Does anyone have any ideas?
list(A)
'CD44D9F9-DE88-4BBD-B7A2-41F7A9904DAC',
'7FF2D867-DE88-4549-B5C1-D3C321F8DB9B',
'3FC3DE3F-7ADE-44F1-B17D-23E037130907'
Datasource example
IndividualProfileId Name JobTitle EmailAddress IsEmployee
3FC3DE3F-7ADE-44F1-B17D-23E037130907 Joe Blo Director dsd#ad.com 1
CD44D9F9-DE88-4BBD-B7A2-41F7A9904DAC Maxy Dosh The Boss 1
98AB3AFD-4D4E-4BAF-91CE-A778EB29D959 some one a job 322#wewd.ocm 1
7FF2D867-DE88-4549-B5C1-D3C321F8DB9B Max Walsh CEO 1

There is a very simple (single-line) way of doing this, given that you get the employee results from the database first (so resultSetFromDatabase is just example data, you should have some LINQ query here that gets your results).
var a = new[] { "GUID1", "GUID2", "GUID3"};
var resultSetFromDatabase = new[]
{
new { IndividualProfileId = "GUID3", Name = "Joe Blo" },
new { IndividualProfileId = "GUID1", Name = "Maxy Dosh" },
new { IndividualProfileId = "GUID4", Name = "some one" },
new { IndividualProfileId = "GUID2", Name = "Max Walsh" }
};
var sortedResults = a.Join(res, s => s, e => e.IndividualProfileId, (s, e) => e);
It's impossible to have the datasource get the results directly in the right order, unless you're willing to write some dedicated SQL stored procedure. The problem is that you'd have to tell the database the contents of a. Using LINQ this can only be done via Contains. And that doesn't guarantee any order in the result set.

Turn the list(A), which you stated is a string, into an actual list. For example, you could use listAsString.Split(",") and then remove the 's from each element. I’ll assume the finished list is called list.
Query the database to retrieve the rows that you need, for example:
var data = db.Table.Where(row => list.Contains(row.IndividualProfileId));
From the data returned, create a dictionary keyed by the IndividualProfileId, for example:
var dic = data.ToDictionary(e => e.IndividualProfileId);
Iterate through the list and retrieve the dictionary entry for each item:
var results = list.Select(item => dic[item]).ToList();
Now results will have the records in the same order that the IDs were in list.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.