Techniques to make autocomplete on website more responsive - c#

In my website's advanced search screen there are about 15 fields that need an autocomplete field.
Their content is all depending on each other's value (so if one is filled in, the other's content will change depending on the first's value).
Most of the fields have a huge amount of possibilities (1000's of entries at least).
Currently make an ajax call if the user stops typing for half a second. This ajax call makes a quick call to my Lucene index and returns a bunch of JSon objects. The method itself is really fast, but it's the connection and transferring of data that is too slow.
If I look at other sites (say facebook), their autocomplete is instant. I figure they put the possible values in their HTML, so they don't have to do a round trip. But I fear with the amounts of data I'm handling, this is not an option.
Any ideas?

Return only top x results.
Get some trends about what users are picking,
and order based on that, preferably
automatically.
Cache results for every URL & keystroke combination,
so that you don't have to round-trip
if you've already fetched the result
before.
Share this cache with all
autocompletes that use the same URL
& keystroke combination.
Of course,
enable gzip compression for the
JSON, and ensure you're setting your
cache headers to cache for some
time. The time depends on your rate
of change of autocomplete response.
Optimize the JSON to send down the
bare minimum. Don't send down
anything you don't need.

Are you returning ALL results for the possibilities or just the top 10 as json objects.
I notice a lot of people send large numbers of results back to the screen, but then only show the first few. By sending back small numbers of results, you can reduce the data transfer.

Return the top "X" results, rather than the whole list, to cut back on the number of options? You might also want to try and put in some trending to track what users pick from the list so you can try and make the top "X" the most used/most relvant. You could always return your most relevant list first, then return the full list if they are still struggling.

In addition to limiting the set of results to a top X set consider enabling caching on the responses of the AJAX requests (which means using GET and keeping the URL simple).
Its amazing how often users will backspace then end up retyping exactly the same content. Also by allowing public and server-side caching your could speed up the overall round-trup time.

Cache the results in System.Web.Cache
Use a Lucene cache
Use GET not POST as IE caches this
Only grab a subset of results (10 as people suggest)
Try a decent 3rd party autocomplete widget like the YUI one

Returning the top-N entries is a good approach. But if you want/have to return all the data, I would try and limit the data being sent and the JSON object itself.
For instance:
"This Here Company With a Long Name" becomes "This Here Company..." (you put the dots in the name client side--again; transfer a minimum of data).
And as far as the JSON object goes:
{n: "This Here Company", v: "1"}
... Where "n" would be the name and "v" would be the value.

Related

Serializing objects bigger than 2MiB to Json in Asp.net

We are currently doing Performance tests to determine, if Kendo UI is fast enough for our needs. For that we need to perform tests with a large database (~150 columns and ~100,000 rows).
Table rows should be read by a Kendo UI Grid using ajax calls, which return the data as a json string. With our test data (random strings of 3-10 chars) this works for up to ~700 result rows per request. More, and we hit maxJsonLength, which is already set to Int32.Max-3.
We are not planning on displaying that many rows per page, but there might be binary data attached to the rows. That data could, even with 20 rows, easily go above the 2 MiB restriction implied by having to use an Int32 to set the max size.
So is there any way to serialize objects with a length bigger than 2M?
JSON isn't really designed to transfer large binary data. If you want your UI to be fast and snappy you should try splitting larger objects into smaller ones and also removing binary content from the json.
For example, you can refactor the content of json to only carry a link to the binary resource. If that binary resource is actually needed on screen you can perform a separate request. In fact you can perform requests in parallel: e.g. load json and display the content. Load first N entries with binary data and display it. Don't load the rest as it will slow down your page render time.
We are now using the Json lib from http://www.newtonsoft.com to serialize the objects. It is not bound by the web.config settings and can handle Json requests of unlimited length afaik.

Batch get items using Parse API times out

I have an array of Object Ids which I need to retrieve from Parse. The size of the array varies greatly, and sometimes there are duplicates. Up until now, I've been prototyping, so I would use
string[] objectIds = new [] { "xT6...
...WhereContainedIn("objectId", objectIds);
And this would work okay. In real life, though, the size of the objectId array above can reach in the hundreds, and the query returns "operation was slow and timed out". I really have two questions here:
1) There has to be a better way to retrieve an array of objects, if you know the object Ids, but I couldn't find it. Is WhereContainedIn() the only solution here?
2) Are there any guidelines for how/when queries will simply fail? The documentation only mentions a limit of 1000 items to be retrieved, and nothing about the query going in. If it turns out that this query has to be batched, that would be okay, but there are no guidelines for batching, either.
So I have never used (or even heard of parse but reading throught the documentation I found this text about the limit maybe it would help.
"You can limit the number of results by calling Limit. By default, results are limited to 100, but anything from 1 to 1000 is a valid limit:"
https://www.parse.com/docs/dotnet_guide#queries-constraints

How to centrally maintain a mathematical formula in C# (web) so it can be changed if needed?

We have an application that has a LOT of mathematical checks on the page and according to it, the user is given a traffic light (Red, green, yellow).
Green = He may continue
Red = Dont let him continue
Yellow = Allow to continue but warn
These formulas operate on the various text-fields on the page. So, for example, if textbox1 has "10" and texbox2 has "30"... The formula might be:
T1 * T2 > 600 ? "GREEN" : "RED"
My question is:
Is it possible to somehow centralize these formulas?
Why do I need it?
Right now, if there is any change in a formula, we have to replicate the change at server-side as well (violation of DRY, difficult to maintain code)
One option could be to
- store the (simple) formula as text with placeholders in a config(?)
- replace the placeholders with values in javascript as well as server-side code
- use eval() for computation in JS
- use tricks outlined here for C#
In this approach issue could be different interpretations of same mathematical string in JS and C#.
Am i making sense or should this question be reported?!! :P
Depending on your application's requirements, it may be acceptable to just do all the validation on the server. Particularly if you have few users or most of them are on a reasonably fast intranet, you can "waste" some network calls to save yourself a maintenance headache.
If the user wants feedback between every field entry (or every few entries, or every few seconds), you could use an AJAX call to ask the server for validation without a full page refresh.
This will, of course result in more requests than doing the validation entirely on the client, and if many of your users have bad network connections there could be latency in giving them the feedback. My guess is the total bandwidth usage is about the same. You use some for every validation round-trip, but those are small. It may be outweighed by all that validation JS that you're not going to send to clients.
The main benefit is the maintenance and FUD that you'd otherwise have keeping the client and server validation in sync. There's also the time savings in never having to write the validation javascript.
In any case, it may be worth taking a step back and asking what your requirements are.
The Microsoft.CSharp.CSharpCodeProvider provider can compile code on-the-fly. In particular, see CompileAssemblyFromFile.
This would allow you to execute code at runtime from a web.config for instance; however use with caution.
You could write C# classes to model your expressions with classes such as Expression, Value, BooleanExpr, etc. (an Abstract Syntax Tree)
e.g.
Expression resultExpression = new ValueOf("T1").Times(new ValueOf("T2")).GreaterThan(600).IfElse("RED","GREEN")
^Variable ^Variable ^Value=>BoolExpr ^(Value, Value) => Value
These expressions could then be used to evaluation in C# AND to emit Java script for the checks:
String result = resultExpression.bind("T1", 10).bind("T2",20).evaluate() //returns "RED"
String jsExpression resultExpression.toJavaScript // emits T1 * T2 > 600 ? "RED" : "GREEN"
You can make a low level calculator class that uses strings as input and pushes and pops things onto a stack. Look up a "Reverse Polish Calculator". If the number of inputs you are using doesn't change this would be a pretty slick way to store your equations. You would be able to store them in a text file or in a config file very easily.
for instance:
string Equation = "V1,V2,+";
string ParseThis = Equation.Replace("V1", "34").Replace("V2", "45");
foreach(string s in ParseThis.split(',')) {
if (s == "+") {
val1 = stack.Pop();
val2 = stack.Pop();
return int.parse(val1) + int.Parse(val2);
}
else {
stack.Push(s);
}
}
obviously this gets more complicated with different equations but it could allow you to store your equations as strings anywhere you want.
apologies if any of the syntax is incorrect but this should get you going in the right direction.
The simplest solution would be to implement the formulae once in C# server-side, and use AJAX to evaluate the expressions from the client when changes are made. This might slow down the page.
If you want the formulae evaluated client-side and server-side but written only once, then I think you will need to do something like:
Pull the formulae out into a separate class
For the client-side:
Compile the class to Javascript
Call into the javascript version, passing in the values from the DOM
Update the DOM using the results of the formulae
For the server-side:
Call into the formulae class, passing in the values from the form data (or controls if this is web-forms)
Take the necessary actions using the results of the formulae
.. or you could do the converse, and write the formulae in Javascript, and use a C# Javascript engine to evaluate that code server-side.
I wouldn't spend time writing a custom expression language, or writing code to translate C# to Javascript, as suggested by some of the other answers. As shown in the questions I linked to, these already exist. This is a distraction from your business problem. I would find an existing solution which fits and benefit from someone else's work.

Fast autocomplete from in-memory collection (.NET)

I have this text input field on a web page. User types in item names for purchase. I'd like to provide a dropdown with possible names, based on letters typed so far.
Question is how to implement the search on the server (ASP.NET MVC). I'll probably load the whole collection of item names (there are over 100 000) in a static variable on app start. How should I implement efficient search for names starting with given one or more characters?
TIA
You can sort the collection by name, then write a modified binary search that returns a range of items.
However, I would recommend first trying a simple sequential search and seeing how it behaves under load.
I'll probably load the whole
collection of item names (there are
over 100 000) in a static variable on
app start. How should I implement
efficient search for names starting
with given one or more characters?
By NOT (!) loading them into a static variable. Hit the db server on every request with a "top 101" clause. Finished.

Efficient algorithm for finding related submissions

I recently launched my humble side project and would like to add a "related submissions" section when viewing a submission. Exactly like what SO is doing here - see right column, titled "Related"
Considering that each submission has a title and a set of tags, what is most effective (optimum result), most efficient (fast, memory friendly) way to query the database for related submissions?
I can think of one way to do this (which I'll post as an answer) but I'm very interested to see what others have to say. Or perhaps there's already a standard way of achieving this?
Here's my two cent solution:
To achieve the best output, we need to put “weight” on the query results.
To start with, each submission in the database is assumed to have a weight of zero.
Then, if a submission in the "pool" shares one tag with the current submission, we'd add +3 to the found submission. Hence, if another submission is found that shares two tags with the current submission, we add +6 to the weight.
Next, we split/tokenize the title of the current submission and remove “stop words”.
I’ve seen a list of stop words from google, but for now I’ll define my stop words to be: [“of”, “a”, “the”, “in”]
Example:
Title “The Best Submission of All Times”
Result the array: ["The", “Best”, “Submission”, “of”, “All”, “Times”]
Remove stop words: [“Best”, “Submission”, “All”, “Times”]
Then we query the database for submissions containing any of the mentioned titles, and for each result we add the weight: +2
And finally sort the list descending by weight and take the top N results.
What do you think? (be gentle!)
If I understand well, you need a technique to find whether two posts are "similar" one to each other. You may want to use a probabilistic model for that:
http://en.wikipedia.org/wiki/Mutual_information
The idea would be to say that if two posts share a lot of "uncommon" words, they are probably speaking on the same topic. For detecting uncommon words, depending on your application, you may use a general table of frequencies, or maybe better, build it yourself on the universe of the words of your posts (but you will need to have enough of them to have something relevant).
I would not limit myself on title and tags, but I would overweight them in the research.
This kind of ideas is very common in spam filtering. I unfortunately the time to make a full review, but a quick google search gives:
http://www.aclweb.org/anthology/P/P04/P04-3024.pdf
karlmicha.googlepages.com/acl2004_poster.pdf

Categories