I'm attempting to implement complete search functionality in my ASP.NET MVC (C#, Linq-to-Sql) website.
The site consists of about 3-4 tables that have about 1-2 columns that I want to search.
This is what I have so far:
public List<SearchResult> Search(string Keywords)
{
string[] split = Keywords.Split(new char[] { ' ' }, StringSplitOptions.RemoveEmptyEntries);
List<SearchResult> ret = new List<SearchResult>();
foreach (string s in split)
{
IEnumerable<BlogPost> results = db.BlogPosts.Where(x => x.Text.Contains(s) || x.Title.Contains(s));
foreach (BlogPost p in results)
{
if (ret.Exists(x => x.PostID == p.PostID))
continue;
ret.Add(new SearchResult
{
PostTitle= p.Title,
BlogPostID = p.BlogPostID,
Text=p.Text
});
}
}
return ret;
}
As you can see, I have a foreach for the keywords and an inner foreach that runs over a table (I would repeat it for each table).
This seems inefficent and I wanted to know if theres a better way to create a search method for a database.
Also, what can I do to the columns in the database so that they can be searched faster? I read something about indexing them, is that just the "Full-text indexing" True/False field I see in SQL Management Studio?
Also, what can I do to the columns in
the database so that they can be
searched faster? I read something
about indexing them, is that just the
"Full-text indexing" True/False field
I see in SQL Management Studio?
Yes, enabling full-text indexing will normally go a long way towards improving performance for this scenario. But unfortunately it doesn't work automatically with the LIKE operator (and that's what your LINQ query is generating). So you'll have to use one of the built-in full-text searching functions like FREETEXT, FREETEXTTABLE, CONTAINS, or CONTAINSTABLE.
Just to explain, your original code will be substantially slower than full-text searching as it will typically result in a table scan. For example, if you're searching a varchar field named title with LIKE '%ABC%' then there's no choice but for SQL to scan every single record to see if it contains those characters.
However, the built-in full-text searching will actually index the text of every column you specify to include in the full-text index. And it's that index that drastically speeds up your queries.
Not only that, but full-text searching provides some cool features that the LIKE operator can't give you. It's not as sophisticated as Google, but it has the ability to search for alternate versions of a root word. But one of my favorite features is the ranking functionality where it can return an extra value to indicate relevance which you can then use to sort your results. To use that look into the FREETEXTTABLE or CONTAINSTABLE functions.
Some more resources:
Full-Text Search (SQL Server)
Pro Full-Text Search in SQL Server 2008
The following should do the trick. I can't say off the top of my head whether the let kwa = ... part will actually work or not, but something similar will be required to make the array of keywords available within the context of SQL Server. I haven't used LINQ to SQL for a while (I've been using LINQ to Entities 4.0 and nHibernate for some time now, which have a different set of capabilities). You might need to tweak that part to get it working, but the general principal is sound:
public List<SearchResult> Search(string keywords)
{
var searcResults = from bp in db.BlogPosts
let kwa = keywords.Split(new char[]{' '}, StringSplitOptions.RemoveEmptyEntries);
where kwa.Any(kw => bp.Text.Contains(kw) || bp.Title.Contains(kw))
select new SearchResult
{
PostTitle = bp.Title,
BlogPostID = bp.BlogPostID,
Test = bp.Text
};
return searchResults.ToList();
}
Related
Following the question I asked: Build a dynamic query using neo4j client
I got an answer about how can I return value dynamically using string only.
When I'm trying to use the syntax to return multi values from the query it failed,
I tried the following query:
var resQuery2 = WebApiConfig.GraphClient.Cypher
.Match("(movie:Movie {title:{title}})")
.OptionalMatch("(movie)<-[r]-(person:Person)")
.WithParam("title", title)
.Return(() => Return.As<string>("movie, collect([person.name, head(split(lower(type(r)), '_')), r.roles])"));
I'm getting the following error:
The deserializer is running in single column mode, but the response
included multiple columns which indicates a projection instead. If
using the fluent Cypher interface, use the overload of Return that
takes a lambda or object instead of single string. (The overload with
a single string is for an identity, not raw query text: we can't map
the columns back out if you just supply raw query text.)
Is it possible to return multiple nodes using only strings?
We can't get an output like in the question you asked previously - this is due to the fact that you are asking for a Node (the movie) and a Collection of strings (the collect) and they have no common properties, or even styles of property.
Firstly, let's look at the painful way to do this:
var q = gc.Cypher
.Match("(movie:Movie)")
.OptionalMatch("(movie)<-[r]-(person:Person)")
.Return(() => Return.As<string>("{movie:movie, roles:collect([person.name, head(split(lower(type(r)), '_')), r.roles])}"));
var results = q.Results;
Here we take the query items (movie, r, person) and create a type with them the {} around the results, and cast that to a string.
This will give you a horrible string with the Node data around the movie and then a collection of the roles:
foreach (var m in results)
{
//This is going to be painful to navigate/use
dynamic d = JsonConvert.DeserializeObject<dynamic>(m);
Console.WriteLine(d.movie);
Console.WriteLine(d.roles);
}
You'd be a lot better off doing something like:
var q = gc.Cypher
.Match("(movie:Movie)")
.OptionalMatch("(movie)<-[r]-(person:Person)")
.Return(() => new
{
Movie = Return.As<Node<string>>("movie"),
Roles = Return.As<IEnumerable<string>>("collect([person.name, head(split(lower(type(r)), '_')), r.roles])")
});
var res = q.Results;
You could either JsonConvert.DeserializeObject<dynamic>() the Movie node, at your leisure, or write a strongly typed class.
In terms of a 'dynamic' object, I don't know how you were wanting to interact with the collect part of the return statement, if this doesn't help, you might need to update the question to show a usage expectation.
I have a rather huge (30 mln rows, up to 5–100Kb each) Table on Azure.
Each RowKey is a Guid and PartitionKey is a first Guid part, for example:
PartitionKey = "1bbe3d4b"
RowKey = "1bbe3d4b-2230-4b4f-8f5f-fe5fe1d4d006"
Table has 600 reads and 600 writes (updates) per second with an average latency of 60ms. All queries use both PartitionKey and RowKey.
BUT, some reads take up to 3000ms (!). In average, >1% of all reads take more than 500ms and there's no correlation with entity size (100Kb row may be returned in 25ms and 10Kb one – in 1500ms).
My application is an ASP.Net MVC 4 web-site running on 4-5 Large instances.
I have read all MSDN articles regarding Azure Table Storage performance goals and already did the following:
UseNagle is turned Off
Expect100Continue is also disabled
MaxConnections for table client is set to 250 (setting 1000–5000 doesn't make any sense)
Also I checked that:
Storage account monitoring counters have no throttling errors
There are some kind of "waves" in performance, though they does not depend on load
What could be the reason of such performance issues and how to improve it?
I use the MergeOption.NoTracking setting on the DataServiceContext.MergeOption property for extra performance if I have no intention of updating the entity anytime soon. Here is an example:
var account = CloudStorageAccount.Parse(RoleEnvironment.GetConfigurationSettingValue("DataConnectionString"));
var tableStorageServiceContext = new AzureTableStorageServiceContext(account.TableEndpoint.ToString(), account.Credentials);
tableStorageServiceContext.RetryPolicy = RetryPolicies.Retry(3, TimeSpan.FromSeconds(1));
tableStorageServiceContext.MergeOption = MergeOption.NoTracking;
tableStorageServiceContext.AddObject(AzureTableStorageServiceContext.CloudLogEntityName, newItem);
tableStorageServiceContext.SaveChangesWithRetries();
Another problem might be that you are retrieving the entire enity with all its properties even though you intend only use one or two properties - this is of course wasteful but can't be easily avoided. However, If you use Slazure then you can use query projections to only retrieve the entity properties that you are interested in from the table storage and nothing more, which would give you better query performance. Here is an example:
using SysSurge.Slazure;
using SysSurge.Slazure.Linq;
using SysSurge.Slazure.Linq.QueryParser;
namespace TableOperations
{
public class MemberInfo
{
public string GetRichMembers()
{
// Get a reference to the table storage
dynamic storage = new QueryableStorage<DynEntity>("UseDevelopmentStorage=true");
// Build table query and make sure it only return members that earn more than $60k/yr
// by using a "Where" query filter, and make sure that only the "Name" and
// "Salary" entity properties are retrieved from the table storage to make the
// query quicker.
QueryableTable<DynEntity> membersTable = storage.WebsiteMembers;
var memberQuery = membersTable.Where("Salary > 60000").Select("new(Name, Salary)");
var result = "";
// Cast the query result to a dynamic so that we can get access its dynamic properties
foreach (dynamic member in memberQuery)
{
// Show some information about the member
result += "LINQ query result: Name=" + member.Name + ", Salary=" + member.Salary + "<br>";
}
return result;
}
}
}
Full disclosure: I coded Slazure.
You could also consider pagination if you are retrieving large data sets, example:
// Retrieve 50 members but also skip the first 50 members
var memberQuery = membersTable.Where("Salary > 60000").Take(50).Skip(50);
Typically, if a specific query requires scanning a large number of rows, that will take longer time. Is the behavior you are seeing specific a query / data? Or, are you seeing the performance varies for the same data and query?
I have framed query to submit to solr which is of following format.
id:95154 OR id:68209 OR id:89482 OR id:94233 OR id:112481 OR id:93843
i want to get records according to order from starting. say i need to get document with id 95154 document first then id 68209 next and so on. but its not happening right now its giving last id 93843 first and some times random.i am using solr in grails 2.1 and my solr version is 1.4.0. here is sample way i am getting documents from solr
def server = solrService.getServer('provider')
SolrQuery sponsorSolrQuery = new SolrQuery(solarQuery)
def queryResponse = server.query(sponsorSolrQuery);
documentsList = queryResponse.getResults()
As #injecteer mentions, there is nothing built-in to Lucene to consider the sequence of clauses in a boolean query, but:
You are able to apply boosts to each term, and as long as the field is a basic field (meaning, not a TextField), the boosts will apply cleanly to give you a decent sort by score.
id:95154^6 OR id:68209^5 OR id:89482^4 OR id:94233^3 OR id:112481^2 OR id:93843
there's no such thing in Lucene (I strongly assume, that in Solr as well). In Lucene you can sort the results based on contents of documents' fields, but not on the order of clauses in a query.
that means, that you have to sort the results yourself:
documentsList = queryResponse.getResults()
def sordedByIdOrder = solarQueryAsList.collect{ id -> documentList.find{ it.id == id } }
I have a domain class Schedule with a property 'days' holding comma separated values like '2,5,6,8,9'.
Class Schedule {
String days
...
}
Schedule schedule1 = new Schedule(days :'2,5,6,8,9')
schedule1.save()
Schedule schedule2 = new Schedule(days :'1,5,9,13')
schedule2.save()
I need to get the list of the schedules having any day from the given list say [2,8,11].
Output: [schedule1]
How do I write the criteria query or HQL for the same. We can prefix & suffix the days with comma like ',2,5,6,8,9,' if that helps.
Thanks,
Hope you have a good reason for such denormalization - otherwise it would be better to save the list to a child table.
Otherwise, querying would be complicated. Like:
def days = [2,8,11]
// note to check for empty days
Schedule.withCriteria {
days.each { day ->
or {
like('username', "$day,%") // starts with "$day"
like('username', "%,$day,%")
like('username', "%,$day") // ends with "$day"
}
}
}
In MySQL there is a SET datatype and FIND_IN_SET function, but I've never used that with Grails. Some databases have support for standard SQL2003 ARRAY datatype for storing arrays in a field. It's possible to map them using hibernate usertypes (which are supported in Grails).
If you are using MySQL, FIND_IN_SET query should work with the Criteria API sqlRestriction:
http://grails.org/doc/latest/api/grails/orm/HibernateCriteriaBuilder.html#sqlRestriction(java.lang.String)
Using SET+FIND_IN_SET makes the queries a bit more efficient than like queries if you care about performance and have a real requirement to do denormalization.
I'm a little stumped on this one. Anyone have any ideas? I'll try to lay out the example as brief as possible.
Creating Silverlight 3.0 application against SQL 2005 database. Using RIA Services and Entity Framework for data access.
I need to be able to populate a grid against a table. However, my grid UI and my table structure is different. Basically my grid needs to turn rows into columns (like a PIVOT table). Here are my challenges / assumptions
I have no idea until runtime which columns I will have on the grid.
Silverlight 3 only supports binding to properties
Silverlight 3 does not allow you to add a row to the grid and manually populate data.
As we all know, Silverlight does not have the System.Data (mainly DataTable) namespace
So, how do I create an object w/ dynamic properties so that I can bind to the grid. Every idea I've had (multi-dimensional arrays, hash tables, etc.) fall apart b/c SL needs a property to bind to, I can't manually add/fill a data row, and I can't figure out a way to add dynamic properties. I've seen an article on a solution involving a linked list but I'm looking for a better alternative. It may come down to making a special "Cody Grid" which will be a bunch of text boxes/labels. Doable for sure but I'll lose some grid functionality that users expect
The ONLY solution I have been able to come up is to create a PIVOT table query in SQL 2005 and use an entity based on that query/view. SQL 2008 would help me with that. I would prefer to do it in Silverlight but if that is the last resort, so be it. If I go the PIVOT route, how do I implement a changing data structure in Entity Framework?
Data Sample.
Table
Name Date Value
Cody 1/1/09 15
Cody 1/2/09 18
Mike 1/1/09 20
Mike 1/8/09 77
Grid UI should look like
Name 1/1/09 1/2/09 1/3/09 .... 1/8/09
Cody 15 18 NULL NULL
Mike 20 NULL NULL 77
Cody
My team came up with a good solution. I'm not sure who deserves the credit but it's somewhere in google land. So far it works pretty good.
Essentially the solution comes down to using reflection to build a dynamic object based on this dynamic data. The function takes in a 2-dimensional array and turns it into a List
object with properties that can be bound. We put this process in a WCF Service and it seems to do exactly what we need so far.
Here is some of the code that builds the object using Reflection
AppDomain myDomain = AppDomain.CurrentDomain;
AssemblyName myAsmName = new AssemblyName("MyAssembly");
AssemblyBuilder myAssembly = myDomain.DefineDynamicAssembly(myAsmName, AssemblyBuilderAccess.Run);
ModuleBuilder myModule = myAssembly.DefineDynamicModule(myAsmName.Name);
TypeBuilder myType = myModule.DefineType("DataSource", TypeAttributes.Public);
string columnName = "whatever";
for (int j = 0; j <= array.GetUpperBound(1); j++)
{
Type properyType = typeof(T);
FieldBuilder exField = myType.DefineField("_" + "columnName" + counter, properyType, FieldAttributes.Private);
//The following line is where I’m passing columnName + counter and getting errors with some strings but not others.
PropertyBuilder exProperty = myType.DefineProperty(columnName + counter.ToString(), PropertyAttributes.None, properyType, Type.EmptyTypes);
//Get
MethodBuilder exGetMethod = myType.DefineMethod("get_" + "columnName" + counter, MethodAttributes.Public, properyType, Type.EmptyTypes); ILGenerator getIlgen = exGetMethod.GetILGenerator();
//IL for a simple getter:
//ldarg.0
//ldfld int32 SilverlightClassLibrary1.Class1::_Age
//ret
getIlgen.Emit(OpCodes.Ldarg_0);
getIlgen.Emit(OpCodes.Ldfld, exField);
getIlgen.Emit(OpCodes.Ret);
exProperty.SetGetMethod(exGetMethod);
//Set
MethodBuilder exSetMethod = myType.DefineMethod("set_" + "columnName" + counter, MethodAttributes.Public, null, new Type[] { properyType }); ILGenerator setIlgen = exSetMethod.GetILGenerator();
//IL for a simple setter:
//ldarg.0
//ldarg.1
//stfld int32 SilverlightClassLibrary1.Class1::_Age
//ret
setIlgen.Emit(OpCodes.Ldarg_0);
setIlgen.Emit(OpCodes.Ldarg_1);
setIlgen.Emit(OpCodes.Stfld, exField); setIlgen.Emit(OpCodes.Ret);
exProperty.SetSetMethod(exSetMethod);
counter++;
}
finished = myType.CreateType();
You can dynamically set columns with their associated bindings (ensuring that AutoGenerateColumns is off):
For instance, the name column:
DataGridTextColumn txtColumn = new DataGridTextColumn();
textColumn.Header = "Name";
textColumn.Binding = new Binding("FirstName");
myDataGrid.Columns.Add(txttColumn);
The ObservableCollection you use to store the data that is queried could possibly be overriden to support pivoting, making sure to change the binding of the DataGrid columns, as shown above.
Note: This is a fair amount of hand waving i'm sure (haven't touched silverlight for over a year); but I hope it's enough to formulate another strategy.
if you are working with two dimensional array then adding columns dynamically as shown above will not work.
The problem is with silverlight it cannot understand the binding of columns to a list.
So we have to create list of rows with row convertor that will represent our two dimensional arrays.
this one worked for me
http://www.scottlogic.co.uk/blog/colin/2010/03/binding-a-silverlight-3-datagrid-to-dynamic-data-via-idictionary-updated/