Problem:
We're querying our database with a lot of entries. The overall application performance is okay. But I think it could be better if our bottleneck, querying a special table, could be done parallel.
We are using for that dynamic finders like:
Object.findAllByObjectCollectionAndParentIsNullAndDeletedByUserIsNull(objectCollection, [sort: 'attr1.number', fetch: [objectType: 'eager']])
I think about this solution: Dividing the query into 2 steps.
The first step loads only the Ids in a sorted way.
The second step (could be done in parallel threads) loads the objects itself by the id and inserts it into the resultset.
I already googled for some hints about that, but find nothing.
Is it possible, that it makes no sense?
Or is there already a adequate solution in grails standard/extensions?
If it makes sense and there is no solution: Can someone give me a hint implementing it? But, we need the ids in that sorted manner, explained in the example.
We're using grails 2.3.11, hibernate:3.6.10.13 with jdk 1.7 under it.
I found a solution!
ArrayList<Long> objectIds = Attribute.executeQuery("select o.id from Object o, ObjectType ot where o.objectCollection = :oc and o.parent is null and o.deletedByUser is null and oc.typeUsage = ot order by ot.nr asc", [oc: objectCollection])
Object[] tmp = new Object[objectIds.size()]
withPool(10, {
objectIds.eachWithIndexParallel {
Long objectId, i ->
Object o = Object.findById(objectId, [cache: true])
tmp[i] = a
}
})
for (Object a : tmp)
a.merge()
Its round about 3 times slower!
I think the reason is, that
we have to create the threads and databaseconnections too.
After loading the objects, they need to be merged into our local thread.
But in the end - we cant use it, beause the lazy loaded fields couldn't be loaded on demand later. So this could be only a solution for read-only Objects
Related
I have the following code
PagedResultList res = myService.getPage(paginateParams, ...)
println res.size() // returns 2
println res.getTotalCount() // returns 1
getPage looks like:
def criteria = MyDomain.createCriteria()
criteria.list(max: paginateParams.max, offset: paginateParams.offset) { // max is 10, offset is 0, sortBy is updatedAt and sortOrder is desc
eq('org', org)
order(paginateParams.sortBy, paginateParams.sortOrder)
}
why do the two method return different values? The documentation doesn't explain the difference, but does mention that getTotalCount is for number of records
currently on grails 2.4.5
edits:
println on res prints out:
res: [
com.<hidden>.MyDomain: 41679f98-a7c5-4193-bba8-601725007c1a,
com.<hidden>.MyDomain: 41679f98-a7c5-4193-bba8-601725007c1a]
Yes, res has a SINGLE object twice - that's the bug I'm trying to fix. How do I know that? I have an primary key on MyDomain's ID, and when I inspect the database, it's also showing one record for this particular org (see my criteria)
edit 2: I found this comment (http://docs.grails.org/2.4.5/ref/Domain%20Classes/createCriteria.html)
listDistinct If subqueries or associations are used, one may end up
with the same row multiple times in the result set. In Hibernate one
would do a "CriteriaSpecification.DISTINCT_ROOT_ENTITY". In Grails one
can do it by just using this method.
Which, if I understand correctly, is their way of saying "list" method doesn't work in this scenario, use listDistinct instead but then they go on to warn:
The listDistinct() method does not work well with the pagination
options maxResult and firstResult. If you need distinct results with
pagination, we currently recommend that you use HQL. You can find out
more information from this blog post.
However, the blog post is a dead link.
Related: GORM createCriteria and list do not return the same results : what can I do?
Not related to actual problem after question edited but this quote seems useful
Generally PagedResultList .size() perform size() on resultList property (in-memory object represent database record), while .getTotalCount() do count query against database. If this two value didn't match your list may contain duplicate.
After viewing related issues (GORM createCriteria and list do not return the same results : what can I do?) I determined that there were several approaches:
Use grails projection groupBy('id') - doesn't work b/c i need the entire object
USe HSQL - Domain.executeQuery - actually this didn't work for my scenario very well because this returns a list, whereas criteria.list returns a PagedResultList from which I previously got totalCount. This solution had me learning HSQL and also made me break up my existing logic into two components - one that returned PagedResultList and one that didn't
Simply keep a set of IDs as I process my PagedResultList and make sure that I didn't have any duplicates.
I ended up going with option 3 because it was quick, didn't require me to learn a new language (HSQL) and I felt that I could easily write the code to do it and I'm not limited by the CPU to do such a unique ID check.
I'm trying to accelerate the performance of my app and wonder if there is a difference between accessing domain property value with instance.name and instance.getName()
If it is, which one is the best in terms of performance ?
Example
class User {
String name
}
User user = User.get(100);
//is it better this way
user.name
//or this way
user.getName()
Thank you
It doesn't matter for the usage you've provided, because user.name uses user.getName() behind scenes. So it's the same. If you want to access property directly you have to use # like this user.#name. See more here
But I don't think this is the way you can speed up your app.
It is very likely you will find a lot easier ways for improving performance of your code. Here are some ideas where to start if you like to improve performance.
A) Number of queries. Try to avoid the the N+1 problem. For example if one user hasMany [events: Event], code like user.events.each { access event.anyPropertyExceptId } will dispatch new queries for each event.
B) Efficiency of queries. Grails per default creates indexes for all gorm associations / other nested domains. However anything you use to search, filter etc. you need to do "manually" for example.
static mapping = {
anyDomainProperty index: 'customIndexName'
}
C) Only query for the data you are interested in, replace for example:
User.all.each { user ->
println user.events.size()
}
with
Event.withCriteria {
projections {
property('user')
countDistinct('id')
groupProperty('user')
}
}
D) If you really need to speed up your groovy code and your problem is rather a single request than general cpu usage, take a look at http://gpars.codehaus.org and http://grails.org/doc/2.3.8/guide/async.html and try to parallize work.
I doubt any performance issues in your application are related to how you are accessing your properties of your domain classes. In fact, if you profile/measure your application I'm sure you will see that is the case.
I've got a domain class, Widget, that I need to delete all instances out of -- clear it out. After that, I will load in fresh data. What do you suggest as a mechanism to do this?
P.S. Note this is not at bootstrap time, but at "run-time".
The easiest way is to use HQL directly:
DomainClass.executeUpdate('delete from DomainClass')
DomainClass.findAll().each { it.delete() }
If you want to avoid any GORM gotchas, such as needing to delete the object immediately and checking to make sure it actually gets deleted, add some arguments.
DomainClass.findAll().each { it.delete(flush:true, failOnError:true) }
Fairly old post, but still actual.
If your table is very large (millions of entries), iterating using findall()*.delete() might not be the best option, as you can run into transaction timeouts (e.g. MySQL innodb_lock_wait_timeout setting) besides potential memory problems stated by GreenGiant.
So at least for MySQL Innodb, much faster is to use TRUNCATE TABLE:
sessionFactory.currentSession
.createSQLQuery("truncate table ${sessionFactory.getClassMetadata(MyDomainClass).tableName}")
.executeUpdate()
This is only useful if your table is not referenced by other objects as a foreign key.
From what I learnt, I agree with #ataylor the below code is fastest IF there are no associations in your domain object (Highly unlikely in any real application):
DomainClass.executeUpdate('delete from DomainClass')
But if you have assiciations with other domains, then the safest way to delete (and also a bit slower than the one mentioned above) would be the following:
def domainObjects = DomainClass.findAll()
domainObjects.each {
it.delete(flush:it==domainObjects.last, failOnError:true)
}
If you have a list of objects and want to delete all elements, you can use * operator.
'*' will split the list and pass its elements as separate arguments.
Example.
List<Book> books = Book.findAllByTitle('grails')
books*.delete()
Is there any way to get a PagedResultList in grails without using criteria? I would like to avoid criteria as they are slightly more complex and make unit testing rather annoying. Code Below
def pagedResultList = MyDomainClass.createCriteria().list(max:10, offset:0)
{ order("id", "asc") }
//Below does not return pagedResultList
def aList = MyDomainClass.list(sort:"id", order:"asc", max:10, offset: 0)
PagedResultList is just used to wrap the results of Criteria-based queries (you can see its use in the source here). If you really want to use it, you could always just invoke the constructor directly, since it will handle any list. Of course the totalCount property (which is probably what you are interested in) will then be unset.
If the idea is to get both the paged results list and the total number of results, I'm not aware of any magic that can get both in one query (even the use of PagedResultList in the source linked above issues two queries).
This seems to break in Grails 2.x because there is no costructor that just takes a list. Compare http://grails.org/doc/2.1.0/api/grails/orm/PagedResultList.html and http://docs.huihoo.com/grails/1.3.7/api/grails/orm/PagedResultList.html
I have no idea if I'm doing this right, but this is how a Get method in my repository looks:
public IQueryable<User> GetUsers(IEnumerable<Expression<Func<User, object>>> eagerLoading)
{
IQueryable<User> query = db.Users.AsNoTracking();
if (eagerLoading != null)
{
foreach (var expression in eagerLoading)
{
query = query.Include(expression);
}
}
return query;
}
Lets say I also have a GeographyRepository that has GetCountries method, which is similar to this.
I have 2 separate service layer classes calling these 2 separate repositories, sharing the same DbContext (EF 4.1 code-first).
So in my controller, I'd do:
myViewModel.User = userService.GetUserById(1);
myViewModel.Countries = geoService.GetCountries();
This is 2 separate calls to the database. If I didn't use these patterns and tie up the interface and database, I'd have 1 call. I guess its something of a performance vs maintainability.
My question is, can this be pushed to 1 database call? Can we merge queries like this when views calls multiple repositories?
I'd say that if performance is the real issue then I'd try and avoid going back to the database altogether. I'm assuming the list returned from geoService.GetCountries() is fairly static, so I'd be inclined to cache it in the service after the initial load and remove the database hit altogether. The fact that you have a service there suggests that it would be the perfect place to abstract away such details.
Generally when asking questions about performance, it's rare that all perf related issues can be tarred with the same brush and you need to analyse each situation and work out an appropriate solution for the specific perf issue you're having.