Normally, when you have a Retrieve from db followed by an aggregate list action, this is optimized by Mendix, so the list is not actually retrieved but only the aggregate result is returned (https://docs.mendix.com/howto/logic-business-rules/optimizing-microflow-aggregates).
However, I recently found out that when using a custom range (offset or limit), apparently the retrieve is not optimized and the entire list is retrieved.
Since imo this is a bug, I've created a support ticket for this, but instead of fixing it, Mendix has updated the documentation :( and they suggested to create an idea for this… so here it is ;)
An aggregation of a custom retrieve might still be usefull. Let’s say I have a list of 1000 objects with some decimal attribute. I want to have the average of these decimals, ignoring the lowest 100 and the highest 100. i would do a retrieve with sorting on this attribute and use a custom retrieve with offset 100 and limit 800 followed by an aggregation. I now expect this to be optimized.
Interesting points Rom, I always though that aggregating in memory was less efficient than doing a separate retrieve, but based on your remarks I did some test and actual the opposite is true. Doing a count on a list of about 3700 objects that is already in memory takes 0 ms, doing a sum or average takes 4 ms, this does not weigh up to doing an additional retrieve, so indeed my 'optimization’ is actually doing the opposites. That does indeed render my use case invalid.
I also agree it's difficult to find other actual use cases, but I also cannot think of a reason why Mendix should not support this.
@Bart: see, that’s not a proper use case: lists in Java have their size as an attribute (and so List.size() is an O(1) operation, see e.g. https://stackoverflow.com/questions/6540511/time-complexity-for-java-arraylist )and therefore your ‘optimization’ is actually not an optimization, since your call requires database interaction.
And summing: if you’re batching, I can’t really see the use case for summing. Still, you’re probably using the list, and might as well sum in memory, it’s not horribly inefficient.
@Rom: one use case (where I encountered the ‘bug’ in the first place, is when doing batch processing, I do a count of the number of items in each iteration. To make use of this optimalization I did 2 retrieves: one for the count, and one for the actual processing, because a separate retrieve with aggregate should be more efficient.
But another use case I can think of is when you want to do a sum of the first x records for example.
What is the use case to do just a count on a subset of a table? Whenever I retrieve something with a limit (and optionally offset), I *do* something with the list besides counting.
To me, it also seems to be a bug. But if they think that it’s a feature, you have my upvote to change it. :-)
Would definitely like this.