Hi Achiel, thank you for your long answer!
I'm aware of this problem, I just assumed that a retrieve in a microflow as a dataset had the same behavior as retrieving a dataset the "normal" way. But obviously this is not the case...
This 10k is more exception then rule, but the exceptions have to be possible as well and shouldn't kill the entire server. I realize that what I'm trying to build here is not that easy in Mendix, since you're bound to the framework's limits. Nevertheless I'm not giving up just yet!
I've had a good night sleep and have come up with some new ideas. I guess I will build some custom interface which will mimic the default Mendix behavior, because users need to be able to take actions on the recordset, like sorting and exporting. I guess I'll use some smarter retrieve actions and sorting to get the job done. I have a nice challenge for the day :)
This is a common problem with application development: doing stuff on the database is a lot more efficient than doing it on the application server (because it's loaded into memory, as Rom mentioned). It's actually quite cute that Mendix makes this very apparent in that the database is very close to the UI because we have a nice security system sitting in between.
There unfortunately is no magic wand that will fix this for you, in any language. However, you do have options: - hit the database directly, as you have astutely noted. Maybe you can pre-populate the database with all sorts of values, which will allow you do the queries via xpaths? - limit the queries you're running in your microflows. 10000 hits sounds like a lot, you can limit retrieves in your microflow to X (where x should be a lot less than 10k) which should alleviate the memory usage. Some smart xpaths and sorting should get you quite far?
I hope this points you in the right general direction: the case you're talking about is pretty difficult, especially because we don't know exactly what you're trying to achieve. If you want more specific tips you might want to open a new thread with more specifics?
Ok, to be a good citizen, I'll post the solution I've come up with here. Done the following:
I've added a small entity, which holds an association to the entity where the parsed xpath is. In this entity (I'll refer to it as SEL in this story), I have a field containing the count, the offset and a association to an entity which holds all info about the database fields and in this case the fieldnames. When the user runs the parsed xpath, I run a microflow which calls a java action which returns a count from an aggregate on the retrieved object from xpath. This microflow also creates a new SEL object and places this record count in the recordcount field. The offset is set to zero.
The SEL form has two buttons, one forward and one backward button, which are just calls to microflows. The form has a grid which gets it's data from a microflow. This dataset microflow calls a java action and passes the SEL object. The java action gets the data by use of the parsed xpath and use of the offset and sortfield from the SEL object. I've also set the limit to 25 (which is the size of the list in the grid), so it returns pretty fast.
Now, the user sees more or less the same as he would see when a "normal" data query is done. When the user pushes the back or forward button, I update the SEL object and increment/decrement the offset with the amount of the list limit. Then the grid is refreshed and the next or previous page is shown and some read only fields with some css kinda mimick the "record X of X records" label. When the user choses another sortfield, the grid is refreshed again, so the dataset runs the java action again and returns a fresh list. Now it takes less then a second to "select" 20000+ records. Pretty sweet :)
The only issue now is that I have to create several forms for different grids. If somebody knows a good way to dynamically load another view into a view based on an association, I'd like to hear. All buttons are in the SEL object, I just need to load another grid view into the form... Oh, one more issue: you need several microflows, because you cannot pass back an any object to mendix to hook up to the grid. That's a bit of a maintenance bitch. It still beats getting the server killed though!
The retrieve from the database does not actually load all objects in memory: it retrieves a limited number, i.e. the number of rows specified in the data grid.
The microflow on the other hand tries to load everything in the memory. If you have enough records, this will give you the out of memory error.