Can somebody explain how to control the delete and commit of entities in Postgress database when a microflow runs in a background task ?
Can somebody explain how to control the delete and commit of entities in Postgress database when a microflow runs in a background task ? Explanation. We have a case with 30 million serial numbers. These numbers are imported in an input entry. Thereafter they are processed and written to a file output and stored in a processed entity. Processing occurs in groups of 450.000 units, and 100 numbers are grouped into a single batch. We made a microflow that runs in the background to process each group of 450.000 units. This microflow loops and calls another microflow that calculates the results for 100 numbers. See attached figure slide 1 for the concept used. A start and end transition are used in the loop, with the idea that the delete and commit are done after processing 100 numbers. This seems to work when the internal database in the modeler is used, but not when a Postgress data base is used. When we look in the PG Admin tool and measure processing times of indivudal actions, we see initial fast processing of first batches of 100. However, processing time increase over time. PG Admin shows that only after the loop is finished, the tupples insert and delete is performed. However, we thought that the start and end transition should force this after processing 100 numbers. When we look into the processing time, we see that the retrieve input entity (retrieve action: get first 100 of the list) is causing the increase of processing time (from 10 msec upto 180 ms) , all other actions have the same time. When we start a new group of 450.000 numbers, we see the same behavior, fast start and increasing processing time of new batches of 100. We expect that because the delete of Input entity after processing 100 units is not done, is causing the increased retrieved time while we are processing the 450.000 units. Can we force a delete in Postgress after 100 units, while running from a microflow in background ?
Robert Jan Gorter
Did you think about using the ProcessQueue for your batch processing? It has the benefit that the batches are really running in seperate actions in a seperate context. This may help.
Hi Robert Jan,
One thing I noticed is the placement of your Start Transaction and Finish Transaction events. When the microflow start running a transaction is started. This means that with your Start Transaction event, you are starting a second transaction. Some of the things you describe could happen because of this, because the first transaction is never finished.
How you can fix this is by first finishing the transaction, and then put a Start Transaction activity right after the Finish Transaction. This way always only 1 transaction is open. Hope this helps.
Do note that starting and ending an transaction only causes a savepoint to be stored so that when a rollback is done it will revert to that savepoint. What you are looking for is commit in seperate database transaction. Because then those objects will be committed to the database and are visible by other processes.
Maybe ordering the input entity with an index and retrieval of each 100 based on the same sorting might help.