Best way to remove objects without performance loss
We have an app that is constantly importing and exporting data and due to the many xml mappings in our app, we have to clean up old data. Because of the database size, we're experiencing troubles since we implemented the cleanup. Our cleanup setup: Scheduled events triggering cleanup microflows each hour Each microflow cleans up objects of a different module Total of 60 objects on toplevel Delete behaviour handles deleting the associated sub objects of the objects on top level (probably this causing all trouble) Each microflows has a couple of retrieves to get a list of objects older than x days and deletes those one by one in a loop Count of the retrieves are logged in the microflow All scheduled events are working fine, except the most important one. When we start the app, this microflow runs exactly one time. I can see this by looking at our log. Somewhere in the microflow the logging stops, so I concluded that this microflow hangs. We're also logging all the scheduled events and I see that this scheduled event is triggered less than the once a day. It should trigger every hour. I tried to reproduce this but I don't have that many data in my local database. I'm trying to cleanup the objects of the trouble microflow, maybe there are alternatives: Batch remove in java Run a sql script that deletes the data (don't like this one) Maybe you guys have some ideas why my current approach is not working and how I can improve this?
In my experience a removeBatch in java is faster.
It also helps to prevent cascaded deletes by (batch)removing the childs yourself.
If your queries are too complex you might consider marking your records with a commit without events and then using a removeBatch in the right order to actually remove them.
Why do you not make a backup of the data of the live application and restore that one on your local machine? Then you have the data and you can debug better.
Did you consider building a solution that will clean up directly when objects are no longer needed? This would spread out the load and prevent you from having to analyze your database every --insert magic number here-- interval.