CSV Import and process via sftp not processing records
Hi Guys, There is this issues that has been bugging me for a while now. When I pull and process a file of around 2 million records from SFTP some data changes that should happen based on that file do not accure. This changes seem to be random. For example entity customer <---* order. I import 10 order per customer, and say if costumer has 1 order of more than 100 euros make customer.level = enum.CustomerLevel.Pro ……. If customer level is pro do not change costumer.level Entity Order contains lets say attributes. Ordernumber, amount of articles and total amount of Euros. After the data pull and process is done, you find a “pro” level customer – find the order with above 100 euros and all is fine. All the data is there except that the customer.level is still not Pro. If I upload the data manually with file manager and trigger the same process microflow, all the data and the changes are there. I do keep getting the warning: WARNING: Not all bytes were read from the S3ObjectInputStream, aborting HTTP connection. This is likely an error and may result in sub-optimal behavior. Request only the bytes you need via a ranged GET or drain the input stream after use. But after some browsing (and considering i get the same message while uploading the file manually or pulling it from sftp ) i think i can presume that this is not the cause of this issue. Thanks in advance guys
What I usually do with these import functionalities, is doing it in 2 steps. First import all the data in an Importdata entity. Once all data is imported, start a microflow to process all these imported records, and set a processed-flag. This processing should be done in batches and with an end-transaction after each batch.
When does your processing stop? During the import of the file, or the processing of the imported entries?
The file is saved in a import data entity and processed in batches. Process saves a part of the file in a import data entity → process the import data entity in batches → once finished delete the import data entity and start importing from the last line read – process this until csv = empty.
The headschatcher from me is that the entire process seems to work fine when I upload the file manually (using the file manager widget and triggering the same MF). Lines that are not being processed partially when file is pulled from sftp are being processed completely when csv is uploaded manually.