There is not a standard module (at least that I know of at the moment) but you may want to look at Apache PDFBox which is also included as a default lib and a great lib for what you want to do.
And to get you started:
In the past I did a proof of concept with Abbyy Flexicapture. But it means that these PDF documents go first go though this system and then you get a XML in return (it needs training like an AI system and it gets better over time). It was quite accurate. There are probably more systems out there like Flexicapture. I have not yet found a cloud sollution that did this.