The microflow execution logic is exactly as you described. The remaining logic/actions in Microflow A will not execute until (sub)microflow B is complete. Perhaps there’s more to your question? Could you be more specific?
Additionally, the Task Queue allows more asynchronous execution of logic, if that’s more what you are looking for...
You should be using parallel split in your case and calling the two microflows A & B in each of the split path. So, use a third Microflow C and call both A &B inside C using parallel split paths