It's also very hand-wavey on the details of how to actually use graph convolutional networks to extract structured ID card data. For example what "bounding box information" is used in your node representations? What is the architecture of your biLSTM?
This seems very much more like a promotion for your API than useful information on how to build a system that extracts data from ID cards.
What deep learning gives you that’s really useful and valuable (beyond better OCR) is that you can use graph convolutional networks to automatically parse the OCR output and convert it into structured data. You could hand-write a parser or use a template matching approach but you’ll have to create a new parser/template for every ID card type whereas the GCN approach can be used to learn the parser…