Ask HN: Best way to do heavy csv processing? I got a couple of big csv files (~5-10GB with millions of rows) that needs to be processed (linked to older files and updating the data etc) and then exported to new csv files. The data follows the relation model but just updating one field after dumping it into postgresql takes quite some time (doing a update on join) and I'm not sure this is the most effective tool/way for this kind of work. The only queries that will be run is to update or doing inserts/append new data to existing tables (eg older files). Do you have any suggestions to look into for a workload like this? |