Show HN: Transform a CSV into a JSON and vice versa

Show HN: Transform a CSV into a JSON and vice versa(jsonmatic.com)

125 points by okumurahata 5 years ago | 110 comments

af3d 5 years ago |

Looks a bit like adware IMO. The library appears to be drenched in analytics. Dependencies include:

https://www.npmjs.com/package/web-vitals/v/0.1.0 https://www.npmjs.com/package/@fingerprintjs/fingerprintjs

Harvesting user's data, most likely...

okumurahata 5 years ago | |

Author here; some clarification. I use fingerprint to get the number of visits (instead of using something invasive like Google Analytics):

https://github.com/erikmartinjordan/jsonmatic/blob/master/sr...

I get the fingerprint as a UID (which is like a random number for me). I don't harvest any user's data. Code is open-source, you can verify what I'm saying if you wish.

koolba 5 years ago | | |

Collecting usage statistics is harvesting data. This is a classic example of why you should never run random NPM modules. Or even install them as all of this is possible in a post install script too.

Putting analytics in a deployed app is your prerogative. Putting it in what touts itself as a reusable component is at best frowned upon.

Karliss 5 years ago | |

It looks like there is some confusion of what it is. The content you see in linked page is the the software not a demonstration of how the library output looks. If it was a library taking in bytes and outputting bytes I would agree that it shouldn't depend on any analytics, but if it's a website that's more of authors choice.

ianschmitz 5 years ago | |

There's nothing wrong with web-vitals...

hughcrt 5 years ago | | |

There's nothing wrong with web-vitals, and it's included in create-react-app, which the author used.

croes 5 years ago | | |

But there is much wrong with finger printing.

MonaroVXR 5 years ago | |

How did you figure this out?

lcabral 5 years ago | | |

The page has a link at the bottom to the GitHub project where you check the dependencies...

jarofgreen 5 years ago |

At work we work an a Python Library to do this, and much more:

PyPi: https://pypi.org/project/flattentool/

Source: https://github.com/OpenDataServices/flatten-tool

Docs: https://flatten-tool.readthedocs.io/en/latest/

It converts JSON to CSV and vice versa but also Spreadsheet files, XML ...

It has recently had some work to make it memory efficient for large files.

Work, BTW, is an Open Data Workers Co-op working on data and standards. We use this tool a lot directly, but also as a library in other tools. https://dataquality.threesixtygiving.org/ for instance - this is a website that checks data against the 360 Giving Data Standard [ https://www.threesixtygiving.org/ ].

contravariant 5 years ago | |

What are the advantages of this tool in comparison with e.g. pandas' json_normalize?

jarofgreen 5 years ago | | |

Flatten-tool has more options and functions than just that one pandas functions (But I haven't done a full comparison to all Pandas functions. I wasn't around when the tool was started so I can't say what analysis was done at the time.)

For instance I note with interest their examples on nested data and arrays. We have various different ways you can work with arrays, so you can design user-friendly spreadsheets as you want and still get JSON of the right structure out: https://flatten-tool.readthedocs.io/en/latest/examples/#one-... (Letting people work on data in user-friendly spreadsheets and converting it to JSON when they are done is one of the big use cases we have)

brundolf 5 years ago |

Slightly OT: I've realized that CSVs are dramatically more information-dense than the equivalent JSON, and actually make a pretty reasonable API response format if your dataset is large and fits into the tabular shape. They can be a fraction of the size, mainly because keys aren't duplicated for every item.

earthboundkid 5 years ago |

I wrote my own converter a few years ago, then ended up needing it again last week. It’s one of those things you don’t always need but it’s handy to have when you do. https://github.com/baltimore-sun-data/csv2json

jabo 5 years ago |

I recently heard about a tool called Miller that helps convert between JSON and CSV among other formats: https://github.com/johnkerl/miller

  mlr --c2j cat documents.csv > documents.jsonl

Converts a CSV file to a JSONL file

th0ma5 5 years ago |

CSV is more of a rumor than a standard, plus JSON can have a tree structure. It is a fun idea to think about and may be useful in some narrow cases, but will fail in almost all but those most trivial of structures.

amyjess 5 years ago | |

> CSV is more of a rumor than a standard

This reminds me of something my boss at a previous job would say: "I am morally opposed to CSV."

Why? Because we worked at an NLP company, where we would frequently have tabular data featuring commas, which means if we used CSV we'd have a lot of overhead involving quoting all our CSV data. Instead my boss preferred TSV (T = tab) as our preferred tabular data format, which was much simpler for us to parse since we didn't really deal with any fields that had \t in them.

earthboundkid 5 years ago | | |

Lol, so instead of having an actually working solution (escaping), you had a still broken solution that just didn’t blow up as often so you could ignore it until it caused a crash.

fellowniusmonk 5 years ago | | |

I always try to use ascii char 31 (unit seperator) if I'm computationally generating csvs for shunting around data internally.

hnick 5 years ago | | |

PSV (Pipe) is also good, tabs can rarely show up in some sets of data like mail addresses if humans key them in. I usually go with one or the other if I have a choice.

nly 5 years ago | |

Trees are trivially flattened, and it's literally a couple lines of comments or documentation to describe what flavor of CSV you're using.

th0ma5 5 years ago | | |

Flattening is a lot of duplication

nly 5 years ago |

jq's stream and fromstream functions can be used to flatten and unflatten JSON. I use it all the time at work for POs who want to see data in Excel

https://jqplay.org/s/ub-WvXCcPn

... from there it's just a row->column rotation to CSV.

sireat 5 years ago |

CSV<->JSON is fundamentally an unsolvable problem because of mismatch in data hierarchies among them.

Plus you have the type looseness for both and lack of standards for CSV.

A trivial 2-D case is handled well by Python library such as Pandas. Here OP could be an alternative.

When I say trivial I mean flat 2 dimensional data, such as you would get from Mockaroo or similar source.

However in real life - data is messy.

As you get into 3,4 and deeper hierarchies on JSON you can't really translate that into nice flat 2d CSV.

Then you have missing keys, mixed up types and you end up rolling you own hand written converters.

code-faster 5 years ago |

I have a couple of open source CLI tools to do this: - https://github.com/tyleradams/json-toolkit/blob/master/csv-t... - https://github.com/tyleradams/json-toolkit/blob/master/json-...

tyingq 5 years ago |

Wouldn't this need to allow upload/download of CSV to really meet the spirit of the title? Or maybe replace the references of CSV with "HTML Table"?

okumurahata 5 years ago | |

Yes, you are right. I added a button to allow CSV upload instead of only add data by copying/pasting.

robbiejs 5 years ago |

If anyone is looking at a free tool to quickly edit CSV data in an Excel-like editor, see https://editcsvonline.com

somishere 5 years ago |

Built something similar on codepen quite a few years ago. Not sure where I came up with the format, seems a bit wild looking at the v. nice dot notation used here, but possibly more useful/efficient for variable data models, also takes into account data types:

https://codepen.io/theprojectsomething/pen/OwppWW

Note: click the Toggle Info to read the "spec" (groan) :)

osullip 5 years ago |

I run a software company and we have a challenge when it comes to these types of conversion tools.

If there is any data that is a) not publicly accessible or b) contains personal information, I cannot authorise the use of a web based third party tool. There is just too much risk that some bad actor uses this as a method to soak up data.

I would love to verify /validate that all of the processing is local and have some way to certify if this hasn't changed.

hmsimha 5 years ago |

It would make it much easier for users to visually parse the JSON section if you added `font-family: monospace` to the textarea element

okumurahata 5 years ago | |

Done.

me_bx 5 years ago |

Nice, I like the look of the editable table.

Shameless plug: a similar solution, working all client side, not imposing to use a key as first column, and with options regarding CSV format.

https://mango-is.com/tools/csv-to-json/

867-5309 5 years ago |

this can turn an HTML table into JSON with the option to download the JSON, or it can turn JSON into an HTML table with no option to download a CSV -- where does CSV come into this?

also, clearly javascript is a bit too ambitious for the job when e.g. PHP could provide the intended functionality with two lines of code: foreach($arrays as $values){echo implode(',', $values) . "\n";} echo json_encode($arrays);

also, CSV is more for storing rigidly-structured uniform columns and rows, whereas JSON is more for storing loosely-structured varying objects, otherwise you're redeclaring column headings in every array, which wouldn't make much difference for gzipped transport but still wasteful and verbose nonetheless. column headings are usually the first line of a CSV

laumars 5 years ago | |

If you’re using JSON for tables then you’re much better off using jsonlines. It’s got a properly defined specification (unlike the wishy washy spec of CSV which every maintainer seems to implement differently) so you’re less likely to garble your data while still having all the benefits that CSVs do. Plus a lot of JSON marshaller will natively support jsonlines despite not advertising that functionality.

Personally I’d recommend jsonlines over regular CSVs these days but I’ve had so many issues with CSV parsers being incompatible over the years that I’d welcome anything which offers stricter formatting rules.

I’d definitely recommend you check it out. https://jsonlines.org

oweiler 5 years ago |

Why doesn't the library transform the csv into a an array of json objects?

aae42 5 years ago | |

i was wondering if there would be an option for this in the UI somewhere, but it doesn't look like there is

i wonder if it requires a "key" column to serve as the dictionary key

EDIT: i kind of wouldn't mind a HN discussion on which is better... ruby is my go-to scripting language, so i find this structure very natural, i've found when trying to do the same things with Go, i much prefer things to be structured more like an array of objects

would be interesting to hear the merits of both

brianzelip 5 years ago |

FYI, the view on small devices is pretty bad - the demo json output is almost unreadable without an awkward pinch + scroll. Compare this view to the same content on the Readme via GitHub.

okumurahata 5 years ago | |

Fixed.

ddgflorida 5 years ago |

Shameless plug - I wrote convertcsv.com and it supports about everything you can think of as far as format conversions. JSON, XML, YAML, JSON Lines, Fixed Width, ...

AbhyudayaSharma 5 years ago |

You can do this in Powershell

    cat file.csv | ConvertFrom-Csv | ConvertTo-Json

darrenf 5 years ago |

`jq` can transform CSV to JSON and vice-versa, especially for simple/naive data where simply splitting on `,` is good enough - and where you aren't too bothered by types (e.g. if you don't mind numbers ending up as strings).

First attempt is to simply read each line in as raw and split on `,` - sort of does the job of, but it isn't the array of arrays that you might expect:

    $ echo -e "foo,bar,quux\n1,2,3\n4,5,6\n7,8,9" > foo.csv

    $ jq -cR 'split(",")' foo.csv
    ["foo","bar","quux"]
    ["1","2","3"]
    ["4","5","6"]
    ["7","8","9"]

Pipe that back to `jq` in slurp mode, though:

   $ jq -R 'split(",")' foo.csv | jq -cs
   [["foo","bar","quux"],["1","2","3"],["4","5","6"],["7","8","9"]]

And if you prefer objects, this output can be combined with the csv2json recipe from the jq cookbook[0], without requiring `any-json` or any other external tool:

   $ jq -cR 'split(",")' foo.csv | jq -csf csv2json.jq
   [{"foo":1,"bar":2,"quux":3},
    {"foo":4,"bar":5,"quux":6},
    {"foo":7,"bar":8,"quux":9}]

Note that this recipe also keeps numbers as numbers!

In the reverse direction there's a builtin `@csv` format string. This can be use with the second example above to say "turn each array into a CSV row" like so:

   $ jq -R  'split(",")' foo.csv | jq -sr '.[]|@csv'
   "foo","bar","quux"
   "1","2","3"
   "4","5","6"
   "7","8","9"

And to turn the fuller structure from the third example back into CSV, you can pick out the fields, albeit this one is less friendly with quotes and doesn't spit out a header (probably doable by calling `keys` on `.[0]` only...):

    $ jq -cR 'split(",")' foo.csv | jq -csf csv2json.jq | \
    > jq -r '.[]|[.foo,.bar,.quux]|@csv'
    1,2,3
    4,5,6
    7,8,9

I don't consider myself much of a jq power user, but I am a huge admirer of its capabilities.

[0] https://github.com/stedolan/jq/wiki/Cookbook#convert-a-csv-f...

0x008 5 years ago | |

This is the kind of comment I came here for.

gspr 5 years ago |

I'm sorry, but why is this a website?

AnthonBerg 5 years ago | |

So that we may have this discussion and get out of this strange rut that is the care and feeding of idempotent little formats.

And to educate! To show each other. To make knowledge discoverable. That’s the reason websites like this are honestly a very good thing.

okumurahata 5 years ago | | |

Thanks, AnthonBerg.

me_bx 5 years ago | |

Why not?

Some users don't like to install too many desktop applications and rather use simple web apps...

stevage 5 years ago |

How do you actually load CSVs into it?

okumurahata 5 years ago | |

Feature added (see comment below).

lettergram 5 years ago |

Can this handle uploading csvs?

okumurahata 5 years ago | |

Feature added. You could only add CSV data by copying/pasting on the table, but now you can upload a CSV file as well.

https://github.com/erikmartinjordan/jsonmatic/blob/525b7fbc9...

luming 5 years ago |

You should use outline instead of border in your cell css.

codetrotter 5 years ago | |

Why?

I read https://css-tricks.com/almanac/properties/o/outline/ and it says

> The outline property in CSS draws a line around the outside of an element. It’s similar to border except that:

> 1. It always goes around all the sides, you can’t specify particular sides

> 2. It’s not a part of the box model, so it won’t affect the position of the element or adjacent elements (nice for debugging!)

> […]

> It is often used for accessibility reasons, to emphasize a link when tabbed to without affecting positioning and in a different way than hover.

I guess this is why you said outline should be used instead in this case.

luming 5 years ago | |

And ::focus-within pseudo-class.

» ps aux | grep root | head -n5 | format jsonl ["root","87596","0.0","0.0","4359648","116","??","Ss","10:01am","0:00.02","com.apple.cmio.registerassistantservice"] ["root","81777","0.0","0.0","4321932","88","??","Ss","Wed12am","0:00.01","PlugInLibraryService"] ["root","71784","0.0","0.1","4365572","10504","??","Ss","Tue11pm","0:25.88","PerfPowerServices"] ["root","42906","0.0","0.0","4321572","88","??","Ss","Tue09am","0:00.01","com.apple.ColorSyncXPCAgent"] ["root","47415","0.0","0.0","4303172","88","??","Ss","Sat04am","0:00.01","aslmanager"]