JavaScript for Data Science

146 points by mrmagoo17 5 years ago | 75 comments

I know that data science is a broad and somewhat vague term but this -

   We will cover:

    Core features of modern JavaScript

    Programming with callbacks and promises

    Creating objects and classes

    Writing HTML and CSS

    Creating interactive pages with React

    Building data services

    Testing

    Data visualization

    Combining everything to create a three-tier web application

- this isn't data science.

zwaps 5 years ago | |

I get your point, but as someone doing data science and having no idea about JavaScript, this is actually precisely what I need.

Like, all the stuff "for my data science", such as making a visualization website etc.

11235813213455 5 years ago | |

out of context it's not data science

zitterbewegung 5 years ago | |

It's more like Presenting and Serving Models using Javascript for Data Science.

bryanrasmussen 5 years ago | |

nobody ever writes books assuming you know how to use the language, I suppose it decreases customer base.

rapfaria 5 years ago | | |

While understanble, I hate this. "Here's 100 pages of python before we get to the good stuff", which ends up not even being good.

Publishers should just offer a free e-book of said language, and make it a requirement.

HWR_14 5 years ago | | |

It decreases the amount of boilerplate "how to program in X" text you have to write. Producing text, especially novel text, is expensive in a non-fiction book.

bluishgreen 5 years ago | |

"JavaScript relies heavily on callback functions: Instead of a function giving us a result immediately, we give it another function that tells it what to do next. Many other languages use them as well, but JavaScript is often the first place that programmers with data science backgrounds encounter them."

That sentence from the book clarifies a lot for me. It is Javascript for Data Science People. Taken in that context this is an excellent book written with empathy for the Data Science user who is usually making uneasy excursions which they hope and pray is only temporary into Javascript and running back to Python the first time they encounter a Promise or a Callback.

jhbadger 5 years ago | |

The book does cover a lot of basic Javascript material, as its target is actual natural scientists who may not have much experience with the language, but towards the end it does cover things like Data-Forge (which is a data science library in Javascript)

nkmnz 5 years ago | |

The title is „JavaScript for Data Science“, not „Data Science for JavaScript“. It’s like... in a bar: they will serve a beer for you, so they have the beer and you have you. For a book called „JS for DS“, you should have the the DS while they bring the JS.

Compare this with: „Data wrangling with JavaScript“ [1]

[1] https://www.amazon.de/Data-Wrangling-JavaScript-Ashley-Davis...

d--b 5 years ago | |

Well the problem with “data science” is that it costs a shit ton of money but rarely integrates into anything. A book about wiring data science models into real user facing application maybe isn’t data science, but sure is useful...

javierluraschi 5 years ago | |

I'm glad more people are doing DS/ML/AI with JavaScript, thanks for this book and keep up the great work! -- We are also working in this space, would love to connect, you can find me in javier at hal9.ai

jiofih 5 years ago | |

I assume it’s aimed at data scientists who want to learn Javascript? No point teaching DS concepts here.

A better name would be “JS for data scientists”

jhgb 5 years ago | |

Presumably that's why the "JavaScript for" prefix precedes it?

danpalmer 5 years ago |

I don’t want to repeat the old and tired JavaScript hate, but this just isn’t a great idea.

I’d suggest that there are 3 important primitives for data science: flexible numeric types, fast math/algorithm libraries, and data manipulation being easy.

JavaScript doesn’t really have any of these. Numbers are 64bit floats only - no integers, no big numbers. There aren’t equivalents to Numpy/Pandas/Scikit Learn, and the lack of standard library and expressiveness in data manipulation in the language makes basic tasks harder.

JavaScript has its uses, but there’s really no reason to force data science be one of them.

czep 5 years ago |

To address some of the skepticism about when and where javascript would be appropriate in data science, would you want to fit a logistic regression model in javascript? Probably not, but to build a solver that takes model outputs and visualizes the changes in predicted probabilities based on different combinations of variables? This is definitely where javascript would make sense. Visualization, dashboards, reporting, and exploratory analysis are all ripe domains for developing rich responsive UIs. Basically, any layer where you have a data-to-human interface can be leveraged with javascript.

There is a lot of great work happening in this space already. In the R world for example, shiny makes heavy use of js to the point that you often can't tell where R code ends and javascript begins. Plotly's Dash provides bindings for R, Python, and Julia. Personally, as a data scientist, I have been excitedly learning React because it really rips the landscape wide open for all the use cases I mentioned above. It then makes sense to have libraries that give JS users a good data model and can do most of the same numerical computation that we'd be doing in other languages. Again, you probabaly don't want to do serious numerical work in js, but remember people said that about Python ten years ago too.

I love the framing of this book, because I want more data scientists to start thinking about the presentation of data and spark some bits of ingenuity to make datasets and model outputs accessible to non-data scientists. Data scientists should be the ones writing the tools that interface data with humans because of their domain knowledge. But this is a different skillset and usually the work of SW engineers. Of course engineers can also have great data intuition too, but I really do encourage data scientists to develop their front end skills, it's well worth it.

tharne 5 years ago |

I don't see the point of this. You already have a ubiquitous, easy-to-learn, high-level language that's great for data science, it's called python. If you're a JavaScript developer who wants to get into data science but are too lazy to learn python, you probably weren't that interested in data science in the first place.

Python definitely has some problems, but if you were going to have a new lingua franca for data science, it would probably be something like Julia, certainly not JavaScript.

javierluraschi 5 years ago | |

My hunch is that there has been 10X more investment in engineering for JavaScript: nodejs, webassembly, webgl, webgpu, react native, deno, typescript, electron, chrome, etc. That will be harder to rewrite in Python than to rewrite TensorFlow and a few math libraries in JavaScript.

la_fayette 5 years ago |

Data science is not a standardized term, however I don't get what specifically makes this text relevant for the domain of data science... For some data science projects one could surely use javascript, however in mamy cases one misses important libraries, for purposes such as statistical analysis, data manipulation, machine learning, ...

genrez 5 years ago |

I am a noob to Javascript, so if someone knows better, than please correct me about this, but arrow functions aren't meant to replace normal function syntax, right? From [1], it seems like the main point of arrow syntax is to allow you to inherit the "this" parameter if you are inside a method. Meanwhile, you need normal function syntax if you are creating a constructor, making a method function for a prototype, or making generator functions. (I didn't even know javascript had generator functions until just now :))

So it seems a bit weird to me that they advocate using arrow function syntax instead of the regular syntax. They seem to be advocating using the new class syntax instead, so I guess they don't need the constructor or method creation features of the normal syntax, but I still don't see why they would specifically advocate for arrow function syntax. Is it faster? They say it interferes with other features, but which features?

[1] https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe...

mLuby 5 years ago | |

I've seen a majority of sources abandon the function keyword entirely in favor of const arrow declarations (and shorthand method syntax).

FWIW I personally like the function keyword, since it's clear what it is to non-JS readers, but primarily because it hoists to the top of its file, so unimportant utility functions can sit unobtrusively at the end of the file, thereby letting readers encounter more important logic earlier in the file.

genrez 5 years ago | | |

Interesting to know that what the article recommends is indeed the industry standard. I'd forgotten about hoisting until you brought it up!

__jem 5 years ago | |

Not changing `this` is a huge benefit that shouldn't be ignored. Especially when you're programming in a more functional style, it makes sense to default to arrow functions because you never want to engage in `this` shenanigans anyway. So, yes, I'd say it's a pretty common idiom in the JS community to replace "normal" function declarations.

genrez 5 years ago | | |

I agree that inheriting the `this` for arrow functions is beneficial. To me it seems like you would want to use the normal syntax for global functions for hoisting and to prevent unintentional re-definitions, the arrow functions where you would use lambda functions in other languages, and the class method syntax for methods.

side-note: Most of my JS experience is writing userscripts for myself, so I definitely do my share of 'this' shenanigans.

talolard 5 years ago |

As a data scientist who does more frontend, I think this is a really valuable concept. Hello by users/stakeholders engage with our work is the way to push it forward in the org and a dash of frontend can do wonders for getting that message across. It’s wonderful that people are making resources about the frontend for data scientists

javierluraschi 5 years ago | |

Glad you also see it this way! Would love to chat with you and get some feedback on a platform we are building at hal9.ai, my email is javier at hal9.ai -- Looking forward to chat.

brianzelip 5 years ago |

Just putting this out there: stdlib - a standard library for js, https://stdlib.io/.

mark_l_watson 5 years ago |

I thought of writing a Javascript + tensor flow.js + NLP + web scraping + linked data + etc. book about a year ago. tensorflow.js is especially very cool: well documented with great examples. In fact, it was the great tensor flow.js examples and demos that convinced me to not write the book because I didn't feel like I could do much value add on that subject.

splithalf 5 years ago |

Data scientists are the new webmasters.

qntty 5 years ago | |

Could you elaborate?

slt2021 5 years ago |

hard pass.

even python is not used for data science, all heavy lifting is done in C/fortran, and python is just a glue

Rainymood 5 years ago |

Really cool but no one needs this... as a data scientist learning javascript, teach me how to run data science models using javascript! That's where the real gold is... I'm even thinking of writing articles about this myself... JS is great for making things more tangible and interactive

m00dy 5 years ago |

well, I was expecting training a neural network with web-assembly through gpu support in its last chapter :)

temp8964 5 years ago |

They use data-forge.js, which has less stars than danfo.js.

I can't find any benchmark how they compare to data.table or pandas.

Without a dominant and high performance data frame library as a foundation, I wouldn't even try.

jason0597 5 years ago |

Why on earth would you want to use JavaScript for Data Science?

nesarkvechnep 5 years ago | |

Because some people are monoglots :(

javierluraschi 5 years ago | |

A few reasons, https://venturebeat.com/2021/04/23/4-reasons-to-learn-machin...

Personally, I'm excited to build apps that don't require cloud computing and if they do, have access to one of the largest software engineering libraries through NPM. Sure, I'm not doing just Data Science in JavaScript but rather building apps that use DS/ML/AI, but that's still a valid use case. The alternative would be to use Python for prototyping then rewrite for production apps.