Julia and JuliaHub: Advancing Innovation and Growth(info.juliahub.com) |
Julia and JuliaHub: Advancing Innovation and Growth(info.juliahub.com) |
Julia also integrates with python, with stuff like PythonCall.jl. I've gotten everything to work so far, but it hasn't been smooth. The python code is always the major bottleneck though, so I try to avoid it.
Overall, julia is a significantly better language in every single aspect except for ecosystem and the occassional environment issue, which you'll get with conda often anyways. It's really a shame that practically nobody actually cares about it compared to python. It supports multi-dimensional arrays as a first-class citizen, which means that each package doesn't have it's own array like torch, numpy, etc, and you don't have to constantly convert between the types.
https://arstechnica.com/science/2020/10/the-unreasonable-eff...
And other peoples code is actually a pleasure to read
One example is a system that I built using three libraries. One was a C library from Postgres for geolocation, another was Uber's H3 library (also C) and a third was a Julia native library for geodesy. From Julia, I was able to extend the API of the H3 library and the Postgres library so that all three libraries would inter-operate transparently. This extension could be done without any mods to the packages I was important.
Slightly similar, if you have a magic whizbang way of looking at your data as a strange form of matrix, you can simply implement a few optimized primitive matrix operations and the standard linear algebra libraries will now use your data structure. Normal languages can't really do that.
More on that second case and the implications in the following video:
That makes it a good candidate for running well on ARM platforms (think embedded data processing at the edge).
Not sure how well fortran does on ARM.
Julia also has an active thriving ecosystem, and an excellent package manager.
Obviously there are real bright spots too, with speed, multiple dispatch, a relatively flourishing ecosystem, but overall I wouldn't pick it up for something new if given the choice. I'd use Jax or C++ extensions for performance and settle on python for high level, despite its obvious warts.
Huh? I think Pkg is very good as far as package managers go, exceptionally so. What specifically is your issue with it?
Open sourcing and maintaining some components of things like JuliaSim or JuliaSim Control might expand adoption of Julia for people like me. I will never be able to convince my company to pay for JuliaHub if their pricing is similar to Mathworks.
The future of Python's main open source data science ecosystem, numfocus, does not seem bright. Despite performance improvements, Python will always be a glue language. Python succeeds because the language and its tools are *EASY TO USE*. It has nothing to do with computer science sophistication or academic prowess - it humbly gets the job done and responds to feedback.
In comparison to mojo/max/modular, the julia community doesn't seem to be concerned with capturing share from python or picking off its use cases. That's the real problem. There is room for more than one winner here. However, have the people that wanted to give julia a shot already done so? I hope not because there is so much richness to their community under the hood.
1. Very scarce packages ecosystem. Like there's dataframes.jl file with poor mans implementation of Pandas.
2. Recompiling everything every time. It meant that a Julia program in some script would take ~40 seconds compiling with dataframes & some other important packages.
I think if a language is to replace Python in science, it would need to either be very fast (recompilation on every run breaks this, and running Julia in a notebook/shell is interesting, but outside of pure scientific code, it should be easier to re-run it), or it should offer ergonomics. Pandas has very rough corners, especially when you need grouping with nontrivial operations, or grouped window functions. Joins aren't easy either. Any system that makes this more ergonomic, could bite a bit off Python. But I don't see such.
https://github.com/TidierOrg/TidierDB.jl https://github.com/TidierOrg/TidierData.jl
of note i am biased as a tidier contributor and author of tidierdb.jl but hopefully you might be willing to give it a try.
> recompilation on every run breaks this
Your comment is exceedingly misleading. Whether and when Julia code gets compiled is up to the user.
It looks like Julia has found a few niches though: HPC and numerical modelling among them.
I'm not aware of what the vision is currently tbh
No one bothered fixing it, in great part due to Discourse being the main place of discussion, as far as I know.
It’s almost statically compilable which has almost gotten me to pick it up a few times, but apparently it still can’t compile a lot of the most important ecosystem packages yet.
The metaprogramming has almost gotten me to pick it up a few times, but apparently there aren’t mature static anti-footgun tools, even to the degree of mypy’s pseudo-static analysis, so I wouldn’t really want to use those in prod or even complex toy stuff.
It’s so damned interesting though. I hope it gets some of this eventually.
I know that's not exactly answering your question, but you might be interested https://chapel-lang.org/ChapelCon/2024/chamberlain-clbg.pdf
Fundamentally, when you keep a tight, purely functional core representation of your language (e.g. jaxpr’s) and decompose your autograd into two steps (forward mode and a compiler-level transpose operation) you get a system that is substantially easier to guarantee correct gradients, is much more composable, and even makes it easier to define custom gradients.
Unfortunately, Julia didn’t actually have any proper PLT or compilers people involved in the outset. This is the original sin I see as someone with an interest in autograd. I’m sure someone more focused on type theory has a more cogent criticism of their design decisions in that domain and would identify a different “original sin”.
In the end, I think they’ve made a nice MatLab alternative but there’s a hard upper bound on what they can reach.
while I don't disagree that currently JAX outshines Julia's autodiff options in many ways, I think comments like this are 1. false 2. rude and 3. unnecessary to make your point
Your first sentence is a scorching hot take, but I don't see how it's justified by your second sentence.
The community always understood that python is a glue language, which is why the bottleneck interfaces (with IO or between array types) are implemented in lower-level languages or ABIs. The former was originally C but often is now Rust, and Apache Arrow is a great example of the latter.
The strength of using Python is when you want to do anything beyond pure computation (e.g. networking) the rest of the world already built a package for that.
For example, the reason why numfocus is so great is that everything was designed to work with numpy as its underlying data structure.
Actually one of the reasons CUDA won the hearts of researchers over OpenCL, is that Khronos never cared for Fortran, and even C++ was late to the party.
I attended one Khronos webminar where the panel was puzzled with a question from the audience regarding Fortran support roadmap.
NVidia is sponsoring the work on the LLVM Fortran frontend, so same applies.
https://bsky.app/profile/badphysicist.bsky.social/post/3lhfm...
If your work is well-served by existing libraries, great! There's no need to compete against something that's already working well. But that's frequently not the case for modeling, simulation, differential equations, and SciML.
But I think a reasonably compentent Python/JAX programmer can roll out whatever they need relatively easily (especially if you want to use the GPU). I do miss Tullio, though.
Another example: It's frustrating that Flax had to implement it's own "lifted" transformations instead of being able to just use jax transformations -- which makes it impossible to just slot a Flax model into a jax library that integrates ODEs. Equinox might be better on this front, but that means that all the models now need to be re-implemented in Equinox. The fragmentation and churn in the Python ecosystem is outrageous -- the only reason it doesn't collapse under its own weight is how much funding and manpower ML stakeholders are able to pour into the ecosystem.
Given how much the ecosystem depends on that sponsored effort, the popular frameworks will likely prioritize ML applications, and corollary use cases will be second class citizens in case of design tradeoffs. Eg: framework overheads matter less when one is trying to use large NN models -vs- when one is trying to use small models, or other parametric approaches.
You mean in terms of the ODE stuff, Julia provides?
Anything particular in mind?
- I wrote this elsewhere: I find their approach to memory management/mutable arrays really hits the worst of both worlds (manual memory management and garbage collection). You end up trying to preallocate memory but don’t actually have control over memory allocations. I find the dynamic type system exacerbates this.
- It’s a very big language, even in the IR. So proper program transforms like mapping functions or autograd are quite difficult to implement.
- Static compilation is really hard, which makes it a non-starter for a lot of domains where it could have made inroads (robotics, games, etc).
Also, IIRC, it’s not terribly difficult to use flax with equinox. It’s just a matter of storing the weight dict and model function in an equinox module. Filter_jit will correctly recognize the weights as a dynamic variable and the flax model as a static variable.
EDIT: or 2 products: https://news.ycombinator.com/item?id=42962548
For simulations, JAX will choke on very “branchy” computations. But, honestly I’ve had very little success differentiating through those computations in the first place and they don’t run well on the GPU. Thus, I’m generally inclined to use wrappers around C++ (or ideally Rust) for those purposes (my use-case is usually some rigid-body dynamics style simulation).