Impending kOS

Impending kOS(archive.vector.org.uk)

323 points by nightTrevors 11 years ago | 231 comments

kiyoto 11 years ago |

>Kdb+ has sharp elbows.

No shit. I used to work as a quant, and while I was an okay quant and mediocre trader at best, I survived for three years in the industry because of my kdb+ proficiency: the firm I was at spent a couple of million dollars on kdb+ only to find out that most people could not wrap their heads around kdb+ let alone debug it effectively.

My (former) colleagues were definitely smart people. In many ways, they were way smarter than myself. But I somehow could get a much better handle of kdb+'s idiosyncrasies, and my ability to stare at dense k/q code (usually no more than a dozen lines) and figure out what's wrong with it earned me the reputation as the "q guy" - and some level of job security.

The firm eventually phased out kdb+ completely after my boss and I left (the two proponents of kdb+).

qfan 11 years ago | |

I've worked in many shops that use kdb+ and the ones who really benefit are the ones who bothered to get some training on it rather than those who just assume they'll wing it somehow. Kx themselves have been running great intro workshops for a couple of years now. Some guys at the next desk attended one and came back buzzing with excitement at how they now saw through the noise. So the take away is - if you didn't bother to learn it, stop complaining that you don't understand it. It's not difficult to get when it is explained well, but you're not gonna get it just by staring at it.

kiyoto 11 years ago | | |

>I've worked in many shops that use kdb+ and the ones who really benefit are the ones who bothered to get some training on it rather than those who just assume they'll wing it somehow.

Yea, I know all about the training and First Derivative. My employer also hired them.

In their defense, every First Derivative KDB+ consultant that I worked with was very sharp and an excellent teacher. They really knew their stuff, and First Derivative is no small part of what has made KDB+ so successful. However, even with their excellent pedagogy, most of my co-workers were totally lost/weren't willing to apply themselves to learn q/kdb+ well.

Here is another way to think about it: many people can't ever get their heads around certain conceptually difficult topics, say, measure theory or quantum physics. I don't think kdb+ is nearly as hard, but it seemed that way looking at my peers who were no slowpokes.

yawaramin 11 years ago | | |

I initially read that as:

> ... came back buzzing with excrement

masklinn 11 years ago | |

Why were you a proponent of kdb+?

kiyoto 11 years ago | | |

For several reasons, some more legitimate than others.

1. kdb+ was (and maybe is) a good solution to the problem that we had: doing complex data manipulation/simple statistical calculations against billions of rows of time series data. Hadoop is the term du jour for data processing, but truth of the matter is that finance doesn't have really huge data. At best, it's a couple of terabytes, and most of the time, you are working with a small subset of it. Running KDB+ on a beefy server or two would usually do the job (rather well).

2. Maybe because I studied math, but I find k/q's vectorial/functional sematics appealing. I think the syntax is horrible, but the semantics is very neat.

3. Finally, because it helped me keep my job. It was rather amazing to me that all these Ph.D. statisticians that I worked with couldn't bring themselves to learn kdb+ effectively. Apparently this stuff can be very hard for even the smartest people (or maybe they thought it was such a niche skill with a low ROI).

viksit 11 years ago |

The last line of TFA reads like the beginning of some sort of movie in the drama/thriller category. "kOS is coming. Nothing will be the same afterwards."

That's what irritated me most.

What I'd like to understand is - what led the author to this particular conclusion? Is it the fact that this language is super expressive and concise? Is it that it routinely [1] outperforms its C counterparts even if it ultimately translates to C? Is the Z graphical interface so superior that it'll blow the pants off Cocoa and Quartz and X.org or Wayland or what have you? Why would one rewrite emacs or vim on it? I don't want some basic 4 line text editor - I would like to be productive. Why would Mozilla spend energy porting firefox to it? Or Google, chrome? Or bash?

Simply talking about the history of K/kdb+ and how brilliant its creator is simply doesn't help the reader understand why they should be excited about it. If that was the intention of this article, then the real points to make should've started after that line.

That would've been much more interesting.

[1] - No pun intended, of course

geocar 11 years ago | |

What you're actually seeing is testimony: people saying they are seeing something amazing, and they aren't very good at explaining what they saw.

Btw: k doesn't translate to C. It's actually a quite simple interpreter. The fact that it outperforms other languages so easily should be saying more about those languages than it should be saying anything about k.

burntsushi 11 years ago | | |

Well, does there exist a technical analysis written by someone who understands k that explains why it is so much faster than X languages?

mwcampbell 11 years ago | | |

So would k be even faster if it were compiled to machine code rather than being interpreted? Would that involve an unacceptable speed versus space trade-off? Or am I missing some crucial reason why k code has to be interpreted?

sz4kerto 11 years ago |

The same story told again -- it's aim is to generate this magical atmosphere around a fast db engine and language that's deliberately obfuscated to make people who work in it feel smart, so they try to spread that it's the best. KDB/Q is a nice tool, not the holy grail how they put it. And it's not fast because they know something better - it's because it lacks almost any safety measure.

kiyoto 11 years ago | |

I totally agree. That's why I wrote a parser for the language to help me and my coworkers debug =) https://github.com/kiyoto/ungod

As I commented above, the syntax is horrible. But I still think it is an effective tool for certain problems.

MichaelGG 11 years ago | |

I'm very interested in hearing more. What do you mean by lacking safety? As in the DB invariants aren't? Or lack of durability? Or as in, easy-to-misuse query language?

And to help calibrate, what are your preferred languages and styles?

bshimmin 11 years ago |

Here's the text editor they're talking about: http://www.kparc.com/edit.k

The code is, well, not the easiest to understand.

scottlocklin 11 years ago |

The APL family was developed to think about math and linear algebra in particular. Iverson's "Notation as a tool of thought" f'rinstance: http://www.jsoftware.com/papers/tot.htm

If you've worked with such things for your day job, exposure to an APL language is mind blowing in the same way as exposure to Lisp is. You'll rapidly find out that an awful lot of the numerics world is an ad-hoc reinvention of an APL language. Leading thinkers in the numerics world have noticed. Have a look at the Tensor type inside Torch7, or -idx- class in Lush (the same thing): they are, in fact, a sort of APL with a more conventional, aka painfully wordy, notation.

Writing an editor in an array language seems crazy, but then, writing a parallel processing system in a language that was designed to run applications in your web browser also seems crazy. If people had stuck with APL style languages, well, databases, particularly distributed databases (Kx and 1010, both K based systems, scale to Pentascale, and have for a long time), would suck less, as would CUDA programming. Their revival could make life easier in these problem domains.

zokier 11 years ago |

Googled around, Kuro5hin (that's a name I haven't seen for some time) has a tutorial for K from 2002: http://www.kuro5hin.org/story/2002/11/14/22741/791

The download link at http://www.kparc.com/ asks for password, so I'm not sure whats going on with that.

sohagan857 11 years ago | |

You have to have a personal invite to download the code atm. The whole article is telling you it is not available yet, but "if coming".

kencausey 11 years ago | | |

I may be mistaken but I believe zokier is referring to a download link for k whereas you are referring to a download for kOS. No?

manish_gill 11 years ago |

The language looks fascinating. Check out Kona, an open source implementation: https://github.com/kevinlawler/kona

radicalbyte 11 years ago |

Interesting article. Found this (http://queue.acm.org/detail.cfm?id=1531242, submitted here https://news.ycombinator.com/item?id=8476120) interview with Arthur Whitney (from 2009) which is also really interesting.

TheOsiris 11 years ago |

> “It is a lot easier to find your errors in four lines of code than in four hundred.”

Looking at his code on http://www.kparc.com/edit.k I'd like to disagree with that statement

lmm 11 years ago |

There's a contradiction I always see in these pieces: they talk a lot about the importance of using the right data structure. But these languages get their incredible conciseness by not giving you any choice about your data structures; their array type is hardcoded into the language, and if you want to use something else then your code balloons.

beagle3 11 years ago | |

K basically has 3 data shapes:

atom (int, float, char, date, symbol, ...)

list (one dimensional array of atoms, dicts, flips or lists)

dict (a map from one list to another)

There's also a flip, which exchanges the first two indexes applied to an item (so, e.g., it effectively transposes a list of lists) but it is just sugar (both syntactic and semantic).

You can trust Whitney that all of these are properly implemented, including appends.

It's not often that you actually need more. I've discovered this after using K for a while, and going back to python.

Back in my pre-K (ha!) C++ and Python day, I had an awful lot of classes everywhere. After using K for a while, my Python and C both have much much fewer (structs in C more often than python, as C is missing python's dict). And the code has gotten much shorter and more efficient. Arguably, more readable as well. And I've essentially dropped C++ for C, because the extra complexity is just not worth it.

ah- 11 years ago |

k/q really doesn't have to be this unreadable, that's just Arthurs style. Here's some code in C by him for comparison: http://kx.com/q/cs107/a.c

nemo1618 11 years ago | |

I tried cleaning it up a bit: https://gist.github.com/lukechampine/f54fce8fd756254cefb2

But the actual meaning of the program is still lost on me. I can only guess it has something to do with parsing files (note the checks for curly braces). Feeding it its own source code produces some output, but I have no idea what it actually modified.

epsylon 11 years ago | | |

It's a solution to CS107 assignment 1 : http://web.stanford.edu/class/cs107/assign1.html

ah- 11 years ago | | |

In case anybody is still interested, it's now up on http://kparc.com/cs107/readme

Torn 11 years ago | | |

The original is 404ing, do you have a mirror?

icsa 11 years ago | |

Here is a style for K (http://nsl.com/papers/style.pdf) 1995. Interestingly, most of the concepts still apply.

I believe the audience is developers using K in a commercial/production environment.

The biggest differences, compared to Arthur's style of writing, are: * Less code on each line * A separate comment column on each line * Nominal use of spaces for readability

sohagan857 11 years ago | |

Q isn't unreadable. Its implemented in standard English words.http://code.kx.com/wiki/Reference K is considered unreadable by many.

ah- 11 years ago | | |

I was more referring to the almost exclusive use of single letter variable names. If one would use at least short words to name things, the implementation would be a whole lot more understandable.

ANTSANTS 11 years ago | |

Good lord, I need a drink now.

cheez 11 years ago | |

If that's true, why aren't there DSLs that compile to k?

svan99 11 years ago |

I love K/Q and is using it in my startup. Thanks a lot for Kx's recent freeing up the 32bit version. To use APL-like languages I have to really shift the way of designing/modeling things. Most importantly K may not be best for lots of developers working on the same thing. Object oriented languages will fit better in that case. K projects often only involve one or two developers who model things in vector thinking (column based thinking in data domain), know exactly what to do and how to do it. Vector thinking is not suitable for all problems, but works really well if it does. Btw, new 3.2 version of kdb appears to be even faster than before. It also improves websocket integration and JSON data conversion. Very nice to integrate with nodejs/Qt.

We also use Forth which I think is a really another way of shifting mind.

icsa 11 years ago | |

What kind of computation is done at your startup? I've used k/q for trading, graph analytics and computer vision (scene reconstruction).

ndesaulniers 11 years ago |

I seriously thought that the article, and some of the code examples people are posting [0] was a joke! I need to go rethink my career.

[0] http://www.kparc.com/$/edit.k

avmich 11 years ago | |

That's certainly good - we should learn languages which are different enough to justify learning.

Your code in whatever language you work will benefit from that rethinking.

robfig 11 years ago |

So K is a general purpose programming language? If the claims are true, why don't they submit some entries to the Computer Language Benchmarks Game?

tinco 11 years ago | |

It's proprietary.

xorcist 11 years ago | | |

igouy 11 years ago | |

Perhaps they asked if their K programs were wanted:

http://benchmarksgame.alioth.debian.org/play.html#languagex

icsa 11 years ago | | |

See https://news.ycombinator.com/item?id=8477308

jamesfisher 11 years ago |

Does a good formal introduction exist for K, or Q, or APL, or J, or any other languages in this family? Something with, you know, a syntax definition at least, and any kind of formal definition of the semantics.

The closest I could find is this [1] but "The model is expressed in SHARP APL", so from the start, it's circular.

[1] http://www.jsoftware.com/papers/APLSyntaxSemantics.htm

eggy 11 years ago | |

It's not formal, but it is helpful to watch these series of videos by Martin Saurer on J:

https://www.youtube.com/watch?v=VSJpJt3c11c

It goes form solving some Euler problems to a full-blown web app in J.

avmich 11 years ago | |

For J, IMO, jsoftware.com has good resources. That includes vocabulary (the way to specify the language), a few textbooks (JforC, J Primer), essays, examples of short code... And J forums are pretty helpful.

Coming back to the question, for J its vocabulary on jsoftware.com is a good resource.

zokier 11 years ago |

I wonder what K would look like with bit reader-friendly syntax. I tried running a program that supposedly "will take a K expression and produce its English translation" (http://kx.com/a/k/examples/read.k), but either it doesn't work with kdb+/q or I can't figure out how to use it. Does anyone have some example output, or advise?

anonbanker 11 years ago |

A language/kernel/db/ui to watch. Reading the comments here, it seems the language separates the wheat from the chaff; the average brogrammer won't he able to handle this, but many of us are very interested in exploring.

An OS this small is incredibly exciting to me.

sumanthvepa 11 years ago |

Worked at Morgan Stanley's fixed income desk many years ago as my first job after grad school, programming A+. It was a decidedly acquired taste. The language and style were utterly alien to anyone coming from a background in C (and at that time an early version of Java.) While A+ was fast for manipulating arrays (much of what the trading floor needed), it seemed that doing GUI development with A+ was really pointless. The speed of the language didn't matter in GUI interactions, and it was very hard do understand other people's code.

tmikaeld 11 years ago |

Um, so what happened?

geocar 11 years ago | |

It's not done yet.

This summer, Pierre and I got kOS to boot directly into g (the graphical interface; formally called z) with ISR, keymap, modesetting, basic filesystem, etc weighing in around 100 lines of C. That was pretty exciting. Could probably be done with less with some deeper changes to Arthur's code, but it's still very useful to run k under Linux. Oleg made a silly little game in kOS.

Arthur and Oleg did some performance benchmarks staging k against q (current kdb+), Postgres, some "popular RDBMS" (that I can't name), and MongoDB. It was impressive that k is so much faster than q, but it also really underscores the cost of the wrong data structure (and how hard it is to get the right one with SQL or MongoDB).

DennisP 11 years ago | | |

How much are you planning to opensource?

I realize you have a thriving commercial software company and that's cool. But...wow. This is exactly the sort of thing Alan Kay's team has been working on for the past five years, and you guys seem to be beating them to it, with a completely different approach. It would be pretty amazing to be able to dig into it, find out how the whole system works, and contribute.

mwcampbell 11 years ago | | |

What hardware is your team currently targeting with kOS? An x86 virtual machine under something like VirtualBox seems to be a popular choice among developers of alternative operating systems, since it's a way of avoiding the diversity of PC hardware and the need for lots of drivers. So are you doing that? Or sticking to things that are pretty well standardized but outdated, like IDE and VGA as opposed to SATA and modern GPUs? Or are you targeting a particular subset of PC hardware?

keithpeter 11 years ago | | |

Quote from OA

"Whitney sent Oleg and Pierre some of the C code he was working on, and notes on a problem he didn’t know how to solve. They emailed back a solution, coded in his style."

Did Pierre and Oleg think their solution out in standard code first and then make it Whitney-like, or did they find themselves thinking in Whitneyese straight away? I imagine their teacher may have noticed Whitney tendencies and that is what led to the original encouragement to make contact.

mamcx 11 years ago |

So, I read about the fast interpreter and small language. How do something like this?

"Whitney’s strategy was to implement a core of the language – including the bits everyone thought most difficult, the operators and nested arrays – and use that to implement the rest of the language. The core was to be written in self-expanding C. As far as I know, the kdb+ interpreter is built the same way.

Unlike the tall skinny C programs in the textbooks, the code for this interpreter spills sideways across the page. It certainly doesn’t look like C."

This mean that the code is very unreadable C? Like a kind of code-golf?

How replicate it for built a speedy interpreter? And what if I use lua or python instead?

JulianMorrison 11 years ago | |

There are two opposite schools of readable C.

The mainstream says: readable C has function and variable and type names that express meaning, so a function is read like a narrative with verbs, adjectives and nouns. The fact that this narrative scrolls over pages, is unimportant.

The APL/K/J school says: readable C has functions, variables, and types named with single letters, so that the totality of a function is short enough to fit in one glance - preferably, on one line that does not need a scrollbar. The function does exactly what it says, no more and no less; its intent is thus completely clear. To name it descriptively would ruin the ability to grasp the whole thing as a gestalt.

mamcx 11 years ago | | |

Ok, so APL/K/J is for people that read compressed code..

But that how relate to how build a fast interpreter? Faster than C?

anonu 11 years ago |

I hate to love KDB because its a very expensive closed platform. But if you understand some of the concepts and how easy it is to achieve those concepts with a few lines of q code you can do some brilliant things. Yes, KDB is a great data store and provides very quick methods for crunching that data with its vector-based approach.

However, whats really impressed me with KDB is that you can do so much more with it. In some banks it has effectively become the messaging middleware for connecting hundreds of disparate data sources. In addition to passing messages you get the data storage and analytics tools for free...

duckingtest 11 years ago |

This seems to be a very fun language to write in, in the same way optimizing assembly code in the ancient times was fun; the age of skilled artisans. Unfortunately (for programmers - great for the rest) now is the time of a factory worker. Due to that, I don't think this language and its platform has a long future.

Its continued survival is rather a testament to excellency of marketing (as evidenced by this article!) rather than actual merits.

avmich 11 years ago | |

I think the opposite :) - marketing of R is light years ahead. Language is very obscure, but so good that from time to time you're going to hear wonderful stories from new enlightened.

And with the rise of parallel programming we're going to actually switch more to more vector way of describing and solving problems. Just like Lisp ideas are spreading everywhere in modern languages, APL ideas are also fruitful.

rsync 11 years ago |

I'm interested in a kOS with a "god says ..." program built in ...

serf 11 years ago | |

get Terry Davis on the job. Is he opposed to porting his holy code to other esoteric platforms?

jeffreyrogers 11 years ago |

> kOS is coming. Nothing will be the same afterwards.

There seems to be this strange idea going around that if we just get the right tool, everything else is going to change forever. I see this a lot with people trying to create IDEs that let non-programmers create programs without really knowing how to code.

But the thing is, most people just don't have anything worth coding. The problem isn't that the tools don't exist. They do, even if they aren't perfect. It's that making something that matters isn't an easy thing to do. And no tool can change that.

tsmith 11 years ago |

Sounds great! Where's the documentation on IPC / mutexes / threading?

> todo

> files, procs, tcp/ip, usb, ..

Oh.

icsa 11 years ago | |

E.g. - (k4/q) http://code.kx.com/wiki/Reference/hopen shows how to open a file/proc.

The rest of the wiki is quite useful including a references and tutorials for q.

k5 uses operators instead of words like q.

Btw, being able map/reduce w/ 1000 procs on a 8GB Linux vm (using k5) is both useful and fun.

rgbrgb 11 years ago |

Kind of curious to play around with that text editor. Any chance of K running on OSX?

ah- 11 years ago | |

You can get kdb+ for free from here: http://kx.com/software-download.php

However, they've been working on a new version of k that's not publicly available yet and I suppose the kparc stuff requires that.

esya 11 years ago |

Hey Geo, if you read this, congrats :)

Tristan

agarttha 11 years ago |

joke: search jqk on google images

lafar6502 11 years ago |

I don't buy this story. Maybe kdb is fast and great, but the article attempts to describe it as a work of a genius, better than anything else because it's 100 lines of code, doing its own memory management and running on bare metal. But in fact many other programs do their own memory management and can run on bare metal. JVM or .Net do their own mem management, all database servers too, and many, many others. So what's left on the table? 500 lines of C code that's meant to change the world? I really doubt it (or, these guys arent using line breaks). We've read so many similar stories about Lisp and how it's one language that can do everything in 5 lines of code or how you can build an entire OS and all applications in Lisp, but where's that OS now?

icsa 11 years ago | |

What's left on the table is the efficient use of resources.

When I mentioned to a friend that the current version of K5 was a binary < 100KB, his response was "I don't believe it. I can't write HelloWorld in less than 100 KB!".

Access to more resources does not mean that one should be wasteful.

K benefits from a dedication to avoiding waste and duplication and efficient use of mathematical concepts. The Fundamental New Computing Technologies at VPRI has similar goals (an entire end-user system including "Office" apps in 40 KLOC or less).

Being able to prototype a multi-proc map/reduce algorithm in k with 1000 procs on a laptop with 8 GB RAM is quite nice.

> 500 lines of C code that's meant to change the world? I really doubt it (or, these guys arent using line breaks). There are line breaks, undoubtedly. That said, I'm sure the code is concise - much like k code.

K3 was 1200 lines of code and included the language, windows(GUI), database, IPC, REPL (w/ simple debugger), FFI and OS interaction. The Windows executable was 320 KB.

jwatte 11 years ago |

Every ten years, some company comes around and claims that their functional/data flow/columnar/meta shortish is orders of magnitude better. Usually, the one thing they have going for them is that they focus on only a small subproblem. That lets them be small. Every demo is one of no edge cases, no exceptions, and no I/O errors. (Or they cram all that into some "standard" library.)

The real challenge is that, 99% of the time, requirements and integration is what kills you, not raw performance. For the cases where performance (or formal correctness, or whatever) matters, the main challenge is usually to convince the market that it's worth paying for, and then finding the right developer project match.

avmich 11 years ago | |

This is all false for APL. First, APL is right there in the history with Fortran and Lisp - and its "every ten years" became irrelevant already when it was used to analyze IBM 360 hardware and found some bugs which were fixed in time for shipping. Second, these languages are truly for very wide ranges of problems - it's the industry curse that APL isn't used more widely; I guess the reason is it's harder to learn. But you can do can use it for really everything. Who'd thing JavaScript would be the language to write Linux emulator? So it's less wonder that k is used for OS writing.