For using the 32-bit version (from Limits):
22 Limits
Each database runs in memory and/or disk map-on-demand -- possibly partitioned. There is no limit on the size of a partitioned database but on 32-bit systems the main memory OLTP portion of a database is limited to about 1GB of raw data, i.e. 1/4 of the address space. The raw data of a main memory 64bit process should be limited to about 1/2 of available RAM.
The & "where" operator in raw k has stayed with me over the years as a particularly inspired way to deal with column based data.
I would say the language is very interesting. It is probably not interesting enough to get sued for, though ....
I suspect times have changed - there are implementations that have been out there for years (https://github.com/kevinlawler/kona implements k3 with sprinkles of k4, and http://althenia.net/kuc implements an almost-k4 with a JIT and writable closures).
IIRC, when you did your implementation it was when k4 was still a "technology preview" and not their main product (or was just released) - I remember understanding the panic in those action, even though I totally disagree with them. (I didn't know about the threats, but I do remember seeing it appear and disappear within a day, and assumed something was happening behind the scenes)
typedef struct k0{signed char m,a,t;C u;I r;union{G g;H h;I i;J j;E e;F f;S s;struct k0k;struct{J n;G G0[1];};};}K;
Sorry I guess I'm just not seeing the "not much there and actually easy to understand"
Whatever a 'H' is
// remove more clutter
#define O printf
#define R return
#define Z static
...
Removes clutter indeed... :)The q language is very powerful, and expressive - interesting mix of lisp and APL. You can do really powerful analytics without writing tons of code for it.
You really have to see how fast KDB is compared to most nosql products out there.
I barely did any q / kdb; only made a functional and usable UI, and did some prototyping of new ideas in other languages (Java, Max/MSP, Csound). I spent some time looking into q and was thoroughly baffled. Still am. It was really, really fast, though!
As I vaguely understand and can explain it, the k/q system made it easy to do fuzzy searches and deal with missing pieces of data. If the user missed a note, or our pitch detection failed, or our source data was bad, we were still able to find matches. (Yes, I wish I'd been able to understand this more at the time. Bygones, now...)
Sure, you can do the same in R or python, but the whole process is very quick and easy in q.
To elaborate: K uses one letter mnemonic codes for all of it's basic storage types:
G = General = 8-bit unsigned int
H = sHort = 16-bit signed int
I = Integer = 32-bit signed int
J = bigger integer = 64-bit signed int
(Note how G,H,I,J follow each other?) E = 32-bit floating point "rEal"
F = 64-bit Floating point
(Again, they are near each other) S = Symbol
K = "general list type", the central K language type
And that's mostly it; the last unnamed union (with fields "n" and "G0") is for vectors, n being the length and G0 being the data.The only other field you are ever going to need is "t" for type (saying whether which union member is actually in use). The rest are internal implementation details, but are also easy to remember: r=reference count; u=flags; m and a have something to do with memory mapping and allocation).
There are a few more basic types: b=boolean, t=time,d=date,p=datetime,u=month - but they are merely different interpretations of the EFGHIJSK members above; to access data from C, all you need is the list given above.
My thoughts on this area have changed a bunch, I think when I was young I was a lot more about cleverness and conciseness.
Now that I'm older and I've worked on a large variety of software systems, I am starting to believe that readability of code is one of the most important values. After all, you read the code a lot more than you write it.
I can say definitively that: - i have often regretted using single letter variables (outside of loop 'i') - I have very regretted using non-descriptive names - I have never regretted using longer variable/method names
Now a days in an IDE environment, longer names doesn't even convey a typing penalty. Yeah yeah I know Java, but it's a safe language, and in a world where I want to deliver working, correct, bug free code, safety is more important than single letter expressiveness.
After all, I don't think people hold up APL as good code.
I'm almost the other way around. I always valued elegance, which often manifests itself in conciseness (but most conciseness is NOT an example of elegance). Followed by readability, which usually manifests itself in verbosity (though most verbosity is NOT actually readable). And when decision time came, I'd prefer verbose inelegant code to non-verbose concise code.
But then, I spent some time using K. And I realized I need 100-1000 times less lines to achieve the same thing, AND it usually works about as fast (despite an interpreter), AND I have less bugs AND those bugs tend to be of one kind (off-by-one) and easy to find.
e.g., look at this example by Stevan Apter: http://nsl.com/k/t.k - 14 short lines implement an efficient (fast and memory compact) memory database than includes joins, selects, inserts, deletes, aggregates, grouping and sorting.
To the uninitiated, it looks like code golf, but this is actually very readable K if you know K. The definition order:where:{[t;f;o]{x y z x}/[_n;o;t f]} gives two names ("order" and "where") to an idiomatic K expression that takes a table "t", and paired lists "f" (of field names) and "o" (of re-index functions that can be used to filter or sort), and returns the resulting table after reindexing, one by one, applying those functions to their relevant fields. A sorting function would be "desc:>:" (that is, ">:" hereforth also named "desc") which returns a sorting permutation for its argument. A filtering function would be "&3>" (prnounced "where 3 is larger than" and can be written "where 3>" in the Q dialect).
Now, this conciseness does NOT come from golfing. It comes from eschewing the now obligatory object oriented programming, sticking with "down to the metal" data types, and rather than trying to find the minimal base of operations and endless compositions (like most languages do), use a wide base of operations and a precisely chosen set of compositions.
It is true that reading/writing a K line takes 10-50 times as long as reading a C line. But I've been unable to consider modern software engineering anything but ludicrous when a different set of primitives (and experience) can get you the same results for 1/100 or 1/1000 of the size of the executable specification, and everyone puts a SEP field on it.
H is a short in k/q, so H refers to a short here as well. This naming scheme is true for all of the above letters.