InfiniSQL(infinisql.org) |
1) I include keystore-like stored procedures in the source. They do get/set with integer key and string val. I haven't done thorough benchmarking, but I expect them to outperform the other benchmark I've published, which is quite a bit more complex workload
2) (camus2) agreed, nothing ever dies in IT. But roll back the clock a few years. How much noSQL would come into exisence if there was a free xzySQL that scaled across nodes, was fast, etc. I believe the answer is that there'd be very few network-based noSQL for operational workloads if that had been the case.
3) jwatte: Yeah! Jagged edges too!
4) stephen24: Also, I intend to change the license from AGPL to GPL next time I push out some code. No excuse not to try it out.
5) siliconc0w: There's an architectural write-up at High Scalability: http://highscalability.com/blog/2013/11/25/how-to-make-an-in... -- I believe that the actor model architecture is distinct in InfiniSQL.
6) diwu1989: Yes and no. Yes, MemSQL is more mature. No,
(a) I'm not sure how MemSQL scales horizontally (especially since that was a feature added after v1 of their code was released), and,
(b) MemSQL isn't free software
7) itsbits: for now InfiniSQL is mainly for hackers and early adopters--the dependencies are pretty clearly documented but it requires some effort to work with in its current state
(And I really want to try this...)
To be clear, there are no legal reasons I can think of that would ever prevent internal use of LGPL/GPL software.
You mean these companies (Apple, for example) have policies.
Policies like this often change because someone decides the cost vs risk tradeoff is worth it.
Changing a license because of bad policies of certain companies is not a great reason to change a license (in fact, it's, IMHO, an actively bad one).
You really should only change licenses if you find the license you chose does not suit the needs of your users (and policies are not really needs).
Based on FSF feedback, I'm going to modify the license to include a Classpath-like exception. The intention is to allow people to write stored procedures that link against infinisql without triggering the copyleft. Only if the source to infinisql itself is modified (and distributed) will the copyleft apply.
I'm curious to know the rationale against the GPL in general (not just the AGPL), and how those shops allow Linux & gnu toolchains in spite of their rule against the GPL.
Do you have benchmark reports?
I'm currently using MySQL, how similar is the SQL syntax?
The SQL support is documented (http://www.infinisql.org/docs/index/)
I like document stores like MongoDB and RethinkDB and feel they are a great fit for most scenarios. I also feel that caching layers with Redis or Memcached can help...
Cassandra is interesting in the primary storage space as well, and imho has resolved a lot of issues, while others remain. I'm interested to see if this database can get there faster than Cassandra/CQL can get to more parity with traditional SQL systems.
While I appreciate the options, there is no one solution for everything... If you never break 100 simultaneous users, memory-mapped flat files and map/reduce could be sufficient.
> InfiniSQL currently is an in memory database. This means that all records are stored in system memory, and not written to disk. This provides very high performance--but it also means that InfiniSQL currently lacks the property of Durability. If the power goes out, all data is gone. This limitation is temporary.
They do mention that they'll implement persistence, but that's likely to lower performance, as you're limited to how fast the write ahead log can be written, even if updates to on-disk structures are batched.
They also mention:
> No sharding is necessary with InfiniSQL: it partitions data automatically across available hardware. Connect to any node, and all of the data is accessible.
I haven't looked at how joins are done across large tables that span over multiple nodes (or if it's even supported), but that's not likely to be fast either, for obvious reasons.
2) no joins supported yet. However, the benchmark that I performed (on the blog) involves 3 updates across random nodes. I designed InfiniSQL specifically to perform multi-node transactions very well, because that's the Achilles' heel of every other distributed OLTP system. I plan to implement joins, but expect them to perform decently for the workload you describe.
It practically is on the front page.
It does really scale, check out the benchmark report on the blog. http://www.infinisql.org/blog/2013/1112/benchmarking-infinis...
For deadlock-prone workloads, it will likely not be as good, admittedly.
I'm considering a variation on MVCC that gets around the single transactionid bottleneck, but the currently implementation is based on 2PL. http://www.infinisql.org/docs/overview/#ftn.idp37098256
For concurrency management algorithms, there are no good ones. Only those that are less bad than others in some cases.
More charitably, they don't want to be _legally obligated_ to contribute if they modify it.
Instead, a friend and I have been thinking about how to perhaps modify MVCC to work with distinct transactionid's per partition. Namely, I'm already generating what I call "subtransactionid"'s for each partition involved in a transaction. And those must be ordered for synchronous replication, so I think the way to implement a variation on MVCC may already be mostly there.
I know I still owe you an architectural doc...fixin' ta, ya know.
Should be quite easy to do equijoins especially if you're joining a couple thousand rows at most at a time; it only gets hairier when you're joining all records of very large tables that don't necessarily fit in memory, which is not very OLTP-y.
With regards to persistence, I'm really curious to hear how you're planning to have durability without writing something to disk on every transaction. It could work if you're relaxing the definition of durable to mean written to memory on at least $n$ nodes, though that's likely to be surprising to someone with a stricter definition of durable.
Edit: By the way, it's really cool that you have a C++ implementation of actors, I'll have to look into it. Have you thought about turning that into a library?
I've thought about having an actor library, or minimally, to have the actor basis of InfiniSQL independent of specific workload, but haven't thought it through entirely. I'd be supportive of any efforts to that effect if you want to work on it!
Also, the main application is in C++. A python script launches the C++ daemons. Perl scripts are quick and dirty tests and deployment scripts. The main hacking I'm looking for is with C++, and I don't care so much if the other stuff gets re-implemented in some other language.
No API, got it.
Most of the developers i've seen will happily sell you a commercial license if you don't like the software. After paying for it enough, most companies start to ask "well, actually, how risky is this, really?", and this is how policies change.
In any case, my other point stands - there are no actual legal reasons to not use LGPL/GPL software internally. It would have zero legal impact.
I actually generally see just the opposite - even automakers, who are traditionally stalwarts about anything, are now starting to use GPL software in cars.
"which is why there is so much work being invested into LLVM/Clang"
This is a weird opinion, that i've seen a few times.
This is not why LLVM was/is chosen, AFAIK. LLVM was/is chosen for greater control over destiny, a better platform, and a better community.
If LLVM was GPL there is exactly one company that would theoretically stop contributing (admittedly, it's been about 2 months since i calculated the list of companies that contributed in the past year). I doubt that would actually happen, too (mainly because I asked once if they would)
I was just at an LLVM social this evening, and not a single person there worked for a company that chose LLVM because of "massive commercial pressure against the GPL".
But I'm conflicted about GPL vs Apache (or BSDish) in the sense that I'm getting the message that I have to bend over backwards just a little bit further before somebody, somewhere might be willing to use my software, maybe. Free isn't enough. I also have to let them fork it, keep it proprietary, wrap their own brand around it, before maybe they might consider using it.
That said, I really want people to use it, and of course help me hack on it. But I'm conflicted.
My impression is that the second list exists solely because there exists GPLv2 licensed software with no viable alternatives to it. Unfortunately, your project is not one of them. It's your project so you can do whatever you want, but GPL is an obstacle to adoption in industry.
Apple was frustrated they couldn't get what they wanted out of GCC, and were getting patches and designs constantly rejected. This, combined with getting control over their destiny, and having serious needs for a modular compiler frontend for XCode (and their design for a compiler server/etc for GCC got shot down), plus Chris demonstrating good performance results + trajectory, led to them choosing LLVM.
But what do I know - I was there, in both communities, talking to the people who were involved in these decisions.
Realistically, if the SVP/VP in charge of Apple's developer tools had decided GCC was still the way to go, they would be working on GCC, GPLv3 policy or not.
Policies are not an end unto themselves.
All of this is completely orthogonal to the freezing of the GCC version. They could get what they wanted out of it in the pre-GPLv3 versions, given their future plans were LLVM based anyway, so they didn't make an exception for GCC when they banned GPLv3.
Of course, I'm not going to claim that apple didn't do other things more for licensing reasons, which a lot of can be explained by the desire to be able to share code between OS X and IOS in some places (and eventually, in a lot of places), and GPLv3 would have disastrous effects if they messed up. They calculated the eng cost, came up with "we have good alternatives, and can rewrite the rest", and did that, and banned GPLv3. However, they were making exceptions for years for certain pieces of software already. So if you had chosen any other example than LLVM, i'd probably agree with you. LLVM is just not a great example of "commercial pushback against GPL".
Apple's dropping of Samba would be a good example, since that is directly the reason they dropped Samba.
[1] One of my GCC friends walked out of this presentation complaining that he was selling them a bill of goods. Of course, he turned out to be wrong, but ...
The conclusion I've drawn from it is that GPLv3 was a significant driver in the decision to seek out and drive forward a non-GPL compiler project. I didn't say it was the only factor, but I stand by my conclusion that it must have been a significant one.
"The conclusion I've drawn from it is that GPLv3 was a significant driver in the decision to seek out and drive forward a non-GPL compiler project. I didn't say it was the only factor, but I stand by my conclusion that it must have been a significant one."
I believe i've completely rebutted this statement with my response. I believe I accurately explained exactly what went into the decision to fund and use LLVM, and "seeking out and driving a non-GPL compiler project" was literally not on the list of things the decision makers (Ted, in this case) cared about.[1] If you have actual historical evidence to the contrary, that contradicts my explanation of what drove the decision to use LLVM, i'd love to hear it. So far what you've put forth is a single data point which I already explained, was, AFAIK, completely unrelated to the decision to use LLVM.
Also, Apple/Chris first suggested merging LLVM and GCC (http://gcc.gnu.org/ml/gcc/2005-11/msg00888.html), which would seem an odd strategy if licensing was the huge driver you claim it was.
Historically, the timeline isn't even close to right for your conclusion to be correct. Apple started seriously investing in LLVM in 2005, and the GCC GPLv3 switch didn't happen until 2009.
So, basically, you are welcome to stand by your conclusion, but it's, well, wrong :)
[1] In fact, Ted literally did not care about the licensing at all. They were considering using ICC as well, but this mostly got dropped after the switch to x86.
I own open source licensing policy at one very large company (which doesn't really work like you suggest), and am in contact with about 50-100 other open source counsel on a regular basis, and the only software most ban is AGPL (and a few other licenses which aren't talking about here, as they are wildly uncommon).
Most companies also do not treat GPLv2 and GPLv3 differently from a licensing perspective, only those that ship embedded devices do.
At least, this is my experience. I'm curious where yours is coming from.
Perhaps you work with companies where adopting technology stacks is more of a top-down decision where legal counsel is always involved. In those situations, GPL doesn't pose a particular barrier because all open source software faces that same barrier. But some companies give more autonomy to their developers, and in those cases there's a difference in overhead when managing GPL compliance.
Interesting. We use about 8000 open source packages, and add roughly 90 a week right now.
"Perhaps you work with companies where adopting technology stacks is more of a top-down decision where legal counsel is always involved. In those situations, GPL doesn't pose a particular barrier because all open source software faces that same barrier. But some companies give more autonomy to their developers, and in those cases there's a difference in overhead when managing GPL compliance."
Actually, i work at a company (Google) where autonomy is given. People are free to use basically anything but AGPL. We simply tell them what will be required of them if they use it, and enforce that this happens.
The overhead of GPL compliance is not any more than the overhead of any other license compliance, for us, in practice.
You still have to do stuff for BSD and MIT anyway, so you need a process that knows what is going into shipping software.
The short version is that:
Overhead is kept low by doing it as part of the same check-in process as any other source code (IE you don't fill out some magical form and send it to lawyers), among other things.
Shipping time is simple verification that nothing changed (and the build system will verify it anyway).
My experience is that companies find GPL compliance overhead higher because they aren't doing the right thing for other licenses anyway. In particular, they never produce correct attribution for MIT/BSD/etc, so having to do "anything at all" is higher overhead.
This experience comes from reviewing a large number of companies for acquisition :)
The second change from GPLv2 to GPLv3 is the DRM clause, or the "you can't use a technical method to bypass the legal requirements". Again, only really relevant if the company uses DRM, but would be willing to use GPLv2. That is a very short list.
I'm willing to concede that GPL is not a disadvantage to adoption by Google or any of the hundreds of startups Google might aquihire in a given quarter :)
So I'll concede that it likely wasn't a direct cause of the move to and support of llvm, clang didn't come along until later -- and after it must have been clear that gcc would eventually be gplv3 licensed.
God knows GCC's codebase is a rats nest that few people really want to work with, but if it hadn't been for gplv3 do you really maintain that apple wouldn't have stuck with it as a frontend for longer? The early releases of clang (and gcc+llvm before it) were problematic for a lot of mac developers at the time, after all.
Yes. Absolutely. It's been mentioned numerous times at conferences and other in-person meetings.
Apple did not write clang because of GPLv3. They wrote clang because they needed something that was
1. Faster than GCC 2. Offered better diagnostics 3. Could offer code completion and indexing for XCode.