We built play.questdb.io to make it easy for anyone to try our database. No installation.
There's a Jupyter Lab notebook, data, sample code, queries and graphs.
We'd love to hear what you think.
We debated for hours whether or not to go the notebook route. I'm sure y'all did something similar; would you care to share your reasons for going with the notebook?
That one focuses on SQL queries though.
The notebook in https://play.questdb.io/ offers a more rounded experience to try out any and all of our features.
You can use the notebook to try out data ingestion, dropping partitions and more that is simply not possible in a more sandboxed environment.
The other part is that we value our Python users and wanted to provide an example of how to use our database in conjunction with other tools commonly used in the data science space to slice and dice time series data.
PS. We're also wire-compatible with PostgreSQL.
Eventually all local drives fill up though.
When ingesting data we partition data by time. By default we partition by day. This give you the flexibility to detach partitions, store them somewhere slower and cheaper with more capacity for longer term storage and reattach them later if need be.
Built on top of our open source primary product, we also have a cloud variant of QuestDB which runs on AWS. One of the things that we're building there is cold storage. It will automate this process onto S3 such that if a query ever needs to access this older data it will re-enstate it automatically for you with no admin overhead.
Note that I'm one of the co-founder of QuestDB, but let me try to be as objective and un-biased as possible. Under the hood, InfluxDB and QuestDB are built differently. Both storage engines are column-oriented. InfluxDB's storage engine uses a Time-Structured Merge Tree (TSM), while QuestDB uses a linear data structure (arrays). A linear data structure makes it easier to leverage modern hardware with native support for CPU's SIMD instructions [1]. Running close to the hardware is one of the key differentiators of QuestDB from an architectural standpoint.
Both have a Write-Ahead Log (WAL) that makes the data durable in case of an unexpected failure. Both use the InfluxDB Line Protocol to ingest data efficiently. Hats off to InfluxDB's team, we found the ILP implementation very neat. However, QuestDB's implementation of ILP is over TCP rather than HTTP for performance reasons. QuestDB is Postgres Wire compatible, meaning that you could also ingest via Postgres, although for market data it would not be the recommended way.
One characteristic of QuestDB is that data is always ordered by time on disk, and out-of-order data is dealt with before touching the disk [2]. The data is partitioned by time. For queries spanning time intervals, the relevant time partitions & columns are lifted to memory, while others are left untouched. This makes such queries (downsampling, interval search etc) particularly fast and efficient.
From a developer experience standpoint, one material difference is the language: InfluxDB has got its own native language, Flux [3], while QuestDB uses SQL, with a bunch of native SQL extensions to manipulate time-series data efficiently: SAMPLE BY, LATEST ON, etc [4]. QuestDB also includes SQL Joins and time-series join (ASOF Join) popular for market data. Since QuestDB speaks the postgresql protocol, developers can use their standard Postgres libraries to query from any language.
From a performance perspective, InfluxDB is known to struggle with ingestion and queries alongside high-cardinality datasets [5]. QuestDB deals with such high cardinality datasets better and is particularly good at ingesting data from concurrent sources, with a max throughput can now reach nearly 5M rows/sec on a single machine. Benchmarks on TSBS [6] with the latest version will follow soon.
InfluxDB is a platform, meaning that they provide an exhaustive offering around the database, while QuestDB is less mature. QuestDB is not yet fully compatible with several tools (say a dashboard like metabase for example), as some popular ones have been prioritised instead (Grafana, Kafka, Telegraf, Pandas dataframes). The charting capabilities of InfluxDB's console are excellent, while QuestDB users would mostly rely on Grafana instead.
[Adding this via post edit #1] One area Influx currently has edge is storage overhead. QuestDB does not support compression yet. Time-series data can often be compressed well [7]. Chances are QuestDB will use more disk space to store the same amount of data.
Hope this helps!
[1] https://news.ycombinator.com/item?id=22803504 [2] https://questdb.io/blog/2021/05/10/questdb-release-6-0-tsbs-... [3] https://docs.influxdata.com/influxdb/cloud/query-data/get-st... [4] https://questdb.io/blog/2022/11/23/sql-extensions-time-serie... [5] https://docs.influxdata.com/influxdb/cloud/write-data/best-p... [6] https://github.com/timescale/tsbs [7] https://www.vldb.org/pvldb/vol8/p1816-teller.pdf