Hi John,
>Wow. I knew somebody would think of that one day. ;-)
Yep, here we are :-)
>... assuming it's possible to utilize indexes and coherently analyze big chunks of data in multiple threads.
The very fabric of those engines, columnar content and olap-centric, is clearly helping in terms of multi-threading. On top of that, the dev team is a university-teaching one (remember the fox software founder...) and they hide nothing in terms of the way they achieve performance! On a very traditional issue, sorting for instance, how to improve via multithreading:
https://duckdb.org/2021/08/27/external-sorting.htmlShould one understand Morsel-Driven Parallelism, the paper is online. And the C++ implementation straight on github.
In my tests there were always cases when perf ain't improving even getting worse, That's when the data is limited in size. Below a couple of thousand records, the old guard (sqlite, btrees, vfp rushmore) and others still tend to perform better!