This looks interesting, handling large datasets is becoming more common and I’m always on the lookout for useful tools because pandas requires your dataframe to fit in memory.
…. my rule of thumb for pandas is that you should have 5 to 10 times as much RAM as the size of your dataset. So if you have a 10 GB dataset, you should really have about 64, preferably 128 GB of RAM if you want to avoid memory management problems.
Using Ibis a portable data frame library vast datasets can be explored (>1 billion rows).
There is an introduction here https://ibis-project.org/posts/1tbc/ All available on GitHub https://github.com/ibis-project/ibis