jm + dataframes   1

Modin: Speed up your Pandas workflows by changing a single line of code
The modin.pandas DataFrame is an extremely light-weight parallel DataFrame. Modin transparently distributes the data and computation so that all you need to do is continue using the pandas API as you were before installing Modin. Unlike other parallel DataFrame systems, Modin is an extremely light-weight, robust DataFrame. Because it is so light-weight, Modin provides speed-ups of up to 4x on a laptop with 4 physical cores.

We have focused heavily on bridging the solutions between DataFrames for small data (e.g. pandas) and large data. Often data scientists require different tools for doing the same thing on different sizes of data. The DataFrame solutions that exist for 1KB do not scale to 1TB+, and the overheads of the solutions for 1TB+ are too costly for datasets in the 1KB range. With Modin, because of its light-weight, robust, and scalable nature, you get a fast DataFrame at small and large data. With preliminary cluster and out of core support, Modin is a DataFrame library with great single-node performance and high scalability in a cluster.
data  parallel  python  pandas  dataframes  modin  data-science 
18 days ago by jm

Copy this bookmark:



description:


tags: