Class SubsetDataFrame<Row,​Column,​Value>

  • Type Parameters:
    Row -
    Column -
    Value -
    All Implemented Interfaces:
    DataFrame<Row,​Column,​Value>, MutableDataFrame<Row,​Column,​Value>

    public class SubsetDataFrame<Row,​Column,​Value>
    extends ReMappedDataFrame<Row,​Column,​Value>
    ToDo: Could use less memory by compressing all the pointers (int array), especially since direct traversal is not needed and accessing the pointers is unfrequent. Could be stored in a Gzipped byte array or ByteBuffer, or using some of the compression techniques of https://github.com/lemire/JavaFastPFOR ToDo: Also possible would to obtain how many unique values exist in each column to select appropriate data structure, as well as knowing whether the values could be accessed with less memory consuming (byte, short) pointers.
    • Constructor Detail

      • SubsetDataFrame

        public SubsetDataFrame​(DataFrame<Row,​Column,​Value> dataFrame,
                               com.macrofocus.filter.MutableIndexFilter<Row> outputFilter)
      • SubsetDataFrame

        public SubsetDataFrame​(DataFrame<Row,​Column,​Value> dataFrame,
                               com.macrofocus.filter.MutableIndexFilter<Row> outputFilter,
                               com.macrofocus.selection.MutableSelection<Row> selection)