Creating a 'paged' Vec in Rust

#rust #showdev

In this post I'll show you some code I wrote for paginating over a Vec collection in Rust. I needed this for a CLI tool I wrote which was meant to display all the vector entries retrieved from a remote server. In most cases, I expected to receive a lot of results, so to display them in a terminal efficiently, I couldn't reasonably render them all. I decided I would page the results. However, I realised I also wanted the following:

I wanted to track a users selection of an element in the source vector, and,
I wanted to display a users selection on the page for display purposes (e.g. for highlighting their cursor selection).

Intuitively, these indexes would differ. I realised I wanted a data structure to answer the following question:

"What elements are on the same page as element n and what is its index there?".

My approach was to retain the full vector, but to only display one 'page' at a time. That 'page' would be the one where the current index and the users cursor would be sitting.

PagedVec takes a Vec<A> and internally creates a Vec<Vec<&A>> which references all the A's from the source vector. With the impl method page_for, you can optionally get:

a) the 'page' for the source vectors index, and
b) the modulus translated index for that 'page'.

For example:

let some_vec = vec![1, 2, 3, 4, 5, 6, 7, 8];
let paged = PagedVec::from(some_vec, 3);

let page_1 = paged.page_for(0); // Some(0, vec![1, 2, 3])
let page_1 = paged.page_for(1); // Some(1, vec![1, 2, 3])
let page_1 = paged.page_for(2); // Some(2, vec![1, 2, 3])

let page_2 = paged.page_for(3); // Some(0, vec![4, 5, 6])

let page_3 = paged.page_for(7); // Some(1, vec![7, 8])

My original implementation code is below👇. The real magic comes from the standard libs chunks method; it basically does the page partitioning.

pub struct PagedVec<'a, A: 'a> {
    page_length: usize,
    pages: Vec<Vec<&'a A>>,
}

impl<'a, A> PagedVec<'a, A> {
    pub fn from(vec: &'a Vec<A>, page_length: usize) -> PagedVec<'a, A> {
        PagedVec {
            page_length,
            pages: vec.chunks(page_length).map(|slice| {
                slice.iter().collect::<Vec<&'a A>>()
            }).collect::<Vec<Vec<&'a A>>>(),
        }
    }

    pub fn page_for(&'a self, index: usize) -> Option<(usize, &'a Vec<&'a A>)> {
        self.pages.get((index as f32 / self.page_length as f32).floor() as usize)
        .map(|page| {
            (
                index % self.page_length,
                page
            )
        })
    }
}

My favourite part of this implementation: using Rust's referencing and borrowing capabilities, I was able to do this with little additional heap allocation. Quite simply, all the 'pages' are basically pointers to the data that exist in the original vector. If the elements of the vector were huge, the 'pages' wouldn't duplicate the volume of data resident in memory. Kind of like a view in a relational database.

Soon after using this, I realised I could reduce my memory usage even more by not retaining pages I wasn't actually displaying. The decision to use the implementation below was a design choice based on factors particular to my CLI tool.

pub struct PagedVec<'a, A: 'a> {
    page_length: usize,
    vec: &'a Vec<A>,
}

impl<'a, A> PagedVec<'a, A> {
    pub fn from(vec: &'a Vec<A>, page_length: usize) -> PagedVec<'a, A> {
        PagedVec { page_length, vec }
    }

    pub fn page(&'a self, index: usize) -> Option<(usize, Vec<&'a A>)> {
        let mut paged = self.vec.chunks(self.page_length);
        let opt_page = paged.nth((index as f32 / self.page_length as f32)
          .floor() as usize);

        opt_page.map(|page| (index % self.page_length, page.iter().collect::<Vec<&'a A>>()))
    }
}

In addition to chunks, the magic trick here is the use of nth; rather than creating all the pages from the source vector, nth—in conjunction with the lazy evaluation of Rust iterators—allows me to only create the pages up to the selected index. No more wasting time or memory building pages I don't need! 😎

There could be cases where it would be better to use the original implementation to keep all the pages in memory. Which one is better? It depends on your use case 🤷‍♀️.

Update 3/6/2019
After sharing my implementation, some of the Rust devs in the community were able to point out some improvements in my code.

Reddit user 2brainz was able to share a generic and completely allocation free implementation of PagedVec, below:

use std::{cmp::min, marker::PhantomData};

pub struct Paged<'a, T, V> {
    vec: &'a V,
    page_length: usize,
    phantom: PhantomData<&'a T>,
}

impl<'a, T, V> Paged<'a, T, V>
where
    V: AsRef<[T]>,
{
    pub fn new(vec: &'a V, page_length: usize) -> Paged<'a, T, V> {
        Paged {
            vec,
            page_length,
            phantom: PhantomData,
        }
    }

    pub fn page(&self, index: usize) -> Option<(usize, &'a [T])> {
        let slice = self.vec.as_ref();
        let len = slice.len();

        if index < len {
            let page_index = index % self.page_length;
            let start = index - page_index;
            let end = min(len, start + self.page_length);

            slice.get(start..end).map(|s| (page_index, s))
        } else {
            None
        }
    }
}

The savings here are based on simply working with the slice that backs the provided Vec.

Also, I am ashamed to say I did forget some of the fallacies of using floating types in any programming language. A number of Rust devs reminded me that a) with numeric division, flooring is normally implicit, and b) floats are never precise. Cheers for pointing that out Shadow0133
and 2brainz.