Stacking bits

daria-nepriakhina-xY55bL5mZAM-unsplash-reduced.jpg

Photo by Daria Nepriakhina 🇺🇦 on Unsplash

We are digressing a little from our previous path about trees to meet a data structure which has a deep connection with (binary) trees.

I will show a very compact (memory efficient) and fast implementation of a binary stack.

In this post I will also show how to use Rust types to implement failures and success as well as some Test Driven Design example.

For those who want to have a look to previous posts, you can check

The source code for this post is here

what is a stack

A stack is a data structure which allow you to retrieve the last datum you put into.

This is very similar to a stack of books on your desk: the most common operations you can do are:

  • to add a book on the top; also known as push
  • to look at the topmost book; also known as top
  • to remove the topmost book; also known as pop
  • to count how many books are in our stack; size

It is also known as Last In First Out queue, sometime written with the LIFO achronym.

stack_usage.drawio.png

defining a binary stack

Instead of creating a generic stack I’d like to focus on a very specific case: a stack which can only hold boolean values.

The signature of the api could be as follows

// warning: this code does not compile
struct Stack {}

fn push(stack : Stack, value: bool) {}

fn top(stack : Stack) -> bool {}

fn pop(stack : Stack) -> bool {}

fn size(stack : Stack) -> u32 {}

To start this project I chose to create a library:

cargo init --lib .

This creates a lib.rs file in the src directory with some example code

Rust allow to create a trait implementation for this structure; I will add the minimum code needed to compile it

pub struct Stack0 {}

impl Stack0 {

    // adding also a constructor
    pub fn new() -> Stack0 {
        Stack0 {}
    }

    pub fn push(self: & mut Stack0, value: bool) {
    }

    pub fn top(self: & Stack0) -> bool {
        true
    }

    pub fn pop(self: & mut Stack0) -> bool {
        true
    }

    pub fn size(self: & Stack0) -> u32 {
        1
    }
}

test first design

What laws should this data structure obey?

Let’s check a few:

  1. if I push a value, pop should return it
  2. if I push a value the size should increase by one
  3. also top should return the last thing I push -ed
  4. after pop the size should decrease
  5. a new stack should have size 0

Rust has a convenient toolkit to create and execute tests: let’s add a submodule in our library

// this macro is used by compiler to
// conditionally compile the next section
#[cfg(test)]
// this defines a module within current one
mod tests {
    // this statement imports everithing defined
    // in the upper module
    use super::*;

    // this macro identifies a test
    #[test]
    fn empty_when_created() {
        let result = Stack0::new().size();
        assert_eq!(result, 0);
    }


}

by executing

cargo test

we will compile the library with the tests and execute them, a report is printed which may look like this:

running 1 test
test tests::empty_when_created ... FAILED

failures:

---- tests::empty_when_created stdout ----
thread 'tests::empty_when_created' panicked at src/lib.rs:53:9:
assertion `left == right` failed
  left: 1
 right: 0
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace


failures:
    tests::empty_when_created

test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s

we can add more tests in the same module: e.g.

    #[test]
    fn size_increase_when_push() {
        let mut stack = Stack0::new();
        stack.push(false);
        let result1 = stack.size();
        stack.push(true);
        let result2 = stack.size();
        assert_eq!(result2, result1 + 1);
    }

managing unwanted status

Now I have couple of questions

  1. What happens when I try to get a book from an empty stack?
  2. Can a stack grow forever? Or can I decide it is too tall to grow?

Rust uses the Result enumeration to express this concept; a Result can be either be Ok(some_value) or Err(error).

Being Result<R, E> a generic type, I need to specify both the result type R and the error type E; the result type will be of course a bool while for this example I will choose a simple String for the error.

The stack API can be now modified to tackle our questions:

pub struct Stack1 {}

impl Stack1 {
    pub fn new() -> Stack1 {
        Stack1 {}
    }

    pub fn push(self: & mut Stack1, value: bool) -> Result<bool, String>{
        Ok(value)
    }

    pub fn top(self: & Stack1) -> Result<bool, String> {
        Ok(true)
    }

    pub fn pop(self: & mut Stack1) -> Result<bool, String> {
        Ok(true)
    }

    pub fn size(self: & Stack1) -> u32 {
        1
    }
}

and so our tests must be adapted; we need to add two rules and their type signature.

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn push_and_pop() {
        let mut stack = Stack1::new();
        stack.push(false);
        stack.push(true);
        let result1 = stack.pop();
        let result2 = stack.pop();
        assert_eq!(result1, Ok(true));
        assert_eq!(result2, Ok(false));
    }

    #[test]
    fn size_increase_when_push() {
        let mut stack = Stack1::new();
        stack.push(false);
        let result1 = stack.size();
        stack.push(true);
        let result2 = stack.size();
        assert_eq!(result2, result1 + 1);
    }

    #[test]
    fn push_and_top() {
        let mut stack = Stack1::new();
        stack.push(false);
        let result1 = stack.top();
        stack.push(true);
        let result2 = stack.top();
        assert_eq!(result1, Ok(false));
        assert_eq!(result2, Ok(true));
    }

    #[test]
    fn size_decreases_when_pop() {
        let mut stack = Stack1::new();
        stack.push(false);
        stack.push(true);
        let result1 = stack.size();
        stack.pop();
        let result2 = stack.size();
        assert_eq!(result1, result2 - 1);
    }

    #[test]
    fn empty_when_created() {
        let result = Stack1::new().size();
        assert_eq!(result, 0);
    }

    #[test]
    fn empty_does_not_pop() {
        let mut stack = Stack1::new();
        let result = stack.pop();
        assert_eq!(result, Err("Empty stack".into()));
    }

}

implementing the stack

The idea here is to use an integer value to host a sequence of bits, each representing a value pushed in the stack.

We can emulate the push behavior by executing a left shift of one and adding 1 only if the input value is true

The following example is made with a 4 bit value just to make it easier to read

e.g.

  1. stack status is is 0b0001
  2. stack receive a PUSH true
    1. left shift the status; now is 0b0010
    2. add one to the status; now is 0b0011
  3. stack receive a PUSH false
    1. left shift the status; now is 0b0110

the right shift can be used as pop e.g

  1. stack status is 0b0110
  2. stack receive a POP
    1. evaluate status & 0b0001 and store the result
    2. right shift the status; now is 0b0011
    3. return the stored result (is now false)
  3. stack receive a POP
    1. evaluate status & 0b0001 and store the result
    2. right shift the status; now is 0b0001
    3. return the stored result (is now true)

as 0b0000 >> 1 == 0b000 (shift right 0 returns 0) we need a way to know when our stack is empty; a possible way is to define the empty status equal to 0b0001

I will use an unsigned integer of type usize in this example. The number of bytes of this integer depends on the cpu architecture, it may be 64 or 32 in most cases.

How can we calculate the size of the stack? we use one bit to mark the start point, so if we count all bits after it we have the result.

Rust has two interesting features for this kind of calculation; we know the number of bits of usize from the constant usize::BITS and the function usize::leading_zeros returns the count of the bits before our marker.

here is the full implementation:

pub struct Stack1 {
    stack: usize
}

impl Stack1 {
    pub fn new() -> Stack1 {
        Stack1 { stack: 1 }
    }

    pub fn push(& mut self, value: bool) -> Result<bool, String>{
        self.stack = self.stack << 1;
        if value{
            self.stack += 1;
        }
        Ok(value)
    }

    pub fn top(& self) -> Result<bool, String> {
        if self.stack == 1 {
            return Err("Empty stack".into())
        }
        Ok((self.stack & 1) == 1)
    }

    pub fn pop(& mut self) -> Result<bool, String> {
        if self.stack == 1 {
            return Err("Empty stack".into())
        }
        let result = (self.stack & 1) == 1;
        self.stack = self.stack >> 1;
        Ok(result)
    }

    pub fn size(& self) -> u32 {
        usize::BITS - usize::leading_zeros(self.stack) - 1
    }
}

a last remark on error types and borrow checker rules

It may sound strange to have a very compact and fast implementation like this one with an error type like String; we are actually using constant strings so a type like &str should be enough right?

Well, no.

If we use a signature like

fn pop(& mut self) -> Result<bool, & str>;

we are actually using this lifetime signature

fn pop<'a>(& mut 'a self) -> Result<bool, & 'a str>;

This means that we cannot execute something like

let result1 = stack.pop(true);
let result2 = stack.pop(true);

because the first call may return an Err(& 'a str) which is like a borrow of a mutable pointer that cannot be done twice.

So, can we have a more efficient definition of the error type? Sure we can use an enumeration.

But this is for another post.

marco.p.v.vezzoli

Self taught assembler programming at 11 on my C64 (1983). Never stopped since then -- always looking up for curious things in the software development, data science and AI. Linux and FOSS user since 1994. MSc in physics in 1996. Working in large semiconductor companies since 1997 (STM, Micron) developing analytics and full stack web infrastructures, microservices, ML solutions

You may also like...