Wednesday, June 21, 2017

Some observations on foldLeft with reference to short-circuiting

Have you ever written code such as the following and expected it to quit accessing values of y after it hits the first true value?

    val y = Seq[Boolean](false, true, ..., false)
    val x = y.foldLeft[Boolean](false)(_ || _)

After all, the second parameter of || is call-by-name so, if it's not needed to determine the outcome of the value for x, then that should mean that the recursion of the foldLeft should stop, right?

But it doesn't in practice, and if you look at the definition of foldLeft, you can see why. There is no short-circuit defined.

I therefore created a method in the FP module of my LaScala library to be the short-circuit equivalent of foldLeft. It's called foldLeftShort and it looks like this:

  /**
    * This method performs a foldLeft with short-circuit logic
    *
    * @param xi an iterator of X objects
    * @param y0 the initial value of Y (e.g. 0 if Y is Int)
    * @param q  the quick-return function: if this evaluates to true, given the new y value, that value is immediately returned
    * @param f  the function which accumulates the result: as with foldLeft, it takes a Y and an X and yields a Y
    * @tparam X the underlying type of the iterator
    * @tparam Y the return type
    * @return the aggregate of the given iterator, according to the initial value of y0 and the accumulator function f.
    */
  def foldLeftShort[X, Y](xi: Iterator[X], y0: Y, q: Y => Boolean)(f: (Y, X) => Y): Y = {
    def inner(_y: Y, xo: => Option[X]): Y =
      xo match {
        case None => _y
        case Some(x) =>
          val y = f(_y, x)
          if (q(y)) y
          else
            inner(y, nextOption(xi))
      }

    inner(y0, nextOption(xi))
  }

  /**
    * Method to get the next element from an iterator, as an Option.
    *
    * @param xi the iterator.
    * @tparam X the underlying type.
    * @return Some(x) where x is the next element; or None where the iterator is exhausted.
    */
  def nextOption[X](xi: Iterator[X]): Option[X] = if (xi.hasNext) Some(xi.next) else None

As with foldLeft, the accumulating function is in its own parameter set. But the initial value is joined by the iterator itself and the short-circuit function in the first parameter set. It would be nice to be able to extend the Iterator type to include this.

The inner function isn't defined in the typical way, using the concatenation unapply method (::) because, for some reason that I don't quite understand, you cannot decompose an Iterator this way. Instead, we make use of another FP method: nextOption which returns the next element of the iterator as an Option (see above).

Note the short-circuit mechanism, which really does work. As soon as we have a value by applying f to the current accumulator and the newest value from the iterator, we return if f evaluates to true.

In order to use this as efficiently, as possible, we need to write code something like the following where in this example, we want to save doing a trivial lookup too many times. That's not worth the effort--but just suppose that the lookup was actually an expensive database query instead:

    val lookupTable = Map("a"->false, "b"->true)
    val boi = ns.view.map(x => lookupTable.get(x)).toIterator
    FP.foldLeftShort[Option[Boolean], Boolean](boi, false, b => b)(_ || _.getOrElse(false))

Obviously, if you want to end up with an Option[Boolean], you would write something like this instead:


FP.foldLeftShort[Option[Boolean], Option[Boolean]](boi, Some(false), b => b.getOrElse(false))(optionalOr)

/**
    * NOTE: that we can't use map2 here because we are essentially implementing three-valued (Kleenean) logic
  *
  * @param b1 the accumulator: if b2 is None or Some(false) then b1 is returned unchanged
  * @param b2 the value: if b2 is Some(true) then Some(true) will be returned
  * @return an Optional[Boolean]
  */
def optionalOr(b1: Option[Boolean], b2: Option[Boolean]): Option[Boolean] =
    b2 match {
      case Some(x) => for (b <- b1) yield b || x
      case None => b1
    }

Note that, if you use map2(_ || _) instead of optionalOr, the logic is probably not going to give you exactly what you want, because None will combine with false to give None, whereas you really want None to have no effect on the accumulator value at all.

No comments:

Post a Comment