Scala Prof: December 2016

Thursday, December 29, 2016

Get considered harmful

In Scala, there are several monadic containers for which the get method can throw an exception. The two most obvious such containers are Option and Try. If you have an instance of Option[T] and you invoke get, you will get either a T or throw a java.util.NoSuchElementException; if you have an instance of Try[T] and you invoke get, you will get either a T or throw the exception. Future[T] doesn't have a get method but it has value, which yields Option[Try[T]] which has two potential exception-throwing get methods.

So, we learn that using get is bad: a code smell. Idiomatic Scala recommends using pattern matching or monadic operations like map, flatMap, etc. for working with values from such containers.

So, if get is so bad, then why is Map[K,V] designed the way it is? Map[K,V] extends GenMapLike[K,V], which actually defines the following two methods:

abstract def apply(key: K): V
abstract def get(key: K): Option[V]

Thus, effectively, GenMapLike[K,V] extends Function1[K,V]. This class was originally written by Martin Odersky, although not until 2.9. I'm assuming that it was written this way to be compatible with earlier versions. In Scala 2.8, MapLike[K,V] extends Function1[K,V] via PartialFunction[K,V]. Again, Odersky was the author.

If the value of key is unknown in the map, the so-called default method is invoked. By default, the default method simply throws an exception (as you'd expect). In other words, apply is the dangerous method (default notwithstanding) and get is the safe method.

So, is it just me, or are these definitions backwards? This is how I feel that the methods in GenMapLike[K,V] should be defined:

abstract def get(key: K): V
abstract def apply(key: K): Option[V]

Thus, GenMapLike[K,V] would effectively extend Function1[K,Option[V]]. What would be wrong with that? It would be so much more consistent. By all means have this version of get invoke the default mechanism. But it would still essentially correspond to the dangerous method.

Obviously, nobody is going to change it now. And, they're just names, right? But it does seem a shame that it got this way in the first place.

Tuesday, December 13, 2016

Scala's pesky "contravariant position" problem

I'm sure this has happened to you. You design a data structure, for example a tree and, because you are thinking essentially of using this tree to look things up (perhaps it's an AST for example) you decide to make the tree covariant in its underlying type:

trait Tree[+A] extends Node[A] {
   def children: Seq[Node[A]]
}

Now, it's time to start building this tree up from nodes and branches so you add a method like the following:

  def +:(a: A): Tree[A] = :+(a)

Aargh! it's the dreaded "Covariant type A appears in contravariant position in type A of value a." Does this mean that we have to re-write our Tree trait so that it is invariant (as we would have to do if tree was a mutable data structure like Array)? Not at all.

There is a simple and pretty much automatic way of refactoring the +: method. Use the fact that you can define a parametric type at the method level and constrain it to be a super-type of A. Thus:

def +:[B >: A](a: B): Tree[B] = :+(a)

Since B is now a super-type of A (or, of course, A itself), then we declare a (we could rename it b now of course) as a B and we return a Tree[B]. Problem solved.

Note also that if the type of a needs to have one or more context bounds you can declare these too, just as you would expect:

  def +:[B >: A : TreeBuilder](b: B): Tree[B] = :+(b)

I should perhaps point out that this problem is not unique to Scala. It will occur (or should occur) in any object-oriented language that allows generic typing (or anywhere the Liskov substitution principle applies). It's only because Java, for example, glosses over the whole concept of variance, that we don't find the problem there too.

Next time I will talk about having more than one context bound for a type.

Tuesday, December 6, 2016

Spying -- or a functional way to log in Scala

All of the logging utilities I've found for Scala suffer from the same problem: they are not written in a functional-programming style. Here's the problem: you want to log the result of a function which, to choose an example more or less at random, looks like this:

  def map2[T, U](to1: Option[T], to2: => Option[T])(f: (T, T) => U): Option[U] = for {t1 <- to1; t2 <- to2} yield f(t1, t2)

Now, we decide that we'd like to log the result of this method (or maybe we just want to print it to the console). To do this the "traditional" way, we would need to write something like this:

  def map2[T, U](to1: Option[T], to2: => Option[T])(f: (T, T) => U): Option[U] = {
    val logger = LoggerFactory.getLogger(getClass)
    val r = for {t1 <- to1; t2 <- to2} yield f(t1, t2)
    logger.debug(s"map2: ${r.toString}")
    r
  }

We have interrupted the flow of the expression, we've had to create a new variable (r), we've had to wrap it all in braces. Not cool!

So, I looked around for something that could do this the functional way. Unfortunately, I couldn't find anything. There is something in Scalaz but it looked complicated and maybe a little too general. So I decided (as I often do) to write my own and call it Spy.

One of the design decisions that I had to make was this: I don't want the Spy to have to be instantiated for every user invocation, or even every class that an invocation appears in. Yet I want each invocation/class to be able to customize the spying behavior somewhat. That's where implicits come (again) to the rescue. But the spy-function (the one that actually does something with a String formed from the expression's value) needs to be found. The natural type of the spy-function is String=>Unit but it turns out that the implicits mechanism couldn't deal with something so ordinary so I changed it to String=>Spy where the Spy class is essentially just a wrapper that has no real significance. Then, the implicit value for the spy-function (if any) could be found and used.

One example of a need to customize the implicit value is to forget about logging and simply write to the console. I tried to make this as easy as possible. See the specification in the repo (linked at the bottom) for an example of this.

Here, using the default slf4j logging mechanism, is the map2 function with logging. Note that, when you use the default mechanism, you must provide an implicit value of a logger in scope. A convenience method has been provided for this as shown below.

  implicit val logger = Spy.getLogger(getClass)
  def map2[T, U](to1: Option[T], to2: => Option[T])(f: (T, T) => U): Option[U] = Spy.spy(s"map2($to1,$to2)",for {t1 <- to1; t2 <- to2} yield f(t1, t2))

We test it using the following specification for map2 (unchanged):

  "map2(Option)" should "succeed" in {
    val one = Some(1)
    val two = Some(2)
    def sum(x: Int, y: Int) = x + y
    map2(one, two)(sum) should matchPattern { case Some(3) => }
    map2(one, None)(sum) should matchPattern { case None => }
  }

And the resulting entries in the log file are:

2016-12-06 22:35:04,123 DEBUG com.phasmid.laScala.fp.FP$ - spy: map2(Some(1),Some(2)): Some(3)
2016-12-06 22:35:04,128 DEBUG com.phasmid.laScala.fp.FP$ - spy: map2(Some(1),None): None

All we had to do, for the default logging behavior, was to ensure there was an implicit value of logger in scope and wrap the expression inside an invocation of Spy.spy. Everything is still purely functional. You can even leave the spy invocation in place if you really want to and either explicitly switch it off by adding false as a third parameter or by turning off debugging in the logger.

If you're interested in using this you can find the Spy class, together with its SpySpec, in my LaScala project.