Sunday, March 13, 2016

Composing with Try

Because scala.util.Try is a monad, we can compose tries together conjunctively. In the excellent Neophyte's Guide to Scala, Daniel Westheide shows some examples of this. Paraphrasing his for-comprehension exemplar slightly we have:

  for {
      url <- Try(new URL(s))
      connection <- Try(url.openConnection())
      is <- Try(connection.getInputStream)
    } yield Source.fromInputStream(is).getLines()

Each try properly depends on the success of its predecessor. That's to say, this is a sequential conjunction.

But suppose we want to compose tries disjunctively? In other words, success is not defined (as above) when all components succeed but when one succeeds. And again, we want our composition to be sequential (the first try is tried first). How should we go about that?

The obvious solution is to use orElse:

  val x1 = "10.0" 
  val x: Try[Any] = Try(x1.toInt) orElse Try(x1.toDouble)

This works just as expected: the value of x is Success(10.0).

Indeed, orElse has the signature:

deforElse[U >: T](default: ⇒ Try[U])Try[U]


Ideally, we'd like another signature something like this:

  def orElseTry[U >: T](alt: => U): Try[U] = orElse(Try(f))

This compares with orElse as does map to flatMap or recover to recoverWith. And we might invoke it thus:

  def toNumber(x: String) = Try(x.toInt) orElseTry(x.toDouble)

This looks pleasantly elegant and simple--and it differs only from the normal way of doing things by one space. Indeed, I was able to implement it quite easily in a copy of scala.util.Try.scala. However, in their infinite wisdom, the fathers of Scala did not provide such a method (actually, Try derives from the folks at Twitter rather than EPFL).

Unlike with the conjunctive set of Try operations that we started with, the disjunctive set is much more likely to be based on a single input, as in the case of the toNumber example above. I therefore set about creating a class to support multiple successive Try operations:

  case class Trial[V, T](f: V=>Try[T]) extends TrialBase[V,T](f)
  abstract class TrialBase[V, T](f: V=>Try[T]) extends (V=>Try[T]) {
    def apply(v: V): Try[T] = f(v)
    private def orElse(f: V=>Try[T], g: V=>Try[T]): V=>Try[T] = {v => f(v) orElse g(v) }
    def |: (g: V => Try[T]): Trial[V,T] = Trial(orElse(g,f))
    def :| (g: V => Try[T]): Trial[V,T] = Trial(orElse(f,g))
    def ^: (g: V => T): Trial[V,T] = Trial(orElse(Lift(g),f))
    def :^ (g: V => T): Trial[V,T] = Trial(orElse(f,Lift(g)))
  }

This case class called Trial takes a function f of type V=>Try[T] such that its instances are composable with other functions of type V=>Try[T] (using |: or :|) or even V=>T functions (using ^: or :^). These latter functions are "lifted" to V=>Try[T]. Such Trial objects can be composed with other functions both leftwards and rightwards using the |: and ^: operators with the colon on the proper side. And, of course we can even use partial functions to allow us to fork the path of control. Let me provide an example taken from a generic CSV parser that I created:

  val defaultParser = Trial.none[String,Any] :|
    {case s @ (date0(_) | date4(_) | date1(_)) => dateParser(s)} :^
    {case quoted(w) => w} :^
    {case whole(s) => s.toInt} :^
    {case truth(w) => true} :^
    {case untruth(w) => false} :^
    {case s: String => s match { case floating(_) | floating(_,_) | floating(_,_,_) => s.toDouble}} :^
    {case s: String => s}
  val dateFormatStrings = Seq("yyyy-MM-dd","yyyy-MM-dd-hh:mm:ss.s"// etc.
  val dateParser = Trial[String,Any]((parseDate _)(dateFormatStrings))
  val date0 = """(\d\d\d\d-\d\d-\d\d)""".r
  val date1 = ??? // omitting for brevity
  def parseDate(dfs: Seq[String])(s: String): Try[DateTime] = ??? // omitting for brevity
  val quoted = """"([^"]*)"""".r
  val whole = """(\d+)""".r
  val floating = """-?(\d+(\.\d*)?|\d*\.\d+)([eE][+-]?\d+)?[fFdD]?""".r
  val truth = """(?i)^([ty]|true|yes)$""".r
  val untruth = """(?i)^([fn]|false|no)$""".r



In this fragment, we define a default element parser for our CsvParser object. Trial.none simply starts the chain of functions (we could also put the date parsing function as the argument to Trial). If the input String is recognized as a valid date format (in reality there are many more possible date formats), then we invoke dateParser with the String as its parameter. Otherwise, we try to interpret the String as a quoted string, an integer, a boolean, etc.

If you're interested in looking at the code you can see it here: Trial.scala, while the specification is here: TrialSpec.scala.

Thursday, March 3, 2016

Scala job interview questions

One of my students from last semester asked me what questions about Scala might arise in an upcoming job interview. Of course, there are such lists available already for example the following:
Naturally, I couldn't resist the idea of writing up some of my own questions/answers:

Question: Name two significant differences between a trait and an abstract class.
Answer: Abstract classes have constructors with zero or more parameters while traits do not; a class can extend any number of traits but only one abstract class; see also To trait or not to trait? for more differences.

Question: What type of construct follows the match keyword?
Answer: A partially-defined function.

Question: What exactly is wrong with a recursive function that is not tail-recursive?
Answer: You run the risk of running out of stack space and thus throwing an exception.

Question: Explain the concept and usage of an extractor.
Answer: An extractor is essentially the opposite of a constructor and is defined by the unapply(t: T) method defined in the companion object of a type T; for a case class, which provides a built-in unapply method, it returns a Option[TupleN] based on the number N of parameters in the constructor (exception: if N=0, then the returned type is Boolean; if N=1, the returned type is Option[T]); extractors are typically used following the case keyword in match expressions. For more information, see The Neophyte's Guide to Scala Part 1: Extractors.

Question: What relatively minor difference in the definitions of their respective non-empty-constructors results in the major distinction between a List and a Stream?
Answer: The tail parameter is defined as call-by-name.