Monday, December 28, 2015

MapReduce with Akka actors

I was creating a lecture on actors, in particular Akka, for the Scala class that I teach. I wanted to do something simple and general. I showed the students an application I had previously written which used actors to manage a hedge fund. But I wanted something even more generic so I thought about implementing map/reduce. After all, actors (especially remote actors) are the preferred means of implementing concurrent processing in Scala, especially where the code to be run in parallel involves representing mutable state.

So, I did the obvious -- I looked on the internet for some code that someone had already written. I didn't want to make a big deal out of it, or spend a lot of time on it. Unfortunately, I couldn't find anything. Well, I couldn't find anything good. There were several half-hearted attempts at doing it but nothing that I would want to use. So, I set out to do it myself. The good news was that I got something (a word count application) working in about an hour. But after sharing it with the students, I realized that it, too, wasn't nearly as good as it could be.

So, I have refined it to a point (after many hours of work) that I am now proud to share it. What, to me, is particularly elegant is the functional API that I created to set up map-reduce stages that can be chained (or piped, rather) together to form a (potentially) multi-stage parallel process. Using this interface actually hides all details of actors (except for creating the actor system and a couple of implicit values) from the application programmer.

If you're interested in taking a look, contributing, or whatever, please refer to Majabigwaduce on Github. The README will explain, hopefully, everything you need to know about using the framework. And, of course, the source code is all there too. As I searched for an appropriate name for something that uses Akka to implement Map/Reduce, I couldn't help the name Majabigwaduce from  popping into my mind. See the README for an explanation.

I have used my own naming convention everywhere in the code. If you're wondering about it, please see my previous blog.

Tuesday, December 8, 2015

Naming of identifiers

One of the conventions recommended by the designers of Scala is to use short identifier names. It's fine to make method names long (especially when they are private methods whose meaning may not be entirely clear otherwise) but context usually gives a pretty good clue to identifiers.

You've probably noticed code like this (taken from the Artima web site: Working with Lists):

def isort(xs: List[Int]): List[Int] =
    if (xs.isEmpty) Nil
    else insert(xs.head, isort(xs.tail))

where xs is used for a sequence of things and pronounced x's (the plural of x).

Here, I want to present a recommendation for generalizing this notion. In the same way that a sequence (Seq) of x is written xs, then a Future of x should be written xf. A Try of x might be written xt but I'm going to recommend something slightly different (xy) because t is useful for Tuple. So, the general scheme is:

val ac = C[A]

where C is some container type and A is some element type. Obviously, these can be composed such that if A happens to be a sequence of X (for example) then we end up with:

val xsc = C[Seq[X]]
Note that the order of the container/type definitions is mirrored for the name of the identifier. The advantage of all this is that when a reader (or even yourself) comes to read that code later, it is clear what the type of the identifier is, even if it isn't shown explicitly. If you're not reading the code through an IDE, then having an indication of type will be quite helpful. It also makes debugging a lot easier.

In a for-comprehension, patterns/generators look very simple: for (x <- xs) yield x, for example.

Here are my recommendations for the abbreviations (but of course you should choose your own):


ContainerAbbreviation
Seq/List, etc.s
Arraya
Mapm
Futuref
Tryy
Promisep
Tuplet
Optiono
Iterator/iterablei
Eitherh
RDDr

TypeAbbreviation
My own typex
Generic type A, B, T, etc.a, b, t, etc.
String (line/document)g
String (word)s or w
URLu
Doubled
Intn
Functionf
ActorRefa
Exceptione

Sometimes you might have more than one identifier of the same type in which case you can follow the single letter with 1, 2, etc. Or, alternatively, you might use a single uppercase character to add context, for example gD for a document string and gL for a line string. You might also decide to use t2 for a Tuple2, etc. In the situation where we have an Either of form Either[String,X] or Either[Throwable,X] then we simply use the identifier xe, ignoring the left-side types.

Some confusion might arise between a as an Array (i.e. a container) and a as a generic type A (or ActorRef). This should be clear in that types are found at the start of the identifier string. Similarly, f as a Future and f as a Function. And s if you use it both for a String and a Seq.

Here are some examples of code which uses this idea:

  def flatten[X](xsfs: Seq[Future[Seq[X]]])(implicit c: ExecutionContext): Future[Seq[X]] = Future.sequence(xsfs) map {_ flatten}

  def sequence[X](xy: Try[X]): Either[Throwable,X] =
    xy match {
    case Success(s) => Right(s)
    case Failure(e) => Left(e)
  }

  def sequence[X](xos : Seq[Option[X]]): Option[Seq[X]] = (Option(Seq[X]()) /: xos) {
    (xso, xo) => for (xs <- xso; x <- xo ) yield xs :+ x
  }

  val ws = List("http://www.htmldog.com/examples/","http://www.google.com")

  val uys = for ( w <- ws ) yield Try(new URL(w))

Friday, November 20, 2015

Avoiding vars in Scala

One of the difficulties that everyone new to functional programming has is avoiding vars and mutable types. These aren't the same thing of course but they're similar enough for purposes of this discussion.

Of course, the first thing to understand is why are such mutable objects considered harmful--why would we want to avoid them? Without getting into an essay on referential transparency, we should aim to preserve RT in our code for the following reasons: referentially transparent programs are reliably testable and potentially provable.

So, let's say we have some code like the following and we want to avoid using vars and make it more functional (so that we can write unit tests on it):

  def printStatistics(xs: Seq[Double]) {
    var c = 0
    var s = 0.0
    var v = 0.0
    for (x <- xs) {
      if (x >= 0 && x <= 1) {
        c += 1
        s += x
        v += x*x
      }
    }
    println(s"Mean: ${s/c}, Std. Dev: ${math.sqrt(v/c)}")
  }

I chose this particular bit of code because it involves two of the features that most seem to get in the way of functionalizing the code, viz. calculating more than one quantity and having complex logic. Here, we want to calculate both the mean and the standard deviation of a sequence of values. However, we also want to exclude, for some reason, values that are outliers.

We can deal with the multiplicity of quantities by turning to one of the functional programmer's best friends: the tuple. Let's group our three quantities together in a 3-tuple: (c,s,v) for the purposes of evaluation and then we will return a 2-tuple (mu, sigma) as the result of the method.  Secondly, the logic which was complicating the calculations can in this case be easily refactored: we filter the values of xs.

Finally, there is one aspect of rolling up vars that is definitely tricky: because iteration uses vars we need to map the iteration-with-vars code into recursive code. Again, without getting into a formal proof, we can say that recursion (with immutable objects) is the dual of iteration (with mutable objects).

How are we going to turn this particular code fragment into something recursive? What's the appropriate pattern? Well, here another feature of functional programming comes to the rescue: higher-order functions. In particular, there is a family of higher-order functions which provide the necessary recursive pattern: Foldables. A Foldable is an extension of Functor (which defines the map method) and looks something like this:

  trait Foldable[F[_]] extends Functor[F] {
    def foldLeft[A,B](z: B)(f: (B, A) => B): B
    def foldRight[A,B](z: B)(f: (A, B) => B): B
  }

Where F[_] represents a collection type such as Seq[A] or List[A]. z is the initial ("zero") value of our accumulator, f combines the current accumulator (first parameter) with the current element. After all elements of our collection have been visited, we return the accumulator as the result. If A happens to be a monoid (a type that has its own zero and combine methods), then we can define a reduce method:

      def reduce[A](): A

But foldLeft and foldRight don't actually require monoids because we provide the zero method and the combine method explicitly. This is what enables us, in the present instance, to keep two separate sums going, one for the mean and one for the variance. This is what the final code looks like:

  def getStatistics(xs: Seq[Double]) = {
      val y = xs filter { x => x>=0 && x<=1 }
      val r = y.foldLeft[(Int,Double,Double)]((0,0,0)){case ((c,s,v),x) => (c+1,s+x,v+x*x)}
      (r._2/r._1,math.sqrt(r._3/r._1))
  }

Here, we are explicitly counting the elements of y (although y.size would do just as well). First we filter the input to remove outliers and then we perform our calculations using foldLeft. The initial value is easy, of course: (0,0,0). The combiner function is also quite straightforward: strictly speaking it is a partial function matching the case where the inputs are presented as (c,s,v) [the accumulated values] and x [the current element from the sequence]. The returned value is simply: (c+1,s+x,v+x*x).

Finally, we return a tuple of the mean and standard deviation. We can of course print these values as was done before in the printStatistics method. But we can also test these values in unit tests to ensure that we have our logic right.

Friday, November 13, 2015

More on Scala's Parser-combinators: building a concordance

I want to continue my discussion of Scala's parser-combinator library from a previous blog post. The ParseResult type gives us both the value that was parsed and the current state of the input. That Input value (called next in the Success case class that extends ParseResult) allows the parser not only to know what input remains for further parsing, but also can yield the current position in the original text (via the pos method). The pos method is invoked in the Failure case, but not in the Success case, which is normally what you want--once you've parsed your input successfully, you normally don't care where the various phrases actually occurred...

...except if you're goal is to build a concordance. It took me a little while to figure out how to extract the positions of words so I thought it might be helpful to show how it's done.

So, what should the type of my Parser be? The most obvious type seemed to be a Tuple of Int and String: giving us the index of the word and the word itself. And we want to be able to parse an entire String (or file) of many words so we need something like this grammar rule (in extended BNF):
sentence ::= { word }
Where word is defined using a regular expression to suit our needs (see below for specifics). The name sentence might be a bit misleading here because it's not a normal language sentence: it's simply a repetition of words and, in my definition, can include periods, commas, semi-colons, etc. Your needs might be different.

We can turn all this into Scala code using the rules described in my previous post:

  class ConcordanceParser extends RegexParsers {
    val rWord = """[\w’]+[,;\.\-\?\!\—]?""".r
    def word: Parser[String] = regex(rWord)
    def sentence: Parser[Seq[String]] = rep(word)
  }


The problem with this is that our parser still doesn't return the positional information. We need to take advantage of the method positioned that is defined for Parsers. The signature of this method is:

    def positioned[T <: Positional](p: ⇒ Parser[T]): Parser[T]

So, the type that we need for our parser must extend Positional. Well, that's not too hard, although it seems strange that there isn't one already defined. Let's define a case class:

  case class PositionalString(s: String) extends Positional
Now let's substitute PositionalString for String, pass the original word-parser, with result mapped to PositionalString, as the parameter to positioned:

  class ConcordanceParser extends RegexParsers {
    val rWord = """[\w’]+[,;\.\-\?\!\—]?""".r
    def word: Parser[PositionalString] = positioned(regex(rWord) ^^ {w => PositionalString(w)})
    def sentence: Parser[Seq[PositionalString]] = rep(word)
  }
That's it. All you have to decide now is the precise form of regex that you wish to use for the type of documents you want to make a concordance for. And how you want to treat the resulting output from the parser. For the complete source code, please go to: https://github.com/rchillyard/Scalaprof/tree/master/Concordance

[My apologies for the awful formatting, but Blogger seems to have regressed--I simply can't fix it without Blogger putting its bad formatting back]

Saturday, November 7, 2015

Simple, obvious, elegant

I am teaching a Scala for Big Data class for graduate engineering students. As we have been getting into the compositional and pure functional aspects of Scala, I've been trying to help the students solve problems. I have a mantra for these situations: Simple, obvious, elegant (SOE for short). Let's look at an example:

Let's say we are looking for a method which, given a function of type A=>B, would return a function of type F[A]=>F[B] where F is a container of some sort, such as Option, Seq, Try, etc. In particular, we will require that F be a functor because it is obviously going to require the map method. In other words we need a body for the following signature:
def lift[A,B](f: A => B): F[A] => F[B] = ???
What could possibly go in the body? Well, what information do we have? We know about f. That's it! But somehow, we need this to form a function which will operate on the content of the container. That's to say, we will need map. What will the map method operate on? Whatever is in context when it is invoked, something we don't know yet. That sounds like we need a partially applied function. But this time it is the receiver* of the method, not a parameter that is unknown. Therefore the only possible answer that satisfies our requirements (the signature) is:
def lift[A,B](f: A => B): F[A] => F[B] = _ map f
In particular, we can define the following methods, and many others, inside their respective classes/traits:
def liftOption[A,B](f: A => B): Option[A] => Option[B] = _ map f
def liftSeq[A,B](f: A => B): Seq[A] => Seq[B] = _ map f
Here's another example. I've been working on a type of number that is lazily evaluated when a function is applied to it. This will form part of a fuzzy number suite that I'm developing. I could perhaps use Real from Spire but for now I want to keep it simple without bringing in any external libraries. To begin with, my lazy number class is defined thus:
abstract class LazyNumber[X : Fractional](x: X, f: X=>X) extends Valuable[X] with Fractional[LazyNumber[X]] {
  def get = f(x)
  def construct(x: X, f: X=>X): LazyNumber[X]
  def map(g: X=>X): LazyNumber[X] = construct(x,g.compose(f)) 
...}
where Valuable defines the (abstract) get method, construct is implemented by sub-classes (we could call it unit if we liked). So, the value of a lazy number is f(x) and applying a function g to this lazy number consists of composing function g with f and leaving x as is.  What about flatMap? We'd like to be able to construct for-comprehensions with our lazy numbers, wouldn't we? So, we need a body for the following signature:
def flatMap(g: X=>LazyNumber[X]): LazyNumber[X] = ???
Again, what could possibly go in the body? Well, what do we have? We've got x, f and g. What would be simple, obvious and elegant? How about:
def flatMap(g: X=>LazyNumber[X]): LazyNumber[X] = g(f(x))
Could anything possibly be more obvious, simple or elegant? 

* We could easily get into a debate here: receiver is really an artifact of object-oriented programming -- from a functional programming point of view, the two parameters to the map function (the pre-parameter and the post-parameter) are just parameters.

Friday, October 30, 2015

Scala's Parser-combinators

There have been many situations where I've had to come up with some sort of parser in order to manipulate objects that can be defined by some domain-specific language (DSL). Examples include specifying rules and aggregators for data ingest; mathematical expression parsers for a fuzzy calculator or for dynamically specifying such things as fitness functions for genetic algorithms; bridge hand notations; reading and interpreting queries for information systems using SQL-like or other grammars; JSON, XML and other deserialization. I've probably written dozens of them in various different languages. I've learned (and forgotten) more than I ever really wanted to know about ASTs and regular expressions.

Scala's parser-combinators appear to be a perfect solution to this general sort of problem. No longer do I have to go looking for open-source libraries to help with parsing. And I can parse functionally without the use of side-effects. SPCs give me all of the goodies that regular expressions can, but the definition of the parser is in Scala -- not just a bunch of special symbols strung together. Not only that but I can convert the matched input to actual types of my choosing.

But finding good documentation on SPCs is really hard. The basics are there in Programming in Scala and there are references to trivial implementations just about everywhere on the web. But these sources simply didn't tell me what I wanted to know in order to implement one and teach about it. The Scala documentation for SPCs is of course the ultimate source but navigating it when you're not quite sure what you're doing is challenging. Given that you will mostly be creating sub-classes of RegexParsers, its API documentation is somewhat helpful (especially the calculator examples). Perhaps the most helpful for detail was Jim McBeath's blog, but he jumps pretty much into the deep end. So here I'd like to present something reasonably brief that covers the basics and ends with an actual implementation.

If you're familiar with the precepts behind functional programming, you'll know that side-effects are a bad thing (they don't allow you to prove your program's correctness and they make testing a lot harder than it should be). So, that rules out having a (mutable) source of characters (or other lexical tokens) on which you make calls to some parse method. Instead, Scala provides three related types: abstract classes Parser[T] and ParseResult[T], in addition to trait Parsers. Of these types:
  • Parser[T] encodes logic concerned only with how to parse a source -- it doesn't know anything about the source itself. Parser[T] is actually a function which takes an Input and returns a ParseResult[T].
  • ParseResult[T] captures both a result of type T and the new state of the input.
  • Parsers provides generic parser combinators. Most usefully it has a subclass called RegexParsers which you will typically extend in order to build your parser application. Not only does this sub-trait define Input specifically as Reader[Char], but it defines a method which brings it all together allowing you to parse a sequence of characters using a specific parser:
    • def parseAll[T](parser: Parser[T], in: CharSequence): ParseResult[T]
    • def parseAll[T](parser: Parser[T], in: Reader[Char]): ParseResult[T]
There also other forms of parseAll as well as parse (which looks at only a prefix of the input) but this form of parseAll will get you going. There is an even more useful sub-trait called JavaTokenParsers which defines such parsers as floatingPointNumber.

So, and here's where the fun starts, how do we define our own parsers? Let's start by observing how we would define a grammar in BNF:

expr ::= term { "+" term | "-" term }.
term ::= factor { "*" factor | "/" factor }.
factor ::= floatingPointNumber | "(" expr “)".
We can easily convert this into Scala defining three parsers:

def expr: Parser[Any] = term~rep("+"~term|"-"~term);
def term: Parser[Any] = factor~rep("*"~factor|"/"~factor);
def factor: Parser[Any] = floatingPointNumber|"("~expr~")";
The substitutions we used are:
  • newline -> "def";
  • " ::= " -> ": Parser[Any] ="
  • " " -> "~"
  • "{" -> "rep("
  • "}" -> ")"
  • "." -> ";" [Of course, we can discard the semi-colons.]
Now, let's wrap these definitions inside our desired class:

import scala.util.parsing.combinator._
class ArithmeticParser extends JavaTokenParsers {
  def expr: Parser[Any] = term~rep("+"~term|"-"~term)
  def term: Parser[Any] = factor~rep("*"~factor|"/"~factor)
  def factor: Parser[Any] = floatingPointNumber|"("~expr~")"
}
val p = new ArithmeticParser
val r = p.parseAll(p.expr,"1+2")
The type of r is a ParseResult[Any] which isn't particularly useful to us as is. But, we can see what can be done with it from the following ScalaTest code:


r should matchPattern { case p.Success(_,_) => } r will be an instance of one of the subclasses of ParseResult: either Success, Failure, or Error. But you cannot match on Success as such (it is not imported by the import statement shown): instead you must refer to it as p.Success where p is your parser. Don't confuse it of course with the Success subclass of Try or Future. The get method of ParseResult[Any] returns an Any which is basically the concatenation of the matched tokens. What we really want is to be able to get something out of the parser(s) that conforms to a type of our choosing, such as a Double, for instance, though if we are building a DSL for Xs, we would want to get back an X, whatever that might be.

If you look in the Scala API at Parsers again, you will see among the methods, the following:
p1 ~ p2 // sequencing: must match p1 followed by p2
p1 | p2 // alternation: must match either p1 or p2, with preference given to p1
p1.?    // optionality: may match p1 or not
p1.*    // repetition: matches any number of repetitions of p1
These are the famed parser combinators and this is how they work:
  • If p is a Parser[String], for instance, then the result of p matching String s is simply s.
  • (sequencing): if p is a Parser[P] and q is a Parser[Q] then p~q results in an object of type P~Q, which can also be written ~[P,Q]. In other words, assuming that it matches successfully, it will result in both a P and a Q.
  • (alternation): if p is a Parser[P] and q is a Parser[Q] then p!q results in an object of type P, or an object of type Q. It attempts to match P first and if that's successful, it doesn't invoke q at all.
  • (optionality):  if p is a Parser[P] then p.? or opt(p) results in an object of type Option[P].
  • (repetition):  if p is a Parser[P] then p.* or rep(p) results in an object of type List[P]. There's a similar method, repsep which takes an additional parameter, the separator.
We are getting close! There is one more operator that you need to know about. This operator is a method on Parse[T] and is written ^^. It's essentially a "lift" method because, it takes as its parameter a function (T)=>U (i.e. T is transformed into U) and has the effect of transforming a Parser[T] into a Parser[U]. Thus, if we have written our parsers as Parser[Any] (as we did) and what we really want is a Parser[Double] then we need to create some functions which transform Any into Double and apply this to each of our parsers using the ^^ operator. This is much harder to explain (or understand) than it actually looks so let's dive right into a simple example:

package edu.neu.coe.scala.parse
import scala.util.parsing.combinator._
/**
 * @author scalaprof
 */
class Arith extends JavaTokenParsers {
  trait Expression {
    def value: Double
  }
  abstract class Factor extends Expression
  case class Expr(t: Term, ts: List[String~Term]) extends Expression {
    def term(t: String~Term): Double = t match {case "+"~x => x.value; case "-"~x => -x.value }
    def value = ts.foldLeft(t.value)(_ + term(_))
  }
  case class Term(f: Factor, fs: List[String~Factor]) extends Expression {
    def factor(t: String~Factor): Double = t match {case "*"~x => x.value; case "/"~x => 1/x.value }
    def value = fs.foldLeft(f.value)(_ * factor(_))
  }
  case class FloatingPoint(x: Any) extends Factor {
    def value = x match {
      case x: String => x.toDouble
      case _ => throw new RuntimeException("FloatingPoint: logic error: x is not a String")
    }
  }
  case class Parentheses(e: Expr) extends Factor {
    def value = e.value
  }
  def expr: Parser[Expr] = term~rep("+"~term | "-"~term | failure("expr")) ^^ { case t~r => r match {case x: List[String~Term] => Expr(t,x)}}
  def term: Parser[Term] = factor~rep("*"~factor | "/"~factor | failure("term")) ^^ { case f~r => r match {case x: List[String~Factor] => Term(f,x)}} 
  def factor: Parser[Factor] = (floatingPointNumber | "("~expr~")" | failure("factor")) ^^ { case "("~e~")" => e match {case x: Expr => Parentheses(x)}; case s => FloatingPoint(s) }
}
Now our test code will look a little different. We can actually test the value of the result as a Double.
val p = new ArithmeticParser
val r = p.parseAll(p.expr,"1+2")
r should matchPattern { case p.Success(_,_) => }
r.get.value should matchPattern { case Success(3.0) => }
That's really all you need to know. It is fairly easy to make this a bit more robust by wrapping the arithmetic operations in Try and to make the types generic. Maybe I will post another blog showing how that's done. 

Friday, September 4, 2015

Newton's Approximation

Way back in 1967 my school started an innovative new program. We could write programs for a computer at Imperial College in London, mail them in, have them run, with the results being returned to us the following week.

The biggest snag was that we didn't have a card-punch machine. So, instead we got cards that were semi-perforated--every second column (starting with 1) was perforated so that little rectangular holes could easily be punched out. Obviously, we had to know the EBCDIC code for all of the letters, numbers, etc.

The programs were written in Fortran so embedded spaces were meaningless. But if a line went on too long (a line was effectively limited to 37 characters) we needed a punch in column 6. We had special continuation cards since we couldn't create those ourselves.

Some of us decided to program a Newton's approximation of some easy function. A couple of the other kids chose cos(x) - x = 0. Because I was a smart-ass I chose cos(x) - sech(x) = 0.

I wanted to show an example of this for the class I'm teaching so, as usual, the first thing I did was to google "scala newton approximation" or something like that. One of the entries I got was this one:
But when I looked at that it was so un-Scala-like that I decided I would have to write my own. This is what I came up with:

/**
 * @author scalaprof
 * (c) Robin Hillyard (2015)
 */
case class Newton(f: Newtonian, guess: Double, maxTries: Int) {
    def solve: Either[String,Double] = solve(State(guess,maxTries))
    private def step(s: State): Either[State,Either[String,Double]] =
        f(s) match {
     case None => Right(Right(s.x))
     case Some(x) => s(x) match {
  case Left(e) => Right(Left(e))
  case Right(s1) => Left(s1)
     }
        }
import scala.annotation.tailrec
@tailrec private def solve(s: State): Either[String,Double] = step(s) match { case Right(r) => r case Left(l) => solve(l) } } case class State(x: Double, tries: Int) extends Function1[Double,Either[String,State]] { def apply(x: Double) = tries match { case 0 => Left("Failed to converge") case _ => Right(State(x,tries-1)) } } case class Newtonian(name: String, f: Double=>Double, dfbdx: Double=>Double, threshold: Double) extends Function1[State,Option[Double]] { def apply(s: State) = { val x = s.x val y = f(x) if (math.abs(y) > threshold) Some(x - y/dfbdx(x)) else None } } object Newton { def apply(f: Newtonian, start: Double): Newton = apply(f,start,100) def main(args: Array[String]): Unit = { val f = Newtonian("cos(x)-x (~1E-7)", {x => math.cos(x) - x},{x => -math.sin(x) - 1},1E-7) Newton(f,1.0).solve match { case Right(x) => println(s"""the solution to "${f.name}" is $x""") case Left(m) => println(s"error: $m") } } }
Notice that there are no variables in this program! All of the problem domain (including the definition of the function itself, its derivative and an appropriate tolerance value) is encapsulated in the Newtonian class and all of the logic of Newton's method is encapsulated in two classes: Newton which is the main driver and State which holds the current state of the run.

This makes good use of the Scala types Option and Either to cope with the logic aspects of the problem. There are really two places where this sort of logic is required: when running the actual approximation logic itself in method apply of class Newtonian: have we reached a point where we are close enough? Or should we return a new delta(x) value. Thus the return type is Option[Double] which can be either None or Some(delta).

The second place is in the apply method of the State class: if we've exhausted the maximum number of tries, we should return an error message, otherwise we should return a new value of State. This can be handled with an Either[String,State]. The values will thus be Left("Failed to converge") or Right(State(x,tries-1)).

In the solve method of class Newton, we sort all this logic out, returning an Either[String,Double] result.

Finally, we have the singular object Newton which includes the main program. This sets up the Newtonian for our particular problem, that's to say cos(x)-x=0 and tries to solve it. According to the result type, a message is printed on the console.

One other point to mention is that we do use recursion in this code. Programmers have been taught to avoid recursion because it is "less efficient than iteration." However, functional languages like Scala have a way of making recursion exactly as efficient as iteration, provided that the logic is expressed as a "tail" recursion. There is an annotation on the step method in Newton to ensure that our code really is tail-recursive.

Friday, August 28, 2015

String Interpolation

Scala 2.10 introduced a nice feature for creating instances of types which can be defined using a String. It's called String Interpolation and its most common application is those strings that you see sprinkled through code like the following:

    throw new Exception(s"$x is too big for Int")

The expression beginning with "$" is replaced in the string by its value (in this case the evaluating x, whether that is a function or an identifier). Expressions including spaces require curly braces. If you follow the link referenced at the top, you will see how to customize string interpolation for your own purposes. If it is all completely clear to you by the time you get back here, then just ignore the rest of this blog post.

But if, like me, you find that comments like "A simple (buggy) implementation of this method could be:..." less than super-helpful, then stay right here and see a real example.

The example I'm going to present is the creation of a Rational number using the string form r"n/d" where n and d represent the numerator and denominator respectively. What Rational, I hear you saythere is no Rational object in Scala. Well, that's true (today at least). But we are going to suppose that we do have the following class (where most of the code in the middle has been replaced by ellipsis).

  case class Rational(n: Long, d: Long) extends Numeric[Rational] {
    ...  
  }

  object Rational {
    def apply(x: Long): Rational = new Rational(x,1)
    def apply(x: String): Rational = {
      val rRat = """^\s*(\d+)\s*(\/\s*(\d+)\s*)?$""".r
      x match {
        case rRat(n,_,d) => Rational(n.toLong,d.toLong)
        case rRat(n) => Rational(n.toLong)
        case _ => throw new Exception(s"invalid rational expression: $x")
      }
    }
    ...
  }

We will add the code described in the linked document to the object definition above. However, we will have to work the actual implementation a little better.

  object Rational {
    implicit class RationalHelper(val sc: StringContext) extends AnyVal {
      def r(args: Any*): Rational = {
        val strings = sc.parts.iterator
        val expressions = args.iterator
        val sb = new StringBuffer()
        while(strings.hasNext) {
          val s = strings.next
          if (s.isEmpty) {
            if(expressions.hasNext)
              sb.append(expressions.next)
            else
              throw new Exception("r: logic error: missing expression")
          }
          else
            sb.append(s)
        }
        if(expressions.hasNext)
          throw new Exception(s"r: logic error: ignored: ${expressions.next}")
        else
          Rational(sb.toString)
      }
   }
  ...
  }

You can probably eliminate the rather ugly logic errors because I don't think they can ever happen (although the mechanism which requires you to evaluate the next expression when you get an empty argument string is rather hokey and not well explained).

That's all there is to it. If you actually want to use it in another class file, then you will need to import Rational.RationalHelper.