Monday, May 16, 2016

Learn Java or Scala first?

There is a common question type on Quora that asks whether you need to know Java in order to learn Scala. My experience tells me that it is not necessary. Indeed, I think it may actually be harmful to learn an imperative language (such as Java) first. Such languages are based on the Von Neumann model of computing (which was fine in its day): load-modify-store. But compute power today is so much greater than anything imaginable in the 1940s that a compiler doesn't need us to tell it how to solve a problem. We really only need to be able to describe our problem.

Rather than repeat my entire argument here, I will simply link to an answer I wrote for Quora on the question Should I need Java...?

Here also are some other Quora discussions relevant to this subject:
The consensus appears to be that it would perfectly appropriate, and generally beneficial, to learn Scala first, without having to know Java or, indeed, any other language. Learning Scala first will teach you how to think and describe your problem to the compiler. You shouldn't have to be thinking about the way the machine will ultimately implement your application.

Wednesday, May 4, 2016

Transforming iteration into tail recursion

My attention was caught by a question on Quora: What specific Scala coding skills do I need to master in order to get a job as a Scala dev?
 
Since I had given my students at Northeastern University two questions in the final exam that related to this, I thought I would answer the question and write a series of blogs here to give a little more detail. The two questions that I gave them specifically were the first two bullet points in my answer. So let's start the ball rolling by seeing if there is indeed a general way to transform an iteration into a (tail) recursion.

Here's a typical, if trivial, example of the kind of thing that new Scala programmers want to do:

  val data = List(0.1, 0.4, 0.2, 0.7, -0.1, 1.1, 0.5)

  {
    var sum = 0.0
    for (i <- data.indices; if i % 2 == 0) sum = sum + data(i)
    println(s"sum of even positions: $sum")
  }

This works just as you'd expect. The even numbered elements of data are summed. The only fly in the ointment is that darned var sum. Obviously, we do want something to start out as zero and get successively augmented so that it results in the sum of the data values we're interested in. But how can we do that without iterating and updating a variable?

This is where recursion comes to our rescue. Recursion is really the functional programming counterpart of iteration for imperative programming. But haven't we always been taught to avoid recursion in our programming? Well, yes. But because recursion is so important for functional programming, languages like Scala (and Haskell, etc.) allow a compiler optimization when our process is tail-recursive. What does that mean? Well, if the recursion happens as the very last thing that the recursive method performs (i.e. at the tail end), there's actually nothing of interest that we need to place on the stack. We would essentially just be putting boiler-plate code on the stack--nothing that we really need.  So, the compiler implements tail-recursion every bit as efficiently as iteration.

Since the compiler is better than we are at checking if a method is tail-recursive (well, that's certainly true for beginners), we can assert that our code is tail-recursive using the @tailrec annotation, and the compiler will warn us if we are wrong. Then we can fix it, rather than run the risk of it chewing up the entire stack in a production run.

So, here's our functional-programming style version of the previous code:

  val data = List(0.1, 0.4, 0.2, 0.7, -0.1, 1.1, 0.5)

  {
    import scala.annotation.tailrec
    @tailrec def loop(accumulator: Double, indices: List[Int]): Double = indices match {
      case Nil => accumulator
      case h :: t => loop(accumulator + data(h), t)
    }

    val sum = loop(0.0, (for (i <- data.indices; if i % 2 == 0) yield i).toList)
    println(s"sum of even positions: $sum")
  }

This particular example was so simple that it was easy to transform (refactor) the code. [And, yes, I'm aware that there are even simpler ways to sum these numbers using Scala's collections API.] In particular, we had a generator of indices (the guts of the for-comprehension) that could be invoked ahead of doing the recursion. This won't always be true of course, and in such cases we'll need a third parameter in our loop method. But here loop only needs two things: the list of indices we haven't looked at yet and the accumulated value (accumulator).

It's true that the code is a bit longer than it was. But the big advantage is that our new code is referentially transparent. That's to say we can replace any identifier with its definition and our program will function the same. That's not possible when there are mutable variables about (i.e. vars) because the behavior of the program will depend on exactly when we make the substitution. There are much better discussions of referential transparency out there. My purpose here was to suggest a simple refactoring of a simple var-based iteration.

In the next blog entry, I will talk about dealing with mutable state.