Tuesday, May 18, 2021

Flog: the functional logger

In this series of blogs, we've covered functional versions of comparers, table parsers, and just "code."

It's time to revisit an earlier functional topic: loggers (see Spy). Spy was OK but the syntax was not ideal. In particular, it required an extra set of parentheses which had to be removed when removing the log messages. So, over the last few years, I have created a new functional logger called Flog. You can find it at Flog. Take a look, in particular, at the README.md file, and also under the test directory: FlogExamples.sc as well as other spec files in the test directory.

I have also written about it on Medium.

So, what is the real difference between Spy and Flog? It's mostly down to the use of an implicit class. What's that?

Implicit classes, like the other implicit things (vals, defs, objects) are an especially powerful bit of magic. If some code requires a particular method, say, propertyA (defined for class Y) and you try to invoke it on an instance of class X which does not have such a method, the compiler is going to look for an implicit converter from X to Y. This could be declared as an implicit def name(x: X): Y, or an implicit val of type X=>Y, or as the constructor of an implicit class Y which takes exactly one parameter, an X. This is a similar mechanism that allows for additional methods on String that belong to the StringOps class, such as r for regex.

In the case of Flog, the name (not that it matters) of the implicit class is Flogger. It defines a number of methods such as def !![X: Loggable](x: => X): X which performs info-level logging for X and various compound types of X such as Iterable[X], Option[X], etc. There are similar methods for debug (!?) and trace (!??).

The real utility of the library is in the various implicits that allow for the rendering of different types and types of types. Flog can handle Iterable[X], Option[X], Map[K, V], Future[X], Try[X], as well as any case class (or other Product).

Friday, October 2, 2020

Using: Did they forget something?

I was upgrading one of my open-source projects (TableParser) to Scala 2.13 and I realized that I could now use the new "Using" resource management utility.

But, it seems they missed a necessary signature. Let's review:

The basic signature for normal (single-resource use) is this apply method (I've used a context bound to save a bit of space, and I also renamed the first parameter to r):

def apply[R : ReleasableA](r: => R)(f: (R) => A)Try[A

 = Try { Using.resource(r)(f) }

As you can see, it invokes the other single-resource signature resource:

def resource[R : ReleasableA](r: R)(f: (R) => A)A

The difference between these signatures is somewhat subtle: apply returns a Try[A] whereas resource returns A. But, there's another, more subtle yet more significant difference: the r parameter in apply is call-by-name, whereas in resource, it's call-by-value. This implies that, if an exception is thrown while evaluating r for resource, it will indeed be thrown and not wrapped in a Failure. So, the resource signature is not suitable if there's a possibility of an exception being thrown while evaluating the r parameter.

If that's the case, then you must use the apply signature. However, suppose that your type A is of the form Try[X],  then the result type will be Try[Try[X]]. That's a bit ugly.

What is needed is a method such as the following (this signature is to apply, as flatMap is to map):

def safeResource[R: Releasable, A](resource: => R)(f: R => Try[A]): Try[A] = Using(resource)(f).flatten

An alternative expression for the body of this method would be the following:

try { Using.resource(resource)(f) } catch { case NonFatal(e) => Failure(e) }  

You could, of course, add the flatten method call yourself, but that's not the most elegant. So, I think it would be nice to have this signature added to the Using object in the Scala library.

Tuesday, September 15, 2020

Continued Fractions

I was inspired recently by a wonderful Mathologer video to implement continued fractions in Scala. I chose Scala for two reasons: first (as you would know if you've been here before) is that it's my favorite language and, second, it requires lazy programming.  What are continued fractions? And why are they so interesting?

They've been around, essentially, since Euclid, although they hit their heyday in the 17th century with the likes of Leibniz, Wallis, and Euler (who came a little later), and many other less well known mathematicians. The particular subject of the video (all his videos are great, BTW) is the solution to the "Strand puzzle." This was a puzzle set in the Strand magazine (the one that published the Sherlock Holmes stories) by one of the great puzzle setters of all time: Henry Dudeney.

A man lives at house number n on a long street of N houses (all on the same side), numbered from 1 thru N. He notices that if he sums all the house numbers on one side of his (L) and all of the house numbers on the other side (R), then L = R. What is his house number?

Supposedly, the brilliant mathematician Ramanujan immediately recognized that the solutions were related to the convergents of a particular continued fraction that gives solutions to the equation Y^2 -- 2 X^2 = 1 -- one of the so-called Pell equations, after John Pell (1611-1685). I will let you watch the video to get the full story.

So, the general Pell equation Y^2 -- x X^2 = 1 relates to the following continued fraction:


When we set x = 2 (for the Strand magazine problem), we get successive approximations to the square root of two.

Here's another rather elegant definition of phi, the golden ratio:


So, how do we code this in Scala? The most important class is called ConFrac and is defined thus:

class ConFrac(val b: Long, co: => Option[CF]) {

  lazy val tailOption: Option[CF] = co

...
}

You might notice that this is not a case class. That's because we need the second parameter to be lazily evaluated, i.e. "call by name". Since it's not a case class, we need to ensure that we can reference the first parameter, b, in other places, using the val keyword. This doesn't work, of course, for the second parameter because of course that is, in reality, a function that generates an Option[CF]. Hence the lazy val (or def) defined as tailOption. Obviously, we need to define CF too:

case class CF(a: Long, c: ConFrac)

This is a case class, and defines the numerator a and identifies the next ConFrac in the series as c.

How do we get the value of our continued fraction? Well, it's an infinite series so we can't evaluate the whole thing. But, we can look at the convergents which asymptotically approach the true value. Here's the method for getting the convergents:

  def convergents: LazyList[Rational] = {
    def inner(an_2: BigInt, an_1: BigInt, bn_2: BigInt, bn_1: BigInt, w: => LazyList[Pair]): LazyList[Rational] = w match {
      case LazyList() => LazyList()
      case p #:: tail =>
        val an = p.b * an_1 + p.a * an_2
        val bn = p.b * bn_1 + p.a * bn_2
        Rational(an, bn) #:: inner(an_1, an, bn_1, bn, tail)
    }

    val h #:: z = coefficients
    Rational(h.b) #:: inner(1, h.b, 0, 1, z)
  }

Note that the inner method is not tail-recursive. I don't think there's a way to make it tail-recursive. This method returns a LazyList[Rational], and we can evaluate it as far as we want. Scala fans will note that I used a pattern in the definition val h #:: z = coefficients. This is so much more elegant than any other way of getting the information I needed. Of course, I have to be sure that coefficients is not empty. It makes use of another method coefficients:

  def coefficients: LazyList[Pair] = {
    def inner(_b: Long, a: Long, co: Option[CF]): LazyList[Pair] =
      Pair(_b, a) #:: {
        co match {
          case None => LazyList()
          case Some(cf) => inner(cf.c.b, cf.a, cf.c.tailOption)
        }
      }

    inner(b, 0, tailOption)
  }

This uses another case class Pair which is is simply two Longs: b and a. The result of coefficients is a LazyList of Pairs.  The form of continued fraction described here looks as follows where the nth element of the result is (bn, an) and a0 is always zero:

So, we can get values from a ConFrac. But how do we construct them in the first place? Obviously, declaring new ConFrac(1, new ConFrac(...)) wouldn't be so convenient! Instead, there are methods in the ConFrac companion object as follows:

  def apply(ps: LazyList[Pair]): ConFrac = ps match {
    case p #:: LazyList() => new ConFrac(p.b, None)
    case p #:: tail => new ConFrac(p.b, Some(CF(p.a, ConFrac(tail))))
  }

  def simple(xs: LazyList[Long]): ConFrac = ConFrac(Pair.zip(xs, LazyList.continually(1L)))

Pair.zip is a method in Pair's companion object which zips the two lazy lists together and then maps each resulting tuple to a Pair. The purpose of simple is to make it easier to define a so-called simple continuous fraction (where all of the a terms are unity).

Here are some of the definitions of simple LazyLists:

  • phi (the golden ratio): LazyList.continually(1L)
  • e (Euler's number): 2L #:: LazyList.iterate(Seq(1L, 2, 1)) { case Seq(x, y, z) => Seq(x, y + 2, z) }.flatten

And here is one of the definitions of lazy lists for generalized continued fractions:

  • pi (one of several): Pair.zip(0L +: LazyList.from(0).map(2L * _ + 1), 4L +: LazyList.from(1).map(x => 1L * x * x))
It's been fun working on this project. There is one method that is currently not well implemented. It is the toDouble method. The signature is: def toDouble(epsilon: Double): Option[Double] and it is required to provide an approximation that is correct to within epsilon. Such a method is required if mathematics with known precision is being performed. If you have a required precision of, say, pi to 1E-20, then you need to be able to ensure that the result is true to that precision.

One thing that has tripped me up a few times working with LazyLists is that there are many methods which LazyList inherits through Seq which are not "call by name." An example of this is +: which prepends an element to a LazyList. But, unfortunately, it doesn't do it lazily ;)

I think it will work out to be a good assignment for my Scala class!

Wednesday, April 8, 2020

A functional comparer

One of the aspects of Scala and Java that I've always felt could be improved is the mechanism for comparing things. The basic scheme, inherited from Java, is that two objects, x and y, can be compared and if the result is less than zero, then x is smaller than y, if it's greater than zero, then x is larger than y, otherwise they are equal. This is the kind of code which we need to write for some user-defined class:
case class Date(year: Int, month: Int, day: Int) extends Ordered[Date] {
  def compareTo(that: Date): Int = {
    val cfy = year.compareTo(that.year)
    if (cfy!=0) cfy
    else {
      val cfm = month.compareTo(that.month)
      if (cfm!=0) cfm
      else day.compareTo(that.day)
    }
  }
}
Part of the problem arises from the use of an Int (int in Java) that tries to represent three-valued logic with four billion different values. Java has no easy construct to switch according to ranges of values. Maybe in Java 12, the new switch statement will accommodate this requirement?

In Scala, at least, we could use pattern matching so that the code (above) to discriminate the results could be rewritten:
case class Date(year: Int, month: Int, day: Int) extends Ordered[Date] {
  def compare(that: Date): Int = year.compareTo(that.year) match {
    case 0 => month.compareTo(that.month) match {
      case 0 => day.compareTo(that.day)
      case x => x
    }
    case x => x
  }
}
But it's not really a lot better. Given that Scala is a functional language, I think we can improve things significantly. How about something like the following?
object Date {
  implicit val dateComparer: Comparer[Date] = {
    val cf = implicitly[Comparer[Int]]
    cf.snap(_.year) orElse cf.snap(_.month) orElse cf.snap(_.day)
  }
}

This time, we get an implicit Comparer[Int] (see below) and we compose the three necessary comparisons together.

Comparer[T] is a trait which defines an apply function of type T => T => Comparison. There are various methods for composing comparers, and there are methods which make it easy to compare things using the more usual tupled parameters (rather than the curried apply function). Comparison is a trait which can be evaluated as a Kleenean (three-valued logic type) and ultimately as an Option[Boolean].

We are able to perform the Date comparison (above) because the result of invoking each of the snap methods is one object: a Comparer[Date].  The orElse method composes comparers together such that, if there is a distinction yielded by the left-hand operand, then that is the result, otherwise we look to the right-hand operand, and so on.

We find an implicit value of a type class for the integer comparer, and we make this a variable called cf. The snap method takes a "lens" function as its parameter and transforms the Comparer[Int] into a Comparer[Date]

Actually, we can come up with something rather more elegant than this:
object Date {
  implicit val dateComparer: Comparer[Date] = Comparer.same[Date] :| (_.year) :| (_.month) :| (_.day)
}
The Comparer.same method simply provides a Comparer of the given type which always evaluates to Same. The :| method composes (using orElse) two Comparers where the one on the right is constructed from an implicitly discovered Comparer of the type yielded by the lens function and which is then snapped by the given lens. Here, the lens functions are defined as lambdas (function literals).

There's also a :|! method which works the same except that it invokes the orElseNot method which flips the sense of the Comparer formed from the lens function.

Actually, since in this case the lens functions are all of type Date=>Int and all of the same sense, we can do even better:
object Date {
  implicit val dateComparer: Comparer[Date] = Comparer(_.year, _.month, _.day)
}
Now, isn't that a lot more elegant? The apply method takes a variable list of lens functions but, in this form, they must all be of the same type.

Now, we've got the compiler doing some serious work for us. For each of the lens functions, the compiler will find an implicit Comparer and apply the lens function to it (via snap).

A typical usage of this in a specification might be:
val today = Date(2019, 6, 5)
val tomorrow = Date(2019, 6, 6)
Compare(today, today) shouldBe Same
Compare(today, tomorrow) shouldBe Less
Compare(tomorrow, today) shouldBe More
Well, of course, that's not quite it. We can do even better:
object MyComparers extends Comparers {
  val comparer: Comparer[Date] = comparer3(Date)
}
import MyComparers._
Compare(tomorrow, today) shouldBe More
This time, we didn't have to spell out how to compare the various elements of the Date case class. The compiler figured it out for us using the magic of type inference. Notice that we simply pass in the Date.apply method to the comparer3 method. But, inside comparer3, that method is never actually invoked. The work is all done by the compiler.

Now, we've really got the compiler doing some serious work for us!

As well as comparing case class instances made up of simple types such as String, Int, etc. we can also create case classes based on collections, wrappers, user-defined types, tuples, whatever. Just as long as there is an implicit Comparer[T] of the appropriate underlying type defined in scope.

So, for example, suppose that we have the following bizarre case class:
case class Bizarre(x1: Seq[Int], x2: (Double,Double), x3: Try[Boolean], x4: Option[Int], x5: Unit)
object MyComparers extends Comparers {
  implicit val comparer: Comparer[Bizarre] = comparer5(Bizarre)
}
import MyComparers._
val x1 = Bizarre(Seq(1, 2), (1.0, 0.0), Success(true), Some(4), ())
val x2 = Bizarre(Seq(1, 2), (1.0, 0.0), Success(true), Some(5), ())
Compare(x1,x2) shouldBe Less
We need do no more than declare the comparer implicitly.

Couldn't we have achieved all of this simply by comparing tuples using Ordering? To some extent, yes--if you just want to compare two tuples of the same arity and are happy with the -1, 0, +1 result. However, it's a little awkward and it doesn't happen implicitly for a case class. You must first invoke unapply in order to yield an Option[TupleN...]. And that will only work up to Tuple8. The comparers defined in the Compare package work with up to 11 parameters. And, even then, it would require a lot of code if the case class wasn't defined in decreasing order of significance.

The main purpose of the Comparer package is the elegance and the ease of composition of comparers and the provision of out-of-the-box comparisons for just about every situation. You can find this open-source package on github: https://github.com/rchillyard/Comparer.

Monday, April 15, 2019

Scala table parser

I've been busy over the last year with some new Scala projects in my GitHub space. In this blog I will talk about TableParser. The current release is v1.0.5.

TableParser is a CSV file parser which uses type classes to facilitate the task of programming the parser. I have written elsewhere about the advantages of type classes, but in a nutshell, a type class (which is usually defined as a trait with a single parametric type, e.g. trait T[X]) can allow you to create classes which provide functionality which derives from the combination of the type class T and its underlying type X. The totality of such classes, therefore, is the Cartesian product of all type classes T and all underlying (concrete) types X.

TableParser operates on the assumption that each row in a CSV file represents some "thing," which you can model as a case class. Don't worry if your CSV is basically just a matrix of strings--we can handle that too.

But, what if the data in these rows is of disparate types, some Ints, some Doubles, some Strings? For this, we rely on some magic that derives from Scala's ability to perform type inference. This is an aspect of Scala that not generally emphasized enough, in my opinion. Of course, we like the fact that type inference can check the integrity of our program. But it does more than that--it essentially writes code for us!

As an example, let's think about a CSV file which contains a set of daily hawk count observations (i.e. each has a date, a species name, and a count). Just to keep it simple for this explanation, we will ignore the date. We describe our hawk count with a simple case class:
case class HawkCount(species: String, count: Int)
And now we create an object which extends CellParsers as follows:
object HawkCountParser extends CellParsers {
  implicit val hawkCountParser: CellParser[HawkCount] = cellParser2(HawkCount)
}
The only tricky part here was that we had to count up the number of parameters in HawkCount and use the appropriate cell parser (in this case, cellParser2). Then we had to pass in the name of the case class and we have a parser which knows how to read a species and, more significantly, how to read and convert the values in the count column to Ints.

What we are actually passing to cellParser2 is a function which takes two parameters. It is the function of HawkCount called "apply." It is the type inference of the compiler which now allows us to know how to parse the individual fields (parameters) of the case class. If you have created additional apply methods (or simply have a companion object for your type), you will have to explicitly name the apply method that you want (you can do this using the type--see README file).

Now, all we have to do is to parse our file. Something like this...
import HawkCountParser._
val hty = Table.parse("hawkmountain.csv")

Note that the returned value is a Try[Table[HawkCount]]. A Table is a monad and can easily be transformed into another Table using map or flatMap.

Sometimes, there will be too many columns to be grouped logically into one case class. But, no worries, you can set up a hierarchy of case classes. Just make sure that you define the parsers for the inner case classes before they are referenced by an outer case class.

You could simply print your table by invoking foreach and printing each row. However, if you want a little more control and logic to your output, you have two options: a simple "square" rendering, for which you will set up an output type, for example,

implicit object StringBuilderWriteable extends Writable[StringBuilder] {
  override def writeRaw(o: StringBuilder)(x: CharSequence): StringBuilder = o.append(x.toString)
  override def unit: StringBuilder = new StringBuilder
  override def delimiter: CharSequence = "|"} 
hty.map(_.render)

Alternatively, you could write your table out to a hierarchical format, such as XML or HTML.
For more detail, please see the README file.

Saturday, March 2, 2019

Functionalizing code

I'm not sure if functionalizing is really a word. But let's suppose it is.

Some time ago, I wrote an application to allow me to view the output of the "Results" option that Blackboard Lean (oops: Learn) provides us. Since most answers use HTML, the content of individual answers can be obscured at best. At worst, impossible.

So, I created a filter that takes the CSV file and turns it into an HTML file with a table (aspects of questions are columns, student submissions are rows). I recently upgraded it to allow me to specify a particular set of columns that I was interested in. But I was very dissatisfied when I looked at the HTML class which I had previously used to set up the output tags. This was how it looked:

/**  * Mutable class to form an HTML string  */
class HTML() {
  val content = new StringBuilder("")
  val tagStack: mutable.Stack[String] = mutable.Stack[String]()

  def tag(w: String): StringBuilder = {
    tagStack.push(w)
    content.append(s"<$w>")
  }

  def unTag: StringBuilder = content.append(s"</${tagStack.pop()}>")

  def append(w: String): StringBuilder = content.append(w)

  def close(): Unit = while (tagStack.nonEmpty) {
    unTag
  }

  override def toString: String = content.toString + "\n"
}


And this was how it was used:


def preamble(w: String): String = {
  val result = new HTML
  result.tag("head")
  result.tag("title")
  result.append(w)
  result.close()
  result.toString
}
def parseStreamIntoHTMLTable(ws: Stream[String], title: String): String = {
  val result = new HTML
  result.tag("html")
  result.append(preamble(title))
  result.tag("body")
  result.tag("table")
  ws match {
    case header #:: body =>
      result.append(parseRowIntoHTMLRow(header, header = true))
      for (w <- body) result.append(parseRowIntoHTMLRow(w))
  }
  result.close()
  result.toString
}
 

The rendering of the code here doesn't actually show that the mutable Stack class is deprecated. How could I be using a deprecated class--and for mutation too! Ugh! Well, it was a utility, not an exemplar for my students. So, it was acceptable.
 
But I decided to functionalize it. First, I needed to create a trait which had the basic behavior I needed: 


 
/**
  * Trait Tag to model an Markup Language-type document.
  */
trait Tag {

  /**
    * Method to yield the name of this Tag
    *
    * @return the name, that's to say what goes between &lt; and &gt;
    */
  def name: String

  /**
    * Method to yield the attributes of this Tag.
    *
    * @return a sequence of attributes, each of which is a String
    */
  def attributes: Seq[String]

  /**
    * Method to yield the content of this Tag.
    *
    * @return the content as a String.
    */
  def content: String

  /**
    * Method to yield the child Tags of this Tag.
    *
    * @return a Seq of Tags.
    */
  def tags: Seq[Tag]

  /**
    * Method to add a child to this Tag
    *
    * @param tag the tag to be added
    * @return a new version of this Tag with the additional tag added as a child
    */
  def :+(tag: Tag): Tag

  /**
    * Method to yield the tag names depth-first in a Seq
    *
    * @return a sequence of tag names
    */
  def \\ : Seq[String] = name +: (for (t <- tags; x <- t.\\) yield x)
}
Together with an abstract base class:
abstract class BaseTag(name: String, attributes: Seq[String], content: String, tags: Seq[Tag])(implicit rules: TagRules) extends Tag {

  override def toString: String = s"""\n${tagString()}$content$tagsString${tagString(true)}"""

  private def attributeString(close: Boolean) = if (close || attributes.isEmpty) "" else " " + attributes.mkString(" ")

  private def tagsString = if (tags.isEmpty) "" else tags mkString ""

  private def nameString(close: Boolean = false) = (if (close) "/" else "") + name

  private def tagString(close: Boolean = false) = s"<${nameString(close)}${attributeString(close)}>"
}

And a case class for HTML with its companion object:
/** * Case class to model an HTML document. * @param name the name of the tag at the root of the document. * @param attributes the attributes of the tag. * @param content the content of the tag. * @param tags the child tags. * @param rules the "rules" (currently ignored) but useful in the future to validate documents. */ case class HTML(name: String, attributes: Seq[String], content: String, tags: Seq[Tag])(implicit rules: TagRules) extends BaseTag(name, attributes, content, tags) { /** * Method to add a child to this Tag * * @param tag the tag to be added * @return a new version of this Tag with the additional tag added as a child */ override def :+(tag: Tag): Tag = HTML(name, attributes, content, tags :+ tag) } /** * Companion object to HTML */ object HTML { implicit object HtmlRules extends TagRules def apply(name: String, attributes: Seq[String], content: String): HTML = apply(name, attributes, content, Nil) def apply(name: String, attributes: Seq[String]): HTML = apply(name, attributes, "") def apply(name: String): HTML = apply(name, Nil) def apply(name: String, content: String): HTML = apply(name, Nil, content) }
Here's one of the places the tags in the document are set up:
  def parseStreamProjectionIntoHTMLTable(columns: Seq[String], wss: Stream[Seq[String]], title: String): Try[Tag] = Try {
    val table = HTML("table", Seq("""border="1"""")) :+ parseRowProjectionIntoHTMLRow(columns, header = true)
    val body = HTML("body") :+ wss.foldLeft(table)((tag, ws) => tag :+ parseRowProjectionIntoHTMLRow(ws))
    HTML("html") :+ preamble(title) :+ body
  }
Now, all we have to do is to use .toString on the document to render it for the final HTML source!