Tuesday, June 26, 2007

Search in Haskell

I've been playing around today with search in Haskell, and besides from some overly technical papers and a confusing explanation on the YAHT wiki book, I haven't found anything useful. So I though I'd put up what I could figure out.

First, let's use type classes to get some modularity. Search can be over trees or graphs or whatever, basically we just need states and a function to give us the children of a node. We also are going need to compare states to check if we have reached the goal.

> class (Ord a) => SearchState a where
> children :: a -> [a]

For instance, let's say we are trying to solve the numeric maze given in the haskell wiki. In this problem our state is just an integer.

> newtype NumMaze = NumMaze Integer deriving (Show, Eq, Ord)

And here is the instantiation. Our operations to generate children are (x+2), (x*2), (x/2). So

> instance SearchState NumMaze where
> children (NumMaze n) =
> map NumMaze $
> [n+2, n*2]++[n `div` 2 | 2 `divides` n]
> where divides a b = b `mod` a == 0

Now we can write very generic functions for the search operations. We'll start with a monadic depth first search. Depth first search is very natural for functional programming (if we don't worry about blowing the stack). We dive in with recursion. There is good explanation of this on the haskell wikibook here. Instead of using a graph, we use our search state and simplify the code by using a fold.

> dfs :: (SearchState a,MonadPlus m) => a -> a -> m[a]
> dfs start target
> | start == target = return [start]
> | otherwise =
> foldr mplus mzero $
> [ dfs c target >>= \path -> return (start:path)) |
> c <- children start] The code has two cases. If we have reached the goal, it returns a list of success with the result. Otherwise, in the list comprehension it tries searching each child. If it finds a successful path, it adds it returns it. You might wonder where we handle failure. If there are no children, the fold returns mzero which will propagate failure to its parents. When the parent gets a failure, mplus will try a different path, and we have backtracking. Next we'll try breadth first search. Here is one way to do it by passing a queue as an accumulator. > bfs :: (SearchState a,Monad m) => a -> a -> m [a]
> bfs start target = bfs' [[start]] >>= return . reverse
> where
> bfs' [] = fail "No Path"
> bfs' ((s:r):ns)
> | s == target = return (s:r)
> | otherwise = bfs' (ns ++[c:s:r |c<-children s]) We can make this faster by adding a set to keep track of states that we have already visited. > import Data.Set (empty, notMember, union, fromList)

> bfs2 :: (SearchState a,Monad m) => a -> a -> m [a]
> bfs2 start target = bfs' [[start]] empty >>= return .reverse
> where
> bfs' [] _ = fail "No Path"
> bfs' ((s:r):ns) seen
> | s == target = return (s:r)
> | otherwise = bfs' (ns ++new') seen'
> where
> new' = [c:s:r|
> c <- children s, > c `notMember` seen]
> seen' = seen `union` (fromList $ map head new')

So these functions are nice, and they are how I would write them in O'Caml. But they just feel too ugly for haskell. I know, I'm getting spoiled. So I thought I would give a shot at a lazier version. The rest of this entry is my very basic understanding of the Functional Pearl Breadth-First Combinators.

First, depth first search. I visualize it like this. Imagine, that the states form a tree, and we want to first transform the tree to a stream, and then process it linearly. Haskell makes this absurdly easy. First, imagine that we can linearize all the children searches. This gives us a matrix-like list of streams (since we are doing DFS, we hope they are not infinite, but that does not break the formulation) i.e.

[[Child1, Child1.1, Child1.2 ...],
[Child2, Child2.1, ...],
...
]

Since this is DFS, we want to explore these in order. This means all we need to do in concat them, and we have the full traversal. Bam!

> lazyDFT :: (SearchState a) => a -> [a]
> lazyDFT start = start : (concat $ map lazyDFT $ children start)

In the paper they call this the "and" operator for logical combination.

Now, what about Breadth-first search. Well we want to move one level at a time instead of following a single path to its end. If we know that our tree is balanced, we can simply transpose the matrix and then take the concat

[[Child1, Child1.1, Child1.2 ...],
[Child2, Child2.1, ...],
...
]
-->
[[Child1, Child2, Child3 ...],
[Child1.1, Child2.1, ...],
...
]

Here is the code using Data.List's lazy transpose function.

> lazyBFTBal :: (SearchState a) => a -> [a]
> lazyBFTBal start =
> start:(concat $ transpose $ map lazyBFTBal $ children start)

Hmm... Not sure the best way to proceed here. In all honesty, I find the rest of the combinators paper to be a bit confusing to say the least, but I'll try to fake it. I want to make sure I traverse the tree in order, so I'll preserve the order at each level. The new type signature is .

lazyBFT :: (SearchState a) => a -> [[a]]

and the matrix looks like,

[[[Child1], [Child1.1, Child1.2] ...],
[[Child2], [Child2.1], [Child2.1.1]],
...
]
transposed to
[[[Child1], [Child2] ...],
[[Child1.1, Child1.2], [Child2.1],...],
[ [Child2.1.1],..]
...
]

and then we concat each row,

[[Child1,Child2,...], [Child1.1,Child1.2],[Child2.1.1]]

Score! Back to where we started.

The code looks like,

> lazyBFT start = [start]:(map concat $ transpose $ map lazyBFT $ children start)

If I use it on the problem we started with yields.

*Main> take 3 $ lazyBFT (NumMaze 10)
[[NumMaze 10],[NumMaze 12,NumMaze 5,NumMaze 20],[NumMaze 14,NumMaze 6,NumMaze 24,NumMaze 7,NumMaze 10,NumMaze 22,NumMaze 10,NumMaze 40]]

Now lets add some of the pruning back.

> lazyBFT2 seen start = [start]:(map concat $ transpose $ map (lazyBFT2 seen') $ c)
> where c = [c| c <- children start , c `notMember` seen] > seen' = seen `union` (fromList c)

*Main> take 3 $ lazyBFT2 (singleton $ NumMaze 10) (NumMaze 10)
[[NumMaze 10],[NumMaze 12,NumMaze 5,NumMaze 20],[NumMaze 14,NumMaze 6,NumMaze 24,NumMaze 7,NumMaze 22,NumMaze 40]]



Okay, I'll stop here for now. Pretty cool stuff. I wonder, is there is a lazy way to do this pruning? Maybe next post...

Friday, June 15, 2007

JSON-Static

So I've been getting ready for work by playing with the Facebook web API. I got fed up pretty quickly with the xml interface, so I started using the json. I really like the simplicity of json, particularly for serializing types. I started working on this project in O'Caml, and I was really impressed by Martin Jambon's JSON libraries for O'Caml, json-static . You can just throw in the keyword "json" and it write all the boilerplate code for the serialization. i.e.

type json greeting = Hello | Goodbye

will write code for greeting_of_json and json_of_greeting that do the default conversion, turning Hello to "Hello" and Goodbye to "Goodbye". This is super cool. But, what if you have some weird data type, say

type json weird = {visible: int ; invisible:int}

and you don't want to serialize the invisible type. hmm. Well we can define a special module.

module Weird=
struct
type t = {visible: int ; invisible:int}
let of_json = ...
let to_json = ...
end

This is nice and all, but ... it is a real hack. O'Caml doesn't have this kind of polymorphism and this only works because of camlp4 magic.

However, this is exactly the kind of thing Haskell can do. So in this entry, I'll try to port this idea to Haskell in a neat way. First, we start with a JSON parser. I'm using the one from JSON.hs. This library uses parsec to do the conversion from strings to a JSON data type and from the type back to pretty-printed strings. All we need to do is auto-generate the boilerplate code to convert from the json type to other types.

So let's go. First we define the basic type class.


>module Data.Json where
>import qualified JSON as J

The Json class has two basic functions. toJson converts data types to a JSON data structure.
fromJson tries to convert a JSON type into a Haskell type. The two list functions are defined
so that we can handle the String newtype. The default implementations should handle these functions in the other cases.

>class Json a where
> toJson :: a -> J.Value
> fromJson :: (Monad m) => J.Value -> m a

These follow the list functions from the prelude.

> listToJson :: [a] -> J.Value
> listToJson = J.Array . map toJson
> listFromJson :: (Monad m) => J.Value -> m [a]
> listFromJson (J.Array a) = mapM fromJson a

Now we define a couple instances for the basic data types.

Some are really basic.

>instance Json Double where
> toJson = J.Number
> fromJson (J.Number a) = return a

A String is just a list of characters.

>instance Json Char where
> toJson = J.String . show
> fromJson (J.String a) = return $ head a
> listToJson = J.String
> listFromJson (J.String a) = return a

Lists just call the default list constructors.

>instance Json a => Json [a] where
> toJson = listToJson
> fromJson = listFromJson


This part was the basic stuff. Now we need to derive instances for arbitrary types. It took me forever to figure out the best way to do this. I started with DRiFT, but I found the interface kind of duct-tapey. Next, I tried SYB, which seemed really neat and perfect for this task. Instead of generating code, it allows you to use folds over the types themselves. The one problem with SYB though is that it is really difficult to incorporate type classes. SYB 3 shows you how to do this, but the implementation is not quite there yet.

So I settled on Data.Derive with Template Haskell. Data.Derive provides a bunch of helper methods for writing derive instances for types, and a ton of ways of inserting the derived instances back into the code. So without further ado, here is the code.

{-#OPTIONS_GHC -fth #-}
>module Data.Derive.Json(makeJson ) where

This is the Data.Derive library

>import Language.Haskell.TH.All
>import Language.Haskell.TH.Lib
>import Data.List
>import Control.Monad (liftM)
>import qualified Data.Map as M
>import qualified JSON as JS
>import qualified Data.Json as DJ



>makeJson :: Derivation
>makeJson = derivationQ json "Json"

For each derived element, we define a toJson and fromJson function.

>json :: Dec-> Q [Dec]
>json dat =
> do
> toClauses <- toJson
> fromJ <- fromJson dat
> return $ simple_instance "Json" dat
> [funN "toJson" toClauses,
> funN "fromJson" fromJ]
> where
> toJson = sequence [sclause [ctp ctr 'x'] `liftM` jsonit ctr |
> ctr <-dataCtors dat]

>litQ :: String -> ExpQ
>litQ = return . lit


This is the toJson code generator function. It generates code like this -

data Test = RecordName {field1::String, field2::Integer}

toJson (RecordName x1 x2) =
JS.Object $ M.fromList $ [("field1",toJson x1),("field2",toJson x2)]


>jsonit:: Con -> ExpQ
>jsonit ctr =
> case ctorFields ctr of
> [] -> [|JS.String $(litQ $ ctorName ctr)|]
> fl -> flds fl
> where
> flds :: [String] -> ExpQ
> flds f = [|JS.Object $ M.fromList $
> $( liftM lst $ mapM field $ zip [1..] f)|]
> field :: (Int,String) -> ExpQ
> field (n,f) = [| ( $(litQ f),
> DJ.toJson $(return $ vrn 'x' n)) |]

This code looks a bit confusing, but if you get by the weird Template Haskell syntax, it is pretty straightforward.

The second piece of code fromJson is a bit harder to understand. The first case is when there is a record. JSON.hs parses JS classes as a map from the string of the field to a JSON value. The obvious way to do this would be -

fromJson (J.Object m) = RecordName{ field1 = fromJson $ M.lookup m "field1";
field2 = fromJson $ M.lookup m "field2"}

However, we need to take into account that lookup or fromJson could fail. This means that we need to put it in a monad. I follow the lead of Data.Map and have fromJson return a value wrapped in a generic monad. To construct the record, we use the do syntax.

do
f1<-M.lookup m "field1" >>=fromJson
f2<-M.lookup m "field2" >>=fromJson
return $ RecordName{field1=f1;field2=f2}

I'm not sure how to do this with template haskell, so we'll throw out the syntactic sugar for now.

M.lookup m "field1" >>=fromJson >>=
(\f1-> M.lookup m "field2" >>=fromJson >>=
(\f2 -> return $ RecordName{field1=f1;field2=f2}))

We need a couple of tricks to do this. First, we need unique names for the variables, f1 - fn. We do this by using the Q monad built into Template Haskell. One of the cool things that the Q monad lets us do is create unique variables. So the code -- let f = newName "f" ; do {f1<-f;f2<-f;f3<-f; return (f1,f2,f3)} -- will create three unique names.

The other issue we need to handle is to tie the variable names to the fields. We could just choose arbitary names, but let's find a cooler way to do this. If we were writing a compiler, what would we do? Use CPS! If we use CPS we can mimic the lambdas within code inside the Template Haskell. This way we can pass around an environment of bound variables.

>type Env = [(String,Name)]
>type Conti = Env -> ExpQ

>fromJson :: Dec -> Q [Clause]
>fromJson dat = sequence $
> case dataCtors dat of
> [ctr] ->
> [clause [return $ ConP ('JS.Object) [VarP m]] (normalB $ t sd) []] -- fromJson (JS.Object m) =
> where
> m = mkName "m"

{-M.lookup "str" m >>= DJ.fromJson >>= (\ f1. {cont})-}

> t :: Conti -> ExpQ
> t = foldl (.) (\c->c []) $
> map bindOne $ ctorFields ctr
> where
> bindOne str cont env = do
> var <- newName "f"
> [|M.lookup $(litQ str) $(varE m) >>= DJ.fromJson >>=
> $(lamE [varP var] $ (cont ((str,var):env))) |]

{-Construct the record from the environment.
return $ RecordName{field1=f1;field2=f2}-}

> sd a = [| return $(rec a) |]
> where
> pair (s,n) = return (mkName s,VarE n)
> rec = recConE ( mkName $ ctorName ctr) . map pair

The second case just reads in zero argument constructors as strings. Nothing exciting here.

> ctrs ->
> map (\s -> clause [return $ ConP ('JS.String) [LitP $ StringL s]]
> (normalB [|return $(conE $ mkName s)|]) []) $
> map ctorName ctrs

Anyway. I thought this was pretty cool. It is amazing what you can do with a little bit of template code. Gets you out of writing a ton of annoying processing code.

Friday, May 25, 2007

JGraph in Haskell

Okay, so I was playing around with Jgraph to draw graphs for my thesis, and I really liked its simple interface and the clarity of the graphs it produced. I first used the program in Norman Ramsey's Programming languages class, so I have always associated it with programming languages. Anyway, I've been learning Haskell, so I thought what the heck, I'll try to use Parsec to write a stripped down version of JGraph in Haskell, and get some monad/literate programming practice. Here it is.

>import Text.ParserCombinators.Parsec hiding (label)
>import Text.ParserCombinators.Parsec.Char
>import Text.ParserCombinators.Parsec.Token hiding (lexeme, symbol,natural,reserved)
>import qualified Text.ParserCombinators.Parsec.Token as P
>import Text.ParserCombinators.Parsec.Language (emptyDef)
>import Text.ParserCombinators.Parsec.Error
>import Graphics.HGL hiding (Point,char)
>import Graphics.HGL.Draw.Picture
>import Graphics.HGL.Draw.Text
>import Control.Monad
>import Data.Either

-------------
The Type

Our topmost type is a graph. It has a title, two axes, and many curves.

>data Graph = Graph {title :: String,xaxis::Axis,yaxis::Axis,curves::[Curve]} deriving Show

For now an axis is just labeled with a name. In the future, the axis will have a size.

>data Axis = Axis {axisLab:: String} deriving Show

A curve is a collection of points. Connected tells us if we should draw the connecting line.

>data Curve = Curve { pts::[Point], connected::Bool} deriving Show
>type Point = (Int,Int)

-------------
The Lexer

Parsec does lexing and parsing simultaneously, but we still need to define the lexer. Our lexer just needs to throw away empty space and eat reserved words.

>lexer :: TokenParser ()
>lexer = makeTokenParser
> (emptyDef
> { reservedNames = ["newgraph","newcurve","label","title","connected"]})

These are just shortcuts to this lexer. You can think of lexer like an object. makeTokenParser is like new lexer() and P.natural lexer is like the method lexer.natural.

>lexeme = P.lexeme lexer
>natural =P.natural lexer
>symbol = P.symbol lexer
>reserved = P.reserved lexer
>stringLit = P.stringLiteral lexer

This one is a little more interesting. We eat a string and convert it to a boolean value.

>boolLit = do
> t<-stringLit > case t of
> "true"->return True
> "false"->return False
> _ -> fail "Not boolean"

------------
The Parser

Our grammar in BNF

::= "newgraph" ("title" STRING)? *
::= "xaxis" ("label" STRING)?
::= "yaxis" ("label" STRING)?
::= "newcurve" ("connected" BOOL)? +
::= INT INT

Looks just like our types! Cool. First we introduce two helper functions-
The first just eats a reserved word and continues.

>label :: String -> Parser a -> Parser a
>label s p = do {reserved s; p}

Second one tries a parse, and if it fails, returns a default value.

>def :: a -> Parser a -> Parser a
>def d par = do {par <|> return d}

We start our parsing with points. The Point parser just reads in two natural numbers and returns a point.

::= INT INT

>point :: Parser Point
>point = do
> a<-natural > b<-natural > return (fromInteger a, fromInteger b)

The rest of the parsers are remarkably similar. They all first read in their label, and then their subparts. The function many1 is equivalent to our + notation above, it means read at least one of the items. Similarly The function many is equivalent to *. Finally the function def is our ? operator, it means try to read one, if it fails just return the default value.

::= "newcurve" ("connected" BOOL)? +

>curve :: Parser Curve
>curve = label "newcurve" $ do
> connected <- def False $ label "connected" boolLit > pts<-many1 point > return $ Curve {pts= pts,connected=connected}

::= "yaxis" ("label" STRING)?

>axis :: String -> Parser Axis
>axis s = label s $ do
> lab <- def "" $ label "label" stringLit > return $ Axis lab

::= "newgraph" ("title" STRING)? *

>graph :: Parser Graph
>graph = label "newgraph" $ do
> title <- def "" $ label "title" stringLit > xaxis <- axis "xaxis" > yaxis <- axis "yaxis" > curves <- many curve > return $ Graph{title = title,xaxis=xaxis,yaxis=yaxis,curves=curves}

It is pretty neat how the BNF description and the Parser line up so nicely. This isn't always the case, but we had a prtty simple grammar to parse.

--------------
The Graphics

In JGraph we would draw out to a postscript file, but I'm not sure how to do that yet, so for now we just use HGL.
The program opens a window, draws the graph with the labels of the axes, a title, plots points, and possibly connects them .

Start with some constants.

>size = 300
>midH = size `div` 2
>midV = size `div` 2
>bot = size
>right = size

This is just boilerplate, it creates a window and displays a graphic in the window.

>createW :: Graphic -> IO()
>createW p = withWindow_ "HGraph" (right,bot) $ \w -> do
> drawInWindow w p
> getKey w

Our Drawing functions will look just like the parsing functions. We traverse the Graph structure drawing each part.
First we draw a point.

>drawPoint :: Point -> Graphic
>drawPoint (x,y)= ellipse (x-1,y-1) (x+1,y+1)

Now we draw a curve. A curve is just a series of points. transPts moves the pts from the graph position to the screen position, and then the first line draws them on the screen.

The second line connects the points together. We do this by zipping the points to the points shifted by one. i.e. if our points were [(1,2),(7,3),(4,6)] the code finds [(1,2),(7,3)] and [(7,3),(4,6)] and then runs line (1,2) (7,3), line (7,3) (4,6). I like this trick.

>drawCurve :: Curve -> Graphic
>drawCurve cur = do
> overGraphics $ map drawPoint transPts
> if connected cur then overGraphics $ zipWith line (init transPts) (tail transPts)
> else emptyGraphic
> where transPts = map (\(x,y)->(midH + x,midV - y)) (pts cur)

You might have noticed that we are taking
advantage of the fact that Graphic is a Monad (Yay!) which means we can draw in an imperative style.
For instance do {line (0,0) (1,1);line (0,1) (1,0)} will draw a cross. This is all that is going on an our last function . This just draws a bunch of things on the screen.


>drawGraph :: Graph -> Graphic
>drawGraph graph = do
> setTextAlignment (Center,Top)
> text (midH, 0) $ title graph
> axes
> overGraphics $ map drawCurve (curves graph)
> where axes = do line (midH,0) (midH,bot)
> text (midH,midV-midV `div` 2) $ axisLab $ yaxis graph
> line (0,midV) (right,midV)
> text (midH+midH `div` 2,midV) $ axisLab $ xaxis graph
>

--------------
Main

>main :: IO()
>main = do
> test <-readFile "test" > case parse graph "JGraph" test of
> Left v-> ioError $ userError $ show v
> Right g -> runGraphics $ createW $ drawGraph g


Here is a sample graph.

newgraph title "My Graph" xaxis label "blah2" yaxis label "blah" newcurve connected "true" 0 0 4 28 100 100 title "hello"