2015年9月25日星期五

Haskell的惰性求值可以隐藏不必要的运行时错误

今天在做Multiple Variable Declaration Evaluation时遇到了一个解释不通的现象,百思不得其解,最后经多次尝试,发现是Haskell的惰性求值(lazy evaluation)干的好事。



解释不通的代码是这样的:
data Exp = Num Int
| Add Exp Exp
| Sub Exp Exp
| Mult Exp Exp
| Div Exp Exp
| Var String
| Decl String Exp Exp
| DeclareMulti [(String, Exp)] Exp

type Binding = (String, Int)
type Env = [Binding]

evaluate :: Exp -> Int
evaluate e = eval e [] -- starts with an empty environment
where
eval :: Exp -> Env -> Int
eval (Num x) _ = x
eval (Add e1 e2) env = eval e1 env + eval e2 env
eval (Sub e1 e2) env = eval e1 env - eval e2 env
eval (Mult e1 e2) env = eval e1 env * eval e2 env
eval (Div e1 e2) env = eval e1 env `div` eval e2 env
eval (Var v) env = case lookup v env of
Nothing -> error ("Not in scope: " ++ v)
Just x -> x
eval (Decl v e1 e2) env = eval e2 ((v, eval e1 env):env) -- bindings in the front of the list will shadow the behind ones if they have the same name.
eval (DeclareMulti lst e) env = eval e ((map (\(v, e)->(v, eval e env)) lst) ++ env) -- haskell lazy evaluation!

按照我的预测,当执行evaluate (DeclareMulti [("x", (Num 3)), ("x", (Add (Var "x") (Num 2)))] (Mult (Var "x") (Num 2)))时,应该返回error,因为第二个x=x+2中右端的x还没有被加入env(ironment)中,也就无法运算。但是,真正的结果却是6(即3*2),也就是说第二个declaration中的错误并没有影响到整句的执行并输出error。而如果将这个语句做一个小小的改动,变成evaluate (DeclareMulti [("x", (Add (Var "x") (Num 2))), ("x", (Num 3))] (Mult (Var "x") (Num 2))),即对调两个declaration的位置,就不能正常得到结果,而是返回了我期望的error。

百思不得其解的我尝试了以下语句:
*Interp> map (\(v, x)->(v, 9 `div` x)) [("x", 1), ("x", 8)]
[("x",9),("x",1)]
*Interp> map (\(v, x)->(v, 9 `div` x)) [("x", 1), ("x", 0)]
[("x",9),("x",*** Exception: divide by zero
*Interp> (map (\(v, x)->(v, 9 `div` x)) [("x", 1), ("x", 8)])!!0
("x",9)
*Interp> (map (\(v, x)->(v, 9 `div` x)) [("x", 1), ("x", 8)])!!1
("x",1)
*Interp> (map (\(v, x)->(v, 9 `div` x)) [("x", 1), ("x", 8)])!!0
("x",9)
*Interp> (map (\(v, x)->(v, 9 `div` x)) [("x", 1), ("x", 0)])!!0
("x",9)
*Interp> evaluate (DeclareMulti [("x", (Num 3)), ("y", (Add (Var "x") (Num 2)))] (Var "y"))
*** Exception: Not in scope: x
*Interp> evaluate (DeclareMulti [("x", (Num 3)), ("y", (Add (Var "x") (Num 2)))] (Var "x"))
3

终于我明白了问题所在——惰性求值。这是haskell的一个语言特性,它会只计算为了得出最终结果而真正需要的部分,其他部分即使代码中看起来让它算了它也不会算。为了计算(Mult (Var "x") (Num 2)),只需要environment中最前面的x的binding就可以了(因为前面的会shadow后面的,后面的算出来也用不上),所以就只算两个对于"x"的declaration的前面一个,所以将有错的declaration放在后面并不会被执行也不会有error,放在前面就会被执行并且有error。