## Sunday, May 05, 2013

### Static analysis of multi-staged programs via unstaging translation

Static analysis of multi-staged programs via unstaging translation
W. Choi, B. Aktemur, K. Yi, and M. Tatsuta
POPL 2011

Staged programs are programs that construct and run other programs (written using the same syntax).  For example, Lisp's quote, antiquote and eval operations are well-known examples of .  In the typed programming world, MetaML and MetaOCaml may be the best-known language designs for staged computation.  Work on modal type systems (dating to Davies and Pfenning) established that one can view the modal $\Box$ type as a "code" type, with $run : \Box A \to A$.  Subsequent work has looked at modal types indexed by contexts (e.g. Contextual Modal Type Theory or Taha and Nielsen's environment classifiers).

This paper looks at the problem of static analysis for staged computation.  This is challenging because the values of code types are program fragments that may not be known until runtime, hence, it is difficult (or at least requires significant effort) to adapt classic static analysis techniques.  The authors argue that there has been no significant work on static analyses that are more detailed than type systems (and I don't know of any either).

The authors propose a solution based on translation.  A staged program (in a monomorphic $\lambda$-calculus with $box$, $unbox$ and $run$ operations) is translated to a conventional program in a $\lambda$-calculus with records.  A box type $\Box(\Gamma \vdash \tau)$ is translated to a function type $\tau_\Gamma \to \tau'$, where $\tau_\Gamma$ is a record with one field corresponding to each variable in $\Gamma$.  Boxing and unboxing are translated to function abstraction and application respectively.  Importantly, to preserve evaluation order, unboxing operations are "hoisted" outside of the $\lambda$-abstractions introduced by boxing.

Given a staged program, the translation yields a conventional unstaged program that can be analyzed using conventional techniques.  The results of the translation can be converted back to the staged setting by an inverse translation; the authors carefully define the forward translation to make this reverse translation well-defined.  There are some additional subtleties associated with projecting the results of unstaged static analysis which the paper captures by a safety condition.  An associated technical report (which I haven't read) deals with extensions to handle references (this seems fairly straightforward, though.)

I recommend this paper to anyone interested in staged programming.  It may have applications beyond the static analysis setting the authors consider: for example, in our recent work on LINQ in F# we note that a similar translation is useful for simulating open code quotations using closed code quotations (only the latter are available in F#).

Stray observations:

• It would be interesting to revisit this translation in the context of Rhiger's system (ESOP 2012)
• The authors seem to consider it impossible to develop conventional static analysis or type and effect systems that can do something meaningful with staged computation.  I agree that this seems challenging but the word impossible seems too strong.
• Meta-programming also often involves decomposition of code (e.g. pattern matching against code fragments).  It is non-obvious whether this approach can be extended to handle this feature (but doing this in a type-safe and expressive way still seems to be a significant challenge).
• Formalizing the translation also seems like a nontrivial challenge.