Referee(s)' Comments to Author: Referee: 1 Comments to the Author SUMMARY The paper describes a type-and-effect system for a core language equipped with mutable and immutable algebraic data structures and with a mixture of strict and lazy evaluation. The system allows functions to be polymorphic in the mutability of their arguments and results. This feature is intended to remedy the code duplication that is sometimes observed in ML and Haskell programs. Furthermore, the system keeps track of effects, and requires that only pure computations can be suspended. This feature is intended to facilitate reasoning about programs as well as program optimisation. EVALUATION ** in favor of the paper The ideas that form the technical foundation of the paper appear sound and simple. As far as I can see, there are two basic features. First, all algebraic data is allocated in regions, and every region is marked as either mutable or immutable. This choice occurs at region creation time; once a region is created, its status remains fixed. Destructive update is permitted only within mutable regions. Second, the system keeps track of effects. Read and write effects are distinguished, and each effect carries the name of the region that is affected. Effect masking is permitted. (There are in fact two kinds of effect masking: the traditional kind, where any effect on any region r can be masked outside of the "letregion r" construct; and a new kind, where a read effect on an immutable region r can be masked at any time.) The system uses effects to enforce that only pure computations can be suspended. The above two features are essentially orthogonal. As far as I can see, the only interaction between the two lies in the idea that a read effect on a region r can be masked (that is, considered pure) if only if this region is immutable. Thus, the second feature takes advantage of the presence of the first feature. This seems a natural and interesting idea. The paper presents a core language, which is quite verbose, but one expects that a reasonably lightweight surface language could be designed, by using constraint-based type inference in the style of Haskell. This may have been done in separate work by the author, which I have not looked up. ** against the paper Against the paper, I have two main criticisms, plus some minor issues. * First criticism. The results that are formally established are, I believe, insufficient. The author establishes type soundness, in the form of subject reduction and progress statements. So, "well-typed programs do not crash". This is fine, but somewhat trivial: this result would still hold if the system was simplified by removing the requirement that suspended computations must be pure, or further simplified by no longer keeping track of effects, or even further simplified by viewing all regions as mutable. To put this another way, although the abstract claims that "only computations without visible side effects are suspended", there is no formal version of this statement, and (as a consequence) no formal proof that this property of the system can be exploited to enable interesting program transformations. Section 5.1 shows that the statement is actually false, or only a certain interpretation of it is true: because the system does not keep track of allocation effects, a computation that allocates fresh mutable memory is considered pure; so it is not the case that pure computations can be shared without altering the program's semantics. Sections 5 and 6, which study program transformations, are informal: no semantics preservation property is proved. To put this another way, I would like a formal answer to the questions: "why not view every region as mutable and forget about effects? what is the benefit of marking certain regions immutable? what is the benefit of keeping track of effects?" Apparently the answer is that this enables more program transformations, but this remains informal in the present paper. * Second criticism. The author carefully avoids the issue of the interaction between polymorphism and mutable state. He side-steps the issue by adopting the convention that type abstractions suspend evaluation. Although this is technically correct (and the discussion at the end of Section 4.6 is accurate), it is not satisfactory. The programming languages in the ML family traditionally have a true type-erasure semantics, where type abstractions do not suspend evaluation. The paper does not explain how this property could be maintained in the presence of mutability polymorphism. Roughly speaking, in an ML-like language, one must be careful not to generalize the type of an expression that allocates fresh mutable memory. There are two traditional ways of solving this problem. One (a) is to adopt the value restriction: consider that every expression may allocate fresh mutable memory, unless it is a value; so, generalize only values. The other (b) is to use a more refined type-and-effect system: keep track of allocation effects, and generalize only expressions that do not allocate fresh mutable memory. It is not clear to me how to adapt either of these traditional solutions to the setting of the present paper. Consider approach (a). Is an application of the data constructor "cons" to values a value? In an ML-like language, the answer would be positive if this data constructor has immutable fields, and negative if it has at least one mutable field. Here, however, we have mutability polymorphism, so the answer would be positive if the region that this data constructor inhabits is immutable, and negative if it is mutable. Because this region is in general a region variable, there is no fixed answer. Approach (b) presents a similar problem. Assume an expression has an allocation effect in region r. Can this expression be generalized? The answer is positive if r is immutable (an allocation effect in an immutable region can be masked), and negative if r is mutable. Again, there is no fixed answer. An example may best summarize the problem. Consider the expression "Cons (lambda x.x) Nil", that is, a singleton list of the identity function. What should be the type of this list? If it is an immutable list, then its type can be generalized, so one type could be "forall r a, Const r => List r (a -> a)". On the other hand, if we do not generalize, then we have no constraint on r, so one type could be "List r (t -> t)", where r is any fixed region and t is any fixed type. These two type schemes are incomparable, and it is not obvious how to find a single principal type scheme that subsumes these two. I believe that this issue may be the main reason why mutability polymorphism was never introduced in ML. It would be very nice if the author would face this problem and propose a true type-erasure semantics. * Minor criticism. The paper puts much emphasis on explicit witnesses, or proof terms, for propositions. It is not clear to me why it is important or useful to construct explicit witnesses. The propositions involved seem simple enough that entailment of propositions is decidable, so proof terms could be reconstructed by the type-checker of the core language instead of incorporated as part of type derivations. Such an approach, it seems, would make the presentation of the core language significantly more lightweight. I would like this issue to be clarified. * Minor criticism. I believe there might be technical problems in the formulation of some of the typing rules, but this can probably be fixed; detailed comments appear further on. ** bottom line I have mixed feelings about this paper. I like the simplicity of the basic ideas, but, as explained above, I am disappointed to find that (i) the guarantees offered by the system are not fully spelled out and (ii) the system does not have a true type-erasure semantics (type abstractions suspend evaluation) so it would not be suitable for adoption in an ML-like language. I believe I would be willing to accept the lack of an answer to issue (ii), as the problem is challenging and the author could perhaps attempt to justify why it might be acceptable for type abstractions to suspend evaluation. On the other hand, the paper really feels incomplete without an answer to (i). The paper is about "program optimisation", but never delivers a formal guarantee that types can help optimise programs. I summary, I would like to recommend a major revision of the paper, containing a formal statement of the claim that "only computations without visible side effects are suspended" and an application of this statement to formally establish the correctness of at least one of the program transformations presented in the paper. I would also like issue (ii) to be addressed or at least discussed. DETAILED COMMENTS page 2, "only pure function applications should be suspended": why is this important or desirable? page 2, "in ML, references and arrays [...] are the only ones capable of being destructively updated". In ocaml, there are also strings, which technically are not considered arrays. The fact that ocaml strings are mutable is often a problem: for instance, the compiler cannot hoist string constants out of expressions! One would like to have both mutable and immutable strings, and one would like some functions (such as String.length) to be polymorphic with regard to this. Arrays pose an analogous problem. The problem with algebraic data structures is perhaps somewhat less acute: in practice, there does not seem to be much duplication between (say) a library for purely functional trees and a library for imperative trees. page 3, "we give integers the type Int r, where r is a region variable, and constrain r to be mutable or constant as needed". A more naive or more natural approach might be to give integers the type "Int r", where r is one "mutable" or "constant". This approach would lead to more compact types and to a purely unification-based type inference algorithm. Would it work? Would it be less expressive than your approach? It seems that it would amount to working with just two regions, a mutable one and a constant one. It would not allow effect masking, so it would classify fewer expressions as pure. Is that why you reject this approach? page 5, the constant 5 has type "forall r, Int r". Should we interpret this as a function that waits for a parameter r and allocates a boxed integer 5? My impression (after reading the entire paper) is that the answer is "yes", but it was not initially clear, since it was not explained up front that type abstractions suspend evaluation. By the way, one of the motivations of the paper is program optimisation; however, the introduction of mutability polymorphism means that integers must be boxed, so there is a price to be paid up front. Of course, as demonstrated in Section 6, one may hope to be able to unbox integers in many situations, but this requires work. Section 2.1, it seems that a bare arrow stands for an arrow with effect bottom; please mention this. Is updateInt a primitive function? (I believe it is.) page 4, the discussion in Section 2.2 is somewhat unclear to me. I understand that in Haskell, the conjunction of "Mutable r" and "Const r" does not lead to an immediate type error. My understanding is that in Disciple, it does lead to a type error. Is this correct? page 4, the syntax used in the definition of the type List is quite surprising, as it seems that the parameter r is unused. At this point, you have not introduced the notion of a "primary region variable". Of course the reader may suspect that something like this is going on behind the scenes; nevertheless the syntax of the definition is disturbing. I would suggest changing the syntax to: data List r a = Nil@r | Cons@r a (List r a) This is of course more verbose, but has the merit of explicitly showing that r is used as the region where Nil and Cons are allocated. This syntax also offers extra flexibility, in that it now becomes possible for Nil and Cons to inhabit different regions (so, if needed, one could be considered mutable and the other immutable). page 4, "if the list is not a cons, then a run-time error is raised". This is admittedly mostly a matter of taste, but it would be nice if the system did not permit any kind of run-time errors. This could be achieved in several ways. One simple way would be to follow ocaml's convention that only records can have mutable fields. (In your algebraic data type formulation, update would be permitted only if there is just one data constructor.) Another, perhaps more satisfactory approach would be to enrich type environments with a distinction between mutable and immutable bindings, in the style of caml-light where matching against the pattern "C x" would make "x" a mutable variable if the data constructor "C" was declared to carry a mutable argument. In your system, this might involve populating the type environment with bindings of the form "x : T@r", "x has type T and inhabits region r", and allow an update to the variable x provided the constraint "Mutable r" holds. I believe this would allow programmers to express updates in a more natural style; plus, the run-time test would be eliminated, and compilation to efficient code would be easier (your current update_Cons,0 operator requires some work to compile efficiently). page 6, a different design choice would have been to have an explicit "force" operation, as well as a type for suspensions. This would then allow suspending impure computations: the type "lazy a e" would be the type of suspended computations with result type a and effect e. This would be sound type-and-effect system, I believe. You could still recognize suspended computations that are pure (if and where desired) since they have type "lazy a bottom". Could you explain why you choose not to follow this route? One point that I do understand (I think) is that, as soon as you choose not to have an explicit type for suspensions, then (a) force must be implicit in function applications, case constructs, etc. and (b) one must require that only pure computations would be suspended, otherwise the type-and-effect system would be unsound. (By the way, this would be a technical explanation why you require that only pure computations can be suspended, *if* the type-and-effect system was used for some *other* purpose than making sure that only pure computations can be suspended!) page 7, on the "phase distinction" between region variables and region handles. In my opinion, this is a sound distinction to make, and I commend you for making it. Region variables are type variables: they have lexical scope, while region handles are freshly allocated at runtime: they have dynamic scope. The distinction is analogous to the by-now-standard distinction between term variables x and memory locations l. Although one could technically insist on confusing region variables and region handles (by viewing the latter as bound by a toplevel letregion), it is clearer to distinguish. A couple of things I do not understand. On page 7, the sentence "region handles are bound by region variables" does not make sense to me. On page 20, the "similarity judgement", which relates region variables and region handles, is mysterious. I see no reason why this is needed: in my experience, region variables are replaced at runtime with region handles, just like term variables are replaced at runtime with (values that contain) memory locations in standard treatment of ML references. There is usually no "similarity judgement x ~ l"! Perhaps the need for this judgement arises because some typing rules (e.g. letregion) are seemingly non-hygienic? I comment on this further on. page 7, "This similar to system" -> "This is similar to the system" page 8, at this point, you have introduced "underlined const", "MkConst", and "Const"; similarly "underlined mutable", "MkMutable", and "Mutable". This is noisy and confusing. Later on, you introduce MkPurify, MkPureBot, MkPureJoin, etc. Simple, but still noisy. Figures 6 and 8 are also (in my humble opinion) mostly noise. As suggested earlier in my review, why not remove witnesses entirely and let the type-checker silently reconstruct them where needed? It seems that this might be possible. You would end up with a constraint-based type system, a pretty standard notion. (You would need to define what it means for a constraint to be satisfiable, and this could be done only with respect to a world, a mapping of region handles to either "mutable" or "immutable". Entailment of constraints would involve a universal quantification over worlds.) On a related note, it would be nice if you could refer to "propositions" and "proofs" instead of "(dependent) kinds" and "types". The terminology would be much less intimidating! I understand that you refer to witnesses as "types" because they do not exist at runtime; but you could also call them "proofs" and indicate that proofs are erased at runtime. page 9, if the system had a notion of subtyping, then the constraint "Pure e" could be replaced with "e <= bottom". I understand that you do not wish to have subtyping, so perhaps you could just have a construct to convert a pure effect e into bottom. Ah, it later turns out (page 11) that you do have such a construct, "mask". But then, the type of suspend could be simplified to (a -> b) -> (a -> b), could it not? (Assuming that a bare arrow means an arrow with effect bottom.) page 9, last sentence, missing dot. page 10, the example of lazyMap seems more complex than it should be. Since the effect e is constrained to be pure, it could be replaced with bottom. Furthermore, since r1 is constrained to be constant, the read effect Read r1 could be ignored. I understand that the point here is to illustrate the conjunction of effects, but perhaps it would be better to pick an example where no simplifications are possible. page 10, "guarentees" (use ispell) page 11, "super-kinds", wow! more terminology escalation. Could you view "propositions" and "kinds" as two distinct syntactic categories, and thereby avoid the need for "super-kinds" altogether? page 11, "We use use" figure 1, in the definition of Sigma (store typing), I had difficulty reading the first production "l: tau, rho". I was troubled by the comma. Technically the comma is already used as a separator in the production "Sigma1, Sigma2", so it might be better not to overload it. How about "l: tau@rho"? page 13, "[in TyLetRegion], we have included the r notin fv(tau) premise anyway because it makes the proof of soundness easier". I am surprised! In my opinion the premise does not just make things "easier", it is required for hygiene. Without the premise, the type tau would be allowed to contain a free region variable r, which would not make sense -- the type tau would be ill-kinded with respect to the environments Gamma and Sigma. Why take the trouble to define well-kindedness if you allow ill-kinded types to appear in judgements? This leads me to ask why you do *not* require the premise "r notin fv(sigma)". Either that, or replace "sigma" in the conclusion with something along the lines of "sigma \ r". Otherwise, again, the effect "sigma" is not well-kinded in the context of the conclusion. This does not make sense to me. figure 4, there is something I do not understand about TyAbsT and TyAppT. First, since type abstractions suspend evaluation, it seems to me that they should have a latent effect; so the universal type should not take the form "forall a, tau" but "forall a, tau/e". I do not see this in the syntax of types. In TyAbsT, the type variable a may occur in sigma2, it seems, but so it may occur free in the conclusion; again this is unhygienic. Second, in the conclusion of TyAppT, the substitution of phi2 for a is applied not only to the type phi12 (as it should) but also to the effect sigma1. This does not seem to make sense, since the scope of the type variable in the premise did not extend all the way to sigma1 (did it?). Did I misunderstand? Please clarify. figure 4, rule TyData requires the type phi to be well-kinded, but there is no such requirement for phi'. Yet the other premises do not seem to automatically guarantee that phi' is well-kinded. figure 5, there seems to be some duplication between the store typing Sigma and the store H: both associate region handles to mutability flags. Later on, this forces you to require Sigma and H to be in agreement. Is there any way of avoiding this duplication and dropping the agreement hypotheses? page 16, "the second premise of EvLetRegion is always true": if it is literally "always true", why is it there? Surely you meant something slightly different. By the way, the first two premises EvLetRegion appear somewhat circular, in that the first premise seems to define H' in terms of the Delta_i, while the second premise seems to indicate how to obtain the Delta_i when H' is known. This is perhaps not technically a problem (these are just equality hypotheses, not definitions), but it is somewhat disturbing. figure 7, do I read EvSuspendApp correctly? It seems to me that reduction is non-deterministic and that a suspended application can spontaneously beta-reduce at any time. It also seems to me that suspended computations are viewed as values, hence duplicated; they are not shared. In other words, this is not a call-by-need semantics. Would you be able to show that (for well-typed terms only) it is equivalent to a call-by-need semantics? page 17, "kind expressions can contain types, and hence store locations as well". I do not see how it is possible for a type to contain a store location. page 19, back to the topic of hygiene for region variables: "if we were to permit this example to be well-typed then we would need to weaken our Preservation theorem"... Why on earth would you do this? Don't. You will keep a simple statement of Preservation and will lose nothing. If you do wish to adapt this example so that it can be well-typed, then I suggest that existential quantification is the tool you are looking for. The sub-expression "succ r1 r1 (23 r1)" has type "Int r1", so it also has type "exists r, Int r". This new type does not mention r1, so it is fine for it to occur in the conclusion of a hygienic TyLetRegion rule. At runtime, the witness for r is a dynamic region handle, which does escape the scope of the "letregion" construct where it was allocated -- but that is fine, since region handles are not lexically scoped. page 20, I am surprised to see a new typing rule added after type soundness has been stated. Is it sound? Why did you find it difficult to include this rule in the soundness proof? Is it useful? Why did you find it necessary to include this rule in the implementation? More specifically, here are some comments about rule TyObs: * I observe the following fact: give me a type derivation that uses TyObs; then I can insert a "letregion r" construct just below every use of TyObs, and I obtain a new type derivation, which is still well-formed, since r does not occur free in the conclusion of TyObs. * thus, TyObs is no more expressive than a hygenic version of TyLetRegion with built-in effect masking; in other words, I suggest removing TyObs and replacing "sigma" with "sigma \ (Read r \/ Write r)" in the conclusion of TyLetRegion; you will obtain a hygenic version of TyLetRegion and remove the problematic TyObs, without any loss of expressiveness. * the rule TyObs, as it stands, seems useless, because the requirement that r must not appear in Gamma implies that nothing can be known about r; in particular, it cannot be known that r is mutable, or that r is immutable. A combined TyLetRegion/TyObs rule would not have this problem. page 20, the informal reasoning appears circular. You are trying to justify why it is sound to use TyObs to prove that a certain computation is pure, but the informal argument exploits the fact that "suspended computations must be pure". In short, these lines ("As r does not appear [...] has finished evaluating") do not make much sense to me. Better to incorporate some version of the rule in the formal soundness proof. page 20, "it does not guarantee that optimising transforms based on masked effect information preserve the meaning of the original program". Where is the emphasis? Do you mean that this property is in fact true, but not guaranteed (because not yet formally proved), or is in fact false? At this point of the paper I thought the former (after all, optimisation is the motivation of the paper). But upon reading 5.1 I found that it really seems to be the latter. That is a disappointment! page 22, so, expressions with an allocation effect in a mutable memory region are not deterministic, yet your system views them as pure. This seems to be a shortcoming of your system. You argue "we do not actually need allocation effects", but I do not understand your one-line argument; if you wish to pursue this argument, please clarify it and make it a formal result. I would suggest instead introducing explicit allocation effects (the theoretical cost seems negligible; the practical cost might be some extra verbosity), together with the rule that an allocation effect on an immutable region can be masked. Then prove a formal result about your system -- computations which the system deems "pure" are actually pure in a certain sense (deterministic?) and can be shared, hoisted, etc. page 24, "Doing this exposes [...]": missing a colon at the end of this sentence. page 25, "One potential disadvantage [...]". A possibility might be to explicitly mask the effect Read r1 in the transformed code. Then the witness w1 would still be in use. I don't know if this would help much. Section 6.1, it is conventional wisdom in the separation logic community that individual witnesses of distinctness are not very convenient, because their number grows quadratically with the number of regions involved. One might wish to adopt some ideas from separation logic and use constraints such as (say) "Mutable r1 * Mutable r2", which (by definition of the star) implies that r1 and r2 are distinct regions. This might take us quite far away from the current system, though. The ideas in Section 6.1 can be traced back at least to the "calculus of capabilities" and "alias types" papers (which are properly acknowledged), so I do not see anything really original here. page 28, "one of the current limitations [...] is that the results of case alternatives must have the same type". This is reasonable! It is actually so in all type systems that I know of, including systems of dependent types, etc. The key is not to relax this rule but to enrich the type algebra so that the results of case alternatives can be *made* to have the same type. Here, in particular, existential quantification over region variables would be very useful. It would allow you to "choose between a mutable and a constant integer", since both have type "exists r, Int r". You don't need "a notion of subtyping", only an introduction rule for existential quantification (which admittedly can be viewed as a form of subtyping, if desired). Note that existential quantification over regions is a key ingredient in the system of "alias types" and its descendants. page 28, last paragraph. One may wonder whether it would be possible to extend your system with support for changing mutable data into immutable data. A classic motivating example would be a tail-recursive version of list map in destination-passing style: the new list is mutable while it is constructed, but becomes immutable once returned to the client. Of course, converting mutable data to immutable data is unsound if there is sharing. One expects that linearity might help: it should be sound to convert mutable data to immutable data if it is not shared. If "Mutable r" is viewed as an ownership assertion over the region (as in separation logic), then it could be safely turned into "Immutable r". Such a mechanism would be quite flexible; in particular, there would be no need to delimit a lexical scope where initialisation is performed; the decision to convert "Mutable r" into "Immutable r" could be taken at any point in the lifetime of a data structure. ---------------------------------------------------------------------- Referee: 2 Comments to the Author Witnessing Mutability, Purity and Aliasing for Program Optimization This paper describes a language in which the fields of any constructed data value may, a priori, be mutated. Fields are associated with regions and regions may be constrained to be constant or mutable. Constraints are realized by capabilities/witnesses that are passed around in the operational semantics, with update operations requiring a witness of mutability for the field to be updated. Functions are annotated with effects, specifying the regions that may be read or written, and can be polymorphic in types, regions and effects. Further refinements include witnesses that particular effects can be considered pure (reading from a constant region), selective suspension (only applications of *pure* functions may be suspended) and witnesses for disjointness of regions. There is a great deal of previous work on effects, capabilities and region-based type systems, so the basic ideas used here are all fairly familiar. The main distinguishing features of this work seem to be the proposal to work with pervasive mutability in a high-level, higher-order language, the emphasis on witness passing and construction (presented using some not-entirely-trivial kinding), and the explicit goal of justifying interesting program transformations, rather than just, say, safe memory management. Interesting though this subject is, I do not think this paper reaches the standard of JFP. My two main criticisms are that the language design is rather unconvincing and that the formal development is extremely weak. The initial motivation given is that languages like ML separate mutable reference types from value types. As a high-level programmer, I have always regarded this as a - rather major - feature, not a bug! Despite writing a great deal of ML code using both purely functional and quite heavily imperative data representations, I have rarely wished for polymorphism in the mutability of data. I can already make lists of reference cells, for example, and "deciding that we need an operation to destructively insert an element into the middle of a list" is a radical change of data representation, suddenly introducing identity and mutability where there was none before - I'd expect to have to make big changes elsewhere, replacing copying with explicit cloning and so on. I wouldn't expect much code to be usefully polymorphic in mutability, and this seems to be borne out by the examples in the present paper, most of which are predicated on particular assumptions about mutability and constancy and do a lot of work that's specific to those assumptions; not only being careful about copying things, but also passing around and constructing a enormous number of regions and witnesses. Hence one writes different code for different combinations of constraints, which is more straightforward in existing languages. The language seems likely to be inefficient (by default, even "5" has to be compiled as a fresh allocation) and isn't especially expressive about what can be updated (only fields of immutable constructors, so in practice one probably ends up using another unary datatype corresponding to ref to allow mutation on the outside). The heavily-kinded presentation of witnesses is rather baroque and clumsy compared with (a) just using separate syntactic categories for things and (b) using a bit more logic - having all this general purpose machinery of kinds sits uneasily with the odd syntactic restrictions on uniqueness of region identifiers in the initial program, constructions that can only appear in certain places in the program, and so on. The only theorem is a preservation and progress result, which does not directly establish a comprehensible property of programs; it merely says something about the extra witnesses that get passed around. There's no *immediately* obvious reason to believe that the presence or absence of these (fairly complex) witnesses implies something useful. Furthermore, the most interesting aspect of effect systems (going all the way back to FX) is masking - rules that allow unobservable side effects to be removed from the effect annotations. Although the paper presents a masking rule, that is not included in the system for which preservation and progress has been shown. Much of the paper presents a number of plausible-looking optimising tranformations based on the extended type system, but this is done via examples, rather than by carefully specifying the general pattern of the transformations, and there are no proofs at all that any of these rewrites preserve observational equivalence. That might be acceptable in a more engineering-focused paper on practical compiler implementation (of the sort that contains performance numbers), but for a metatheory paper, it's well behind the state of the art. The author is commendably clear and upfront about the limitations of what has been proved, but I think the theory would need to be properly developed to justify journal publication. ---------------------------------------------------------------------- Referee: 3 Comments to the Author The paper presents a region-based type theory for polymorphism over mutability, purity and laziness, and alias analysis. It's hard to know what to make of this work. I don't see anything wrong with it, although I caution that the paper contains no proofs, or even the complete statement of the theorem (since some key definitions are relegated to an appendix, which I haven't seen). However, the work's motivation isn't clear, and the language contains a raft of strange design decisions. In the first place, it's not clear what the role of the language is. At the outset it seems as though the language is for programmers, opening "Suppose we are writing a library that provides a useful data structure". But the language doesn't seem appropriate for programmers; it involves numerous complication to do even simple tasks. At the end of the paper, it gives the impression that the language is intended as an internal language for a compiler. It seems more appropriate for such a role, but then it's not clear why polymorphism over mutability would be valuable. Moreover, the language hides the real-world cost of laziness, so there is a limit to how low-level it can be. It's also not clear why use a region-based type system in the first place. Regions don't seem necessary for the paper's avowed goal of polymorphism over mutability, and they don't play any role in purity. The regions do seem necessary for the alias analysis, but the paper presents the alias material as an afterthought. The type theory combines eager and lazy evaluation in an unusual manner. Rather than using types to distinguish between eager and lazy terms, the two are combined into the same type. This weakens the invariant provided by the type, so any operation must be able to accept either a value or a suspension. Consequently, an implementation must use call-by-need, which would hurt performance. Or so it would seem, if we assume that the suspend primitive actually creates suspensions. In the operational semantics (EvSuspendApp) it appears that the suspension primitive doesn't suspend at all. I'm not sure why the author decided to make the predicates (Const, Mutable, Pure) kinds and their witnesses types. The language would have been simpler if propositions and the proofs thereof were their own syntactic classes. The language also would have been simpler if the predicates has taken argument syntactically (ie, Const(r) instead of Const). This would have eliminated the need for kind application and the arrow superkind. In addition, it is never explained why there is a distinction between box and diamond kinds. If that distinction were also eliminated, the entire syntactic class of superkinds could be discarded. The KiAll rule is very strange. It allows the body of the quantifier to be an arbitrary constructor, not necessarily a type. I have no idea what a non-type quantifier would mean. To summarize, I think the paper would benefit greatly with a discussion of the rationale behind the type system's design choices. Some other comments: On page 3 the paper describes the effects as a lattice. It appears to me to be a join semi-lattice. The List example on page 4 is confusing. At first glance it appears as though the first argument is irrelevant, but I gather that the first argument is taken to be the region in which datatype's values are allocated. It would be helpful if the author could devise a notation that is less misleading. The citation for LF on page 6 is unusual. Usually one would cite the LF paper (Harper, Honsell, Plotkin). I'm surprised to see this on page 6: "To avoid problems with variable capture we require all bound variables r in the initial program to be unique." I am very suspicious. This trick rarely works and I'd be surprised if it worked here. The problem is the property is not preserved under evaluation. I also don't see the benefit to the trick, as opposed to the usual strategy of considering alpha-equivalent expressions identical. The discussion of region deallocation on page 19 is a bit misleading. The reader might get the impression that the problem is with stack-like deallocation, rather than deallocation in general. More to the point, since reads from immutable regions are considered to generate no effect, immutable regions cannot be deallocated. I think the discussion at the bottom of page 19 of the r \not\in fv(\tau) requirement misses the point. If the requirement were omitted, region variables could escape their scope. Consequently, the type system would fail to enjoy the regularity property: well-formed terms could have ill-formed types. The particular issue cited of a term changing type as it evaluates is merely a symptom of an non-regular type system. On page 20 the papers says "Although TyObs is valid, we have omitted it ... because our soundness proof does not cover it." If the soundness proof does not cover it, in what sense is it valid?