Compiler project: Design, first phase

First target choice

I choose to have as first target .Net’s virtual machine because it is — at least theoretically — made to support various paradigms, it already has support for:

There also is a package on Hackage to write and manipulate code in .Net’s intermediate language. Ah, and it supports unsigned integral types unlike, for example, the JVM.

Type mappings

Haskell’s values will be mapped to Lazy<T> values with the exception of functions values who will be mapped to Func<T, TResult> values and tuples who will be mapped to Tuple<T> values. It should be noted that because of .Net’s Func<T, TResult> delegate implementation, it can’t represent functions of more than 16 variables, so the compiler should automatically curry functions of 16 variables or more when they are used as values. .Net’s tuples going only up to 8-tuples, the last element of a 8-tuple will be used to store a tuple containing the rest of the tuple, that is a one-tuple for 8-tuples, a pair for 9-tuples, a triple for 10-tuples, et cætera.

N.B. : The limitation to 16 variables for a function value will probably stay a limitation of the implementation, but to comply with Haskell 98, I must support tuples up to 15 elements.

Haskell’s numeric types will be mapped to .Net’s numeric types, e.g. Integer will be mapped to BigInteger. Arrays will be mapped to .Net’s arrays, or to a class backed by a .Net array. Character handling will most certainly be a tall order: I would like to have a string type backed by .Net’s native string type but it’s a UTF-16 encoded string type giving only access to enumerators on, length in and indexation of 16-bit entities. StringInfo is better in this aspect, but it gives access to the .Net concept of a text element, something akin to a grapheme.

I will have to find a mean to export type aliases, but I’ll probably implement newtypes as classes extending the base type and adding nothing to the base type, as well as modifying nothing I don’t need to.

Module mappings

Each Haskell module will be mapped to three things:

  • A namespace, for the scoping role of the module
  • An assembly or a netmodule, for the “hiding” role of the module
  • A set of classes (one for each datatype and most probably one for the functions) for the code

In its scoping role, a module named X.Y will correspond to a namespace named Haskell.X.Y. In its role to hide elements, it will probably correspond to an assembly.

Notes & musings

N.B. : The following paragraph contains open questions for the implementation. All suggestions are welcomed.

I’ll perhaps implement typeclasses with a map linking types with dictionaries, but I will need to decide one thing: do I enter only the type without the Lazy<T> wrapping, only the type with the Lazy<T> wrapping or both ? And if I enter both, should I just enter a generic function forcing the value and calling the version without the Lazy<T> wrapping ? Or, at least, should I just do that for user-defined types ? And if I do that, shouldn’t I offer a option (with a pragma perhaps) to do the inverse, i.e. consider that the default version is the one with the Lazy<T> wrapping and the dictionary I create for the strict type just wrap the value and call the version for the lazy type.

There is another delegate family, Action<T>, for functions that do not return a value. Should I use it for function returning () ? And should I use Func<TResult> for functions that only take a () as parameter ? Of course, this paragraph questions are for the case where a function is used as a function value, but they are nonetheless interesting.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s