Last time, we looked at two potential encodings of C's platform dependent type long
. Today we'll renew our search, but with the help of opaque types.
Opaque types
Opaque types are a new form of type in Scala 3. They are types that exist only at compile-time, erasing to their underlying type at runtime. Take the following example:
opaque type Meter = Int
object Meter:
def apply(meters: Int): Meter = meters
This defines an opaque type called Meter
that is backed by Int
at runtime. Within the scope that one defines an opaque type, it is merely an alias for the type that backs it at runtime. This means that within this file, Meter
and Int
are the same type, but outside of it, Meter
is it's own type with no relation to the Int
type.
Opaque types can be used to enrich basic data types with information. For example, one could define a PositiveInt
opaque type, and make it only instantiable after testing that an Int
passed in to its apply
method is positive:
opaque type PositiveInt = Int
object PositiveInt:
def apply(int: Int): PositiveInt =
if int >= 0 then int
else throw new Error("Not positive!")
Opaque types are also useful for implementing lightweight new types, acting as an alternative to classes that extended AnyVal
in Scala 2. In this case, we want to have types that represent platform dependent types without requiring unboxing to use with MethodHandle
s generated by java.lang.foreign
. Opaque types could be perfect for this because they erase at runtime to their underlying type, and the standard java.lang
types work perfectly with foreign's MethodHandles
.
CVal
To start with, let's define an opaque type called CVal
. This type will act as the base type for our platform dependent types, and will house methods that should be available for all platform dependent types.
opaque type CVal = Matchable
object CVal:
def apply(a: Matchable): CVal =
a
In this snippet, CVal
is defined as being backed by Matchable
because it's the lowest level type that is understandable by the JVM in Scala 3. In Scala 2, Any
would've been used, but Any
technically includes opaque types and other types that exist outside of the JVM's understanding.
Let's add a method to turn a CVal
back into a regular type:
opaque type CVal = Matchable
object CVal:
def apply(a: Matchable): CVal =
a
extension (cval: CVal)
def as[A <: Matchable]
: Option[A] =
cval match
case a: A => Some(a)
case _ => None
When one tries to compile this code, they'll get a compiler warning: " the type test for A cannot be checked at runtime because it refers to an abstract type member or type parameter".
In order to avoid this, one can have the method use a TypeTest
, a new type class in Scala 3.
import scala.reflect.TypeTest
opaque type CVal = Matchable
object CVal:
def apply(a: Matchable): CVal =
a
extension (cval: CVal)
def as[A <: Matchable](using
TypeTest[Matchable, A]
): Option[A] =
cval match
case a: A => Some(a)
case _ => None
Now the as
method definition throws no errors. This is because a TypeTest
is a record of information necessary to determine if Matchable
is actually an A
at compiletime. However, TypeTest
can be an inefficient way to test this, as it creates and tosses away an Option
in the process of matching. We may toss it aside for .isInstanceOf
and .asInstanceOf
later on, but at least we have the ability to check this without using those methods.
At least, all of this works in theory. In reality, it's best to unit test these things to make sure there's no deficiency in our code that the compiler did not catch. I've written the following unit test to check the as
method on CVal
:
class CValDemonstration
extends munit.FunSuite:
test("demo 1") {
val cval = CVal(5)
cval.as[5]
assertEquals(
cval.as[Int],
Some(5)
)
}
There are no warnings or errors when compiling this code, and when I run the test, Some(5)
and cval.as[Int]
are recognized as equal to each other. So far so good.
CIntegral
The next type we'll define is CIntegral
. We'll make the platform dependent types that are known to be integral on all platforms into subtypes of this type, and it will be home to common methods and logic for integral platform dependent types.
opaque type CIntegral <: CVal =
CVal
object CIntegral:
def apply(a: Long): CIntegral =
CVal(a)
extension (
cintegral: CIntegral
)
def asLong =
cintegral.as[Long].get
One thing you might notice is <: CVal = CVal
in our opaque type definition. This means that CIntegral
erases to a CVal
at runtime, but is considered a subtype of CVal
at compiletime. It also means that unlike CVal
, which has no relation to its runtime type of Matchable
except in its defining scope, CIntegral
has a subtyping relationship with CVal
and can be used wherever a CVal
is expected.
This subtyping relationship is more for the sharing of method definitions between the two than for the ability to use CVal
as the Any
of platform dependent types. This is particularly important to keep in mind because it's very difficult if not impossible to retrieve a subtype of an opaque type after you've cast that type information away. Unlike normal JVM types that carry type information around with them at runtime, opaque types cease to exist at compile-time, so if their type is not based on some runtime data, you cannot retrieve the child type of an opaque type.
Anyway, this definition of CIntegral allows Long
values to be redefined as CIntegral
s via the apply
method, and allows us to convert any CIntegral
into a Long
via the asLong
extension method. By making the sole "constructor" of CIntegral
types only accept a Long
we effectively make CIntegral
backed by Long
values. Consequently, that makes the assumption that as[Long]
will return Some
practically certain. This is more of a de facto type safety than the de jure typesafety that we'd want; if the definition of our patterns change then our code can become unsafe without the compiler warning us.
We can try to come up with something safer as our experiments continue, should they show promise.
Third try at CLong
Now that we have CIntegral
defined, let's take a stab at implementing CLong
. We'll define it as an opaque subtype of CIntegral
.
opaque type CLong <: CIntegral = CIntegral
object CLong:
def apply(i: Int): CLong =
CIntegral(i.toLong)
In the above snippet I've declared an apply
method for CLong
which accepts Int
values. The reason for this is because long
in C is defined to be at least 32-bits long in the C specification. Since that's the minima of the specification, any Int
should fit inside a signed long
on any platform. Non-conformant implementations of C exist, but we can't program our way around that madness.
Int
isn't the only integral we'd likely like to be able to instantiate a CLong
from. A n apply
method for Long
would be great if that's safe for the platform we're developing on. However, first we need a concept of platforms for our constructor to reason about:
enum Platform:
case WinX64
case MacOSX64
case LinuxX64
This time, the Platform
type can be defined as an enum, since we don't do anything fancy with it. Now, our apply
method will take a platform as context, and return a defined Option
if the platform supports 64-bit long
s :
opaque type CLong <: CIntegral = CIntegral
object CLong:
def apply(i: Int): CLong =
CIntegral(i.toLong)
def apply(l: Long)(using
p: Platform
): Option[CLong] = p match
case Platform.LinuxX64 |
Platform.MacOSX64 =>
Some(CIntegral(l))
case _ => None
This new apply
method for creating CLong
from Long
types works as defined, but returning an Option
can be unsatisfying. This is boxing, and it only exists because the compiler doesn't know what platform we're dealing with. It would be nice to allow users to avoid that boxing if they go through the effort of proving to the compiler what the current platform is. We can try to define methods that return a bare CLong
if a specific Platform
is present in the call's context:
object CLong:
def apply(i: Int): CLong =
CIntegral(i.toLong)
def apply(l: Long)(using
p: Platform
): Option[CLong] = p match
case Platform.LinuxX64 |
Platform.MacOSX64 =>
Some(CIntegral(l))
case _ => None
def apply(l: Long)(using
p: Platform.LinuxX64.type
): CLong = CIntegral(l)
def apply(l: Long)(using
p: Platform.MacOSX64.type
): CLong = CIntegral(l)
Sadly, this code doesn't compile as written. The compiler complains that the final apply is a double definition. That means that at runtime, the right apply method overload would not be able to be chosen based on the runtime types. This is mainly because the concept of singleton types like .type
doesn't exist on the JVM at runtime. However, we can get around this with a new annotation present in Scala 3 called @targetName
. @targetName
basically gives the annotated method a different name in the bytecode, meaning the Scala compiler with its total knowledge of the types in question can instruct the JVM on which apply method to use:
opaque type CLong <: CIntegral =
CIntegral
object CLong:
def apply(i: Int): CLong =
CIntegral(i.toLong)
def apply(l: Long)(using
p: Platform
): Option[CLong] = p match
case Platform.LinuxX64 |
Platform.MacOSX64 =>
Some(CIntegral(l))
case _ => None
def apply(l: Long)(using
p: Platform.LinuxX64.type
): CLong = CIntegral(l)
@targetName(
"certainLongMacOSX"
)
def apply(l: Long)(using
p: Platform.MacOSX64.type
): CLong = CIntegral(l)
The @targetName
here basically gives the 4th apply method the name "certainLongMacOSX" in the bytecode emitted by the Scala compiler. Thanks to that, our definition works now. Let's test our CLong
instantiation with the following unit test:
class CLongDemonstration
extends munit.FunSuite:
given Platform =
Platform.LinuxX64
test("demo 1") {
val clong1 = CLong(5)
val clong2 = CLong(5L)
}
The definition of clong2
here doesn't compile because the compiler is complaining of an ambiguous overload. This is a tough problem to deal with so we won't address it right now. Instead we'll sidestep it by renaming our platform specific applies to certain
.
def certain(l: Long)(using
Platform.LinuxX64.type
): CLong = CIntegral(l)
@targetName(
"certainLongMacOSX"
)
def certain(l: Long)(using
Platform.MacOSX64.type
): CLong = CIntegral(l)
test("demo 1") {
val clong1 = CLong(5)
val clong2 = CLong(5L)
assertEquals(
clong1.as[Long],
clong2.flatMap(_.as[Long])
)
}
test("demo 2") {
val clong =
summon[Platform] match
case Platform.LinuxX64 =>
CLong.certain(5L)(using
Platform.LinuxX64
)
case Platform.MacOSX64 =>
CLong.certain(5L)(using
Platform.MacOSX64
)
case _ => CLong(5)
assertEquals(
clong.as[Long],
Some(5L)
)
}
With the fixed method name, we can complete the unit test implementations, and run them. Doing so has the assertions come back true, but the match expression and the usage of certain is ugly. However, we can fix those problems by using two new features of Scala 3 - pattern bound given instances and union types:
test("demo 2") {
val clong =
summon[Platform] match
case given (Platform.LinuxX64.type |
Platform.MacOSX64.type) =>
CLong.certain(5L)
case _ => CLong(5)
assertEquals(
clong.as[Long],
Some(5L)
)
}
This new version of "demo 2" is cleaner looking, but now we get a compiler error on at the invocation of certain
. The compiler is complaining once again that the invocation is ambiguous, and it is because both methods could serve the union given we created. However, that's a hint that certain shouldn't have two definitions in the first place. certain
should be available for use in the case that the platform is known to be either LinuxX64
or MacOSX64
, and that's what the union Platform.LinuxX64.type | Platform.MacOSX64.type
indicates. So lets try to collapse the two definitions into one.
def certain(l: Long)(using
Platform.LinuxX64.type |
Platform.MacOSX64.type
): CLong = CIntegral(l)
The cleaner "demo 2" code now compiles and all the assertions pass. However, has our collapse of the two certain
methods into one cost us our ability to use certain
with a given of the singleton type Platform.LinuxX64.type
? Let's write a third demo:
test("demo 3") {
val clong =
summon[Platform] match
case given Platform.LinuxX64.type =>
CLong.certain(5l)
case _ => CLong(5)
assertEquals(
clong.as[Long],
Some(5l)
)
}
certain
can still be invoked in the case where we only have LinuxX64
or MacOSX64
in context! Things are really starting to look up!
Trouble in paradise
Sadly, there is currently an insufficiency in our code. When describing a C function to java.lang.foreign
, one gives it a set of type descriptions corresponding to Byte
, Float
, Long
, Int
, Short
, Double
, and some "foreign" specific types. The MethodHandle
generated from that description expects those types to be passed into it. So on Windows 64-bit, if a C function requires a long
, then we describe it as taking Int
and we must pass in anInt
. Therefore we have two choices:
- We can stick with an encoding similar to the one we have today, where integral types are always backed by a
Long
and we convert them into/out of the type needed when invoking a C function binding - We just store the type needed inside CLong.
Both approaches have their upsides and downsides. An upside of approach 1 is we always know what type we're dealing with when writing code internal to these integral types. A downside is that we must know what type to convert it to or from without having examples when we need them.
Approach 2 will have us storing an Int
in CIntegral
types when the platform demands a 32-bit integral, which has the upside that we don't need to convert the data when passing into method handles and when extracting data, but we have to make sure that the data we're passing in is aligned with the platform definitions or type misalignment can happen.
There's also differences in how math is done on all the different integral types as a side-effect of their boundaries, meaning that to get equivalent math to an Int
using Long
as a backing for CIntegral would mean converting to an Int
and back for some math operations like ordering and division.
For now, let's pursue approach 2, as it seems to be the simpler of the two. Lets start by modifying CIntegral and CLong to allow them to store any integral type:
opaque type CIntegral <: CVal =
CVal
object CIntegral:
def apply(a: AnyVal): CIntegral =
CVal(a)
extension (
cintegral: CIntegral
)
def asLong =
cintegral.as[Long].get
opaque type CLong <: CIntegral =
CIntegral
object CLong:
def apply(i: Int)(using
p: Platform
): CLong =
p match
case Platform.WinX64 =>
CIntegral(i)
case Platform.MacOSX64 |
Platform.LinuxX64 =>
CIntegral(i)
def apply(l: Long)(using
p: Platform
): Option[CLong] = p match
case Platform.LinuxX64 |
Platform.MacOSX64 =>
Some(CIntegral(l))
case _ => None
def certain(l: Long)(using
Platform.LinuxX64.type |
Platform.MacOSX64.type
): CLong = CIntegral(l)
If we run this code through the demonstration unit tests from before, the assertions will fail. That's because there's bugs in the implementation:
CLong
'sapply
method forInt
has the Linux and Mac branch failing to convert theInt
toLong
, an example of type misalignmentCIntegral
sasLong
method assumes thatCIntegral
is aLong
in all cases, which is no longer true
Type classes to the rescue
Type classes can be used for many things in Scala, but one of the more interesting uses for them is as a vehicle for type calculation. Take for example the following type class:
trait MyCalc[A]:
type B
object MyCalc:
given MyCalc[Int] with
type B = Float
given MyCalc[Float] with
type B = String
This type class and its instances are a mapping between type A and type B. Using this type class, it's possible to write a function that demands a type B
based on what type A
is. Lets take a look:
class MyCalcDemonstration
extends munit.FunSuite:
def myFun[A](using
mc: MyCalc[A]
)(b: mc.B): mc.B = b
test("compiles") {
assertNoDiff(
compileErrors(
"val f: Float = myFun[Int](5f)"
),
""
)
}
This code compiles because a MyCalc
is looked up based upon the type A
passed in, and from that MyCalc
, instance the input type and result type is determined. In this case, since A
was Int
, B
is Float
.
We can extend this approach to create a type class that maps from a Platform
and a type like CLong
to the underlying type for that platform:
trait TypeRelation[
P <: Platform,
A
]:
type Real <: AnyVal
This type class, TypeRelation
has two type inputs, and an inner type called Real
that's a subtype of Matchable
. P
is the corresponding Platform
for the mapping, A
is the opaque type, and Real
is the type that will back the opaque type at runtime. Let's try to put it into usage:
opaque type CLong <: CIntegral =
CIntegral
object CLong:
def apply(i: Int)(using
p: Platform
): CLong =
p match
case Platform.WinX64 =>
CIntegral(i)
case Platform.MacOSX64 |
Platform.LinuxX64 =>
CIntegral(i)
def apply(l: Long)(using
p: Platform
): Option[CLong] = p match
case Platform.LinuxX64 |
Platform.MacOSX64 =>
Some(CIntegral(l))
case _ => None
def certain[P <: Platform](
using
p: P,
tr: TypeRelation[P, CLong]
)(l: tr.Real): CLong =
CIntegral(l)
given TypeRelation[
Platform.LinuxX64.type |
Platform.MacOSX64.type,
CLong
] with
type Real = Long
given TypeRelation[
Platform.WinX64.type,
CLong
] with
type Real = Int
TypeRelation
here provides the mapping needed when defined as part of CLong
s companion. Of note is the redefinition of certain
here. First, we move the context parameter section in front of the standard inputs. This is an option that's newly available in Scala 3, and it lets us control what inputs are allowed by first checking the context of the certain
invocation. This change empowers certain
to only take the integral types that matches CLong
on the current platform. As a nice bonus, it also makes certain
work on the Windows platform. However, this still hasn't saved us from the bug in apply
. Lets see if we can modify CIntegral
's apply
to use TypeRelation
.
opaque type CIntegral <: CVal =
CVal
object CIntegral:
def apply[P <: Platform](using
p: P,
tr: TypeRelation[P, ?]
)(a: tr.Real): CIntegral =
CVal(a)
extension (
cintegral: CIntegral
)
def asLong =
cintegral.as[Long].get
opaque type CLong <: CIntegral =
CIntegral
object CLong:
def apply(i: Int)(using
p: Platform
): CLong = p match
case given (Platform.MacOSX64.type |
Platform.LinuxX64.type) =>
CIntegral(i)
case given Platform.WinX64.type =>
CIntegral(i)
def apply(l: Long)(using
p: Platform
): Option[CLong] = p match
case given (Platform.LinuxX64.type |
Platform.MacOSX64.type) =>
Some(CIntegral(l))
case _ => None
def certain[P <: Platform](
using
p: P,
tr: TypeRelation[P, CLong]
)(l: tr.Real): CLong =
CIntegral(l)
given TypeRelation[
Platform.LinuxX64.type |
Platform.MacOSX64.type,
CLong
] with
type Real = Long
given TypeRelation[
Platform.WinX64.type,
CLong
] with
type Real = Int
Since we have no information about the opaque type being dealt with within CIntegral
, we have to use ?
as the platform dependent type in its apply
method. However, this works fine for our purposes because the requisite TypeRelation
definitions are in scope where we're invoking CIntegral
's apply
. With this definition, as long as a specific Platform
context is available, CIntegral
knows exactly what type is needed, and only allows that type. In the case where we had a bug in before, Int
is being promoted to a Long
automatically, and we can confirm that via the unit tests all passing once again.
We'll explore further, trying to add some basic functionality to CIntegral
and CVal
, but that will have to wait until the next blogpost...
The code for this blogpost can be found under the opaque-types-1 folder at this github repository.
Til next time, and happy Scala hacking!